Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Meta reported 67.32K in Employees for its fiscal year ending in December of 2023. Data for Meta | FB - Employees Total Number including historical, tables and charts were last updated by Trading Economics this last October in 2025.
Meta Platforms had ****** full-time employees as of December 2024, down from ****** people in 2023. As of December 2023, more than ******* employees at tech companies worldwide were laid off throughout the year across more than 1,000 companies. Facebook: how it all beganIn 2003, a sophomore at named Mark Zuckerberg hacked into protected areas of the university's computer network in order to find photos of other students. He then would pair two of them next to each other on a program called “Facemash” and ask users to choose the more attractive person. At the beginning of 2004, Zuckerberg launched “The Facebook,” a social network dedicated to Harvard students, which later grew to encompass Columbia, Yale and Stanford. The popularity of this new service sky-rocketed and in mid-2004, Zuckerberg interrupted his studies and moved his operation to Palo Alto, California, in the heart of Silicon Valley. By 2006, Facebook was open to the general public. In 2020, the company reported almost ** billion U.S. dollars in revenue and a net income of ***** billion US dollars. It is also the most popular social network in the world, with *** billion monthly active users as of December 2020. Facebook employee diversity criticismLike many other tech companies, Facebook has been criticized for having a diversity problem. As of June 2020, tech positions, as well as management roles in U.S. offices were overwhelmingly occupied by men. Furthermore, almost ** percent of Facebook employees in the U.S. are White and only *** percent are African-American, which has sparked concern regarding representation and equal opportunities. Around **** percent of senior level positions are occupied by White employees and only *** percent by Hispanic-Americans.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Meta population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Meta. The dataset can be utilized to understand the population distribution of Meta by age. For example, using this dataset, we can identify the largest age group in Meta.
Key observations
The largest age group in Meta, MO was for the group of age 65 to 69 years years with a population of 18 (14.06%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in Meta, MO was the 20 to 24 years years with a population of 0 (0%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Meta Population by Age. You can refer the same here
We include a description of the data sets in the meta-data as well as sample code and results from a simulated data set. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: The R code is available on line here: https://github.com/warrenjl/SpGPCW. Format: Abstract The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publicly available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. File format: R workspace file. Metadata (including data dictionary) • y: Vector of binary responses (1: preterm birth, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate). This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Meta population by age cohorts (Children: Under 18 years; Working population: 18-64 years; Senior population: 65 years or more). It lists the population in each age cohort group along with its percentage relative to the total population of Meta. The dataset can be utilized to understand the population distribution across children, working population and senior population for dependency ratio, housing requirements, ageing, migration patterns etc.
Key observations
The largest age group was 18 to 64 years with a poulation of 79 (61.72% of the total population). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Age cohorts:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Meta Population by Age. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
“Falling between the cracks”: Investigating the competing challenges experienced by professionals working with people who hoard.Meta data including information about data collection, interview schedule and output data (transcript example).
The attached document is the manual how to use this portal. It is only accessible for logged-in users.
All Senckenberg employees should be able to log in using their institutional credentials.
IMPORTANT: For your first login, please
Add your full name to your profile!
Contact the admin of your organizational unit to add you to the group
Read further details in the manual.
In case of an update, new versions will be uploaded, here.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Describe your research hypothesis, what your data shows, any notable findings and how the data can be interpreted. Please add sufficient description to enable others to understand what the data is, how it was gathered and how to interpret and use it.
Although many people with acquired brain injury (ABI) have a wish to work, getting (back) to work after ABI is not always obvious. In people with severe mental illnesses, the Individual Placement and Support (IPS) is an evidence-based intervention focusing on people who do not have an employer, that is developed to help people with severe mental illnesses to obtain and maintain paid work). During IPS, the person is supported by an IPS employment specialist who works together with health care providers and employers. It is a highly client-centered method. The aim of the study is to investigate the feasibility of IPS in people with ABI and to get insight into its first effects on employment. Please see meta data file for additional information.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Meta Kaggle Code is an extension to our popular Meta Kaggle dataset. This extension contains all the raw source code from hundreds of thousands of public, Apache 2.0 licensed Python and R notebooks versions on Kaggle used to analyze Datasets, make submissions to Competitions, and more. This represents nearly a decade of data spanning a period of tremendous evolution in the ways ML work is done.
By collecting all of this code created by Kaggle’s community in one dataset, we hope to make it easier for the world to research and share insights about trends in our industry. With the growing significance of AI-assisted development, we expect this data can also be used to fine-tune models for ML-specific code generation tasks.
Meta Kaggle for Code is also a continuation of our commitment to open data and research. This new dataset is a companion to Meta Kaggle which we originally released in 2016. On top of Meta Kaggle, our community has shared nearly 1,000 public code examples. Research papers written using Meta Kaggle have examined how data scientists collaboratively solve problems, analyzed overfitting in machine learning competitions, compared discussions between Kaggle and Stack Overflow communities, and more.
The best part is Meta Kaggle enriches Meta Kaggle for Code. By joining the datasets together, you can easily understand which competitions code was run against, the progression tier of the code’s author, how many votes a notebook had, what kinds of comments it received, and much, much more. We hope the new potential for uncovering deep insights into how ML code is written feels just as limitless to you as it does to us!
While we have made an attempt to filter out notebooks containing potentially sensitive information published by Kaggle users, the dataset may still contain such information. Research, publications, applications, etc. relying on this data should only use or report on publicly available, non-sensitive information.
The files contained here are a subset of the KernelVersions
in Meta Kaggle. The file names match the ids in the KernelVersions
csv file. Whereas Meta Kaggle contains data for all interactive and commit sessions, Meta Kaggle Code contains only data for commit sessions.
The files are organized into a two-level directory structure. Each top level folder contains up to 1 million files, e.g. - folder 123 contains all versions from 123,000,000 to 123,999,999. Each sub folder contains up to 1 thousand files, e.g. - 123/456 contains all versions from 123,456,000 to 123,456,999. In practice, each folder will have many fewer than 1 thousand files due to private and interactive sessions.
The ipynb files in this dataset hosted on Kaggle do not contain the output cells. If the outputs are required, the full set of ipynbs with the outputs embedded can be obtained from this public GCS bucket: kaggle-meta-kaggle-code-downloads
. Note that this is a "requester pays" bucket. This means you will need a GCP account with billing enabled to download. Learn more here: https://cloud.google.com/storage/docs/requester-pays
We love feedback! Let us know in the Discussion tab.
Happy Kaggling!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the detailed breakdown of the count of individuals within distinct income brackets, categorizing them by gender (men and women) and employment type - full-time (FT) and part-time (PT), offering valuable insights into the diverse income landscapes within Meta. The dataset can be utilized to gain insights into gender-based income distribution within the Meta population, aiding in data analysis and decision-making..
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Income brackets:
Variables / Data Columns
Employment type classifications include:
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Meta median household income by race. You can refer the same here
As of January 2024, several major technology companies, including Google, Amazon, Meta, and Apple, have implemented return-to-office mandates requiring employees to be in the office at least three days per week. Interestingly, Zoom, a company that played a significant role in facilitating work-from-home activities during the COVID-19 pandemic, has announced a return-to-office mandate of its own requiring employees to work from the office twice per week. In contrast, X (formerly Twitter) adopted an office-only policy for their employees since Elon Musk acquired Twitter in 2022, requiring all X employees to work from the office the entire work week.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
PLEASE, CITE AS Kalabikhina IE, Kuznetsova PO, Zhuravleva SA (2024) Size and factors of the motherhood penalty in the labour market: A meta-analysis. Population and Economics 8(2): 178-205. https://doi.org/10.3897/popecon.8.e121438
Explanatory note 1: List of papers used in the meta-analysis - see the file "Meta_regression_analysis_papers".
The data is presented in WORD format.
Explanatory note 2: Set of data used in the meta-analysis - see the file "Meta_regression_analysis_table".
The data is presented in EXCEL format.
Description of table headers:
estimate_number - Number of the estimate
paper_number - Number of the paper
paper_name - Paper (year and first author)
paper_excluded - Paper was excluded from the final sample
survey - Data source
table_in_paper - Number of the table with the regression results in the paper
coeff - Regression coefficient for parenthood variable (estimate)
se - SE of the estimate
t - t-value of the estimate
ols - Estimate is obtained using the OLS method
fixed_effects - Estimate is obtained using the fixed effects method
panel - Model considers panel data (for several years)
quintile - Estimate is obtained using the quintile regression method
other - Estimate is obtained using other methods
selection_into_motherhood - Estimate is obtained allowing for selection into motherhood
hackman - Estimate is obtained allowing for selection into employment (Heckman procedure)
annual_earnings - Annual earnings are considered in the model
monthly_wage - Monthly wage is considered in the model
daily_wage - Daily wage is considered in the model
hourly_wage - Hourly wage is considered in the model
min_age_kid - Child's age (minimum)
max_age_kid - Child's age (maximum)
motherhood - Model uses a dummy variable of the presence of children
num_kids - Model uses a variable of the number of children
kid1 - Model uses a variable of the presence of one child
kid2p - Model uses a variable of the presence of two or more children
kid2 - Model uses a variable of the presence of two children
kid3p - Model uses a variable of the presence of three or more children
kid3 - Model uses a variable of the presence of three children
kid4p - Model uses a variable of the presence of three or more children
race/nationality - Model includes a race/ethnicity variable
age - Model includes the age variable
marstat - Model includes the marital status variable
oth_char_hh - Model includes any other variables of other household characteristics
settl_type - Model includes a variable of the type of settlement (urban, rural)
region - Model includes a variable of the region of the country
education - Model includes information on the level of education
experience - Model includes a variable of work experience
pot_experience - Model includes a variable of potential work experience, to be calculated from the data on age and number of years of education
tenure - Model includes a variable of the duration of employment at the current job
interruptions - Model includes a variable of employment interruptions (related to motherhood)
occupation - Model includes an occupation variable
industry - Model includes a variable of the industry of employment
union - Model includes a variable of trade union membership
friendly_conditions - Model includes a variable of the favourable working conditions for mothers (flexible schedule, possibility to work from home, etc.).
hours - Model includes a variable of the number of hours worked
sector - Model includes a variable of the type of employer ownership (public or private)
informal - Model includes a variable of informal employment
size_ent - Model includes a variable of the employer size
min_age_woman - Woman's age (minimum)
max_age_woman - Woman's age (maximum)
mean_age_woman - Woman's age (mean)
restricted - Sample is limited
private - Model considers only private sector employees
state - Model considers only public sector employees
full_time - Model considers only full-time workers
part_time - Model considers only part-time workers
better_educated - Model considers only women with a high level of education
lower_educated - Model considers only women with a low level of education
married - Model includes only married women
single - Model includes only single women
natives - Model includes only native women (born in the country)
immigrants - Model includes only immigrant women (born abroad)
race - Model includes only women of a particular race
min_year - Time period (minimum year)
max_year - Time period (maximum year)
journal - Type of publication
usa - Sample includes women from the USA
western_europe - Sample includes women from Western Europe (Belgium, France, Germany, Luxembourg, the Netherlands, Switzerland)
north_europe - Sample includes women from Northern Europe (Denmark, Finland, Norway, Sweden)
south_europe - Sample includes women from Southern Europe (Greece, Italy, Portugal, Spain)
east_centre_europe - Sample includes women from Central or Eastern Europe (Czechia, Hungary, Poland, Russia, Serbia, Ukraine)
china - Sample includes women from China
Russia - Sample includes women from Russia
others - Sample includes women from other countries
country - Country name
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Scholars have conducted numerous studies on how distal antecedents influence motivational states, subsequently affecting employee proactive work behavior. However, there is still debate about the strength of the effects different motivational states have on employee proactive behavior. This paper employs meta-analysis to explore the motivational mechanism proposed in Parker's (2010) model of proactive work behavior. It analyzes the relative strength of the effects of three motivational states—"can do," "reason to," and "energized to"—on employee proactive work behavior. Through literature screening, this study conducted a meta-analysis of 94 Chinese and English studies (with a total sample size of 30,724) that adopted Parker's theoretical model. It compared the correlations between distal antecedents, different motivational perspectives, and proactive work behavior. The results show that all three motivational states positively influence proactive work behavior. "Reason to" has a stronger effect on proactive work behavior compared to the other motivational states, and "can do" is stronger than "energized to," highlighting the importance of "reason to" in promoting employee proactive work behavior. Further discussion on this topic is provided.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
ABSTRACT Meta-analysis is an adequate statistical technique to combine results from different studies, and its use has been growing in the medical field. Thus, not only knowing how to interpret meta-analysis, but also knowing how to perform one, is fundamental today. Therefore, the objective of this article is to present the basic concepts and serve as a guide for conducting a meta-analysis using R and RStudio software. For this, the reader has access to the basic commands in the R and RStudio software, necessary for conducting a meta-analysis. The advantage of R is that it is a free software. For a better understanding of the commands, two examples were presented in a practical way, in addition to revising some basic concepts of this statistical technique. It is assumed that the data necessary for the meta-analysis has already been collected, that is, the description of methodologies for systematic review is not a discussed subject. Finally, it is worth remembering that there are many other techniques used in meta-analyses that were not addressed in this work. However, with the two examples used, the article already enables the reader to proceed with good and robust meta-analyses. Level of Evidence V, Expert Opinion.
Although we spend much of our waking hours working, the emotional experience of work, versus non-work, remains unclear. While the large literature on work stress suggests that work generally is aversive, some seminal theory and findings portray working as salubrious and perhaps as an escape from home life. Here, we examine the subjective experience of work (versus non-work) by conducting a quantitative review of 59 primary studies that assessed affect on working days. Meta-analyses of within-day studies indicated that there was no difference in positive affect (PA) between work versus non-work domains. Negative affect (NA) was higher for work than non-work, although the magnitude of difference was small (i.e., .22 SD, an effect size comparable to that of the difference in NA between different leisure activities like watching TV versus playing board games). Moderator analyses revealed that PA was relatively higher at work and NA relatively lower when affect was measured using "real-time" measurement (e.g., Experience Sampling Methodology) versus measured using the Day Reconstruction Method (i.e., real-time reports reveal a more favorable view of work as compared to recall/DRM reports). Additional findings from moderator analyses included significant differences in main effect sizes as a function of the specific affect, and, for PA, as a function of the age of the sample and the time of day when the non-work measurements were taken. Results for the other possible moderators including job complexity and affect intensity were not statistically significant.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
One of the newest types of multimedia involves body-connected interfaces, usually termed haptics. Haptics may use stylus-based tactile interfaces, glove-based systems, handheld controllers, balance boards, or other custom-designed body-computer interfaces. How well do these interfaces help students learn Science, Technology, Engineering, and Mathematics (STEM)? We conducted an updated review of learning STEM with haptics, applying meta-analytic techniques to 21 published articles reporting on 53 effects for factual, inferential, procedural, and transfer STEM learning. This deposit includes the data extracted from those articles and comprises the raw data used in the meta-analytic analyses.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A collection of 22 data set of 50+ requirements each, expressed as user stories.
The dataset has been created by gathering data from web sources and we are not aware of license agreements or intellectual property rights on the requirements / user stories. The curator took utmost diligence in minimizing the risks of copyright infringement by using non-recent data that is less likely to be critical, by sampling a subset of the original requirements collection, and by qualitatively analyzing the requirements. In case of copyright infringement, please contact the dataset curator (Fabiano Dalpiaz, f.dalpiaz@uu.nl) to discuss the possibility of removal of that dataset [see Zenodo's policies]
The data sets have been originally used to conduct experiments about ambiguity detection with the REVV-Light tool: https://github.com/RELabUU/revv-light
This collection has been originally published in Mendeley data: https://data.mendeley.com/datasets/7zbk8zsd8y/1
The following text provides a description of the datasets, including links to the systems and websites, when available. The datasets are organized by macro-category and then by identifier.
g02-federalspending.txt
(2018) originates from early data in the Federal Spending Transparency project, which pertain to the website that is used to share publicly the spending data for the U.S. government. The website was created because of the Digital Accountability and Transparency Act of 2014 (DATA Act). The specific dataset pertains a system called DAIMS or Data Broker, which stands for DATA Act Information Model Schema. The sample that was gathered refers to a sub-project related to allowing the government to act as a data broker, thereby providing data to third parties. The data for the Data Broker project is currently not available online, although the backend seems to be hosted in GitHub under a CC0 1.0 Universal license. Current and recent snapshots of federal spending related websites, including many more projects than the one described in the shared collection, can be found here.
g03-loudoun.txt
(2018) is a set of extracted requirements from a document, by the Loudoun County Virginia, that describes the to-be user stories and use cases about a system for land management readiness assessment called Loudoun County LandMARC. The source document can be found here and it is part of the Electronic Land Management System and EPlan Review Project - RFP RFQ issued in March 2018. More information about the overall LandMARC system and services can be found here.
g04-recycling.txt
(2017) concerns a web application where recycling and waste disposal facilities can be searched and located. The application operates through the visualization of a map that the user can interact with. The dataset has obtained from a GitHub website and it is at the basis of a students' project on web site design; the code is available (no license).
g05-openspending.txt
(2018) is about the OpenSpending project (www), a project of the Open Knowledge foundation which aims at transparency about how local governments spend money. At the time of the collection, the data was retrieved from a Trello board that is currently unavailable. The sample focuses on publishing, importing and editing datasets, and how the data should be presented. Currently, OpenSpending is managed via a GitHub repository which contains multiple sub-projects with unknown license.
g11-nsf.txt
(2018) refers to a collection of user stories referring to the NSF Site Redesign & Content Discovery project, which originates from a publicly accessible GitHub repository (GPL 2.0 license). In particular, the user stories refer to an early version of the NSF's website. The user stories can be found as closed Issues.
g08-frictionless.txt
(2016) regards the Frictionless Data project, which offers an open source dataset for building data infrastructures, to be used by researchers, data scientists, and data engineers. Links to the many projects within the Frictionless Data project are on GitHub (with a mix of Unlicense and MIT license) and web. The specific set of user stories has been collected in 2016 by GitHub user @danfowler and are stored in a Trello board.
g14-datahub.txt
(2013) concerns the open source project DataHub, which is currently developed via a GitHub repository (the code has Apache License 2.0). DataHub is a data discovery platform which has been developed over multiple years. The specific data set is an initial set of user stories, which we can date back to 2013 thanks to a comment therein.
g16-mis.txt
(2015) is a collection of user stories that pertains a repository for researchers and archivists. The source of the dataset is a public Trello repository. Although the user stories do not have explicit links to projects, it can be inferred that the stories originate from some project related to the library of Duke University.
g17-cask.txt
(2016) refers to the Cask Data Application Platform (CDAP). CDAP is an open source application platform (GitHub, under Apache License 2.0) that can be used to develop applications within the Apache Hadoop ecosystem, an open-source framework which can be used for distributed processing of large datasets. The user stories are extracted from a document that includes requirements regarding dataset management for Cask 4.0, which includes the scenarios, user stories and a design for the implementation of these user stories. The raw data is available in the following environment.
g18-neurohub.txt
(2012) is concerned with the NeuroHub platform, a neuroscience data management, analysis and collaboration platform for researchers in neuroscience to collect, store, and share data with colleagues or with the research community. The user stories were collected at a time NeuroHub was still a research project sponsored by the UK Joint Information Systems Committee (JISC). For information about the research project from which the requirements were collected, see the following record.
g22-rdadmp.txt
(2018) is a collection of user stories from the Research Data Alliance's working group on DMP Common Standards. Their GitHub repository contains a collection of user stories that were created by asking the community to suggest functionality that should part of a website that manages data management plans. Each user story is stored as an issue on the GitHub's page.
g23-archivesspace.txt
(2012-2013) refers to ArchivesSpace: an open source, web application for managing archives information. The application is designed to support core functions in archives administration such as accessioning; description and arrangement of processed materials including analog, hybrid, and
born digital content; management of authorities and rights; and reference service. The application supports collection management through collection management records, tracking of events, and a growing number of administrative reports. ArchivesSpace is open source and its
Which county has the most Facebook users?
There are more than 378 million Facebook users in India alone, making it the leading country in terms of Facebook audience size. To put this into context, if India’s Facebook audience were a country then it would be ranked third in terms of largest population worldwide. Apart from India, there are several other markets with more than 100 million Facebook users each: The United States, Indonesia, and Brazil with 193.8 million, 119.05 million, and 112.55 million Facebook users respectively.
Facebook – the most used social media
Meta, the company that was previously called Facebook, owns four of the most popular social media platforms worldwide, WhatsApp, Facebook Messenger, Facebook, and Instagram. As of the third quarter of 2021, there were around 3,5 billion cumulative monthly users of the company’s products worldwide. With around 2.9 billion monthly active users, Facebook is the most popular social media worldwide. With an audience of this scale, it is no surprise that the vast majority of Facebook’s revenue is generated through advertising.
Facebook usage by device
As of July 2021, it was found that 98.5 percent of active users accessed their Facebook account from mobile devices. In fact, almost 81.8 percent of Facebook audiences worldwide access the platform only via mobile phone. Facebook is not only available through mobile browser as the company has published several mobile apps for users to access their products and services. As of the third quarter 2021, the four core Meta products were leading the ranking of most downloaded mobile apps worldwide, with WhatsApp amassing approximately six billion downloads.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In our work, we have designed and implemented a novel workflow with several heuristic methods to combine state-of-the-art methods related to CVE fix commits gathering. As a consequence of our improvements, we have been able to gather the largest programming language-independent real-world dataset of CVE vulnerabilities with the associated fix commits.
Our dataset containing 26,617 unique CVEs coming from 6,945 unique GitHub projects is, to the best of our knowledge, by far the biggest CVE vulnerability dataset with fix commits available today. These CVEs are associated with 31,883 unique commits that fixed those vulnerabilities. Compared to prior work, our dataset brings about a 397% increase in CVEs, a 295% increase in covered open-source projects, and a 480% increase in commit fixes.
Our larger dataset thus substantially improves over the current real-world vulnerability datasets and enables further progress in research on vulnerability detection and software security. We used NVD(nvd.nist.gov) and Github Secuirty advisory Database as the main sources of our pipeline.
We release to the community a 14GB PostgreSQL database that contains information on CVEs up to January 24, 2024, CWEs of each CVE, files and methods changed by each commit, and repository metadata.
Additionally, patch files related to the fix commits are available as a separate package. Furthermore, we make our dataset collection tool also available to the community.
`cvedataset-patches.zip` file contains fix patches, and `dump_morefixes_27-03-2024_19_52_58.sql.zip` contains a postgtesql dump of fixes, together with several other fields such as CVEs, CWEs, repository meta-data, commit data, file changes, method changed, etc.
MoreFixes data-storage strategy is based on CVEFixes to store CVE commits fixes from open-source repositories, and uses a modified version of Porspector(part of ProjectKB from SAP) as a module to detect commit fixes of a CVE. Our full methodology is presented in the paper, with the title of "MoreFixes: A Large-Scale Dataset of CVE Fix Commits Mined through Enhanced Repository Discovery", which will be published in the Promise conference (2024).
For more information about usage and sample queries, visit the Github repository: https://github.com/JafarAkhondali/Morefixes
If you are using this dataset, please be aware that the repositories that we mined contain different licenses and you are responsible to handle any licesnsing issues. This is also the similar case with CVEFixes.
This product uses the NVD API but is not endorsed or certified by the NVD.
This research was partially supported by the Dutch Research Council (NWO) under the project NWA.1215.18.008 Cyber Security by Integrated Design (C-SIDe).
To restore the dataset, you can use the docker-compose file available at the gitub repository. Dataset default credentials after restoring dump:
POSTGRES_USER=postgrescvedumper
POSTGRES_DB=postgrescvedumper
POSTGRES_PASSWORD=a42a18537d74c3b7e584c769152c3d
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Meta reported 67.32K in Employees for its fiscal year ending in December of 2023. Data for Meta | FB - Employees Total Number including historical, tables and charts were last updated by Trading Economics this last October in 2025.