62 datasets found
  1. A dataset from a survey investigating disciplinary differences in data...

    • zenodo.org
    • explore.openaire.eu
    • +1more
    bin, csv, pdf, txt
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anton Boudreau Ninkov; Anton Boudreau Ninkov; Chantal Ripp; Chantal Ripp; Kathleen Gregory; Kathleen Gregory; Isabella Peters; Isabella Peters; Stefanie Haustein; Stefanie Haustein (2024). A dataset from a survey investigating disciplinary differences in data citation [Dataset]. http://doi.org/10.5281/zenodo.7853477
    Explore at:
    txt, pdf, bin, csvAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anton Boudreau Ninkov; Anton Boudreau Ninkov; Chantal Ripp; Chantal Ripp; Kathleen Gregory; Kathleen Gregory; Isabella Peters; Isabella Peters; Stefanie Haustein; Stefanie Haustein
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    GENERAL INFORMATION

    Title of Dataset: A dataset from a survey investigating disciplinary differences in data citation

    Date of data collection: January to March 2022

    Collection instrument: SurveyMonkey

    Funding: Alfred P. Sloan Foundation


    SHARING/ACCESS INFORMATION

    Licenses/restrictions placed on the data: These data are available under a CC BY 4.0 license

    Links to publications that cite or use the data:

    Gregory, K., Ninkov, A., Ripp, C., Peters, I., & Haustein, S. (2022). Surveying practices of data citation and reuse across disciplines. Proceedings of the 26th International Conference on Science and Technology Indicators. International Conference on Science and Technology Indicators, Granada, Spain. https://doi.org/10.5281/ZENODO.6951437

    Gregory, K., Ninkov, A., Ripp, C., Roblin, E., Peters, I., & Haustein, S. (2023). Tracing data:
    A survey investigating disciplinary differences in data citation.
    Zenodo. https://doi.org/10.5281/zenodo.7555266


    DATA & FILE OVERVIEW

    File List

    • Filename: MDCDatacitationReuse2021Codebookv2.pdf
      Codebook
    • Filename: MDCDataCitationReuse2021surveydatav2.csv
      Dataset format in csv
    • Filename: MDCDataCitationReuse2021surveydatav2.sav
      Dataset format in SPSS
    • Filename: MDCDataCitationReuseSurvey2021QNR.pdf
      Questionnaire

    Additional related data collected that was not included in the current data package: Open ended questions asked to respondents


    METHODOLOGICAL INFORMATION

    Description of methods used for collection/generation of data:

    The development of the questionnaire (Gregory et al., 2022) was centered around the creation of two main branches of questions for the primary groups of interest in our study: researchers that reuse data (33 questions in total) and researchers that do not reuse data (16 questions in total). The population of interest for this survey consists of researchers from all disciplines and countries, sampled from the corresponding authors of papers indexed in the Web of Science (WoS) between 2016 and 2020.

    Received 3,632 responses, 2,509 of which were completed, representing a completion rate of 68.6%. Incomplete responses were excluded from the dataset. The final total contains 2,492 complete responses and an uncorrected response rate of 1.57%. Controlling for invalid emails, bounced emails and opt-outs (n=5,201) produced a response rate of 1.62%, similar to surveys using comparable recruitment methods (Gregory et al., 2020).

    Methods for processing the data:

    Results were downloaded from SurveyMonkey in CSV format and were prepared for analysis using Excel and SPSS by recoding ordinal and multiple choice questions and by removing missing values.

    Instrument- or software-specific information needed to interpret the data:

    The dataset is provided in SPSS format, which requires IBM SPSS Statistics. The dataset is also available in a coded format in CSV. The Codebook is required to interpret to values.


    DATA-SPECIFIC INFORMATION FOR: MDCDataCitationReuse2021surveydata

    Number of variables: 95

    Number of cases/rows: 2,492

    Missing data codes: 999 Not asked

    Refer to MDCDatacitationReuse2021Codebook.pdf for detailed variable information.

  2. d

    NYSERDA Low- to Moderate-Income New York State Census Population Analysis...

    • catalog.data.gov
    • datasets.ai
    • +3more
    Updated Jun 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ny.gov (2025). NYSERDA Low- to Moderate-Income New York State Census Population Analysis Dataset: Average for 2013-2015 [Dataset]. https://catalog.data.gov/dataset/nyserda-low-to-moderate-income-new-york-state-census-population-analysis-dataset-aver-2013
    Explore at:
    Dataset updated
    Jun 28, 2025
    Dataset provided by
    data.ny.gov
    Area covered
    New York
    Description

    How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov. The Low- to Moderate-Income (LMI) New York State (NYS) Census Population Analysis dataset is resultant from the LMI market database designed by APPRISE as part of the NYSERDA LMI Market Characterization Study (https://www.nyserda.ny.gov/lmi-tool). All data are derived from the U.S. Census Bureau’s American Community Survey (ACS) 1-year Public Use Microdata Sample (PUMS) files for 2013, 2014, and 2015. Each row in the LMI dataset is an individual record for a household that responded to the survey and each column is a variable of interest for analyzing the low- to moderate-income population. The LMI dataset includes: county/county group, households with elderly, households with children, economic development region, income groups, percent of poverty level, low- to moderate-income groups, household type, non-elderly disabled indicator, race/ethnicity, linguistic isolation, housing unit type, owner-renter status, main heating fuel type, home energy payment method, housing vintage, LMI study region, LMI population segment, mortgage indicator, time in home, head of household education level, head of household age, and household weight. The LMI NYS Census Population Analysis dataset is intended for users who want to explore the underlying data that supports the LMI Analysis Tool. The majority of those interested in LMI statistics and generating custom charts should use the interactive LMI Analysis Tool at https://www.nyserda.ny.gov/lmi-tool. This underlying LMI dataset is intended for users with experience working with survey data files and producing weighted survey estimates using statistical software packages (such as SAS, SPSS, or Stata).

  3. GlobPOP: A 33-year (1990-2022) global gridded population dataset (Version...

    • zenodo.org
    tiff
    Updated Sep 4, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luling Liu; Xin Cao; Xin Cao; Shijie Li; Na Jie; Luling Liu; Shijie Li; Na Jie (2024). GlobPOP: A 33-year (1990-2022) global gridded population dataset (Version 2.0-test-alpha) [Dataset]. http://doi.org/10.5281/zenodo.11071249
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Sep 4, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Luling Liu; Xin Cao; Xin Cao; Shijie Li; Na Jie; Luling Liu; Shijie Li; Na Jie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data Usage Notice

    This version is not recommended for download. Please check the newest version.

    We would like to inform you that the updated GlobPOP dataset (2021-2022) have been available in version 2.0. The GlobPOP dataset (2021-2022) in the current version is not recommended for your work. The GlobPOP dataset (1990-2020) in the current version is the same as version 1.0.

    Thank you for your continued support of the GlobPOP.

    If you encounter any issues, please contact us via email at lulingliu@mail.bnu.edu.cn.

    Introduction

    Continuously monitoring global population spatial dynamics is essential for implementing effective policies related to sustainable development, such as epidemiology, urban planning, and global inequality.

    Here, we present GlobPOP, a new continuous global gridded population product with a high-precision spatial resolution of 30 arcseconds from 1990 to 2020. Our data-fusion framework is based on cluster analysis and statistical learning approaches, which intends to fuse the existing five products(Global Human Settlements Layer Population (GHS-POP), Global Rural Urban Mapping Project (GRUMP), Gridded Population of the World Version 4 (GPWv4), LandScan Population datasets and WorldPop datasets to a new continuous global gridded population (GlobPOP). The spatial validation results demonstrate that the GlobPOP dataset is highly accurate. To validate the temporal accuracy of GlobPOP at the country level, we have developed an interactive web application, accessible at https://globpop.shinyapps.io/GlobPOP/, where data users can explore the country-level population time-series curves of interest and compare them with census data.

    With the availability of GlobPOP dataset in both population count and population density formats, researchers and policymakers can leverage our dataset to conduct time-series analysis of population and explore the spatial patterns of population development at various scales, ranging from national to city level.

    Data description

    The product is produced in 30 arc-seconds resolution(approximately 1km in equator) and is made available in GeoTIFF format. There are two population formats, one is the 'Count'(Population count per grid) and another is the 'Density'(Population count per square kilometer each grid)

    Each GeoTIFF filename has 5 fields that are separated by an underscore "_". A filename extension follows these fields. The fields are described below with the example filename:

    GlobPOP_Count_30arc_1990_I32

    Field 1: GlobPOP(Global gridded population)
    Field 2: Pixel unit is population "Count" or population "Density"
    Field 3: Spatial resolution is 30 arc seconds
    Field 4: Year "1990"
    Field 5: Data type is I32(Int 32) or F32(Float32)

    More information

    Please refer to the paper for detailed information:

    Liu, L., Cao, X., Li, S. et al. A 31-year (1990–2020) global gridded population dataset generated by cluster analysis and statistical learning. Sci Data 11, 124 (2024). https://doi.org/10.1038/s41597-024-02913-0.

    The fully reproducible codes are publicly available at GitHub: https://github.com/lulingliu/GlobPOP.

  4. B

    Census of Population, 2006 [Canada]: Special Interest Profiles [B2020]

    • borealisdata.ca
    • search.dataone.org
    Updated Nov 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Canada (2023). Census of Population, 2006 [Canada]: Special Interest Profiles [B2020] [Dataset]. http://doi.org/10.5683/SP3/9TET2T
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 2, 2023
    Dataset provided by
    Borealis
    Authors
    Statistics Canada
    License

    https://borealisdata.ca/api/datasets/:persistentId/versions/3.1/customlicense?persistentId=doi:10.5683/SP3/9TET2Thttps://borealisdata.ca/api/datasets/:persistentId/versions/3.1/customlicense?persistentId=doi:10.5683/SP3/9TET2T

    Area covered
    Canada
    Description

    This new product will present data for specific census topics and population groups according to selected demographic, cultural, and socio-economic characteristics. These detailed 'profile-type' tables expand the analytical depth of basic census information. Special interest profiles include: ethnic groups, Aboriginal peoples, occupation, industry, and place of work.

  5. f

    Data from: Category-Adaptive Variable Screening for Ultra-High Dimensional...

    • tandf.figshare.com
    zip
    Updated Aug 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinhan Xie; Yuanyuan Lin; Xiaodong Yan; Niansheng Tang (2023). Category-Adaptive Variable Screening for Ultra-High Dimensional Heterogeneous Categorical Data [Dataset]. http://doi.org/10.6084/m9.figshare.7819544.v4
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 16, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Jinhan Xie; Yuanyuan Lin; Xiaodong Yan; Niansheng Tang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The populations of interest in modern studies are very often heterogeneous. The population heterogeneity, the qualitative nature of the outcome variable and the high dimensionality of the predictors pose significant challenge in statistical analysis. In this article, we introduce a category-adaptive screening procedure with high-dimensional heterogeneous data, which is to detect category-specific important covariates. The proposal is a model-free approach without any specification of a regression model and an adaptive procedure in the sense that the set of active variables is allowed to vary across different categories, thus making it more flexible to accommodate heterogeneity. For response-selective sampling data, another main discovery of this article is that the proposed method works directly without any modification. Under mild regularity conditions, the newly procedure is shown to possess the sure screening and ranking consistency properties. Simulation studies contain supportive evidence that the proposed method performs well under various settings and it is effective to extract category-specific information. Applications are illustrated with two real datasets. Supplementary materials for this article are available online.

  6. Annual Population Survey Two-Year Longitudinal Dataset, January 2018 -...

    • beta.ukdataservice.ac.uk
    • datacatalogue.cessda.eu
    Updated 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Social Survey Division Office For National Statistics (2024). Annual Population Survey Two-Year Longitudinal Dataset, January 2018 - December 2019 [Dataset]. http://doi.org/10.5255/ukda-sn-8840-1
    Explore at:
    Dataset updated
    2024
    Dataset provided by
    UK Data Servicehttps://ukdataservice.ac.uk/
    DataCitehttps://www.datacite.org/
    Authors
    Social Survey Division Office For National Statistics
    Description

    The Annual Population Survey (APS) is a major survey series, which aims to provide data that can produce reliable estimates at local authority level. Key topics covered in the survey include education, employment, health and ethnicity. The APS comprises key variables from the Labour Force Survey (LFS), all its associated LFS boosts and the APS boost.

    The APS allows for analysis to be carried out on detailed subgroups and below regional level. In recent years (particularly with the sample size of the LFS 5 quarter dataset reducing) there has been some interest in producing a two year APS longitudinal dataset to look at any trends that may occur over a year. The APS Two-Year Longitudinal Datasets, covering 2012/13 onwards, have been deposited as a result of this work. Person- and Household-level APS datasets are also available.

    For further detailed information about methodology, users should consult the Labour Force Survey User Guide, included with the APS documentation.

    Occupation data for 2021 and 2022
    The ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. None of ONS' headline statistics, other than those directly sourced from occupational data, are affected and you can continue to rely on their accuracy. Further information can be found in the ONS article published on 11 July 2023: Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022

  7. d

    Factori USA Consumer Graph Data | socio-demographic, location, interest and...

    • datarade.ai
    .json, .csv
    Updated Jul 23, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Factori (2022). Factori USA Consumer Graph Data | socio-demographic, location, interest and intent data | E-Commere |Mobile Apps | Online Services [Dataset]. https://datarade.ai/data-products/factori-usa-consumer-graph-data-socio-demographic-location-factori
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    Jul 23, 2022
    Dataset authored and provided by
    Factori
    Area covered
    United States of America
    Description

    Our consumer data is gathered and aggregated via surveys, digital services, and public data sources. We use powerful profiling algorithms to collect and ingest only fresh and reliable data points.

    Our comprehensive data enrichment solution includes a variety of data sets that can help you address gaps in your customer data, gain a deeper understanding of your customers, and power superior client experiences.

    1. Geography - City, State, ZIP, County, CBSA, Census Tract, etc.
    2. Demographics - Gender, Age Group, Marital Status, Language etc.
    3. Financial - Income Range, Credit Rating Range, Credit Type, Net worth Range, etc
    4. Persona - Consumer type, Communication preferences, Family type, etc
    5. Interests - Content, Brands, Shopping, Hobbies, Lifestyle etc.
    6. Household - Number of Children, Number of Adults, IP Address, etc.
    7. Behaviours - Brand Affinity, App Usage, Web Browsing etc.
    8. Firmographics - Industry, Company, Occupation, Revenue, etc
    9. Retail Purchase - Store, Category, Brand, SKU, Quantity, Price etc.
    10. Auto - Car Make, Model, Type, Year, etc.
    11. Housing - Home type, Home value, Renter/Owner, Year Built etc.

    Consumer Graph Schema & Reach: Our data reach represents the total number of counts available within various categories and comprises attributes such as country location, MAU, DAU & Monthly Location Pings:

    Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method on a suitable interval (daily/weekly/monthly).

    Consumer Graph Use Cases:

    360-Degree Customer View:Get a comprehensive image of customers by the means of internal and external data aggregation.

    Data Enrichment:Leverage Online to offline consumer profiles to build holistic audience segments to improve campaign targeting using user data enrichment

    Fraud Detection: Use multiple digital (web and mobile) identities to verify real users and detect anomalies or fraudulent activity.

    Advertising & Marketing:Understand audience demographics, interests, lifestyle, hobbies, and behaviors to build targeted marketing campaigns.

    Using Factori Consumer Data graph you can solve use cases like:

    Acquisition Marketing Expand your reach to new users and customers using lookalike modeling with your first party audiences to extend to other potential consumers with similar traits and attributes.

    Lookalike Modeling

    Build lookalike audience segments using your first party audiences as a seed to extend your reach for running marketing campaigns to acquire new users or customers

    And also, CRM Data Enrichment, Consumer Data Enrichment B2B Data Enrichment B2C Data Enrichment Customer Acquisition Audience Segmentation 360-Degree Customer View Consumer Profiling Consumer Behaviour Data

  8. m

    Data for:Improved Population Mapping for China Using the 3D Build-ing,...

    • data.mendeley.com
    Updated Sep 4, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhen Lei (2024). Data for:Improved Population Mapping for China Using the 3D Build-ing, Nighttime Light, Points-of-interest, and Land Use/Cover Data Within a Multiscale Geographically Weighted Regression Model [Dataset]. http://doi.org/10.17632/hwz54s535n.1
    Explore at:
    Dataset updated
    Sep 4, 2024
    Authors
    Zhen Lei
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    China
    Description

    Auxiliary Data.gdb: Land_use: original land use data POI_name: interests-point-data from the Amap platform (name indicates category)

    New_gridded_population_dataset(.gdb): experimental result data, i.e., a gridded population map of mainland China with a resolution of 100 meters

    New_minus_WorldPop_PopulationResidual(.gdb): pixel-level residuals of the new gridded population dataset with the Worldpop dataset

    POI_Correlation_Coefficient: Zonal statistical output of POI kernel density values: summary of various POI kernel densities in residential areas of administrative units Summary of POI Pearson correlation coefficients: sum of Pearson's correlation coefficients for 13 types of POIs at a certain bandwidth

    PopulationData_AdministrativeUnitLevel.gdb: Population_data_mainlandChina_level3: population data at the district and county level in mainland China Population_data_Name_level4_Table: township and street-level population data for provinces and municipalities

    Note: Due to the storage space limitation, 3D building, nighttime light, and WorldPop datasets have not been uploaded. To access these publicly available data, please visit the official website via the "Related links" at the bottom. In addition, we are not authorized to share data for the fourth level of administrative boundaries, so we only share the corresponding population data in tabular form.

  9. u

    American Community Survey

    • gstore.unm.edu
    csv, geojson, gml +5
    Updated Mar 6, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Earth Data Analysis Center (2020). American Community Survey [Dataset]. https://gstore.unm.edu/apps/rgis/datasets/adecfea6-fcd7-4c41-8165-165c4490a9da/metadata/FGDC-STD-001-1998.html
    Explore at:
    kml(5), csv(5), xls(5), json(5), geojson(5), zip(5), gml(5), shp(5)Available download formats
    Dataset updated
    Mar 6, 2020
    Dataset provided by
    Earth Data Analysis Center
    Time period covered
    2018
    Area covered
    New Mexico, West Bounding Coordinate -109.050173 East Bounding Coordinate -103.001964 North Bounding Coordinate 37.000293 South Bounding Coordinate 31.332172
    Description

    A broad and generalized selection of 2014-2018 US Census Bureau 2018 5-year American Community Survey population data estimates, obtained via Census API and joined to the appropriate geometry (in this case, New Mexico Census tracts). The selection is not comprehensive, but allows a first-level characterization of total population, male and female, and both broad and narrowly-defined age groups. In addition to the standard selection of age-group breakdowns (by male or female), the dataset provides supplemental calculated fields which combine several attributes into one (for example, the total population of persons under 18, or the number of females over 65 years of age). The determination of which estimates to include was based upon level of interest and providing a manageable dataset for users.The U.S. Census Bureau's American Community Survey (ACS) is a nationwide, continuous survey designed to provide communities with reliable and timely demographic, housing, social, and economic data every year. The ACS collects long-form-type information throughout the decade rather than only once every 10 years. The ACS combines population or housing data from multiple years to produce reliable numbers for small counties, neighborhoods, and other local areas. To provide information for communities each year, the ACS provides 1-, 3-, and 5-year estimates. ACS 5-year estimates (multiyear estimates) are “period” estimates that represent data collected over a 60-month period of time (as opposed to “point-in-time” estimates, such as the decennial census, that approximate the characteristics of an area on a specific date). ACS data are released in the year immediately following the year in which they are collected. ACS estimates based on data collected from 2009–2014 should not be called “2009” or “2014” estimates. Multiyear estimates should be labeled to indicate clearly the full period of time. While the ACS contains margin of error (MOE) information, this dataset does not. Those individuals requiring more complete data are directed to download the more detailed datasets from the ACS American FactFinder website. This dataset is organized by Census tract boundaries in New Mexico. Census tracts are small, relatively permanent statistical subdivisions of a county or equivalent entity, and were defined by local participants as part of the 2010 Census Participant Statistical Areas Program. The primary purpose of census tracts is to provide a stable set of geographic units for the presentation of census data and comparison back to previous decennial censuses. Census tracts generally have a population size between 1,200 and 8,000 people, with an optimum size of 4,000 people. State and county boundaries always are census tract boundaries in the standard census geographic hierarchy. In a few rare instances, a census tract may consist of noncontiguous areas. These noncontiguous areas may occur where the census tracts are coextensive with all or parts of legal entities that are themselves noncontiguous. For the 2010 Census, the census tract code range of 9400 through 9499 was enforced for census tracts that include a majority American Indian population according to Census 2000 data and/or their area was primarily covered by federally recognized American Indian reservations and/or off-reservation trust lands; the code range 9800 through 9899 was enforced for those census tracts that contained little or no population and represented a relatively large special land use area such as a National Park, military installation, or a business/industrial park; and the code range 9900 through 9998 was enforced for those census tracts that contained only water area, no land area.

  10. World Population Statistics - 2023

    • kaggle.com
    Updated Jan 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhavik Jikadara (2024). World Population Statistics - 2023 [Dataset]. https://www.kaggle.com/datasets/bhavikjikadara/world-population-statistics-2023
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 9, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Bhavik Jikadara
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    World
    Description
    • The current US Census Bureau world population estimate in June 2019 shows that the current global population is 7,577,130,400 people on Earth, which far exceeds the world population of 7.2 billion in 2015. Our estimate based on UN data shows the world's population surpassing 7.7 billion.
    • China is the most populous country in the world with a population exceeding 1.4 billion. It is one of just two countries with a population of more than 1 billion, with India being the second. As of 2018, India has a population of over 1.355 billion people, and its population growth is expected to continue through at least 2050. By the year 2030, India is expected to become the most populous country in the world. This is because India’s population will grow, while China is projected to see a loss in population.
    • The following 11 countries that are the most populous in the world each have populations exceeding 100 million. These include the United States, Indonesia, Brazil, Pakistan, Nigeria, Bangladesh, Russia, Mexico, Japan, Ethiopia, and the Philippines. Of these nations, all are expected to continue to grow except Russia and Japan, which will see their populations drop by 2030 before falling again significantly by 2050.
    • Many other nations have populations of at least one million, while there are also countries that have just thousands. The smallest population in the world can be found in Vatican City, where only 801 people reside.
    • In 2018, the world’s population growth rate was 1.12%. Every five years since the 1970s, the population growth rate has continued to fall. The world’s population is expected to continue to grow larger but at a much slower pace. By 2030, the population will exceed 8 billion. In 2040, this number will grow to more than 9 billion. In 2055, the number will rise to over 10 billion, and another billion people won’t be added until near the end of the century. The current annual population growth estimates from the United Nations are in the millions - estimating that over 80 million new lives are added yearly.
    • This population growth will be significantly impacted by nine specific countries which are situated to contribute to the population growth more quickly than other nations. These nations include the Democratic Republic of the Congo, Ethiopia, India, Indonesia, Nigeria, Pakistan, Uganda, the United Republic of Tanzania, and the United States of America. Particularly of interest, India is on track to overtake China's position as the most populous country by 2030. Additionally, multiple nations within Africa are expected to double their populations before fertility rates begin to slow entirely.

    Content

    • In this Dataset, we have Historical Population data for every Country/Territory in the world by different parameters like Area Size of the Country/Territory, Name of the Continent, Name of the Capital, Density, Population Growth Rate, Ranking based on Population, World Population Percentage, etc. >Dataset Glossary (Column-Wise):
    • Rank: Rank by Population.
    • CCA3: 3 Digit Country/Territories Code.
    • Country/Territories: Name of the Country/Territories.
    • Capital: Name of the Capital.
    • Continent: Name of the Continent.
    • 2022 Population: Population of the Country/Territories in the year 2022.
    • 2020 Population: Population of the Country/Territories in the year 2020.
    • 2015 Population: Population of the Country/Territories in the year 2015.
    • 2010 Population: Population of the Country/Territories in the year 2010.
    • 2000 Population: Population of the Country/Territories in the year 2000.
    • 1990 Population: Population of the Country/Territories in the year 1990.
    • 1980 Population: Population of the Country/Territories in the year 1980.
    • 1970 Population: Population of the Country/Territories in the year 1970.
    • Area (km²): Area size of the Country/Territories in square kilometers.
    • Density (per km²): Population Density per square kilometer.
    • Growth Rate: Population Growth Rate by Country/Territories.
    • World Population Percentage: The population percentage by each Country/Territories.
  11. l

    Data from: Population Health data collection for the City of Greater Bendigo...

    • opal.latrobe.edu.au
    • researchdata.edu.au
    xlsx
    Updated Mar 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandra Leggat; Stephen Begg; Charles Ambrose; Greg D'Arcy (2024). Population Health data collection for the City of Greater Bendigo [Dataset]. http://doi.org/10.4225/22/55BAE9DBD9670
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Mar 7, 2024
    Dataset provided by
    La Trobe
    Authors
    Sandra Leggat; Stephen Begg; Charles Ambrose; Greg D'Arcy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Greater Bendigo City
    Description

    This data collection contains de-identified clinical health service utilisation data from Bendigo Health and the General Practitioners Practices associated with the Loddon Mallee Murray Medicare Local. The collection also includes associated population health data from the ABS, AIHW and the Municipal Health Plans. Health researchers have a major interest in how clinical data can be used to monitor population health and health care in rural and regional Australia through analysing a broad range of factors shown to impact the health of different populations. The Population Health data collection provides students, managers, clinicians and researchers the opportunity to use clinical data in the study of population health, including the analysis of health risk factors, disease trends and health care utilisation and outcomes.Temporal range (data time period):2004 to 2014Spatial coverage:Bendigo Latitude -36.758711200000010000, Bendigo Longitude 144.283745899999990000

  12. u

    Population by County 2017

    • gstore.unm.edu
    Updated Mar 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Population by County 2017 [Dataset]. https://gstore.unm.edu/apps/rgis/datasets/cd10009e-a79f-4de5-a12c-87bb5b499e9f/metadata/ISO-19115:2003.html
    Explore at:
    Dataset updated
    Mar 6, 2020
    Time period covered
    2017
    Area covered
    West Bound -109.05017 East Bound -103.00196 North Bound 37.000293 South Bound 31.33217
    Description

    A broad and generalized selection of 2013-2017 US Census Bureau 2017 5-year American Community Survey population data estimates, obtained via Census API and joined to the appropriate geometry (in this case, New Mexico counties). The selection is not comprehensive, but allows a first-level characterization of total population, male and female, and both broad and narrowly-defined age groups. In addition to the standard selection of age-group breakdowns (by male or female), the dataset provides supplemental calculated fields which combine several attributes into one (for example, the total population of persons under 18, or the number of females over 65 years of age). The determination of which estimates to include was based upon level of interest and providing a manageable dataset for users.The U.S. Census Bureau's American Community Survey (ACS) is a nationwide, continuous survey designed to provide communities with reliable and timely demographic, housing, social, and economic data every year. The ACS collects long-form-type information throughout the decade rather than only once every 10 years. As in the decennial census, strict confidentiality laws protect all information that could be used to identify individuals or households.The ACS combines population or housing data from multiple years to produce reliable numbers for small counties, neighborhoods, and other local areas. To provide information for communities each year, the ACS provides 1-, 3-, and 5-year estimates. ACS 5-year estimates (multiyear estimates) are “period” estimates that represent data collected over a 60-month period of time (as opposed to “point-in-time” estimates, such as the decennial census, that approximate the characteristics of an area on a specific date). ACS data are released in the year immediately following the year in which they are collected. ACS estimates based on data collected from 2009–2014 should not be called “2009” or “2014” estimates. Multiyear estimates should be labeled to indicate clearly the full period of time. The primary advantage of using multiyear estimates is the increased statistical reliability of the data for less populated areas and small population subgroups. Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. While each full Data Profile contains margin of error (MOE) information, this dataset does not. Those individuals requiring more complete data are directed to download the more detailed datasets from the ACS American FactFinder website. This dataset is organized by New Mexico county boundaries.

  13. d

    Individuals, ZIP Code Data

    • catalog.data.gov
    • gimi9.com
    Updated Aug 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics of Income (SOI) (2024). Individuals, ZIP Code Data [Dataset]. https://catalog.data.gov/dataset/zip-code-data
    Explore at:
    Dataset updated
    Aug 22, 2024
    Dataset provided by
    Statistics of Income (SOI)
    Description

    This annual study provides selected income and tax items classified by State, ZIP Code, and the size of adjusted gross income. These data include the number of returns, which approximates the number of households; the number of personal exemptions, which approximates the population; adjusted gross income; wages and salaries; dividends before exclusion; and interest received. Data are based who reported on U.S. Individual Income Tax Returns (Forms 1040) filed with the IRS. SOI collects these data as part of its Individual Income Tax Return (Form 1040) Statistics program, Data by Geographic Areas, ZIP Code Data.

  14. H

    Current Population Survey

    • data.niaid.nih.gov
    • dataverse.harvard.edu
    Updated May 31, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2011). Current Population Survey [Dataset]. http://doi.org/10.7910/DVN/35IUVQ
    Explore at:
    Dataset updated
    May 31, 2011
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Users can download data or view data tables on topics related to the labor force of the United States. Background Current Population Survey is a joint effort between the Bureau of Labor Statistics and the Census Bureau. It provides information and data on the labor force of the United States, such as: employment, unemployment, earnings, hours of work, school enrollment, health, employee benefits and income. The CPS is conducted monthly and has a sample of approximately 50,000 households. It is representative of the non-institutionalized US population. The sample provides estimates for the nation as a whole and serves as part of model-based estimates for individual states and other geographic areas. User Functionality Users can download data sets or view data tables on their topic of interest. Data can be organized by a variety of demographic variables, including: sex, age, race, marital status and educational attainment. Data is available on a national or state level. Data Notes The CPS is conducted monthly and has a sample of approximately 50,000 households. It is representative of the non-institutionalized US population. The sample provides estimates for th e nation as a whole and serves as part of model-based estimates for individual states and other geographic areas.

  15. o

    Population Estimates for Northern Ireland - Dataset - Open Data NI

    • admin.opendatani.gov.uk
    Updated Jan 17, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Population Estimates for Northern Ireland - Dataset - Open Data NI [Dataset]. https://admin.opendatani.gov.uk/dataset/population-estimates-for-northern-ireland
    Explore at:
    Dataset updated
    Jan 17, 2018
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Area covered
    Ireland, Northern Ireland
    Description

    Population estimates relate to the population as of 30th June each year, and therefore are often referred to as mid-year estimates. They are used to allocate public funds to the Northern Ireland Executive through the Barnett formula and are widely used by Northern Ireland government departments for the planning of services, such as health and education. These statistics are also of interest to those involved in research and academia. They are widely used to express other statistics as a rate, and thus enable comparisons across the United Kingdom and other countries. Furthermore, population estimates form the basis for future population statistics such as population projections.

  16. u

    American Community Survey

    • gstore.unm.edu
    csv, geojson, gml +5
    Updated Mar 6, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Earth Data Analysis Center (2020). American Community Survey [Dataset]. https://gstore.unm.edu/apps/rgis/datasets/92f102fa-5d6c-41b6-8cf9-132f78a30e02/metadata/FGDC-STD-001-1998.html
    Explore at:
    csv(5), zip(5), json(5), gml(5), geojson(5), xls(5), shp(5), kml(5)Available download formats
    Dataset updated
    Mar 6, 2020
    Dataset provided by
    Earth Data Analysis Center
    Time period covered
    2017
    Area covered
    New Mexico, West Bounding Coordinate -109.050173 East Bounding Coordinate -103.001964 North Bounding Coordinate 37.000293 South Bounding Coordinate 31.332172
    Description

    A broad and generalized selection of 2013-2017 US Census Bureau 2017 5-year American Community Survey population data estimates, obtained via Census API and joined to the appropriate geometry (in this case, New Mexico Census tracts). The selection is not comprehensive, but allows a first-level characterization of total population, male and female, and both broad and narrowly-defined age groups. In addition to the standard selection of age-group breakdowns (by male or female), the dataset provides supplemental calculated fields which combine several attributes into one (for example, the total population of persons under 18, or the number of females over 65 years of age). The determination of which estimates to include was based upon level of interest and providing a manageable dataset for users.The U.S. Census Bureau's American Community Survey (ACS) is a nationwide, continuous survey designed to provide communities with reliable and timely demographic, housing, social, and economic data every year. The ACS collects long-form-type information throughout the decade rather than only once every 10 years. The ACS combines population or housing data from multiple years to produce reliable numbers for small counties, neighborhoods, and other local areas. To provide information for communities each year, the ACS provides 1-, 3-, and 5-year estimates. ACS 5-year estimates (multiyear estimates) are “period” estimates that represent data collected over a 60-month period of time (as opposed to “point-in-time” estimates, such as the decennial census, that approximate the characteristics of an area on a specific date). ACS data are released in the year immediately following the year in which they are collected. ACS estimates based on data collected from 2009–2014 should not be called “2009” or “2014” estimates. Multiyear estimates should be labeled to indicate clearly the full period of time. While the ACS contains margin of error (MOE) information, this dataset does not. Those individuals requiring more complete data are directed to download the more detailed datasets from the ACS American FactFinder website. This dataset is organized by Census tract boundaries in New Mexico. Census tracts are small, relatively permanent statistical subdivisions of a county or equivalent entity, and were defined by local participants as part of the 2010 Census Participant Statistical Areas Program. The primary purpose of census tracts is to provide a stable set of geographic units for the presentation of census data and comparison back to previous decennial censuses. Census tracts generally have a population size between 1,200 and 8,000 people, with an optimum size of 4,000 people. State and county boundaries always are census tract boundaries in the standard census geographic hierarchy. In a few rare instances, a census tract may consist of noncontiguous areas. These noncontiguous areas may occur where the census tracts are coextensive with all or parts of legal entities that are themselves noncontiguous. For the 2010 Census, the census tract code range of 9400 through 9499 was enforced for census tracts that include a majority American Indian population according to Census 2000 data and/or their area was primarily covered by federally recognized American Indian reservations and/or off-reservation trust lands; the code range 9800 through 9899 was enforced for those census tracts that contained little or no population and represented a relatively large special land use area such as a National Park, military installation, or a business/industrial park; and the code range 9900 through 9998 was enforced for those census tracts that contained only water area, no land area.

  17. A

    ‘NYSERDA Low- to Moderate-Income New York State Census Population Analysis...

    • analyst-2.ai
    Updated Feb 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘NYSERDA Low- to Moderate-Income New York State Census Population Analysis Dataset: Average for 2013-2015’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-nyserda-low-to-moderate-income-new-york-state-census-population-analysis-dataset-average-for-2013-2015-0724/f3a01d19/?iid=020-481&v=presentation
    Explore at:
    Dataset updated
    Feb 12, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    New York
    Description

    Analysis of ‘NYSERDA Low- to Moderate-Income New York State Census Population Analysis Dataset: Average for 2013-2015’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/8bd0ae94-40d3-4c9b-8a6b-de032e07929f on 12 February 2022.

    --- Dataset description provided by original source is as follows ---

    How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov.

    The Low- to Moderate-Income (LMI) New York State (NYS) Census Population Analysis dataset is resultant from the LMI market database designed by APPRISE as part of the NYSERDA LMI Market Characterization Study (https://www.nyserda.ny.gov/lmi-tool). All data are derived from the U.S. Census Bureau’s American Community Survey (ACS) 1-year Public Use Microdata Sample (PUMS) files for 2013, 2014, and 2015.

    Each row in the LMI dataset is an individual record for a household that responded to the survey and each column is a variable of interest for analyzing the low- to moderate-income population.

    The LMI dataset includes: county/county group, households with elderly, households with children, economic development region, income groups, percent of poverty level, low- to moderate-income groups, household type, non-elderly disabled indicator, race/ethnicity, linguistic isolation, housing unit type, owner-renter status, main heating fuel type, home energy payment method, housing vintage, LMI study region, LMI population segment, mortgage indicator, time in home, head of household education level, head of household age, and household weight.

    The LMI NYS Census Population Analysis dataset is intended for users who want to explore the underlying data that supports the LMI Analysis Tool. The majority of those interested in LMI statistics and generating custom charts should use the interactive LMI Analysis Tool at https://www.nyserda.ny.gov/lmi-tool. This underlying LMI dataset is intended for users with experience working with survey data files and producing weighted survey estimates using statistical software packages (such as SAS, SPSS, or Stata).

    --- Original source retains full ownership of the source dataset ---

  18. HHS COVID-19 Small Area Estimations Survey - Primary Vaccine Series - Wave...

    • catalog.data.gov
    • healthdata.gov
    • +2more
    Updated Jul 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Health and Human Services (2025). HHS COVID-19 Small Area Estimations Survey - Primary Vaccine Series - Wave 25 [Dataset]. https://catalog.data.gov/dataset/hhs-covid-19-small-area-estimations-survey-primary-vaccine-series-wave-25
    Explore at:
    Dataset updated
    Jul 4, 2025
    Dataset provided by
    United States Department of Health and Human Serviceshttp://www.hhs.gov/
    Description

    The goal of the Monthly Outcome Survey (MOS) Small Area Estimations (SAE) is to generate estimates of the proportions of adults, by county and month, who were in the population of interest for the U.S. Department of Health and Human Services’ (HHS) We Can Do This COVID-19 Public Education Campaign. These data are designed to be used by practitioners and researchers to understand how county-level COVID-19 vaccination hesitancy changed over time in the United States.

  19. w

    National Population Database

    • data.wu.ac.at
    • gimi9.com
    wms
    Updated Apr 20, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Health and Safety Laboratory (2018). National Population Database [Dataset]. https://data.wu.ac.at/schema/data_gov_uk/NzJkOGJmNjMtN2NjMi00OGI2LThkOTctYTg1ZDQ4MmJmMjlj
    Explore at:
    wmsAvailable download formats
    Dataset updated
    Apr 20, 2018
    Dataset provided by
    Health and Safety Laboratory
    Area covered
    707bd9bad8997440d5674b70bc61d21f4a31c9b2
    Description

    The National Population Database (NPD) is a point-based Geographical Information System (GIS) dataset that combines locational information from providers like the Ordnance Survey with population information about those locations, mainly sourced from Government statistics. The points (and sometimes polygons) represent individual buildings, so the NPD allows detailed local analysis for anywhere in Great Britain.

    The Health & Safety Laboratory (HSL) working with Staffordshire University originally created the NPD in 2004 to help its parent organisation, the Health and Safety Executive (HSE), assess the risks to society of major hazard sites e.g. oil refineries, chemical works and gas holders. Of particular interest to HSE were 'sensitive' populations e.g. schools and hospitals where the people at those locations may be more vulnerable to harm and potentially harder to evacuate in an emergency. The data is split into 5 themes: residential, sensitive populations, transport, workplaces and leisure.

    More information about the NPD can be found here:

    https://www.hsl.gov.uk/what-we-do/better-decisions/geoanalytics/national-population-database

    The NPD was created using various datasets available within Government as part of the Public Sector Mapping Agreement (PSMA) and contains other intellectual property so is only available under license and for a fee. Please contact the HSL GIS Team if you would like to discuss gaining access to the sample or full dataset.

  20. World Gender Statistics

    • kaggle.com
    zip
    Updated Nov 28, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    World Bank (2016). World Gender Statistics [Dataset]. https://www.kaggle.com/datasets/theworldbank/world-gender-statistics/versions/1
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Nov 28, 2016
    Dataset authored and provided by
    World Bankhttp://worldbank.org/
    Area covered
    World
    Description

    The Gender Statistics database is a comprehensive source for the latest sex-disaggregated data and gender statistics covering demography, education, health, access to economic opportunities, public life and decision-making, and agency.

    The Data

    The data is split into several files, with the main one being Data.csv. The Data.csv contains all the variables of interest in this dataset, while the others are lists of references and general nation-by-nation information.

    Data.csv contains the following fields:

    Data.csv

    • Country.Name: the name of the country
    • Country.Code: the country's code
    • Indicator.Name: the name of the variable that this row represents
    • Indicator.Code: a unique id for the variable
    • 1960 - 2016: one column EACH for the value of the variable in each year it was available

    The other files

    I couldn't find any metadata for these, and I'm not qualified to guess at what each of the variables mean. I'll list the variables for each file, and if anyone has any suggestions (or, even better, actual knowledge/citations) as to what they mean, please leave a note in the comments and I'll add your info to the data description.

    Country-Series.csv

    • CountryCode
    • SeriesCode
    • DESCRIPTION

    Country.csv

    • Country.Code
    • Short.Name
    • Table.Name
    • Long.Name
    • 2-alpha.code
    • Currency.Unit
    • Special.Notes
    • Region
    • Income.Group
    • WB-2.code
    • National.accounts.base.year
    • National.accounts.reference.year
    • SNA.price.valuation
    • Lending.category
    • Other.groups
    • System.of.National.Accounts
    • Alternative.conversion.factor
    • PPP.survey.year
    • Balance.of.Payments.Manual.in.use
    • External.debt.Reporting.status
    • System.of.trade
    • Government.Accounting.concept
    • IMF.data.dissemination.standard
    • Latest.population.census
    • Latest.household.survey
    • Source.of.most.recent.Income.and.expenditure.data
    • Vital.registration.complete
    • Latest.agricultural.census
    • Latest.industrial.data
    • Latest.trade.data
    • Latest.water.withdrawal.data

    FootNote.csv

    • CountryCode
    • SeriesCode
    • Year
    • DESCRIPTION

    Series-Time.csv

    • SeriesCode
    • Year
    • DESCRIPTION

    Series.csv

    • Series.Code
    • Topic
    • Indicator.Name
    • Short.definition
    • Long.definition
    • Unit.of.measure
    • Periodicity
    • Base.Period
    • Other.notes
    • Aggregation.method
    • Limitations.and.exceptions
    • Notes.from.original.source
    • General.comments
    • Source
    • Statistical.concept.and.methodology
    • Development.relevance
    • Related.source.links
    • Other.web.links
    • Related.indicators
    • License.Type

    Acknowledgements

    This dataset was downloaded from The World Bank's Open Data project. The summary of the Terms of Use of this data is as follows:

    • You are free to copy, distribute, adapt, display or include the data in other products for commercial and noncommercial purposes at no cost subject to certain limitations summarized below.

    • You must include attribution for the data you use in the manner indicated in the metadata included with the data.

    • You must not claim or imply that The World Bank endorses your use of the data by or use The World Bank’s logo(s) or trademark(s) in conjunction with such use.

    • Other parties may have ownership interests in some of the materials contained on The World Bank Web site. For example, we maintain a list of some specific data within the Datasets that you may not redistribute or reuse without first contacting the original content provider, as well as information regarding how to contact the original content provider. Before incorporating any data in other products, please check the list: Terms of use: Restricted Data.

    -- [ed. note: this last is not applicable to the Gender Statistics database]

    • The World Bank makes no warranties with respect to the data and you agree The World Bank shall not be liable to you in connection with your use of the data.

    • This is only a summary of the Terms of Use for Datasets Listed in The World Bank Data Catalogue. Please read the actual agreement that controls your use of the Datasets, which is available here: Terms of use for datasets. Also see World Bank Terms and Conditions.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Anton Boudreau Ninkov; Anton Boudreau Ninkov; Chantal Ripp; Chantal Ripp; Kathleen Gregory; Kathleen Gregory; Isabella Peters; Isabella Peters; Stefanie Haustein; Stefanie Haustein (2024). A dataset from a survey investigating disciplinary differences in data citation [Dataset]. http://doi.org/10.5281/zenodo.7853477
Organization logo

A dataset from a survey investigating disciplinary differences in data citation

Explore at:
txt, pdf, bin, csvAvailable download formats
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anton Boudreau Ninkov; Anton Boudreau Ninkov; Chantal Ripp; Chantal Ripp; Kathleen Gregory; Kathleen Gregory; Isabella Peters; Isabella Peters; Stefanie Haustein; Stefanie Haustein
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

GENERAL INFORMATION

Title of Dataset: A dataset from a survey investigating disciplinary differences in data citation

Date of data collection: January to March 2022

Collection instrument: SurveyMonkey

Funding: Alfred P. Sloan Foundation


SHARING/ACCESS INFORMATION

Licenses/restrictions placed on the data: These data are available under a CC BY 4.0 license

Links to publications that cite or use the data:

Gregory, K., Ninkov, A., Ripp, C., Peters, I., & Haustein, S. (2022). Surveying practices of data citation and reuse across disciplines. Proceedings of the 26th International Conference on Science and Technology Indicators. International Conference on Science and Technology Indicators, Granada, Spain. https://doi.org/10.5281/ZENODO.6951437

Gregory, K., Ninkov, A., Ripp, C., Roblin, E., Peters, I., & Haustein, S. (2023). Tracing data:
A survey investigating disciplinary differences in data citation.
Zenodo. https://doi.org/10.5281/zenodo.7555266


DATA & FILE OVERVIEW

File List

  • Filename: MDCDatacitationReuse2021Codebookv2.pdf
    Codebook
  • Filename: MDCDataCitationReuse2021surveydatav2.csv
    Dataset format in csv
  • Filename: MDCDataCitationReuse2021surveydatav2.sav
    Dataset format in SPSS
  • Filename: MDCDataCitationReuseSurvey2021QNR.pdf
    Questionnaire

Additional related data collected that was not included in the current data package: Open ended questions asked to respondents


METHODOLOGICAL INFORMATION

Description of methods used for collection/generation of data:

The development of the questionnaire (Gregory et al., 2022) was centered around the creation of two main branches of questions for the primary groups of interest in our study: researchers that reuse data (33 questions in total) and researchers that do not reuse data (16 questions in total). The population of interest for this survey consists of researchers from all disciplines and countries, sampled from the corresponding authors of papers indexed in the Web of Science (WoS) between 2016 and 2020.

Received 3,632 responses, 2,509 of which were completed, representing a completion rate of 68.6%. Incomplete responses were excluded from the dataset. The final total contains 2,492 complete responses and an uncorrected response rate of 1.57%. Controlling for invalid emails, bounced emails and opt-outs (n=5,201) produced a response rate of 1.62%, similar to surveys using comparable recruitment methods (Gregory et al., 2020).

Methods for processing the data:

Results were downloaded from SurveyMonkey in CSV format and were prepared for analysis using Excel and SPSS by recoding ordinal and multiple choice questions and by removing missing values.

Instrument- or software-specific information needed to interpret the data:

The dataset is provided in SPSS format, which requires IBM SPSS Statistics. The dataset is also available in a coded format in CSV. The Codebook is required to interpret to values.


DATA-SPECIFIC INFORMATION FOR: MDCDataCitationReuse2021surveydata

Number of variables: 95

Number of cases/rows: 2,492

Missing data codes: 999 Not asked

Refer to MDCDatacitationReuse2021Codebook.pdf for detailed variable information.

Search
Clear search
Close search
Google apps
Main menu