57 datasets found
  1. A dataset from a survey investigating disciplinary differences in data...

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv, pdf, txt
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anton Boudreau Ninkov; Anton Boudreau Ninkov; Chantal Ripp; Chantal Ripp; Kathleen Gregory; Kathleen Gregory; Isabella Peters; Isabella Peters; Stefanie Haustein; Stefanie Haustein (2024). A dataset from a survey investigating disciplinary differences in data citation [Dataset]. http://doi.org/10.5281/zenodo.7853477
    Explore at:
    txt, pdf, bin, csvAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anton Boudreau Ninkov; Anton Boudreau Ninkov; Chantal Ripp; Chantal Ripp; Kathleen Gregory; Kathleen Gregory; Isabella Peters; Isabella Peters; Stefanie Haustein; Stefanie Haustein
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    GENERAL INFORMATION

    Title of Dataset: A dataset from a survey investigating disciplinary differences in data citation

    Date of data collection: January to March 2022

    Collection instrument: SurveyMonkey

    Funding: Alfred P. Sloan Foundation


    SHARING/ACCESS INFORMATION

    Licenses/restrictions placed on the data: These data are available under a CC BY 4.0 license

    Links to publications that cite or use the data:

    Gregory, K., Ninkov, A., Ripp, C., Peters, I., & Haustein, S. (2022). Surveying practices of data citation and reuse across disciplines. Proceedings of the 26th International Conference on Science and Technology Indicators. International Conference on Science and Technology Indicators, Granada, Spain. https://doi.org/10.5281/ZENODO.6951437

    Gregory, K., Ninkov, A., Ripp, C., Roblin, E., Peters, I., & Haustein, S. (2023). Tracing data:
    A survey investigating disciplinary differences in data citation.
    Zenodo. https://doi.org/10.5281/zenodo.7555266


    DATA & FILE OVERVIEW

    File List

    • Filename: MDCDatacitationReuse2021Codebookv2.pdf
      Codebook
    • Filename: MDCDataCitationReuse2021surveydatav2.csv
      Dataset format in csv
    • Filename: MDCDataCitationReuse2021surveydatav2.sav
      Dataset format in SPSS
    • Filename: MDCDataCitationReuseSurvey2021QNR.pdf
      Questionnaire

    Additional related data collected that was not included in the current data package: Open ended questions asked to respondents


    METHODOLOGICAL INFORMATION

    Description of methods used for collection/generation of data:

    The development of the questionnaire (Gregory et al., 2022) was centered around the creation of two main branches of questions for the primary groups of interest in our study: researchers that reuse data (33 questions in total) and researchers that do not reuse data (16 questions in total). The population of interest for this survey consists of researchers from all disciplines and countries, sampled from the corresponding authors of papers indexed in the Web of Science (WoS) between 2016 and 2020.

    Received 3,632 responses, 2,509 of which were completed, representing a completion rate of 68.6%. Incomplete responses were excluded from the dataset. The final total contains 2,492 complete responses and an uncorrected response rate of 1.57%. Controlling for invalid emails, bounced emails and opt-outs (n=5,201) produced a response rate of 1.62%, similar to surveys using comparable recruitment methods (Gregory et al., 2020).

    Methods for processing the data:

    Results were downloaded from SurveyMonkey in CSV format and were prepared for analysis using Excel and SPSS by recoding ordinal and multiple choice questions and by removing missing values.

    Instrument- or software-specific information needed to interpret the data:

    The dataset is provided in SPSS format, which requires IBM SPSS Statistics. The dataset is also available in a coded format in CSV. The Codebook is required to interpret to values.


    DATA-SPECIFIC INFORMATION FOR: MDCDataCitationReuse2021surveydata

    Number of variables: 95

    Number of cases/rows: 2,492

    Missing data codes: 999 Not asked

    Refer to MDCDatacitationReuse2021Codebook.pdf for detailed variable information.

  2. d

    NYSERDA Low- to Moderate-Income New York State Census Population Analysis...

    • catalog.data.gov
    • datasets.ai
    • +3more
    Updated Jun 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ny.gov (2025). NYSERDA Low- to Moderate-Income New York State Census Population Analysis Dataset: Average for 2013-2015 [Dataset]. https://catalog.data.gov/dataset/nyserda-low-to-moderate-income-new-york-state-census-population-analysis-dataset-aver-2013
    Explore at:
    Dataset updated
    Jun 28, 2025
    Dataset provided by
    data.ny.gov
    Area covered
    New York
    Description

    How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov. The Low- to Moderate-Income (LMI) New York State (NYS) Census Population Analysis dataset is resultant from the LMI market database designed by APPRISE as part of the NYSERDA LMI Market Characterization Study (https://www.nyserda.ny.gov/lmi-tool). All data are derived from the U.S. Census Bureau’s American Community Survey (ACS) 1-year Public Use Microdata Sample (PUMS) files for 2013, 2014, and 2015. Each row in the LMI dataset is an individual record for a household that responded to the survey and each column is a variable of interest for analyzing the low- to moderate-income population. The LMI dataset includes: county/county group, households with elderly, households with children, economic development region, income groups, percent of poverty level, low- to moderate-income groups, household type, non-elderly disabled indicator, race/ethnicity, linguistic isolation, housing unit type, owner-renter status, main heating fuel type, home energy payment method, housing vintage, LMI study region, LMI population segment, mortgage indicator, time in home, head of household education level, head of household age, and household weight. The LMI NYS Census Population Analysis dataset is intended for users who want to explore the underlying data that supports the LMI Analysis Tool. The majority of those interested in LMI statistics and generating custom charts should use the interactive LMI Analysis Tool at https://www.nyserda.ny.gov/lmi-tool. This underlying LMI dataset is intended for users with experience working with survey data files and producing weighted survey estimates using statistical software packages (such as SAS, SPSS, or Stata).

  3. d

    Factori USA Consumer Graph Data | socio-demographic, location, interest and...

    • datarade.ai
    .json, .csv
    Updated Jul 23, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Factori (2022). Factori USA Consumer Graph Data | socio-demographic, location, interest and intent data | E-Commere |Mobile Apps | Online Services [Dataset]. https://datarade.ai/data-products/factori-usa-consumer-graph-data-socio-demographic-location-factori
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    Jul 23, 2022
    Dataset authored and provided by
    Factori
    Area covered
    United States of America
    Description

    Our consumer data is gathered and aggregated via surveys, digital services, and public data sources. We use powerful profiling algorithms to collect and ingest only fresh and reliable data points.

    Our comprehensive data enrichment solution includes a variety of data sets that can help you address gaps in your customer data, gain a deeper understanding of your customers, and power superior client experiences.

    1. Geography - City, State, ZIP, County, CBSA, Census Tract, etc.
    2. Demographics - Gender, Age Group, Marital Status, Language etc.
    3. Financial - Income Range, Credit Rating Range, Credit Type, Net worth Range, etc
    4. Persona - Consumer type, Communication preferences, Family type, etc
    5. Interests - Content, Brands, Shopping, Hobbies, Lifestyle etc.
    6. Household - Number of Children, Number of Adults, IP Address, etc.
    7. Behaviours - Brand Affinity, App Usage, Web Browsing etc.
    8. Firmographics - Industry, Company, Occupation, Revenue, etc
    9. Retail Purchase - Store, Category, Brand, SKU, Quantity, Price etc.
    10. Auto - Car Make, Model, Type, Year, etc.
    11. Housing - Home type, Home value, Renter/Owner, Year Built etc.

    Consumer Graph Schema & Reach: Our data reach represents the total number of counts available within various categories and comprises attributes such as country location, MAU, DAU & Monthly Location Pings:

    Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method on a suitable interval (daily/weekly/monthly).

    Consumer Graph Use Cases:

    360-Degree Customer View:Get a comprehensive image of customers by the means of internal and external data aggregation.

    Data Enrichment:Leverage Online to offline consumer profiles to build holistic audience segments to improve campaign targeting using user data enrichment

    Fraud Detection: Use multiple digital (web and mobile) identities to verify real users and detect anomalies or fraudulent activity.

    Advertising & Marketing:Understand audience demographics, interests, lifestyle, hobbies, and behaviors to build targeted marketing campaigns.

    Using Factori Consumer Data graph you can solve use cases like:

    Acquisition Marketing Expand your reach to new users and customers using lookalike modeling with your first party audiences to extend to other potential consumers with similar traits and attributes.

    Lookalike Modeling

    Build lookalike audience segments using your first party audiences as a seed to extend your reach for running marketing campaigns to acquire new users or customers

    And also, CRM Data Enrichment, Consumer Data Enrichment B2B Data Enrichment B2C Data Enrichment Customer Acquisition Audience Segmentation 360-Degree Customer View Consumer Profiling Consumer Behaviour Data

  4. f

    Data from: Category-Adaptive Variable Screening for Ultra-High Dimensional...

    • tandf.figshare.com
    zip
    Updated Aug 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinhan Xie; Yuanyuan Lin; Xiaodong Yan; Niansheng Tang (2023). Category-Adaptive Variable Screening for Ultra-High Dimensional Heterogeneous Categorical Data [Dataset]. http://doi.org/10.6084/m9.figshare.7819544.v4
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 16, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Jinhan Xie; Yuanyuan Lin; Xiaodong Yan; Niansheng Tang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The populations of interest in modern studies are very often heterogeneous. The population heterogeneity, the qualitative nature of the outcome variable and the high dimensionality of the predictors pose significant challenge in statistical analysis. In this article, we introduce a category-adaptive screening procedure with high-dimensional heterogeneous data, which is to detect category-specific important covariates. The proposal is a model-free approach without any specification of a regression model and an adaptive procedure in the sense that the set of active variables is allowed to vary across different categories, thus making it more flexible to accommodate heterogeneity. For response-selective sampling data, another main discovery of this article is that the proposed method works directly without any modification. Under mild regularity conditions, the newly procedure is shown to possess the sure screening and ranking consistency properties. Simulation studies contain supportive evidence that the proposed method performs well under various settings and it is effective to extract category-specific information. Applications are illustrated with two real datasets. Supplementary materials for this article are available online.

  5. B

    Census of Population, 2006 [Canada]: Special Interest Profiles [B2020]

    • borealisdata.ca
    • search.dataone.org
    Updated Nov 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Canada (2023). Census of Population, 2006 [Canada]: Special Interest Profiles [B2020] [Dataset]. http://doi.org/10.5683/SP3/9TET2T
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 2, 2023
    Dataset provided by
    Borealis
    Authors
    Statistics Canada
    License

    https://borealisdata.ca/api/datasets/:persistentId/versions/3.1/customlicense?persistentId=doi:10.5683/SP3/9TET2Thttps://borealisdata.ca/api/datasets/:persistentId/versions/3.1/customlicense?persistentId=doi:10.5683/SP3/9TET2T

    Area covered
    Canada
    Description

    This new product will present data for specific census topics and population groups according to selected demographic, cultural, and socio-economic characteristics. These detailed 'profile-type' tables expand the analytical depth of basic census information. Special interest profiles include: ethnic groups, Aboriginal peoples, occupation, industry, and place of work.

  6. m

    Data for:Improved Population Mapping for China Using the 3D Build-ing,...

    • data.mendeley.com
    Updated Sep 4, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhen Lei (2024). Data for:Improved Population Mapping for China Using the 3D Build-ing, Nighttime Light, Points-of-interest, and Land Use/Cover Data Within a Multiscale Geographically Weighted Regression Model [Dataset]. http://doi.org/10.17632/hwz54s535n.1
    Explore at:
    Dataset updated
    Sep 4, 2024
    Authors
    Zhen Lei
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    China
    Description

    Auxiliary Data.gdb: Land_use: original land use data POI_name: interests-point-data from the Amap platform (name indicates category)

    New_gridded_population_dataset(.gdb): experimental result data, i.e., a gridded population map of mainland China with a resolution of 100 meters

    New_minus_WorldPop_PopulationResidual(.gdb): pixel-level residuals of the new gridded population dataset with the Worldpop dataset

    POI_Correlation_Coefficient: Zonal statistical output of POI kernel density values: summary of various POI kernel densities in residential areas of administrative units Summary of POI Pearson correlation coefficients: sum of Pearson's correlation coefficients for 13 types of POIs at a certain bandwidth

    PopulationData_AdministrativeUnitLevel.gdb: Population_data_mainlandChina_level3: population data at the district and county level in mainland China Population_data_Name_level4_Table: township and street-level population data for provinces and municipalities

    Note: Due to the storage space limitation, 3D building, nighttime light, and WorldPop datasets have not been uploaded. To access these publicly available data, please visit the official website via the "Related links" at the bottom. In addition, we are not authorized to share data for the fourth level of administrative boundaries, so we only share the corresponding population data in tabular form.

  7. Annual Population Survey Two-Year Longitudinal Dataset, January 2018 -...

    • beta.ukdataservice.ac.uk
    • datacatalogue.cessda.eu
    Updated 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Social Survey Division Office For National Statistics (2024). Annual Population Survey Two-Year Longitudinal Dataset, January 2018 - December 2019 [Dataset]. http://doi.org/10.5255/ukda-sn-8840-1
    Explore at:
    Dataset updated
    2024
    Dataset provided by
    UK Data Servicehttps://ukdataservice.ac.uk/
    DataCitehttps://www.datacite.org/
    Authors
    Social Survey Division Office For National Statistics
    Description

    The Annual Population Survey (APS) is a major survey series, which aims to provide data that can produce reliable estimates at local authority level. Key topics covered in the survey include education, employment, health and ethnicity. The APS comprises key variables from the Labour Force Survey (LFS), all its associated LFS boosts and the APS boost.

    The APS allows for analysis to be carried out on detailed subgroups and below regional level. In recent years (particularly with the sample size of the LFS 5 quarter dataset reducing) there has been some interest in producing a two year APS longitudinal dataset to look at any trends that may occur over a year. The APS Two-Year Longitudinal Datasets, covering 2012/13 onwards, have been deposited as a result of this work. Person- and Household-level APS datasets are also available.

    For further detailed information about methodology, users should consult the Labour Force Survey User Guide, included with the APS documentation.

    Occupation data for 2021 and 2022
    The ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. None of ONS' headline statistics, other than those directly sourced from occupational data, are affected and you can continue to rely on their accuracy. Further information can be found in the ONS article published on 11 July 2023: Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022

  8. u

    American Community Survey

    • gstore.unm.edu
    csv, geojson, gml +5
    Updated Mar 6, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Earth Data Analysis Center (2020). American Community Survey [Dataset]. https://gstore.unm.edu/apps/rgis/datasets/adecfea6-fcd7-4c41-8165-165c4490a9da/metadata/FGDC-STD-001-1998.html
    Explore at:
    kml(5), csv(5), xls(5), json(5), geojson(5), zip(5), gml(5), shp(5)Available download formats
    Dataset updated
    Mar 6, 2020
    Dataset provided by
    Earth Data Analysis Center
    Time period covered
    2018
    Area covered
    New Mexico, West Bounding Coordinate -109.050173 East Bounding Coordinate -103.001964 North Bounding Coordinate 37.000293 South Bounding Coordinate 31.332172
    Description

    A broad and generalized selection of 2014-2018 US Census Bureau 2018 5-year American Community Survey population data estimates, obtained via Census API and joined to the appropriate geometry (in this case, New Mexico Census tracts). The selection is not comprehensive, but allows a first-level characterization of total population, male and female, and both broad and narrowly-defined age groups. In addition to the standard selection of age-group breakdowns (by male or female), the dataset provides supplemental calculated fields which combine several attributes into one (for example, the total population of persons under 18, or the number of females over 65 years of age). The determination of which estimates to include was based upon level of interest and providing a manageable dataset for users.The U.S. Census Bureau's American Community Survey (ACS) is a nationwide, continuous survey designed to provide communities with reliable and timely demographic, housing, social, and economic data every year. The ACS collects long-form-type information throughout the decade rather than only once every 10 years. The ACS combines population or housing data from multiple years to produce reliable numbers for small counties, neighborhoods, and other local areas. To provide information for communities each year, the ACS provides 1-, 3-, and 5-year estimates. ACS 5-year estimates (multiyear estimates) are “period” estimates that represent data collected over a 60-month period of time (as opposed to “point-in-time” estimates, such as the decennial census, that approximate the characteristics of an area on a specific date). ACS data are released in the year immediately following the year in which they are collected. ACS estimates based on data collected from 2009–2014 should not be called “2009” or “2014” estimates. Multiyear estimates should be labeled to indicate clearly the full period of time. While the ACS contains margin of error (MOE) information, this dataset does not. Those individuals requiring more complete data are directed to download the more detailed datasets from the ACS American FactFinder website. This dataset is organized by Census tract boundaries in New Mexico. Census tracts are small, relatively permanent statistical subdivisions of a county or equivalent entity, and were defined by local participants as part of the 2010 Census Participant Statistical Areas Program. The primary purpose of census tracts is to provide a stable set of geographic units for the presentation of census data and comparison back to previous decennial censuses. Census tracts generally have a population size between 1,200 and 8,000 people, with an optimum size of 4,000 people. State and county boundaries always are census tract boundaries in the standard census geographic hierarchy. In a few rare instances, a census tract may consist of noncontiguous areas. These noncontiguous areas may occur where the census tracts are coextensive with all or parts of legal entities that are themselves noncontiguous. For the 2010 Census, the census tract code range of 9400 through 9499 was enforced for census tracts that include a majority American Indian population according to Census 2000 data and/or their area was primarily covered by federally recognized American Indian reservations and/or off-reservation trust lands; the code range 9800 through 9899 was enforced for those census tracts that contained little or no population and represented a relatively large special land use area such as a National Park, military installation, or a business/industrial park; and the code range 9900 through 9998 was enforced for those census tracts that contained only water area, no land area.

  9. d

    Individuals, ZIP Code Data

    • catalog.data.gov
    • gimi9.com
    Updated Aug 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics of Income (SOI) (2024). Individuals, ZIP Code Data [Dataset]. https://catalog.data.gov/dataset/zip-code-data
    Explore at:
    Dataset updated
    Aug 22, 2024
    Dataset provided by
    Statistics of Income (SOI)
    Description

    This annual study provides selected income and tax items classified by State, ZIP Code, and the size of adjusted gross income. These data include the number of returns, which approximates the number of households; the number of personal exemptions, which approximates the population; adjusted gross income; wages and salaries; dividends before exclusion; and interest received. Data are based who reported on U.S. Individual Income Tax Returns (Forms 1040) filed with the IRS. SOI collects these data as part of its Individual Income Tax Return (Form 1040) Statistics program, Data by Geographic Areas, ZIP Code Data.

  10. l

    Data from: Population Health data collection for the City of Greater Bendigo...

    • opal.latrobe.edu.au
    • researchdata.edu.au
    xlsx
    Updated Mar 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandra Leggat; Stephen Begg; Charles Ambrose; Greg D'Arcy (2024). Population Health data collection for the City of Greater Bendigo [Dataset]. http://doi.org/10.4225/22/55BAE9DBD9670
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Mar 7, 2024
    Dataset provided by
    La Trobe
    Authors
    Sandra Leggat; Stephen Begg; Charles Ambrose; Greg D'Arcy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Greater Bendigo City
    Description

    This data collection contains de-identified clinical health service utilisation data from Bendigo Health and the General Practitioners Practices associated with the Loddon Mallee Murray Medicare Local. The collection also includes associated population health data from the ABS, AIHW and the Municipal Health Plans. Health researchers have a major interest in how clinical data can be used to monitor population health and health care in rural and regional Australia through analysing a broad range of factors shown to impact the health of different populations. The Population Health data collection provides students, managers, clinicians and researchers the opportunity to use clinical data in the study of population health, including the analysis of health risk factors, disease trends and health care utilisation and outcomes.Temporal range (data time period):2004 to 2014Spatial coverage:Bendigo Latitude -36.758711200000010000, Bendigo Longitude 144.283745899999990000

  11. HHS COVID-19 Small Area Estimations Survey - Primary Vaccine Series - Wave...

    • catalog.data.gov
    • healthdata.gov
    • +2more
    Updated Jul 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Health and Human Services (2025). HHS COVID-19 Small Area Estimations Survey - Primary Vaccine Series - Wave 25 [Dataset]. https://catalog.data.gov/dataset/hhs-covid-19-small-area-estimations-survey-primary-vaccine-series-wave-25
    Explore at:
    Dataset updated
    Jul 4, 2025
    Dataset provided by
    United States Department of Health and Human Serviceshttp://www.hhs.gov/
    Description

    The goal of the Monthly Outcome Survey (MOS) Small Area Estimations (SAE) is to generate estimates of the proportions of adults, by county and month, who were in the population of interest for the U.S. Department of Health and Human Services’ (HHS) We Can Do This COVID-19 Public Education Campaign. These data are designed to be used by practitioners and researchers to understand how county-level COVID-19 vaccination hesitancy changed over time in the United States.

  12. A

    ‘NYSERDA Low- to Moderate-Income New York State Census Population Analysis...

    • analyst-2.ai
    Updated Feb 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘NYSERDA Low- to Moderate-Income New York State Census Population Analysis Dataset: Average for 2013-2015’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-nyserda-low-to-moderate-income-new-york-state-census-population-analysis-dataset-average-for-2013-2015-0724/f3a01d19/?iid=020-481&v=presentation
    Explore at:
    Dataset updated
    Feb 12, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    New York
    Description

    Analysis of ‘NYSERDA Low- to Moderate-Income New York State Census Population Analysis Dataset: Average for 2013-2015’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/8bd0ae94-40d3-4c9b-8a6b-de032e07929f on 12 February 2022.

    --- Dataset description provided by original source is as follows ---

    How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov.

    The Low- to Moderate-Income (LMI) New York State (NYS) Census Population Analysis dataset is resultant from the LMI market database designed by APPRISE as part of the NYSERDA LMI Market Characterization Study (https://www.nyserda.ny.gov/lmi-tool). All data are derived from the U.S. Census Bureau’s American Community Survey (ACS) 1-year Public Use Microdata Sample (PUMS) files for 2013, 2014, and 2015.

    Each row in the LMI dataset is an individual record for a household that responded to the survey and each column is a variable of interest for analyzing the low- to moderate-income population.

    The LMI dataset includes: county/county group, households with elderly, households with children, economic development region, income groups, percent of poverty level, low- to moderate-income groups, household type, non-elderly disabled indicator, race/ethnicity, linguistic isolation, housing unit type, owner-renter status, main heating fuel type, home energy payment method, housing vintage, LMI study region, LMI population segment, mortgage indicator, time in home, head of household education level, head of household age, and household weight.

    The LMI NYS Census Population Analysis dataset is intended for users who want to explore the underlying data that supports the LMI Analysis Tool. The majority of those interested in LMI statistics and generating custom charts should use the interactive LMI Analysis Tool at https://www.nyserda.ny.gov/lmi-tool. This underlying LMI dataset is intended for users with experience working with survey data files and producing weighted survey estimates using statistical software packages (such as SAS, SPSS, or Stata).

    --- Original source retains full ownership of the source dataset ---

  13. H

    Current Population Survey

    • data.niaid.nih.gov
    • dataverse.harvard.edu
    Updated May 31, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2011). Current Population Survey [Dataset]. http://doi.org/10.7910/DVN/35IUVQ
    Explore at:
    Dataset updated
    May 31, 2011
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Users can download data or view data tables on topics related to the labor force of the United States. Background Current Population Survey is a joint effort between the Bureau of Labor Statistics and the Census Bureau. It provides information and data on the labor force of the United States, such as: employment, unemployment, earnings, hours of work, school enrollment, health, employee benefits and income. The CPS is conducted monthly and has a sample of approximately 50,000 households. It is representative of the non-institutionalized US population. The sample provides estimates for the nation as a whole and serves as part of model-based estimates for individual states and other geographic areas. User Functionality Users can download data sets or view data tables on their topic of interest. Data can be organized by a variety of demographic variables, including: sex, age, race, marital status and educational attainment. Data is available on a national or state level. Data Notes The CPS is conducted monthly and has a sample of approximately 50,000 households. It is representative of the non-institutionalized US population. The sample provides estimates for th e nation as a whole and serves as part of model-based estimates for individual states and other geographic areas.

  14. o

    Population Estimates for Northern Ireland - Dataset - Open Data NI

    • admin.opendatani.gov.uk
    Updated Jan 17, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Population Estimates for Northern Ireland - Dataset - Open Data NI [Dataset]. https://admin.opendatani.gov.uk/dataset/population-estimates-for-northern-ireland
    Explore at:
    Dataset updated
    Jan 17, 2018
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Area covered
    Ireland, Northern Ireland
    Description

    Population estimates relate to the population as of 30th June each year, and therefore are often referred to as mid-year estimates. They are used to allocate public funds to the Northern Ireland Executive through the Barnett formula and are widely used by Northern Ireland government departments for the planning of services, such as health and education. These statistics are also of interest to those involved in research and academia. They are widely used to express other statistics as a rate, and thus enable comparisons across the United Kingdom and other countries. Furthermore, population estimates form the basis for future population statistics such as population projections.

  15. GlobPOP: A 31-year (1990-2020) global gridded population dataset generated...

    • zenodo.org
    tiff
    Updated Apr 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luling Liu; Xin Cao; Xin Cao; Shijie Li; Na Jie; Luling Liu; Shijie Li; Na Jie (2025). GlobPOP: A 31-year (1990-2020) global gridded population dataset generated by cluster analysis and statistical learning [Dataset]. http://doi.org/10.5281/zenodo.10088105
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Apr 18, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Luling Liu; Xin Cao; Xin Cao; Shijie Li; Na Jie; Luling Liu; Shijie Li; Na Jie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data Update Notice 数据更新通知

    We are pleased to announce that the GlobPOP dataset for the years 2021-2022 has undergone a comprehensive quality check and has now been updated accordingly. Following the established methodology that ensures the high precision and reliability, these latest updates allow for even more comprehensive time-series analysis. The updated GlobPOP dataset remains available in GeoTIFF format for easy integration into your existing workflows.

    2021-2022 年的 GlobPOP 数据集经过全面的质量检查,现已进行相应更新。 遵循确保高精度和可靠性的原有方法,本次更新允许进行更全面的时间序列分析。 更新后的 GlobPOP 数据集仍以 GeoTIFF 格式提供,以便轻松集成到您现有的工作流中。

    To reflect these updates, our interactive web application has also been refreshed. Users can now explore the updated national population time-series curves from 1990 to 2022. This can be accessed via the same link: https://globpop.shinyapps.io/GlobPOP/. Thank you for your continued support of the GlobPOP, and we hope that the updated data will further enhance your research and policy analysis endeavors.

    交互式网页反映了人口最新动态,用户现在可以探索感兴趣的国家1990 年至 2022 年人口时间序列曲线,并将其与人口普查数据进行比较。感谢您对 GlobPOP 的支持,我们希望更新的数据将进一步加强您的研究和政策分析工作。

    If you encounter any issues, please contact us via email at lulingliu@mail.bnu.edu.cn.

    如果您遇到任何问题,请通过电子邮件联系我们。

    Introduction

    Continuously monitoring global population spatial dynamics is essential for implementing effective policies related to sustainable development, such as epidemiology, urban planning, and global inequality.

    Here, we present GlobPOP, a new continuous global gridded population product with a high-precision spatial resolution of 30 arcseconds from 1990 to 2020. Our data-fusion framework is based on cluster analysis and statistical learning approaches, which intends to fuse the existing five products(Global Human Settlements Layer Population (GHS-POP), Global Rural Urban Mapping Project (GRUMP), Gridded Population of the World Version 4 (GPWv4), LandScan Population datasets and WorldPop datasets to a new continuous global gridded population (GlobPOP). The spatial validation results demonstrate that the GlobPOP dataset is highly accurate. To validate the temporal accuracy of GlobPOP at the country level, we have developed an interactive web application, accessible at https://globpop.shinyapps.io/GlobPOP/, where data users can explore the country-level population time-series curves of interest and compare them with census data.

    With the availability of GlobPOP dataset in both population count and population density formats, researchers and policymakers can leverage our dataset to conduct time-series analysis of population and explore the spatial patterns of population development at various scales, ranging from national to city level.

    Data description

    The product is produced in 30 arc-seconds resolution(approximately 1km in equator) and is made available in GeoTIFF format. There are two population formats, one is the 'Count'(Population count per grid) and another is the 'Density'(Population count per square kilometer each grid)

    Each GeoTIFF filename has 5 fields that are separated by an underscore "_". A filename extension follows these fields. The fields are described below with the example filename:

    GlobPOP_Count_30arc_1990_I32

    Field 1: GlobPOP(Global gridded population)
    Field 2: Pixel unit is population "Count" or population "Density"
    Field 3: Spatial resolution is 30 arc seconds
    Field 4: Year "1990"
    Field 5: Data type is I32(Int 32) or F32(Float32)

    More information

    Please refer to the paper for detailed information:

    Liu, L., Cao, X., Li, S. et al. A 31-year (1990–2020) global gridded population dataset generated by cluster analysis and statistical learning. Sci Data 11, 124 (2024). https://doi.org/10.1038/s41597-024-02913-0.

    The fully reproducible codes are publicly available at GitHub: https://github.com/lulingliu/GlobPOP.

  16. n

    Data from: Assessing cetacean populations using integrated population...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Mar 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eiren Jacobson; Charlotte Boyd; Tamara McGuire; Kim Shelden; Gina Himes Boor; André Punt (2020). Assessing cetacean populations using integrated population models: an example with Cook Inlet beluga whales [Dataset]. http://doi.org/10.5061/dryad.9zw3r229w
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 13, 2020
    Dataset provided by
    National Oceanic and Atmospheric Administration
    Montana State University
    University of Washington
    Cook Inlet Beluga Whale Photo ID Project-Alaska WildLife Alliance*
    University of St Andrews
    Authors
    Eiren Jacobson; Charlotte Boyd; Tamara McGuire; Kim Shelden; Gina Himes Boor; André Punt
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    Cook Inlet
    Description

    Effective conservation and management of animal populations requires knowledge of abundance and trends. For many species, these quantities are estimated using systematic visual surveys. Additional individual-level data are available for some species. Integrated population modelling (IPM) offers a mechanism for leveraging these datasets into a single estimation framework. IPMs that incorporate both population- and individual-level data have previously been developed for birds, but have rarely been applied to cetaceans. Here, we explore how IPMs can be used to improve the assessment of cetacean populations. We combined three types of data that are typically available for cetaceans of conservation concern: population-level visual survey data, individual-level capture-recapture data, and data on anthropogenic mortality. We used this IPM to estimate the population dynamics of the Cook Inlet population of beluga whales (CIBW; Delphinapterus leucas) as a case study. Our state-space IPM included a population process model and three observational submodels: 1) a group detection model to describe group size estimates from aerial survey data; 2) a capture-recapture model to describe individual photographic capture-recapture data; and 3) a Poisson regression model to describe historical hunting data. The IPM produces biologically plausible estimates of population trajectories consistent with all three datasets. The estimated population growth rate since 2000 is less than expected for a recovering population. The estimated juvenile/adult survival rate is also low compared to other cetacean populations, indicating that low survival may be impeding recovery. This work demonstrates the value of integrating various data sources to assess cetacean populations and serves as an example of how multiple, imperfect datasets can be combined to improve our understanding of a population of interest. The model framework is applicable to other cetacean populations and to other taxa for which similar data types are available.

    Methods /Data/CIBW_RSideCapHist_McGuire&Stephens.csv contains a matrix of right side capture histories (1 = captured, 0 = not captured) for each individual (rows) and year (columns). Photographic capture-recapture data were collected by Tamara McGuire. These data are made available here, without restriction, but anyone wishing to use these data is requested to contact tamaracookinletbeluga@gmail.com, who can provide further information on how raw data were processed to provide capture histories.

    /Data/CIBW_HuntData_Mahoney&Shelden2000.xlsx contains the minimum documented number of animals killed (MinKilled) for years between 1950 and 1998 as published in Mahoney and Shelden 2000. Entries which are NA indicate that no data were available for that year.

    /Data/CIBW_Abundance_HobbsEtAl2015.xlsx contains the total group size estimates from Hobbs et al. 2015.

    /Data/CIBW_Abundance_BoydEtAl2019.txt contains an array with dimensions [1:1000, 1:8, 1:11] containing 1000 posterior samples of total group size for up to 8 survey days over 11 years, as described in Boyd et al. 2019.

  17. u

    Population by County 2017

    • gstore.unm.edu
    Updated Mar 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Population by County 2017 [Dataset]. https://gstore.unm.edu/apps/rgis/datasets/cd10009e-a79f-4de5-a12c-87bb5b499e9f/metadata/ISO-19115:2003.html
    Explore at:
    Dataset updated
    Mar 6, 2020
    Time period covered
    2017
    Area covered
    West Bound -109.05017 East Bound -103.00196 North Bound 37.000293 South Bound 31.33217
    Description

    A broad and generalized selection of 2013-2017 US Census Bureau 2017 5-year American Community Survey population data estimates, obtained via Census API and joined to the appropriate geometry (in this case, New Mexico counties). The selection is not comprehensive, but allows a first-level characterization of total population, male and female, and both broad and narrowly-defined age groups. In addition to the standard selection of age-group breakdowns (by male or female), the dataset provides supplemental calculated fields which combine several attributes into one (for example, the total population of persons under 18, or the number of females over 65 years of age). The determination of which estimates to include was based upon level of interest and providing a manageable dataset for users.The U.S. Census Bureau's American Community Survey (ACS) is a nationwide, continuous survey designed to provide communities with reliable and timely demographic, housing, social, and economic data every year. The ACS collects long-form-type information throughout the decade rather than only once every 10 years. As in the decennial census, strict confidentiality laws protect all information that could be used to identify individuals or households.The ACS combines population or housing data from multiple years to produce reliable numbers for small counties, neighborhoods, and other local areas. To provide information for communities each year, the ACS provides 1-, 3-, and 5-year estimates. ACS 5-year estimates (multiyear estimates) are “period” estimates that represent data collected over a 60-month period of time (as opposed to “point-in-time” estimates, such as the decennial census, that approximate the characteristics of an area on a specific date). ACS data are released in the year immediately following the year in which they are collected. ACS estimates based on data collected from 2009–2014 should not be called “2009” or “2014” estimates. Multiyear estimates should be labeled to indicate clearly the full period of time. The primary advantage of using multiyear estimates is the increased statistical reliability of the data for less populated areas and small population subgroups. Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. While each full Data Profile contains margin of error (MOE) information, this dataset does not. Those individuals requiring more complete data are directed to download the more detailed datasets from the ACS American FactFinder website. This dataset is organized by New Mexico county boundaries.

  18. w

    National Population Database

    • data.wu.ac.at
    • gimi9.com
    wms
    Updated Apr 20, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Health and Safety Laboratory (2018). National Population Database [Dataset]. https://data.wu.ac.at/schema/data_gov_uk/NzJkOGJmNjMtN2NjMi00OGI2LThkOTctYTg1ZDQ4MmJmMjlj
    Explore at:
    wmsAvailable download formats
    Dataset updated
    Apr 20, 2018
    Dataset provided by
    Health and Safety Laboratory
    Area covered
    707bd9bad8997440d5674b70bc61d21f4a31c9b2
    Description

    The National Population Database (NPD) is a point-based Geographical Information System (GIS) dataset that combines locational information from providers like the Ordnance Survey with population information about those locations, mainly sourced from Government statistics. The points (and sometimes polygons) represent individual buildings, so the NPD allows detailed local analysis for anywhere in Great Britain.

    The Health & Safety Laboratory (HSL) working with Staffordshire University originally created the NPD in 2004 to help its parent organisation, the Health and Safety Executive (HSE), assess the risks to society of major hazard sites e.g. oil refineries, chemical works and gas holders. Of particular interest to HSE were 'sensitive' populations e.g. schools and hospitals where the people at those locations may be more vulnerable to harm and potentially harder to evacuate in an emergency. The data is split into 5 themes: residential, sensitive populations, transport, workplaces and leisure.

    More information about the NPD can be found here:

    https://www.hsl.gov.uk/what-we-do/better-decisions/geoanalytics/national-population-database

    The NPD was created using various datasets available within Government as part of the Public Sector Mapping Agreement (PSMA) and contains other intellectual property so is only available under license and for a fee. Please contact the HSL GIS Team if you would like to discuss gaining access to the sample or full dataset.

  19. u

    American Community Survey

    • gstore.unm.edu
    csv, geojson, gml +5
    Updated Mar 6, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Earth Data Analysis Center (2020). American Community Survey [Dataset]. https://gstore.unm.edu/apps/rgis/datasets/92f102fa-5d6c-41b6-8cf9-132f78a30e02/metadata/FGDC-STD-001-1998.html
    Explore at:
    csv(5), zip(5), json(5), gml(5), geojson(5), xls(5), shp(5), kml(5)Available download formats
    Dataset updated
    Mar 6, 2020
    Dataset provided by
    Earth Data Analysis Center
    Time period covered
    2017
    Area covered
    West Bounding Coordinate -109.050173 East Bounding Coordinate -103.001964 North Bounding Coordinate 37.000293 South Bounding Coordinate 31.332172, New Mexico
    Description

    A broad and generalized selection of 2013-2017 US Census Bureau 2017 5-year American Community Survey population data estimates, obtained via Census API and joined to the appropriate geometry (in this case, New Mexico Census tracts). The selection is not comprehensive, but allows a first-level characterization of total population, male and female, and both broad and narrowly-defined age groups. In addition to the standard selection of age-group breakdowns (by male or female), the dataset provides supplemental calculated fields which combine several attributes into one (for example, the total population of persons under 18, or the number of females over 65 years of age). The determination of which estimates to include was based upon level of interest and providing a manageable dataset for users.The U.S. Census Bureau's American Community Survey (ACS) is a nationwide, continuous survey designed to provide communities with reliable and timely demographic, housing, social, and economic data every year. The ACS collects long-form-type information throughout the decade rather than only once every 10 years. The ACS combines population or housing data from multiple years to produce reliable numbers for small counties, neighborhoods, and other local areas. To provide information for communities each year, the ACS provides 1-, 3-, and 5-year estimates. ACS 5-year estimates (multiyear estimates) are “period” estimates that represent data collected over a 60-month period of time (as opposed to “point-in-time” estimates, such as the decennial census, that approximate the characteristics of an area on a specific date). ACS data are released in the year immediately following the year in which they are collected. ACS estimates based on data collected from 2009–2014 should not be called “2009” or “2014” estimates. Multiyear estimates should be labeled to indicate clearly the full period of time. While the ACS contains margin of error (MOE) information, this dataset does not. Those individuals requiring more complete data are directed to download the more detailed datasets from the ACS American FactFinder website. This dataset is organized by Census tract boundaries in New Mexico. Census tracts are small, relatively permanent statistical subdivisions of a county or equivalent entity, and were defined by local participants as part of the 2010 Census Participant Statistical Areas Program. The primary purpose of census tracts is to provide a stable set of geographic units for the presentation of census data and comparison back to previous decennial censuses. Census tracts generally have a population size between 1,200 and 8,000 people, with an optimum size of 4,000 people. State and county boundaries always are census tract boundaries in the standard census geographic hierarchy. In a few rare instances, a census tract may consist of noncontiguous areas. These noncontiguous areas may occur where the census tracts are coextensive with all or parts of legal entities that are themselves noncontiguous. For the 2010 Census, the census tract code range of 9400 through 9499 was enforced for census tracts that include a majority American Indian population according to Census 2000 data and/or their area was primarily covered by federally recognized American Indian reservations and/or off-reservation trust lands; the code range 9800 through 9899 was enforced for those census tracts that contained little or no population and represented a relatively large special land use area such as a National Park, military installation, or a business/industrial park; and the code range 9900 through 9998 was enforced for those census tracts that contained only water area, no land area.

  20. M

    World Bank Indicators of Interest to COVID-19 Outbreak

    • catalog.midasnetwork.us
    xls
    Updated Jul 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MIDAS Coordination Center (2023). World Bank Indicators of Interest to COVID-19 Outbreak [Dataset]. https://catalog.midasnetwork.us/collection/83
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jul 7, 2023
    Dataset authored and provided by
    MIDAS Coordination Center
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Variables measured
    disease, COVID-19, pathogen, Homo sapiens, risk factors, host organism, age-stratified, phenotypic sex, infectious disease, health system capacity, and 3 more
    Dataset funded by
    National Institute of General Medical Sciences
    Description

    The datasets contain different indicators that have been collected over time that might help in analysis, modeling, prediction and projection of the COVID-19 pandemic. Some of these datasets pertain to health and demographic information such as number of beds, hospital workers, mortality from certain conditions, health expenditures, WASH, age of population, number of the population of certain age groups in a country, etc.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Anton Boudreau Ninkov; Anton Boudreau Ninkov; Chantal Ripp; Chantal Ripp; Kathleen Gregory; Kathleen Gregory; Isabella Peters; Isabella Peters; Stefanie Haustein; Stefanie Haustein (2024). A dataset from a survey investigating disciplinary differences in data citation [Dataset]. http://doi.org/10.5281/zenodo.7853477
Organization logo

A dataset from a survey investigating disciplinary differences in data citation

Explore at:
txt, pdf, bin, csvAvailable download formats
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anton Boudreau Ninkov; Anton Boudreau Ninkov; Chantal Ripp; Chantal Ripp; Kathleen Gregory; Kathleen Gregory; Isabella Peters; Isabella Peters; Stefanie Haustein; Stefanie Haustein
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

GENERAL INFORMATION

Title of Dataset: A dataset from a survey investigating disciplinary differences in data citation

Date of data collection: January to March 2022

Collection instrument: SurveyMonkey

Funding: Alfred P. Sloan Foundation


SHARING/ACCESS INFORMATION

Licenses/restrictions placed on the data: These data are available under a CC BY 4.0 license

Links to publications that cite or use the data:

Gregory, K., Ninkov, A., Ripp, C., Peters, I., & Haustein, S. (2022). Surveying practices of data citation and reuse across disciplines. Proceedings of the 26th International Conference on Science and Technology Indicators. International Conference on Science and Technology Indicators, Granada, Spain. https://doi.org/10.5281/ZENODO.6951437

Gregory, K., Ninkov, A., Ripp, C., Roblin, E., Peters, I., & Haustein, S. (2023). Tracing data:
A survey investigating disciplinary differences in data citation.
Zenodo. https://doi.org/10.5281/zenodo.7555266


DATA & FILE OVERVIEW

File List

  • Filename: MDCDatacitationReuse2021Codebookv2.pdf
    Codebook
  • Filename: MDCDataCitationReuse2021surveydatav2.csv
    Dataset format in csv
  • Filename: MDCDataCitationReuse2021surveydatav2.sav
    Dataset format in SPSS
  • Filename: MDCDataCitationReuseSurvey2021QNR.pdf
    Questionnaire

Additional related data collected that was not included in the current data package: Open ended questions asked to respondents


METHODOLOGICAL INFORMATION

Description of methods used for collection/generation of data:

The development of the questionnaire (Gregory et al., 2022) was centered around the creation of two main branches of questions for the primary groups of interest in our study: researchers that reuse data (33 questions in total) and researchers that do not reuse data (16 questions in total). The population of interest for this survey consists of researchers from all disciplines and countries, sampled from the corresponding authors of papers indexed in the Web of Science (WoS) between 2016 and 2020.

Received 3,632 responses, 2,509 of which were completed, representing a completion rate of 68.6%. Incomplete responses were excluded from the dataset. The final total contains 2,492 complete responses and an uncorrected response rate of 1.57%. Controlling for invalid emails, bounced emails and opt-outs (n=5,201) produced a response rate of 1.62%, similar to surveys using comparable recruitment methods (Gregory et al., 2020).

Methods for processing the data:

Results were downloaded from SurveyMonkey in CSV format and were prepared for analysis using Excel and SPSS by recoding ordinal and multiple choice questions and by removing missing values.

Instrument- or software-specific information needed to interpret the data:

The dataset is provided in SPSS format, which requires IBM SPSS Statistics. The dataset is also available in a coded format in CSV. The Codebook is required to interpret to values.


DATA-SPECIFIC INFORMATION FOR: MDCDataCitationReuse2021surveydata

Number of variables: 95

Number of cases/rows: 2,492

Missing data codes: 999 Not asked

Refer to MDCDatacitationReuse2021Codebook.pdf for detailed variable information.

Search
Clear search
Close search
Google apps
Main menu