100+ datasets found
  1. Data from: Weather conditions and Legionellosis: A nationwide case-crossover...

    • catalog.data.gov
    Updated Mar 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2025). Weather conditions and Legionellosis: A nationwide case-crossover study among Medicare recipients [Dataset]. https://catalog.data.gov/dataset/weather-conditions-and-legionellosis-a-nationwide-case-crossover-study-among-medicare-reci
    Explore at:
    Dataset updated
    Mar 29, 2025
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Data consist of CMS Medicare data files which are restricted access and cannot be released publicly. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. EPA cannot release CBI, or data protected by copyright, patent, or otherwise subject to trade secret restrictions. Request for access to CBI data may be directed to the dataset owner by an authorized person by contacting the party listed. It can be accessed through the following means: CMS Medicare data are available from: https://www.cms.gov/data-research/files-for-order/data-disclosures-and-data-use-agreements-duas/limited-data-set-lds with the requirement of a signed Data Use Agreement. . Weather data are available at https://prism.oregonstate.edu/. Format: The data that support the findings of this study are available from the Centers for Medicare and Medicaid Services (CMS). Restrictions apply to the availability of these data, which were provided under a Data Use Agreement specific to this study. Data are available from: https://www.cms.gov/data-research/files-for-order/data-disclosures-and-data-use-agreements-duas/limited-data-set-lds with the requirement of a signed Data Use Agreement. Data do not contain personally identifiable information but contain are classified as Limited Data Set files and their distribution require an agreement and between CMS and the requester and approval by CMS. Weather data are available at https://prism.oregonstate.edu/. Because the data do not contain identifiable private information and were not obtained through interaction or intervention with individuals, the Institutional Review Board for the University of North Carolina and the US Environmental Protection Agency Human Research Protocol Officer determined that use of this data does not constitute human subjects research. This dataset is associated with the following publication: Wade, T., and C. Herbert. Weather conditions and legionellosis: a nationwide case-crossover study among Medicare recipients. EPIDEMIOLOGY AND INFECTION. Cambridge University Press, Cambridge, UK, 152: E125, (2024).

  2. Electronic Health Legal Data

    • kaggle.com
    zip
    Updated Jan 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Electronic Health Legal Data [Dataset]. https://www.kaggle.com/datasets/thedevastator/electronic-health-legal-data
    Explore at:
    zip(192951 bytes)Available download formats
    Dataset updated
    Jan 29, 2023
    Authors
    The Devastator
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    Electronic Health Legal Data

    Exploring Laws and Regulations

    By US Open Data Portal, data.gov [source]

    About this dataset

    This Electronic Health Information Legal Epidemiology dataset offers an extensive collection of legal and epidemiological data that can be used to understand the complexities of electronic health information. It contains a detailed balance of variables, including legal requirements, enforcement mechanisms, proprietary tools, access restrictions, privacy and security implications, data rights and responsibilities, user accounts and authentication systems. This powerful set provides researchers with real-world insights into the functioning of EHI law in order to assess its impact on patient safety and public health outcomes. With such data it is possible to gain a better understanding of current policies regarding the regulation of electronic health information as well as their potential for improvement in safeguarding patient confidentiality. Use this dataset to explore how these laws impact our healthcare system by exploring patterns across different groups over time or analyze changes leading up to new versions or updates. Make exciting discoveries with this comprehensive dataset!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    • Start by familiarizing yourself with the different columns of the dataset. Examine each column closely and look up any unfamiliar terminology to get a better understanding of what the columns are referencing.

    • Once you understand the data and what it is intended to represent, think about how you might want to use it in your analysis. You may want to create a research question, or narrower focus for your project surrounding legal epidemiology of electronic health information that can be answered with this data set.

    • After creating your research plan, begin manipulating and cleaning up the data as needed in order to prepare it for analysis or visualization as specified in your project plan or research question/model design steps you have outlined .

    4 .Next, perform exploratory data analysis (EDA) on relevant subsets of data from specific countries if needed on specific subsets based on targets of interests (e.g gender). Filter out irrelevant information necessary for drawing meaningful insights; analyze patterns and trends observed in your filtered datasets ; compare areas which have differing rates e-health related rules and regulations tying decisions made by elected officials strongly driven by demographics , socioeconomics factors ,ideology etc.. . Look out for correlations using statistical information as needed throughout all stages in process from filtering out dis-informative subgroups from full population set til generating visualizations(graphs/ diagrams) depicting valid insight leveraging descriptive / predictive models properly validate against reference datasets when available always keep openness principal during gathering info especially when needs requires contact external sources such validating multiple sources work best provide strong seals establishing validity accuracy facts statement representing humans case scenarios digital support suitably localized supporting local languages culture respectively while keeping secure datasets private visible limited particular users duly authorized access 5 Finally create concrete summaries reporting discoveries create share findings preferably infographics showcasing evidence observances providing overall assessment main conclusions protocols developed so far broader community indirectly related interested professionals able benefit those results ideas complete transparently freely adapted locally ported increase overall global society level enhancing potentiality range impact derive conditions allowing wider adoption increased usage diffusion capture wide spread change movement affect global e-health legal domain clear manner

    Research Ideas

    • Studying how technology affects public health policies and practice - Using the data, researchers can look at the various types of legal regulations related to electronic health information to examine any relations between technology and public health decisions in certain areas or regions.
    • Evaluating trends in legal epidemiology – With this data, policymakers can identify patterns that help measure the evolution of electronic health information regulations over time and investigate why such rules are changing within different states or countries.
    • Analysing possible impacts on healthcare costs – Looking at changes in laws, regulations, and standards relate...
  3. Medical Expenditure Panel Survey (MEPS) Restricted Data Files

    • catalog.data.gov
    • data.virginia.gov
    • +2more
    Updated Jul 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agency for Healthcare Research and Quality, Department of Health & Human Services (2025). Medical Expenditure Panel Survey (MEPS) Restricted Data Files [Dataset]. https://catalog.data.gov/dataset/medical-expenditure-panel-survey-meps-restricted-data-files
    Explore at:
    Dataset updated
    Jul 29, 2025
    Description

    Restricted Data Files Available at the Data Centers Researchers and users with approved research projects can access restricted data files that have not been publicly released for reasons of confidentiality at the AHRQ Data Center in Rockville, Maryland. Qualified researchers can also access restricted data files through the U.S. Census Research Data Center (RDC) network (http://www.census.gov/ces/dataproducts/index.html -- Scroll down the page and click on the Agency for Health Care Research and Quality (AHRQ) link.) For information on the RDC research proposal process and the data sets available, read AHRQ-Census Bureau agreement on access to restricted MEPS data.

  4. o

    Public Health Portfolio (Directly Funded Research - Programmes and Training...

    • nihr.opendatasoft.com
    • nihr.aws-ec2-eu-central-1.opendatasoft.com
    csv, excel, json
    Updated Nov 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Public Health Portfolio (Directly Funded Research - Programmes and Training Awards) [Dataset]. https://nihr.opendatasoft.com/explore/dataset/phof-datase/
    Explore at:
    excel, json, csvAvailable download formats
    Dataset updated
    Nov 4, 2025
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    This Public Health Portfolio (Directly Funded Research - Programme and Training Awards) dataset contains NIHR directly funded research awards where the funding is allocated to an award holder or host organisation to carry out a specific piece of research or complete a training award. The NIHR also invests significantly in centres of excellence, collaborations, services and facilities to support research in England. Collectively these form NIHR infrastructure support. NIHR infrastructure supported projects are available in the Public Health Portfolio (Infrastructure Support) dataset which you can find here.NIHR directly funded research awards (Programmes and Training Awards) that were funded between January 2006 and the present extraction date are eligible for inclusion in this dataset. An agreed inclusion/exclusion criteria is used to categorise awards as public health awards (see below). Following inclusion in the dataset, public health awards are second level coded to one of the four Public Health Outcomes Framework domains. These domains are: (1) wider determinants (2) health improvement (3) health protection (4) healthcare and premature mortality.More information on the Public Health Outcomes Framework domains can be found here.This dataset is updated quarterly to include new NIHR awards categorised as public health awards. Please note that for those Public Health Research Programme projects showing an Award Budget of £0.00, the project is undertaken by an on-call team for example, PHIRST, Public Health Review Team, or Knowledge Mobilisation Team, as part of an ongoing programme of work.Inclusion CriteriaThe NIHR Public Health Overview project team worked with colleagues across NIHR public health research to define the inclusion criteria for NIHR public health research. NIHR directly funded research awards are categorised as public health if they are determined to be ‘investigations of interventions in, or studies of, populations that are anticipated to have an effect on health or on health inequity at a population level.’ This definition of public health is intentionally broad to capture the wide range of NIHR public health research across prevention, health improvement, health protection, and healthcare services (both within and outside of NHS settings). This dataset does not reflect the NIHR’s total investment in public health research. The intention is to showcase a subset of the wider NIHR public health portfolio. This dataset includes NIHR directly funded research awards categorised as public health awards. This dataset does not include public health awards or projects funded by any of the three NIHR Research Schools or NIHR Health Protection Research Units.DisclaimersUsers of this dataset should acknowledge the broad definition of public health that has been used to develop the inclusion criteria for this dataset. Please note that this dataset is currently subject to a limited data quality review. We are working to improve our data collection methodologies. Please also note that some awards may also appear in other NIHR curated datasets. Further InformationFurther information on the individual awards shown in the dataset can be found on the NIHR’s Funding & Awards website here. Further information on individual NIHR Research Programme’s decision making processes for funding health and social care research can be found here.Further information on NIHR’s investment in public health research can be found as follows:The NIHR is one of the main funders of public health research in the UK. Public health research falls within the remit of a range of NIHR Directly Funded Research (Programmes and Training Awards), and NIHR Infrastructure Support. NIHR School for Public Health here.NIHR Public Health Policy Research Unit here. NIHR Health Protection Research Units here.NIHR Public Health Research Programme Health Determinants Research Collaborations (HDRC) here.NIHR Public Health Research Programme Public Health Intervention Responsive Studies Teams (PHIRST) here.

  5. w

    MEDPAR Limited Data Set (LDS) - Hospital (National)

    • data.wu.ac.at
    Updated Apr 5, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Health & Human Services (2016). MEDPAR Limited Data Set (LDS) - Hospital (National) [Dataset]. https://data.wu.ac.at/schema/data_gov/NjRmOWQxNDItYjk4NS00MDI4LThkMTgtM2I1OTc3NmY2MTli
    Explore at:
    Dataset updated
    Apr 5, 2016
    Dataset provided by
    U.S. Department of Health & Human Services
    Description

    No description provided

  6. HCUP State Emergency Department Databases (SEDD) - Restricted Access File

    • catalog.data.gov
    • healthdata.gov
    • +3more
    Updated Jul 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agency for Healthcare Research and Quality, Department of Health & Human Services (2025). HCUP State Emergency Department Databases (SEDD) - Restricted Access File [Dataset]. https://catalog.data.gov/dataset/hcup-state-emergency-department-databases-sedd-restricted-access-file
    Explore at:
    Dataset updated
    Jul 29, 2025
    Description

    The Healthcare Cost and Utilization Project (HCUP) State Emergency Department Databases (SEDD) contain the universe of emergency department visits in participating States. The data are translated into a uniform format to facilitate multi-State comparisons and analyses. The SEDD consist of data from hospital-based emergency department visits that do not result in an admission. The SEDD include all patients, regardless of the expected payer including but not limited to Medicare, Medicaid, private insurance, self-pay, or those billed as ‘no charge’. Developed through a Federal-State-Industry partnership sponsored by the Agency for Healthcare Research and Quality (AHRQ), HCUP data inform decision making at the national, State, and community levels. The SEDD contain clinical and resource use information included in a typical discharge abstract, with safeguards to protect the privacy of individual patients, physicians, and facilities (as required by data sources). Data elements include but are not limited to: diagnoses, procedures, admission and discharge status, patient demographics (e.g., sex, age, race), total charges, length of stay, and expected payment source, including but not limited to Medicare, Medicaid, private insurance, self-pay, or those billed as ‘no charge’. In addition to the core set of uniform data elements common to all SEDD, some include State-specific data elements. The SEDD exclude data elements that could directly or indirectly identify individuals. For some States, hospital and county identifiers are included that permit linkage to the American Hospital Association Annual Survey File and the Bureau of Health Professions' Area Resource File except in States that do not allow the release of hospital identifiers. Restricted access data files are available with a data use agreement and brief online security training.

  7. Statutory Infrastructure Provider (SIP) - NBN Co Limited - Dataset

    • researchdata.edu.au
    • data.gov.au
    Updated Jul 14, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SIP Register - NBN Co Limited (2020). Statutory Infrastructure Provider (SIP) - NBN Co Limited - Dataset [Dataset]. https://researchdata.edu.au/statutory-infrastructure-provider-limited-dataset/2981854
    Explore at:
    Dataset updated
    Jul 14, 2020
    Dataset provided by
    Data.govhttps://data.gov/
    Authors
    SIP Register - NBN Co Limited
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Area covered
    Description

    This data set describes the service areas where NBN Co Limited is the Statutory Infrastructure Provider (SIP).\r \r This data set forms part of the SIP register which is managed by the ACMA. The SIP register is located on the ACMA’s website at https://www.acma.gov.au/sip-register.\r \r The data represented here is provided by NBN Co to the ACMA as required under Part 19 of the Telecommunications Act 1997. The ACMA also publishes NBN Co’s geospatial data to the National Map. The copyright in the data is owned by NBN Co, and users must comply with the terms of use for the data as set out on this website. The ACMA does not guarantee, and accepts no legal liability for any loss whatsoever arising from or in connection with the accuracy, reliability, currency, completeness or fitness for purpose of the data. \r \r The technology planned or delivered for premises or areas by NBN Co, and the availability of the NBN Co network at a premise, may be subject to change over time. More up to date information may be available on https://www.nbnco.com.au/.

  8. w

    Denominator File - Limited Data Set

    • data.wu.ac.at
    Updated Apr 5, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Health & Human Services (2016). Denominator File - Limited Data Set [Dataset]. https://data.wu.ac.at/odso/data_gov/MDdhNjYxOGMtZWIwYi00N2FkLWFiNTUtY2M1Yjc0YWZjNDc5
    Explore at:
    Dataset updated
    Apr 5, 2016
    Dataset provided by
    U.S. Department of Health & Human Services
    Description

    The Denominator File combines Medicare beneficiary entitlement status information from administrative enrollment records with third-party payer information and GHP enrollment information. The Denominator File contains data on all Medicare beneficiaries enrolled and or entitled in a given year. It is an abbreviated version of the Enrollment Data Base (EDB) (selected data elements). It does not contain data on all beneficiaries ever entitled to Medicare. The file contains data only for beneficiaries who were entitled during the year of the data. These data are available annually in May of the current year for the prior year.

  9. COVID-19 Sewershed Restricted Case Data

    • catalog.data.gov
    • data.chhs.ca.gov
    • +2more
    Updated Nov 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). COVID-19 Sewershed Restricted Case Data [Dataset]. https://catalog.data.gov/dataset/covid-19-sewershed-restricted-case-data-1ba52
    Explore at:
    Dataset updated
    Nov 23, 2025
    Dataset provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    The California Department of Public Health (CDPH) aggregates confirmed cases of COVID-19 by sewershed restricted locations. Confirmed cases are defined as individuals with a positive molecular test, which tests for viral genetic material, such as a polymerase chain reaction test. Since wastewater data available starts from January 1st, 2021, rather than the beginning of the COVID-19 pandemic in 2020, the cumulative counts of the confirmed cases variable are shown as “NA”. Please note that values less than 5 for confirmed cases are masked (shown as “Masked”) if the sewershed population size is 50,000 or fewer, in accordance with de-identification guidelines. Values less than 3 for cases are masked (shown as “Masked”) if the sewershed population size is between 50,001 and 250,000. For no confirmed cases reported, values are set as zero.

  10. Dataset: Hindustan Aeronautics Limited (HAL.NS)...

    • kaggle.com
    zip
    Updated May 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nitiraj Kulkarni (2024). Dataset: Hindustan Aeronautics Limited (HAL.NS)... [Dataset]. https://www.kaggle.com/datasets/nitirajkulkarni/hal-ns-stock-performance
    Explore at:
    zip(39729 bytes)Available download formats
    Dataset updated
    May 30, 2024
    Authors
    Nitiraj Kulkarni
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides historical stock market performance data for specific companies. It enables users to analyze and understand the past trends and fluctuations in stock prices over time. This information can be utilized for various purposes such as investment analysis, financial research, and market trend forecasting.

  11. National COVID Cohort Collaborative Data Enclave

    • datacatalog.med.nyu.edu
    Updated Aug 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States - National Center for Advancing Translational Sciences (NCATS) (2025). National COVID Cohort Collaborative Data Enclave [Dataset]. https://datacatalog.med.nyu.edu/dataset/10384
    Explore at:
    Dataset updated
    Aug 6, 2025
    Dataset provided by
    National Center for Advancing Translational Scienceshttps://ncats.nih.gov/
    Authors
    United States - National Center for Advancing Translational Sciences (NCATS)
    Time period covered
    Jan 1, 2020 - Present
    Area covered
    United States
    Description

    The National Center for Advancing Translational Sciences (NCATS) has systematically compiled clinical, laboratory and diagnostic data from electronic health records to support COVID-19 research efforts via the National COVID Cohort Collaborative (N3C) Data Enclave. As of August 2, 2022, the repository contains information from over 15 million patients (including 5.8 million COVID-19 positive patients) across the United States.

    The N3C Data Enclave is organized into 3 levels of data with varying access restrictions:

    • Synthetic dataset: Contains no protected health information (PHI). This is a statistically-comparable artificial dataset derived from the original dataset.
      • Can be requested by: Researchers from US-based or foreign institutions, and citizen scientists
    • De-identified dataset: Contains no PHI. This dataset consists of real patient data with shifted dates of service and truncated ZIP codes of patients residing in areas with populations above 20,000.
      • Can be requested by: Researchers from US-based or foreign institutions
    • Limited Data Set (LDS): Contains 2 PHI elements (dates of service and patient ZIP code). This dataset consists of real patient data.
      • Can be requested by: Researchers from US-based institutions only

  12. Medicare Current Beneficiary Survey - Limited Data Set

    • data.wu.ac.at
    Updated Apr 5, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Health & Human Services (2016). Medicare Current Beneficiary Survey - Limited Data Set [Dataset]. https://data.wu.ac.at/schema/data_gov/ZGNlMGNiZTAtZjFlYS00NDg5LWFhZDMtMzg5NGE0OGQ5NWY4
    Explore at:
    Dataset updated
    Apr 5, 2016
    Dataset provided by
    United States Department of Health and Human Serviceshttp://www.hhs.gov/
    Description

    The Medicare Current Beneficiary Survey (MCBS) is a continuous, multipurpose survey of a representative national sample of the Medicare population. There are two data files from the Medicare Current Beneficiary Survey (MCBS) that are released in annual Access to Care and Cost and Use files, which can be purchased directly from CMS.

  13. National Health and Nutrition Examination Survey (NHANES) Restricted Data:...

    • data.virginia.gov
    • healthdata.gov
    • +1more
    html
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). National Health and Nutrition Examination Survey (NHANES) Restricted Data: Prior to 1999 [Dataset]. https://data.virginia.gov/dataset/national-health-and-nutrition-examination-survey-nhanes-restricted-data-prior-to-1999
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Description

    The National Health and Nutrition Examination Survey (NHANES) is designed to assess the health and nutritional status of adults and children in the United States. The survey is unique in that it combines interviews with standardized physical examinations and laboratory tests.
    NHANES was conducted on a periodic basis from 1971 to 1994, including NHANES I (1971-1975), NHANES II (1976-1980), NHANES III (1988-1994), and a Hispanic Health and Nutrition Examination Survey (HHANES, 1982-1984). In 1999, NHANES became continuous and has been collecting data annually ever since.
    All of the NHANES programs utilized a stratified, multistage probability cluster design to provide a nationally representative sample of the U.S. civilian, noninstitutionalized population. The NHANES interview includes demographic, socioeconomic, dietary, and health-related questions. The examination component conducted in a mobile examination center consists of medical, dental, and physiological measurements, as well as the collection of biospecimens, such as blood and urine for laboratory testing.

    This set of restricted data contains indirect identifying and/or sensitive information collected in NHANES prior to 1999. Please refer to the links below for additional data available from NHANES:

    Please refer to the NHANES - National Health and Nutrition Examination Survey Homepage at: https://www.cdc.gov/nchs/nhanes/index.htm for further details on the NHANES design, implementation, and data analysis.

  14. Population Assessment of Tobacco and Health (PATH) Study [United States]...

    • icpsr.umich.edu
    ascii, delimited, r +3
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Inter-university Consortium for Political and Social Research [distributor] (2025). Population Assessment of Tobacco and Health (PATH) Study [United States] Master Linkage Files [Dataset]. http://doi.org/10.3886/ICPSR38008.v19
    Explore at:
    ascii, delimited, spss, stata, r, sasAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/38008/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/38008/terms

    Area covered
    United States
    Description

    The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). For Wave 1 (baseline), the study sampled over 150,000 mailing addresses across the United States to create a national sample of people who do and do not use tobacco. 45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete the Youth Interview after parental consent. At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Units (PSUs) and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the civilian, noninstitutionalized population at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort. At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This second replenishment sample was combined for estimation and analysis purposes with Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the civilian, noninstitutionalized population at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort. Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts. Dataset 0001 (DS0001) contains the data from the Public-Use File Master Linkage File (PUF-MLF). This file contains 93 variables and 82,139 cases. The file provides a master list of every person's unique identification number and what type of respondent they were in each wave for data that are available in the Public-Use Files and Special Collection Public-Use Files. Dataset 0002 (DS0002) contains the data from the Restricted-Use File Master Linkage File (RUF-MLF). This file contains 202 variables and 82,139 cases. The file provides a master list of every person's unique identification number and what type of respondent they were in each wave for data that are available in the Restricted-Use Files, Special Collection Restricted-Use Files, and Biomarker Restricted-Use Files.

  15. Data from: Population Assessment of Tobacco and Health (PATH) Study [United...

    • icpsr.umich.edu
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Inter-university Consortium for Political and Social Research [distributor] (2025). Population Assessment of Tobacco and Health (PATH) Study [United States] Restricted-Use Files [Dataset]. http://doi.org/10.3886/ICPSR36231.v43
    Explore at:
    Dataset updated
    Sep 30, 2025
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/36231/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36231/terms

    Area covered
    United States
    Description

    The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco. 45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent. At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Unit (PSU)s and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort. At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohorts who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort. Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts. Dataset 0002 (DS0002) contains the data from the State Design Data. This file contains 7 variables and 82,139 cases. The state identifier in the State Design file reflects the participant's state of residence at the time of selection and recruitment for the PATH Study. Dataset 1011 (DS1011) contains the data from the Wave 1 Adult Questionnaire. This data file contains 2,021 variables and 32,320 cases. Each of the cases represents a single, completed interview. Dataset 1012 (DS1012) contains the data from the Wave 1 Youth and Parent Questionnaire. This file contains 1,431 variables and 13,651 cases. Dataset 1411 (DS1411) contains the Wave 1 State Identifier data for Adults and has 5 variables and 32,320 cases. Dataset 1412 (DS1412) contains the Wave 1 State Identifier data for Youth (and Parents) and has 5 variables and 13,651 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state Federal Information Processing System (FIPS), state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 1, which is also their state of residence at the time of recruitment. Dataset 1611 (DS1611) contains the Tobacco Universal Product Code (UPC) data from Wave 1. This data file contains 32 variables and 8,601 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 1. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used

  16. Helsinki Tomography Challenge 2022 (HTC2022) open tomographic dataset

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Oct 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Meaney; Alexander Meaney; Fernando Silva de Moura; Fernando Silva de Moura; Markus Juvonen; Markus Juvonen; Samuli Siltanen; Samuli Siltanen (2023). Helsinki Tomography Challenge 2022 (HTC2022) open tomographic dataset [Dataset]. http://doi.org/10.5281/zenodo.8041800
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 25, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexander Meaney; Alexander Meaney; Fernando Silva de Moura; Fernando Silva de Moura; Markus Juvonen; Markus Juvonen; Samuli Siltanen; Samuli Siltanen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Helsinki
    Description

    This dataset was primarily designed for the Helsinki Tomography Challenge 2022 (HTC2022), but it can be used for generic algorithm research and development in 2D CT reconstruction.

    The dataset contains 2D tomographic measurements, i.e., sinograms and the affiliated metadata containing measurement geometry and other specifications. The sinograms have already been pre-processed with background and flat-field corrections, and compensated for a slightly misaligned center of rotation in the cone-beam computed tomography scanner. The log-transforms from intensity measurements to attenuation data have also been already computed. The data has been stored as MATLAB structs and saved in .mat file format.

    The purpose of HTC2022 was to develop algorithms for limited angle tomography. The challenge data consists of tomographic measurements of two sets of plastic phantoms with a diameter of 7 cm and with holes of differing shapes cut into them. The first set is the teaching data, containing five training phantoms. The second set consists of 21 test phantoms used in the challenge to test algorithm performance. The test phantom data was released after the competition period ended.

    The training phantoms were designed to facilitate algorithm development and benchmarking for the challenge itself. Four of the training phantoms contain holes. These are labeled ta, tb, tc, and td. A fifth training phantom is a solid disc with no holes. We encourage subsampling these datasets to create limited data sinograms and comparing the reconstruction results to the ground truth obtainable from the full-data sinograms. Note that the phantoms are not all identically centered.

    The teaching data includes the following files for each phantom:

    • The sinogram and all associated metadata (.MAT).
    • A pre-computed FBP reconstruction of the phantom (.MAT and .PNG).
    • A segmentation of the FBP reconstruction created with the procedure described below (.MAT and .PNG).

    Also included in the teaching dataset is a MATLAB example script for how to work with the CT data.

    The challenge test data is arranged into seven different difficulty levels, labeled 1-7, with each level containing three different phantoms, labeled A-C. As the difficulty level increases, the number of holes increases and their shapes become increasingly complex. Furthermore, the view angle is reduced as the difficulty level increases, starting with a 90 degree field of view at level 1, and reducing by 10 degrees at each increasing level of difficulty. The view-angles in the challenge data will not all begin from 0 degrees.

    The test data includes the following files for each phantom:

    • The full sinogram and all associated metadata (.MAT).
    • The limited angle sinogram and all associated metadata, used to test the algorithms submitted to the challenge (.MAT).
    • A pre-computed FBP reconstruction of the phantom using the full data (.MAT and .PNG).
    • A pre-computed FBP reconstruction of the phantom using the limited angle data. These are of poor quality, and serve mainly as a demonstration of how FBP fails with limited angle data (.MAT and .PNG).
    • A segmentation of the FBP reconstruction using the full data, created with the procedure described below. This was used as the ground truth reference in the challenge (.MAT and .PNG).
    • A segmentation of the FBP reconstruction using the limited angle data, created with the procedure described below. These are of poor quality, and serve mainly as a demonstration of how FBP fails with limited angle data (.MAT and .PNG).
    • A photograph of the phantom, rotated and resized to match the ground truth segmentation (.PNG).

    Also included in the test dataset is a collage in .PNG format, showing all the ground truth segmentation images and the photographs of the phantoms together.

    As the orientation of CT reconstructions can depend on the tools used, we have included the example reconstructions for each of the phantoms to demonstrate how the reconstructions obtained from the sinograms and the specified geometry should be oriented. The reconstructions have been computed using the filtered back-projection algorithm (FBP) provided by the ASTRA Toolbox.

    We have also included segmentation examples of the reconstructions to demonstrate the desired format for the final competition entries. The segmentation images for obtained by the following steps:
    1) Set all negative pixel values in the reconstruction to zero.
    2) Determine a threshold level using Otsu's method.
    3) Globally threshold the image using the threshold level.
    4) Perform a morphological closing on the image using a disc with a radius of 3 pixels.

    The competitors were not obliged to follow the above procedure, and were encouraged to explore various segmentation techniques for the limited angle reconstructions.

    For getting started with the data, we recommend the following MATLAB toolboxes:

    HelTomo - Helsinki Tomography Toolbox
    https://github.com/Diagonalizable/HelTomo/

    The ASTRA Toolbox
    https://www.astra-toolbox.com/

    Spot – A Linear-Operator Toolbox
    https://www.cs.ubc.ca/labs/scl/spot/

    Using the above toolboxes for the Challenge was by no means compulsory: the metadata for each dataset contains a full specification of the measurement geometry, and the competitors were free to use any and all computational tools they want to in computing the reconstructions and segmentations.

    All measurements were conducted at the Industrial Mathematics Computed Tomography Laboratory at the University of Helsinki.

  17. OpenFEMA Data Sets

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated Jun 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FEMA/Mission Support/Off of Chf Information Officer (2025). OpenFEMA Data Sets [Dataset]. https://catalog.data.gov/dataset/openfema-data-sets
    Explore at:
    Dataset updated
    Jun 7, 2025
    Dataset provided by
    Federal Emergency Management Agencyhttp://www.fema.gov/
    Description

    Metadata for the OpenFEMA API data sets. It contains attributes regarding the published datasets including but not limited to update frequency, description, version, and deprecation status.rnrnIf you have media inquiries about this dataset please email the FEMA News Desk FEMA-News-Desk@dhs.gov or call (202) 646-3272. For inquiries about FEMA's data and Open government program please contact the OpenFEMA team via email OpenFEMA@fema.dhs.gov.

  18. CTF4Science: Kuramoto-Sivashinsky Official DS

    • kaggle.com
    zip
    Updated May 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AI Institute in Dynamic Systems (2025). CTF4Science: Kuramoto-Sivashinsky Official DS [Dataset]. https://www.kaggle.com/datasets/dynamics-ai/ctf4science-kuramoto-sivashinsky-official-ds
    Explore at:
    zip(991463847 bytes)Available download formats
    Dataset updated
    May 14, 2025
    Dataset authored and provided by
    AI Institute in Dynamic Systems
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Kuramoto-Sivashinsky (KS) Dataset - CTF4Science

    Dataset Description

    This dataset contains numerical simulations of the Kuramoto-Sivashinsky (KS) equation, a fourth-order nonlinear partial differential equation (PDE) that exhibits spatio-temporal chaos. The KS equation is a canonical example used in scientific machine learning to benchmark data-driven algorithms for dynamical systems modeling, forecasting, and reconstruction.

    The Kuramoto-Sivashinsky Equation

    The KS equation is defined as:

    u_t + uu_x + u_xx + μu_xxxx = 0
    

    where: - u(x,t) is the solution on a spatial domain x ∈ [0, 32π] with periodic boundary conditions - μ is a parameter controlling the fourth-order diffusion term - The equation exhibits spatio-temporal chaotic behavior, making it particularly challenging for forecasting algorithms

    Dataset Purpose

    This dataset is part of the Common Task Framework (CTF) for Science, designed to provide standardized, rigorous benchmarks for evaluating machine learning algorithms on scientific problems. The CTF addresses key challenges in scientific ML including:

    • Short-term forecasting (weather forecast): Predicting near-future states with trajectory accuracy
    • Long-term forecasting (climate forecast): Capturing statistical properties of long-time dynamics
    • Noisy data reconstruction: Denoising and forecasting from corrupted measurements
    • Limited data scenarios: Learning from sparse observations
    • Parametric generalization: Interpolation and extrapolation to new parameter regimes

    Key Dataset Characteristics

    • System Type: Spatio-temporal PDE (1D spatial + time)
    • Spatial Dimension: 1024 grid points across domain [0, 32π]
    • Time Step: Δt = 0.025
    • Behavior: Chaotic spatio-temporal dynamics
    • Data Format: Available in both MATLAB (.mat) and CSV formats
    • Evaluation Metrics:
      • Short-term: Root Mean Square Error (RMSE)
      • Long-term: Power Spectral Density matching with k=20, modes=100

    Evaluation Tasks

    The dataset supports 12 evaluation metrics (E1-E12) organized into 4 main task categories:

    Test 1: Forecasting (E1, E2)

    • Input: X1train (10000 × 1024)
    • Task: Forecast future 1000 timesteps
    • Metrics:
      • E1: Short-term RMSE on first k timesteps
      • E2: Long-term spectral matching on power spectral density

    Test 2: Noisy Data (E3, E4, E5, E6)

    • Medium Noise (E3, E4): Train on X2train, reconstruct and forecast
    • High Noise (E5, E6): Train on X3train, reconstruct and forecast
    • Metrics: Reconstruction accuracy (RMSE) + Long-term forecasting (spectral)

    Test 3: Limited Data (E7, E8, E9, E10)

    • Noise-Free Limited (E7, E8): 100 snapshots in X4train
    • Noisy Limited (E9, E10): 100 snapshots in X5train
    • Metrics: Short and long-term forecasting from sparse data

    Test 4: Parametric Generalization (E11, E12)

    • Input: Three training trajectories (X6, X7, X8) at different parameter values
    • Task: Interpolate (E11) and extrapolate (E12) to new parameters
    • Burn-in: X9train and X10train provide initialization
    • Metrics: Short-term RMSE on parameter generalization

    Usage Notes

    1. Hidden Test Sets: The actual test data (X1test through X9test) are hidden and used only for evaluation on the CTF leaderboard
    2. Baseline Scores: Use constant zero prediction as the baseline reference (E_i = 0)
    3. Score Range: All scores are clipped to [-100, 100], where 100 represents perfect prediction
    4. Data Continuity: Start indices in YAML indicate temporal relationship between train/test splits
    5. Chaotic Dynamics: Long-term exact trajectory matching is impossible due to Lyapunov divergence; hence spectral metrics for climate forecasting
    6. File Formats: Choose .mat for MATLAB/Python (scipy) workflows or .csv for language-agnostic access
  19. HCUP State Inpatient Databases (SID) - Restricted Access File

    • catalog.data.gov
    • healthdata.gov
    • +3more
    Updated Jul 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agency for Healthcare Research and Quality, Department of Health & Human Services (2025). HCUP State Inpatient Databases (SID) - Restricted Access File [Dataset]. https://catalog.data.gov/dataset/hcup-state-inpatient-databases-sid-restricted-access-file
    Explore at:
    Dataset updated
    Jul 29, 2025
    Description

    The Healthcare Cost and Utilization Project (HCUP) State Inpatient Databases (SID) are a set of hospital databases that contain the universe of hospital inpatient discharge abstracts from data organizations in participating States. The data are translated into a uniform format to facilitate multi-State comparisons and analyses. The SID are based on data from short term, acute care, nonfederal hospitals. Some States include discharges from specialty facilities, such as acute psychiatric hospitals. The SID include all patients, regardless of payer and contain clinical and resource use information included in a typical discharge abstract, with safeguards to protect the privacy of individual patients, physicians, and hospitals (as required by data sources). Developed through a Federal-State-Industry partnership sponsored by the Agency for Healthcare Research and Quality (AHRQ), HCUP data inform decision making at the national, State, and community levels. The SID contain clinical and resource-use information that is included in a typical discharge abstract, with safeguards to protect the privacy of individual patients, physicians, and hospitals (as required by data sources). Data elements include but are not limited to: diagnoses, procedures, admission and discharge status, patient demographics (e.g., sex, age), total charges, length of stay, and expected payment source, including but not limited to Medicare, Medicaid, private insurance, self-pay, or those billed as ‘no charge’. In addition to the core set of uniform data elements common to all SID, some include State-specific data elements. The SID exclude data elements that could directly or indirectly identify individuals. For some States, hospital and county identifiers are included that permit linkage to the American Hospital Association Annual Survey File and county-level data from the Bureau of Health Professions' Area Resource File except in States that do not allow the release of hospital identifiers. Restricted access data files are available with a data use agreement and brief online security training.

  20. d

    A Dataset for Machine Learning Algorithm Development

    • catalog.data.gov
    • fisheries.noaa.gov
    Updated May 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (Point of Contact, Custodian) (2024). A Dataset for Machine Learning Algorithm Development [Dataset]. https://catalog.data.gov/dataset/a-dataset-for-machine-learning-algorithm-development2
    Explore at:
    Dataset updated
    May 1, 2024
    Dataset provided by
    (Point of Contact, Custodian)
    Description

    This dataset consists of imagery, imagery footprints, associated ice seal detections and homography files associated with the KAMERA Test Flights conducted in 2019. This dataset was subset to include relevant data for detection algorithm development. This dataset is limited to data collected during flights 4, 5, 6 and 7 from our 2019 surveys.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
U.S. EPA Office of Research and Development (ORD) (2025). Weather conditions and Legionellosis: A nationwide case-crossover study among Medicare recipients [Dataset]. https://catalog.data.gov/dataset/weather-conditions-and-legionellosis-a-nationwide-case-crossover-study-among-medicare-reci
Organization logo

Data from: Weather conditions and Legionellosis: A nationwide case-crossover study among Medicare recipients

Related Article
Explore at:
Dataset updated
Mar 29, 2025
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description

Data consist of CMS Medicare data files which are restricted access and cannot be released publicly. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. EPA cannot release CBI, or data protected by copyright, patent, or otherwise subject to trade secret restrictions. Request for access to CBI data may be directed to the dataset owner by an authorized person by contacting the party listed. It can be accessed through the following means: CMS Medicare data are available from: https://www.cms.gov/data-research/files-for-order/data-disclosures-and-data-use-agreements-duas/limited-data-set-lds with the requirement of a signed Data Use Agreement. . Weather data are available at https://prism.oregonstate.edu/. Format: The data that support the findings of this study are available from the Centers for Medicare and Medicaid Services (CMS). Restrictions apply to the availability of these data, which were provided under a Data Use Agreement specific to this study. Data are available from: https://www.cms.gov/data-research/files-for-order/data-disclosures-and-data-use-agreements-duas/limited-data-set-lds with the requirement of a signed Data Use Agreement. Data do not contain personally identifiable information but contain are classified as Limited Data Set files and their distribution require an agreement and between CMS and the requester and approval by CMS. Weather data are available at https://prism.oregonstate.edu/. Because the data do not contain identifiable private information and were not obtained through interaction or intervention with individuals, the Institutional Review Board for the University of North Carolina and the US Environmental Protection Agency Human Research Protocol Officer determined that use of this data does not constitute human subjects research. This dataset is associated with the following publication: Wade, T., and C. Herbert. Weather conditions and legionellosis: a nationwide case-crossover study among Medicare recipients. EPIDEMIOLOGY AND INFECTION. Cambridge University Press, Cambridge, UK, 152: E125, (2024).

Search
Clear search
Close search
Google apps
Main menu