This case surveillance publicly available dataset has 32 elements for all COVID-19 cases shared with CDC and includes demographics, geography (county and state of residence), any exposure history, disease severity indicators and outcomes, and presence of any underlying medical conditions and risk behaviors. This dataset requires a registration process and a data use agreement. CDC has three COVID-19 case surveillance datasets: COVID-19 Case Surveillance Public Use Data with Geography: Public use, patient-level dataset with clinical data (including symptoms), demographics, and county and state of residence. (19 data elements) COVID-19 Case Surveillance Public Use Data: Public use, patient-level dataset with clinical and symptom data and demographics, with no geographic data. (12 data elements) COVID-19 Case Surveillance Restricted Access Data: Restricted access, patient-level dataset with clinical (including symptoms), demographics, and county and state of residence. Access requires a registration process and a data use agreement. (32 data elements) Requesting Access to the COVID-19 Case Surveillance Restricted Access Detailed Data Please review the following documents to determine your interest in accessing the COVID-19 Case Surveillance Restricted Access Detailed Data file: 1) CDC COVID-19 Case Surveillance Restricted Access Detailed Data: Summary, Guidance, Limitations Information, and Restricted Access Data Use Agreement Information 2) Data Dictionary for the COVID-19 Case Surveillance Restricted Access Detailed Data The next step is to complete the Registration Information and Data Use Restrictions Agreement (RIDURA). Once complete, CDC will review your agreement. After access is granted, Ask SRRG (eocevent394@cdc.gov) will email you information about how to access the data through GitHub. If you have questions about obtaining access, email eocevent394@cdc.gov. Overview The COVID-19 case surveillance database includes patient-level data reported by U.S. states and autonomous reporting entities, including New York City, the District of Columbia, as well as U.S. territories and affiliates. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020, to clarify the interpretation of antigen detection tests and serologic test results within the case classification. The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC. COVID-19 case surveillance data are collected by jurisdictions and are shared voluntarily with CDC. For more information, visit: <a href="https://wwwn.cdc.gov/nndss/conditions/coronavirus-disease-2019-c
https://www.icpsr.umich.edu/web/ICPSR/studies/39057/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/39057/terms
The Michigan Public Policy Survey (MPPS) is a program of state-wide surveys of local government leaders in Michigan. The MPPS is designed to fill an important information gap in the policymaking process. While there are ongoing surveys of the business community and of the citizens of Michigan, before the MPPS there were no ongoing surveys of local government officials that were representative of all general purpose local governments in the state. Therefore, while we knew the policy priorities and views of the state's businesses and citizens, we knew very little about the views of the local officials who are so important to the economies and community life throughout Michigan. The MPPS was launched in 2009 by the Center for Local, State, and Urban Policy (CLOSUP) at the University of Michigan and is conducted in partnership with the Michigan Association of Counties, Michigan Municipal League, and Michigan Townships Association. The associations provide CLOSUP with contact information for the survey's respondents, and consult on survey topics. CLOSUP makes all decisions on survey design, data analysis, and reporting, and receives no funding support from the associations. The surveys investigate local officials' opinions and perspectives on a variety of important public policy issues and solicit factual information about their localities relevant to policymaking. Over time, the program has covered issues such as fiscal, budgetary and operational policy, fiscal health, public sector compensation, workforce development, local-state governmental relations, intergovernmental collaboration, economic development strategies and initiatives such as placemaking and economic gardening, the role of local government in environmental sustainability, energy topics such as hydraulic fracturing ("fracking") and wind power, trust in government, views on state policymaker performance, opinions on the impacts of the Federal Stimulus Program (ARRA), and more. The program will investigate many other issues relevant to local and state policy in the future. A searchable database of every question the MPPS has asked is available on CLOSUP's website. Results of MPPS surveys are currently available as reports, and via online data tables. The MPPS datasets are being released in two forms: public-use datasets and restricted-use datasets. Unlike the public-use datasets, the restricted-use datasets represent full MPPS survey waves, and include all of the survey questions from a wave. Restricted-use datasets also allow for multiple waves to be linked together for longitudinal analysis. The MPPS staff do still modify these restricted-use datasets to remove jurisdiction and respondent identifiers and to recode other variables in order to protect confidentiality. However, it is theoretically possible that a researcher might be able, in some rare cases, to use enough variables from a full dataset to identify a unique jurisdiction, so access to these datasets is restricted and approved on a case-by-case basis. CLOSUP encourages researchers interested in the MPPS to review the codebooks included in this data collection to see the full list of variables including those not found in the public-use datasets, and to explore the MPPS data using the public-use-datasets. The codebooks for these restricted use datasets are available for download on CLOSUP's website.
https://www.icpsr.umich.edu/web/ICPSR/studies/37484/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/37484/terms
The Michigan Public Policy Survey (MPPS) is a program of state-wide surveys of local government leaders in Michigan. The MPPS is designed to fill an important information gap in the policymaking process. While there are ongoing surveys of the business community and of the citizens of Michigan, before the MPPS there were no ongoing surveys of local government officials that were representative of all general purpose local governments in the state. Therefore, while we knew the policy priorities and views of the state's businesses and citizens, we knew very little about the views of the local officials who are so important to the economies and community life throughout Michigan.
The MPPS was launched in 2009 by the Center for Local, State, and Urban Policy (CLOSUP) at the University of Michigan and is conducted in partnership with the Michigan Association of Counties, Michigan Municipal League, and Michigan Townships Association. The associations provide CLOSUP with contact information for the survey's respondents, and consult on survey topics. CLOSUP makes all decisions on survey design, data analysis, and reporting, and receives no funding support from the associations.
The surveys investigate local officials' opinions and perspectives on a variety of important public policy issues and solicit factual information about their localities relevant to policymaking. Over time, the program has covered issues such as fiscal, budgetary and operational policy, fiscal health, public sector compensation, workforce development, local-state governmental relations, intergovernmental collaboration, economic development strategies and initiatives such as placemaking and economic gardening, the role of local government in environmental sustainability, energy topics such as hydraulic fracturing ("fracking") and wind power, trust in government, views on state policymaker performance, opinions on the impacts of the Federal Stimulus Program (ARRA), and more. The program will investigate many other issues relevant to local and state policy in the future. A searchable database of every question the MPPS has asked is available on CLOSUP's website. Results of MPPS surveys are currently available as reports, and via online data tables.
The MPPS datasets are being released in two forms: public-use datasets and restricted-use datasets. The public use datasets are available on OpenICPSR. Unlike the public-use datasets, the restricted-use datasets represent full MPPS survey waves, and include all of the survey questions from a wave. Restricted-use datasets also allow for multiple waves to be linked together for longitudinal analysis. The MPPS staff do still modify these restricted-use datasets to remove jurisdiction and respondent identifiers and to recode other variables in order to protect confidentiality. However, it is theoretically possible that a researcher might be able, in some rare cases, to use enough variables from a full dataset to identify a unique jurisdiction, so access to these datasets is restricted and approved on a case-by-case basis. CLOSUP encourages researchers interested in the MPPS to review the codebooks included in this data collection to see the full list of variables including those not found in the public-use datasets, and to explore the MPPS data using the public-use-datasets. The codebooks for these restricted use datasets are available for download on CLOSUP's website.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Governmental organizations collect and manage a variety of different types of data at different levels in order to fulfil their official tasks. These include geographical, environmental, meteorological, demographic, health, traffic, transport, financial and economic data. Access to this data has traditionally been severely restricted. Over the past ten years, however, there has been a global trend towards a more open data policy, which has been promoted by directives such as GeoIDG, the PSI Directive and INSPIRE. In Germany, the federal states and their authorities have also introduced an open data policy and make some of this data available to the public via platforms such as Destatis or GDI-DE (Open Government Data). This data is used for a variety of purposes, including determining location, analysing environmental trends, transport planning, health planning and more. Although this data is increasingly being used for scientific research, its full potential often remains unrealised, especially for large datasets. Despite the high quality of public authority data, further adaptation to the FAIR principles (Findable, Accessible, Interoperable, Reusable) is necessary to improve its reusability for research. However, data protection regulations and legal frameworks may impose restrictions that make it necessary to anonymise the data or comply with modern data standards. Nevertheless, government data is a valuable resource that makes a significant contribution to increasing knowledge in all scientific disciplines. As part of a pilot project funded by NFDI4Earth, the Deutscher Wetterdienst (DWD) and the German Climate Computing Centre (DKRZ) worked together to facilitate access to data from public authorities, increase the visibility of this data and increase the number of users from various disciplines. The aim was to make the data available in standardised and FAIR-compliant formats for research and other public applications. The DWD's COSMO-REA6 reanalysis dataset (Kaspar et al. 2020), which is of central importance for climate modelling, analyses and energy applications in Europe, was selected as an application example. The standardisation process involved the conversion of regulatory data standards into domain-specific climate research standards and required close collaboration between DWD and DKRZ. After careful curation and quality checking, the dataset was made accessible via the ESGF infrastructure and archived in the WDCC for the long term, taking into account aspects of licensing and authorship. The project's insights and lessons learned were incorporated into a blueprint (Anders et al. 2024), providing guidance on making data from other authorities accessible and usable for both research and the public. Overall, the entire process can be divided into 5 sub-steps: (1) determination and classification of the need, (2) survey of the feasibility, (3) implementation, (4) feedback and follow-up, (5) dissemination. This blueprint outlines generalizable steps and aspects applicable across domains and collaborators, offering a framework for optimizing the use of governmental data in diverse fields.
This dataset contains electronic health records used to study associations between PFAS occurrence and multimorbidity in a random sample of UNC Healthcare system patients. The dataset contains the medical record number to uniquely identify each individual as well as information on PFAS occurrence at the zip code level, the zip code of residence for each individual, chronic disease diagnoses, patient demographics, and neighborhood socioeconomic information from the 2010 US Census. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Because this data has PII from electronic health records the data can only be accessed with an approved IRB application. Project analytic code is available at L:/PRIV/EPHD_CRB/Cavin/CARES/Project Analytic Code/Cavin Ward/PFAS Chronic Disease and Multimorbidity. Format: This data is formatted as a R dataframe and associated comma-delimited flat text file. The data has the medical record number to uniquely identify each individual (which also serves as the primary key for the dataset), as well as information on the occurrence of PFAS contamination at the zip code level, socioeconomic data at the census tract level from the 2010 US Census, demographics, and the presence of chronic disease as well as multimorbidity (the presence of two or more chronic diseases). This dataset is associated with the following publication: Ward-Caviness, C., J. Moyer, A. Weaver, R. Devlin, and D. Diazsanchez. Associations between PFAS occurrence and multimorbidity as observed in an electronic health record cohort. Environmental Epidemiology. Wolters Kluwer, Alphen aan den Rijn, NETHERLANDS, 6(4): p e217, (2022).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team, except for aggregation of individual case count data into daily counts when that was the best data available for a disease and location. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format. All geographic locations at the country and admin1 level have been represented at the same geographic level as in the data source, provided an ISO code or codes could be identified, unless the data source specifies that the location is listed at an inaccurate geographical level. For more information about decisions made by the curation team, recommended data processing steps, and the data sources used, please see the README that is included in the dataset download ZIP file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Access to image-based resources is fundamental to research and the transmission of cultural knowledge. Digital access offers the potential for scholars to employ heritage collections internationally via the internet. However, most of the internet’s image-based resources have been locked up in silos, and access at resolutions useful for research has been restricted to bespoke, locally-built applications. The International Image Interoperability Framework (IIIF) can solve this problem using Application Program Interfaces (APIs) which allow images and metadata held in different digital collections to be accessed in a standardised format.
In this EU-funded MUYA-IIIF project 694612 Proof of Concept project (the MUYA-IIIF PoC) the School of African and Oriental Studies (SOAS) has addressed the problem of tools and infrastructure to realize the potential of IIIF in practical workflows for researchers in the social sciences and humanities. Heretofore the large number of organizations worldwide now providing access to their image-based resources via IIIF have only supported viewing, and not the routine use of research processes such as standards-based scientific annotation. The MUYA-IIIF PoC has worked with the British Library (BL) and the hasdai Partnership of CERN and Data Futures GmbH to build and employ a general-purpose IIIF annotation workflow, extending existing research conducted under earlier The Multimedia Yasna ERC advanced grant. Significantly, not only have problems of annotation workflows been addressed by the MUYA-IIIF PoC, but also the creation of new, reusable primary data resources from research employing annotation, which can be preserved using the W3C's Web Annotation Data Model (WADM) standards and state-of-the art InvenioRDM repository technology.
Specifically, MUYA-IIIF has annotated textual structure in key Avestan manuscripts from multiple collections, including from the British Library, to connect the digitized manuscript imagery with structured transcriptions of the text it bears, enabling analysis and searching.
While many institutions internationally now provide IIIF data resources based on their manuscript collections, very few of these are yet compatible with standards-based annotation. The Oxford MA in digital scholarship found, as recently as Fall 2022, that it needed to convert libraries' IIIF resources before being able to annotate them. In contrast, MUYA-IIIF has produced a new WADM-compliant IIIF service, which can now be freely annotated by scholars, and it has also created annotations of all of the stanzas of the Zoroastrian Yasna ceremony. In turn, this has permitted reuse of existing research investment using Text Encoding Initiative (TEI) analysis of the Yasna. As a result a significant speed-up has been achieved in developing comprehensive interactive transcription of the Yasna manuscripts.
The second part of the MUYA-IIIF project has addressed sustainability and reuse of this new digital collection: in contrast, many data resources in the Humanities and in cultural heritage become vulnerable to technology obsolescence. In particular, WADM annotations are stand-off in nature—they are stored separately from the digitized manuscript imagery and demand new approaches for effective preservation and accessibility for the wider research community. MUYA-IIIF has therefore worked with the hasdai partnership to gain access to new repository technology developed in the InvenioRDM consortium, which supports annotation. InvenioRDM is the software platform on which the upgrade of the European Commission's OpenAIRE trusted Zenodo repository is based.
To support such long-term access and reuse, the project's outputs comprise four components, which together form a sustainable data resource on which not only SOAS but also the external research community can build.
The new MUYA InvenioRDM corpus repository is supported in the long-term through the hasdai Partnership, and the Zenodo record is supported by the EC's OpenAIRE program. In this way MUYA-IIIF has created a new sustainability benchmark for digital research investments using scientific annotation, by assembling standards-based infrastructures and making very long-term costs of operation of complex data resources forecastable in concrete terms.
Implementation of the project
The project was divided into work-packages, reflecting three main activities:
The annotation workflow for the project employed the freizo anəstor platform developed by Data Futures GmbH and employed by institutions in Europe and the U.S. including CERN, Heidelberg, Oxford and Notre Dame, and this was configured for work on the Yasna manuscript. Anəstor can generate multiple versions of Open Annotation Data Model and Web Annotation Data Model (WADM) annotations, to address differences between existing, current and future standards-based WADM research environments, and it is being integrated with the Zenodo global catch-all repository of OpenAIRE.
The annotation workflow provided security for SOAS scholars through ORCID authentication, so that their work was protected from unauthorized modification, and also allowed their contributions to be tracked and credited for citation. In addition the workflow exported annotation collections in a preservable form for efficient access by the research community and for preservation.
Developing an InvenioRDM corpus repository for the MUYA-IIIF project enabled the digital version of the manuscript, together with the British Library metadata, to be presented without restrictions on the internet, and for the annotation collections to form a foundation for future research via down-loadable JSON datasets (JSON is technology-agnostic and can be employed by a wide range of current and future research software applications).
SOAS now plans to extend the MUYA-IIIF repository with additional manuscripts based on fieldwork in India and Iran and through collaborations with other institutions worldwide. Long-term hosting of this data resource by the hasdai Partnership is already organized for 10 years and new developments such as the Oxford Common File Layout (OCFL) are enabling both very long-term preservation using LTO tape libraries and also cross repository interoperability for resilience as technologies continue to evolve.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team, except for aggregation of individual case count data into daily counts when that was the best data available for a disease and location. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format. All geographic locations at the country and admin1 level have been represented at the same geographic level as in the data source, provided an ISO code or codes could be identified, unless the data source specifies that the location is listed at an inaccurate geographical level. For more information about decisions made by the curation team, recommended data processing steps, and the data sources used, please see the README that is included in the dataset download ZIP file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team, except for aggregation of individual case count data into daily counts when that was the best data available for a disease and location. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format. All geographic locations at the country and admin1 level have been represented at the same geographic level as in the data source, provided an ISO code or codes could be identified, unless the data source specifies that the location is listed at an inaccurate geographical level. For more information about decisions made by the curation team, recommended data processing steps, and the data sources used, please see the README that is included in the dataset download ZIP file.
Understanding Society (UK Household Longitudinal Study), which began in 2009, is conducted by the Institute for Social and Economic Research (ISER) at the University of Essex, and the survey research organisations Verian Group and NatCen. It builds on and incorporates the British Household Panel Survey (BHPS), which began in 1991.
For full details of the main Understanding Society study, see SN 6614.
The Understanding Society: Waves 1-14, 2009-2023 and Harmonised BHPS: Waves 1-18, 1991-2009: Secure Access dataset contains British National Grid postcode grid references (at 1m resolution) for the unit postcode of each household surveyed, derived from the ONS National Statistics Postcode Directory (ONSPD). Grid references are presented in terms of Eastings and Northings, which are distances in metres (east and north, respectively) from the origin (0,0), which lies to the west of the Scilly Isles. Each grid reference is given a positional quality indicator to denote the accuracy of the grid reference. In the majority of cases, the assigned grid reference relates to the building of the matched address closest to the postcode mean. The grid references provided for Northern Ireland postcodes use the Irish National Grid system that covers all of Ireland and is independent of the British National Grid. No grid references are provided for postcodes in the Channel Islands and the Isle of Man.
The Secure Access version includes all files in the Special Licence version (see SN 6931 for full details), plus a file for each wave that contains four variables relating to the National Grid Reference for each household: easting, northing, positional quality indicator (w_osgrdind), and a variable identifying whether it relates to the British or Irish grid system. The Secure Access version also contains a data file with full dates of birth for Understanding Society and BHPS respondents, which includes the day of birth variable, which is only available in this study.
Related UK Data Archive studies:
The Secure Access version of the dataset has more restrictive access conditions than standard End User Licence or Special Licence access datasets (see 'Access' section). Further details and links to the less restrictive versions can be found on the Understanding Society series Key data webpage.
International Data Access Network (IDAN)
These data are now available to researchers based outside the UK. Selected UKDS SecureLab/controlled datasets from the Institute for Social and Economic Research (ISER) and the Centre for Longitudinal Studies (CLS) have been made available under the International Data Access Network (IDAN) scheme, via a Safe Room access point at one of the UKDS IDAN partners. Prospective users should read the UKDS SecureLab application guide for non-ONS data for researchers outside the UK via Safe Room Remote Desktop Access. Further details about the IDAN scheme can be found on the UKDS International Data Access Network webpage and the IDAN website.
Latest edition information
For the 17th edition (November 2024), Wave 14 data has been added. Other minor changes and corrections have also been made to Waves 1-13. Please refer to the revisions document for full details.
m_hhresp and n_hhresp files updated, December 2024
In the previous release (17th edition, November 2024), there was an issue with household income estimates in m_hhresp and n_hhresp where a household resides in a new local authority (approx. 300 households in wave 14). The issue has been corrected and imputation models re-estimated and imputed values updated for the full sample. Imputed values will therefore change compared to the versions in the original release. The variables affected are w_ficountax_dv, w_fihhmnnet3_dv, n_fihhmnnet4_dv and n_ctband_dv.
Suitable data analysis software
These data are provided by the depositor in Stata format. Users are strongly advised to analyse them in Stata. Transfer to other formats may result in unforeseen issues. Stata SE or MP software is needed to analyse the larger files, which contain over 2,047 variables.
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
description: The USGS has contracted with SPOT Image Corporation to acquire and provide Satellite Pour l'Observation de la Terre (SPOT) satellite data for calendar years 2010 and 2011. Under the North America Data Buy agreement, SPOT Image will provide moderate-resolution data from their SPOT 4 and 5 satellites over the conterminous United States and parts of Canada and Mexico through the receiving capabilities at the USGS EROS Center. The French space agency, Centre National d'Etudes Spatiales (CNES), owns and operates the SPOT satellite system. SPOT Image Corporation is a subsidiary of the SPOT Image group, which provides worldwide distribution of their imagery. Under the licensing arrangements of the North America Data Buy contract, access is limited to U.S. Federal civil Government agency users and U.S. State and local government users. The 2011 contract expands usage to include U.S. tribal governments. Qualified users must be logged in to EarthExplorer to gain access to this dataset. An explanation of data restrictions and limitations is displayed when accessing these collections and must be agreed upon before gaining access to these data. The USGS SPOT 4 and 5 datasets provide North American coverage between 53 deg north latitude and 23.5 deg north latitude in calendar year 2010. The coverage for 2011 extends from 55 deg north latitude to 23.5 deg north latitude. SPOT satellites carry imaging instruments that operate with panchromatic and multispectral sensors. The SPOT 4 payload includes two High Resolution Visible and Infrared (HRVIR) sensors, and SPOT 5 utilizes two High Resolution Geometric (HRG) instruments. Each sensor has a swath of 60 km and has an oblique viewing capability of 27 deg on each side of vertical. The sensors can operate independently to observe separate targets or in tandem to cover a larger swath in a single pass. Each scene in this collection is approximately 60 km by 60 km and is referenced to World Geodetic System 84 (WGS 84) datum.; abstract: The USGS has contracted with SPOT Image Corporation to acquire and provide Satellite Pour l'Observation de la Terre (SPOT) satellite data for calendar years 2010 and 2011. Under the North America Data Buy agreement, SPOT Image will provide moderate-resolution data from their SPOT 4 and 5 satellites over the conterminous United States and parts of Canada and Mexico through the receiving capabilities at the USGS EROS Center. The French space agency, Centre National d'Etudes Spatiales (CNES), owns and operates the SPOT satellite system. SPOT Image Corporation is a subsidiary of the SPOT Image group, which provides worldwide distribution of their imagery. Under the licensing arrangements of the North America Data Buy contract, access is limited to U.S. Federal civil Government agency users and U.S. State and local government users. The 2011 contract expands usage to include U.S. tribal governments. Qualified users must be logged in to EarthExplorer to gain access to this dataset. An explanation of data restrictions and limitations is displayed when accessing these collections and must be agreed upon before gaining access to these data. The USGS SPOT 4 and 5 datasets provide North American coverage between 53 deg north latitude and 23.5 deg north latitude in calendar year 2010. The coverage for 2011 extends from 55 deg north latitude to 23.5 deg north latitude. SPOT satellites carry imaging instruments that operate with panchromatic and multispectral sensors. The SPOT 4 payload includes two High Resolution Visible and Infrared (HRVIR) sensors, and SPOT 5 utilizes two High Resolution Geometric (HRG) instruments. Each sensor has a swath of 60 km and has an oblique viewing capability of 27 deg on each side of vertical. The sensors can operate independently to observe separate targets or in tandem to cover a larger swath in a single pass. Each scene in this collection is approximately 60 km by 60 km and is referenced to World Geodetic System 84 (WGS 84) datum.
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format. Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc. Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:
Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de456405https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de456405
Abstract (en): This survey investigated health insurance coverage, as well as access to and use of health services, in each of ten states. With the goal of remedying the previous lack of state-level data, the survey was conducted to aid in defining problems of insurance coverage and to analyze the impacts of states' policy options. The main unit of observation is the health insurance family, which includes the head, spouse, and their children up to age 18, or to age 23 if they were in school. Variables on health insurance coverage include the types of coverage respondents carried (Medicare, Medicaid, additional state or federal programs, and private policies), sources of private policy coverage, premiums paid for private policies, and number of months uninsured during the last year. Access to health care is measured by variables such as the type of usual health care provider, the amount of time it usually took to get to the doctor's office, and whether needed medical care was not received during the previous year. Variables on the utilization of health care include the number of overnight hospital stays, the number of visits to doctors, age at first DPT (diphtheria, whooping cough, and tetanus) shot, age at first oral polio immunization, and the number of months since the most recent breast exam and Pap smear. The survey also elicited self-reported health status and opinions on the health care system, gauged satisfaction/dissatisfaction with health services received, and gathered information on employment, income, education, migration, age, sex, marital status, race, Hispanic origin, and citizenship. Civilian, noninstitutionalized population of Colorado, Florida, Minnesota, New Mexico, New York, North Dakota, Oklahoma, Oregon, Vermont, and Washington. Samples sufficient to produce approximately 2,000 families with completed interviews were drawn in each state. Families containing one or more Medicaid or uninsured persons were oversampled. 2005-06-22 A SPSS setup file for Part 1 has been added to the collection and the SAS setup file has been enhanced.1999-12-29 A file with FIPS state and county codes, which can be merged with Part 1, Main Data File, has been added as Part 2. This file is restricted from dissemination. To obtain this file, researchers must agree to the terms and conditions of a restricted data use agreement in accordance with existing servicing policies.1997-11-18 A report, "Data Cleaning Procedures for the 1993 Robert Wood Johnson Foundation Family Health Insurance Survey," has been added to the documentation for this study. All documentation is now available as a PDF file. Funding insitution(s): Robert Wood Johnson Foundation. computer-assisted telephone interview (CATI), face-to-face interviewThe data files for this collection are blank-delimited.Part 1, Main Data File, is a person-level file with family-level variables repeated on each record.The data files in this collection may be linked by common ID variables.
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de455858https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de455858
Abstract (en): The purpose of this data collection is to provide an official public record of the business of the federal courts. The data originate from district and appellate court offices throughout the United States. Information was obtained at two points in the life of a case: filing and termination. The termination data contain information on both filing and terminations, while the pending data contain only filing information. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Performed consistency checks.; Checked for undocumented or out-of-range codes.. All federal court cases in the United States in 2003 Smallest Geographic Unit: County 2015-09-15 Six data files were created with docket numbers blanked for Parts 1, 3, and 5, and with docket numbers containing original values for Parts 2, 4, and 6.2012-06-18 The civil data files were updated.2011-04-12 All parts are being moved to restricted access and will be available only using the restricted access procedures.2005-04-22 Data files for Part 4, Civil Pending, 2003, in both restricted and unrestricted forms and Part 5, Criminal Data, 2003, have been added along with corresponding SAS and SPSS setup files and codebooks in PDF format.2005-01-07 A restricted data file for Part 3, Civil Terminations, 2003, has been added to the data collection. The unrestricted data file for Part 3 and its corresponding SAS and SPSS setup files have been updated. The codebook has been modified to reflect these changes. Funding insitution(s): United States Department of Justice. Office of Justice Programs. Bureau of Justice Statistics. Starting with the year 2001, each year of data for Federal Court Cases is released by ICPSR as a separate study number. Federal Court Cases data for the years 1970-2000 can be found in FEDERAL COURT CASES: INTEGRATED DATA BASE, 1970-2000 (ICPSR 8429).
https://www.icpsr.umich.edu/web/ICPSR/studies/36231/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36231/terms
The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco. 45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent. At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Unit (PSU)s and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort. At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohorts who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort. Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts. Dataset 0002 (DS0002) contains the data from the State Design Data. This file contains 7 variables and 82,139 cases. The state identifier in the State Design file reflects the participant's state of residence at the time of selection and recruitment for the PATH Study. Dataset 1011 (DS1011) contains the data from the Wave 1 Adult Questionnaire. This data file contains 2,021 variables and 32,320 cases. Each of the cases represents a single, completed interview. Dataset 1012 (DS1012) contains the data from the Wave 1 Youth and Parent Questionnaire. This file contains 1,431 variables and 13,651 cases. Dataset 1411 (DS1411) contains the Wave 1 State Identifier data for Adults and has 5 variables and 32,320 cases. Dataset 1412 (DS1412) contains the Wave 1 State Identifier data for Youth (and Parents) and has 5 variables and 13,651 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state Federal Information Processing System (FIPS), state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 1, which is also their state of residence at the time of recruitment. Dataset 1611 (DS1611) contains the Tobacco Universal Product Code (UPC) data from Wave 1. This data file contains 32 variables and 8,601 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 1. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used
Understanding Society (the UK Household Longitudinal Study), which began in 2009, is conducted by the Institute for Social and Economic Research (ISER) at the University of Essex, and the survey research organisations Verian Group (formerly Kantar Public) and NatCen. It builds on and incorporates, the British Household Panel Survey (BHPS), which began in 1991.
The Understanding Society: Calendar Year Dataset, 2021, is designed to enable cross-sectional analysis of individuals and households relating specifically to their annual interviews conducted in the year 2021, and, therefore, combine data collected in three waves (Waves 11, 12 and 13). It has been produced from the same data collected in the main Understanding Society study and released in the longitudinal datasets SN 6614 (End User Licence) and SN 6931 (Special Licence). Such cross-sectional analysis can, however, only involve variables that are collected in every wave in order to have data for the full sample panel. The 2021 dataset is the second of a series of planned Calendar Year Datasets to facilitate cross-sectional analysis of specific years. Full details of the Calendar Year Dataset sample structure (including why some individual interviews from 2022 are included), data structure and additional supporting information can be found in the document '9194_calendar_year_dataset_2020_user_guide'.
As multi-topic studies, the purpose of Understanding Society is to understand the short- and long-term effects of social and economic change in the UK at the household and individual levels. The study has a strong emphasis on domains of family and social ties, employment, education, financial resources, and health. Understanding Society is an annual survey of each adult member of a nationally representative sample. The same individuals are re-interviewed in each wave approximately 12 months apart. When individuals move, they are followed within the UK, and anyone joining their households is also interviewed as long as they are living with them. The fieldwork period for a single wave is 24 months. Data collection uses computer-assisted personal interviewing (CAPI) and web interviews (from wave 7) and includes a telephone mop-up. From March 2020 (the end of wave 10 and 2nd year of wave 11), due to the coronavirus pandemic, face-to-face interviews were suspended, and the survey has been conducted by web and telephone only but otherwise has continued as before. One person completes the household questionnaire. Each person aged 16 or older participates in the individual adult interview and self-completed questionnaire. Youths aged 10 to 15 are asked to respond to a paper self-completion questionnaire. In 2020, an additional frequent web survey was separately issued to sample members to capture data on the rapid changes in people’s lives due to the COVID-19 pandemic (see SN 8644). The COVID-19 Survey data are not included in this dataset.
Further information may be found on the Understanding Society main stage webpage and links to publications based on the study can be found on the Understanding Society Latest Research webpage.
Co-funders
In addition to the Economic and Social Research Council, co-funders for the study included the Department of Work and Pensions, the Department for Education, the Department for Transport, the Department of Culture, Media and Sport, the Department for Community and Local Government, the Department of Health, the Scottish Government, the Welsh Assembly Government, the Northern Ireland Executive, the Department of Environment and Rural Affairs, and the Food Standards Agency.
End User Licence and Special Licence versions:
There are two versions of the Calendar Year 2021 data. One is available under the standard End User Licence (EUL) agreement, and the other is a Special Licence (SL) version. The SL version contains month and year of birth variables instead of just age, more detailed country and occupation coding for a number of variables and various income variables have not been top-coded (see xxxx_eul_vs_sl_variable_differences for more details). Users are advised to first obtain the standard EUL version of the data to see if they are sufficient for their research requirements. The SL data have more restrictive access conditions; prospective users of the SL version will need to complete an extra application form and demonstrate to the data owners exactly why they need access to the additional variables in order to get permission to use that version. The main longitudinal versions of the Understanding Society study may be found under SNs 6614 (EUL) and 6931 (SL).
Low- and Medium-level geographical identifiers produced for the mainstage longitudinal dataset can be used with this Calendar Year 2021 dataset, subject to SL access conditions. See the User Guide for further details.
Suitable data analysis software
These data are provided by the depositor in Stata format. Users are strongly advised to analyse them in Stata. Transfer to other formats may result in unforeseen issues. Stata SE or MP software is needed to analyse the larger files, which contain about 1,900 variables.
This case surveillance publicly available dataset has 32 elements for all COVID-19 cases shared with CDC and includes demographics, geography (county and state of residence), any exposure history, disease severity indicators and outcomes, and presence of any underlying medical conditions and risk behaviors. This dataset requires a registration process and a data use agreement. CDC has three COVID-19 case surveillance datasets: COVID-19 Case Surveillance Public Use Data with Geography: Public use, patient-level dataset with clinical data (including symptoms), demographics, and county and state of residence. (19 data elements) COVID-19 Case Surveillance Public Use Data: Public use, patient-level dataset with clinical and symptom data and demographics, with no geographic data. (12 data elements) COVID-19 Case Surveillance Restricted Access Data: Restricted access, patient-level dataset with clinical (including symptoms), demographics, and county and state of residence. Access requires a registration process and a data use agreement. (32 data elements) Requesting Access to the COVID-19 Case Surveillance Restricted Access Detailed Data Please review the following documents to determine your interest in accessing the COVID-19 Case Surveillance Restricted Access Detailed Data file: 1) CDC COVID-19 Case Surveillance Restricted Access Detailed Data: Summary, Guidance, Limitations Information, and Restricted Access Data Use Agreement Information 2) Data Dictionary for the COVID-19 Case Surveillance Restricted Access Detailed Data The next step is to complete the Registration Information and Data Use Restrictions Agreement (RIDURA). Once complete, CDC will review your agreement. After access is granted, Ask SRRG (eocevent394@cdc.gov) will email you information about how to access the data through GitHub. If you have questions about obtaining access, email eocevent394@cdc.gov. Overview The COVID-19 case surveillance database includes patient-level data reported by U.S. states and autonomous reporting entities, including New York City, the District of Columbia, as well as U.S. territories and affiliates. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020, to clarify the interpretation of antigen detection tests and serologic test results within the case classification. The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC. COVID-19 case surveillance data are collected by jurisdictions and are shared voluntarily with CDC. For more information, visit: <a href="https://wwwn.cdc.gov/nndss/conditions/coronavirus-disease-2019-c