42 datasets found

18 excel spreadsheets by species and year giving reproduction and growth...
catalog.data.gov
data.wu.ac.at
Updated Aug 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2024). 18 excel spreadsheets by species and year giving reproduction and growth data. One excel spreadsheet of herbicide treatment chemistry. [Dataset]. https://catalog.data.gov/dataset/18-excel-spreadsheets-by-species-and-year-giving-reproduction-and-growth-data-one-excel-sp
Explore at:
Dataset updated
Aug 17, 2024
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
Excel spreadsheets by species (4 letter code is abbreviation for genus and species used in study, year 2010 or 2011 is year data collected, SH indicates data for Science Hub, date is date of file preparation). The data in a file are described in a read me file which is the first worksheet in each file. Each row in a species spreadsheet is for one plot (plant). The data themselves are in the data worksheet. One file includes a read me description of the column in the date set for chemical analysis. In this file one row is an herbicide treatment and sample for chemical analysis (if taken). This dataset is associated with the following publication: Olszyk , D., T. Pfleeger, T. Shiroyama, M. Blakely-Smith, E. Lee , and M. Plocher. Plant reproduction is altered by simulated herbicide drift toconstructed plant communities. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY. Society of Environmental Toxicology and Chemistry, Pensacola, FL, USA, 36(10): 2799-2813, (2017).
g
Employee Travel 2021 (Excel)
opendata.greatersudbury.ca
arc-gis-hub-home-arcgishub.hub.arcgis.com
Updated Sep 1, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Greater Sudbury (2021). Employee Travel 2021 (Excel) [Dataset]. https://opendata.greatersudbury.ca/documents/7d73d365118b46e4828f52fea7c8ce3a
Explore at:
Dataset updated
Sep 1, 2021
Dataset authored and provided by
City of Greater Sudbury
Description
Download Employee Travel Excel SheetThis dataset contains information about the employee travel expenses for the year 2021. Details are provided on the employee (name, title, department), the travel (dates, location, purpose) and the cost (expenses, recoveries). Expenses are broken down in separate tabs by Quarter (Q1, Q2, Q3 and Q4). Updated quarterly when expenses are prepared. Expenses for other years are available in separate datasets.
Data from: Current and projected research data storage needs of Agricultural...
catalog.data.gov
agdatacommons.nal.usda.gov
+2more
Updated Apr 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Current and projected research data storage needs of Agricultural Research Service researchers in 2016 [Dataset]. https://catalog.data.gov/dataset/current-and-projected-research-data-storage-needs-of-agricultural-research-service-researc-f33da
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Description
The USDA Agricultural Research Service (ARS) recently established SCINet , which consists of a shared high performance computing resource, Ceres, and the dedicated high-speed Internet2 network used to access Ceres. Current and potential SCINet users are using and generating very large datasets so SCINet needs to be provisioned with adequate data storage for their active computing. It is not designed to hold data beyond active research phases. At the same time, the National Agricultural Library has been developing the Ag Data Commons, a research data catalog and repository designed for public data release and professional data curation. Ag Data Commons needs to anticipate the size and nature of data it will be tasked with handling. The ARS Web-enabled Databases Working Group, organized under the SCINet initiative, conducted a study to establish baseline data storage needs and practices, and to make projections that could inform future infrastructure design, purchases, and policies. The SCINet Web-enabled Databases Working Group helped develop the survey which is the basis for an internal report. While the report was for internal use, the survey and resulting data may be generally useful and are being released publicly. From October 24 to November 8, 2016 we administered a 17-question survey (Appendix A) by emailing a Survey Monkey link to all ARS Research Leaders, intending to cover data storage needs of all 1,675 SY (Category 1 and Category 4) scientists. We designed the survey to accommodate either individual researcher responses or group responses. Research Leaders could decide, based on their unit's practices or their management preferences, whether to delegate response to a data management expert in their unit, to all members of their unit, or to themselves collate responses from their unit before reporting in the survey. Larger storage ranges cover vastly different amounts of data so the implications here could be significant depending on whether the true amount is at the lower or higher end of the range. Therefore, we requested more detail from "Big Data users," those 47 respondents who indicated they had more than 10 to 100 TB or over 100 TB total current data (Q5). All other respondents are called "Small Data users." Because not all of these follow-up requests were successful, we used actual follow-up responses to estimate likely responses for those who did not respond. We defined active data as data that would be used within the next six months. All other data would be considered inactive, or archival. To calculate per person storage needs we used the high end of the reported range divided by 1 for an individual response, or by G, the number of individuals in a group response. For Big Data users we used the actual reported values or estimated likely values. Resources in this dataset:Resource Title: Appendix A: ARS data storage survey questions. File Name: Appendix A.pdfResource Description: The full list of questions asked with the possible responses. The survey was not administered using this PDF but the PDF was generated directly from the administered survey using the Print option under Design Survey. Asterisked questions were required. A list of Research Units and their associated codes was provided in a drop down not shown here. Resource Software Recommended: Adobe Acrobat,url: https://get.adobe.com/reader/ Resource Title: CSV of Responses from ARS Researcher Data Storage Survey. File Name: Machine-readable survey response data.csvResource Description: CSV file includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed. This information is that same data as in the Excel spreadsheet (also provided).Resource Title: Responses from ARS Researcher Data Storage Survey. File Name: Data Storage Survey Data for public release.xlsxResource Description: MS Excel worksheet that Includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel
Sample Student Data
figshare.com
xls
Updated Aug 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carrie Ellis (2022). Sample Student Data [Dataset]. http://doi.org/10.6084/m9.figshare.20419434.v1
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.20419434.v1
Dataset updated
Aug 2, 2022
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Carrie Ellis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In "Sample Student Data", there are 6 sheets. There are three sheets with sample datasets, one for each of the three different exercise protocols described (CrP Sample Dataset, Glycolytic Dataset, Oxidative Dataset). Additionally, there are three sheets with sample graphs created using one of the three datasets (CrP Sample Graph, Glycolytic Graph, Oxidative Graph). Each dataset and graph pairs are from different subjects. · CrP Sample Dataset and CrP Sample Graph: This is an example of a dataset and graph created from an exercise protocol designed to stress the creatine phosphate system. Here, the subject was a track and field athlete who threw the shot put for the DeSales University track team. The NIRS monitor was placed on the right triceps muscle, and the student threw the shot put six times with a minute rest in between throws. Data was collected telemetrically by the NIRS device and then downloaded after the student had completed the protocol. · Glycolytic Dataset and Glycolytic Graph: This is an example of a dataset and graph created from an exercise protocol designed to stress the glycolytic energy system. In this example, the subject performed continuous squat jumps for 30 seconds, followed by a 90 second rest period, for a total of three exercise bouts. The NIRS monitor was place on the left gastrocnemius muscle. Here again, data was collected telemetrically by the NIRS device and then downloaded after he had completed the protocol. · Oxidative Dataset and Oxidative Graph: In this example, the dataset and graph are from an exercise protocol designed to stress the oxidative system. Here, the student held a sustained, light-intensity, isometric biceps contraction (pushing against a table). The NIRS monitor was attached to the left biceps muscle belly. Here, data was collected by a student observing the SmO2 values displayed on a secondary device; specifically, a smartphone with the IPSensorMan APP displaying data. The recorder student observed and recorded the data on an Excel Spreadsheet, and marked the times that exercise began and ended on the Spreadsheet.
B
Data Cleaning Sample
borealisdata.ca
Updated Jul 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP3/ZCN177
Dataset updated
Jul 13, 2023
Dataset provided by
Borealis
Authors
Rong Luo
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Sample data for exercises in Further Adventures in Data Cleaning.
Enterprise Survey 2009-2019, Panel Data - Slovenia
microdata.worldbank.org
catalog.ihsn.org
Updated Aug 6, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Bank Group (WBG) (2020). Enterprise Survey 2009-2019, Panel Data - Slovenia [Dataset]. https://microdata.worldbank.org/index.php/catalog/3762
Explore at:
Dataset updated
Aug 6, 2020
Dataset provided by
European Bank for Reconstruction and Developmenthttp://ebrd.com/
European Investment Bankhttp://eib.org/
World Bankhttp://topics.nytimes.com/top/reference/timestopics/organizations/w/world_bank/index.html
Time period covered
2008 - 2019
Area covered
Slovenia
Description
Abstract

The documentation covers Enterprise Survey panel datasets that were collected in Slovenia in 2009, 2013 and 2019.

The Slovenia ES 2009 was conducted between 2008 and 2009. The Slovenia ES 2013 was conducted between March 2013 and September 2013. Finally, the Slovenia ES 2019 was conducted between December 2018 and November 2019. The objective of the Enterprise Survey is to gain an understanding of what firms experience in the private sector.

As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.

Geographic coverage

National

Analysis unit

The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must take its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.

Universe

As it is standard for the ES, the Slovenia ES was based on the following size stratification: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).

Kind of data

Sample survey data [ssd]

Sampling procedure

The sample for Slovenia ES 2009, 2013, 2019 were selected using stratified random sampling, following the methodology explained in the Sampling Manual for Slovenia 2009 ES and for Slovenia 2013 ES, and in the Sampling Note for 2019 Slovenia ES.

Three levels of stratification were used in this country: industry, establishment size, and oblast (region). The original sample designs with specific information of the industries and regions chosen are included in the attached Excel file (Sampling Report.xls.) for Slovenia 2009 ES. For Slovenia 2013 and 2019 ES, specific information of the industries and regions chosen is described in the "The Slovenia 2013 Enterprise Surveys Data Set" and "The Slovenia 2019 Enterprise Surveys Data Set" reports respectively, Appendix E.

For the Slovenia 2009 ES, industry stratification was designed in the way that follows: the universe was stratified into manufacturing industries, services industries, and one residual (core) sector as defined in the sampling manual. Each industry had a target of 90 interviews. For the manufacturing industries sample sizes were inflated by about 17% to account for potential non-response cases when requesting sensitive financial data and also because of likely attrition in future surveys that would affect the construction of a panel. For the other industries (residuals) sample sizes were inflated by about 12% to account for under sampling in firms in service industries.

For Slovenia 2013 ES, industry stratification was designed in the way that follows: the universe was stratified into one manufacturing industry, and two service industries (retail, and other services).

Finally, for Slovenia 2019 ES, three levels of stratification were used in this country: industry, establishment size, and region. The original sample design with specific information of the industries and regions chosen is described in "The Slovenia 2019 Enterprise Surveys Data Set" report, Appendix C. Industry stratification was done as follows: Manufacturing – combining all the relevant activities (ISIC Rev. 4.0 codes 10-33), Retail (ISIC 47), and Other Services (ISIC 41-43, 45, 46, 49-53, 55, 56, 58, 61, 62, 79, 95).

For Slovenia 2009 and 2013 ES, size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposes, the number of employees was defined on the basis of reported permanent full-time workers. This seems to be an appropriate definition of the labor force since seasonal/casual/part-time employment is not a common practice, except in the sectors of construction and agriculture.

For Slovenia 2009 ES, regional stratification was defined in 2 regions. These regions are Vzhodna Slovenija and Zahodna Slovenija. The Slovenia sample contains panel data. The wave 1 panel “Investment Climate Private Enterprise Survey implemented in Slovenia” consisted of 223 establishments interviewed in 2005. A total of 57 establishments have been re-interviewed in the 2008 Business Environment and Enterprise Performance Survey.

For Slovenia 2013 ES, regional stratification was defined in 2 regions (city and the surrounding business area) throughout Slovenia.

Finally, for Slovenia 2019 ES, regional stratification was done across two regions: Eastern Slovenia (NUTS code SI03) and Western Slovenia (SI04).

Mode of data collection

Computer Assisted Personal Interview [capi]

Research instrument

Questionnaires have common questions (core module) and respectfully additional manufacturing- and services-specific questions. The eligible manufacturing industries have been surveyed using the Manufacturing questionnaire (includes the core module, plus manufacturing specific questions). Retail firms have been interviewed using the Services questionnaire (includes the core module plus retail specific questions) and the residual eligible services have been covered using the Services questionnaire (includes the core module). Each variation of the questionnaire is identified by the index variable, a0.

Response rate

Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.

Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect the refusal to respond as (-8). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary. However, there were clear cases of low response.

For 2009 and 2013 Slovenia ES, the survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Up to 4 attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals. Further research is needed on survey non-response in the Enterprise Surveys regarding potential introduction of bias.

For 2009, the number of contacted establishments per realized interview was 6.18. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The relatively low ratio of contacted establishments per realized interview (6.18) suggests that the main source of error in estimates in the Slovenia may be selection bias and not frame inaccuracy.

For 2013, the number of realized interviews per contacted establishment was 25%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The number of rejections per contact was 44%.

Finally, for 2019, the number of interviews per contacted establishments was 9.7%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The share of rejections per contact was 75.2%.
n
Dataset of development of business during the COVID-19 crisis
narcis.nl
data.mendeley.com
Updated Nov 9, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Litvinova, T (via Mendeley Data) (2020). Dataset of development of business during the COVID-19 crisis [Dataset]. http://doi.org/10.17632/9vvrd34f8t.1
Explore at:
Unique identifier
https://doi.org/10.17632/9vvrd34f8t.1
Dataset updated
Nov 9, 2020
Dataset provided by
Data Archiving and Networked Services (DANS)
Authors
Litvinova, T (via Mendeley Data)
Description
To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.
T
Nuclear Medicine National Headquarter System
data.va.gov
datahub.va.gov
+5more
application/rdfxml +5
Updated Sep 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). Nuclear Medicine National Headquarter System [Dataset]. https://www.data.va.gov/dataset/Nuclear-Medicine-National-Headquarter-System/x6z5-25xw
Explore at:
tsv, application/rdfxml, csv, xml, json, application/rssxmlAvailable download formats
Dataset updated
Sep 12, 2019
Description
The Nuclear Medicine National HQ System database is a series of MS Excel spreadsheets and Access Database Tables by fiscal year. They consist of information from all Veterans Affairs Medical Centers (VAMCs) performing or contracting nuclear medicine services in Veterans Affairs medical facilities. The medical centers are required to complete questionnaires annually (RCS 10-0010-Nuclear Medicine Service Annual Report). The information is then manually entered into the Access Tables, which includes: * Distribution and cost of in-house VA - Contract Physician Services, whether contracted services are made via sharing agreement (with another VA medical facility or other government medical providers) or with private providers. * Workload data for the performance and/or purchase of PET/CT studies. * Organizational structure of services. * Updated changes in key imaging service personnel (chiefs, chief technicians, radiation safety officers). * Workload data on the number and type of studies (scans) performed, including Medicare Relative Value Units (RVUs), also referred to as Weighted Work Units (WWUs). WWUs are a workload measure calculated as the product of a study's Current Procedural Terminology (CPT) code, which consists of total work costs (the cost of physician medical expertise and time), and total practice costs (the costs of running a practice, such as equipment, supplies, salaries, utilities etc). Medicare combines WWUs together with one other parameter to derive RVUs, a workload measure widely used in the health care industry. WWUs allow Nuclear Medicine to account for the complexity of each study in assessing workload, that some studies are more time consuming and require higher levels of expertise. This gives a more accurate picture of workload; productivity etc than using just 'total studies' would yield. * A detailed Full-Time Equivalent Employee (FTEE) grid, and staffing distributions of FTEEs across nuclear medicine services. * Information on Radiation Safety Committees and Radiation Safety Officers (RSOs). Beginning in 2011 this will include data collection on part-time and non VA (contract) RSOs; other affiliations they may have and if so to whom they report (supervision) at their VA medical center.Collection of data on nuclear medicine services' progress in meeting the special needs of our female veterans. Revolving documentation of all major VA-owned gamma cameras (by type) and computer systems, their specifications and ages. * Revolving data collection for PET/CT cameras owned or leased by VA; and the numbers and types of PET/CT studies performed on VA patients whether produced on-site, via mobile PET/CT contract or from non-VA providers in the community.* Types of educational training/certification programs available at VA sites * Ongoing funded research projects by Nuclear Medicine (NM) staff, identified by source of funding and research purpose. * Data on physician-specific quality indicators at each nuclear medicine service.* Academic achievements by NM staff, including published books/chapters, journals and abstracts. * Information from polling field sites re: relevant issues and programs Headquarters needs to address. * Results of a Congressionally mandated contracted quality assessment exercise, also known as a Proficiency study. Study results are analyzed for comparison within VA facilities (for example by mission or size), and against participating private sector health care groups. * Information collected on current issues in nuclear medicine as they arise. Radiation Safety Committee structures and membership, Radiation Safety Officer information and information on how nuclear medicine services provided for female Veterans are examples of current issues.The database is now stored completely within MS Access Database Tables with output still presented in the form of Excel graphs and tables.
Results of subgroup meta-analysis of good knowledge, favorable attitudes,...
plos.figshare.com
xls
Updated Jun 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Beshada Zerfu Woldegeorgis; Amene Abebe Kerbo; Mohammed Suleiman Obsa; Taklu Marama Mokonnon (2023). Results of subgroup meta-analysis of good knowledge, favorable attitudes, and good practices. [Dataset]. http://doi.org/10.1371/journal.pone.0287042.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0287042.t002
Dataset updated
Jun 9, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Beshada Zerfu Woldegeorgis; Amene Abebe Kerbo; Mohammed Suleiman Obsa; Taklu Marama Mokonnon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Results of subgroup meta-analysis of good knowledge, favorable attitudes, and good practices.
d
Practice Level Prescribing Data
digital.nhs.uk
Updated Feb 25, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). Practice Level Prescribing Data [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/practice-level-prescribing-data
Explore at:
Dataset updated
Feb 25, 2020
License
https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
Time period covered
Dec 1, 2019 - Dec 31, 2019
Description
Published: 25th February 2020 - The NHSBSA ‘One Drug Database’ project is underway to provide a consistent, single source of prescriptions information. The first release of new data was due in February 2020 (December 2019 data) on the NHSBSA website, however this has been delayed a month. As a result this is an additional and FINAL release of PLP data by NHS Digital. This means that from March 2020 the PLP data will ONLY be available from the NHSBSA website. There will also be differences in the way data is presented. You can read more about the project, and how this affects the NHSBSA systems and reports in the related link at the bottom of this web page. Practice level prescribing data is a list of all medicines, dressings and appliances that are prescribed by all practices in England and dispensed in the community each month. A zip file is available which users are able to download and extract all 3 files locally. What does the data cover? Practice prescribing data is a list of all medicines, dressings and appliances that are prescribed and dispensed each month. A record will only be produced when this has occurred and there is no record for a zero total. For each practice in England, including GP practices, the following information is presented at presentation level for each medicine, dressing and appliance, (by presentation name): the total number of items prescribed and dispensed the total net ingredient cost the total actual cost the total quantity The data covers NHS prescriptions written in England and dispensed in the community in the UK. Prescriptions written in England but dispensed outside England are included. The data includes prescriptions written by GPs and other non-medical prescribers (such as nurses and pharmacists) who are attached to practices. Practices are identified only by their national code, so an additional data file - linked to the first by the practice code - provides further detail in relation to the practice. Presentations are identified only by their BNF code, so an additional data file - linked to the first by the BNF code - provides the chemical name for that presentation. Warning: Large file size (over 1GB). Each monthly data set is large (over 10 million rows), but can be viewed using add-ons to existing software, such as the Microsoft PowerPivot add-on for Excel, to handle larger data sets. The Microsoft PowerPivot add-on for Excel is available using the link in the 'Related Links' section below. Once PowerPivot has been installed, to load the large files, please follow the instructions below. Note that it may take at least 20 to 30 minutes to load one monthly file. 1. Start Excel as normal 2. Click on the PowerPivot tab 3. Click on the PowerPivot Window icon (top left) 4. In the PowerPivot Window, click on the "From Other Sources" icon 5. In the Table Import Wizard e.g. scroll to the bottom and select Text File 6. Browse to the file you want to open and choose the file extension you require e.g. CSV Once the data has been imported you can view it in a spreadsheet.
Market Basket Analysis
kaggle.com
Updated Dec 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aslan Ahmedov (2021). Market Basket Analysis [Dataset]. https://www.kaggle.com/datasets/aslanahmedov/market-basket-analysis
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 9, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Aslan Ahmedov
Description
Market Basket Analysis

Market basket analysis with Apriori algorithm

The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.

Introduction

Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.

An Example of Association Rules

Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.

Strategy

Data Import

Data Understanding and Exploration

Transformation of the data – so that is ready to be consumed by the association rules algorithm

Running association rules

Exploring the rules generated

Filtering the generated rules

Visualization of Rule

Dataset Description

File name: Assignment-1_Data

List name: retaildata

File format: . xlsx

Number of Row: 522065

Number of Attributes: 7

BillNo: 6-digit number assigned to each transaction. Nominal.

Itemname: Product name. Nominal.

Quantity: The quantities of each product per transaction. Numeric.

Date: The day and time when each transaction was generated. Numeric.

Price: Product price. Numeric.

CustomerID: 5-digit number assigned to each customer. Nominal.

Country: Name of the country where each customer resides. Nominal.

https://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">

Libraries in R

First, we need to load required libraries. Shortly I describe all libraries.

arules - Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules).

arulesViz - Extends package 'arules' with various visualization. techniques for association rules and item-sets. The package also includes several interactive visualizations for rule exploration.

tidyverse - The tidyverse is an opinionated collection of R packages designed for data science.

readxl - Read Excel Files in R.

plyr - Tools for Splitting, Applying and Combining Data.

ggplot2 - A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

knitr - Dynamic Report generation in R.

magrittr- Provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. There is flexible support for the type of right-hand side expressions.

dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory.

tidyverse - This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step.

https://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">

Data Pre-processing

Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.

https://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png"> https://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">

After we will clear our data frame, will remove missing values.

https://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">

To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...
Survey on Interest Rate Controls 2019 - Albania, Algeria, Anguilla...and 103...
microdata.worldbank.org
catalog.ihsn.org
+1more
Updated Oct 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Bank Group - Finance, Competitiveness and Innovation Global Practice (2023). Survey on Interest Rate Controls 2019 - Albania, Algeria, Anguilla...and 103 more [Dataset]. https://microdata.worldbank.org/index.php/catalog/3812
Explore at:
Dataset updated
Oct 26, 2023
Dataset provided by
World Bankhttp://topics.nytimes.com/top/reference/timestopics/organizations/w/world_bank/index.html
Authors
World Bank Group - Finance, Competitiveness and Innovation Global Practice
Time period covered
2019
Area covered
Algeria, Anguilla...and 103 more, Albania
Description
Abstract

The Survey on Interest Rate Controls 2020 was conducted as a World Bank Group study on interest rate controls (IRCs) in lending and deposit markets around the world. The study aims to identify the different types of formal (or de jure) controls, the countries that apply then, how they implement them, and the reasons for doing so. The objective of the study is to advance knowledge on this topic by providing an evidence base for investigating the impact of IRCs on economic outcomes.

The survey investigates present IRCs in each surveyed country, the reasons why they have been applied, the framework and resources associated with their application and the details as to their level and functioning. The focus is on legal forms of control (i.e. codified into law) as opposed to de facto controls. The new database on interest rate controls, a popular form of financial repression is based on a survey of 108 countries, representing 88 percent of global gross domestic product. The interest rate controls presented in this dataset were in effect in 2019.

Geographic coverage

Global Survey, covering 108 countries, representing 88 percent of global GDP.

Analysis unit

Regulation at the national level.

Universe

Banking supervisors and Local Banking Associations.

Kind of data

Sample survey data [ssd]

Mode of data collection

Mail Questionnaire [mail]

Research instrument

Bank supervisors and banking associations were provided with a standard excel file with five parts. The survey was structured in five parts, each placed in a different excel sheet. Part A: Introduction. Countries with no IRCs in place were asked to only answer this sheet and leave the rest blank. Part B: Presented the definitions of controls, institutions, products and additional aspects that will be covered in the survey. Part C: Introduced a set of qualitative questions to describe the IRCs in place. Part D: Displayed a set of tables to quantitatively describe the IRCs in place. Part E: Laid out the final set of questions, covering sanctions and control mechanisms that support the IRCs' enforcement. The questionnaire is provided in the Documentation section in pdf and excel.
Sensitivity analysis of pooled prevalence with each study removed one by...
plos.figshare.com
xls
Updated Jun 9, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Beshada Zerfu Woldegeorgis; Amene Abebe Kerbo; Mohammed Suleiman Obsa; Taklu Marama Mokonnon (2023). Sensitivity analysis of pooled prevalence with each study removed one by one. [Dataset]. http://doi.org/10.1371/journal.pone.0287042.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0287042.t004
Dataset updated
Jun 9, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Beshada Zerfu Woldegeorgis; Amene Abebe Kerbo; Mohammed Suleiman Obsa; Taklu Marama Mokonnon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sensitivity analysis of pooled prevalence with each study removed one by one.
f
Level of knowledge, attitude, and practice towards COVID-19 infection...
plos.figshare.com
xls
Updated Jun 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ayenew Mose; Amare Zewdie; Tadesse Sahle (2023). Level of knowledge, attitude, and practice towards COVID-19 infection prevention among pregnant women in Ethiopia. [Dataset]. http://doi.org/10.1371/journal.pone.0276692.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0276692.t002
Dataset updated
Jun 13, 2023
Dataset provided by
PLOS ONE
Authors
Ayenew Mose; Amare Zewdie; Tadesse Sahle
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Ethiopia
Description
Level of knowledge, attitude, and practice towards COVID-19 infection prevention among pregnant women in Ethiopia.
f
Characteristics of included studies in this systematic review and...
plos.figshare.com
xls
Updated Jun 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ayenew Mose; Amare Zewdie; Tadesse Sahle (2023). Characteristics of included studies in this systematic review and meta-analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0276692.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0276692.t001
Dataset updated
Jun 13, 2023
Dataset provided by
PLOS ONE
Authors
Ayenew Mose; Amare Zewdie; Tadesse Sahle
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Characteristics of included studies in this systematic review and meta-analysis.
o
The rural depopulation in the province of Valladolid: press analysis
openicpsr.org
Updated Jun 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rosa Maria Arraez-Betancort (2025). The rural depopulation in the province of Valladolid: press analysis [Dataset]. http://doi.org/10.3886/E232441V1
Explore at:
Unique identifier
https://doi.org/10.3886/E232441V1
Dataset updated
Jun 5, 2025
Dataset provided by
Miguel de Cervantes European University (UEMC)
Authors
Rosa Maria Arraez-Betancort
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Valladolid, Castilla y León (Spain)
Description
The Excel database is part of a study that addresses the treatment of rural generalist journalism on the abandonment of the countryside in a Spanish province (Valladolid). Based on the global data presented here, a content analysis of 226 journalistic pieces from two digital newspapers was carried out: El Norte de Castilla digital (the only regional newspaper in the autonomous community to which Valladolid belongs) and El Día de Valladolid digital (the strictly local newspaper) in order to detect causes, consequences and solutions provided in journalism on the rural exodus from economic, political, socio-cultural and environmental parameters. The main objectives of the study were twofold. On the one hand, to find out how rural journalism deals with the abandonment of the countryside for Valladolid's readers. Two secondary objectives were derived from this objective: (1) To find out what are the causes, consequences and solutions provided by rural journalism on the exodus from the countryside to the citizens of the province of Valladolid; and (2) To find out what is the professional practice of rural journalism on the abandonment of the countryside. On the other hand, the second main objective was to establish proposals for improvement in rural journalism in order to facilitate its professional practice and contribute in some way to solving the problem of rural depopulation. The data were collected between 1 February and 30 November 2023. In addition to the two aforementioned digital newspapers, the data include those of four specialised digital agricultural magazines: ‘Empresa agraria’, ‘Campo.es’, ‘Tierras.Agricultura’ and ‘Agronews Castilla y León’. It should be noted that, once all the data had been incorporated, several transformations were carried out with the PowerQuery Editor tool, which allowed the data to be positioned in an appropriate way in order to subsequently generate dynamic tables in Excel.
Airline Dataset
kaggle.com
Updated Sep 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sourav Banerjee (2023). Airline Dataset [Dataset]. https://www.kaggle.com/datasets/iamsouravbanerjee/airline-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 26, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sourav Banerjee
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Airline data holds immense importance as it offers insights into the functioning and efficiency of the aviation industry. It provides valuable information about flight routes, schedules, passenger demographics, and preferences, which airlines can leverage to optimize their operations and enhance customer experiences. By analyzing data on delays, cancellations, and on-time performance, airlines can identify trends and implement strategies to improve punctuality and mitigate disruptions. Moreover, regulatory bodies and policymakers rely on this data to ensure safety standards, enforce regulations, and make informed decisions regarding aviation policies. Researchers and analysts use airline data to study market trends, assess environmental impacts, and develop strategies for sustainable growth within the industry. In essence, airline data serves as a foundation for informed decision-making, operational efficiency, and the overall advancement of the aviation sector.

Content

This dataset comprises diverse parameters relating to airline operations on a global scale. The dataset prominently incorporates fields such as Passenger ID, First Name, Last Name, Gender, Age, Nationality, Airport Name, Airport Country Code, Country Name, Airport Continent, Continents, Departure Date, Arrival Airport, Pilot Name, and Flight Status. These columns collectively provide comprehensive insights into passenger demographics, travel details, flight routes, crew information, and flight statuses. Researchers and industry experts can leverage this dataset to analyze trends in passenger behavior, optimize travel experiences, evaluate pilot performance, and enhance overall flight operations.

Dataset Glossary (Column-wise)

Passenger ID - Unique identifier for each passenger

First Name - First name of the passenger

Last Name - Last name of the passenger

Gender - Gender of the passenger

Age - Age of the passenger

Nationality - Nationality of the passenger

Airport Name - Name of the airport where the passenger boarded

Airport Country Code - Country code of the airport's location

Country Name - Name of the country the airport is located in

Airport Continent - Continent where the airport is situated

Continents - Continents involved in the flight route

Departure Date - Date when the flight departed

Arrival Airport - Destination airport of the flight

Pilot Name - Name of the pilot operating the flight

Flight Status - Current status of the flight (e.g., on-time, delayed, canceled)

Structure of the Dataset

https://i.imgur.com/cUFuMeU.png" alt="">

Acknowledgement

The dataset provided here is a simulated example and was generated using the online platform found at Mockaroo. This web-based tool offers a service that enables the creation of customizable Synthetic datasets that closely resemble real data. It is primarily intended for use by developers, testers, and data experts who require sample data for a range of uses, including testing databases, filling applications with demonstration data, and crafting lifelike illustrations for presentations and tutorials. To explore further details, you can visit their website.

Cover Photo by: Kevin Woblick on Unsplash

Thumbnail by: Airplane icons created by Freepik - Flaticon
Albero study: a longitudinal database of the social network and personal...
zenodo.org
bin, csv
Updated Mar 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Isidro Maya Jariego; Isidro Maya Jariego; Daniel Holgado Ramos; Daniel Holgado Ramos; Deniza Alieva; Deniza Alieva (2021). Albero study: a longitudinal database of the social network and personal networks of a cohort of students at the end of high school [Dataset]. http://doi.org/10.5281/zenodo.3532048
Explore at:
bin, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3532048
Dataset updated
Mar 26, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Isidro Maya Jariego; Isidro Maya Jariego; Daniel Holgado Ramos; Daniel Holgado Ramos; Deniza Alieva; Deniza Alieva
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ABSTRACT

The Albero study analyzes the personal transitions of a cohort of high school students at the end of their studies. The data consist of (a) the longitudinal social network of the students, before (n = 69) and after (n = 57) finishing their studies; and (b) the longitudinal study of the personal networks of each of the participants in the research. The two observations of the complete social network are presented in two matrices in Excel format. For each respondent, two square matrices of 45 alters of their personal networks are provided, also in Excel format. For each respondent, both psychological sense of community and frequency of commuting is provided in a SAV file (SPSS). The database allows the combined analysis of social networks and personal networks of the same set of individuals.

INTRODUCTION

Ecological transitions are key moments in the life of an individual that occur as a result of a change of role or context. This is the case, for example, of the completion of high school studies, when young people start their university studies or try to enter the labor market. These transitions are turning points that carry a risk or an opportunity (Seidman & French, 2004). That is why they have received special attention in research and psychological practice, both from a developmental point of view and in the situational analysis of stress or in the implementation of preventive strategies.

The data we present in this article describe the ecological transition of a group of young people from Alcala de Guadaira, a town located about 16 kilometers from Seville. Specifically, in the “Albero” study we monitored the transition of a cohort of secondary school students at the end of the last pre-university academic year. It is a turning point in which most of them began a metropolitan lifestyle, with more displacements to the capital and a slight decrease in identification with the place of residence (Maya-Jariego, Holgado & Lubbers, 2018).

Normative transitions, such as the completion of studies, affect a group of individuals simultaneously, so they can be analyzed both individually and collectively. From an individual point of view, each student stops attending the institute, which is replaced by new interaction contexts. Consequently, the structure and composition of their personal networks are transformed. From a collective point of view, the network of friendships of the cohort of high school students enters into a gradual process of disintegration and fragmentation into subgroups (Maya-Jariego, Lubbers & Molina, 2019).

These two levels, individual and collective, were evaluated in the “Albero” study. One of the peculiarities of this database is that we combine the analysis of a complete social network with a survey of personal networks in the same set of individuals, with a longitudinal design before and after finishing high school. This allows combining the study of the multiple contexts in which each individual participates, assessed through the analysis of a sample of personal networks (Maya-Jariego, 2018), with the in-depth analysis of a specific context (the relationships between a promotion of students in the institute), through the analysis of the complete network of interactions. This potentially allows us to examine the covariation of the social network with the individual differences in the structure of personal networks.

PARTICIPANTS

The social network and personal networks of the students of the last two years of high school of an institute of Alcala de Guadaira (Seville) were analyzed. The longitudinal follow-up covered approximately a year and a half. The first wave was composed of 31 men (44.9%) and 38 women (55.1%) who live in Alcala de Guadaira, and who mostly expect to live in Alcala (36.2%) or in Seville (37.7%) in the future. In the second wave, information was obtained from 27 men (47.4%) and 30 women (52.6%).

DATE STRUCTURE AND ARCHIVES FORMAT

The data is organized in two longitudinal observations, with information on the complete social network of the cohort of students of the last year, the personal networks of each individual and complementary information on the sense of community and frequency of metropolitan movements, among other variables.

Social network

The file “Red_Social_t1.xlsx” is a valued matrix of 69 actors that gathers the relations of knowledge and friendship between the cohort of students of the last year of high school in the first observation. The file “Red_Social_t2.xlsx” is a valued matrix of 57 actors obtained 17 months after the first observation.

The data is organized in two longitudinal observations, with information on the complete social network of the cohort of students of the last year, the personal networks of each individual and complementary information on the sense of community and frequency of metropolitan movements, among other variables.

In order to generate each complete social network, the list of 77 students enrolled in the last year of high school was passed to the respondents, asking that in each case they indicate the type of relationship, according to the following values: 1, “his/her name sounds familiar"; 2, "I know him/her"; 3, "we talk from time to time"; 4, "we have good relationship"; and 5, "we are friends." The two resulting complete networks are represented in Figure 2. In the second observation, it is a comparatively less dense network, reflecting the gradual disintegration process that the student group has initiated.

Personal networks

Also in this case the information is organized in two observations. The compressed file “Redes_Personales_t1.csv” includes 69 folders, corresponding to personal networks. Each folder includes a valued matrix of 45 alters in CSV format. Likewise, in each case a graphic representation of the network obtained with Visone (Brandes and Wagner, 2004) is included. Relationship values range from 0 (do not know each other) to 2 (know each other very well).

Second, the compressed file “Redes_Personales_t2.csv” includes 57 folders, with the information equivalent to each respondent referred to the second observation, that is, 17 months after the first interview. The structure of the data is the same as in the first observation.

Sense of community and metropolitan displacements

The SPSS file “Albero.sav” collects the survey data, together with some information-summary of the network data related to each respondent. The 69 rows correspond to the 69 individuals interviewed, and the 118 columns to the variables related to each of them in T1 and T2, according to the following list:

• Socio-economic data.

• Data on habitual residence.

• Information on intercity journeys.

• Identity and sense of community.

• Personal network indicators.

• Social network indicators.

DATA ACCESS

Social networks and personal networks are available in CSV format. This allows its use directly with UCINET, Visone, Pajek or Gephi, among others, and they can be exported as Excel or text format files, to be used with other programs.

The visual representation of the personal networks of the respondents in both waves is available in the following album of the Graphic Gallery of Personal Networks on Flickr: <https://www.flickr.com/photos/25906481@N07/albums/72157667029974755>.

In previous work we analyzed the effects of personal networks on the longitudinal evolution of the socio-centric network. It also includes additional details about the instruments applied. In case of using the data, please quote the following reference:

Maya-Jariego, I., Holgado, D. & Lubbers, M. J. (2018). Efectos de la estructura de las redes personales en la red sociocéntrica de una cohorte de estudiantes en transición de la enseñanza secundaria a la universidad. Universitas Psychologica, 17(1), 86-98. https://doi.org/10.11144/Javeriana.upsy17-1.eerp

The English version of this article can be downloaded from: https://tinyurl.com/yy9s2byl

CONCLUSION

The database of the “Albero” study allows us to explore the co-evolution of social networks and personal networks. In this way, we can examine the mutual dependence of individual trajectories and the structure of the relationships of the cohort of students as a whole. The complete social network corresponds to the same context of interaction: the secondary school. However, personal networks collect information from the different contexts in which the individual participates. The structural properties of personal networks may partly explain individual differences in the position of each student in the entire social network. In turn, the properties of the entire social network partly determine the structure of opportunities in which individual trajectories are displayed.

The longitudinal character and the combination of the personal networks of individuals with a common complete social network, make this database have unique characteristics. It may be of interest both for multi-level analysis and for the study of individual differences.

ACKNOWLEDGEMENTS

The fieldwork for this study was supported by the Complementary Actions of the Ministry of Education and Science (SEJ2005-25683), and was part of the project “Dynamics of actors and networks across levels: individuals,
COVID-19 Case Surveillance Public Use Data
data.cdc.gov
opendatalab.com
+6more
application/rdfxml +5
Updated Jul 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CDC Data, Analytics and Visualization Task Force (2024). COVID-19 Case Surveillance Public Use Data [Dataset]. https://data.cdc.gov/Case-Surveillance/COVID-19-Case-Surveillance-Public-Use-Data/vbim-akqf
Explore at:
application/rdfxml, tsv, csv, json, xml, application/rssxmlAvailable download formats
Dataset updated
Jul 9, 2024
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Authors
CDC Data, Analytics and Visualization Task Force
License
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Description
Note: Reporting of new COVID-19 Case Surveillance data will be discontinued July 1, 2024, to align with the process of removing SARS-CoV-2 infections (COVID-19 cases) from the list of nationally notifiable diseases. Although these data will continue to be publicly available, the dataset will no longer be updated.

Authorizations to collect certain public health data expired at the end of the U.S. public health emergency declaration on May 11, 2023. The following jurisdictions discontinued COVID-19 case notifications to CDC: Iowa (11/8/21), Kansas (5/12/23), Kentucky (1/1/24), Louisiana (10/31/23), New Hampshire (5/23/23), and Oklahoma (5/2/23). Please note that these jurisdictions will not routinely send new case data after the dates indicated. As of 7/13/23, case notifications from Oregon will only include pediatric cases resulting in death.

This case surveillance public use dataset has 12 elements for all COVID-19 cases shared with CDC and includes demographics, any exposure history, disease severity indicators and outcomes, presence of any underlying medical conditions and risk behaviors, and no geographic data.

CDC has three COVID-19 case surveillance datasets:
COVID-19 Case Surveillance Public Use Data with Geography: Public use, patient-level dataset with clinical data (including symptoms), demographics, and county and state of residence. (19 data elements)
COVID-19 Case Surveillance Public Use Data: Public use, patient-level dataset with clinical and symptom data and demographics, with no geographic data. (12 data elements)
COVID-19 Case Surveillance Restricted Access Detailed Data: Restricted access, patient-level dataset with clinical and symptom data, demographics, and state and county of residence. Access requires a registration process and a data use agreement. (33 data elements)
The following apply to all three datasets:
Data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.
Data are considered provisional by CDC and are subject to change until the data are reconciled and verified with the state and territorial data providers.
Some data cells are suppressed to protect individual privacy.
The datasets will include all cases with the earliest date available in each record (date received by CDC or date related to illness/specimen collection) at least 14 days prior to the creation of the current datasets. This 14-day lag allows case reporting to be stabilized and ensures that time-dependent outcome data are accurately captured.
Datasets are updated monthly.
Datasets are created using CDC’s Policy on Public Health Research and Nonresearch Data Management and Access and include protections designed to protect individual privacy.
For more information about data collection and reporting, please see https://www.cdc.gov/coronavirus/2019-ncov/covid-data/about-us-cases-deaths.html.
For more information about the COVID-19 case surveillance data, please see https://www.cdc.gov/coronavirus/2019-ncov/covid-data/faq-surveillance.html

Overview

The COVID-19 case surveillance database includes individual-level data reported to U.S. states and autonomous reporting entities, including New York City and the District of Columbia (D.C.), as well as U.S. territories and affiliates. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020, to clarify the interpretation of antigen detection tests and serologic test results within the case classification (Interim-20-ID-02). The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC. COVID-19 case surveillance data are collected by jurisdictions and reported voluntarily to CDC.

For more information: NNDSS Supports the COVID-19 Response | CDC.

The deidentified data in the “COVID-19 Case Surveillance Public Use Data” include demographic characteristics, any exposure history, disease severity indicators and outcomes, clinical data, laboratory diagnostic test results, and presence of any underlying medical conditions and risk behaviors. All data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.

COVID-19 Case Reports

COVID-19 case reports have been routinely submitted using nationally standardized case reporting forms. On April 5, 2020, CSTE released an Interim Position Statement with national surveillance case definitions for COVID-19 included. Current versions of these case definitions are available here: https://ndc.services.cdc.gov/case-definitions/coronavirus-disease-2019-2021/.

All cases reported on or after were requested to be shared by public health departments to CDC using the standardized case definitions for laboratory-confirmed or probable cases. On May 5, 2020, the standardized case reporting form was revised. Case reporting using this new form is ongoing among U.S. states and territories.

Data are Considered Provisional

The COVID-19 case surveillance data are dynamic; case reports can be modified at any time by the jurisdictions sharing COVID-19 data with CDC. CDC may update prior cases shared with CDC based on any updated information from jurisdictions. For instance, as new information is gathered about previously reported cases, health departments provide updated data to CDC. As more information and data become available, analyses might find changes in surveillance data and trends during a previously reported time window. Data may also be shared late with CDC due to the volume of COVID-19 cases.
Annual finalized data: To create the final NNDSS data used in the annual tables, CDC works carefully with the reporting jurisdictions to reconcile the data received during the year until each state or territorial epidemiologist confirms that the data from their area are correct.
Access Addressing Gaps in Public Health Reporting of Race and Ethnicity for COVID-19, a report from the Council of State and Territorial Epidemiologists, to better understand the challenges in completing race and ethnicity data for COVID-19 and recommendations for improvement.

Data Limitations

To learn more about the limitations in using case surveillance data, visit FAQ: COVID-19 Data and Surveillance.

Data Quality Assurance Procedures

CDC’s Case Surveillance Section routinely performs data quality assurance procedures (i.e., ongoing corrections and logic checks to address data errors). To date, the following data cleaning steps have been implemented:
Questions that have been left unanswered (blank) on the case report form are reclassified to a Missing value, if applicable to the question. For example, in the question “Was the individual hospitalized?” where the possible answer choices include “Yes,” “No,” or “Unknown,” the blank value is recoded to Missing because the case report form did not include a response to the question.
Logic checks are performed for date data. If an illogical date has been provided, CDC reviews the data with the reporting jurisdiction. For example, if a symptom onset date in the future is reported to CDC, this value is set to null until the reporting jurisdiction updates the date appropriately.
Additional data quality processing to recode free text data is ongoing. Data on symptoms, race and ethnicity, and healthcare worker status have been prioritized.

Data Suppression

To prevent release of data that could be used to identify people, data cells are suppressed for low frequency (<5) records and indirect identifiers (e.g., date of first positive specimen). Suppression includes rare combinations of demographic characteristics (sex, age group, race/ethnicity). Suppressed values are re-coded to the NA answer option; records with data suppression are never removed.

For questions, please contact Ask SRRG (eocevent394@cdc.gov).

Additional COVID-19 Data

COVID-19 data are available to the public as summary or aggregate count files, including total counts of cases and deaths by state and by county. These
Z
Implementación de un dataset de referencias bibliográficas biomédicas: caso...
data.niaid.nih.gov
Updated Nov 7, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antonio Fernandez (2021). Implementación de un dataset de referencias bibliográficas biomédicas: caso de estudio del gen de fusión TMPRSS2:ERG [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5648434
Explore at:
Dataset updated
Nov 7, 2021
Dataset authored and provided by
Antonio Fernandez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is obtained from a web-scraper with results of the query TMPRSS2:ERG in pubmed bibliography database. The dataset consist on an excel file with 915 references including: authors, pubmed id, title, short citation, long citation, link to the url, abstract, DOI and journal of publication.

This data is obtained with a web scraper coded in python for the first practice of the asignature M2.851 - Tipología y ciclo de vida de los datos of the Master Degree in Data Science of the UOC.

Facebook

Twitter

Click to copy link

Link copied

Cite

U.S. EPA Office of Research and Development (ORD) (2024). 18 excel spreadsheets by species and year giving reproduction and growth data. One excel spreadsheet of herbicide treatment chemistry. [Dataset]. https://catalog.data.gov/dataset/18-excel-spreadsheets-by-species-and-year-giving-reproduction-and-growth-data-one-excel-sp

18 excel spreadsheets by species and year giving reproduction and growth data. One excel spreadsheet of herbicide treatment chemistry.

Explore at:

Dataset updated

Aug 17, 2024

Dataset provided by

United States Environmental Protection Agencyhttp://www.epa.gov/

Description

Excel spreadsheets by species (4 letter code is abbreviation for genus and species used in study, year 2010 or 2011 is year data collected, SH indicates data for Science Hub, date is date of file preparation). The data in a file are described in a read me file which is the first worksheet in each file. Each row in a species spreadsheet is for one plot (plant). The data themselves are in the data worksheet. One file includes a read me description of the column in the date set for chemical analysis. In this file one row is an herbicide treatment and sample for chemical analysis (if taken). This dataset is associated with the following publication: Olszyk , D., T. Pfleeger, T. Shiroyama, M. Blakely-Smith, E. Lee , and M. Plocher. Plant reproduction is altered by simulated herbicide drift toconstructed plant communities. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY. Society of Environmental Toxicology and Chemistry, Pensacola, FL, USA, 36(10): 2799-2813, (2017).

Clear search

Close search

Google apps

Main menu

18 excel spreadsheets by species and year giving reproduction and growth...

Employee Travel 2021 (Excel)

Data from: Current and projected research data storage needs of Agricultural...

Sample Student Data

Data Cleaning Sample

Enterprise Survey 2009-2019, Panel Data - Slovenia

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Response rate

Dataset of development of business during the COVID-19 crisis

Nuclear Medicine National Headquarter System

Results of subgroup meta-analysis of good knowledge, favorable attitudes,...

Practice Level Prescribing Data

Market Basket Analysis

Market Basket Analysis

Introduction

An Example of Association Rules

Strategy

Dataset Description

Libraries in R

Data Pre-processing

Survey on Interest Rate Controls 2019 - Albania, Algeria, Anguilla...and 103...

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Mode of data collection

Research instrument

Sensitivity analysis of pooled prevalence with each study removed one by...

Level of knowledge, attitude, and practice towards COVID-19 infection...

Characteristics of included studies in this systematic review and...

The rural depopulation in the province of Valladolid: press analysis

Airline Dataset

Context

Content

Dataset Glossary (Column-wise)

Structure of the Dataset

Acknowledgement

Albero study: a longitudinal database of the social network and personal...

COVID-19 Case Surveillance Public Use Data

CDC has three COVID-19 case surveillance datasets:

Overview

COVID-19 Case Reports

Data are Considered Provisional

Data Limitations

Data Quality Assurance Procedures

Data Suppression

Additional COVID-19 Data

Implementación de un dataset de referencias bibliográficas biomédicas: caso...

18 excel spreadsheets by species and year giving reproduction and growth data. One excel spreadsheet of herbicide treatment chemistry.See More Versions

18 excel spreadsheets by species and year giving reproduction and growth data. One excel spreadsheet of herbicide treatment chemistry.