100+ datasets found

d
Reporting behavior from WHO COVID-19 public data
search.dataone.org
data.niaid.nih.gov
+1more
Updated Jul 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Auss Abbood (2023). Reporting behavior from WHO COVID-19 public data [Dataset]. http://doi.org/10.5061/dryad.9s4mw6mmb
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.9s4mw6mmb
Dataset updated
Jul 14, 2025
Dataset provided by
Dryad Digital Repository
Authors
Auss Abbood
Time period covered
Dec 16, 2022
Description
Objective Daily COVID-19 data reported by the World Health Organization (WHO) may provide the basis for political ad hoc decisions including travel restrictions. Data reported by countries, however, is heterogeneous and metrics to evaluate its quality are scarce. In this work, we analyzed COVID-19 case counts provided by WHO and developed tools to evaluate country-specific reporting behaviors. Methods In this retrospective cross-sectional study, COVID-19 data reported daily to WHO from 3rd January 2020 until 14th June 2021 were analyzed. We proposed the concepts of binary reporting rate and relative reporting behavior and performed descriptive analyses for all countries with these metrics. We developed a score to evaluate the consistency of incidence and binary reporting rates. Further, we performed spectral clustering of the binary reporting rate and relative reporting behavior to identify salient patterns in these metrics. Results Our final analysis included 222 countries and regions...., Data collection COVID-19 data was downloaded from WHO. Using a public repository, we have added the countries' full names to the WHO data set using the two-letter abbreviations for each country to merge both data sets. The provided COVID-19 data covers January 2020 until June 2021. We uploaded the final data set used for the analyses of this paper. Data processing We processed data using a Jupyter Notebook with a Python kernel and publically available external libraries. This upload contains the required Jupyter Notebook (reporting_behavior.ipynb) with all analyses and some additional work, a README, and the conda environment yml (env.yml)., Any text editor including Microsoft Excel and their free alternatives can open the uploaded CSV file. Any web browser and some code editors (like the freely available Visual Studio Code) can show the uploaded Jupyter Notebook if the required Python environment is set up correctly.
GitTables 1M - CSV files
zenodo.org
zip
Updated Jun 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Madelon Hulsebos; Çağatay Demiralp; Paul Groth; Madelon Hulsebos; Çağatay Demiralp; Paul Groth (2022). GitTables 1M - CSV files [Dataset]. http://doi.org/10.5281/zenodo.6515973
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6515973
Dataset updated
Jun 6, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Madelon Hulsebos; Çağatay Demiralp; Paul Groth; Madelon Hulsebos; Çağatay Demiralp; Paul Groth
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset contains >800K CSV files behind the GitTables 1M corpus.

For more information about the GitTables corpus, visit:

- our website for GitTables, or

- the main GitTables download page on Zenodo.
Customer Dataset csv
kaggle.com
Updated Mar 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Moses Moncy (2023). Customer Dataset csv [Dataset]. https://www.kaggle.com/datasets/mosesmoncy/customer-dataset-csv
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 22, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Moses Moncy
Description
Dataset

This dataset was created by Moses Moncy

Contents
UCI and OpenML Data Sets for Ordinal Quantification
zenodo.org
data.niaid.nih.gov
zip
Updated Jul 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mirko Bunse; Mirko Bunse; Alejandro Moreo; Alejandro Moreo; Fabrizio Sebastiani; Fabrizio Sebastiani; Martin Senz; Martin Senz (2023). UCI and OpenML Data Sets for Ordinal Quantification [Dataset]. http://doi.org/10.5281/zenodo.8177302
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8177302
Dataset updated
Jul 25, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mirko Bunse; Mirko Bunse; Alejandro Moreo; Alejandro Moreo; Fabrizio Sebastiani; Fabrizio Sebastiani; Martin Senz; Martin Senz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These four labeled data sets are targeted at ordinal quantification. The goal of quantification is not to predict the label of each individual instance, but the distribution of labels in unlabeled sets of data.

With the scripts provided, you can extract CSV files from the UCI machine learning repository and from OpenML. The ordinal class labels stem from a binning of a continuous regression label.

We complement this data set with the indices of data items that appear in each sample of our evaluation. Hence, you can precisely replicate our samples by drawing the specified data items. The indices stem from two evaluation protocols that are well suited for ordinal quantification. To this end, each row in the files app_val_indices.csv, app_tst_indices.csv, app-oq_val_indices.csv, and app-oq_tst_indices.csv represents one sample.

Our first protocol is the artificial prevalence protocol (APP), where all possible distributions of labels are drawn with an equal probability. The second protocol, APP-OQ, is a variant thereof, where only the smoothest 20% of all APP samples are considered. This variant is targeted at ordinal quantification tasks, where classes are ordered and a similarity of neighboring classes can be assumed.

Usage

You can extract four CSV files through the provided script extract-oq.jl, which is conveniently wrapped in a Makefile. The Project.toml and Manifest.toml specify the Julia package dependencies, similar to a requirements file in Python.

Preliminaries: You have to have a working Julia installation. We have used Julia v1.6.5 in our experiments.

Data Extraction: In your terminal, you can call either

make

(recommended), or

julia --project="." --eval "using Pkg; Pkg.instantiate()" julia --project="." extract-oq.jl

Outcome: The first row in each CSV file is the header. The first column, named "class_label", is the ordinal class.

Further Reading

Implementation of our experiments: https://github.com/mirkobunse/regularized-oq
Reference count CSV dataset of all bibliographic resources in OpenCitations...
figshare.com
zip
Updated Dec 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenCitations (2023). Reference count CSV dataset of all bibliographic resources in OpenCitations Index [Dataset]. http://doi.org/10.6084/m9.figshare.24747498.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24747498.v1
Dataset updated
Dec 11, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
OpenCitations
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
A CSV dataset containing the number of references of each bibliographic entity identified by an OMID in the OpenCitations Index (https://opencitations.net/index).The dataset is based on the last release of the OpenCitations Index (https://opencitations.net/download) – November 2023. The size of the zipped archive is 0.35 GB, while the size of the unzipped CSV file is 1.7 GB.The CSV dataset contains the reference count of 71,805,806 bibliographic entities. The first column (omid) lists the entities, while the second column (references) indicates the corresponding number of incoming citations.
d
Gravity Data for Island of Hawai`i.csv
catalog.data.gov
data.usgs.gov
+1more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Gravity Data for Island of Hawai`i.csv [Dataset]. https://catalog.data.gov/dataset/gravity-data-for-island-of-hawaii-csv
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
U.S. Geological Survey
Area covered
Island of Hawai'i, Hawaii
Description
This data set includes gravity measurements for the Island of Hawai`i collected as the source data for "Deep magmatic structures of Hawaiian volcanoes, imaged by three-dimensional gravity models" (Kauahikaua, Hildenbrand, and Webring, 2000). Data for 3,611 observations are stored as a single table and disseminated in .CSV format. Each observation record includes values for field station ID, latitude and longitude (in both Old Hawaiian and WGS84 projections), elevation, and Observed Gravity value. See associated publication for reduction and interpretation of these data.
MDCOVID19 TotalNumberReleasedFromIsolation
data-maryland.opendata.arcgis.com
data.imap.maryland.gov
+1more
Updated May 22, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ArcGIS Online for Maryland (2020). MDCOVID19 TotalNumberReleasedFromIsolation [Dataset]. https://data-maryland.opendata.arcgis.com/datasets/maryland::mdcovid19-totalnumberreleasedfromisolation/about
Explore at:
Dataset updated
May 22, 2020
Dataset provided by
Authors
ArcGIS Online for Maryland
Description
SummaryThe cumulative number of COVID-19 positive Maryland residents who have been released from home isolation.DescriptionThe MD COVID-19 - Total Number Released from Isolation data layer is a collection of the statewide cumulative total of individuals who tested positive for COVID-19 that have been reported each day by each local health department via the ESSENCE system as having been released from home isolation. As "recovery" can mean different things as people experience COVID-19 disease to varying degrees of severity, MDH reports on individuals released from isolation. "Released from isolation" refers to those who have met criteria and are well enough to be released from home isolation. Some of these individuals may have been hospitalized at some point.COVID-19 is a disease caused by a respiratory virus first identified in Wuhan, Hubei Province, China in December 2019. COVID-19 is a new virus that hasn't caused illness in humans before. Worldwide, COVID-19 has resulted in thousands of infections, causing illness and in some cases death. Cases have spread to countries throughout the world, with more cases reported daily. The Maryland Department of Health reports daily on COVID-19 cases by county.
Annotated Benchmark of Real-World Data for Approximate Functional Dependency...
zenodo.org
data.niaid.nih.gov
csv
Updated Jul 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marcel Parciak; Marcel Parciak; Sebastiaan Weytjens; Frank Neven; Niel Hens; Liesbet M. Peeters; Stijn Vansummeren; Sebastiaan Weytjens; Frank Neven; Niel Hens; Liesbet M. Peeters; Stijn Vansummeren (2023). Annotated Benchmark of Real-World Data for Approximate Functional Dependency Discovery [Dataset]. http://doi.org/10.5281/zenodo.8098909
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8098909
Dataset updated
Jul 1, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Marcel Parciak; Marcel Parciak; Sebastiaan Weytjens; Frank Neven; Niel Hens; Liesbet M. Peeters; Stijn Vansummeren; Sebastiaan Weytjens; Frank Neven; Niel Hens; Liesbet M. Peeters; Stijn Vansummeren
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Annotated Benchmark of Real-World Data for Approximate Functional Dependency Discovery

This collection consists of ten open access relations commonly used by the data management community. In addition to the relations themselves (please take note of the references to the original sources below), we added three lists in this collection that describe approximate functional dependencies found in the relations. These lists are the result of a manual annotation process performed by two independent individuals by consulting the respective schemas of the relations and identifying column combinations where one column implies another based on its semantics. As an example, in the claims.csv file, the AirportCode implies AirportName, as each code should be unique for a given airport.

The file ground_truth.csv is a comma separated file containing approximate functional dependencies. table describes the relation we refer to, lhs and rhs reference two columns of those relations where semantically we found that lhs implies rhs.

The file excluded_candidates.csv and included_candidates.csv list all column combinations that were excluded or included in the manual annotation, respectively. We excluded a candidate if there was no tuple where both attributes had a value or if the g3_prime value was too small.

Dataset References

adult.csv: Dua, D. and Graff, C. (2019). UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science.

claims.csv: TSA Claims Data 2002 to 2006, published by the U.S. Department of Homeland Security.

dblp10k.csv: Frequency-aware Similarity Measures. Lange, Dustin; Naumann, Felix (2011). 243–248. Made available as DBLP Dataset 2.

hospital.csv: Hospital dataset used in Johann Birnick, Thomas Bläsius, Tobias Friedrich, Felix Naumann, Thorsten Papenbrock, and Martin Schirneck. 2020. Hitting set enumeration with partial information for unique column combination discovery. Proc. VLDB Endow. 13, 12 (August 2020), 2270–2283. https://doi.org/10.14778/3407790.3407824. Made available as part the dataset collection to that paper.

t_biocase_... files: t_bioc_... files used in Johann Birnick, Thomas Bläsius, Tobias Friedrich, Felix Naumann, Thorsten Papenbrock, and Martin Schirneck. 2020. Hitting set enumeration with partial information for unique column combination discovery. Proc. VLDB Endow. 13, 12 (August 2020), 2270–2283. https://doi.org/10.14778/3407790.3407824. Made available as part the dataset collection to that paper.

tax.csv: Tax dataset used in Johann Birnick, Thomas Bläsius, Tobias Friedrich, Felix Naumann, Thorsten Papenbrock, and Martin Schirneck. 2020. Hitting set enumeration with partial information for unique column combination discovery. Proc. VLDB Endow. 13, 12 (August 2020), 2270–2283. https://doi.org/10.14778/3407790.3407824. Made available as part the dataset collection to that paper.
HR Dataset.csv
kaggle.com
Updated Mar 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fahad Rehman (2024). HR Dataset.csv [Dataset]. https://www.kaggle.com/datasets/fahadrehman07/hr-comma-sep-csv
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 8, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Fahad Rehman
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
🟡Please Upvote my dataset If you like It.✨

This dataset contains valuable employee information over time that can be analyzed to help optimize key HR functions. Some potential use cases include:

Attrition analysis: Identify factors correlated with attrition like department, role, salary, etc. Segment high-risk employees. Predict future attrition.

Performance management: Analyze the relationship between metrics like ratings, and salary increments. recommend performance improvement programs.

Workforce planning: Forecast staffing needs based on historical hiring/turnover trends. Determine optimal recruitment strategies.

Compensation analysis: Benchmark salaries vs performance, and experience. Identify pay inequities. Inform compensation policies.

Diversity monitoring: Assess diversity metrics like gender ratio over roles, and departments. Identify underrepresented groups.

Succession planning: Identify high-potential candidates and critical roles. Predict internal promotions/replacements in advance.

Given its longitudinal employee data and multiple variables, this dataset provides rich opportunities for exploration, predictive modeling, and actionable insights. With a large sample size, it can uncover subtle patterns. Cleaning, joining with other contextual data sources can yield even deeper insights. This makes it a valuable starting point for many organizational studies and evidence-based decision-making.

.............................................................................................................................................................................................................................................

This dataset contains information about different attributes of employees from a company. It includes 1000 employee records and 12 feature columns.

The columns are:

satisfaction_level: Employee satisfaction score (1-5 scale) last_evaluation: Score on last evaluation (1-5 scale) number_project: Number of projects employee worked on average_monthly_hours: Average hours worked in a month time_spend_company: Number of years spent with the company work_accident: If an employee had a workplace accident (yes/no) left: If an employee has left the company (yes/no) promotion_last_5years: Number of promotions in last 5 years Department: Department of the employee Salary: Annual salary of employee satisfaction_level: Employee satisfaction level (1-5 scale) last_evaluation: Score on last evaluation (1-5 scale)
l
Drug consumption database: original.csv
figshare.le.ac.uk
txt
Updated May 30, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elaine Fehrman; Vincent Egan; Evgeny Mirkes (2023). Drug consumption database: original.csv [Dataset]. http://doi.org/10.25392/leicester.data.7588415.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.25392/leicester.data.7588415.v1
Dataset updated
May 30, 2023
Dataset provided by
University of Leicester
Authors
Elaine Fehrman; Vincent Egan; Evgeny Mirkes
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Drug consumption database with original values of attributes. DescriptionDB.pdf contains detailed description of database.
Z
Data pipeline Validation And Load Testing using Multiple CSV Files
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Mar 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pelle Jakovits (2021). Data pipeline Validation And Load Testing using Multiple CSV Files [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4636797
Explore at:
Dataset updated
Mar 26, 2021
Dataset provided by
Pelle Jakovits
Mainak Adhikari
Afsana Khan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The datasets were used to validate and test the data pipeline deployment following the RADON approach. The dataset has a CSV file that contains around 32000 Twitter tweets. 100 CSV files have been created from the single CSV file and each CSV file containing 320 tweets. Those 100 CSV files are used to validate and test (performance/load testing) the data pipeline components.
MDCOVID19 ConfirmedDeathsByRaceAndEthnicityDistribution
hub.arcgis.com
data.imap.maryland.gov
Updated May 22, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ArcGIS Online for Maryland (2020). MDCOVID19 ConfirmedDeathsByRaceAndEthnicityDistribution [Dataset]. https://hub.arcgis.com/datasets/312715a843064ef18879eb726f64c63a
Explore at:
Dataset updated
May 22, 2020
Dataset provided by
Authors
ArcGIS Online for Maryland
Description
SummaryThe cumulative number of confirmed COVID-19-related deaths among Maryland residents by race and ethnicity: African American; White; Hispanic; Asian; Other; Unknown.DescriptionThe MD COVID-19 - Confirmed Deaths by Race and Ethnicity Distribution data layer is a collection of the statewide confirmed and probable COVID-19 related deaths that have been reported each day by the Vital Statistics Administration by categories of race and ethnicity. A death is classified as confirmed if the person had a laboratory-confirmed positive COVID-19 test result. Some data on deaths may be unavailable due to the time lag between the death, typically reported by a hospital or other facility, and the submission of the complete death certificate. Probable deaths are available from the MD COVID-19 - Probable Deaths by Race and Ethnicity Distribution data layer.COVID-19 is a disease caused by a respiratory virus first identified in Wuhan, Hubei Province, China in December 2019. COVID-19 is a new virus that hasn't caused illness in humans before. Worldwide, COVID-19 has resulted in thousands of infections, causing illness and in some cases death. Cases have spread to countries throughout the world, with more cases reported daily. The Maryland Department of Health reports daily on COVID-19 cases by county.
c
Trustpilot reviews data in CSV format
crawlfeeds.com
csv, zip
Updated May 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Trustpilot reviews data in CSV format [Dataset]. https://crawlfeeds.com/datasets/trustpilot-reviews-data-in-csv-format
Explore at:
zip, csvAvailable download formats
Dataset updated
May 8, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
Access our Trustpilot Reviews Data in CSV Format, offering a comprehensive collection of customer reviews from Trustpilot.

This dataset includes detailed reviews, ratings, and feedback across various industries and businesses. Available in a convenient CSV format, it is ideal for market research, sentiment analysis, and competitive benchmarking.

Leverage this data to gain insights into customer satisfaction, identify trends, and enhance your business strategies. Whether you're analyzing consumer sentiment or conducting competitive analysis, this dataset provides valuable information to support your needs.
CSV file used in statistical analyses
data.gov.au
researchdata.edu.au
+1more
csv
Updated Dec 10, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Commonwealth Scientific and Industrial Research Organisation (2014). CSV file used in statistical analyses [Dataset]. https://data.gov.au/dataset/ds-dap-csiro%3A10809
Explore at:
csvAvailable download formats
Dataset updated
Dec 10, 2014
Dataset provided by
CSIROhttp://www.csiro.au/
Description
A csv file containing the tidal frequencies used for statistical analyses in the paper "Estimating Freshwater Flows From Tidally-Affected Hydrographic Data" by Dan Pagendam and Don Percival. The metadata and files (if any) are available to the public. A csv file containing the tidal frequencies used for statistical analyses in the paper "Estimating Freshwater Flows From Tidally-Affected Hydrographic Data" by Dan Pagendam and Don Percival. The metadata and files (if any) are available to the public.
c
Cult Beauty Products with Ingredients Dataset (CSV)
crawlfeeds.com
csv, zip
Updated Jun 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Cult Beauty Products with Ingredients Dataset (CSV) [Dataset]. https://crawlfeeds.com/datasets/cult-beauty-products-with-ingredients-dataset-csv
Explore at:
csv, zipAvailable download formats
Dataset updated
Jun 30, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
This curated dataset contains only products from CultBeauty.com that include detailed ingredient information, ideal for brands, formulators, analysts, and researchers seeking transparency in cosmetics and skincare data.

It focuses on ingredient-rich listings — allowing deep analysis of formulation trends, compliance mapping, and clean beauty initiatives. Whether you're building an internal database or powering an AI model, this dataset offers a clean, structured foundation for insight.

What’s Included:

Product Name

Brand

Full Ingredient List

Category

Product URL

Price (if available)

Description

Image links

Timestamps

Use Cases:

Ingredient analysis for clean beauty scoring

Competitor formulation comparison

Cosmetic safety mapping (e.g., for allergen research)

Building training sets for AI/ML models in skincare

Trend monitoring across skincare and cosmetic products

Update Frequency:

Monthly or on demand
EPA FRS Facilities Single File CSV Download for the State of Arkansas
catalog.data.gov
Updated Nov 29, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Environmental Information (OEI) - Office of Information Collection (OIC) (2020). EPA FRS Facilities Single File CSV Download for the State of Arkansas [Dataset]. https://catalog.data.gov/dataset/epa-frs-facilities-single-file-csv-download-for-the-state-of-arkansas
Explore at:
Dataset updated
Nov 29, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Area covered
Arkansas
Description
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
Salary data csv
kaggle.com
Updated Jun 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SHAMANTH (2023). Salary data csv [Dataset]. https://www.kaggle.com/datasets/sham04/salary-data-csv
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 13, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
SHAMANTH
Description
Dataset

This dataset was created by SHAMANTH

Contents
f
Gender statistics from World Bank - main CSV file only
figshare.com
txt
Updated May 30, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew Brett (2023). Gender statistics from World Bank - main CSV file only [Dataset]. http://doi.org/10.6084/m9.figshare.9904934.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.9904934.v1
Dataset updated
May 30, 2023
Dataset provided by
figshare
Authors
Matthew Brett
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Main CSV file extracted from zip file download of World Bank gender statistics file.Copy of data as of 25th September 2019.
MDCOVID19 ProbableDeathsByAgeDistribution
arc-gis-hub-home-arcgishub.hub.arcgis.com
data.imap.maryland.gov
+2more
Updated May 22, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ArcGIS Online for Maryland (2020). MDCOVID19 ProbableDeathsByAgeDistribution [Dataset]. https://arc-gis-hub-home-arcgishub.hub.arcgis.com/maps/maryland::mdcovid19-probabledeathsbyagedistribution
Explore at:
Dataset updated
May 22, 2020
Dataset provided by
Authors
ArcGIS Online for Maryland
Description
SummaryThe cumulative number of probable COVID-19 deaths among Maryland residents by age: 0-9; 10-19; 20-29; 30-39; 40-49; 50-59; 60-69; 70-79; 80+; Unknown.DescriptionThe MD COVID-19 - Probable Deaths by Age Distribution data layer is a collection of the statewide confirmed and probable COVID-19 related deaths that have been reported each day by the Vital Statistics Administration by designated age ranges. A death is classified as probable if the person's death certificate notes COVID-19 to be a probable, suspect or presumed cause or condition. Probable deaths are not yet been confirmed by a laboratory test. Some data on deaths may be unavailable due to the time lag between the death, typically reported by a hospital or other facility, and the submission of the complete death certificate. Confirmed deaths are available from the MD COVID-19 - Confirmed Deaths by Age Distribution data layer.COVID-19 is a disease caused by a respiratory virus first identified in Wuhan, Hubei Province, China in December 2019. COVID-19 is a new virus that hasn't caused illness in humans before. Worldwide, COVID-19 has resulted in thousands of infections, causing illness and in some cases death. Cases have spread to countries throughout the world, with more cases reported daily. The Maryland Department of Health reports daily on COVID-19 cases by county.
j
Data from: Data on the Construction Processes of Regression Models
jstagedata.jst.go.jp
jpeg
Updated Jul 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Taichi Kimura; Riko Iwamoto; Mikio Yoshida; Tatsuya Takahashi; Shuji Sasabe; Yoshiyuki Shirakawa (2023). Data on the Construction Processes of Regression Models [Dataset]. http://doi.org/10.50931/data.kona.22180318.v2
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.50931/data.kona.22180318.v2
Dataset updated
Jul 27, 2023
Dataset provided by
Hosokawa Powder Technology Foundation
Authors
Taichi Kimura; Riko Iwamoto; Mikio Yoshida; Tatsuya Takahashi; Shuji Sasabe; Yoshiyuki Shirakawa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This CSV dataset (numbered 1–8) demonstrates the construction processes of the regression models using machine learning methods, which are used to plot Fig. 2–7. The CSV file of 1.LSM_R^2 (plotting Fig. 2) shows the data of the relationship between estimated values and actual values when the least-squares method was used for a model construction. In the CSV file 2.PCR_R^2 (plotting Fig. 3), the number of the principal components was varied from 1 to 5 during the construction of a model using the principal component regression. The data in the CSV file 3.SVR_R^2 (plotting Fig. 4) is the result of the construction using the support vector regression. The hyperparameters were decided by the comprehensive combination from the listed candidates by exploring hyperparameters with maximum R2 values. When a deep neural network was applied to the construction of a regression model, NNeur., NH.L. and NL.T. were varied. The CSV file 4.DNN_HL (plotting Fig. 5a)) shows the changes in the relationship between estimated values and actual values at each NH.L.. Similarly, changes in the relationships between estimated values and actual values in the case NNeur. or NL.T. were varied in the CSV files 5.DNN_ Neur (plotting Fig. 5b)) and 6.DNN_LT (plotting Fig. 5c)). The data in the CSV file 7.DNN_R^2 (plotting Fig. 6) is the result using optimal NNeur., NH.L. and NL.T.. In the CSV file 8.R^2 (plotting Fig. 7), the validity of each machine learning method was compared by showing the optimal results for each method. Experimental conditions Supply volume of the raw material: 25–125 mL Addition rate of TiO2: 5.0–15.0 wt% Operation time: 1–15 min Rotation speed: 2,200–5,700 min-1 Temperature: 295–319 K Nomenclature NNeur.: the number of neurons NH.L.: the number of hidden layers NL.T.: the number of learning times

Facebook

Twitter

Click to copy link

Link copied

Cite

Auss Abbood (2023). Reporting behavior from WHO COVID-19 public data [Dataset]. http://doi.org/10.5061/dryad.9s4mw6mmb

Reporting behavior from WHO COVID-19 public data

Explore at:

Unique identifier

https://doi.org/10.5061/dryad.9s4mw6mmb

Dataset updated

Jul 14, 2025

Dataset provided by

Dryad Digital Repository

Authors

Auss Abbood

Time period covered

Dec 16, 2022

Description

Objective Daily COVID-19 data reported by the World Health Organization (WHO) may provide the basis for political ad hoc decisions including travel restrictions. Data reported by countries, however, is heterogeneous and metrics to evaluate its quality are scarce. In this work, we analyzed COVID-19 case counts provided by WHO and developed tools to evaluate country-specific reporting behaviors. Methods In this retrospective cross-sectional study, COVID-19 data reported daily to WHO from 3rd January 2020 until 14th June 2021 were analyzed. We proposed the concepts of binary reporting rate and relative reporting behavior and performed descriptive analyses for all countries with these metrics. We developed a score to evaluate the consistency of incidence and binary reporting rates. Further, we performed spectral clustering of the binary reporting rate and relative reporting behavior to identify salient patterns in these metrics. Results Our final analysis included 222 countries and regions...., Data collection COVID-19 data was downloaded from WHO. Using a public repository, we have added the countries' full names to the WHO data set using the two-letter abbreviations for each country to merge both data sets. The provided COVID-19 data covers January 2020 until June 2021. We uploaded the final data set used for the analyses of this paper. Data processing We processed data using a Jupyter Notebook with a Python kernel and publically available external libraries. This upload contains the required Jupyter Notebook (reporting_behavior.ipynb) with all analyses and some additional work, a README, and the conda environment yml (env.yml)., Any text editor including Microsoft Excel and their free alternatives can open the uploaded CSV file. Any web browser and some code editors (like the freely available Visual Studio Code) can show the uploaded Jupyter Notebook if the required Python environment is set up correctly.

Clear search

Close search

Google apps

Main menu

Reporting behavior from WHO COVID-19 public data

GitTables 1M - CSV files

Customer Dataset csv

Dataset

Contents

UCI and OpenML Data Sets for Ordinal Quantification

Reference count CSV dataset of all bibliographic resources in OpenCitations...

Gravity Data for Island of Hawai`i.csv

MDCOVID19 TotalNumberReleasedFromIsolation

Annotated Benchmark of Real-World Data for Approximate Functional Dependency...

HR Dataset.csv

🟡Please Upvote my dataset If you like It.✨

This dataset contains valuable employee information over time that can be analyzed to help optimize key HR functions. Some potential use cases include:

The columns are:

Drug consumption database: original.csv

Data pipeline Validation And Load Testing using Multiple CSV Files

MDCOVID19 ConfirmedDeathsByRaceAndEthnicityDistribution

Trustpilot reviews data in CSV format

CSV file used in statistical analyses

Cult Beauty Products with Ingredients Dataset (CSV)

What’s Included:

Use Cases:

Update Frequency:

EPA FRS Facilities Single File CSV Download for the State of Arkansas

Salary data csv

Dataset

Contents

Gender statistics from World Bank - main CSV file only

MDCOVID19 ProbableDeathsByAgeDistribution

Data from: Data on the Construction Processes of Regression Models

Reporting behavior from WHO COVID-19 public data