25 datasets found

h
Data from: lncRNA
huggingface.co
Updated Mar 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rouskin Lab Harvard Medical School (2024). lncRNA [Dataset]. https://huggingface.co/datasets/rouskinlab/lncRNA
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 17, 2024
Dataset authored and provided by
Rouskin Lab Harvard Medical School
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Data types

sequence: 30 datapoints structure: 30 datapoints

Conversion report

Over a total of 30 datapoints, there are:

OUTPUT

ALL: 30 valid datapoints INCLUDED: 6 duplicate sequences with different structure / dms / shape

MODIFIED

0 multiple sequences with the same reference (renamed reference)

FILTERED OUT

0 invalid datapoints (ex: sequence with non-regular characters) 0 datapoints with bad structures 0 duplicate sequences with… See the full description on the dataset page: https://huggingface.co/datasets/rouskinlab/lncRNA.
Data from: Extensive Databases and Group Contribution QSPRs of Ionic Liquids...
acs.figshare.com
zip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kamil Paduszyński (2023). Extensive Databases and Group Contribution QSPRs of Ionic Liquids Properties. 1. Density [Dataset]. http://doi.org/10.1021/acs.iecr.9b00130.s002
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.iecr.9b00130.s002
Dataset updated
May 30, 2023
Dataset provided by
ACS Publications
Authors
Kamil Paduszyński
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
A new group contribution (GC) quantitative structure-property relationship (QSPR) for estimating density (ρ) of pure ionic liquids (ILs) as a function of temperature (T) and pressure (p) is developed on the basis of the most comprehensive collection of volumetric data reported so far (in total 41 250 data points, deposited for 2267 ILs from diverse chemical families). The model was established based on a carefully revised, evaluated, and reduced data set, whereas the adopted GC methodology follows the approach proposed previously [Ind. Eng. Chem. Res. 2012, 51, 591−604]. However, a novel approach is proposed to model both temperature and pressure dependence. The idea consist of an independent representation of reference density ρ0 at T0 = 298.15 K and ρ0 = 0.1 MPa and dimensionless correction f(T, P)  ρ(T, p)/ρ0 for other conditions of temperature and pressure. Three common machine learning algorithms are employed to represent the quantitative structure–property relationship between the studied property end points, GCs, T, and p, namely, multiple linear regression, feed-forward artificial neural network, and least-squares support vector machine. On the basis of detailed statistical analysis of the resulting models, including both internal and external stability checks by means of common statistical procedures such as cross-validation, y-scrambling, and “hold-out” testing, the final model is selected and recommended. An impact of type of cation and anion of the accuracy of calculations is highlighted and discussed. Performance of the new model is finally demonstrated by comparing it with similar methods published recently in the literature.
🍕Food Bank🏦of the World🌍
kaggle.com
Updated Nov 9, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pranav941 (2022). 🍕Food Bank🏦of the World🌍 [Dataset]. https://www.kaggle.com/datasets/pranav941/-world-food-wealth-bank/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 9, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Pranav941
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset Structure & Description

https://imgur.com/AYzsmYU.jpg" alt="Dataset Structure">

Context and Inspiration

I read an article yesterday which got my mind storming, A article by Worldbank on August 15th, 2022 better explains it, It has been quoted below,
I already have a project i'm working on since Feb 2021, trying to solving this problem, listed in my datasets

This dataset showcases the statistics over the past 6-7 decades which covers the production of 150+ unique crops, 50+ livestock elements, Land distribution by usage and population, As aspiring data scientists one can try to extract insights incentivizing the optimal use of natural resources and distribution of resources

August 15, 2022 - Worldbank

Record high food prices have triggered a global crisis that will drive millions more into extreme poverty, magnifying hunger and malnutrition, while threatening to erase hard-won gains in development. The war in Ukraine, supply chain disruptions, and the continued economic fallout of the COVID-19 pandemic are reversing years of development gains and pushing food prices to all-time highs. Rising food prices have a greater impact on people in low- and middle-income countries, since they spend a larger share of their income on food than people in high-income countries. This brief looks at rising food insecurity and World Bank responses to date.

Please leave a upvote if you found this helpful ☮️

Hello 👋, If you are enjoying so far, Please checkout my other Datasets, I would love to hear your support & feedback on it, Thank you !

<--- (❁´◡`❁) --->

Checkout my other Datasets & Notebooks
d
Data from: defining the pyro-thermal niche: do seed traits, ecosystem type...
search.dataone.org
data.niaid.nih.gov
+2more
Updated Mar 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ryan Tangney; Sarah McInnes; Emma Dalziell; William Cornwell; Ben Miller; Tony Auld; Mark Ooi (2025). defining the pyro-thermal niche: do seed traits, ecosystem type and phylogeny influence thermal thresholds in seeds with physical dormancy [Dataset]. http://doi.org/10.5061/dryad.j9kd51cm3
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.j9kd51cm3
Dataset updated
Mar 1, 2025
Dataset provided by
Dryad Digital Repository
Authors
Ryan Tangney; Sarah McInnes; Emma Dalziell; William Cornwell; Ben Miller; Tony Auld; Mark Ooi
Description
Seeds are a key pathway for plant population recovery following disturbance. To prevent germination during unsuitable conditions, most species produce dormant seeds. In fire-prone regions, physical dormancy (PY) enables seeds to germinate after fire. The thermal niche, incorporating seed dormancy and mortality temperature responses, has not been characterised for PY seeds from fire prone environments. We aimed to assess variation in thermal thresholds between species with PY seeds and if the pyro-thermal niche is aligned with seed mass, ecosystem type or phylogenetic relatedness. We collected post heat-shock germination data for 58 Australian species that produce PY seeds. We applied species-specific thermal performance curves to define three critical thresholds (DRT50, dormancy release temperature; Topt, optimum dormancy release temperature and LT50, lethal temperature), defining the pyro-thermal niche. Each species was assigned a mean seed weight and ecosystem type. We constructed a p..., Species selection and data acquisition. We set out to acquire seed germination data following heat-shock for as many species as possible across temperate Australia. However, to provide accurate estimates of threshold conditions, we applied three rules that allowed data from multiple sources to be brought together; 1) seeds needed to be heated from ~40Â°C through to at least 120 Â°C (or higher if there was no evidence of seed mortality at 120Â°C),Â 2) seeds had to be heated for either 5 mins and/or 10mins and 3) there was > two non-zero germination data points within the treatments (e.g., seeds treated at 40 Â°C, 60 Â°C, 80 Â°C, 100 Â°C and 120 Â°C, but only recorded ~ 0%Â germination at 80 and 100 would be discarded). Where there was insufficient data points to model the response, we removed these datasets were removed, as fitting response curves to two data points induces significant error into hypothetical responses. However, this was a rare occurrence, and only appeared once across all s..., , # Data from: defining the pyro-thermal niche: do seed traits, ecosystem type and phylogeny influence thermal thresholds in seeds with physical dormancy

[https://doi.org/10.5061/dryad.j9kd51cm3] (https://doi.org/10.5061/dryad.j9kd51cm3)

This dataset provides the full data requirements to implement the "pyrothermal niche" markdown file in R. It also includes model fitting figures for each of the species used in the analysis.

Description of the data and file structure

The script provided calls upon each of the included .csv files and the phylogenetic tree. The bulk of the csv files are subsets and cuts of the two main datasets, germination data and master data.csv. The figures included here are directly associated with the species-specific model selection undertaken in the manuscript. Each figure indicates the model fit of the 7 models used and included suitablity metrics for each model. For furt...
Supplemental data for characterization of alpha and beta interactions using...
zenodo.org
explore.openaire.eu
csv
Updated Apr 27, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florian Jörg; Florian Jörg; Dominick Cichon; Dominick Cichon; Guillaume Eurin; Guillaume Eurin; Luisa Hötzsch; Luisa Hötzsch; Teresa Marrodán Undagoitia; Teresa Marrodán Undagoitia; Natascha Rupp; Natascha Rupp (2022). Supplemental data for characterization of alpha and beta interactions using the HeXe setup [Eur. Phys. J. C 82, 361] [Dataset]. http://doi.org/10.5281/zenodo.6497602
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6497602
Dataset updated
Apr 27, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Florian Jörg; Florian Jörg; Dominick Cichon; Dominick Cichon; Guillaume Eurin; Guillaume Eurin; Luisa Hötzsch; Luisa Hötzsch; Teresa Marrodán Undagoitia; Teresa Marrodán Undagoitia; Natascha Rupp; Natascha Rupp
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Repository with supplemental data to:
Characterization of alpha and beta interactions in liquid xenon. Jörg, F., Cichon, D., Eurin, G. et al. Eur. Phys. J. C 82, 361 (2022) 10.1140/epjc/s10052-022-10259-3
A pre-print of the article is available on arXiv: 2109.13735

Note: When re-using the data, please make sure to cite the article (and not only the dataset)

The files contain the measured data points (as well as their statistical and systematic uncertainties) as shown in the publication.
All datasets are stored in the .csv format.

20210924_yields_hexe_kr83m.csv
This file contains the normalized light and charge yields as a function of the applied field from the measurement with the ^83mKr source. The data is shown in Figure 16 (dots) of the publication. Furthermore the file contains the LY ratio between the two Isomeric transitions of the ^83mKr source, shown in Figure 17 of the article.

20210924_yields_hexe_rn222.csv
This file contains the normalized light and charge yields as a function of the applied field from the measurement with the ²²²Rn source. The data is shown in Figure 18 (blue-ish points) of the publication

20210924_drift_velocity_hexe_rn222.csv
This file contains the measured electron drift velocity in liquid xenon at a temperature of 174.4 K in dependence of the field. The data was acquired using the ²²²Rn source. Drift velocity is given in units of mm/µs and the datapoints are shown in Figure 20 (black dots) of the publication

20210924_drift_velocity_hexe_kr83m.csv
This file contains the measured electron drift velocity in liquid xenon at a temperature of 174.4 K in dependence of the field. The data was acquired using the ^83mKr source. Drift velocity is given in units of mm/µs and are not displayed in the publications due to visibility reasons.

Minimum working example to plot the drift velocity using the ^83mKr data:

1 import numpy as np 2 import matplotlib.pyplot as plt 3 4 # load the data set 5 data = np.loadtxt("20220427_drift_velocity_hexe_kr83m.csv", delimiter=",") 6 7 # Plot the systematic uncertainty on the drift field 8 plt.errorbar(data[:,0], data[:,2], xerr=data[:,1], fmt="o", capsize=2, ecolor="darkgray", 9 alpha=0.7, elinewidth=3, color="black") 10 11 # Plot the actual data points 12 plt.errorbar(data[:,0], data[:,2], yerr=data[:,3], fmt="o", color="black") 13 14 # Label the axis and define the range 15 plt.ylabel("Drift Velocity [mm/µs]") 16 plt.xlabel("Drift Field [kV/cm]") 17 plt.xscale("log") 18 plt.xlim(0.006, 2) 19 plt.ylim(0, 2.4) 20 plt.show()
Z
Supplemental data for characterization of alpha and beta interactions using...
data.niaid.nih.gov
Updated Apr 27, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jörg, Florian (2022). Supplemental data for characterization of alpha and beta interactions using the HeXe setup [arXiv: 2109.13735] [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5525213
Explore at:
Dataset updated
Apr 27, 2022
Dataset provided by
Eurin, Guillaume
Marrodán Undagoitia, Teresa
Hötzsch, Luisa
Cichon, Dominick
Rupp, Natascha
Jörg, Florian
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description

Preliminary Dataset<<< As the article has not yet been peer-reviewed. Pre-print available on arXiv: 2109.13735

Repository with supplemental data to: Characterization of alpha and electron interactions in liquid xenon

Note: When re-using the data, please make sure to cite the article (and not only the dataset)

The files contain the measured data points (as well as their statistical and systematic uncertainties) as shown in the publication. All datasets are stored in the .csv format.

20210924_yields_hexe_kr83m.csv This file contains the normalized light and charge yields as a function of the applied field from the measurement with the 83mKr source. The data is shown in Figure 16 (dots) of the publication

20210924_yields_hexe_rn222.csv This file contains the normalized light and charge yields as a function of the applied field from the measurement with the 222Rn source. The data is shown in Figure 17 (blue-ish points) of the publication

20210924_drift_velocity_hexe_rn222.csv This file contains the measured electron drift velocity in liquid xenon at a temperature of 174.4 K in dependence of the field. The data was acquired using the 222Rn source. Drift velocity is given in units of mm/µs and the datapoints are shown in Figure 19 (black dots) of the publication

20210924_drift_velocity_hexe_kr83m.csv This file contains the measured electron drift velocity in liquid xenon at a temperature of 174.4 K in dependence of the field. The data was acquired using the 83mKr source. Drift velocity is given in units of mm/µs and are not displayed in the publications due to visibility reasons.

Minimum working example to plot the drift velocity using the 83mKr data:

1 import numpy as np 2 import matplotlib.pyplot as plt 3 4 # load the data set 5 data = np.loadtxt("20210924_drift_velocity_hexe_kr83m.csv", delimiter=",") 6 7 # Plot the systematic uncertainty on the drift field 8 plt.errorbar(data[:,0], data[:,2], xerr=data[:,1], fmt="o", capsize=2, ecolor="darkgray", 9 alpha=0.7, elinewidth=3, color="black") 10 11 # Plot the actual data points 12 plt.errorbar(data[:,0], data[:,2], yerr=data[:,3], fmt="o", color="black") 13 14 # Label the axis and define the range 15 plt.ylabel("Drift Velocity [mm/µs]") 16 plt.xlabel("Drift Field [kV/cm]") 17 plt.xscale("log") 18 plt.xlim(0.006, 2) 19 plt.ylim(0, 2.4) 20 plt.show()
Z
Dataset for the published article "Retarded room temperature Hamaker...
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Apr 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Panagiotis Tolias (2023). Dataset for the published article "Retarded room temperature Hamaker coefficients between bulk elemental metals" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6551065
Explore at:
Dataset updated
Apr 11, 2023
Dataset authored and provided by
Panagiotis Tolias
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The data contained in the zip files constitute the main research data of the publication entitled as "Retarded room temperature Hamaker coefficients between bulk elemental metals". They are provided in a txt file format.

"Identical metals in vacuum.zip" contains the room temperature Hamaker coefficients between 26 identical elemental polycrystalline metals that are embedded in vacuum computed from the full Lifshitz theory as a function of the separation of the metallic semi-spaces within 0-200nm. The employed discretization scheme is the following: for l = 0 − 1 nm, (\Delta{l}=0.1\,nm) which corresponds to 11 data points, for l = 1−200 nm: (\Delta{l}=1\,nm) which corresponds to 200 data points. The computation of the imaginary argument dielectric function of metals is based on the full spectral method combined with a Drude model low frequency extrapolation technique which has been implemented with input from extended-in-frequency dielectric data that range from the far infra-red region to the soft X-ray region of the electromagnetic spectrum.

"Identical metals in water (Fiedler et al).zip" contains the room temperature Hamaker coefficients between 26 identical elemental polycrystalline metals that are embedded in pure water computed from the full Lifshitz theory as a function of the separation of the metallic semi-spaces within 0-200nm. The employed discretization scheme is the following: for l = 0 − 1 nm, (\Delta{l}=0.1\,nm) which corresponds to 11 data points, for l = 1−200 nm: (\Delta{l}=1\,nm) which corresponds to 200 data points. The computation of the imaginary argument dielectric function of metals is based on the full spectral method combined with a Drude model low frequency extrapolation technique which has been implemented with input from extended-in-frequency dielectric data that range from the far infra-red region to the soft X-ray region of the electromagnetic spectrum. The computation of the imaginary argument dielectric function of pure water is based on the simple spectral method which has been implemented with input from the Fiedler et al. dielectric parameterization.

"Identical metals in water (Parsegian-Weiss).zip" contains the room temperature Hamaker coefficients between 26 identical elemental polycrystalline metals that are embedded in pure water computed from the full Lifshitz theory as a function of the separation of the metallic semi-spaces within 0-200nm. The employed discretization scheme is the following: for l = 0 − 1 nm, (\Delta{l}=0.1\,nm) which corresponds to 11 data points, for l = 1−200 nm: (\Delta{l}=1\,nm) which corresponds to 200 data points. The computation of the imaginary argument dielectric function of metals is based on the full spectral method combined with a Drude model low frequency extrapolation technique which has been implemented with input from extended-in-frequency dielectric data that range from the far infra-red region to the soft X-ray region of the electromagnetic spectrum. The computation of the imaginary argument dielectric function of pure water is based on the simple spectral method which has been implemented with input from the Parsegian-Weiss dielectric parameterization.

"Identical metals in water (Roth-Lenhoff).zip" contains the room temperature Hamaker coefficients between 26 identical elemental polycrystalline metals that are embedded in pure water computed from the full Lifshitz theory as a function of the separation of the metallic semi-spaces within 0-200nm. The employed discretization scheme is the following: for l = 0 − 1 nm, (\Delta{l}=0.1\,nm) which corresponds to 11 data points, for l = 1−200 nm: (\Delta{l}=1\,nm) which corresponds to 200 data points. The computation of the imaginary argument dielectric function of metals is based on the full spectral method combined with a Drude model low frequency extrapolation technique which has been implemented with input from extended-in-frequency dielectric data that range from the far infra-red region to the soft X-ray region of the electromagnetic spectrum. The computation of the imaginary argument dielectric function of pure water is based on the simple spectral method which has been implemented with input from the Roth-Lenhoff dielectric parameterization.

All Hamaker coefficients are given in zJ and all separations are given in nm.
Data Set Generated by the Fuzzy Model Constructed to Describe Execution...
zenodo.org
csv
Updated Apr 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tamas Galli; Tamas Galli; Francisco Chiclana; Francisco Chiclana; Francois Siewe; Francois Siewe (2023). Data Set Generated by the Fuzzy Model Constructed to Describe Execution Tracing Quality [Dataset]. http://doi.org/10.5281/zenodo.7841041
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7841041
Dataset updated
Apr 19, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Tamas Galli; Tamas Galli; Francisco Chiclana; Francisco Chiclana; Francois Siewe; Francois Siewe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The uploaded data set was generated by the fuzzy model published in T. Galli, F. Chiclana, and F. Siewe. Genetic algorithm-based fuzzy inference system for describing execution tracing quality. Mathematics, 9(21), 2021. ISSN 2227-7390. doi: https://doi.org/10.3390/ma th9212822. URL https://www.mdpi.com/2571-5577/4/1/20.

The goal of the data generation is to make the published model available in the form of data points in a 5D space, which facilitates the construction of simpler models to approximate the original model. The names of the columns in the .csv file constitute the quality properties of execution tracing: (1) accuracy, (2) legibility, (3) implementation, and (4) security, while column (5) contains execution tracing quality derived from the fuzzy model. The indices in brackets show the column indices in the .csv file.

All variables lie in the continuous range [0, 100], where 100 means the best possible quality value and 0 the complete lack of quality or the lack of the given quality property. While generating the data, the inputs were increased by a step-size 5 and the model's output was collected, i.e. 4 inputs, from including 0 to 100 with 21 data points (21^4 = 194481).
m
MetalHydrideEnth
data.mendeley.com
Updated Mar 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Katarina Batalović (2022). MetalHydrideEnth [Dataset]. http://doi.org/10.17632/4tpmdzxtf6.1
Explore at:
Unique identifier
https://doi.org/10.17632/4tpmdzxtf6.1
Dataset updated
Mar 3, 2022
Authors
Katarina Batalović
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Database linking crystal structure, Materials project id, and experimental enthalpy of hydride formation in various metals/intermetallics. Information for the source of the experimental value is provided, along with DOI where available. Also, data is labeled according to data_set value, where 1 labels data points used in training, 2 labels data points used for validation, and 3 are data points used in the test. Data_set=0 are data points not used in model development. In addition, the model and scaler are provided. More details can be found in K.Batalovic et al., 'Predicting heat of hydride formation by the graph neural network – exploring structure-property relation for metal hydrides '.
S1 Data -
plos.figshare.com
bin
Updated Dec 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yanchao GAO; Hua GU (2023). S1 Data - [Dataset]. http://doi.org/10.1371/journal.pone.0295446.s001
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0295446.s001
Dataset updated
Dec 14, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Yanchao GAO; Hua GU
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ObjectiveTo analyze the influence path of the interaction between the unit environment, achievement transformation willingness, and achievement transformation cognition on achievement transformation output to provide a basis for optimizing the achievement transformation environment of medical and health institutions and improving the efficiency of scientific and technological achievements transformation.MethodsThrough the questionnaire survey, 292 data points were obtained. SPSS20.0 was used to conduct cross-table chi-square analysis and binary logistic regression analysis on the willingness, cognition, and output of scientific and technological achievements transformation. The process 14.0 plug-in is used to analyze the mediating effect of transformation cognition and the moderating effect of the unit environment.ResultsAchievement transformation willingness has a significant positive impact on achievement transformation cognition and achievement transformation output. Achievement transformation cognition has a positive impact on achievement transformation output, and the mediating effect of transformation cognition is significant and partial. The unit environment has a negative moderating effect on the influence of achievement transformation willingness on achievement transformation cognition, and a positive moderating effect on the influence of achievement transformation willingness on achievement transformation output.ConclusionPersonal factors and unit achievement transformation environment have an obvious influence on the willingness and cognition of achievement transformation. It is necessary to optimize the environment of unit scientific and technological achievements transformation, improve the policy system of benefiting from the achievements of scientific and technological achievements transformation of medical and health personnel, implement the disposal right, income right and distribution right of scientific and technological achievements transformation of medical and health institutions, and stimulate the enthusiasm of scientific and technological achievements transformation of medical and health personnel.
h
Fig 9b
hepdata.net
Updated Nov 7, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Fig 9b [Dataset]. http://doi.org/10.17182/hepdata.113470.v1/t10
Explore at:
Unique identifier
https://doi.org/10.17182/hepdata.113470.v1/t10
Dataset updated
Nov 7, 2021
Description
Mass distributions for selected $D^0 D^0$ candidates with the $D^0$ background subtracted. Uncertainties on the data points are statistical only...
n
Household Income
neepawa.ca
southstormont.ca
+71more
Updated Jun 12, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Household Income [Dataset]. https://www.neepawa.ca/services/community-insights-2/
Explore at:
Dataset updated
Jun 12, 2018
Description
The household incomes chart shows how many household fall in each of the income brackets specified by Statistics Canada.
Z
CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly...
data.niaid.nih.gov
zenodo.org
Updated Feb 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hynek, Karel (2025). CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13382426
Explore at:
Dataset updated
Feb 26, 2025
Dataset provided by
Šiška, Pavel
Koumar, Josef
Čejka, Tomáš
Hynek, Karel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
CESNET-TimeSeries24: The dataset for network traffic forecasting and anomaly detection

The dataset called CESNET-TimeSeries24 was collected by long-term monitoring of selected statistical metrics for 40 weeks for each IP address on the ISP network CESNET3 (Czech Education and Science Network). The dataset encompasses network traffic from more than 275,000 active IP addresses, assigned to a wide variety of devices, including office computers, NATs, servers, WiFi routers, honeypots, and video-game consoles found in dormitories. Moreover, the dataset is also rich in network anomaly types since it contains all types of anomalies, ensuring a comprehensive evaluation of anomaly detection methods.Last but not least, the CESNET-TimeSeries24 dataset provides traffic time series on institutional and IP subnet levels to cover all possible anomaly detection or forecasting scopes. Overall, the time series dataset was created from the 66 billion IP flows that contain 4 trillion packets that carry approximately 3.7 petabytes of data. The CESNET-TimeSeries24 dataset is a complex real-world dataset that will finally bring insights into the evaluation of forecasting models in real-world environments.

Please cite the usage of our dataset as:

Koumar, J., Hynek, K., Čejka, T. et al. CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting. Sci Data 12, 338 (2025). https://doi.org/10.1038/s41597-025-04603-x@Article{cesnettimeseries24, author={Koumar, Josef and Hynek, Karel and {\v{C}}ejka, Tom{\'a}{\v{s}} and {\v{S}}i{\v{s}}ka, Pavel}, title={CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting}, journal={Scientific Data}, year={2025}, month={Feb}, day={26}, volume={12}, number={1}, pages={338}, issn={2052-4463}, doi={10.1038/s41597-025-04603-x}, url={https://doi.org/10.1038/s41597-025-04603-x}}

Time series

We create evenly spaced time series for each IP address by aggregating IP flow records into time series datapoints. The created datapoints represent the behavior of IP addresses within a defined time window of 10 minutes. The vector of time-series metrics v_{ip, i} describes the IP address ip in the i-th time window. Thus, IP flows for vector v_{ip, i} are captured in time windows starting at t_i and ending at t_{i+1}. The time series are built from these datapoints.

Datapoints created by the aggregation of IP flows contain the following time-series metrics:

Simple volumetric metrics: the number of IP flows, the number of packets, and the transmitted data size (i.e. number of bytes)

Unique volumetric metrics: the number of unique destination IP addresses, the number of unique destination Autonomous System Numbers (ASNs), and the number of unique destination transport layer ports. The aggregation of \textit{Unique volumetric metrics} is memory intensive since all unique values must be stored in an array. We used a server with 41 GB of RAM, which was enough for 10-minute aggregation on the ISP network.

Ratios metrics: the ratio of UDP/TCP packets, the ratio of UDP/TCP transmitted data size, the direction ratio of packets, and the direction ratio of transmitted data size

Average metrics: the average flow duration, and the average Time To Live (TTL)

Multiple time aggregation: The original datapoints in the dataset are aggregated by 10 minutes of network traffic. The size of the aggregation interval influences anomaly detection procedures, mainly the training speed of the detection model. However, the 10-minute intervals can be too short for longitudinal anomaly detection methods. Therefore, we added two more aggregation intervals to the datasets--1 hour and 1 day.

Time series of institutions: We identify 283 institutions inside the CESNET3 network. These time series aggregated per each institution ID provide a view of the institution's data.

Time series of institutional subnets: We identify 548 institution subnets inside the CESNET3 network. These time series aggregated per each institution ID provide a view of the institution subnet's data.

Data Records

The file hierarchy is described below:

cesnet-timeseries24/

|- institution_subnets/ | |- agg_10_minutes/.csv | |- agg_1_hour/.csv | |- agg_1_day/.csv | |- identifiers.csv |- institutions/ | |- agg_10_minutes/.csv | |- agg_1_hour/.csv | |- agg_1_day/.csv | |- identifiers.csv |- ip_addresses_full/ | |- agg_10_minutes//.csv | |- agg_1_hour//.csv | |- agg_1_day//.csv | |- identifiers.csv |- ip_addresses_sample/ | |- agg_10_minutes/.csv | |- agg_1_hour/.csv | |- agg_1_day/.csv | |- identifiers.csv |- times/ | |- times_10_minutes.csv | |- times_1_hour.csv | |- times_1_day.csv |- ids_relationship.csv |- weekends_and_holidays.csv

The following list describes time series data fields in CSV files:

id_time: Unique identifier for each aggregation interval within the time series, used to segment the dataset into specific time periods for analysis.

n_flows: Total number of flows observed in the aggregation interval, indicating the volume of distinct sessions or connections for the IP address.

n_packets: Total number of packets transmitted during the aggregation interval, reflecting the packet-level traffic volume for the IP address.

n_bytes: Total number of bytes transmitted during the aggregation interval, representing the data volume for the IP address.

n_dest_ip: Number of unique destination IP addresses contacted by the IP address during the aggregation interval, showing the diversity of endpoints reached.

n_dest_asn: Number of unique destination Autonomous System Numbers (ASNs) contacted by the IP address during the aggregation interval, indicating the diversity of networks reached.

n_dest_port: Number of unique destination transport layer ports contacted by the IP address during the aggregation interval, representing the variety of services accessed.

tcp_udp_ratio_packets: Ratio of packets sent using TCP versus UDP by the IP address during the aggregation interval, providing insight into the transport protocol usage pattern. This metric belongs to the interval <0, 1> where 1 is when all packets are sent over TCP, and 0 is when all packets are sent over UDP.

tcp_udp_ratio_bytes: Ratio of bytes sent using TCP versus UDP by the IP address during the aggregation interval, highlighting the data volume distribution between protocols. This metric belongs to the interval <0, 1> with same rule as tcp_udp_ratio_packets.

dir_ratio_packets: Ratio of packet directions (inbound versus outbound) for the IP address during the aggregation interval, indicating the balance of traffic flow directions. This metric belongs to the interval <0, 1>, where 1 is when all packets are sent in the outgoing direction from the monitored IP address, and 0 is when all packets are sent in the incoming direction to the monitored IP address.

dir_ratio_bytes: Ratio of byte directions (inbound versus outbound) for the IP address during the aggregation interval, showing the data volume distribution in traffic flows. This metric belongs to the interval <0, 1> with the same rule as dir_ratio_packets.

avg_duration: Average duration of IP flows for the IP address during the aggregation interval, measuring the typical session length.

avg_ttl: Average Time To Live (TTL) of IP flows for the IP address during the aggregation interval, providing insight into the lifespan of packets.

Moreover, the time series created by re-aggregation contains following time series metrics instead of n_dest_ip, n_dest_asn, and n_dest_port:

sum_n_dest_ip: Sum of numbers of unique destination IP addresses.

avg_n_dest_ip: The average number of unique destination IP addresses.

std_n_dest_ip: Standard deviation of numbers of unique destination IP addresses.

sum_n_dest_asn: Sum of numbers of unique destination ASNs.

avg_n_dest_asn: The average number of unique destination ASNs.

std_n_dest_asn: Standard deviation of numbers of unique destination ASNs)

sum_n_dest_port: Sum of numbers of unique destination transport layer ports.

avg_n_dest_port: The average number of unique destination transport layer ports.

std_n_dest_port: Standard deviation of numbers of unique destination transport layer ports.

Moreover, files identifiers.csv in each dataset type contain IDs of time series that are present in the dataset. Furthermore, the ids_relationship.csv file contains a relationship between IP addresses, Institutions, and institution subnets. The weekends_and_holidays.csv contains information about the non-working days in the Czech Republic.
Pseudo Periodic Synthetic Time Series
kaggle.com
Updated Aug 25, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Overfitted (2020). Pseudo Periodic Synthetic Time Series [Dataset]. https://www.kaggle.com/vipulgote4/pseudo-periodic-synthetic-time-series/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 25, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Overfitted
Description
Data type:

The data is a synthetic univariate time series.

Abstract

This data set is designed for testing indexing schemes in time seriesdatabases. The data appears highly periodic, but never exactly repeats itself. This feature is designed to challenge the indexing tasks.

Context

Data Characteristics

This data set is designed for testing indexing schemes in time series databases. It is a much larger dataset than has been used in any published study (That we are currently aware of). It contains one million data points. The data has been split into 10 sections to facilitate testing (see below). We recommend building the index with 9 of the 100,000-datapoint sections, and randomly extracting a query shape from the 10th section. (Some previously published work seems to have used queries that were also used to build the indexing structure. This will produce optimistic results) The data are interesting because they have structure at different resolutions. Each of the 10 sections where generated by independent invocations of the function:https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3650646%2F63a7467c9c096ba461b6f02702e6d816%2Fequation.jpg?generation=1598371655944726&alt=media" alt="">

Where rand(x) produces a random integer between zero and x. The data appears highly periodic, but never exactly repeats itself. This feature is designed to challenge the indexing structure.

Content

What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

Data Format

The data is stored in one ASCII file. There are 10 columns, 100,000 rows. All data points are in the range -0.5 to +0.5. Rows are separated by carriage returns, columns by spaces.

Acknowledgements

Acknowledgements, Copyright Information, and Availability.Freely available for research use.

Inspiration

Your data will be in front of the world's largest data science community. What questions do you want to see answered?
H
Benchmarking - Raw Source Data
dataverse.harvard.edu
Updated May 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Diomar Anez; Dimar Anez (2025). Benchmarking - Raw Source Data [Dataset]. http://doi.org/10.7910/DVN/JKDONM
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/JKDONM
Dataset updated
May 6, 2025
Dataset provided by
Harvard Dataverse
Authors
Diomar Anez; Dimar Anez
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset contains raw, unprocessed data files pertaining to the management tool 'Benchmarking'. The data originates from five distinct sources, each reflecting different facets of the tool's prominence and usage over time. Files preserve the original metrics and temporal granularity before any comparative normalization or harmonization. Data Sources & File Details: Google Trends File (Prefix: GT_): Metric: Relative Search Interest (RSI) Index (0-100 scale). Keywords Used: "benchmarking" + "benchmarking management" Time Period: January 2004 - January 2025 (Native Monthly Resolution). Scope: Global Web Search, broad categorization. Extraction Date: Data extracted January 2025. Notes: Index relative to peak interest within the period for these terms. Reflects public/professional search interest trends. Based on probabilistic sampling. Source URL: Google Trends Query Google Books Ngram Viewer File (Prefix: GB_): Metric: Annual Relative Frequency (% of total n-grams in the corpus). Keywords Used: Benchmarking Time Period: 1950 - 2022 (Annual Resolution). Corpus: English. Parameters: Case Insensitive OFF, Smoothing 0. Extraction Date: Data extracted January 2025. Notes: Reflects term usage frequency in Google's digitized book corpus. Subject to corpus limitations (English bias, coverage). Source URL: Ngram Viewer Query Crossref.org File (Prefix: CR_): Metric: Absolute count of publications per month matching keywords. Keywords Used: "benchmarking" AND ("process" OR "management" OR "performance" OR "best practices" OR "implementation" OR "approach" OR "evaluation" OR "methodology") Time Period: 1950 - 2025 (Queried for monthly counts based on publication date metadata). Search Fields: Title, Abstract. Extraction Date: Data extracted January 2025. Notes: Reflects volume of relevant academic publications indexed by Crossref. Deduplicated using DOIs; records without DOIs omitted. Source URL: Crossref Search Query Bain & Co. Survey - Usability File (Prefix: BU_): Metric: Original Percentage (%) of executives reporting tool usage. Tool Names/Years Included: Benchmarking (1993, 1996, 1999, 2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2017). Respondent Profile: CEOs, CFOs, COOs, other senior leaders; global, multi-sector. Source: Bain & Company Management Tools & Trends publications (Rigby D., Bilodeau B., et al., various years: 1994, 2001, 2003, 2005, 2007, 2009, 2011, 2013, 2015, 2017). Note: Tool not included in the 2022 survey data. Data Compilation Period: July 2024 - January 2025. Notes: Data points correspond to specific survey years. Sample sizes: 1993/500; 1996/784; 1999/475; 2000/214; 2002/708; 2004/960; 2006/1221; 2008/1430; 2010/1230; 2012/1208; 2014/1067; 2017/1268. Bain & Co. Survey - Satisfaction File (Prefix: BS_): Metric: Original Average Satisfaction Score (Scale 0-5). Tool Names/Years Included: Benchmarking (1993, 1996, 1999, 2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2017). Respondent Profile: CEOs, CFOs, COOs, other senior leaders; global, multi-sector. Source: Bain & Company Management Tools & Trends publications (Rigby D., Bilodeau B., et al., various years: 1994, 2001, 2003, 2005, 2007, 2009, 2011, 2013, 2015, 2017). Note: Tool not included in the 2022 survey data. Data Compilation Period: July 2024 - January 2025. Notes: Data points correspond to specific survey years. Sample sizes: 1993/500; 1996/784; 1999/475; 2000/214; 2002/708; 2004/960; 2006/1221; 2008/1430; 2010/1230; 2012/1208; 2014/1067; 2017/1268. Reflects subjective executive perception of utility. File Naming Convention: Files generally follow the pattern: PREFIX_Tool.csv, where the PREFIX indicates the data source: GT_: Google Trends GB_: Google Books Ngram CR_: Crossref.org (Count Data for this Raw Dataset) BU_: Bain & Company Survey (Usability) BS_: Bain & Company Survey (Satisfaction) The essential identification comes from the PREFIX and the Tool Name segment. This dataset resides within the 'Management Tool Source Data (Raw Extracts)' Dataverse.
n
Population
neepawa.ca
districtofclearwater.com
+94more
Updated Jun 12, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Population [Dataset]. https://www.neepawa.ca/services/community-insights-2/
Explore at:
Dataset updated
Jun 12, 2018
Description
Population is the sum of births plus in-migration, and it signifies the total market size possible in the area. This is an important metric for economic developers to measure their economic health and investment attraction. Businesses also use this as a metric for market size when evaluating startup, expansion or relocation decisions.
Li-ion Battery Aging Dataset
kaggle.com
Updated May 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GIRITHARAN MANI (2024). Li-ion Battery Aging Dataset [Dataset]. https://www.kaggle.com/datasets/mystifoe77/nasa-battery-data-cleaned/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 12, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
GIRITHARAN MANI
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Overview

This dataset provides a comprehensive view of the aging process of lithium-ion batteries, facilitating the estimation of their Remaining Useful Life (RUL). Originally sourced from NASA's open repository, the dataset has undergone meticulous preprocessing to enhance its analytical utility. The data is presented in a user-friendly CSV format after extracting relevant features from the original .mat files.

Key Features of the Dataset

Battery Performance Metrics:

Capacity: Measured over time to assess degradation.

Internal Resistance (Re): Represents the electrical resistance of the battery.

Charge Transfer Resistance (Rct): Indicates charge movement efficiency.

Environmental Conditions:

Ambient Temperature: External temperature affecting battery performance.

Identification Attributes:

Battery ID: Unique identifier for each battery tested.

Test ID: Links specific test conditions to outcomes.

UID & Filename: Traceable dataset references.

Processed Data:

Missing values have been addressed.

Columns irrelevant to RUL estimation have been removed.

Skewness in the data has been corrected for statistical accuracy.

Labels:

Degradation States: Categorized into intervals for easier interpretation.

Ranges include operational and failure states.

Potential Applications

Battery Health Monitoring:

Predict battery failure timelines.

Enhance battery maintenance strategies.

Data Science and Machine Learning:

Model development for RUL prediction.

Feature engineering for predictive analysis.

Research and Development:

Improve battery design.

Study the impact of environmental and operational conditions on battery life.

Technical Details

File Format: CSV

Size: ~625.02 kB

Columns: 9

Data Points: Multiple observations across various tests.

Tags

Keywords: Lithium-ion batteries, RUL, Battery Aging, Machine Learning, Data Analysis, Predictive Maintenance.

License

Apache 2.0: Permits academic and commercial use.

Usage Instructions

Import the dataset into your data analysis tools (e.g., Python, R, MATLAB).

Explore features to understand correlations and dependencies.

Use machine learning models for RUL prediction.

Provenance

The dataset was retrieved from NASA's publicly available data repositories. It has been preprocessed to align with research and industrial standards for usability in analytical tasks.

Call to Action

Leverage this dataset to enhance your understanding of lithium-ion battery degradation and build models that could revolutionize energy storage solutions.
n
Median Age
neepawa.ca
langford.ca
+78more
Updated Jun 12, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Median Age [Dataset]. https://www.neepawa.ca/services/community-insights-2/
Explore at:
Dataset updated
Jun 12, 2018
Description
The median age indicates the age separating the population group into two halves of equal size.
H
Zero-Based Budgeting (ZBB) - Raw Source Data
dataverse.harvard.edu
Updated May 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Diomar Anez; Dimar Anez (2025). Zero-Based Budgeting (ZBB) - Raw Source Data [Dataset]. http://doi.org/10.7910/DVN/8CRH2L
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/8CRH2L
Dataset updated
May 6, 2025
Dataset provided by
Harvard Dataverse
Authors
Diomar Anez; Dimar Anez
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset contains raw, unprocessed data files pertaining to the management tool 'Zero-Based Budgeting' (ZBB), including related concepts like Priority Based Budgeting. The data originates from five distinct sources, each reflecting different facets of the tool's prominence and usage over time. Files preserve the original metrics and temporal granularity before any comparative normalization or harmonization. Data Sources & File Details: Google Trends File (Prefix: GT_): Metric: Relative Search Interest (RSI) Index (0-100 scale). Keywords Used: "zero based budgeting" + "priority based budgeting" + "zero based budgeting management" Time Period: January 2004 - January 2025 (Native Monthly Resolution). Scope: Global Web Search, broad categorization. Extraction Date: Data extracted January 2025. Notes: Index relative to peak interest within the period for these terms. Reflects public/professional search interest trends. Based on probabilistic sampling. Source URL: Google Trends Query Google Books Ngram Viewer File (Prefix: GB_): Metric: Annual Relative Frequency (% of total n-grams in the corpus). Keywords Used: Zero Based Budgeting + Priority Based Budgeting + Program Budgeting Time Period: 1950 - 2022 (Annual Resolution). Corpus: English. Parameters: Case Insensitive OFF, Smoothing 0. Extraction Date: Data extracted January 2025. Notes: Reflects term usage frequency in Google's digitized book corpus. Subject to corpus limitations (English bias, coverage). Source URL: Ngram Viewer Query Crossref.org File (Prefix: CR_): Metric: Absolute count of publications per month matching keywords. Keywords Used: ("zero based budgeting" OR "priority based budgeting" OR "program budgeting") AND ("management" OR "financial" OR "budgeting process" OR "planning" OR "control" OR "system") Time Period: 1950 - 2025 (Queried for monthly counts based on publication date metadata). Search Fields: Title, Abstract. Extraction Date: Data extracted January 2025. Notes: Reflects volume of relevant academic publications indexed by Crossref. Deduplicated using DOIs; records without DOIs omitted. Source URL: Crossref Search Query Bain & Co. Survey - Usability File (Prefix: BU_): Metric: Original Percentage (%) of executives reporting tool usage. Tool Names/Years Included: Zero-Based Budgeting (2012, 2014, 2017, 2022). Respondent Profile: CEOs, CFOs, COOs, other senior leaders from multinational corporations and medium-sized enterprises across strategy, marketing, HR, etc.; global, multi-sector. Source: Bain & Company Management Tools & Trends publications (Rigby D., Bilodeau B., Ronan C. et al., various years: 2013, 2015, 2017, 2023). Data Compilation Period: July 2024 - January 2025. Notes: Data points correspond to specific survey years. Sample sizes: 2012/1208; 2014/1067; 2017/1268; 2022/1068. Bain & Co. Survey - Satisfaction File (Prefix: BS_): Metric: Original Average Satisfaction Score (Scale 0-5). Tool Names/Years Included: Zero-Based Budgeting (2012, 2014, 2017, 2022). Respondent Profile: CEOs, CFOs, COOs, other senior leaders from multinational corporations and medium-sized enterprises across strategy, marketing, HR, etc.; global, multi-sector. Source: Bain & Company Management Tools & Trends publications (Rigby D., Bilodeau B., Ronan C. et al., various years: 2013, 2015, 2017, 2023). Data Compilation Period: July 2024 - January 2025. Notes: Data points correspond to specific survey years. Sample sizes: 2012/1208; 2014/1067; 2017/1268; 2022/1068. Reflects subjective executive perception of utility. File Naming Convention: Files generally follow the pattern: PREFIX_Tool.csv, where the PREFIX indicates the data source: GT_: Google Trends GB_: Google Books Ngram CR_: Crossref.org (Count Data for this Raw Dataset) BU_: Bain & Company Survey (Usability) BS_: Bain & Company Survey (Satisfaction) The essential identification comes from the PREFIX and the Tool Name segment. This dataset resides within the 'Management Tool Source Data (Raw Extracts)' Dataverse.
Home Health Care Agency Ratings
kaggle.com
Updated Jan 29, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Home Health Care Agency Ratings [Dataset]. https://www.kaggle.com/datasets/thedevastator/home-health-care-agency-ratings
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 29, 2023
Dataset provided by
Kaggle
Authors
The Devastator
Description
Home Health Care Agency Ratings

Quality Measurements, Types of Services and More

By US Open Data Portal, data.gov [source]

About this dataset

This dataset provides a list of all Home Health Agencies registered with Medicare. Contained within this dataset is information on each agency's address, phone number, type of ownership, quality measure ratings and other associated data points. With this valuable insight into the operations of each Home Health Care Agency, you can make informed decisions about your care needs. Learn more about the services offered at each agency and how they are rated according to their quality measure ratings. From dedicated nursing care services to speech pathology to medical social services - get all the information you need with this comprehensive look at U.S.-based Home Health Care Agencies!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

Are you looking to learn more about Home Health Care Agencies registered with Medicare? This dataset can provide quality measure ratings, addresses, phone numbers, types of services offered and other information that may be helpful when researching Home Health Care Agencies.

This guide will explain how to use the data in this dataset to gain a better understanding of Home Health Care Agencies registered with Medicare.

First, you will need to become familiar with the columns in the dataset. A list of all columns and their associated descriptions is provided above for your reference. Once you understand each column’s purpose, it will be easier for you to decide what metrics or variables are most important for your own research.

Next, use this data to compare various facets between different Home Health Care Agencies such as type of ownership, services offered and quality measure ratings like star rating or CMS certification number (from 0-5 stars). Collecting information from multiple sources such as public reviews or customer feedback can help supplement these numerical metrics in order to paint a more accurate picture about each agency's performance and customer satisfaction level.

Finally once you have collected enough data points on one particular agency or a comparison between multiple agencies then conduct more analysis using statistical methods like correlation matrices in order to determine any patterns that exist within the data set which may reveal valuable insights into topic of research at hand

Research Ideas

Using the data to compare quality of care ratings between agencies, so people can make better informed decisions about which agency to hire for home health services.

Analyzing the costs associated with different types of home health care services, such as nursing care and physical therapy, in order to determine where money could be saved in health care budgets.

Evaluating the performance of certain agencies by analyzing the number of episodes billed to Medicare compared to their national averages, allowing agencies with lower numbers of billing episodes to be identified and monitored more closely if necessary

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

Unknown License - Please check the dataset description for more information.

Columns

File: csv-1.csv | Column name | Description | |:----------------------------------------...

Facebook

Twitter

Click to copy link

Link copied

Cite

Rouskin Lab Harvard Medical School (2024). lncRNA [Dataset]. https://huggingface.co/datasets/rouskinlab/lncRNA

Data from: lncRNA

rouskinlab/lncRNA

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Mar 17, 2024

Dataset authored and provided by

Rouskin Lab Harvard Medical School

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Data types

sequence: 30 datapoints structure: 30 datapoints

  Conversion report

Over a total of 30 datapoints, there are:

  OUTPUT

ALL: 30 valid datapoints INCLUDED: 6 duplicate sequences with different structure / dms / shape

  MODIFIED

0 multiple sequences with the same reference (renamed reference)

  FILTERED OUT

0 invalid datapoints (ex: sequence with non-regular characters) 0 datapoints with bad structures 0 duplicate sequences with… See the full description on the dataset page: https://huggingface.co/datasets/rouskinlab/lncRNA.

Clear search

Close search

Google apps

Main menu

Data from: lncRNA

Data from: Extensive Databases and Group Contribution QSPRs of Ionic Liquids...

🍕Food Bank🏦of the World🌍

Dataset Structure & Description

Context and Inspiration

August 15, 2022 - Worldbank

Please leave a upvote if you found this helpful ☮️

Hello 👋, If you are enjoying so far, Please checkout my other Datasets, I would love to hear your support & feedback on it, Thank you !

Checkout my other Datasets & Notebooks

Data from: defining the pyro-thermal niche: do seed traits, ecosystem type...

Supplemental data for characterization of alpha and beta interactions using...

Supplemental data for characterization of alpha and beta interactions using...

Dataset for the published article "Retarded room temperature Hamaker...

Data Set Generated by the Fuzzy Model Constructed to Describe Execution...

MetalHydrideEnth

S1 Data -

Fig 9b

Household Income

CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly...

Pseudo Periodic Synthetic Time Series

Data type:

Abstract

Context

Data Characteristics

Content

Data Format

Acknowledgements

Inspiration

Benchmarking - Raw Source Data

Population

Li-ion Battery Aging Dataset

Dataset Overview

Key Features of the Dataset

Potential Applications

Technical Details

Tags

License

Usage Instructions

Provenance

Call to Action

Median Age

Zero-Based Budgeting (ZBB) - Raw Source Data

Home Health Care Agency Ratings

Home Health Care Agency Ratings

Quality Measurements, Types of Services and More

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Data from: lncRNA

rouskinlab/lncRNA