Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset used in the article entitled 'Synthetic Datasets Generator for Testing Information Visualization and Machine Learning Techniques and Tools'. These datasets can be used to test several characteristics in machine learning and data processing algorithms.
https://www.emergenresearch.com/purpose-of-privacy-policyhttps://www.emergenresearch.com/purpose-of-privacy-policy
The Synthetic Data Generation Market size is expected to reach a valuation of USD 36.09 Billion in 2033 growing at a CAGR of 39.45%. The research report classifies market by share, trend, demand and based on segmentation by Data Type, Modeling Type, Offering, Application, End Use and Regional Outlook.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundClinical data is instrumental to medical research, machine learning (ML) model development, and advancing surgical care, but access is often constrained by privacy regulations and missing data. Synthetic data offers a promising solution to preserve privacy while enabling broader data access. Recent advances in large language models (LLMs) provide an opportunity to generate synthetic data with reduced reliance on domain expertise, computational resources, and pre-training.ObjectiveThis study aims to assess the feasibility of generating realistic tabular clinical data with OpenAI’s GPT-4o using zero-shot prompting, and evaluate the fidelity of LLM-generated data by comparing its statistical properties to the Vital Signs DataBase (VitalDB), a real-world open-source perioperative dataset.MethodsIn Phase 1, GPT-4o was prompted to generate a dataset with qualitative descriptions of 13 clinical parameters. The resultant data was assessed for general errors, plausibility of outputs, and cross-verification of related parameters. In Phase 2, GPT-4o was prompted to generate a dataset using descriptive statistics of the VitalDB dataset. Fidelity was assessed using two-sample t-tests, two-sample proportion tests, and 95% confidence interval (CI) overlap.ResultsIn Phase 1, GPT-4o generated a complete and structured dataset comprising 6,166 case files. The dataset was plausible in range and correctly calculated body mass index for all case files based on respective heights and weights. Statistical comparison between the LLM-generated datasets and VitalDB revealed that Phase 2 data achieved significant fidelity. Phase 2 data demonstrated statistical similarity in 12/13 (92.31%) parameters, whereby no statistically significant differences were observed in 6/6 (100.0%) categorical/binary and 6/7 (85.71%) continuous parameters. Overlap of 95% CIs were observed in 6/7 (85.71%) continuous parameters.ConclusionZero-shot prompting with GPT-4o can generate realistic tabular synthetic datasets, which can replicate key statistical properties of real-world perioperative data. This study highlights the potential of LLMs as a novel and accessible modality for synthetic data generation, which may address critical barriers in clinical data access and eliminate the need for technical expertise, extensive computational resources, and pre-training. Further research is warranted to enhance fidelity and investigate the use of LLMs to amplify and augment datasets, preserve multivariate relationships, and train robust ML models.
https://www.rootsanalysis.com/privacy.htmlhttps://www.rootsanalysis.com/privacy.html
The global synthetic data market size is projected to grow from USD 0.4 billion in the current year to USD 19.22 billion by 2035, representing a CAGR of 42.14%, during the forecast period till 2035
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Test Data Management Market size was valued at USD 1.54 Billion in 2024 and is projected to reach USD 2.97 Billion by 2031, growing at a CAGR of 11.19% from 2024 to 2031.
Test Data Management Market Drivers
Increasing Data Volumes: The exponential growth in data generated by businesses necessitates efficient management of test data. Effective TDM solutions help organizations handle large volumes of data, ensuring accurate and reliable testing processes.
Need for Regulatory Compliance: Stringent data privacy regulations, such as GDPR, HIPAA, and CCPA, require organizations to protect sensitive data. TDM solutions help ensure compliance by masking or anonymizing sensitive data used in testing environments.
Adoption of DevOps and Agile Methodologies: The shift towards DevOps and Agile development practices increases the demand for TDM solutions. These methodologies require continuous testing and integration, necessitating efficient management of test data to maintain quality and speed.
https://market.us/privacy-policy/https://market.us/privacy-policy/
The Synthetic Data Generation Market is estimated to reach USD 6,637.9 Mn By 2034, Riding on a Strong 35.9% CAGR during forecast period.
https://www.futuremarketinsights.com/privacy-policyhttps://www.futuremarketinsights.com/privacy-policy
The synthetic data generation market is projected to be worth US$ 300 million in 2024. The market is anticipated to reach US$ 13.0 billion by 2034. The market is further expected to surge at a CAGR of 45.9% during the forecast period 2024 to 2034.
Attributes | Key Insights |
---|---|
Synthetic Data Generation Market Estimated Size in 2024 | US$ 300 million |
Projected Market Value in 2034 | US$ 13.0 billion |
Value-based CAGR from 2024 to 2034 | 45.9% |
Country-wise Insights
Countries | Forecast CAGRs from 2024 to 2034 |
---|---|
The United States | 46.2% |
The United Kingdom | 47.2% |
China | 46.8% |
Japan | 47.0% |
Korea | 47.3% |
Category-wise Insights
Category | CAGR through 2034 |
---|---|
Tabular Data | 45.7% |
Sandwich Assays | 45.5% |
Report Scope
Attribute | Details |
---|---|
Estimated Market Size in 2024 | US$ 0.3 billion |
Projected Market Valuation in 2034 | US$ 13.0 billion |
Value-based CAGR 2024 to 2034 | 45.9% |
Forecast Period | 2024 to 2034 |
Historical Data Available for | 2019 to 2023 |
Market Analysis | Value in US$ Billion |
Key Regions Covered |
|
Key Market Segments Covered |
|
Key Countries Profiled |
|
Key Companies Profiled |
|
The Synthea generated data is provided here as a 1,000 person (1k), 100,000 person (100k), and 2,800,000 persom (2.8m) data sets in the OMOP Common Data Model format. SyntheaTM is a synthetic patient generator that models the medical history of synthetic patients. Our mission is to output high-quality synthetic, realistic but not real, patient data and associated health records covering every aspect of healthcare. The resulting data is free from cost, privacy, and security restrictions. It can be used without restriction for a variety of secondary uses in academia, research, industry, and government (although a citation would be appreciated). You can read our first academic paper here: https://doi.org/10.1093/jamia/ocx079
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
This is one of two collection records. Please see the link below for the other collection of associated audio files.
Both collections together comprise an open clinical dataset of three sets of 101 nursing handover records, very similar to real documents in Australian English. Each record consists of a patient profile, spoken free-form text document, written free-form text document, and written structured document.
This collection contains 3 sets of text documents.
Data Set 1 for Training and Development
The data set, released in June 2014, includes the following documents:
Folder initialisation: Initialisation details for speech recognition using Dragon Medical 11.0 (i.e., i) DOCX for the written, free-form text document that originates from the Dragon software release and ii) WMA for the spoken, free-form text document by the RN) Folder 100profiles: 100 patient profiles (DOCX) Folder 101writtenfreetextreports: 101 written, free-form text documents (TXT) Folder 100x6speechrecognised: 100 speech-recognized, written, free-form text documents for six Dragon vocabularies (TXT) Folder 101informationextraction: 101 written, structured documents for information extraction that include i) the reference standard text, ii) features used by our best system, iii) form categories with respect to the reference standard and iv) form categories with respect to the our best information extraction system (TXT in CRF++ format).
An Independent Data Set 2
The aforementioned data set was supplemented in April 2015 with an independent set that was used as a test set in the CLEFeHealth 2015 Task 1a on clinical speech recognition and can be used as a validation set in the CLEFeHealth 2016 Task 1 on handover information extraction. Hence, when using this set, please avoid its repeated use in evaluation – we do not wish to overfit to these data sets.
The set released in April 2015 consists of 100 patient profiles (DOCX), 100 written, and 100 speech-recognized, written, free-form text documents for the Dragon vocabulary of Nursing (TXT). The set released in November 2015 consists of the respective 100 written free-form text documents (TXT) and 100 written, structured documents for information extraction.
An Independent Data Set 3
For evaluation purposes, the aforementioned data sets were supplemented in April 2016 with an independent set of another 100 synthetic cases.
Lineage: Data creation included the following steps: generation of patient profiles; creation of written, free form text documents; development of a structured handover form, using this form and the written, free-form text documents to create written, structured documents; creation of spoken, free-form text documents; using a speech recognition engine with different vocabularies to convert the spoken documents to written, free-form text; and using an information extraction system to fill out the handover form from the written, free-form text documents.
See Suominen et al (2015) in the links below for a detailed description and examples.
Data scarcity presents a significant obstacle in the field of biomedicine, where acquiring diverse and sufficient datasets can be costly and challenging. Synthetic data generation offers a potential solution to this problem by expanding dataset sizes, thereby enabling the training of more robust and generalizable machine learning models. Although previous studies have explored synthetic data generation for cancer diagnosis, they have predominantly focused on single-modality settings, such as whole-slide image tiles or RNA-Seq data. To bridge this gap, we propose a novel approach, RNA-Cascaded-Diffusion-Model or RNA-CDM, for performing RNA-to-image synthesis in a multi-cancer context, drawing inspiration from successful text-to-image synthesis models used in natural images. In our approach, we employ a variational auto-encoder to reduce the dimensionality of a patient’s gene expression profile, effectively distinguishing between different types of cancer. Subsequently, we employ a cascad..., , , # RNA-CDM Generated One Million Synthetic Images
https://doi.org/10.5061/dryad.6djh9w174
One million synthetic digital pathology images were generated using the RNA-CDM model presented in the paper "RNA-to-image multi-cancer synthesis using cascaded diffusion models".
There are ten different h5 files per cancer type (TCGA-CESC, TCGA-COAD, TCGA-KIRP, TCGA-GBM, TCGA-LUAD). Each h5 file contains 20.000 images. The key is the tile number, ranging from 0-20,000 in the first file, and from 180,000-200,000 in the last file. The tiles are saved as numpy arrays.
The code used to generate this data is available under academic license in https://rna-cdm.stanford.edu .
Carrillo-Perez, F., Pizurica, M., Zheng, Y. et al. Generation of synthetic whole-slide image tiles of tumours from RNA-sequencing data via cascaded diffusion models...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Insider Threat Test Dataset is a collection of synthetic insider threat test datasets that provide both background and malicious actor synthetic data.
The CERT Division, in partnership with ExactData, LLC, and under sponsorship from DARPA I2O, generated a collection of synthetic insider threat test datasets. These datasets provide both synthetic background data and data from synthetic malicious actors. For more background on this data, please see the paper, Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data. Datasets are organized according to the data generator release that created them. Most releases include multiple datasets (e.g., r3.1 and r3.2). Generally, later releases include a superset of the data generation functionality of earlier releases. Each dataset file contains a readme file that provides detailed notes about the features of that release. The answer key file answers.tar.bz2 contains the details of the malicious activity included in each dataset, including descriptions of the scenarios enacted and the identifiers of the synthetic users involved.
https://researchdatafinder.qut.edu.au/display/n6417https://researchdatafinder.qut.edu.au/display/n6417
Test data created for the ACRV Robotic Vision Challenge 1. See https://competitions.codalab.org/competitions/20940. Synthetic image data generated from Unreal Engine QUT Research Data Respository Dataset Resource available for download
https://www.marketresearchintellect.com/privacy-policyhttps://www.marketresearchintellect.com/privacy-policy
The size and share of the market is categorized based on Type (Implementation, Consulting, Support and Maintenance) and Application (Data subsetting, Data masking, Data profiling and analysis, Data compliance and security, Synthetic test data generation, Others) and geographical regions (North America, Europe, Asia-Pacific, South America, and Middle-East and Africa).
SYNTHETIC This dataset contains 10 tumor and normal pairs synthetic WGS data of colorectal cancer that were simulated in a standard format of Illumina paired-end reads. NEAT read simulator (version 3.0, https://github.com/zstephens/neat-genreads) is utilized to synthetize these 10 pairs tumor and normal WGS data. In the procedure of data generation, simulated parameters (i.e., sequencing error statistics, read fragment length distribution and GC% coverage bias) are learned from data models provided by NEAT. The average sequencing depth for tumor and normal samples aims to reach around 110X and 60X, respectively. For generation of synthetic normal WGS data per each sample, a germline variant profile from a real patient is down-sampled randomly, which includes 50% germline variants of such a patient. It is then mixed together with an in silico germline variant profile that is modelled randomly using an average mutation rate (0.001), finally constituting a full germline profile for normal synthetic WGS data. For generation of synthetic tumor WGS data per each sample, a pre-defined somatic short variant profile (SNVs+Indels) learn from a real CRC patient is added to the germline variant profile used for creating normal synthetic WGS data of the same patient, which is utilized to produce simulated sequences. Neither copy number profile nor structural variation profile is introduced into the tumor synthetic WGS data. Tumor content and ploidy are assumed to be 100% and 2.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Spectral library search can enable more sensitive peptide identification in tandem mass spectrometry experiments. However, its drawbacks are the limited availability of high-quality libraries and the added difficulty of creating decoy spectra for result validation. We describe MS Ana, a new spectral library search engine that enables high sensitivity peptide identification using either curated or predicted spectral libraries as well as robust false discovery control through its own decoy library generation algorithm. MS Ana identifies on average 36% more spectrum matches and 4% more proteins than database search in a benchmark test on single-shot human cell-line data. Further, we demonstrate the quality of the result validation with tests on synthetic peptide pools and show the importance of library selection through a comparison of library search performance with different configurations of publicly available human spectral libraries.
100% synthetic. Based on model-released photos. Can be used for any purpose except for the ones violating the law. Worldwide. Different backgrounds: colored, transparent, photographic. Diversity: ethnicity, demographics, facial expressions, and poses.
The INDIGO Change Detection Reference Dataset Description This graffiti-centred change detection dataset was developed in the context of INDIGO, a research project focusing on the documentation, analysis and dissemination of graffiti along Vienna's Donaukanal. The dataset aims to support the development and assessment of change detection algorithms. The dataset was collected from a test site approximately 50 meters in length along Vienna's Donaukanal during 11 days between 2022/10/21 and 2022/12/01. Various cameras with different settings were used, resulting in a total of 29 data collection sessions or "epochs" (see "EpochIDs.jpg" for details). Each epoch contains 17 images generated from 29 distinct 3D models with different textures. In total, the dataset comprises 6,902 unique image pairs, along with corresponding reference change maps. Additionally, exclusion masks are provided to ignore parts of the scene that might be irrelevant, such as the background. To summarise, the dataset, labelled as "Data.zip," includes the following: Synthetic Images: These are colour images created within Agisoft Metashape Professional 1.8.4, generated by rendering views from 17 artificial cameras observing 29 differently textured versions of the same 3D surface model. Change Maps: Binary images that were manually and programmatically generated, using a Python script, from two synthetic graffiti images. These maps highlight the areas where changes have occurred. Exclusion Masks: Binary images are manually created from synthetic graffiti images to identify "no data" areas or irrelevant ground pixels. Image Acquisition Image acquisition involved the use of two different camera setups. The first two datasets (ID 1 and 2; cf. "EpochIDs.jpg") were obtained using a Nikon Z 7II camera with a pixel count of 45.4 MP, paired with a Nikon NIKKOR Z 20 mm lens. For the remaining image datasets (ID 3-29), a triple GoPro setup was employed. This triple setup featured three GoPro cameras, comprising two GoPro HERO 10 cameras and one GoPro HERO 11, all securely mounted within a frame. This triple-camera setup was utilised on nine different days with varying camera settings, resulting in the acquisition of 27 image datasets in total (nine days with three datasets each). Data Structure The "Data.zip" file contains two subfolders: 1_ImagesAndChangeMaps: This folder contains the primary dataset. Each subfolder corresponds to a specific epoch. Within each epoch folder resides a subfolder for every other epoch with which a distinct epoch pair can be created. It is important to note that the pairs "Epoch Y and Epoch Z" are equivalent to "Epoch Z and Epoch Y", so the latter combinations are not included in this dataset. Each sub-subfolder, organised by epoch, contains 17 more subfolders, which hold the image data. These subfolders consist of: Two synthetic images rendered from the same synthetic camera ("X_Y.jpg" and "X_Z.jpg") The corresponding binary reference change map depicting the graffiti-related differences between the two images ("X_YZ.png"). Black areas denote new graffiti (i.e. "change"), and white denotes "no change". "DataStructure.png" provides a visual explanation concerning the creation of the dataset. The filenames follow the following pattern: X - Is the ID number of the synthetic camera. In total, 17 synthetic cameras were placed along the test site Y - Corresponds to the reference epoch (i.e. the "older epoch") Z - Corresponds to the "new epoch" 2_ExclusionMasks: This folder contains the binary exclusion masks. They were manually created from synthetic graffiti images and identify "no data" areas or areas considered irrelevant, such as "ground pixels". Two exclusion masks were generated for each of the 17 synthetic cameras: "groundMasks": depict ground pixels which are usually irrelevant for the detection of graffiti "noDataMasks": depict "background" for which no data is available. A detailed dataset description (including detailed explanations of the data creation) is part of a journal paper currently in preparation. The paper will be linked here for further clarification as soon as it is available. Licensing Due to the nature of the three image types, this dataset comes with two licenses: Synthetic images: These come with an In Copyright license (for the rights usage terms, see https://rightsstatements.org/page/InC/1.0/?language=en). The copyright lies with: the Ludwig Boltzmann Gesellschaft (https://d-nb.info/gnd/1024204324) the TU Wien (https://d-nb.info/gnd/55426-1) One or more anonymous graffiti creator(s) upon whose work these images are based. The first two entities are also the licensor of these images. Change maps and masks: These are openly licensed via CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0) In this case, the copyright lies with: the Ludwig Boltzmann Gesellschaft (https://d-nb.info/gnd/1024204324) the TU Wien (https://d-nb.info/gnd/55426-1) Both institutes are also the licensor of these images. Every synthetic image, change map and mask has this licensing information embedded as IPTC photo metadata. In addition, the images' IPTC metadata also provide a short image description, the image creator and the creator's identity (in the form of an ORCiD).
https://www.marketresearchintellect.com/privacy-policyhttps://www.marketresearchintellect.com/privacy-policy
The size and share of the market is categorized based on Type (Implementation, Consulting, Support and Maintenance, Training and Education) and Application (Data subsetting, Data masking, Data profiling and analysis, Data compliance and security, Synthetic test data generation, Others (data provisioning and data monitoring)) and geographical regions (North America, Europe, Asia-Pacific, South America, and Middle-East and Africa).
In order to anticipate the impact of local public policies, a synthetic population reflecting the characteristics of the local population provides a valuable test bed. While synthetic population datasets are now available for several countries, there is no open-source synthetic population for Canada. We propose an open-source synthetic population of individuals and households at a fine geographical level for Canada for the years 2021, 2023 and 2030. Based on 2016 census data and population projections, the synthetic individuals have detailed socio-demographic attributes, including age, sex, income, education level, employment status and geographic locations, and are related into households. A comparison of the 2021 synthetic population with 2021 census data over various geographical areas validates the reliability of the synthetic dataset. Users can extract populations from the dataset for specific zones, to explore ‘what if’ scenarios on present and future populations. They can extend the dataset using local survey data to add new characteristics to individuals. Users can also run the code to generate populations for years up to 2042.
To capture the full social and economic benefits of AI, new technologies must be sensitive to the diverse needs of the whole population. This means understanding and reflecting the complexity of individual needs, the variety of perceptions, and the constraints that might guide interaction with AI. This challenge is no more relevant than in building AI systems for older populations, where the role, potential, and outstanding challenges are all highly significant.
The RAIM (Responsible Automation for Inclusive Mobility) project will address how on-demand, electric autonomous vehicles (EAVs) might be integrated within public transport systems in the UK and Canada to meet the complex needs of older populations, resulting in improved social, economic, and health outcomes. The research integrates a multidisciplinary methodology - integrating qualitative perspectives and quantitative data analysis into AI-generated population simulations and supply optimisation. Throughout the project, there is a firm commitment to interdisciplinary interaction and learning, with researchers being drawn from urban geography, ageing population health, transport planning and engineering, and artificial intelligence.
The RAIM project will produce a diverse set of outputs that are intended to promote change and discussion in transport policymaking and planning. As a primary goal, the project will simulate and evaluate the feasibility of an on-demand EAV system for older populations. This requires advances around the understanding and prediction of the complex interaction of physical and cognitive constraints, preferences, locations, lifestyles and mobility needs within older populations, which differs significantly from other portions of society. With these patterns of demand captured and modelled, new methods for meeting this demand through optimisation of on-demand EAVs will be required. The project will adopt a forward-looking, interdisciplinary approach to the application of AI within these research domains, including using Deep Learning to model human behaviour, Deep Reinforcement Learning to optimise the supply of EAVs, and generative modelling to estimate population distributions.
A second component of the research involves exploring the potential adoption of on-demand EAVs for ageing populations within two regions of interest. The two areas of interest - Manitoba, Canada, and the West Midlands, UK - are facing the combined challenge of increasing older populations with service issues and reducing patronage on existing services for older travellers. The RAIM project has established partnerships with key local partners, including local transport authorities - Winnipeg Transit in Canada, and Transport for West Midlands in the UK - in addition to local support groups and industry bodies. These partnerships will provide insights and guidance into the feasibility of new AV-based mobility interventions, and a direct route to influencing future transport policy. As part of this work, the project will propose new approaches for assessing the economic case for transport infrastructure investment, by addressing the wider benefits of improved mobility in older populations.
At the heart of the project is a commitment to enhancing collaboration between academic communities in the UK and Canada. RAIM puts in place opportunities for cross-national learning and collaboration between partner organisations, ensuring that the challenges faced in relation to ageing mobility and AI are shared. RAIM furthermore will support the development of a next generation of researchers, through interdisciplinary mentoring, training, and networking opportunities.
Overview This database is meant to evaluate the performance of denoising and delineation algorithms for PPG signals affected by noise. The noise generator allows applying the algorithms under test to an artificially corrupted reference PPG signal and comparing its output to the output obtained with the original signal. Moreover, the noise generator can produce artifacts of variable intensities, permitting the evaluation of the algorithms' performance against different noise levels. The reference signal is a PPG sample of a healthy subject at rest during a relaxing session. Database The database includes 1 recording of 72 seconds of synchronous PPG and ECG signals sampled at 250 Hz using a Medicom device, ABP-10 module (Medicom MTD Ltd., Russia). It was collected from a healthy subject during an induced relaxation by guided autogenic relaxation. For more information about the data collection, please refer to the following publication: https://pubmed.ncbi.nlm.nih.gov/30094756/ In addition, PPG signals corrupted by the noise generator at different levels are also included in the database. Realistic noise generator Motion Artifacts in PPG signals generally appear in the form of sudden spikes (in correspondence to the subject's movement) and slowly varying offsets (baseline wander) due to the changes in distance between the skin and the sensor after every sudden movement. For this reason, conventional noise generators — using random noise drawn from different distributions such as Gaussian or Poissonian — do not allow to properly evaluate the algorithm's performance, as they can only provide unrealistic noises compared to the one commonly found in PPG signals. To overcome this issue, we designed a more realistic synthetic noise generator that can simulate those two behaviors, enabling us to corrupt a reference signal with different noise levels. The details about noise generation are available in the reference paper. Data Files The reference PPG signal can be found in Datasets\GoodSignals\PPG and the simultaneously acquired ECG in Datasets\GoodSignals\ECG. The folder Datasets\NoisySignals contains 340 noisy PPG signals affected by different levels of noise. The names describe the intensity of the noise (evaluated in terms of the standard deviation of the random noise used as input for the noise generator, see reference paper). Five noisy signals are produced for every noise level by running the noise generator with five random seeds each (for noise generation). Name convention: ppg_stdx_y denotes the y-th noisy PPG signal produced using a noise with a standard deviation of x. Datasets\BPMs contains the ground truth for the heart-rate estimation computed in windows of 8s with an overlap of 2s. Code The folder Code contains the MATLAB scripts to generate the noisy files by generating the realistic noise with the function noiseGenerator. When referencing this material, please cite: Masinelli, G.; Dell'Agnola, F.; Valdés, A.A.; Atienza, D. SPARE: A Spectral Peak Recovery Algorithm for PPG Signals Pulsewave Reconstruction in Multimodal Wearable Devices. Sensors 2021, 21, 2725. https://doi.org/10.3390/s21082725
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset used in the article entitled 'Synthetic Datasets Generator for Testing Information Visualization and Machine Learning Techniques and Tools'. These datasets can be used to test several characteristics in machine learning and data processing algorithms.