8 datasets found

Data archive for paper "Copula-based synthetic data augmentation for...
zenodo.org
zip
Updated Mar 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Meyer; David Meyer (2022). Data archive for paper "Copula-based synthetic data augmentation for machine-learning emulators" [Dataset]. http://doi.org/10.5281/zenodo.5150327
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5150327
Dataset updated
Mar 15, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
David Meyer; David Meyer
Description
Overview

This is the data archive for paper "Copula-based synthetic data augmentation for machine-learning emulators". It contains the paper’s data archive with model outputs (see results folder) and the Singularity image for (optionally) re-running experiments.

For the Python tool used to generate synthetic data, please refer to Synthia.

Requirements

Singularity >= 3

Portable Batch System (PBS) job scheduler*

Today's high-performance computer (e.g. ~ 32 CPUs @ 2 500 MHz with 64 GB of RAM )

*Although PBS in not a strict requirement, it is required to run all helper scripts as included in this repository. Please note that depending on your specific system settings and resource availability, you may need to modify PBS parameters at the top of submit scripts stored in the hpc directory (e.g. #PBS -lwalltime=72:00:00).

Usage

To reproduce the results from the experiments described in the paper, first fit all copula models to the reduced NWP-SAF dataset with:

qsub hpc/fit.sh

then, to generate synthetic data, run all machine learning model configurations, and compute the relevant statistics use:

qsub hpc/stats.sh qsub hpc/ml_control.sh qsub hpc/ml_synth.sh

Finally, to plot all artifacts included in the paper use:

qsub hpc/plot.sh

Licence

Code released under MIT license. Data from the reduced NWP-SAF dataset released under CC BY 4.0.
Synthetic Data Generation Market Analysis, Size, and Forecast 2025-2029:...
technavio.com
Updated May 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Technavio (2025). Synthetic Data Generation Market Analysis, Size, and Forecast 2025-2029: North America (US, Canada, and Mexico), Europe (France, Germany, Italy, and UK), APAC (China, India, and Japan), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/synthetic-data-generation-market-analysis
Explore at:
Dataset updated
May 6, 2025
Dataset provided by
TechNavio
Authors
Technavio
Time period covered
2021 - 2025
Area covered
Global, United States
Description
Snapshot img

Synthetic Data Generation Market Size 2025-2029

The synthetic data generation market size is forecast to increase by USD 4.39 billion, at a CAGR of 61.1% between 2024 and 2029.

The market is experiencing significant growth, driven by the escalating demand for data privacy protection. With increasing concerns over data security and the potential risks associated with using real data, synthetic data is gaining traction as a viable alternative. Furthermore, the deployment of large language models is fueling market expansion, as these models can generate vast amounts of realistic and diverse data, reducing the reliance on real-world data sources. However, high costs associated with high-end generative models pose a challenge for market participants. These models require substantial computational resources and expertise to develop and implement effectively. Companies seeking to capitalize on market opportunities must navigate these challenges by investing in research and development to create more cost-effective solutions or partnering with specialists in the field. Overall, the market presents significant potential for innovation and growth, particularly in industries where data privacy is a priority and large language models can be effectively utilized.

What will be the Size of the Synthetic Data Generation Market during the forecast period?

Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
Request Free SampleThe market continues to evolve, driven by the increasing demand for data-driven insights across various sectors. Data processing is a crucial aspect of this market, with a focus on ensuring data integrity, privacy, and security. Data privacy-preserving techniques, such as data masking and anonymization, are essential in maintaining confidentiality while enabling data sharing. Real-time data processing and data simulation are key applications of synthetic data, enabling predictive modeling and data consistency. Data management and workflow automation are integral components of synthetic data platforms, with cloud computing and model deployment facilitating scalability and flexibility. Data governance frameworks and compliance regulations play a significant role in ensuring data quality and security. Deep learning models, variational autoencoders (VAEs), and neural networks are essential tools for model training and optimization, while API integration and batch data processing streamline the data pipeline. Machine learning models and data visualization provide valuable insights, while edge computing enables data processing at the source. Data augmentation and data transformation are essential techniques for enhancing the quality and quantity of synthetic data. Data warehousing and data analytics provide a centralized platform for managing and deriving insights from large datasets. Synthetic data generation continues to unfold, with ongoing research and development in areas such as federated learning, homomorphic encryption, statistical modeling, and software development. The market's dynamic nature reflects the evolving needs of businesses and the continuous advancements in data technology.

How is this Synthetic Data Generation Industry segmented?

The synthetic data generation industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments. End-userHealthcare and life sciencesRetail and e-commerceTransportation and logisticsIT and telecommunicationBFSI and othersTypeAgent-based modellingDirect modellingApplicationAI and ML Model TrainingData privacySimulation and testingOthersProductTabular dataText dataImage and video dataOthersGeographyNorth AmericaUSCanadaMexicoEuropeFranceGermanyItalyUKAPACChinaIndiaJapanRest of World (ROW)

By End-user Insights

The healthcare and life sciences segment is estimated to witness significant growth during the forecast period.In the rapidly evolving data landscape, the market is gaining significant traction, particularly in the healthcare and life sciences sector. With a growing emphasis on data-driven decision-making and stringent data privacy regulations, synthetic data has emerged as a viable alternative to real data for various applications. This includes data processing, data preprocessing, data cleaning, data labeling, data augmentation, and predictive modeling, among others. Medical imaging data, such as MRI scans and X-rays, are essential for diagnosis and treatment planning. However, sharing real patient data for research purposes or training machine learning algorithms can pose significant privacy risks. Synthetic data generation addresses this challenge by producing realistic medical imaging data, ensuring data privacy while enabling research
S
Synthetic Data Solution Report
marketreportanalytics.com
doc, pdf, ppt
Updated Apr 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Report Analytics (2025). Synthetic Data Solution Report [Dataset]. https://www.marketreportanalytics.com/reports/synthetic-data-solution-54761
Explore at:
ppt, doc, pdfAvailable download formats
Dataset updated
Apr 3, 2025
Dataset authored and provided by
Market Report Analytics
License
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The synthetic data solution market is experiencing robust growth, driven by increasing demand for data privacy compliance (GDPR, CCPA), the need for large, diverse datasets for AI/ML model training, and the rising costs and difficulties associated with obtaining real-world data. The market, currently estimated at $2 billion in 2025, is projected to witness a Compound Annual Growth Rate (CAGR) of 25% from 2025 to 2033, reaching an estimated $12 billion by 2033. This expansion is fueled by several key trends, including the maturation of synthetic data generation techniques, the increasing adoption of cloud-based solutions offering scalability and cost-effectiveness, and the growing recognition of synthetic data's crucial role in overcoming data bias and enhancing model accuracy. Key application areas driving this growth are financial services, where synthetic data helps in fraud detection and risk management, and the retail sector, benefiting from improved customer segmentation and personalized marketing strategies. The medical industry also presents a significant opportunity, with synthetic data enabling the development of innovative diagnostic tools and personalized treatments while protecting patient privacy. The competitive landscape is dynamic, with established players like Baidu competing alongside innovative startups such as LightWheel AI and Hanyi Innovation Technology. While the North American market currently holds a significant share, the Asia-Pacific region, particularly China and India, is poised for substantial growth due to increasing digitalization and the burgeoning AI market. Challenges remain, however, including the need to ensure the quality and realism of synthetic data and the ongoing development of robust validation and verification methods. Overcoming these hurdles will be crucial to unlocking the full potential of this rapidly evolving market. On-premises solutions are currently more prevalent, but the shift towards cloud-based solutions is expected to accelerate, driven by the benefits of scalability and accessibility.
S
Synthetic Data Generation Report
datainsightsmarket.com
doc, pdf, ppt
Updated Jun 16, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Synthetic Data Generation Report [Dataset]. https://www.datainsightsmarket.com/reports/synthetic-data-generation-1124388
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
Jun 16, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The synthetic data generation market is experiencing explosive growth, driven by the increasing need for high-quality data in various applications, including AI/ML model training, data privacy compliance, and software testing. The market, currently estimated at $2 billion in 2025, is projected to experience a Compound Annual Growth Rate (CAGR) of 25% from 2025 to 2033, reaching an estimated $10 billion by 2033. This significant expansion is fueled by several key factors. Firstly, the rising adoption of artificial intelligence and machine learning across industries demands large, high-quality datasets, often unavailable due to privacy concerns or data scarcity. Synthetic data provides a solution by generating realistic, privacy-preserving datasets that mirror real-world data without compromising sensitive information. Secondly, stringent data privacy regulations like GDPR and CCPA are compelling organizations to explore alternative data solutions, making synthetic data a crucial tool for compliance. Finally, the advancements in generative AI models and algorithms are improving the quality and realism of synthetic data, expanding its applicability in various domains. Major players like Microsoft, Google, and AWS are actively investing in this space, driving further market expansion. The market segmentation reveals a diverse landscape with numerous specialized solutions. While large technology firms dominate the broader market, smaller, more agile companies are making significant inroads with specialized offerings focused on specific industry needs or data types. The geographical distribution is expected to be skewed towards North America and Europe initially, given the high concentration of technology companies and early adoption of advanced data technologies. However, growing awareness and increasing data needs in other regions are expected to drive substantial market growth in Asia-Pacific and other emerging markets in the coming years. The competitive landscape is characterized by a mix of established players and innovative startups, leading to continuous innovation and expansion of market applications. This dynamic environment indicates sustained growth in the foreseeable future, driven by an increasing recognition of synthetic data's potential to address critical data challenges across industries.
f
Supplementary file 1_Data augmented lung cancer prediction framework using...
frontiersin.figshare.com
docx
Updated Feb 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yifan Jiang; Venkata S. K. Manem (2025). Supplementary file 1_Data augmented lung cancer prediction framework using the nested case control NLST cohort.docx [Dataset]. http://doi.org/10.3389/fonc.2025.1492758.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fonc.2025.1492758.s001
Dataset updated
Feb 25, 2025
Dataset provided by
Frontiers
Authors
Yifan Jiang; Venkata S. K. Manem
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PurposeIn the context of lung cancer screening, the scarcity of well-labeled medical images poses a significant challenge to implement supervised learning-based deep learning methods. While data augmentation is an effective technique for countering the difficulties caused by insufficient data, it has not been fully explored in the context of lung cancer screening. In this research study, we analyzed the state-of-the-art (SOTA) data augmentation techniques for lung cancer binary prediction.MethodsTo comprehensively evaluate the efficiency of data augmentation approaches, we considered the nested case control National Lung Screening Trial (NLST) cohort comprising of 253 individuals who had the commonly used CT scans without contrast. The CT scans were pre-processed into three-dimensional volumes based on the lung nodule annotations. Subsequently, we evaluated five basic (online) and two generative model-based offline data augmentation methods with ten state-of-the-art (SOTA) 3D deep learning-based lung cancer prediction models.ResultsOur results demonstrated that the performance improvement by data augmentation was highly dependent on approach used. The Cutmix method resulted in the highest average performance improvement across all three metrics: 1.07%, 3.29%, 1.19% for accuracy, F1 score and AUC, respectively. MobileNetV2 with a simple data augmentation approach achieved the best AUC of 0.8719 among all lung cancer predictors, demonstrating a 7.62% improvement compared to baseline. Furthermore, the MED-DDPM data augmentation approach was able to improve prediction performance by rebalancing the training set and adding moderately synthetic data.ConclusionsThe effectiveness of online and offline data augmentation methods were highly sensitive to the prediction model, highlighting the importance of carefully selecting the optimal data augmentation method. Our findings suggest that certain traditional methods can provide more stable and higher performance compared to SOTA online data augmentation approaches. Overall, these results offer meaningful insights for the development and clinical integration of data augmented deep learning tools for lung cancer screening.
H
Healthcare Data Collection and Labeling Report
datainsightsmarket.com
doc, pdf, ppt
Updated Jun 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Healthcare Data Collection and Labeling Report [Dataset]. https://www.datainsightsmarket.com/reports/healthcare-data-collection-and-labeling-976710
Explore at:
doc, ppt, pdfAvailable download formats
Dataset updated
Jun 2, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global healthcare data collection and labeling market is experiencing robust growth, driven by the increasing adoption of artificial intelligence (AI) and machine learning (ML) in healthcare. The rising volume of patient data generated through electronic health records (EHRs), wearable devices, and medical imaging necessitates efficient and accurate data labeling for training sophisticated AI algorithms. This demand fuels the market's expansion. While precise market sizing figures require further details, a reasonable estimate, considering the current growth trajectory of related AI and healthcare sectors, would place the 2025 market value at approximately $2 billion, with a Compound Annual Growth Rate (CAGR) of 15-20% projected through 2033. Key drivers include the need for improved diagnostic accuracy, personalized medicine, and drug discovery, all heavily reliant on high-quality labeled datasets. Furthermore, regulatory compliance mandates around data privacy and security are indirectly driving the adoption of specialized data collection and labeling services, ensuring data integrity and patient confidentiality. The market is segmented based on data type (imaging, text, sensor data), labeling method (supervised, unsupervised, semi-supervised), service type (data annotation, data augmentation, model training), and end-user (hospitals, pharmaceutical companies, research institutions). Companies like Alegion, Appen, and iMerit are key players, offering a range of services to meet diverse healthcare data needs. However, challenges remain, including data heterogeneity, scalability concerns related to large datasets, and the potential for bias in labeled data. Addressing these challenges requires continuous innovation in data collection methodologies, advanced labeling techniques, and the development of robust quality control measures. Future market growth will hinge on the successful integration of advanced technologies like synthetic data generation and automated labeling tools, aiming to reduce costs and accelerate the development of AI-powered healthcare solutions.
XIMAGENET-12: An Explainable AI Benchmark CVPR2024
kaggle.com
Updated Sep 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anomly (2023). XIMAGENET-12: An Explainable AI Benchmark CVPR2024 [Dataset]. http://doi.org/10.34740/kaggle/ds/3123294
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/ds/3123294
Dataset updated
Sep 13, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Anomly
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Introduction:

https://qiangli.de/imgs/flowchart2%20(1).png">
🌟 XimageNet-12 🌟

An Explainable Visual Benchmark Dataset for Robustness Evaluation. A Dataset for Image Background Exploration!

Blur Background, Segmented Background, AI-generated Background, Bias of Tools During Annotation, Color in Background, Random Background with Real Environment

+⭐ Follow Authors for project updates.

Website: XimageNet-12

Here, we trying to understand how image background effect the Computer Vision ML model, on topics such as Detection and Classification, based on baseline Li et.al work on ICLR 2022: Explainable AI: Object Recognition With Help From Background, we are now trying to enlarge the dataset, and analysis the following topics: Blur Background / Segmented Background / AI generated Background/ Bias of tools during annotation/ Color in Background / Dependent Factor in Background/ LatenSpace Distance of Foreground/ Random Background with Real Environment! Ultimately, we also define the math equation of Robustness Scores! So if you feel interested How would we make it or join this research project? please feel free to collaborate with us!

In this paper, we propose an explainable visual dataset, XIMAGENET-12, to evaluate the robustness of visual models. XIMAGENET-12 consists of over 200K images with 15,410 manual semantic annotations. Specifically, we deliberately selected 12 categories from ImageNet, representing objects commonly encountered in practical life. To simulate real-world situations, we incorporated six diverse scenarios, such as overexposure, blurring, and color changes, etc. We further develop a quantitative criterion for robustness assessment, allowing for a nuanced understanding of how visual models perform under varying conditions, notably in relation to the background.

Progress:

Blur Background-> Done! You can find the image Generated in the corresponding folder!

Segmented Background -> Done! you can download the image and its corresponding transparent mask image!

Color in Background->Done!~~ you can now download the image with different background color modified, and play with different color-ed images!

Random Background with Real Environment -> Done! you can also find we generated the image with the photographer's real image as a background and removed the original background of the target object, but similar to the style!

Bias of tools during annotation->Done! for this one, you won't get a new image, because this is about math and statistics data analysis when different tools and annotators are applied!

AI generated Background-> current on progress ( 12 /12) Done!, So basically you can find one sample folder image we uploaded, please take a look at how real it is, and guess what LLM model we are using to generate the high-resolution background to make it so real :)

What tool we used to generate those images?

We employed a combination of tools and methodologies to generate the images in this dataset, ensuring both efficiency and quality in the annotation and synthesis processes.

IoG Net: Initially, we utilized the IoG Net, which played a foundational role in our image generation pipeline.

Polygon Faster Labeling Tool: To facilitate the annotation process, we developed a custom Polygon Faster Labeling Tool, streamlining the labeling of objects within the images.AnyLabeling Open-source Project: We also experimented with the AnyLabeling open-source project, exploring its potential for our annotation needs.

V7 Lab Tool: Eventually, we found that the V7 Lab Tool provided the most efficient labeling speed and delivered high-quality annotations. As a result, we standardized the annotation process using this tool.

Data Augmentation: For the synthesis of synthetic images, we relied on a combination of deep learning frameworks, including scikit-learn and OpenCV. These tools allowed us to augment and manipulate images effectively to create a diverse range of backgrounds and variations.

GenAI: Our dataset includes images generated using the Stable Diffusion XL model, along with versions 1.5 and 2.0 of the Stable Diffusion model. These generative models played a pivotal role in crafting realistic and varied backgrounds.

For a detailed breakdown of our prompt engineering and hyperparameters, we invite you to consult our upcoming paper. This publication will provide comprehensive insights into our methodologies, enabling a deeper understanding of the image generation process.

How to use our dataset?

this dataset has been/could be downloaded via Kaggl...
P
Ornamental Flower Plants Dataset Dataset
paperswithcode.com
Updated Feb 25, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Ornamental Flower Plants Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/ornamental-flower-plants-dataset
Explore at:
Dataset updated
Feb 25, 2025
Description
Description:

👉 Download the dataset here

The Ornamental Flower Plants dataset is curated to assist in the development of accurate image classification models for various species of ornamental plants. Ideal for machine learning practitioners, researchers, and botanists, this dataset is structured to facilitate training and testing models that focus on plant recognition and classification. It contributes to enhancing biodiversity monitoring, plant taxonomy, and conservation through technological advancements.

Download Dataset

Dataset Features

Image Format: The dataset contains high-quality images in JPEG format, all resized to 224×224 pixels to ensure uniformity in model training.

Split Structure:

Training Set: Contains approximately 700 images of ornamental flowers, capturing various angles, lighting conditions, and environments to simulate real-world data variability.

Test Set: Comprises 150 images used for model evaluation, allowing for accurate performance measurement after training.

Categorization: Each image in the dataset is categorized by flower species, ensuring diversity in floral morphology, color, size, and regional varieties.

Context and Importance

In the modern era, flower identification has numerous applications ranging from educational tools to horticultural management. With increasing environmental concerns, this dataset serves as a valuable resource for automating plant recognition systems. Whether integrated into mobile applications for hobbyists or leveraged in large-scale agricultural systems, this dataset can assist in identifying and cataloging plant species effortlessly.

Data Enhancement Opportunities

Extended Variety: Expanding the dataset by adding more species of flowers, including those from rare or endangered categories, would greatly enhance the classification scope.

Additional Metadata: Adding extra information such as blooming seasons, geographic regions, and common uses (e.g., medicinal, decorative) would increase the dataset’s applicability across different fields like environmental science and agriculture.

Augmentation Techniques: To further improve model generalization, data augmentation (such as rotating, flipping, and varying brightness) could be applied to create synthetic variations of the original images.

Inspiration for Model Development

The inspiration behind this dataset is to push the boundaries of ornamental plant identification by building models that outperform existing flower classification tools. The goal is to develop an intuitive, easy-to-use system capable of classifying multiple flower species with high precision. Such systems can be useful in conservation efforts, eco-tourism, educational tools, or gardening aids.

Potential Applications

Mobile Applications: Use the dataset to develop apps that allow users to snap a picture and identify flowers instantly.

Agricultural Systems: Employ the dataset for AI-driven tools that assist farmers in monitoring ornamental plant health and identifying potential threats.

Conservation Efforts: Aid botanists and environmentalists in cataloging and preserving endangered flower species through automated systems.

Conclusion

The Ornamental Flower Plants dataset is an excellent starting point for developing sophisticated image classification models tailored to plant recognition. Its potential to be expanded and its diverse applications across industries make it an invaluable resource for AI practitioners and researchers working on environmental and botanical projects.

This dataset is sourced from Kaggle.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

David Meyer; David Meyer (2022). Data archive for paper "Copula-based synthetic data augmentation for machine-learning emulators" [Dataset]. http://doi.org/10.5281/zenodo.5150327

Data archive for paper "Copula-based synthetic data augmentation for machine-learning emulators"

Explore at:

3 scholarly articles cite this dataset (View in Google Scholar)

zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.5150327

Dataset updated

Mar 15, 2022

Dataset provided by

Zenodohttp://zenodo.org/

Authors

David Meyer; David Meyer

Description

Overview

This is the data archive for paper "Copula-based synthetic data augmentation for machine-learning emulators". It contains the paper’s data archive with model outputs (see results folder) and the Singularity image for (optionally) re-running experiments.

For the Python tool used to generate synthetic data, please refer to Synthia.

Requirements

Singularity >= 3
Portable Batch System (PBS) job scheduler*
Today's high-performance computer (e.g. ~ 32 CPUs @ 2 500 MHz with 64 GB of RAM )

*Although PBS in not a strict requirement, it is required to run all helper scripts as included in this repository. Please note that depending on your specific system settings and resource availability, you may need to modify PBS parameters at the top of submit scripts stored in the hpc directory (e.g. #PBS -lwalltime=72:00:00).

Usage

To reproduce the results from the experiments described in the paper, first fit all copula models to the reduced NWP-SAF dataset with:

qsub hpc/fit.sh

then, to generate synthetic data, run all machine learning model configurations, and compute the relevant statistics use:

qsub hpc/stats.sh
qsub hpc/ml_control.sh
qsub hpc/ml_synth.sh

Finally, to plot all artifacts included in the paper use:

qsub hpc/plot.sh

Licence

Code released under MIT license. Data from the reduced NWP-SAF dataset released under CC BY 4.0.

Clear search

Close search

Google apps

Main menu

Data archive for paper "Copula-based synthetic data augmentation for...

Synthetic Data Generation Market Analysis, Size, and Forecast 2025-2029:...

Snapshot img

Synthetic Data Solution Report

Synthetic Data Generation Report

Supplementary file 1_Data augmented lung cancer prediction framework using...

Healthcare Data Collection and Labeling Report

XIMAGENET-12: An Explainable AI Benchmark CVPR2024

Introduction:

🌟 XimageNet-12 🌟

Progress:

What tool we used to generate those images?

How to use our dataset?

Ornamental Flower Plants Dataset Dataset

Data archive for paper "Copula-based synthetic data augmentation for machine-learning emulators"