100+ datasets found
  1. o

    Messy data for data cleaning exercise - Dataset - openAFRICA

    • open.africa
    Updated Oct 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Messy data for data cleaning exercise - Dataset - openAFRICA [Dataset]. https://open.africa/dataset/messy-data-for-data-cleaning-exercise
    Explore at:
    Dataset updated
    Oct 6, 2021
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A messy data for demonstrating "how to clean data using spreadsheet". This dataset was intentionally formatted to be messy, for the purpose of demonstration. It was collated from here - https://openafrica.net/dataset/historic-and-projected-rainfall-and-runoff-for-4-lake-victoria-sub-regions

  2. D

    Data Cleaning Tools Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Data Cleaning Tools Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/data-cleaning-tools-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Cleaning Tools Market Outlook



    As of 2023, the global market size for data cleaning tools is estimated at $2.5 billion, with projections indicating that it will reach approximately $7.1 billion by 2032, reflecting a robust CAGR of 12.1% during the forecast period. This growth is primarily driven by the increasing importance of data quality in business intelligence and analytics workflows across various industries.



    The growth of the data cleaning tools market can be attributed to several critical factors. Firstly, the exponential increase in data generation across industries necessitates efficient tools to manage data quality. Poor data quality can result in significant financial losses, inefficient business processes, and faulty decision-making. Organizations recognize the value of clean, accurate data in driving business insights and operational efficiency, thereby propelling the adoption of data cleaning tools. Additionally, regulatory requirements and compliance standards also push companies to maintain high data quality standards, further driving market growth.



    Another significant growth factor is the rising adoption of AI and machine learning technologies. These advanced technologies rely heavily on high-quality data to deliver accurate results. Data cleaning tools play a crucial role in preparing datasets for AI and machine learning models, ensuring that the data is free from errors, inconsistencies, and redundancies. This surge in the use of AI and machine learning across various sectors like healthcare, finance, and retail is driving the demand for efficient data cleaning solutions.



    The proliferation of big data analytics is another critical factor contributing to market growth. Big data analytics enables organizations to uncover hidden patterns, correlations, and insights from large datasets. However, the effectiveness of big data analytics is contingent upon the quality of the data being analyzed. Data cleaning tools help in sanitizing large datasets, making them suitable for analysis and thus enhancing the accuracy and reliability of analytics outcomes. This trend is expected to continue, fueling the demand for data cleaning tools.



    In terms of regional growth, North America holds a dominant position in the data cleaning tools market. The region's strong technological infrastructure, coupled with the presence of major market players and a high adoption rate of advanced data management solutions, contributes to its leadership. However, the Asia Pacific region is anticipated to witness the highest growth rate during the forecast period. The rapid digitization of businesses, increasing investments in IT infrastructure, and a growing focus on data-driven decision-making are key factors driving the market in this region.



    As organizations strive to maintain high data quality standards, the role of an Email List Cleaning Service becomes increasingly vital. These services ensure that email databases are free from invalid addresses, duplicates, and outdated information, thereby enhancing the effectiveness of marketing campaigns and communications. By leveraging sophisticated algorithms and validation techniques, email list cleaning services help businesses improve their email deliverability rates and reduce the risk of being flagged as spam. This not only optimizes marketing efforts but also protects the reputation of the sender. As a result, the demand for such services is expected to grow alongside the broader data cleaning tools market, as companies recognize the importance of maintaining clean and accurate contact lists.



    Component Analysis



    The data cleaning tools market can be segmented by component into software and services. The software segment encompasses various tools and platforms designed for data cleaning, while the services segment includes consultancy, implementation, and maintenance services provided by vendors.



    The software segment holds the largest market share and is expected to continue leading during the forecast period. This dominance can be attributed to the increasing adoption of automated data cleaning solutions that offer high efficiency and accuracy. These software solutions are equipped with advanced algorithms and functionalities that can handle large volumes of data, identify errors, and correct them without manual intervention. The rising adoption of cloud-based data cleaning software further bolsters this segment, as it offers scalability and ease of

  3. B

    Data Cleaning Sample

    • borealisdata.ca
    • dataone.org
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 13, 2023
    Dataset provided by
    Borealis
    Authors
    Rong Luo
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Sample data for exercises in Further Adventures in Data Cleaning.

  4. D

    Data Cleansing Software Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Data Cleansing Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-data-cleansing-software-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Cleansing Software Market Outlook



    The global data cleansing software market size was valued at approximately USD 1.5 billion in 2023 and is projected to reach around USD 4.2 billion by 2032, exhibiting a compound annual growth rate (CAGR) of 12.5% during the forecast period. This substantial growth can be attributed to the increasing importance of maintaining clean and reliable data for business intelligence and analytics, which are driving the adoption of data cleansing solutions across various industries.



    The proliferation of big data and the growing emphasis on data-driven decision-making are significant growth factors for the data cleansing software market. As organizations collect vast amounts of data from multiple sources, ensuring that this data is accurate, consistent, and complete becomes critical for deriving actionable insights. Data cleansing software helps organizations eliminate inaccuracies, inconsistencies, and redundancies, thereby enhancing the quality of their data and improving overall operational efficiency. Additionally, the rising adoption of advanced analytics and artificial intelligence (AI) technologies further fuels the demand for data cleansing software, as clean data is essential for the accuracy and reliability of these technologies.



    Another key driver of market growth is the increasing regulatory pressure for data compliance and governance. Governments and regulatory bodies across the globe are implementing stringent data protection regulations, such as the General Data Protection Regulation (GDPR) in Europe and the California Consumer Privacy Act (CCPA) in the United States. These regulations mandate organizations to ensure the accuracy and security of the personal data they handle. Data cleansing software assists organizations in complying with these regulations by identifying and rectifying inaccuracies in their data repositories, thus minimizing the risk of non-compliance and hefty penalties.



    The growing trend of digital transformation across various industries also contributes to the expanding data cleansing software market. As businesses transition to digital platforms, they generate and accumulate enormous volumes of data. To derive meaningful insights and maintain a competitive edge, it is imperative for organizations to maintain high-quality data. Data cleansing software plays a pivotal role in this process by enabling organizations to streamline their data management practices and ensure the integrity of their data. Furthermore, the increasing adoption of cloud-based solutions provides additional impetus to the market, as cloud platforms facilitate seamless integration and scalability of data cleansing tools.



    Regionally, North America holds a dominant position in the data cleansing software market, driven by the presence of numerous technology giants and the rapid adoption of advanced data management solutions. The region is expected to continue its dominance during the forecast period, supported by the strong emphasis on data quality and compliance. Europe is also a significant market, with countries like Germany, the UK, and France showing substantial demand for data cleansing solutions. The Asia Pacific region is poised for significant growth, fueled by the increasing digitalization of businesses and the rising awareness of data quality's importance. Emerging economies in Latin America and the Middle East & Africa are also expected to witness steady growth, driven by the growing adoption of data-driven technologies.



    The role of Data Quality Tools cannot be overstated in the context of data cleansing software. These tools are integral in ensuring that the data being processed is not only clean but also of high quality, which is crucial for accurate analytics and decision-making. Data Quality Tools help in profiling, monitoring, and cleansing data, thereby ensuring that organizations can trust their data for strategic decisions. As organizations increasingly rely on data-driven insights, the demand for robust Data Quality Tools is expected to rise. These tools offer functionalities such as data validation, standardization, and enrichment, which are essential for maintaining the integrity of data across various platforms and applications. The integration of these tools with data cleansing software enhances the overall data management capabilities of organizations, enabling them to achieve greater operational efficiency and compliance with data regulations.



    Component Analysis



    The data cle

  5. d

    Coresignal | Clean Data | Company Data | AI-Enriched Datasets | Global /...

    • datarade.ai
    .json, .csv
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Coresignal, Coresignal | Clean Data | Company Data | AI-Enriched Datasets | Global / 35M+ Records / Updated Weekly [Dataset]. https://datarade.ai/data-products/coresignal-clean-data-company-data-ai-enriched-datasets-coresignal
    Explore at:
    .json, .csvAvailable download formats
    Dataset authored and provided by
    Coresignal
    Area covered
    Hungary, Guatemala, Guinea-Bissau, Guadeloupe, Niue, Panama, Namibia, Saint Barthélemy, Chile, Andorra
    Description

    This clean dataset is a refined version of our company datasets, consisting of 35M+ data records.

    It’s an excellent data solution for companies with limited data engineering capabilities and those who want to reduce their time to value. You get filtered, cleaned, unified, and standardized B2B data. After cleaning, this data is also enriched by leveraging a carefully instructed large language model (LLM).

    AI-powered data enrichment offers more accurate information in key data fields, such as company descriptions. It also produces over 20 additional data points that are very valuable to B2B businesses. Enhancing and highlighting the most important information in web data contributes to quicker time to value, making data processing much faster and easier.

    For your convenience, you can choose from multiple data formats (Parquet, JSON, JSONL, or CSV) and select suitable delivery frequency (quarterly, monthly, or weekly).

    Coresignal is a leading public business data provider in the web data sphere with an extensive focus on firmographic data and public employee profiles. More than 3B data records in different categories enable companies to build data-driven products and generate actionable insights. Coresignal is exceptional in terms of data freshness, with 890M+ records updated monthly for unprecedented accuracy and relevance.

  6. Clean Data.csv

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    txt
    Updated Dec 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zaid Hattab (2023). Clean Data.csv [Dataset]. http://doi.org/10.6084/m9.figshare.24718401.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 3, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Zaid Hattab
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A subset of the Oregon Health Insurance Experiment (OHIE) contains 12,229 individuals who satisfied the inclusion criteria and who responded to the in-person survey by October 2010. It has been used to explore the heterogeneity of the effects of the lottery and the Insurance on a number of outcomes.

  7. clean data

    • kaggle.com
    Updated Aug 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ACHEMLAL YOUSRA (2023). clean data [Dataset]. https://www.kaggle.com/datasets/achemlalyousra/clean-data/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 16, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    ACHEMLAL YOUSRA
    Description

    Dataset

    This dataset was created by ACHEMLAL YOUSRA

    Contents

  8. u

    Jyutping Project - Raw Data and Clean Data

    • rdr.ucl.ac.uk
    application/csv
    Updated Aug 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Lam (2024). Jyutping Project - Raw Data and Clean Data [Dataset]. http://doi.org/10.5522/04/26504347.v1
    Explore at:
    application/csvAvailable download formats
    Dataset updated
    Aug 19, 2024
    Dataset provided by
    University College London
    Authors
    Joseph Lam
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Raw and clean data for Jyutping project, submitted to International Journal of Epidemiology.All data are openly available at the time of scrapping. I only retained Chinese Name and Hong Kong Government Romanised English Names. This project aims to describe the problem of non-standardised romanisation and it's impact on data linkage. The included data allows researchers to replicate my process of extracting Jyutping and Pinyin from Chinese Characters. Quite a few of manual screening and reviewing was required, so the code itself was not fully automated. The codes are stored on my personal GitHub, https://github.com/Jo-Lam/Jyutping_project/tree/main.Please cite this data resource: doi:10.5522/04/26504347

  9. h

    alpaca-cleaned

    • huggingface.co
    Updated Mar 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gene Ruebsamen (2023). alpaca-cleaned [Dataset]. https://huggingface.co/datasets/yahma/alpaca-cleaned
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 30, 2023
    Authors
    Gene Ruebsamen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for Alpaca-Cleaned

    Repository: https://github.com/gururise/AlpacaDataCleaned

      Dataset Description
    

    This is a cleaned version of the original Alpaca Dataset released by Stanford. The following issues have been identified in the original release and fixed in this dataset:

    Hallucinations: Many instructions in the original dataset had instructions referencing data on the internet, which just caused GPT3 to hallucinate an answer.

    "instruction":"Summarize the… See the full description on the dataset page: https://huggingface.co/datasets/yahma/alpaca-cleaned.

  10. u

    NSF/NCAR C-130 CN Clean Data

    • data.ucar.edu
    • ckanprod.ucar.edu
    ascii
    Updated Aug 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antony D. Clarke (2025). NSF/NCAR C-130 CN Clean Data [Dataset]. http://doi.org/10.26023/FM4V-3BER-7X0J
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Aug 1, 2025
    Authors
    Antony D. Clarke
    Time period covered
    Oct 31, 1995 - Dec 23, 1995
    Area covered
    Description

    Condensation Nuclei (CN) data collected by the University of Hawaii group (Clarke) in ACE1. All of the variables are average values for 15 second intervals. This dataset is a composite of all of the clean data files.

  11. Equity in Healthcare Clean DataSets

    • kaggle.com
    Updated Feb 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anopsy (2024). Equity in Healthcare Clean DataSets [Dataset]. https://www.kaggle.com/datasets/anopsy/equity-in-healthcare-clean-datasets
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 21, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Anopsy
    Description

    This dataset is based on train and test dataset from this competition: https://www.kaggle.com/competitions/widsdatathon2024-challenge1 .

    What did I change? 1. I dropped 2 columns that contained to little data.
    2. using Machine Learning I imputed "payer_type", "patient_race" and "bmi". 3. using "patient_zip3" I filled missing values in "patient_state" , "Region" and "Division" 4. using SinmpleImputer I imputed few missing numeric data in "Ozone", "PM2.5" and other columns 5. I created some new features, based on demographic features, that may be a bit more informative. 6. I tokenized the 'breast_cancer_diagnosis_desc' column

    If you're interested how I did that check those notebooks: https://www.kaggle.com/code/anopsy/ml-for-missing-values for "bmi" and new features check this: https://www.kaggle.com/code/anopsy/fe-and-xgb-on-clean-data

    According to the description of the original dataset, it's a "39k record dataset (split into training and test sets) representing patients and their characteristics (age, race, BMI, zip code), their diagnosis and treatment information (breast cancer diagnosis code, metastatic cancer diagnosis code, metastatic cancer treatments, … etc.), their geo (zip-code level) demographic data (income, education, rent, race, poverty, …etc), as well as toxic air quality data (Ozone, PM25 and NO2)."

  12. Clean data (script number 4)

    • figshare.com
    bin
    Updated May 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Violeta Berdejo Espinola (2025). Clean data (script number 4) [Dataset]. http://doi.org/10.6084/m9.figshare.29036840.v4
    Explore at:
    binAvailable download formats
    Dataset updated
    May 22, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Violeta Berdejo Espinola
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Raw labelled data used for analysis

  13. The fasta clean data

    • figshare.com
    application/x-rar
    Updated Jul 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lifeng Zhu (2021). The fasta clean data [Dataset]. http://doi.org/10.6084/m9.figshare.15073920.v1
    Explore at:
    application/x-rarAvailable download formats
    Dataset updated
    Jul 29, 2021
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Lifeng Zhu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    the celan dataset of 16S rRNA gene sequences

  14. H

    Clean Data Input for HHLocation Study

    • dataverse.harvard.edu
    tsv
    Updated Dec 7, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2015). Clean Data Input for HHLocation Study [Dataset]. http://doi.org/10.7910/DVN/QDEMVF
    Explore at:
    tsv(372781372), tsv(130402754)Available download formats
    Dataset updated
    Dec 7, 2015
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This is the cleaned household location data for the reproducible HHLocation case study

  15. h

    salami-processed-enriched-clean-data-trunc

    • huggingface.co
    Updated Jan 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Taufiq Syed (2025). salami-processed-enriched-clean-data-trunc [Dataset]. https://huggingface.co/datasets/taufiqsyed/salami-processed-enriched-clean-data-trunc
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 29, 2025
    Authors
    Taufiq Syed
    Description

    taufiqsyed/salami-processed-enriched-clean-data-trunc dataset hosted on Hugging Face and contributed by the HF Datasets community

  16. Marambio DMPS clean data

    • zenodo.org
    tar
    Updated Jan 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    German Perez Fogwill; German Perez Fogwill (2022). Marambio DMPS clean data [Dataset]. http://doi.org/10.5281/zenodo.5494515
    Explore at:
    tarAvailable download formats
    Dataset updated
    Jan 12, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    German Perez Fogwill; German Perez Fogwill
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Des Moines Independent Community School District
    Description

    DMPS data inverted with Matlab

  17. CLEAN DATA

    • kaggle.com
    Updated Apr 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md Minhazul Islam (2024). CLEAN DATA [Dataset]. https://www.kaggle.com/datasets/minannu/clean-data/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 19, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Md Minhazul Islam
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Tanbeer Jubaer

    Released under Apache 2.0

    Contents

  18. T

    Peru Imports from Ireland of Dish Washing Machines, Machinery for Cleaning

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Apr 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2024). Peru Imports from Ireland of Dish Washing Machines, Machinery for Cleaning [Dataset]. https://tradingeconomics.com/peru/imports/ireland/machines-dishwash-clean-control-fill-packing
    Explore at:
    excel, json, xml, csvAvailable download formats
    Dataset updated
    Apr 16, 2024
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1990 - Dec 31, 2025
    Area covered
    Peru
    Description

    Peru Imports from Ireland of Dish Washing Machines, Machinery for Cleaning was US$244 during 2024, according to the United Nations COMTRADE database on international trade. Peru Imports from Ireland of Dish Washing Machines, Machinery for Cleaning - data, historical chart and statistics - was last updated on August of 2025.

  19. clean data & vizualitation of data

    • kaggle.com
    Updated Apr 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joshua Richards (2024). clean data & vizualitation of data [Dataset]. https://www.kaggle.com/datasets/jermain119/clean-data-and-vizualitation-of-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 23, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Joshua Richards
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Joshua Richards

    Released under Apache 2.0

    Contents

  20. Untitled Item

    • figshare.com
    xlsx
    Updated Apr 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yingzhe Xiong (2024). Untitled Item [Dataset]. http://doi.org/10.6084/m9.figshare.25712202.v3
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Apr 29, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Yingzhe Xiong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    source data

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2021). Messy data for data cleaning exercise - Dataset - openAFRICA [Dataset]. https://open.africa/dataset/messy-data-for-data-cleaning-exercise

Messy data for data cleaning exercise - Dataset - openAFRICA

Explore at:
Dataset updated
Oct 6, 2021
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

A messy data for demonstrating "how to clean data using spreadsheet". This dataset was intentionally formatted to be messy, for the purpose of demonstration. It was collated from here - https://openafrica.net/dataset/historic-and-projected-rainfall-and-runoff-for-4-lake-victoria-sub-regions

Search
Clear search
Close search
Google apps
Main menu