77 datasets found
  1. B

    Data Cleaning Sample

    • borealisdata.ca
    • dataone.org
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 13, 2023
    Dataset provided by
    Borealis
    Authors
    Rong Luo
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Sample data for exercises in Further Adventures in Data Cleaning.

  2. Employee data cleaning exercise

    • kaggle.com
    zip
    Updated May 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Davin Shaun (2024). Employee data cleaning exercise [Dataset]. https://www.kaggle.com/datasets/davinshaun/employee-data-cleaning-exercise
    Explore at:
    zip(29608 bytes)Available download formats
    Dataset updated
    May 29, 2024
    Authors
    Davin Shaun
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Davin Shaun

    Released under MIT

    Contents

  3. D

    Yield Data Cleaning Software Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Yield Data Cleaning Software Market Research Report 2033 [Dataset]. https://dataintelo.com/report/yield-data-cleaning-software-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Yield Data Cleaning Software Market Outlook



    According to our latest research, the global Yield Data Cleaning Software market size in 2024 stands at USD 1.14 billion, with a robust compound annual growth rate (CAGR) of 13.2% expected from 2025 to 2033. By the end of 2033, the market is forecasted to reach USD 3.42 billion. This remarkable market expansion is being driven by the increasing adoption of precision agriculture technologies, the proliferation of big data analytics in farming, and the rising need for accurate, real-time agricultural data to optimize yields and resource efficiency.




    One of the primary growth factors fueling the Yield Data Cleaning Software market is the rapid digital transformation within the agriculture sector. The integration of advanced sensors, IoT devices, and GPS-enabled machinery has led to an exponential increase in the volume of raw agricultural data generated on farms. However, this data often contains inconsistencies, errors, and redundancies due to equipment malfunctions, environmental factors, and human error. Yield Data Cleaning Software plays a critical role by automating the cleansing, validation, and normalization of such datasets, ensuring that only high-quality, actionable information is used for decision-making. As a result, farmers and agribusinesses can make more informed choices, leading to improved crop yields, efficient resource allocation, and reduced operational costs.




    Another significant driver is the growing emphasis on sustainable agriculture and environmental stewardship. Governments and regulatory bodies across the globe are increasingly mandating the adoption of data-driven practices to minimize the environmental impact of farming activities. Yield Data Cleaning Software enables stakeholders to monitor and analyze field performance accurately, track input usage, and comply with sustainability standards. Moreover, the software’s ability to integrate seamlessly with farm management platforms and analytics tools enhances its value proposition. This trend is further bolstered by the rising demand for traceability and transparency in the food supply chain, compelling agribusinesses to invest in robust data management solutions.




    The market is also witnessing substantial investments from technology providers, venture capitalists, and agricultural equipment manufacturers. Strategic partnerships and collaborations are becoming commonplace, with companies seeking to enhance their product offerings and expand their geographical footprint. The increasing awareness among farmers about the benefits of data accuracy and the availability of user-friendly, customizable software solutions are further accelerating market growth. Additionally, ongoing advancements in artificial intelligence (AI) and machine learning (ML) are enabling more sophisticated data cleaning algorithms, which can handle larger datasets and deliver deeper insights, thereby expanding the market’s potential applications.




    Regionally, North America continues to dominate the Yield Data Cleaning Software market, supported by its advanced agricultural infrastructure, high rate of technology adoption, and significant investments in agri-tech startups. Europe follows closely, driven by stringent environmental regulations and a strong focus on sustainable farming practices. The Asia Pacific region is emerging as a high-growth market, fueled by the rapid modernization of agriculture, government initiatives to boost food security, and increasing awareness among farmers about the benefits of digital solutions. Latin America and the Middle East & Africa are also showing promising growth trajectories, albeit from a smaller base, as they gradually embrace precision agriculture technologies.



    Component Analysis



    The Yield Data Cleaning Software market is bifurcated by component into Software and Services. The software segment currently accounts for the largest share of the market, underpinned by the increasing adoption of integrated farm management solutions and the demand for user-friendly platforms that can seamlessly process vast amounts of agricultural data. Modern yield data cleaning software solutions are equipped with advanced algorithms capable of detecting and rectifying data anomalies, thus ensuring the integrity and reliability of yield datasets. As the complexity of agricultural operations grows, the need for scalable, customizable software that can adapt to

  4. Labor Force Survey, LFS 2006 - Egypt

    • erfdataportal.com
    Updated Feb 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Central Agency For Public Mobilization And Statistics (2023). Labor Force Survey, LFS 2006 - Egypt [Dataset]. https://www.erfdataportal.com/index.php/catalog/146
    Explore at:
    Dataset updated
    Feb 5, 2023
    Dataset provided by
    Central Agency for Public Mobilization and Statisticshttps://www.capmas.gov.eg/
    Economic Research Forum
    Time period covered
    2006
    Area covered
    Egypt
    Description

    Abstract

    THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE CENTRAL AGENCY FOR PUBLIC MOBILIZATION AND STATISTICS (CAPMAS)

    In any society, the human element represents the basis of the work force which exercises all the service and production activities. Therefore, it is a mandate to produce labor force statistics and studies, that is related to the growth and distribution of manpower and labor force distribution by different types and characteristics.

    In this context, the Central Agency for Public Mobilization and Statistics conducts "Quarterly Labor Force Survey" which includes data on the size of manpower and labor force (employed and unemployed) and their geographical distribution by their characteristics.

    By the end of each year, CAPMAS issues the annual aggregated labor force bulletin publication that includes the results of the quarterly survey rounds that represent the manpower and labor force characteristics during the year.

    ----> Historical Review of the Labor Force Survey:

    1- The First Labor Force survey was undertaken in 1957. The first round was conducted in November of that year, the survey continued to be conducted in successive rounds (quarterly, bi-annually, or annually) till now.

    2- Starting the October 2006 round, the fieldwork of the labor force survey was developed to focus on the following two points: a. The importance of using the panel sample that is part of the survey sample, to monitor the dynamic changes of the labor market. b. Improving the used questionnaire to include more questions, that help in better defining of relationship to labor force of each household member (employed, unemployed, out of labor force ...etc.). In addition to re-order of some of the already existing questions in much logical way.

    3- Starting the January 2008 round, the used methodology was developed to collect more representative sample during the survey year. this is done through distributing the sample of each governorate into five groups, the questionnaires are collected from each of them separately every 15 days for 3 months (in the middle and the end of the month)

    ----> The survey aims at covering the following topics:

    1- Measuring the size of the Egyptian labor force among civilians (for all governorates of the republic) by their different characteristics. 2- Measuring the employment rate at national level and different geographical areas. 3- Measuring the distribution of employed people by the following characteristics: gender, age, educational status, occupation, economic activity, and sector. 4- Measuring unemployment rate at different geographic areas. 5- Measuring the distribution of unemployed people by the following characteristics: gender, age, educational status, unemployment type "ever employed/never employed", occupation, economic activity, and sector for people who have ever worked.

    The raw survey data provided by the Statistical Agency were cleaned and harmonized by the Economic Research Forum, in the context of a major project that started in 2009. During which extensive efforts have been exerted to acquire, clean, harmonize, preserve and disseminate micro data of existing labor force surveys in several Arab countries.

    Geographic coverage

    Covering a sample of urban and rural areas in all the governorates.

    Analysis unit

    1- Household/family. 2- Individual/person.

    Universe

    The survey covered a national sample of households and all individuals permanently residing in surveyed households.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE CENTRAL AGENCY FOR PUBLIC MOBILIZATION AND STATISTICS (CAPMAS)

    ----> Sample Design and Selection

    The sample of the LFS 2006 survey is a simple systematic random sample.

    ----> Sample Size

    The sample size varied in each quarter (it is Q1=19429, Q2=19419, Q3=19119 and Q4=18835) households with a total number of 76802 households annually. These households are distributed on the governorate level (urban/rural).

    A more detailed description of the different sampling stages and allocation of sample across governorates is provided in the Methodology document available among external resources in Arabic.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The questionnaire design follows the latest International Labor Organization (ILO) concepts and definitions of labor force, employment, and unemployment.

    The questionnaire comprises 3 tables in addition to the identification and geographic data of household on the cover page.

    ----> Table 1- Demographic and employment characteristics and basic data for all household individuals

    Including: gender, age, educational status, marital status, residence mobility and current work status

    ----> Table 2- Employment characteristics table

    This table is filled by employed individuals at the time of the survey or those who were engaged to work during the reference week, and provided information on: - Relationship to employer: employer, self-employed, waged worker, and unpaid family worker - Economic activity - Sector - Occupation - Effective working hours - Work place - Average monthly wage

    ----> Table 3- Unemployment characteristics table

    This table is filled by all unemployed individuals who satisfied the unemployment criteria, and provided information on: - Type of unemployment (unemployed, unemployed ever worked) - Economic activity and occupation in the last held job before being unemployed - Last unemployment duration in months - Main reason for unemployment

    Cleaning operations

    ----> Raw Data

    Office editing is one of the main stages of the survey. It started once the questionnaires were received from the field and accomplished by the selected work groups. It includes: a-Editing of coverage and completeness b-Editing of consistency

    ----> Harmonized Data

    • The STATA is used to clean and SPSS is used harmonize the datasets.
    • The harmonization process starts with a cleaning process for all raw data files received from the Statistical Agency.
    • All cleaned data files are then merged to produce one data file on the individual level containing all variables subject to harmonization.
    • A country-specific program is generated for each dataset to generate/ compute/ recode/ rename/ format/ label harmonized variables.
    • A post-harmonization cleaning process is then conducted on the data.
    • Harmonized data is saved on the household as well as the individual level, in SPSS and then converted to STATA, to be disseminated.
  5. Movies Data 1916 - 2016

    • kaggle.com
    zip
    Updated Jan 30, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashish (2019). Movies Data 1916 - 2016 [Dataset]. https://www.kaggle.com/datasets/ashydv/movies-data-1916-2016
    Explore at:
    zip(567534 bytes)Available download formats
    Dataset updated
    Jan 30, 2019
    Authors
    Ashish
    Description

    Dataset

    This dataset was created by Ashish

    Contents

  6. T

    Slovenia - Service producer prices: Cleaning activities

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Feb 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2021). Slovenia - Service producer prices: Cleaning activities [Dataset]. https://tradingeconomics.com/slovenia/service-producer-prices-cleaning-activities-eurostat-data.html
    Explore at:
    excel, csv, json, xmlAvailable download formats
    Dataset updated
    Feb 23, 2021
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Slovenia
    Description

    Slovenia - Service producer prices: Cleaning activities was 135.20 points in March of 2024, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Slovenia - Service producer prices: Cleaning activities - last updated from the EUROSTAT on December of 2025. Historically, Slovenia - Service producer prices: Cleaning activities reached a record high of 135.20 points in March of 2024 and a record low of 64.70 points in June of 2006.

  7. Surveys of Data Professionals (Alex the Analyst)

    • kaggle.com
    zip
    Updated Nov 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stewie (2023). Surveys of Data Professionals (Alex the Analyst) [Dataset]. https://www.kaggle.com/datasets/alexenderjunior/surveys-of-data-professionals-alex-the-analyst
    Explore at:
    zip(81050 bytes)Available download formats
    Dataset updated
    Nov 27, 2023
    Authors
    Stewie
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    [Dataset Name] - About This Dataset

    Overview

    This dataset is used in a data cleaning project based on the raw data from Alex the Analyst's Power BI tutorial series. The original dataset can be found here.

    Context

    The dataset is employed in a mini project that involves cleaning and preparing data for analysis. It is part of a series of exercises aimed at enhancing skills in data cleaning using Pandas.

    Content

    The dataset contains information related to [provide a brief description of the data, e.g., sales, customer information, etc.]. The columns cover various aspects such as [list key columns and their meanings].

    Acknowledgements

    The original dataset is sourced from Alex the Analyst's Power BI tutorial series. Special thanks to [provide credit or acknowledgment] for making the dataset available.

    Citation

    If you use this dataset in your work, please cite it as follows:

    How to Use

    1. Download the dataset from this link.
    2. Explore the Jupyter Notebook in the associated repository for insights into the data cleaning process.

    Feel free to reach out for any additional information or clarification. Happy analyzing!

  8. T

    Euro Area - Service producer prices: Cleaning activities

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Aug 18, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2021). Euro Area - Service producer prices: Cleaning activities [Dataset]. https://tradingeconomics.com/euro-area/service-producer-prices-cleaning-activities-eurostat-data.html
    Explore at:
    xml, json, excel, csvAvailable download formats
    Dataset updated
    Aug 18, 2021
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Euro Area
    Description

    Euro Area - Service producer prices: Cleaning activities was 118.60 points in September of 2023, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Euro Area - Service producer prices: Cleaning activities - last updated from the EUROSTAT on November of 2025. Historically, Euro Area - Service producer prices: Cleaning activities reached a record high of 118.60 points in September of 2023 and a record low of 86.70 points in March of 2006.

  9. T

    Denmark - Service producer prices: Cleaning activities

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Apr 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2021). Denmark - Service producer prices: Cleaning activities [Dataset]. https://tradingeconomics.com/denmark/service-producer-prices-cleaning-activities-eurostat-data.html
    Explore at:
    excel, csv, json, xmlAvailable download formats
    Dataset updated
    Apr 1, 2021
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Denmark
    Description

    Denmark - Service producer prices: Cleaning activities was 106.00 points in December of 2020, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Denmark - Service producer prices: Cleaning activities - last updated from the EUROSTAT on December of 2025. Historically, Denmark - Service producer prices: Cleaning activities reached a record high of 106.00 points in December of 2020 and a record low of 79.90 points in March of 2006.

  10. S

    Sweden Services Turnover Index: wda: Cleaning Activities

    • ceicdata.com
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). Sweden Services Turnover Index: wda: Cleaning Activities [Dataset]. https://www.ceicdata.com/en/sweden/services-turnover-index-2015100/services-turnover-index-wda-cleaning-activities
    Explore at:
    Dataset updated
    Jan 15, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Apr 1, 2017 - Mar 1, 2018
    Area covered
    Sweden
    Description

    Sweden Services Turnover Index: wda: Cleaning Activities data was reported at 107.600 2015=100 in Mar 2018. This records an increase from the previous number of 100.600 2015=100 for Feb 2018. Sweden Services Turnover Index: wda: Cleaning Activities data is updated monthly, averaging 75.000 2015=100 from Jan 2000 (Median) to Mar 2018, with 219 observations. The data reached an all-time high of 119.500 2015=100 in Oct 2016 and a record low of 32.700 2015=100 in Jan 2000. Sweden Services Turnover Index: wda: Cleaning Activities data remains active status in CEIC and is reported by Statistics Sweden. The data is categorized under Global Database’s Sweden – Table SE.H005: Services Turnover Index: 2015=100.

  11. T

    Sweden - Service producer prices: Cleaning activities

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jul 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2021). Sweden - Service producer prices: Cleaning activities [Dataset]. https://tradingeconomics.com/sweden/service-producer-prices-cleaning-activities-eurostat-data.html
    Explore at:
    json, csv, excel, xmlAvailable download formats
    Dataset updated
    Jul 11, 2021
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Sweden
    Description

    Sweden - Service producer prices: Cleaning activities was 119.30 points in March of 2021, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Sweden - Service producer prices: Cleaning activities - last updated from the EUROSTAT on November of 2025. Historically, Sweden - Service producer prices: Cleaning activities reached a record high of 119.30 points in March of 2021 and a record low of 99.30 points in March of 2015.

  12. f

    General Household Survey, Panel 2018-2019 - Nigeria

    • microdata.fao.org
    Updated Nov 8, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Bureau of Statistics (2022). General Household Survey, Panel 2018-2019 - Nigeria [Dataset]. https://microdata.fao.org/index.php/catalog/1374
    Explore at:
    Dataset updated
    Nov 8, 2022
    Dataset authored and provided by
    National Bureau of Statistics
    Time period covered
    2018 - 2019
    Area covered
    Nigeria
    Description

    Abstract

    The General Household Survey-Panel (GHS-Panel) is implemented in collaboration with the World Bank Living Standards Measurement Study (LSMS) team as part of the Integrated Surveys on Agriculture (ISA) program. The objectives of the GHS-Panel include the development of an innovative model for collecting agricultural data, interinstitutional collaboration, and comprehensive analysis of welfare indicators and socio-economic characteristics. The GHS-Panel is a nationally representative survey of approximately 5,000 households, which are also representative of the six geopolitical zones. The 2018/19 is the fourth round of the survey with prior rounds conducted in 2010/11, 2012/13, and 2015/16. GHS-Panel households were visited twice: first after the planting season (post-planting) between July and September 2018 and second after the harvest season (post-harvest) between January and February 2019.

    Geographic coverage

    National, the survey covered all the 36 states and Federal Capital Territory (FCT).

    Analysis unit

    Households, Individuals, Agricultural plots, Communites

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The original GHS-Panel sample of 5,000 households across 500 enumeration areas (EAs) and was designed to be representative at the national level as well as at the zonal level. The complete sampling information for the GHS-Panel is described in the Basic Information Document for GHS-Panel 2010/2011. However, after a nearly a decade of visiting the same households, a partial refresh of the GHS-Panel sample was implemented in Wave 4. For the partial refresh of the sample, a new set of 360 EAs were randomly selected which consisted of 60 EAs per zone. The refresh EAs were selected from the same sampling frame as the original GHS-Panel sample in 2010 (the "master frame").

    A listing of all households was conducted in the 360 EAs and 10 households were randomly selected in each EA, resulting in a total refresh sample of approximated 3,600 households. In addition to these 3,600 refresh households, a subsample of the original 5,000 GHS-Panel households from 2010 were selected to be included in the new sample. This "long panel" sample was designed to be nationally representative to enable continued longitudinal analysis for the sample going back to 2010. The long panel sample consisted of 159 EAs systematically selected across the 6 geopolitical Zones. The systematic selection ensured that the distribution of EAs across the 6 Zones (and urban and rural areas within) is proportional to the original GHS-Panel sample.

    Interviewers attempted to interview all households that originally resided in the 159 EAs and were successfully interviewed in the previous visit in 2016. This includes households that had moved away from their original location in 2010. In all, interviewers attempted to interview 1,507 households from the original panel sample. The combined sample of refresh and long panel EAs consisted of 519 EAs. The total number of households that were successfully interviewed in both visits was 4,976.

    Sampling deviation

    While the combined sample generally maintains both national and Zonal representativeness of the original GHS-Panel sample, the security situation in the North East of Nigeria prevented full coverage of the Zone. Due to security concerns, rural areas of Borno state were fully excluded from the refresh sample and some inaccessible urban areas were also excluded. Security concerns also prevented interviewers from visiting some communities in other parts of the country where conflict events were occurring. Refresh EAs that could not be accessed were replaced with another randomly selected EA in the Zone so as not to compromise the sample size. As a result, the combined sample is representative of areas of Nigeria that were accessible during 2018/19. The sample will not reflect conditions in areas that were undergoing conflict during that period. This compromise was necessary to ensure the safety of interviewers.

    Mode of data collection

    Computer Assisted Personal Interview [capi]

    Cleaning operations

    CAPI: For the first time in GHS-Panel, the Wave four exercise was conducted using Computer Assisted Person Interview (CAPI) techniques. All the questionnaires, household, agriculture and community questionnaires were implemented in both the post-planting and post-harvest visits of Wave 4 using the CAPI software, Survey Solutions. The Survey Solutions software was developed and maintained by the Survey Unit within the Development Economics Data Group (DECDG) at the World Bank. Each enumerator was given tablets which they used to conduct the interviews. Overall, implementation of survey using Survey Solutions CAPI was highly successful, as it allowed for timely availability of the data from completed interviews. DATA COMMUNICATION SYSTEM: The data communication system used in Wave 4 was highly automated. Each field team was given a mobile modem allow for internet connectivity and daily synchronization of their tablet. This ensured that head office in Abuja has access to the data in real-time. Once the interview is completed and uploaded to the server, the data is first reviewed by the Data Editors.

    The data is also downloaded from the server, and Stata dofile was run on the downloaded data to check for additional errors that were not captured by the Survey Solutions application. An excel error file is generated following the running of the Stata dofile on the raw dataset. Information contained in the excel error files are communicated back to respective field interviewers for action by the interviewers. This action is done on a daily basis throughout the duration of the survey, both in the post-planting and post-harvest. DATA CLEANING: The data cleaning process was done in three main stages. The first stage was to ensure proper quality control during the fieldwork. This was achieved in part by incorporating validation and consistency checks into the Survey Solutions application used for the data collection and designed to highlight many of the errors that occurred during the fieldwork. The second stage cleaning involved the use of Data Editors and Data Assistants (Headquarters in Survey Solutions). As indicated above, once the interview is completed and uploaded to the server, the Data Editors review completed interview for inconsistencies and extreme values. Depending on the outcome, they can either approve or reject the case. If rejected, the case goes back to the respective interviewer's tablet upon synchronization. Special care was taken to see that the households included in the data matched with the selected sample and where there were differences, these were properly assessed and documented.

    The agriculture data were also checked to ensure that the plots identified in the main sections merged with the plot information identified in the other sections. Additional errors observed were compiled into error reports that were regularly sent to the teams. These errors were then corrected based on re-visits to the household on the instruction of the supervisor. The data that had gone through this first stage of cleaning was then approved by the Data Editor. After the Data Editor's approval of the interview on Survey Solutions server, the Headquarters also reviews and depending on the outcome, can either reject or approve. The third stage of cleaning involved a comprehensive review of the final raw data following the first and second stage cleaning. Every variable was examined individually for (1) consistency with other sections and variables, (2) out of range responses, and (3) outliers. However, special care was taken to avoid making strong assumptions when resolving potential errors. Some minor errors remain in the data where the diagnosis and/or solution were unclear to the data cleaning team.

  13. S

    Spain SSPI: Cleaning Activities

    • ceicdata.com
    Updated Feb 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). Spain SSPI: Cleaning Activities [Dataset]. https://www.ceicdata.com/en/spain/services-sector-price-index-2015100/sspi-cleaning-activities
    Explore at:
    Dataset updated
    Feb 15, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 2021 - Dec 1, 2023
    Area covered
    Spain
    Description

    Spain SSPI: Cleaning Activities data was reported at 104.778 2015=100 in Dec 2023. This records an increase from the previous number of 104.678 2015=100 for Sep 2023. Spain SSPI: Cleaning Activities data is updated quarterly, averaging 100.907 2015=100 from Mar 2007 (Median) to Dec 2023, with 68 observations. The data reached an all-time high of 104.974 2015=100 in Jun 2023 and a record low of 97.321 2015=100 in Mar 2007. Spain SSPI: Cleaning Activities data remains active status in CEIC and is reported by National Statistics Institute. The data is categorized under Global Database’s Spain – Table ES.I037: Services Sector Price Index: 2015=100.

  14. T

    Finland - Service producer prices: Cleaning activities

    • tradingeconomics.com
    csv, excel, json, xml
    Updated May 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2021). Finland - Service producer prices: Cleaning activities [Dataset]. https://tradingeconomics.com/finland/service-producer-prices-cleaning-activities-eurostat-data.html
    Explore at:
    csv, json, xml, excelAvailable download formats
    Dataset updated
    May 18, 2021
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Finland
    Description

    Finland - Service producer prices: Cleaning activities was 106.80 points in December of 2020, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Finland - Service producer prices: Cleaning activities - last updated from the EUROSTAT on December of 2025. Historically, Finland - Service producer prices: Cleaning activities reached a record high of 106.80 points in December of 2020 and a record low of 78.30 points in March of 2005.

  15. D

    Cleaning SLA Compliance Computer Vision Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Oct 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Cleaning SLA Compliance Computer Vision Market Research Report 2033 [Dataset]. https://dataintelo.com/report/cleaning-sla-compliance-computer-vision-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Oct 1, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Cleaning SLA Compliance Computer Vision Market Outlook



    According to our latest research, the global Cleaning SLA Compliance Computer Vision market size reached USD 1.87 billion in 2024, with a robust year-on-year growth driven by increasing automation and digital transformation across various sectors. The market is expected to exhibit a CAGR of 22.3% during the forecast period, reaching a projected value of USD 6.93 billion by 2033. The primary growth factor fueling this expansion is the growing demand for real-time monitoring and compliance verification in cleaning operations, particularly within commercial and industrial environments where adherence to Service Level Agreements (SLAs) is critical for operational efficiency and regulatory compliance.



    The surge in demand for Cleaning SLA Compliance Computer Vision solutions is largely attributed to the heightened emphasis on hygiene and cleanliness standards post-pandemic. Organizations are increasingly leveraging computer vision technologies to automate the monitoring and verification of cleaning tasks, ensuring that SLAs are consistently met without manual oversight. The integration of artificial intelligence and machine learning with computer vision has enabled advanced analytics, anomaly detection, and predictive maintenance, further enhancing the reliability and precision of compliance solutions. This technological evolution is significantly reducing operational costs, minimizing human error, and providing actionable insights for continuous improvement, thereby acting as a key growth driver for the market.



    Another critical growth factor is the rapid digitalization of facility management processes across sectors such as healthcare, hospitality, and retail. The adoption of smart building technologies and IoT-enabled devices has created a conducive environment for the deployment of computer vision-based SLA compliance solutions. These systems provide real-time data on cleaning activities, occupancy, and environmental conditions, enabling facility managers to optimize resource allocation and ensure compliance with contractual obligations. Furthermore, regulatory pressures and the need for transparent reporting have prompted organizations to invest in automated compliance verification tools, thereby accelerating market growth.



    The proliferation of cloud-based platforms and scalable deployment models is also propelling the Cleaning SLA Compliance Computer Vision market forward. Cloud deployment offers seamless integration, centralized data management, and remote accessibility, making it an attractive option for enterprises with distributed operations. Additionally, the increasing availability of affordable hardware components, coupled with advancements in edge computing, is democratizing access to sophisticated computer vision solutions. This democratization is fostering innovation and encouraging small and medium-sized enterprises (SMEs) to adopt these technologies, further expanding the market’s reach and potential.



    From a regional perspective, North America continues to dominate the market, accounting for the largest share in 2024, followed closely by Europe and Asia Pacific. The presence of major technology providers, early adoption of automation, and stringent regulatory frameworks are key factors contributing to North America’s leadership. Meanwhile, the Asia Pacific region is witnessing the fastest growth, driven by rapid urbanization, infrastructure development, and increasing investments in smart building technologies. Latin America and the Middle East & Africa are also emerging as promising markets, supported by growing awareness and government initiatives aimed at improving public health and safety standards.



    Component Analysis



    The component segment of the Cleaning SLA Compliance Computer Vision market is categorized into software, hardware, and services, each playing a pivotal role in the overall ecosystem. Software solutions form the backbone of compliance monitoring, leveraging advanced algorithms for image recognition, anomaly detection, and data analytics. These platforms are designed to process large volumes of visual data in real-time, enabling organizations to track cleaning activities, detect missed spots, and generate compliance reports automatically. The software segment is experiencing significant growth, driven by continuous advancements in AI and machine learning, which are enhancing the accuracy and scalability of computer vision appl

  16. U

    United Kingdom SIT: Cleaning Activities

    • ceicdata.com
    Updated Nov 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). United Kingdom SIT: Cleaning Activities [Dataset]. https://www.ceicdata.com/en/united-kingdom/industrial-turnover-value-services/sit-cleaning-activities
    Explore at:
    Dataset updated
    Nov 15, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jul 1, 2017 - Jun 1, 2018
    Area covered
    United Kingdom
    Variables measured
    Industrial Sales / Turnover
    Description

    United Kingdom SIT: Cleaning Activities data was reported at 821.500 GBP mn in Sep 2018. This records an increase from the previous number of 795.800 GBP mn for Aug 2018. United Kingdom SIT: Cleaning Activities data is updated monthly, averaging 600.500 GBP mn from Jan 1998 (Median) to Sep 2018, with 249 observations. The data reached an all-time high of 821.500 GBP mn in Sep 2018 and a record low of 0.000 GBP mn in Dec 1999. United Kingdom SIT: Cleaning Activities data remains active status in CEIC and is reported by Office for National Statistics. The data is categorized under Global Database’s United Kingdom – Table UK.C002: Industrial Turnover Value: Services.

  17. T

    Turkey Turnover Index: TS: NACE 2: AS: Cleaning Activities

    • ceicdata.com
    Updated Oct 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). Turkey Turnover Index: TS: NACE 2: AS: Cleaning Activities [Dataset]. https://www.ceicdata.com/en/turkey/turnover-index-2005100-by-trade-and-services/turnover-index-ts-nace-2-as-cleaning-activities
    Explore at:
    Dataset updated
    Oct 15, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 2010 - Dec 1, 2012
    Area covered
    Türkiye
    Variables measured
    Domestic Trade
    Description

    Turkey Turnover Index: TS: NACE 2: AS: Cleaning Activities data was reported at 314.200 2005=100 in Dec 2012. This records an increase from the previous number of 289.500 2005=100 for Sep 2012. Turkey Turnover Index: TS: NACE 2: AS: Cleaning Activities data is updated quarterly, averaging 278.550 2005=100 from Mar 2009 (Median) to Dec 2012, with 16 observations. The data reached an all-time high of 314.300 2005=100 in Jun 2012 and a record low of 170.700 2005=100 in Mar 2009. Turkey Turnover Index: TS: NACE 2: AS: Cleaning Activities data remains active status in CEIC and is reported by Turkish Statistical Institute. The data is categorized under Global Database’s Turkey – Table TR.H022: Turnover Index: 2005=100: by Trade and Services .

  18. N

    Netherlands Gross Hourly Earnings Index: CS: BS: RO: Cleaning Activities

    • ceicdata.com
    Updated Jan 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). Netherlands Gross Hourly Earnings Index: CS: BS: RO: Cleaning Activities [Dataset]. https://www.ceicdata.com/en/netherlands/gross-hourly-earnings-index-standard-industrial-classification-2008/gross-hourly-earnings-index-cs-bs-ro-cleaning-activities
    Explore at:
    Dataset updated
    Jan 15, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2017 - May 1, 2018
    Area covered
    Netherlands
    Variables measured
    Wage/Earnings
    Description

    Netherlands Gross Hourly Earnings Index: CS: BS: RO: Cleaning Activities data was reported at 119.700 2010=100 in Oct 2018. This stayed constant from the previous number of 119.700 2010=100 for Sep 2018. Netherlands Gross Hourly Earnings Index: CS: BS: RO: Cleaning Activities data is updated monthly, averaging 108.200 2010=100 from Jan 2010 (Median) to Oct 2018, with 106 observations. The data reached an all-time high of 119.700 2010=100 in Oct 2018 and a record low of 99.500 2010=100 in Jun 2010. Netherlands Gross Hourly Earnings Index: CS: BS: RO: Cleaning Activities data remains active status in CEIC and is reported by Statistics Netherlands. The data is categorized under Global Database’s Netherlands – Table NL.G026: Gross Hourly Earnings Index: Standard Industrial Classification 2008.

  19. T

    Poland - Service producer prices: Cleaning activities

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Aug 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). Poland - Service producer prices: Cleaning activities [Dataset]. https://tradingeconomics.com/poland/service-producer-prices-cleaning-activities-eurostat-data.html
    Explore at:
    xml, excel, csv, jsonAvailable download formats
    Dataset updated
    Aug 29, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Poland
    Description

    Poland - Service producer prices: Cleaning activities was 118.10 points in December of 2023, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Poland - Service producer prices: Cleaning activities - last updated from the EUROSTAT on December of 2025. Historically, Poland - Service producer prices: Cleaning activities reached a record high of 118.10 points in December of 2023 and a record low of 66.50 points in March of 2006.

  20. T

    Turkey Wage & Salary Index: TS: NACE 2: AS: Cleaning Activities

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, Turkey Wage & Salary Index: TS: NACE 2: AS: Cleaning Activities [Dataset]. https://www.ceicdata.com/en/turkey/gross-wage-and-salary-index-2005100-by-trade-and-services/wage--salary-index-ts-nace-2-as-cleaning-activities
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 2010 - Dec 1, 2012
    Area covered
    Turkey
    Variables measured
    Wage/Earnings
    Description

    Turkey Wage & Salary Index: TS: NACE 2: AS: Cleaning Activities data was reported at 349.800 2005=100 in Dec 2012. This records an increase from the previous number of 334.600 2005=100 for Sep 2012. Turkey Wage & Salary Index: TS: NACE 2: AS: Cleaning Activities data is updated quarterly, averaging 319.250 2005=100 from Mar 2009 (Median) to Dec 2012, with 16 observations. The data reached an all-time high of 365.800 2005=100 in Dec 2011 and a record low of 226.300 2005=100 in Mar 2009. Turkey Wage & Salary Index: TS: NACE 2: AS: Cleaning Activities data remains active status in CEIC and is reported by Turkish Statistical Institute. The data is categorized under Global Database’s Turkey – Table TR.G055: Gross Wage and Salary Index: 2005=100: by Trade and Services.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177

Data Cleaning Sample

Explore at:
167 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 13, 2023
Dataset provided by
Borealis
Authors
Rong Luo
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

Sample data for exercises in Further Adventures in Data Cleaning.

Search
Clear search
Close search
Google apps
Main menu