63 datasets found
  1. B

    Data Cleaning Sample

    • borealisdata.ca
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 13, 2023
    Dataset provided by
    Borealis
    Authors
    Rong Luo
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Sample data for exercises in Further Adventures in Data Cleaning.

  2. Enterprise Survey 2009-2019, Panel Data - Slovenia

    • microdata.worldbank.org
    Updated Aug 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Enterprise Survey 2009-2019, Panel Data - Slovenia [Dataset]. https://microdata.worldbank.org/index.php/catalog/3762
    Explore at:
    Dataset updated
    Aug 6, 2020
    Dataset provided by
    European Bank for Reconstruction and Developmenthttp://ebrd.com/
    World Bankhttp://worldbank.org/
    World Bank Grouphttp://www.worldbank.org/
    European Investment Bank (EIB)
    Time period covered
    2008 - 2019
    Area covered
    Slovenia
    Description

    Abstract

    The documentation covers Enterprise Survey panel datasets that were collected in Slovenia in 2009, 2013 and 2019.

    The Slovenia ES 2009 was conducted between 2008 and 2009. The Slovenia ES 2013 was conducted between March 2013 and September 2013. Finally, the Slovenia ES 2019 was conducted between December 2018 and November 2019. The objective of the Enterprise Survey is to gain an understanding of what firms experience in the private sector.

    As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.

    Geographic coverage

    National

    Analysis unit

    The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must take its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.

    Universe

    As it is standard for the ES, the Slovenia ES was based on the following size stratification: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The sample for Slovenia ES 2009, 2013, 2019 were selected using stratified random sampling, following the methodology explained in the Sampling Manual for Slovenia 2009 ES and for Slovenia 2013 ES, and in the Sampling Note for 2019 Slovenia ES.

    Three levels of stratification were used in this country: industry, establishment size, and oblast (region). The original sample designs with specific information of the industries and regions chosen are included in the attached Excel file (Sampling Report.xls.) for Slovenia 2009 ES. For Slovenia 2013 and 2019 ES, specific information of the industries and regions chosen is described in the "The Slovenia 2013 Enterprise Surveys Data Set" and "The Slovenia 2019 Enterprise Surveys Data Set" reports respectively, Appendix E.

    For the Slovenia 2009 ES, industry stratification was designed in the way that follows: the universe was stratified into manufacturing industries, services industries, and one residual (core) sector as defined in the sampling manual. Each industry had a target of 90 interviews. For the manufacturing industries sample sizes were inflated by about 17% to account for potential non-response cases when requesting sensitive financial data and also because of likely attrition in future surveys that would affect the construction of a panel. For the other industries (residuals) sample sizes were inflated by about 12% to account for under sampling in firms in service industries.

    For Slovenia 2013 ES, industry stratification was designed in the way that follows: the universe was stratified into one manufacturing industry, and two service industries (retail, and other services).

    Finally, for Slovenia 2019 ES, three levels of stratification were used in this country: industry, establishment size, and region. The original sample design with specific information of the industries and regions chosen is described in "The Slovenia 2019 Enterprise Surveys Data Set" report, Appendix C. Industry stratification was done as follows: Manufacturing – combining all the relevant activities (ISIC Rev. 4.0 codes 10-33), Retail (ISIC 47), and Other Services (ISIC 41-43, 45, 46, 49-53, 55, 56, 58, 61, 62, 79, 95).

    For Slovenia 2009 and 2013 ES, size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposes, the number of employees was defined on the basis of reported permanent full-time workers. This seems to be an appropriate definition of the labor force since seasonal/casual/part-time employment is not a common practice, except in the sectors of construction and agriculture.

    For Slovenia 2009 ES, regional stratification was defined in 2 regions. These regions are Vzhodna Slovenija and Zahodna Slovenija. The Slovenia sample contains panel data. The wave 1 panel “Investment Climate Private Enterprise Survey implemented in Slovenia” consisted of 223 establishments interviewed in 2005. A total of 57 establishments have been re-interviewed in the 2008 Business Environment and Enterprise Performance Survey.

    For Slovenia 2013 ES, regional stratification was defined in 2 regions (city and the surrounding business area) throughout Slovenia.

    Finally, for Slovenia 2019 ES, regional stratification was done across two regions: Eastern Slovenia (NUTS code SI03) and Western Slovenia (SI04).

    Mode of data collection

    Computer Assisted Personal Interview [capi]

    Research instrument

    Questionnaires have common questions (core module) and respectfully additional manufacturing- and services-specific questions. The eligible manufacturing industries have been surveyed using the Manufacturing questionnaire (includes the core module, plus manufacturing specific questions). Retail firms have been interviewed using the Services questionnaire (includes the core module plus retail specific questions) and the residual eligible services have been covered using the Services questionnaire (includes the core module). Each variation of the questionnaire is identified by the index variable, a0.

    Response rate

    Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.

    Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect the refusal to respond as (-8). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary. However, there were clear cases of low response.

    For 2009 and 2013 Slovenia ES, the survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Up to 4 attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals. Further research is needed on survey non-response in the Enterprise Surveys regarding potential introduction of bias.

    For 2009, the number of contacted establishments per realized interview was 6.18. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The relatively low ratio of contacted establishments per realized interview (6.18) suggests that the main source of error in estimates in the Slovenia may be selection bias and not frame inaccuracy.

    For 2013, the number of realized interviews per contacted establishment was 25%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The number of rejections per contact was 44%.

    Finally, for 2019, the number of interviews per contacted establishments was 9.7%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The share of rejections per contact was 75.2%.

  3. S

    Annual Retail Store Data, 2000 [Canada] [Excel]

    • dataverse.scholarsportal.info
    • borealisdata.ca
    • +1more
    pdf, xls
    Updated Nov 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scholars Portal Dataverse (2021). Annual Retail Store Data, 2000 [Canada] [Excel] [Dataset]. https://dataverse.scholarsportal.info/dataset.xhtml;jsessionid=1283d69ee2dd528c9011fe4a2fe3?persistentId=hdl%3A10864%2F11351&version=&q=&fileTypeGroupFacet=&fileAccess=&fileTag=%22Tables%22&fileSortField=&fileSortOrder=
    Explore at:
    xls(2165760), xls(29696), xls(2920448), pdf(76787), pdf(158404), xls(34816), xls(2754048), pdf(81084), pdf(71183), xls(34304), xls(625664), xls(2707968), xls(695808), pdf(70673), pdf(72585), xls(576512), xls(609792), xls(28672), pdf(60236), pdf(30338), pdf(87181), pdf(84140), pdf(92012), xls(610304), pdf(74439), xls(2471424), pdf(73788), xls(30208), pdf(74478), pdf(53645)Available download formats
    Dataset updated
    Nov 17, 2021
    Dataset provided by
    Scholars Portal Dataverse
    Area covered
    Canada, Canada
    Description

    The annual Retail store data CD-ROM is an easy-to-use tool for quickly discovering retail trade patterns and trends. The current product presents results from the 1999 and 2000 Annual Retail Store and Annual Retail Chain surveys. This product contains numerous cross-classified data tables using the North American Industry Classification System (NAICS). The data tables provide access to a wide range of financial variables, such as revenues, expenses, inventory, sales per square footage (chain stores only) and the number of stores. Most data tables contain detailed information on industry (as low as 5-digit NAICS codes), geography (Canada, provinces and territories) and store type (chains, independents, franchises). The electronic product also contains survey metadata, questionnaires, information on industry codes and definitions, and the list of retail chain store respondents.

  4. Example Student Data.xlsx

    • figshare.com
    xlsx
    Updated Jun 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carrie Ellis (2022). Example Student Data.xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.19985453.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 3, 2022
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Carrie Ellis
    Description

    In the attached Excel file, "Example Student Data", there are 6 sheets. There are three sheets with sample datasets, one for each of the three different exercise protocols described. Additionally, there are three sheets with sample graphs created using one of the three datasets. · Sheets 1 and 2: This is an example of a dataset and graph created from an exercise protocol designed to stress the creatine phosphate system. Here, the subject was a track and field athlete who threw the shot put for the DeSales University track team. The NIRS monitor was placed on the right triceps muscle, and the student threw the shot put six times with a minute rest in between throws. Data was collected telemetrically by the NIRS device and then downloaded after the student had completed the protocol. · Sheets 3 and 4: This is an example of a dataset and graph created from an exercise protocol designed to stress the glycolytic energy system. In this example, the subject performed continuous squat jumps for 30 seconds, followed by a 90 second rest period, for a total of three exercise bouts. The NIRS monitor was place on the left gastrocnemius muscle. Here again, data was collected telemetrically by the NIRS device and then downloaded after he had completed the protocol. · Sheets 5 and 6: In this example, the dataset and graph are from an exercise protocol designed to stress the oxidative system. Here, the student held a light-intensity, isometric biceps contraction (pushing against a table). The NIRS monitor was attached to the left biceps muscle belly. Here, data was collected by a student observing the SmO2 values displayed on a secondary device; specifically, a smartphone with the IPSensorMan APP displaying data. The recorder student observed and recorded the data on an Excel Spreadsheet, and marked the times that exercise began and ended on the Spreadsheet.

  5. d

    Data from: Delta Neighborhood Physical Activity Study

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Jun 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Delta Neighborhood Physical Activity Study [Dataset]. https://catalog.data.gov/dataset/delta-neighborhood-physical-activity-study-f82d7
    Explore at:
    Dataset updated
    Jun 5, 2025
    Dataset provided by
    Agricultural Research Service
    Description

    The Delta Neighborhood Physical Activity Study was an observational study designed to assess characteristics of neighborhood built environments associated with physical activity. It was an ancillary study to the Delta Healthy Sprouts Project and therefore included towns and neighborhoods in which Delta Healthy Sprouts participants resided. The 12 towns were located in the Lower Mississippi Delta region of Mississippi. Data were collected via electronic surveys between August 2016 and September 2017 using the Rural Active Living Assessment (RALA) tools and the Community Park Audit Tool (CPAT). Scale scores for the RALA Programs and Policies Assessment and the Town-Wide Assessment were computed using the scoring algorithms provided for these tools via SAS software programming. The Street Segment Assessment and CPAT do not have associated scoring algorithms and therefore no scores are provided for them. Because the towns were not randomly selected and the sample size is small, the data may not be generalizable to all rural towns in the Lower Mississippi Delta region of Mississippi. Dataset one contains data collected with the RALA Programs and Policies Assessment (PPA) tool. Dataset two contains data collected with the RALA Town-Wide Assessment (TWA) tool. Dataset three contains data collected with the RALA Street Segment Assessment (SSA) tool. Dataset four contains data collected with the Community Park Audit Tool (CPAT). [Note : title changed 9/4/2020 to reflect study name] Resources in this dataset:Resource Title: Dataset One RALA PPA Data Dictionary. File Name: RALA PPA Data Dictionary.csvResource Description: Data dictionary for dataset one collected using the RALA PPA tool.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset Two RALA TWA Data Dictionary. File Name: RALA TWA Data Dictionary.csvResource Description: Data dictionary for dataset two collected using the RALA TWA tool.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset Three RALA SSA Data Dictionary. File Name: RALA SSA Data Dictionary.csvResource Description: Data dictionary for dataset three collected using the RALA SSA tool.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset Four CPAT Data Dictionary. File Name: CPAT Data Dictionary.csvResource Description: Data dictionary for dataset four collected using the CPAT.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset One RALA PPA. File Name: RALA PPA Data.csvResource Description: Data collected using the RALA PPA tool.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset Two RALA TWA. File Name: RALA TWA Data.csvResource Description: Data collected using the RALA TWA tool.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset Three RALA SSA. File Name: RALA SSA Data.csvResource Description: Data collected using the RALA SSA tool.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset Four CPAT. File Name: CPAT Data.csvResource Description: Data collected using the CPAT.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Data Dictionary. File Name: DataDictionary_RALA_PPA_SSA_TWA_CPAT.csvResource Description: This is a combined data dictionary from each of the 4 dataset files in this set.

  6. Data from: Delta Produce Sources Study

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Delta Produce Sources Study [Dataset]. https://catalog.data.gov/dataset/delta-produce-sources-study-51a7a
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    The Delta Produce Sources Study was an observational study designed to measure and compare food environments of farmers markets (n=3) and grocery stores (n=12) in 5 rural towns located in the Lower Mississippi Delta region of Mississippi. Data were collected via electronic surveys from June 2019 to March 2020 using a modified version of the Nutrition Environment Measures Survey (NEMS) Farmers Market Audit tool. The tool was modified to collect information pertaining to source of fresh produce and also for use with both farmers markets and grocery stores. Availability, source, quality, and price information were collected and compared between farmers markets and grocery stores for 13 fresh fruits and 32 fresh vegetables via SAS software programming. Because the towns were not randomly selected and the sample sizes are relatively small, the data may not be generalizable to all rural towns in the Lower Mississippi Delta region of Mississippi. Resources in this dataset:Resource Title: Delta Produce Sources Study dataset . File Name: DPS Data Public.csvResource Description: The dataset contains variables corresponding to availability, source (country, state and town if country is the United States), quality, and price (by weight or volume) of 13 fresh fruits and 32 fresh vegetables sold in farmers markets and grocery stores located in 5 Lower Mississippi Delta towns.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Delta Produce Sources Study data dictionary. File Name: DPS Data Dictionary Public.csvResource Description: This file is the data dictionary corresponding to the Delta Produce Sources Study dataset.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel

  7. d

    INFORMAS Protocol: Food Labelling - Annex 2: Excel spreadsheet containing a...

    • catalogue.data.govt.nz
    Updated Feb 1, 2001
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2001). INFORMAS Protocol: Food Labelling - Annex 2: Excel spreadsheet containing a sample of the different variables for analysis. - Dataset - data.govt.nz - discover and use data [Dataset]. https://catalogue.data.govt.nz/dataset/oai-figshare-com-article-5673580
    Explore at:
    Dataset updated
    Feb 1, 2001
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Annex 2 INFORMAS Food Labelling Module Protocol

  8. f

    Sample description table for Proteomics data file submission to PRIDE,...

    • fairdomhub.org
    xlsx
    Updated Mar 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Graf; Katja Baerenfaller; Willi Gruissem (2020). Sample description table for Proteomics data file submission to PRIDE, PXD006848 [Dataset]. https://fairdomhub.org/data_files/3704
    Explore at:
    xlsx(80.1 KB)Available download formats
    Dataset updated
    Mar 26, 2020
    Authors
    Alexander Graf; Katja Baerenfaller; Willi Gruissem
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This Excel file lists the samples uploaded in PRIDE. The table “Table Sorted PP and Replicates” in the Excel file has all the relevant annotation.

    There are more than the expected 168 samples in the PRIDE upload for the following reasons:

    First, all of the measurements from the experiment had been uploaded, including files for measurements that were repeated because of problems during the MS run. These samples are not annotated in the table. Second, we had included 4 Gold Standard samples (2 replicates on each of the two large gels used to process all samples). These 4 gold standard samples in 7 fractions explain 28 extra samples. Third, we did not have 168 but 166 samples in the photoperiod set. Fractions 1 and 2 of sample 43 (Photoperiod 2, bio replicate 1, tech. replicate 2) were lost during sample preparation. While the remaining fractions were measured and are included in the PRIDE upload and the table, this sample was not used in the data analysis. Photoperiod 2 bio rep. 1 was only used with one technical replicate in the calculations.

  9. H

    Finsheet - Stock Price in Excel and Google Sheet

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Apr 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tuan Do (2022). Finsheet - Stock Price in Excel and Google Sheet [Dataset]. http://doi.org/10.7910/DVN/ZD9XVF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 24, 2022
    Dataset provided by
    Harvard Dataverse
    Authors
    Tuan Do
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains the valuation template the researcher can use to retrieve real-time Excel stock price and stock price in Google Sheets. The dataset is provided by Finsheet, the leading financial data provider for spreadsheet users. To get more financial data, visit the website and explore their function. For instance, if a researcher would like to get the last 30 years of income statement for Meta Platform Inc, the syntax would be =FS_EquityFullFinancials("FB", "ic", "FY", 30) In addition, this syntax will return the latest stock price for Caterpillar Inc right in your spreadsheet. =FS_Latest("CAT") If you need assistance with any of the function, feel free to reach out to their customer support team. To get starter, install their Excel and Google Sheets add-on.

  10. f

    RT-PCR Excel Template

    • fairdomhub.org
    application/excel
    Updated Feb 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Katy Wolstencroft (2020). RT-PCR Excel Template [Dataset]. https://fairdomhub.org/data_files/930
    Explore at:
    application/excel(113 KB)Available download formats
    Dataset updated
    Feb 12, 2020
    Authors
    Katy Wolstencroft
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    An example of a JERM-compliant template for RT-PCR data

    This template was taken from the GEO website (http://www.ncbi.nlm.nih.gov/geo/info/spreadsheet.html) and modified to conform to the SysMO-JERM (Just enough Results Model) for transcriptomics.

  11. Data from: How many samples are needed to prove the absence of contamination...

    • geolsoc.figshare.com
    xlsx
    Updated Feb 26, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John A. Heathcote (2019). How many samples are needed to prove the absence of contamination - an example using arsenic? [Dataset]. http://doi.org/10.6084/m9.figshare.7770920.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 26, 2019
    Dataset provided by
    Geological Society of Londonhttp://www.geolsoc.org.uk/
    Authors
    John A. Heathcote
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    An Excel spreadsheet containing the full dataset, showing its sub-sampling

  12. Standard Sample Description V2 Structural Metadata

    • data.niaid.nih.gov
    Updated Feb 3, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Food Safety Authority (2020). Standard Sample Description V2 Structural Metadata [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1215986
    Explore at:
    Dataset updated
    Feb 3, 2020
    Dataset provided by
    The European Food Safety Authorityhttp://www.efsa.europa.eu/
    Authors
    European Food Safety Authority
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Standard Sample Description V2 is a specification aimed at harmonising the collection of analytical measurement data for the presence of harmful or beneficial chemical substances in food, feed and water. The specification is a list of standardised data elements (items describing characteristics of samples or analytical results such as country of origin, product, analytical method, limit of detection, result, etc.), linked to controlled terminologies. This specification uses EFSA FoodEx2 to describe sampled foods.

    This file has been prepared to support the publication of data and interoperability. This file indicates which data elements from the specification will not be published to ensure full protection of confidential/sensitive information, for example personal data in accordance with Regulation (EC) No 45/2001 and to protect commercial interests, including intellectual property as specified in Article 4(2), first indent, of Regulation (EC) No 1049/2001.

    The Excel table contains information about the structural metadata elements of the data collection and their fact tables.

    The column name shows the name of the element (e.g. localOrg). The column description describes how the content has to be interpreted. The column code expresses the corresponding code of the structural metadata element. The column optional says whether the structural metadata element is optional or not (then it is mandatory). The column dataType contains the type which can be used to fill the structural metadata element and the possible maximal length of the field. The possible types are: text or number. The column catalogue contains the name of the catalogue where the content of the structural metadata element has to be picked from (e.g. COUNTRY). The column data protection contains whether the structural metadata element will be published or not (yes = will not be published, no = will be published).

  13. S1 Table. Late Silurian zircon U–Pb ages from the Ludlow and Downton Bone...

    • geolsoc.figshare.com
    xlsx
    Updated Aug 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elizabeth J. Catlos; Darren F. Mark; Stephanie Suarez; Michael E. Brookfield; C. Giles Miller; Axel K. Schmitt; Vincent Gallagher; Anne Kelly (2020). S1 Table. Late Silurian zircon U–Pb ages from the Ludlow and Downton Bone Beds, Welsh Basin, UK [Dataset]. http://doi.org/10.6084/m9.figshare.12783006.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 10, 2020
    Dataset provided by
    Geological Society of Londonhttp://www.geolsoc.org.uk/
    Authors
    Elizabeth J. Catlos; Darren F. Mark; Stephanie Suarez; Michael E. Brookfield; C. Giles Miller; Axel K. Schmitt; Vincent Gallagher; Anne Kelly
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United Kingdom
    Description

    S1 Table. Excel file that contains detailed information regarding the SIMS analyses, including all samples and standard data

  14. Data to Support the Development of Rapid GC-MS Methods for Seized Drug...

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Feb 23, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2023). Data to Support the Development of Rapid GC-MS Methods for Seized Drug Analysis [Dataset]. https://catalog.data.gov/dataset/data-to-support-the-development-of-rapid-gc-ms-methods-for-seized-drug-analysis
    Explore at:
    Dataset updated
    Feb 23, 2023
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    This dataset contains raw datafiles that support the development of rapid gas chromatography mass spectrometry (GC-MS) methods for seized drug analysis. Files are provided in the native ".D" format collected from an Agilent GC-MS system. Files can be opened using Agilent proprietary software or freely available software such as AMDIS (which can be downloaded at chemdata.nist.gov). Included here is data of seized drug mixtures and adjudicated case samples that were analyzed as part of the method development process for rapid GC-MS. Information about the naming of datafiles and the contents of each mixture and case sample can be found in the associated Excel sheet ("File Names and Comments.xlsx").

  15. d

    Sediment sample locations and analysis collected in the vicinity of Buffalo...

    • catalog.data.gov
    • data.usgs.gov
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Sediment sample locations and analysis collected in the vicinity of Buffalo Reef, Michigan, within Lake Superior during USGS Field Activity 2018-043-FA (Microsoft Excel file). [Dataset]. https://catalog.data.gov/dataset/sediment-sample-locations-and-analysis-collected-in-the-vicinity-of-buffalo-reef-michigan-
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Lake Superior, Michigan
    Description

    In September 2018, the USGS Woods Hole Coastal and Marine Science Center (WHCMSC), in collaboration with the US Army Corps of Engineers (USACE), conducted high-resolution geophysical mapping and sediment sampling to determine the distribution of historical mine tailings on the floor of Lake Superior. Large amounts of waste material from copper mining, locally known as “stamp sands”, were dumped into the laFke in the early 20th century, with wide-reaching consequences that have continued into the present day. Mapping was focused offshore of the town of Gay on the Keweenaw Peninsula of Michigan, where ongoing erosion and re-deposition of the stamp sands has buried miles of native, white-sand beaches and is steadily encroaching south onto Buffalo Reef, a large area of cobble/boulder substrate that supports productive fisheries in the lake. The objectives of this cooperative mapping project are to develop a framework for scientific research and provide baseline information required for management of resources within the coastal zone of northern Michigan. High resolution bathymetry and backscatter data reveal the irregular topography of the shallow, cobble-covered Buffalo Reef, and the relatively smooth, finer-grained sediment that covers adjacent, deeper parts of the lake floor. Previous research used numerous sediment samples to determine the general distribution of mine tailings on the lake floor in this area, but little information existed on the extent and thickness of the surficial deposits. The main priority of this project is to image the near-surface stratigraphy, specifically the surficial sand and mud that threaten to cover the reef, with seismic-reflection profiling systems. In addition to continuous coverage of bathymetric and backscatter data, this report includes a dense grid of closely spaced seismic profiles, which will guide efforts to mitigate the impacts on Buffalo Reef from contamination by the shifting stamp sands.

  16. CNS Data for Box Core Sample P-1 taken at P1-92-14 (Excel) [Cutter, G.]

    • data.ucar.edu
    • arcticdata.io
    • +1more
    excel
    Updated Dec 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Greg Cutter (2024). CNS Data for Box Core Sample P-1 taken at P1-92-14 (Excel) [Cutter, G.] [Dataset]. http://doi.org/10.5065/D64J0C5G
    Explore at:
    excelAvailable download formats
    Dataset updated
    Dec 26, 2024
    Dataset provided by
    University Corporation for Atmospheric Research
    Authors
    Greg Cutter
    Time period covered
    Jan 1, 1995 - Dec 31, 1995
    Area covered
    Description

    This data represents the carbon, nitrogen, and silicon content of the Piston Core sample P-1 taken at P1-92-AR.

  17. d

    GP Practice Prescribing Presentation-level Data - July 2014

    • digital.nhs.uk
    csv, zip
    Updated Oct 31, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2014). GP Practice Prescribing Presentation-level Data - July 2014 [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/practice-level-prescribing-data
    Explore at:
    csv(1.4 GB), zip(257.7 MB), csv(1.7 MB), csv(275.8 kB)Available download formats
    Dataset updated
    Oct 31, 2014
    License

    https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions

    Time period covered
    Jul 1, 2014 - Jul 31, 2014
    Area covered
    United Kingdom
    Description

    Warning: Large file size (over 1GB). Each monthly data set is large (over 4 million rows), but can be viewed in standard software such as Microsoft WordPad (save by right-clicking on the file name and selecting 'Save Target As', or equivalent on Mac OSX). It is then possible to select the required rows of data and copy and paste the information into another software application, such as a spreadsheet. Alternatively, add-ons to existing software, such as the Microsoft PowerPivot add-on for Excel, to handle larger data sets, can be used. The Microsoft PowerPivot add-on for Excel is available from Microsoft http://office.microsoft.com/en-gb/excel/download-power-pivot-HA101959985.aspx Once PowerPivot has been installed, to load the large files, please follow the instructions below. Note that it may take at least 20 to 30 minutes to load one monthly file. 1. Start Excel as normal 2. Click on the PowerPivot tab 3. Click on the PowerPivot Window icon (top left) 4. In the PowerPivot Window, click on the "From Other Sources" icon 5. In the Table Import Wizard e.g. scroll to the bottom and select Text File 6. Browse to the file you want to open and choose the file extension you require e.g. CSV Once the data has been imported you can view it in a spreadsheet. What does the data cover? General practice prescribing data is a list of all medicines, dressings and appliances that are prescribed and dispensed each month. A record will only be produced when this has occurred and there is no record for a zero total. For each practice in England, the following information is presented at presentation level for each medicine, dressing and appliance, (by presentation name): - the total number of items prescribed and dispensed - the total net ingredient cost - the total actual cost - the total quantity The data covers NHS prescriptions written in England and dispensed in the community in the UK. Prescriptions written in England but dispensed outside England are included. The data includes prescriptions written by GPs and other non-medical prescribers (such as nurses and pharmacists) who are attached to GP practices. GP practices are identified only by their national code, so an additional data file - linked to the first by the practice code - provides further detail in relation to the practice. Presentations are identified only by their BNF code, so an additional data file - linked to the first by the BNF code - provides the chemical name for that presentation.

  18. Sample 193 Form Template

    • data.iowadot.gov
    Updated Jan 17, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iowa Department of Transportation (2018). Sample 193 Form Template [Dataset]. https://data.iowadot.gov/datasets/sample-193-form-template/about
    Explore at:
    Dataset updated
    Jan 17, 2018
    Dataset authored and provided by
    Iowa Department of Transportationhttps://iowadot.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Excel template for survey 123 for you to make your own Field Sample locations for testing by District Materials labs. This is for Sample Form 193.

  19. H

    Dataset of companies’ profitability, government debt, Financial Statements'...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Mar 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahfoudh Mgammal; Ebrahim Al-Matari (2023). Dataset of companies’ profitability, government debt, Financial Statements' Key Indicators and earnings in an emerging market: Developing a panel and time series database of value-added tax rate increase impacts [Dataset]. http://doi.org/10.7910/DVN/HEL3YG
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 14, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Mahfoudh Mgammal; Ebrahim Al-Matari
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Yemen
    Description

    The dataset included with this article contains three files describing and defining the sample and variables for VAT impact, and Excel file 1 consists of all raw and filtered data for the variables for the panel data sample. Excel file 2 depicts time-series and cross-sectional data for nonfinancial firms listed on the Saudi market for the second and third quarters of 2019 and the third and fourth quarters of 2020. Excel file 3 presents the raw material of variables used in measuring the company's profitability of the panel data sample

  20. Sample list for the SoS RARE project (Security of Supply of Rare Earths)...

    • metadata.bgs.ac.uk
    • data-search.nerc.ac.uk
    html
    Updated Apr 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    British Geological Survey (2021). Sample list for the SoS RARE project (Security of Supply of Rare Earths) (NERC Grant NE/M01116X/1) [Dataset]. https://metadata.bgs.ac.uk/geonetwork/srv/api/records/c298c88b-98ed-4c87-e054-002128a47908
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Apr 29, 2021
    Dataset provided by
    British Geological Surveyhttps://www.bgs.ac.uk/
    License

    http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitationshttp://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations

    Time period covered
    Jan 1, 2015 - Dec 31, 2017
    Area covered
    Earth
    Description

    This dataset is an overall sample list, as an Excel spreadsheet, providing details of the major sample suites collected during the SoS RARE project (2015-2017). It includes location details for samples collected in a range of localities worldwide. The samples are chiefly rocks and soils. Most material is still held by the institutions that did the work, as recorded in the sample list. The dataset also includes partial details of work that has been done on the samples. However time constraints have prevented complete update of the spreadsheet and so it is uploaded here to provide the best record possible of the available sample material.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177

Data Cleaning Sample

Explore at:
151 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 13, 2023
Dataset provided by
Borealis
Authors
Rong Luo
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

Sample data for exercises in Further Adventures in Data Cleaning.

Search
Clear search
Close search
Google apps
Main menu