10 datasets found
  1. h

    ocr-receipts-text-detection

    • huggingface.co
    Updated Sep 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ocr-receipts-text-detection [Dataset]. https://huggingface.co/datasets/TrainingDataPro/ocr-receipts-text-detection
    Explore at:
    Dataset updated
    Sep 19, 2023
    Authors
    Training Data
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    The Grocery Store Receipts Dataset is a collection of photos captured from various grocery store receipts. This dataset is specifically designed for tasks related to Optical Character Recognition (OCR) and is useful for retail. Each image in the dataset is accompanied by bounding box annotations, indicating the precise locations of specific text segments on the receipts. The text segments are categorized into four classes: item, store, date_time and total.

  2. Vietnamese Receipts MC_OCR 2021

    • kaggle.com
    zip
    Updated Apr 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DoMixi1989 (2022). Vietnamese Receipts MC_OCR 2021 [Dataset]. https://www.kaggle.com/datasets/domixi1989/vietnamese-receipts-mc-ocr-2021
    Explore at:
    zip(2271709772 bytes)Available download formats
    Dataset updated
    Apr 8, 2022
    Authors
    DoMixi1989
    Description

    Dataset

    This dataset was created by DoMixi1989

    Contents

  3. Adding receipts files

    • kaggle.com
    zip
    Updated Jan 13, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Premshah (2019). Adding receipts files [Dataset]. https://www.kaggle.com/datasets/premshah/adding-receipts-files/discussion
    Explore at:
    zip(49732 bytes)Available download formats
    Dataset updated
    Jan 13, 2019
    Authors
    Premshah
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Premshah

    Released under CC BY-SA 4.0

    Contents

  4. image invoice

    • kaggle.com
    Updated Feb 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    chakir hicham (2024). image invoice [Dataset]. https://www.kaggle.com/datasets/chakirhicham/image-invoice/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 9, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    chakir hicham
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by chakir hicham

    Released under Apache 2.0

    Contents

  5. h

    invoices-donut-data-v1

    • huggingface.co
    Updated May 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    invoices-donut-data-v1 [Dataset]. https://huggingface.co/datasets/katanaml-org/invoices-donut-data-v1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 11, 2023
    Dataset authored and provided by
    Katana ML
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for Invoices (Sparrow)

    This dataset contains 500 invoice documents annotated and processed to be ready for Donut ML model fine-tuning. Annotation and data preparation task was done by Katana ML team. Sparrow - open-source data extraction solution by Katana ML. Original dataset info: Kozłowski, Marek; Weichbroth, Paweł (2021), “Samples of electronic invoices”, Mendeley Data, V2, doi: 10.17632/tnj49gpmtz.2

  6. Invoice

    • kaggle.com
    zip
    Updated Oct 28, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sid (2019). Invoice [Dataset]. https://www.kaggle.com/sidasj/invoice
    Explore at:
    zip(13886 bytes)Available download formats
    Dataset updated
    Oct 28, 2019
    Authors
    Sid
    Description

    Dataset

    This dataset was created by Sid

    Contents

  7. Contracts to Invoices Conversion (pdf_to_json)

    • kaggle.com
    Updated Sep 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eduard Balamatiuc (2024). Contracts to Invoices Conversion (pdf_to_json) [Dataset]. https://www.kaggle.com/datasets/eduardbalamatiuc/contracts-to-invoices-conversion-pdf-to-json/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 29, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Eduard Balamatiuc
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Eduard Balamatiuc

    Released under Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)

    Contents

  8. Taiwan Consumption channel invoice statistics

    • kaggle.com
    zip
    Updated Jun 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ChenWenSheng (2021). Taiwan Consumption channel invoice statistics [Dataset]. https://www.kaggle.com/datasets/chenwensheng/taiwan-consumption-channel-invoice-statistics/versions/1
    Explore at:
    zip(825396 bytes)Available download formats
    Dataset updated
    Jun 15, 2021
    Authors
    ChenWenSheng
    Area covered
    Taiwan
    Description

    Dataset

    This dataset was created by ChenWenSheng

    Contents

  9. Recipes by Ingredients

    • kaggle.com
    zip
    Updated Jan 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alin Cijov (2021). Recipes by Ingredients [Dataset]. https://www.kaggle.com/alincijov/cooking-ingredients
    Explore at:
    zip(1533668 bytes)Available download formats
    Dataset updated
    Jan 5, 2021
    Authors
    Alin Cijov
    Description

    Cooking Ingredients

    Use different algorithms in order to find the best receipt based on ingredients.

  10. Transportation Dataset

    • kaggle.com
    Updated Oct 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amit Zala (2023). Transportation Dataset [Dataset]. https://www.kaggle.com/datasets/amitzala/transportation-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 2, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Amit Zala
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    DESCRIPTION This table contains data on the percent of residents aged 16 years and older mode of transportation to work for ...

    SUMMARY This table contains data on the percent of residents aged 16 years and older mode of transportation to work for California, its regions, counties, cities/towns, and census tracts. Data is from the U.S. Census Bureau, Decennial Census and American Community Survey. The table is part of a series of indicators in the Healthy Communities Data and Indicators Project of the Office of Health Equity. Commute trips to work represent 19% of travel miles in the United States. The predominant mode – the automobile - offers extraordinary personal mobility and independence, but it is also associated with health hazards, such as air pollution, motor vehicle crashes, pedestrian injuries and fatalities, and sedentary lifestyles. Automobile commuting has been linked to stress-related health problems. Active modes of transport – bicycling and walking alone and in combination with public transit – offer opportunities for physical activity, which is associated with lowering rates of heart disease and stroke, diabetes, colon and breast cancer, dementia and depression. Risk of injury and death in collisions are higher in urban areas with more concentrated vehicle and pedestrian activity. Bus and rail passengers have a lower risk of injury in collisions than motorcyclists, pedestrians, and bicyclists. Minority communities bear a disproportionate share of pedestrian-car fatalities; Native American male pedestrians experience four times the death rate Whites or Asian pedestrians, and African-Americans and Latinos experience twice the rate as Whites or Asians. More information about the data table and a data dictionary can be found in the About/Attachments section.

    ind_id - Indicator ID ind_definition - Definition of indicator in plain language reportyear - Year that the indicator was reported race_eth_code - numeric code for a race/ethnicity group race_eth_name - Name of race/ethnic group geotype - Type of geographic unit geotypevalue - Value of geographic unit geoname - Name of a geographic unit county_name - Name of county that geotype is in county_fips - FIPS code of the county that geotype is in region_name - MPO-based region name; see MPO_County list tab region_code - MPO-based region code; see MPO_County list tab mode - Mode of transportation short name mode_name - Mode of transportation long name pop_total - denominator pop_mode - numerator percent - Percent of Residents Mode of Transportation to Work,
    Population Aged 16 Years and Older LL_95CI_percent - The lower limit of 95% confidence interval UL_95CI_percent - The lower limit of 95% confidence interval percent_se - Standard error of the percent mode of transportation percent_rse - Relative standard error (se/value) expressed as a percent CA_decile - California decile CA_RR - Rate ratio to California rate version - Date/time stamp of a version of data

  11. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
ocr-receipts-text-detection [Dataset]. https://huggingface.co/datasets/TrainingDataPro/ocr-receipts-text-detection

ocr-receipts-text-detection

TrainingDataPro/ocr-receipts-text-detection

Explore at:
Dataset updated
Sep 19, 2023
Authors
Training Data
License

Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically

Description

The Grocery Store Receipts Dataset is a collection of photos captured from various grocery store receipts. This dataset is specifically designed for tasks related to Optical Character Recognition (OCR) and is useful for retail. Each image in the dataset is accompanied by bounding box annotations, indicating the precise locations of specific text segments on the receipts. The text segments are categorized into four classes: item, store, date_time and total.

Search
Clear search
Close search
Google apps
Main menu