bbox column is [x, y, width, height] ymean is y position of the mean of the box line is the line number calculated using ymean
Donut 🍩 : OCR-Free Document Understanding Transformer (ECCV 2022) -- SynthDoG datasets
For more information, please visit https://github.com/clovaai/donut
The links to the SynthDoG-generated datasets are here:
synthdog-en: English, 0.5M. synthdog-zh: Chinese, 0.5M. synthdog-ja: Japanese, 0.5M. synthdog-ko: Korean, 0.5M.
To generate synthetic datasets with our SynthDoG, please see ./synthdog/README.md and our paper for details.
How to Cite
If you find this work useful… See the full description on the dataset page: https://huggingface.co/datasets/naver-clova-ix/synthdog-ko.
DonutVQA Dataset
This dataset is derived from the donut-vqa dataset, reformatting the test split with modified field names, so that it can be used in the ViDoRe benchmark. The text_description column contains OCR text extracted from the images using EasyOCR.
Disclaimer
This dataset may contain publicly available images or text data. All data is provided for research and educational purposes only. If you are the rights holder of any content and have concerns regarding… See the full description on the dataset page: https://huggingface.co/datasets/jinaai/donut_vqa.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
bbox column is [x, y, width, height] ymean is y position of the mean of the box line is the line number calculated using ymean