100+ datasets found

h
chart-to-text
huggingface.co
Updated Oct 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saad Obaid ul Islam (2024). chart-to-text [Dataset]. https://huggingface.co/datasets/saadob12/chart-to-text
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 28, 2024
Authors
Saad Obaid ul Islam
Description
Tackling Hallucinations in Neural Chart Summarization

Introduction

The trained model for investigations and state-of-the-art (SOTA) improvements are detailed in the paper: Tackling Hallucinations in Neural Chart Summarization. This repo contains optimized input prompts and summaries after NLI-filtering.

Abstract

Hallucinations in text generation occur when the system produces text that is not grounded in the input. In this work, we address the problem of… See the full description on the dataset page: https://huggingface.co/datasets/saadob12/chart-to-text.
R
Chart Text Detection Dataset
universe.roboflow.com
zip
Updated Sep 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
minhngoncoding (2024). Chart Text Detection Dataset [Dataset]. https://universe.roboflow.com/minhngoncoding/chart-text-detection
Explore at:
zipAvailable download formats
Dataset updated
Sep 26, 2024
Dataset authored and provided by
minhngoncoding
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Text Bounding Boxes
Description
Chart Text Detection

## Overview Chart Text Detection is a dataset for object detection tasks - it contains Text annotations for 6,399 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
S
Effective comment data and chart data
scidb.cn
Updated Apr 25, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Li Shancheng (2022). Effective comment data and chart data [Dataset]. http://doi.org/10.57760/sciencedb.01715
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.01715
Dataset updated
Apr 25, 2022
Dataset provided by
Science Data Bank
Authors
Li Shancheng
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
There are two files in the data file, one of which is all valid comment text data used by the paper, with a total of 297,774 pieces; the other is the data required for drawing the main graphs in the paper.
a
TEXT - Chart patterns
atmatix.pl
Updated Aug 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ATmatix (2025). TEXT - Chart patterns [Dataset]. https://www.atmatix.pl/en/patterns/all/wse/TEXT
Explore at:
Dataset updated
Aug 8, 2025
Dataset provided by
ATmatix
License
https://www.atmatix.pl/help/terms-of-service#copyrighthttps://www.atmatix.pl/help/terms-of-service#copyright
Description
TEXT (TXT) - Text SA - Technical analysis chart patterns - pattern list, candlestick charts and statistics
E
EconBiz Images for Text Extraction from Scholarly Figures
live.european-language-grid.eu
data.niaid.nih.gov
json
Updated Apr 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). EconBiz Images for Text Extraction from Scholarly Figures [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7506
Explore at:
jsonAvailable download formats
Dataset updated
Apr 17, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
"Scholarly figures are data visualizations like bar charts, pie charts, line graphs, maps, scatter plots or similar figures. Text extraction from scholarly figures is useful in many application scenarios, since text in scholarly figures often contains information that is not present in the surrounding text. This dataset is a corpus of 121 scholarly figures from the economics domain evaluating text extraction tools. We randomly extracted these figures from a corpus of 288,000 open access publications from EconBiz. The dataset resembles a wide variety of scholarly figures from bar charts to maps. We manually labeled the figures to create the gold standard.
We adjusted the provided gold standard to have a uniform format for all datasets. Each figure is accompanied by a TSV file (tab-separated values) where each entry corresponds to a text line which has the following structure:
X-coordinate of the center of the bounding box in pixel
Y-coordinate of the center of the bounding box in pixel
Width of the bounding box in pixel
Height of the bounding box in pixel
Rotation angle around its center in degree
Text inside the bounding box
In addition we provide the ground truth in JSON format. A schema file is included in each dataset as well. The dataset is accompanied with a ReadMe file with further information about the figures and their origin.
If you use this dataset in your own work, please cite one of the papers in the references."
T
Open Text | OTC - Debt
tradingeconomics.com
csv, excel, json, xml
Updated Jun 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). Open Text | OTC - Debt [Dataset]. https://tradingeconomics.com/otc:cn:debt
Explore at:
xml, json, excel, csvAvailable download formats
Dataset updated
Jun 15, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Aug 11, 2025
Area covered
Canada
Description
Open Text reported $4.57M in Debt for its fiscal quarter ending in June of 2025. Data for Open Text | OTC - Debt including historical, tables and charts were last updated by Trading Economics this last August in 2025.
T
Open Text | OTC - Market Capitalization
tradingeconomics.com
csv, excel, json, xml
Updated Feb 22, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2018). Open Text | OTC - Market Capitalization [Dataset]. https://tradingeconomics.com/otc:cn:market-capitalization
Explore at:
csv, xml, excel, jsonAvailable download formats
Dataset updated
Feb 22, 2018
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Aug 12, 2025
Area covered
Canada
Description
Open Text reported CAD10.96B in Market Capitalization this August of 2025, considering the latest stock price and the number of outstanding shares.Data for Open Text | OTC - Market Capitalization including historical, tables and charts were last updated by Trading Economics this last August in 2025.
Z
Sentence/Table Pair Data from Wikipedia for Pre-training with...
data.niaid.nih.gov
zenodo.org
Updated Oct 29, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Huan Sun (2021). Sentence/Table Pair Data from Wikipedia for Pre-training with Distant-Supervision [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5612315
Explore at:
Dataset updated
Oct 29, 2021
Dataset provided by
Alyssa Lees
Huan Sun
Yu Su
You Wu
Cong Yu
Xiang Deng
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the dataset used for pre-training in "ReasonBERT: Pre-trained to Reason with Distant Supervision", EMNLP'21.

There are two files:

sentence_pairs_for_pretrain_no_tokenization.tar.gz -> Contain only sentences as evidence, Text-only

table_pairs_for_pretrain_no_tokenization.tar.gz -> At least one piece of evidence is a table, Hybrid

The data is chunked into multiple tar files for easy loading. We use WebDataset, a PyTorch Dataset (IterableDataset) implementation providing efficient sequential/streaming data access.

For pre-training code, or if you have any questions, please check our GitHub repo https://github.com/sunlab-osu/ReasonBERT

Below is a sample code snippet to load the data

import webdataset as wds

path to the uncompressed files, should be a directory with a set of tar files

url = './sentence_multi_pairs_for_pretrain_no_tokenization/{000000...000763}.tar' dataset = ( wds.Dataset(url) .shuffle(1000) # cache 1000 samples and shuffle .decode() .to_tuple("json") .batched(20) # group every 20 examples into a batch )

Please see the documentation for WebDataset for more details about how to use it as dataloader for Pytorch

You can also iterate through all examples and dump them with your preferred data format

Below we show how the data is organized with two examples.

Text-only

{'s1_text': 'Sils is a municipality in the comarca of Selva, in Catalonia, Spain.', # query sentence 's1_all_links': { 'Sils,_Girona': [[0, 4]], 'municipality': [[10, 22]], 'Comarques_of_Catalonia': [[30, 37]], 'Selva': [[41, 46]], 'Catalonia': [[51, 60]] }, # list of entities and their mentions in the sentence (start, end location) 'pairs': [ # other sentences that share common entity pair with the query, group by shared entity pairs { 'pair': ['Comarques_of_Catalonia', 'Selva'], # the common entity pair 's1_pair_locs': [[[30, 37]], [[41, 46]]], # mention of the entity pair in the query 's2s': [ # list of other sentences that contain the common entity pair, or evidence { 'md5': '2777e32bddd6ec414f0bc7a0b7fea331', 'text': 'Selva is a coastal comarque (county) in Catalonia, Spain, located between the mountain range known as the Serralada Transversal or Puigsacalm and the Costa Brava (part of the Mediterranean coast). Unusually, it is divided between the provinces of Girona and Barcelona, with Fogars de la Selva being part of Barcelona province and all other municipalities falling inside Girona province. Also unusually, its capital, Santa Coloma de Farners, is no longer among its larger municipalities, with the coastal towns of Blanes and Lloret de Mar having far surpassed it in size.', 's_loc': [0, 27], # in addition to the sentence containing the common entity pair, we also keep its surrounding context. 's_loc' is the start/end location of the actual evidence sentence 'pair_locs': [ # mentions of the entity pair in the evidence [[19, 27]], # mentions of entity 1 [[0, 5], [288, 293]] # mentions of entity 2 ], 'all_links': { 'Selva': [[0, 5], [288, 293]], 'Comarques_of_Catalonia': [[19, 27]], 'Catalonia': [[40, 49]] } } ,...] # there are multiple evidence sentences }, ,...] # there are multiple entity pairs in the query }

Hybrid

{'s1_text': 'The 2006 Major League Baseball All-Star Game was the 77th playing of the midseason exhibition baseball game between the all-stars of the American League (AL) and National League (NL), the two leagues comprising Major League Baseball.', 's1_all_links': {...}, # same as text-only 'sentence_pairs': [{'pair': ..., 's1_pair_locs': ..., 's2s': [...]}], # same as text-only 'table_pairs': [ 'tid': 'Major_League_Baseball-1', 'text':[ ['World Series Records', 'World Series Records', ...], ['Team', 'Number of Series won', ...], ['St. Louis Cardinals (NL)', '11', ...], ...] # table content, list of rows 'index':[ [[0, 0], [0, 1], ...], [[1, 0], [1, 1], ...], ...] # index of each cell [row_id, col_id]. we keep only a table snippet, but the index here is from the original table. 'value_ranks':[ [0, 0, ...], [0, 0, ...], [0, 10, ...], ...] # if the cell contain numeric value/date, this is its rank ordered from small to large, follow TAPAS 'value_inv_ranks': [], # inverse rank 'all_links':{ 'St._Louis_Cardinals': { '2': [ [[2, 0], [0, 19]], # [[row_id, col_id], [start, end]] ] # list of mentions in the second row, the key is row_id }, 'CARDINAL:11': {'2': [[[2, 1], [0, 2]]], '8': [[[8, 3], [0, 2]]]}, } 'name': '', # table name, if exists 'pairs': { 'pair': ['American_League', 'National_League'], 's1_pair_locs': [[[137, 152]], [[162, 177]]], # mention in the query 'table_pair_locs': { '17': [ # mention of entity pair in row 17 [ [[17, 0], [3, 18]], [[17, 1], [3, 18]], [[17, 2], [3, 18]], [[17, 3], [3, 18]] ], # mention of the first entity [ [[17, 0], [21, 36]], [[17, 1], [21, 36]], ] # mention of the second entity ] } } ] }
Power BI Financials Data
kaggle.com
Updated May 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sanjana Murthy (2024). Power BI Financials Data [Dataset]. https://www.kaggle.com/datasets/sanjanamurthy392/power-bi-financials-data/discussion?sort=undefined
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 6, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sanjana Murthy
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Dataset

This dataset was created by Sanjana Murthy

Released under CC BY-NC-SA 4.0

Contents
Data from: Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph...
zenodo.org
data.niaid.nih.gov
zip
Updated May 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nandana Mihindukulasooriya; Nandana Mihindukulasooriya; Sanju Tiwari; Sanju Tiwari; Carlos F. Enguix; Carlos F. Enguix; Kusum Lata; Kusum Lata (2023). Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph Generation from Text [Dataset]. http://doi.org/10.5281/zenodo.7916716
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7916716
Dataset updated
May 23, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Nandana Mihindukulasooriya; Nandana Mihindukulasooriya; Sanju Tiwari; Sanju Tiwari; Carlos F. Enguix; Carlos F. Enguix; Kusum Lata; Kusum Lata
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the repository for ISWC 2023 Resource Track submission for Text2KGBench: Benchmark for Ontology-Driven Knowledge Graph Generation from Text. Text2KGBench is a benchmark to evaluate the capabilities of language models to generate KGs from natural language text guided by an ontology. Given an input ontology and a set of sentences, the task is to extract facts from the text while complying with the given ontology (concepts, relations, domain/range constraints) and being faithful to the input sentences.

It contains two datasets (i) Wikidata-TekGen with 10 ontologies and 13,474 sentences and (ii) DBpedia-WebNLG with 19 ontologies and 4,860 sentences.

An example

An example test sentence:

Test Sentence: {"id": "ont_music_test_n", "sent": "\"The Loco-Motion\" is a 1962 pop song written by American songwriters Gerry Goffin and Carole King."}

An example of ontology:

Ontology: Music Ontology

Expected Output:

{ "id": "ont_k_music_test_n", "sent": "\"The Loco-Motion\" is a 1962 pop song written by American songwriters Gerry Goffin and Carole King.", "triples": [ { "sub": "The Loco-Motion", "rel": "publication date", "obj": "01 January 1962" },{ "sub": "The Loco-Motion", "rel": "lyrics by", "obj": "Gerry Goffin" },{ "sub": "The Loco-Motion", "rel": "lyrics by", "obj": "Carole King" },] }

The data is released under a Creative Commons Attribution-ShareAlike 4.0 International (CC BY 4.0) License.

The structure of the repo is as the following.

Text2KGBench

src: the source code used for generation and evaluation, and baseline

benchmark the code used to generate the benchmark

evaluation evaluation scripts for calculating the results

baseline code for generating the baselines including prompts, sentence similarities, and LLM client.

data: the benchmark datasets and baseline data. There are two datasets: wikidata_tekgen and dbpedia_webnlg.

wikidata_tekgen Wikidata-TekGen Dataset

ontologies 10 ontologies used by this dataset

train training data

test test data

manually_verified_sentences ids of a subset of test cases manually validated

unseen_sentences new sentences that are added by the authors which are not part of Wikipedia

test unseen test unseen test sentences

ground_truth ground truth for unseen test sentences.

ground_truth ground truth for the test data

baselines data related to running the baselines.

test_train_sent_similarity for each test case, 5 most similar train sentences generated using SBERT T5-XXL model.

prompts prompts corresponding to each test file

unseen prompts unseen prompts for the unseen test cases

Alpaca-LoRA-13B data related to the Alpaca-LoRA model

llm_responses raw LLM responses and extracted triples

eval_metrics ontology-level and aggregated evaluation results

unseen results results for the unseen test cases

llm_responses raw LLM responses and extracted triples

eval_metrics ontology-level and aggregated evaluation results

Vicuna-13B data related to the Vicuna-13B model

llm_responses raw LLM responses and extracted triples

eval_metrics ontology-level and aggregated evaluation results

dbpedia_webnlg DBpedia Dataset

ontologies 19 ontologies used by this dataset

train training data

test test data

ground_truth ground truth for the test data

baselines data related to running the baselines.

test_train_sent_similarity for each test case, 5 most similar train sentences generated using SBERT T5-XXL model.

prompts prompts corresponding to each test file

Alpaca-LoRA-13B data related to the Alpaca-LoRA model

llm_responses raw LLM responses and extracted triples

eval_metrics ontology-level and aggregated evaluation results

Vicuna-13B data related to the Vicuna-13B model

llm_responses raw LLM responses and extracted triples

eval_metrics ontology-level and aggregated evaluation results

This benchmark contains data derived from the TekGen corpus (part of the KELM corpus) [1] released under CC BY-SA 2.0 license and WebNLG 3.0 corpus [2] released under CC BY-NC-SA 4.0 license.

[1] Oshin Agarwal, Heming Ge, Siamak Shakeri, and Rami Al-Rfou. 2021. Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced Language Model Pre-training. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3554–3565, Online. Association for Computational Linguistics.

[2] Claire Gardent, Anastasia Shimorina, Shashi Narayan, and Laura Perez-Beltrachini. 2017. Creating Training Corpora for NLG Micro-Planners. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages
f
Statistics of TABLE and TEXT.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Sep 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kawazoe, Yoshimasa; Hori, Satoko; Aramaki, Eiji; Nishiyama, Tomohiro; Yada, Shuntaro; Imai, Shungo; Wakamiya, Shoko (2024). Statistics of TABLE and TEXT. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001351015
Explore at:
Dataset updated
Sep 11, 2024
Authors
Kawazoe, Yoshimasa; Hori, Satoko; Aramaki, Eiji; Nishiyama, Tomohiro; Yada, Shuntaro; Imai, Shungo; Wakamiya, Shoko
Description
Real-world data (RWD) in the medical field, such as electronic health records (EHRs) and medication orders, are receiving increasing attention from researchers and practitioners. While structured data have played a vital role thus far, unstructured data represented by text (e.g., discharge summaries) are not effectively utilized because of the difficulty in extracting medical information. We evaluated the information gained by supplementing structured data with clinical concepts extracted from unstructured text by leveraging natural language processing techniques. Using a machine learning-based pretrained named entity recognition tool, we extracted disease and medication names from real discharge summaries in a Japanese hospital and linked them to medical concepts using medical term dictionaries. By comparing the diseases and medications mentioned in the text with medical codes in tabular diagnosis records, we found that: (1) the text data contained richer information on patient symptoms than tabular diagnosis records, whereas the medication-order table stored more injection data than text. In addition, (2) extractable information regarding specific diseases showed surprisingly small intersections among text, diagnosis records, and medication orders. Text data can thus be a useful supplement for RWD mining, which is further demonstrated by (3) our practical application system for drug safety evaluation, which exhaustively visualizes suspicious adverse drug effects caused by the simultaneous use of anticancer drug pairs. We conclude that proper use of textual information extraction can lead to better outcomes in medical RWD mining.
d
Michigan Stratigraphic Nomenclature Chart
datadiscoverystudio.org
data.wu.ac.at
pdf
Updated Feb 8, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Steve Wilson (2013). Michigan Stratigraphic Nomenclature Chart [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/3975fade7f464b649d7cd44ff47f81ce/html
Explore at:
pdfAvailable download formats
Dataset updated
Feb 8, 2013
Authors
Steve Wilson
Area covered

Description
Large format chart of Michigan stratigraphic formations. For information or to download this resource, please see links provided.
s
Data and code for: Variational Graph Author Topic Modeling
researchdata.smu.edu.sg
zip
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ZHANG, CE (SMU); Hady Wirawan LAUW (2023). Data and code for: Variational Graph Author Topic Modeling [Dataset]. http://doi.org/10.25440/smu.21378237.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.25440/smu.21378237.v1
Dataset updated
Jun 3, 2023
Dataset provided by
SMU Research Data Repository (RDR)
Authors
ZHANG, CE (SMU); Hady Wirawan LAUW
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the tensorflow implementation of KDD-2022 paper "Variational Graph Author Topic Modeling" by Delvin Ce Zhang and Hady W. Lauw.

VGATM is a Graph Neural Network model that extracts interpretable topics for documents with authors and venues. Topics of documents then fulfill document classification, citation prediction, etc.
T
Open Text | OTC - Stock Price | Live Quote | Historical Chart
tradingeconomics.com
csv, excel, json, xml
Updated May 28, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2017). Open Text | OTC - Stock Price | Live Quote | Historical Chart [Dataset]. https://tradingeconomics.com/otc:cn
Explore at:
json, xml, csv, excelAvailable download formats
Dataset updated
May 28, 2017
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Aug 12, 2025
Area covered
Canada
Description
Open Text stock price, live market quote, shares value, historical data, intraday chart, earnings per share and news.
Publication text: code, data, and new measures
zenodo.org
csv
Updated Jul 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sam Arts; Sam Arts; Nicola Melluso; Nicola Melluso; Reinhilde Veugelers; Reinhilde Veugelers; Leonidas Aristodemou; Leonidas Aristodemou (2024). Publication text: code, data, and new measures [Dataset]. http://doi.org/10.5281/zenodo.8283353
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8283353
Dataset updated
Jul 11, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sam Arts; Sam Arts; Nicola Melluso; Nicola Melluso; Reinhilde Veugelers; Reinhilde Veugelers; Leonidas Aristodemou; Leonidas Aristodemou
License
Attribution-NonCommercial 1.0 (CC BY-NC 1.0)https://creativecommons.org/licenses/by-nc/1.0/
License information was derived automatically
Description
This Zenodo page describes data collection, processing, and different open access data files related to the text of scientific publications from Microsoft Academic Graph (MAG) (now OpenAlex). If you use the code or data, please cite the following paper:
Arts S, Melluso N, Veugelers R (2023). Beyond Citations: Measuring Novel Scientific Ideas and their Impact in Publication Text. https://doi.org/10.48550/arXiv.2309.16437
T
Open Text | OTC - Dividend Yield
tradingeconomics.com
csv, excel, json, xml
Updated Mar 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). Open Text | OTC - Dividend Yield [Dataset]. https://tradingeconomics.com/otc:cn:dy
Explore at:
json, excel, xml, csvAvailable download formats
Dataset updated
Mar 15, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Aug 10, 2025
Area covered
Canada
Description
Open Text reported 4.16 in Dividend Yield for its fiscal quarter ending in March of 2025. Data for Open Text | OTC - Dividend Yield including historical, tables and charts were last updated by Trading Economics this last August in 2025.
T
Open Text | OTC - Assets
tradingeconomics.com
csv, excel, json, xml
Updated Jun 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). Open Text | OTC - Assets [Dataset]. https://tradingeconomics.com/otc:cn:assets
Explore at:
xml, excel, json, csvAvailable download formats
Dataset updated
Jun 15, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Aug 12, 2025
Area covered
Canada
Description
Open Text reported $1.11M in Assets for its fiscal quarter ending in June of 2025. Data for Open Text | OTC - Assets including historical, tables and charts were last updated by Trading Economics this last August in 2025.
T
Open Text | OTC - Equity Capital And Reserves
tradingeconomics.com
csv, excel, json, xml
Updated Jun 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). Open Text | OTC - Equity Capital And Reserves [Dataset]. https://tradingeconomics.com/otc:cn:equity-capital-and-reserves
Explore at:
json, excel, csv, xmlAvailable download formats
Dataset updated
Jun 15, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Aug 12, 2025
Area covered
Canada
Description
Open Text reported $-3966000 in Equity Capital and Reserves for its fiscal quarter ending in June of 2025. Data for Open Text | OTC - Equity Capital And Reserves including historical, tables and charts were last updated by Trading Economics this last August in 2025.
T
Open Text | OTC - Employees Total Number
tradingeconomics.com
csv, excel, json, xml
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS, Open Text | OTC - Employees Total Number [Dataset]. https://tradingeconomics.com/otc:cn:employees
Explore at:
csv, xml, excel, jsonAvailable download formats
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Aug 10, 2025
Area covered
Canada
Description
Open Text reported 14.8K in Employees for its fiscal year ending in June of 2022. Data for Open Text | OTC - Employees Total Number including historical, tables and charts were last updated by Trading Economics this last August in 2025.
T
Open Text | OTC - Ebitda
tradingeconomics.com
csv, excel, json, xml
Updated Jun 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). Open Text | OTC - Ebitda [Dataset]. https://tradingeconomics.com/otc:cn:ebitda
Explore at:
json, excel, xml, csvAvailable download formats
Dataset updated
Jun 15, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Aug 11, 2025
Area covered
Canada
Description
Open Text reported $-827000 in EBITDA for its fiscal quarter ending in June of 2025. Data for Open Text | OTC - Ebitda including historical, tables and charts were last updated by Trading Economics this last August in 2025.

Facebook

Twitter

Click to copy link

Link copied

Cite

Saad Obaid ul Islam (2024). chart-to-text [Dataset]. https://huggingface.co/datasets/saadob12/chart-to-text

chart-to-text

saadob12/chart-to-text

Explore at:

240 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Oct 28, 2024

Authors

Saad Obaid ul Islam

Description

Tackling Hallucinations in Neural Chart Summarization

  Introduction

The trained model for investigations and state-of-the-art (SOTA) improvements are detailed in the paper: Tackling Hallucinations in Neural Chart Summarization. This repo contains optimized input prompts and summaries after NLI-filtering.

  Abstract

Hallucinations in text generation occur when the system produces text that is not grounded in the input. In this work, we address the problem of… See the full description on the dataset page: https://huggingface.co/datasets/saadob12/chart-to-text.

Clear search

Close search

Google apps

Main menu

chart-to-text

Chart Text Detection Dataset

Chart Text Detection

Effective comment data and chart data

TEXT - Chart patterns

EconBiz Images for Text Extraction from Scholarly Figures

Open Text | OTC - Debt

Open Text | OTC - Market Capitalization

Sentence/Table Pair Data from Wikipedia for Pre-training with...

path to the uncompressed files, should be a directory with a set of tar files

Please see the documentation for WebDataset for more details about how to use it as dataloader for Pytorch

You can also iterate through all examples and dump them with your preferred data format

Power BI Financials Data

Dataset

Contents

Data from: Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph...

Statistics of TABLE and TEXT.

Michigan Stratigraphic Nomenclature Chart

Data and code for: Variational Graph Author Topic Modeling

Open Text | OTC - Stock Price | Live Quote | Historical Chart

Publication text: code, data, and new measures

Open Text | OTC - Dividend Yield

Open Text | OTC - Assets

Open Text | OTC - Equity Capital And Reserves

Open Text | OTC - Employees Total Number

Open Text | OTC - Ebitda

chart-to-textSee More Versions

saadob12/chart-to-text

chart-to-text