Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
🧠 Visual-TableQA: Open-Domain Benchmark for Reasoning over Table Images
Welcome to Visual-TableQA, a project designed to generate high-quality synthetic question-answer datasets associated to images of tables. This resource is ideal for training and evaluating models on visually-grounded table understanding tasks such as document QA, table parsing, and multimodal reasoning.
🚀 Latest Update
We have refreshed the dataset with newly generated QA pairs created by… See the full description on the dataset page: https://huggingface.co/datasets/AI-4-Everyone/Visual-TableQA.
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
A collection of tables collected from the open research knowledge graph (ORKG) infrastructure, with a set of questions about these tables.
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
This dataset was created by WikiDocument Dataset
Released under CC BY-SA 3.0
Facebook
TwitterWe present AdhesiveTableQA
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains resources, namely TempTabQA, developed for the paper: Gupta, V., Kandoi, P., Vora, M., Zhang, S., He, Y., Reinanda R., Srikumar V., TempTabQA: Temporal Question Answering for Semi-Structured Tables. In: Proceeding of the The 2023 Conference on Empirical Methods in Natural Language Processing, Dec 2023.
TempTabQA is a dataset which comprises 11,454 question-answer pairs extracted from Wikipedia Infobox tables. These question-answer pairs are annotated by human annotators. We provide two test sets instead of one: the Head set with popular frequent domains, and the Tail set with rarer domains.
Files to access the annotation follow the below structure:
Maindata
Carefully read the ```LICENCE``` for non-academic usage.
Note : Wherever required consider the year of 2022 as the build date for the dataset.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
GRI-QA
GRI-QA is a benchmark for Table Question Answering (QA) over environmental data extracted from corporate sustainability reports, following the Global Reporting Initiative (GRI) standards. It contains 4,000+ questions across 204 tables from English-language reports of European companies, covering extractive, comparative, quantitative, multi-step, and multi-table reasoning.
Tasks
(Multi) Table QA on real-world corporate sustainability data Question types: extra… See the full description on the dataset page: https://huggingface.co/datasets/lucacontalbo/GRI-QA.
Facebook
Twitterhirundo-io/500-telecomm-personnel-table-qa dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
HCT-QA: Human-Centric Tables Question Answering
HCT-QA is a benchmark dataset designed to evaluate large language models (LLMs) on question answering over complex, human-centric tables (HCTs). These tables often appear in documents such as research papers, reports, and webpages and present significant challenges for traditional table QA due to their non-standard layouts and compositional structure. The dataset includes:
2,188 real-world tables with 9,835 human-annotated QA pairs 4… See the full description on the dataset page: https://huggingface.co/datasets/qcri-ai/HCTQA.
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
A collection of tables collected from the open research knowledge graph (ORKG) infrastructure, with a set of questions about these tables.
Facebook
TwitterThis dataset was created by Somaditya Singh
Facebook
TwitterDataset Card for "spider-tableQA-pretraining"
Usage
import pandas as pd from datasets import load_dataset
spider_tableQA_pretraining = load_dataset("vaishali/spider-tableQA-pretraining")
for sample in spider_tableQA_pretraining['train']: sql_query = sample['query'] input_table_names = sample["table_names"] input_tables = [pd.read_json(table, orient='split') for table in sample['tables']] answer = pd.read_json(sample['answer'], orient='split')
# flattened… See the full description on the dataset page: https://huggingface.co/datasets/vaishali/spider-tableQA-pretraining.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Qatar QA: PPP Conversion Factor: Private Consumption data was reported at 2.823 QAR/Intl $ in 2016. This records an increase from the previous number of 2.779 QAR/Intl $ for 2015. Qatar QA: PPP Conversion Factor: Private Consumption data is updated yearly, averaging 2.111 QAR/Intl $ from Dec 1990 (Median) to 2016, with 27 observations. The data reached an all-time high of 2.938 QAR/Intl $ in 2008 and a record low of 1.992 QAR/Intl $ in 1994. Qatar QA: PPP Conversion Factor: Private Consumption data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Qatar – Table QA.World Bank: Gross Domestic Product: Purchasing Power Parity. Purchasing power parity conversion factor is the number of units of a country's currency required to buy the same amounts of goods and services in the domestic market as U.S. dollar would buy in the United States. This conversion factor is for private consumption (i.e., household final consumption expenditure). For most economies PPP figures are extrapolated from the 2011 International Comparison Program (ICP) benchmark estimates or imputed using a statistical model based on the 2011 ICP. For 47 high- and upper middle-income economies conversion factors are provided by Eurostat and the Organisation for Economic Co-operation and Development (OECD).; ; World Bank, International Comparison Program database.; ;
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for TableBench
📚 Paper
🏆 Leaderboard
💻 Code
Dataset Summary
TableBench is a comprehensive and complex benchmark designed to evaluate Table Question Answering (TableQA) capabilities, aligning closely with the "Reasoning Complexity of Questions" dimension in real-world Table QA scenarios. It covers 18 question categories across 4 major ategories—including… See the full description on the dataset page: https://huggingface.co/datasets/Multilingual-Multimodal-NLP/TableBench.
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Abundance is characterized by VA= very abundant; C= common; F= few; R= rare; B= barren, EB= essentially barren. For preservation. P= poor; M= moderate; G= good, E= etched; O= overgrown. Lowercase letters indicate material considered to be reworked.
Facebook
TwitterDataset Card for "geoQuery-tableQA"
Usage
import pandas as pd from datasets import load_dataset
geoQuery_tableQA = load_dataset("vaishali/geoQuery-tableQA")
for sample in geoQuery_tableQA['train']: question = sample['question'] input_table_names = sample["table_names"] input_tables = [pd.read_json(table, orient='split') for table in sample['tables']] answer = pd.read_json(sample['answer'], orient='split')
# flattened input/output input_to_model =… See the full description on the dataset page: https://huggingface.co/datasets/vaishali/geoQuery-tableQA.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Qatar QA: Imports: cif: Emerging and Developing Economies: Western Hemisphere: Nicaragua data was reported at 0.277 USD mn in 2015. Qatar QA: Imports: cif: Emerging and Developing Economies: Western Hemisphere: Nicaragua data is updated yearly, averaging 0.277 USD mn from Dec 2015 (Median) to 2015, with 1 observations. Qatar QA: Imports: cif: Emerging and Developing Economies: Western Hemisphere: Nicaragua data remains active status in CEIC and is reported by International Monetary Fund. The data is categorized under Global Database’s Qatar – Table QA.IMF.DOT: Imports: cif: by Country: Annual.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Qatar QA: Gross Intake Ratio in First Grade of Primary Education: Female: % of Relevant Age Group data was reported at 109.484 % in 2016. This records a decrease from the previous number of 112.190 % for 2015. Qatar QA: Gross Intake Ratio in First Grade of Primary Education: Female: % of Relevant Age Group data is updated yearly, averaging 93.356 % from Dec 1971 (Median) to 2016, with 40 observations. The data reached an all-time high of 112.973 % in 2009 and a record low of 55.654 % in 1992. Qatar QA: Gross Intake Ratio in First Grade of Primary Education: Female: % of Relevant Age Group data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Qatar – Table QA.World Bank: Education Statistics. Gross intake ratio in first grade of primary education is the number of new entrants in the first grade of primary education regardless of age, expressed as a percentage of the population of the official primary entrance age.; ; UNESCO Institute for Statistics; Weighted average; Each economy is classified based on the classification of World Bank Group's fiscal year 2018 (July 1, 2017-June 30, 2018).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MMTabQA Dataset (HuggingFace Format)
This is the MMTabQA benchmark (EMNLP Findings 2024) converted to HuggingFace Dataset format. MMTabQA is a multimodal table question answering benchmark where tables contain both text and images. It combines four existing table QA datasets (WikiTableQuestions, WikiSQL, FeTaQA, HybridQA) with images replacing certain entity mentions.
Related Work: CAPTR
This dataset conversion was created as part of our research on CAPTR (Caption-based… See the full description on the dataset page: https://huggingface.co/datasets/lenglaender/mmtabqa.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Qatar QA: Quality of Port Infrastructure: WEF: 1=Extremely Underdeveloped To 7=Well Developed and Efficient by International Standards data was reported at 5.600 NA in 2017. This stayed constant from the previous number of 5.600 NA for 2016. Qatar QA: Quality of Port Infrastructure: WEF: 1=Extremely Underdeveloped To 7=Well Developed and Efficient by International Standards data is updated yearly, averaging 5.400 NA from Dec 2007 (Median) to 2017, with 11 observations. The data reached an all-time high of 5.600 NA in 2017 and a record low of 4.369 NA in 2007. Qatar QA: Quality of Port Infrastructure: WEF: 1=Extremely Underdeveloped To 7=Well Developed and Efficient by International Standards data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Qatar – Table QA.World Bank.WDI: Transportation. The Quality of Port Infrastructure measures business executives' perception of their country's port facilities. Data are from the World Economic Forum's Executive Opinion Survey, conducted for 30 years in collaboration with 150 partner institutes. The 2009 round included more than 13,000 respondents from 133 countries. Sampling follows a dual stratification based on company size and the sector of activity. Data are collected online or through in-person interviews. Responses are aggregated using sector-weighted averaging. The data for the latest year are combined with the data for the previous year to create a two-year moving average. Scores range from 1 (port infrastructure considered extremely underdeveloped) to 7 (port infrastructure considered efficient by international standards). Respondents in landlocked countries were asked how accessible are port facilities (1 = extremely inaccessible; 7 = extremely accessible).; ; World Economic Forum, Global Competiveness Report.; Unweighted average;
Facebook
TwitterMYMY-young/symdataset-tableqa-all dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
🧠 Visual-TableQA: Open-Domain Benchmark for Reasoning over Table Images
Welcome to Visual-TableQA, a project designed to generate high-quality synthetic question-answer datasets associated to images of tables. This resource is ideal for training and evaluating models on visually-grounded table understanding tasks such as document QA, table parsing, and multimodal reasoning.
🚀 Latest Update
We have refreshed the dataset with newly generated QA pairs created by… See the full description on the dataset page: https://huggingface.co/datasets/AI-4-Everyone/Visual-TableQA.