100+ datasets found
  1. S

    CBCD:A Chinese Bar Chart Dataset for Data Extraction

    • scidb.cn
    Updated Nov 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ma Qiuping; Zhang Qi; Bi Hangshuo; Zhao Xiaofan (2025). CBCD:A Chinese Bar Chart Dataset for Data Extraction [Dataset]. http://doi.org/10.57760/sciencedb.j00240.00052
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 14, 2025
    Dataset provided by
    Science Data Bank
    Authors
    Ma Qiuping; Zhang Qi; Bi Hangshuo; Zhao Xiaofan
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Currently, in the field of chart datasets, most existing resources are mainly in English, and there are almost no open-source Chinese chart datasets, which brings certain limitations to research and applications related to Chinese charts. This dataset draws on the construction method of the DVQA dataset to create a chart dataset focused on the Chinese environment. To ensure the authenticity and practicality of the dataset, we first referred to the authoritative website of the National Bureau of Statistics and selected 24 widely used data label categories in practical applications, totaling 262 specific labels. These tag categories cover multiple important areas such as socio-economic, demographic, and industrial development. In addition, in order to further enhance the diversity and practicality of the dataset, this paper sets 10 different numerical dimensions. These numerical dimensions not only provide a rich range of values, but also include multiple types of values, which can simulate various data distributions and changes that may be encountered in real application scenarios. This dataset has carefully designed various types of Chinese bar charts to cover various situations that may be encountered in practical applications. Specifically, the dataset not only includes conventional vertical and horizontal bar charts, but also introduces more challenging stacked bar charts to test the performance of the method on charts of different complexities. In addition, to further increase the diversity and practicality of the dataset, the text sets diverse attribute labels for each chart type. These attribute labels include but are not limited to whether they have data labels, whether the text is rotated 45 °, 90 °, etc. The addition of these details makes the dataset more realistic for real-world application scenarios, while also placing higher demands on data extraction methods. In addition to the charts themselves, the dataset also provides corresponding data tables and title text for each chart, which is crucial for understanding the content of the chart and verifying the accuracy of the extracted results. This dataset selects Matplotlib, the most popular and widely used data visualization library in the Python programming language, to be responsible for generating chart images required for research. Matplotlib has become the preferred tool for data scientists and researchers in data visualization tasks due to its rich features, flexible configuration options, and excellent compatibility. By utilizing the Matplotlib library, every detail of the chart can be precisely controlled, from the drawing of data points to the annotation of coordinate axes, from the addition of legends to the setting of titles, ensuring that the generated chart images not only meet the research needs, but also have high readability and attractiveness visually. The dataset consists of 58712 pairs of Chinese bar charts and corresponding data tables, divided into training, validation, and testing sets in a 7:2:1 ratio.

  2. c

    Data for: Stacked bar chart of MSW by context type.

    • repository.cam.ac.uk
    ods
    Updated Nov 9, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mills, Philip (2017). Data for: Stacked bar chart of MSW by context type. [Dataset]. http://doi.org/10.17863/CAM.14480
    Explore at:
    ods(7478 bytes)Available download formats
    Dataset updated
    Nov 9, 2017
    Dataset provided by
    Apollo
    University of Cambridge
    Authors
    Mills, Philip
    License

    https://www.rioxx.net/licenses/all-rights-reserved/https://www.rioxx.net/licenses/all-rights-reserved/

    Description

    Ceramic building material quantification data by context type for Illus. 5.21.

  3. w

    Electric Vehicle Charging Stations stacked bar chart

    • data.wu.ac.at
    csv, json, xml
    Updated Aug 4, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    US Department of Energy (2017). Electric Vehicle Charging Stations stacked bar chart [Dataset]. https://data.wu.ac.at/schema/performance_smcgov_org/aXNzZS1oY2g1
    Explore at:
    xml, json, csvAvailable download formats
    Dataset updated
    Aug 4, 2017
    Dataset provided by
    US Department of Energy
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    Location, type, and access information for electric vehicle charging stations in San Mateo County

  4. w

    Top company types by companies in Stanford

    • workwithdata.com
    Updated May 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Top company types by companies in Stanford [Dataset]. https://www.workwithdata.com/charts/companies?agg=count&chart=hbar&f=1&fcol0=city&fop0=%3D&fval0=Stanford&x=company_type&y=records
    Explore at:
    Dataset updated
    May 6, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This horizontal bar chart displays companies by company type using the aggregation count in Stanford. The data is about companies.

  5. d

    Stacked Bar Chart by Hour

    • dune.com
    Updated Jun 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fungi_agents (2025). Stacked Bar Chart by Hour [Dataset]. https://dune.com/discover/content/relevant?q=author%3Afungi_agents&resource-type=queries
    Explore at:
    Dataset updated
    Jun 26, 2025
    Authors
    fungi_agents
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Blockchain data query: Stacked Bar Chart by Hour

  6. w

    Top company types by companies in Athens

    • workwithdata.com
    Updated May 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Top company types by companies in Athens [Dataset]. https://www.workwithdata.com/charts/companies?agg=count&chart=hbar&f=1&fcol0=city&fop0=%3D&fval0=Athens&x=company_type&y=records
    Explore at:
    Dataset updated
    May 6, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This horizontal bar chart displays companies by company type using the aggregation count in Athens. The data is about companies.

  7. d

    bar chart

    • dune.com
    Updated Oct 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    niveditha_sivan (2025). bar chart [Dataset]. https://dune.com/discover/content/trending?q=author%3Aniveditha_sivan&resource-type=queries
    Explore at:
    Dataset updated
    Oct 14, 2025
    Authors
    niveditha_sivan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Blockchain data query: bar chart

  8. f

    Data_Sheet_1_Graph schema and best graph type to compare discrete groups:...

    • frontiersin.figshare.com
    docx
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fang Zhao; Robert Gaschler (2023). Data_Sheet_1_Graph schema and best graph type to compare discrete groups: Bar, line, and pie.docx [Dataset]. http://doi.org/10.3389/fpsyg.2022.991420.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    Frontiers
    Authors
    Fang Zhao; Robert Gaschler
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Different graph types may differ in their suitability to support group comparisons, due to the underlying graph schemas. This study examined whether graph schemas are based on perceptual features (i.e., each graph type, e.g., bar or line graph, has its own graph schema) or common invariant structures (i.e., graph types share common schemas). Furthermore, it was of interest which graph type (bar, line, or pie) is optimal for comparing discrete groups. A switching paradigm was used in three experiments. Two graph types were examined at a time (Experiment 1: bar vs. line, Experiment 2: bar vs. pie, Experiment 3: line vs. pie). On each trial, participants received a data graph presenting the data from three groups and were to determine the numerical difference of group A and group B displayed in the graph. We scrutinized whether switching the type of graph from one trial to the next prolonged RTs. The slowing of RTs in switch trials in comparison to trials with only one graph type can indicate to what extent the graph schemas differ. As switch costs were observed in all pairings of graph types, none of the different pairs of graph types tested seems to fully share a common schema. Interestingly, there was tentative evidence for differences in switch costs among different pairings of graph types. Smaller switch costs in Experiment 1 suggested that the graph schemas of bar and line graphs overlap more strongly than those of bar graphs and pie graphs or line graphs and pie graphs. This implies that results were not in line with completely distinct schemas for different graph types either. Taken together, the pattern of results is consistent with a hierarchical view according to which a graph schema consists of parts shared for different graphs and parts that are specific for each graph type. Apart from investigating graph schemas, the study provided evidence for performance differences among graph types. We found that bar graphs yielded the fastest group comparisons compared to line graphs and pie graphs, suggesting that they are the most suitable when used to compare discrete groups.

  9. w

    Global BAR Graph Array Market Research Report: By Application (Consumer...

    • wiseguyreports.com
    Updated Sep 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Global BAR Graph Array Market Research Report: By Application (Consumer Electronics, Automotive, Industrial Automation, Telecommunications), By Product Type (Single-Channel BAR Graph Array, Multi-Channel BAR Graph Array, High-Resolution BAR Graph Array), By Technology (Analog Technology, Digital Technology, Mixed-Technology), By End Use (Commercial, Residential, Industrial) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/reports/bar-graph-array-market
    Explore at:
    Dataset updated
    Sep 15, 2025
    License

    https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

    Time period covered
    Sep 25, 2025
    Area covered
    Global
    Description
    BASE YEAR2024
    HISTORICAL DATA2019 - 2023
    REGIONS COVEREDNorth America, Europe, APAC, South America, MEA
    REPORT COVERAGERevenue Forecast, Competitive Landscape, Growth Factors, and Trends
    MARKET SIZE 20242113.7(USD Million)
    MARKET SIZE 20252263.7(USD Million)
    MARKET SIZE 20354500.0(USD Million)
    SEGMENTS COVEREDApplication, Product Type, Technology, End Use, Regional
    COUNTRIES COVEREDUS, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
    KEY MARKET DYNAMICSGrowing consumer electronics demand, Increasing adoption in automotive applications, Advancements in display technology, Rising need for visual data representation, Expansion of IoT and smart devices
    MARKET FORECAST UNITSUSD Million
    KEY COMPANIES PROFILEDMicrochip Technology, Analog Devices, Cirrus Logic, ON Semiconductor, Texas Instruments, Infineon Technologies, Qualcomm, NXP Semiconductors, STMicroelectronics, Maxim Integrated, Rohm Semiconductor, Broadcom
    MARKET FORECAST PERIOD2025 - 2035
    KEY MARKET OPPORTUNITIESGrowing demand in IoT devices, Expansion in consumer electronics, Increasing automation in industries, Advancements in display technology, Rising interest in smart home solutions
    COMPOUND ANNUAL GROWTH RATE (CAGR) 7.1% (2025 - 2035)
  10. d

    Bar Chart IL

    • dune.com
    Updated Jul 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ck1005 (2025). Bar Chart IL [Dataset]. https://dune.com/discover/content/trending?q=author%3Ack1005&resource-type=queries
    Explore at:
    Dataset updated
    Jul 24, 2025
    Authors
    ck1005
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Blockchain data query: Bar Chart IL

  11. w

    Department of State Initial Business Filings, Stacked Bar Chart: Beginning...

    • data.wu.ac.at
    csv, json, xml
    Updated Sep 29, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of State (2015). Department of State Initial Business Filings, Stacked Bar Chart: Beginning 1991 [Dataset]. https://data.wu.ac.at/schema/data_ny_gov/czk2ei02OWh2
    Explore at:
    xml, csv, jsonAvailable download formats
    Dataset updated
    Sep 29, 2015
    Dataset provided by
    Department of State
    Description

    The dataset includes demographic information setting forth the number of filings made by business entities with the Department of State’s Division of Corporations. Such filings are categorized by type and filer.

  12. w

    Top employee types by companies in Dearborn

    • workwithdata.com
    Updated May 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Top employee types by companies in Dearborn [Dataset]. https://www.workwithdata.com/charts/companies?agg=count&chart=hbar&f=1&fcol0=city&fop0=%3D&fval0=Dearborn&x=employee_type&y=records
    Explore at:
    Dataset updated
    May 6, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Dearborn
    Description

    This horizontal bar chart displays companies by employee type using the aggregation count in Dearborn. The data is about companies.

  13. B

    Bar Graph Displays Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jan 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Bar Graph Displays Report [Dataset]. https://www.datainsightsmarket.com/reports/bar-graph-displays-169232
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    Jan 28, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global bar graph displays market is anticipated to experience remarkable growth in the coming years, driven by increasing demand from various end-user industries. The market size was valued at USD XXX million in 2025 and is projected to reach USD XX million by 2033, exhibiting a CAGR of XX% from 2025 to 2033. This growth can be attributed to factors such as technological advancements, rising demand for visual data representation, and increasing adoption in sectors like electronics, medical, and aerospace. Among the key segments, the LED and LCD display types are expected to witness significant growth, owing to their superior brightness, clarity, and energy efficiency. The major regions driving the market include North America, Europe, and Asia Pacific. North America holds a dominant market share, with the United States being a notable contributor. The Asia Pacific region is projected to grow at a higher rate during the forecast period, driven by the rapidly expanding electronics and semiconductor industries in countries like China, India, and Japan. Key players in the bar graph displays market include akYtec, Everlight Electronics, Kingbright, Sifam Tinsley, and Texmate, among others. These companies are focusing on innovation, strategic partnerships, and geographical expansion to enhance their market presence.

  14. Super Market dataset

    • kaggle.com
    zip
    Updated Nov 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chiamaka Ndubuisi (2025). Super Market dataset [Dataset]. https://www.kaggle.com/datasets/chiamakandubuisi/super-market-dataset
    Explore at:
    zip(215497 bytes)Available download formats
    Dataset updated
    Nov 4, 2025
    Authors
    Chiamaka Ndubuisi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Problem Statements for Data Visualization – Supermarket Sales Dataset 1. Sales Performance Across Branches Management wants to understand how sales performance varies across supermarket branches in Lagos, Abuja, Ogun, and Port Harcourt to identify the best-performing locations and areas that need improvement. • Suggested Visualizations: • Bar chart comparing total sales and profit by branch • Map chart showing sales by city • KPI cards: Total Sales, Profit, and Average Transaction Value per branch 2. Customer Purchase Behavior The marketing team needs insights into how different customer types (Member vs Normal) and genders influence purchase trends and average spending. • Suggested Visualizations: • Pie chart for customer type distribution • Bar chart for average spend by gender • Segmented comparison of total sales by customer type 3. Product Line Performance The business wants to know which product categories drive the highest revenue, quantity sold, and customer satisfaction to optimize stock levels and marketing focus. • Suggested Visualizations: • Bar chart showing total sales by product line • Column chart comparing average rating per product line • Profit margin chart by product line 4. Sales Trends Over Time The management team wants to monitor sales trends over time to identify peak periods, track seasonal variations, and plan future promotions accordingly. • Suggested Visualizations: • Line chart showing monthly or weekly sales trend • Seasonal decomposition (sales by month) • Trendline showing revenue growth 5. Payment Method Analysis The finance department needs to evaluate payment method usage (Cash, E-wallet, Credit Card) across cities to improve payment convenience and reduce transaction delays. • Suggested Visualizations: • Donut or bar chart showing share of payment methods • City-level breakdown of preferred payment type • Correlation between payment method and average transaction value 6. Customer Satisfaction Insights The customer experience team wants to explore how customer ratings relate to sales amount, product type, and branch performance to identify drivers of customer satisfaction. • Suggested Visualizations: • Scatter plot of rating vs total purchase amount • Heat map of average rating by branch and product line • KPI card showing average customer rating

  15. d

    Current Supply APY — bar chart

    • dune.com
    Updated Aug 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sammy123 (2025). Current Supply APY — bar chart [Dataset]. https://dune.com/discover/content/relevant?q=author:sammy123&resource-type=queries
    Explore at:
    Dataset updated
    Aug 27, 2025
    Authors
    sammy123
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Blockchain data query: Current Supply APY — bar chart

  16. d

    Fee Bar Chart

    • dune.com
    Updated Jul 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ck1005 (2025). Fee Bar Chart [Dataset]. https://dune.com/discover/content/trending?q=author%3Ack1005&resource-type=queries
    Explore at:
    Dataset updated
    Jul 24, 2025
    Authors
    ck1005
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Blockchain data query: Fee Bar Chart

  17. w

    DPSCS Department-Wide Totals: FIRM Assaults on Inmates: Total FIRM Assaults...

    • data.wu.ac.at
    csv, json, xml
    Updated Jul 13, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DPSCS (2017). DPSCS Department-Wide Totals: FIRM Assaults on Inmates: Total FIRM Assaults by Type, Stacked Bar Chart [Dataset]. https://data.wu.ac.at/odso/data_maryland_gov/dXFhNS10NHZh
    Explore at:
    xml, json, csvAvailable download formats
    Dataset updated
    Jul 13, 2017
    Dataset provided by
    DPSCS
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    The Department of Public Safety and Correctional Services (DPSCS) submits these data to the Governor's Office each month for each of Maryland's prisons and jails. This dataset shows totals across those facilities: population totals, contraband seizures, searches, assaults, hearing officer reports, disciplinary action, identification document issuance, and IWIF statistics. Statistical analyses and data formatting are performed by Department of Information Technology (DoIT).

  18. w

    Top revenue types by company's employees where industry equals Broadline...

    • workwithdata.com
    Updated May 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Top revenue types by company's employees where industry equals Broadline Retail [Dataset]. https://www.workwithdata.com/charts/companies?agg=sum&chart=hbar&f=1&fcol0=industry&fop0=%3D&fval0=Broadline+Retail&x=revenue_type&y=employees
    Explore at:
    Dataset updated
    May 6, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This horizontal bar chart displays employees (people) by revenue type using the aggregation sum. The data is filtered where the industry is Broadline Retail. The data is about companies.

  19. d

    2. Bar chart — Unique minters per day:

    • dune.com
    Updated May 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    tokensaler (2025). 2. Bar chart — Unique minters per day: [Dataset]. https://dune.com/discover/content/relevant?q=author:tokensaler&resource-type=queries
    Explore at:
    Dataset updated
    May 12, 2025
    Authors
    tokensaler
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Blockchain data query: 2. Bar chart — Unique minters per day:

  20. w

    Distribution of employees per revenue type in Richland

    • workwithdata.com
    Updated May 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Distribution of employees per revenue type in Richland [Dataset]. https://www.workwithdata.com/charts/companies?agg=sum&chart=bar&f=1&fcol0=city&fop0=%3D&fval0=Richland&x=revenue_type&y=employees
    Explore at:
    Dataset updated
    May 6, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This bar chart displays employees (people) by revenue type using the aggregation sum in Richland. The data is about companies.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ma Qiuping; Zhang Qi; Bi Hangshuo; Zhao Xiaofan (2025). CBCD:A Chinese Bar Chart Dataset for Data Extraction [Dataset]. http://doi.org/10.57760/sciencedb.j00240.00052

CBCD:A Chinese Bar Chart Dataset for Data Extraction

Explore at:
315 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 14, 2025
Dataset provided by
Science Data Bank
Authors
Ma Qiuping; Zhang Qi; Bi Hangshuo; Zhao Xiaofan
License

Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically

Description

Currently, in the field of chart datasets, most existing resources are mainly in English, and there are almost no open-source Chinese chart datasets, which brings certain limitations to research and applications related to Chinese charts. This dataset draws on the construction method of the DVQA dataset to create a chart dataset focused on the Chinese environment. To ensure the authenticity and practicality of the dataset, we first referred to the authoritative website of the National Bureau of Statistics and selected 24 widely used data label categories in practical applications, totaling 262 specific labels. These tag categories cover multiple important areas such as socio-economic, demographic, and industrial development. In addition, in order to further enhance the diversity and practicality of the dataset, this paper sets 10 different numerical dimensions. These numerical dimensions not only provide a rich range of values, but also include multiple types of values, which can simulate various data distributions and changes that may be encountered in real application scenarios. This dataset has carefully designed various types of Chinese bar charts to cover various situations that may be encountered in practical applications. Specifically, the dataset not only includes conventional vertical and horizontal bar charts, but also introduces more challenging stacked bar charts to test the performance of the method on charts of different complexities. In addition, to further increase the diversity and practicality of the dataset, the text sets diverse attribute labels for each chart type. These attribute labels include but are not limited to whether they have data labels, whether the text is rotated 45 °, 90 °, etc. The addition of these details makes the dataset more realistic for real-world application scenarios, while also placing higher demands on data extraction methods. In addition to the charts themselves, the dataset also provides corresponding data tables and title text for each chart, which is crucial for understanding the content of the chart and verifying the accuracy of the extracted results. This dataset selects Matplotlib, the most popular and widely used data visualization library in the Python programming language, to be responsible for generating chart images required for research. Matplotlib has become the preferred tool for data scientists and researchers in data visualization tasks due to its rich features, flexible configuration options, and excellent compatibility. By utilizing the Matplotlib library, every detail of the chart can be precisely controlled, from the drawing of data points to the annotation of coordinate axes, from the addition of legends to the setting of titles, ensuring that the generated chart images not only meet the research needs, but also have high readability and attractiveness visually. The dataset consists of 58712 pairs of Chinese bar charts and corresponding data tables, divided into training, validation, and testing sets in a 7:2:1 ratio.

Search
Clear search
Close search
Google apps
Main menu