100+ datasets found
  1. f

    Data from: Statistical Graphs in Costa Rica Textbooks for Primary Education

    • scielo.figshare.com
    jpeg
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maynor Jiménez-Castro; Pedro Arteaga; Carmen Batanero (2023). Statistical Graphs in Costa Rica Textbooks for Primary Education [Dataset]. http://doi.org/10.6084/m9.figshare.12171666.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    SciELO journals
    Authors
    Maynor Jiménez-Castro; Pedro Arteaga; Carmen Batanero
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Costa Rica
    Description

    Abstract The aim of this work was to analyze the statistical graphs included in the two most frequently series of textbooks used in Costa Rica basic education. We analyze the type of graph, its semiotic complexity, and the data context, as well as the type of task, reading level required to complete the task and purpose of the graph within the task. We observed the predominance of bar graphs, third level of semiotic complexity (representing a distribution), second reading level (reading between the data), work and school context, reading and computation tasks and analysis purpose. We describe the differences in the various grades and between both editorials, as well as differences and coincidences with results of other textbook studies carried out in Spain and Chile.

  2. Data from: United States Geological Survey Digital Cartographic Data...

    • icpsr.umich.edu
    • datasearch.gesis.org
    ascii
    Updated Jan 18, 2006
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Department of the Interior. United States Geological Survey (2006). United States Geological Survey Digital Cartographic Data Standards: Digital Line Graphs from 1:2,000,000-Scale Maps [Dataset]. http://doi.org/10.3886/ICPSR08379.v1
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Jan 18, 2006
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    United States Department of the Interior. United States Geological Survey
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/8379/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/8379/terms

    Area covered
    New York, Rhode Island, New Hampshire, Connecticut, Vermont, Maine, United States
    Description

    This dataset consists of cartographic data in digital line graph (DLG) form for the northeastern states (Connecticut, Maine, Massachusetts, New Hampshire, New York, Rhode Island and Vermont). Information is presented on two planimetric base categories, political boundaries and administrative boundaries, each available in two formats: the topologically structured format and a simpler format optimized for graphic display. These DGL data can be used to plot base maps and for various kinds of spatial analysis. They may also be combined with other geographically referenced data to facilitate analysis, for example the Geographic Names Information System.

  3. Data Visualization Cheat sheets and Resources

    • kaggle.com
    zip
    Updated May 31, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kash (2022). Data Visualization Cheat sheets and Resources [Dataset]. https://www.kaggle.com/kaushiksuresh147/data-visualization-cheat-cheats-and-resources
    Explore at:
    zip(133638507 bytes)Available download formats
    Dataset updated
    May 31, 2022
    Authors
    Kash
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The Data Visualization Corpus

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1430847%2F29f7950c3b7daf11175aab404725542c%2FGettyImages-1187621904-600x360.jpg?generation=1601115151722854&alt=media" alt="">

    Data Visualization

    Data visualization is the graphical representation of information and data. By using visual elements like charts, graphs, and maps, data visualization tools provide an accessible way to see and understand trends, outliers, and patterns in data.

    In the world of Big Data, data visualization tools and technologies are essential to analyze massive amounts of information and make data-driven decisions

    The Data Visualizaion Copus

    The Data Visualization corpus consists:

    • 32 cheat sheets: This includes A-Z about the techniques and tricks that can be used for visualization, Python and R visualization cheat sheets, Types of charts, and their significance, Storytelling with data, etc..

    • 32 Charts: The corpus also consists of a significant amount of data visualization charts information along with their python code, d3.js codes, and presentations relation to the respective charts explaining in a clear manner!

    • Some recommended books for data visualization every data scientist's should read:

      1. Beautiful Visualization by Julie Steele and Noah Iliinsky
      2. Information Dashboard Design by Stephen Few
      3. Knowledge is beautiful by David McCandless (Short abstract)
      4. The Functional Art: An Introduction to Information Graphics and Visualization by Alberto Cairo
      5. The Visual Display of Quantitative Information by Edward R. Tufte
      6. storytelling with data: a data visualization guide for business professionals by cole Nussbaumer knaflic
      7. Research paper - Cheat Sheets for Data Visualization Techniques by Zezhong Wang, Lovisa Sundin, Dave Murray-Rust, Benjamin Bach

    Suggestions:

    In case, if you find any books, cheat sheets, or charts missing and if you would like to suggest some new documents please let me know in the discussion sections!

    Resources:

    Request to kaggle users:

    • A kind request to kaggle users to create notebooks on different visualization charts as per their interest by choosing a dataset of their own as many beginners and other experts could find it useful!

    • To create interactive EDA using animation with a combination of data visualization charts to give an idea about how to tackle data and extract the insights from the data

    Suggestion and queries:

    Feel free to use the discussion platform of this data set to ask questions or any queries related to the data visualization corpus and data visualization techniques

    Kindly upvote the dataset if you find it useful or if you wish to appreciate the effort taken to gather this corpus! Thank you and have a great day!

  4. f

    Data from: Aspects of University Students' Graph Sense in a Virtual Learning...

    • scielo.figshare.com
    jpeg
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fabiana Chagas de Andrade; Carolina Vieira Schiller; Dione Aparecido Ferreira da Silva; Larissa Pereira Menezes; Alexandre Sousa da Silva (2023). Aspects of University Students' Graph Sense in a Virtual Learning Environment [Dataset]. http://doi.org/10.6084/m9.figshare.14304727.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    SciELO journals
    Authors
    Fabiana Chagas de Andrade; Carolina Vieira Schiller; Dione Aparecido Ferreira da Silva; Larissa Pereira Menezes; Alexandre Sousa da Silva
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract To break with the traditional model of Basic Statistics classes in Higher Education, we sought on Statistical Literacy and Critical Education to develop an activity about graphic interpretation, which took place in a Virtual Learning Environment (VLE), as a complement to classroom meetings. Twenty-three engineering students from a public higher education institution in Rio de Janeiro took part in the research. Our objective was to analyze elements of graphic comprehension in an activity that consisted of identifying incorrect statistical graphs, conveyed by the media, followed by argumentation and interaction among students about these errors. The main results evidenced that elements of the Graphic Sense were present in the discussions and were the goal of the students' critical analysis. The VLE was responsible for facilitating communication, fostering student participation, and linguistic writing, so the use of digital technologies and activities favored by collaboration and interaction are important for statistical development, but such construction is a gradual process.

  5. Classes Knowledge Graph

    • kaggle.com
    zip
    Updated Aug 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Afroz (2024). Classes Knowledge Graph [Dataset]. https://www.kaggle.com/datasets/pythonafroz/dbpedia-classes-knowledge-graph
    Explore at:
    zip(174050111 bytes)Available download formats
    Dataset updated
    Aug 31, 2024
    Authors
    Afroz
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    DBPedia Classes

    DBpedia is a knowledge graph extracted from Wikipedia, providing structured data about real-world entities and their relationships. DBpedia Classes are the core building blocks of this knowledge graph, representing different categories or types of entities.

    Key Concepts:

    Entity: A real-world object, such as a person, place, thing, or concept. Class: A group of entities that share common properties or characteristics. Instance: A specific member of a class.

    Examples of DBPedia Classes:

    Person: Represents individuals, e.g., "Barack Obama," "Albert Einstein." Place: Represents locations, e.g., "Paris," "Mount Everest." Organization: Represents groups, institutions, or companies, e.g., "Google," "United Nations." Event: Represents occurrences, e.g., "World Cup," "French Revolution." Artwork: Represents creative works, e.g., "Mona Lisa," "Star Wars."

    Hierarchy and Relationships:

    DBpedia classes often have a hierarchical structure, where subclasses inherit properties from their parent classes. For example, the class "Person" might have subclasses like "Politician," "Scientist," and "Artist."

    Relationships between classes are also important. For instance, a "Person" might have a "birthPlace" relationship with a "Place," or an "Artist" might have a "hasArtwork" relationship with an "Artwork."

    Applications of DBPedia Classes:

    Semantic Search: DBPedia classes can be used to enhance search results by understanding the context and meaning of queries.

    Knowledge Graph Construction: DBPedia classes form the foundation of knowledge graphs, which can be used for various applications like question answering, recommendation systems, and data integration.

    Data Analysis: DBPedia classes can be used to analyze and extract insights from large datasets.

  6. NLP feature set variables for TwiBot-20.

    • plos.figshare.com
    xls
    Updated Dec 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agata Skorupka (2024). NLP feature set variables for TwiBot-20. [Dataset]. http://doi.org/10.1371/journal.pone.0315849.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 23, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Agata Skorupka
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The study examines different graph-based methods of detecting anomalous activities on digital markets, proposing the most efficient way to increase market actors’ protection and reduce information asymmetry. Anomalies are defined below as both bots and fraudulent users (who can be both bots and real people). Methods are compared against each other, and state-of-the-art results from the literature and a new algorithm is proposed. The goal is to find an efficient method suitable for threat detection, both in terms of predictive performance and computational efficiency. It should scale well and remain robust on the advancements of the newest technologies. The article utilized three publicly accessible graph-based datasets: one describing the Twitter social network (TwiBot-20) and two describing Bitcoin cryptocurrency markets (Bitcoin OTC and Bitcoin Alpha). In the former, an anomaly is defined as a bot, as opposed to a human user, whereas in the latter, an anomaly is a user who conducted a fraudulent transaction, which may (but does not have to) imply being a bot. The study proves that graph-based data is a better-performing predictor than text data. It compares different graph algorithms to extract feature sets for anomaly detection models. It states that methods based on nodes’ statistics result in better model performance than state-of-the-art graph embeddings. They also yield a significant improvement in computational efficiency. This often means reducing the time by hours or enabling modeling on significantly larger graphs (usually not feasible in the case of embeddings). On that basis, the article proposes its own graph-based statistics algorithm. Furthermore, using embeddings requires two engineering choices: the type of embedding and its dimension. The research examines whether there are types of graph embeddings and dimensions that perform significantly better than others. The solution turned out to be dataset-specific and needed to be tailored on a case-by-case basis, adding even more engineering overhead to using embeddings (building a leaderboard of grid of embedding instances, where each of them takes hours to be generated). This, again, speaks in favor of the proposed algorithm based on nodes’ statistics. The research proposes its own efficient algorithm, which makes this engineering overhead redundant.

  7. Further education and skills - Underlying Charts Data

    • explore-education-statistics.service.gov.uk
    Updated Nov 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Education (2024). Further education and skills - Underlying Charts Data [Dataset]. https://explore-education-statistics.service.gov.uk/data-catalogue/data-set/c0579bf7-96fd-4771-9034-e8642b529114
    Explore at:
    Dataset updated
    Nov 28, 2024
    Dataset authored and provided by
    Department for Educationhttps://gov.uk/dfe
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Historical time series of headline adult (19+) further education and skills learner participation, containing breakdowns by provision type and in some cases level. Also includes some all age apprenticeship participation figures.Academic years: 2005/06 to 2023/24 full academic yearsIndicators: ParticipationFilter: Provision type, Age group, Level

  8. r

    Data from: Transformation of Type Graphs with Inheritance for Ensuring...

    • resodate.org
    Updated Jun 17, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Frank Hermann; Hartmut Ehrig; Claudia Ermel (2020). Transformation of Type Graphs with Inheritance for Ensuring Security in E-Government Networks (Long Version) [Dataset]. http://doi.org/10.14279/depositonce-10280
    Explore at:
    Dataset updated
    Jun 17, 2020
    Dataset provided by
    Technische Universität Berlin
    DepositOnce
    Authors
    Frank Hermann; Hartmut Ehrig; Claudia Ermel
    Description

    E-government services usually process large amounts of confidential data. Therefore, security requirements for the communication between components have to be adhered in a strict way. Hence, it is of main interest that developers can analyze their modularized models of actual systems and that they can detect critical patterns. For this purpose, we present a general and formal framework for critical pattern detection and user-driven correction as well as possibilities for automatic analysis and verification at meta-model level. The technique is based on the formal theory of graph transformation, which we extend to transformations of type graphs with inheritance within a type graph hierarchy. We apply the framework to specify relevant security requirements. The extended theory is shown to fulfil the conditions of a weak adhesive HLR category allowing us to transfer analysis techniques and results shown for this abstract framework of graph transformation. In particular, we discuss how confluence analysis and parallelization can be used to enable parallel critical pattern detection and elimination.

  9. S

    CBCD:A Chinese Bar Chart Dataset for Data Extraction

    • scidb.cn
    Updated Nov 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ma Qiuping; Zhang Qi; Bi Hangshuo; Zhao Xiaofan (2025). CBCD:A Chinese Bar Chart Dataset for Data Extraction [Dataset]. http://doi.org/10.57760/sciencedb.j00240.00052
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 14, 2025
    Dataset provided by
    Science Data Bank
    Authors
    Ma Qiuping; Zhang Qi; Bi Hangshuo; Zhao Xiaofan
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Currently, in the field of chart datasets, most existing resources are mainly in English, and there are almost no open-source Chinese chart datasets, which brings certain limitations to research and applications related to Chinese charts. This dataset draws on the construction method of the DVQA dataset to create a chart dataset focused on the Chinese environment. To ensure the authenticity and practicality of the dataset, we first referred to the authoritative website of the National Bureau of Statistics and selected 24 widely used data label categories in practical applications, totaling 262 specific labels. These tag categories cover multiple important areas such as socio-economic, demographic, and industrial development. In addition, in order to further enhance the diversity and practicality of the dataset, this paper sets 10 different numerical dimensions. These numerical dimensions not only provide a rich range of values, but also include multiple types of values, which can simulate various data distributions and changes that may be encountered in real application scenarios. This dataset has carefully designed various types of Chinese bar charts to cover various situations that may be encountered in practical applications. Specifically, the dataset not only includes conventional vertical and horizontal bar charts, but also introduces more challenging stacked bar charts to test the performance of the method on charts of different complexities. In addition, to further increase the diversity and practicality of the dataset, the text sets diverse attribute labels for each chart type. These attribute labels include but are not limited to whether they have data labels, whether the text is rotated 45 °, 90 °, etc. The addition of these details makes the dataset more realistic for real-world application scenarios, while also placing higher demands on data extraction methods. In addition to the charts themselves, the dataset also provides corresponding data tables and title text for each chart, which is crucial for understanding the content of the chart and verifying the accuracy of the extracted results. This dataset selects Matplotlib, the most popular and widely used data visualization library in the Python programming language, to be responsible for generating chart images required for research. Matplotlib has become the preferred tool for data scientists and researchers in data visualization tasks due to its rich features, flexible configuration options, and excellent compatibility. By utilizing the Matplotlib library, every detail of the chart can be precisely controlled, from the drawing of data points to the annotation of coordinate axes, from the addition of legends to the setting of titles, ensuring that the generated chart images not only meet the research needs, but also have high readability and attractiveness visually. The dataset consists of 58712 pairs of Chinese bar charts and corresponding data tables, divided into training, validation, and testing sets in a 7:2:1 ratio.

  10. d

    Data from: Grammar transformations of topographic feature type annotations...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Oct 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Grammar transformations of topographic feature type annotations of the U.S. to structured graph data. [Dataset]. https://catalog.data.gov/dataset/grammar-transformations-of-topographic-feature-type-annotations-of-the-u-s-to-structured-g
    Explore at:
    Dataset updated
    Oct 29, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    United States
    Description

    These data were used to examine grammatical structures and patterns within a set of geospatial glossary definitions. Objectives of our study were to analyze the semantic structure of input definitions, use this information to build triple structures of RDF graph data, upload our lexicon to a knowledge graph software, and perform SPARQL queries on the data. Upon completion of this study, SPARQL queries were proven to effectively convey graph triples which displayed semantic significance. These data represent and characterize the lexicon of our input text which are used to form graph triples. These data were collected in 2024 by passing text through multiple Python programs utilizing spaCy (a natural language processing library) and its pre-trained English transformer pipeline. Before data was processed by the Python programs, input definitions were first rewritten as natural language and formatted as tabular data. Passages were then tokenized and characterized by their part-of-speech, tag, dependency relation, dependency head, and lemma. Each word within the lexicon was tokenized. A stop-words list was utilized only to remove punctuation and symbols from the text, excluding hyphenated words (ex. bowl-shaped) which remained as such. The tokens’ lemmas were then aggregated and totaled to find their recurrences within the lexicon. This procedure was repeated for tokenizing noun chunks using the same glossary definitions.

  11. Data from: OpenAIRE Graph Beginner's Kit Dataset

    • zenodo.org
    • pub.uni-bielefeld.de
    tar
    Updated Aug 20, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miriam Baglioni; Miriam Baglioni; Claudio Atzori; Claudio Atzori; Alessia Bardi; Alessia Bardi; Gianbattista Bloisi; Sandro La Bruzzo; Sandro La Bruzzo; Paolo Manghi; Paolo Manghi; Harry Dimitropoulos; Andrea Mannocci; Andrea Mannocci; Ioannis Foufoulas; Marek Horst; Michele De Bonis; Michele De Bonis; Michele Artini; Thanasis Vergoulis; Thanasis Vergoulis; Serafeim Chatzopoulos; Serafeim Chatzopoulos; Dimitris Pierrakos; Antonis Lempesis; Antonis Lempesis; Andreas Czerniak; Andreas Czerniak; Jochen Schirrwagen; Alexandros Ioannidis; Katerina Iatropoulou; Argiro Kokogiannaki; Argiro Kokogiannaki; Gianbattista Bloisi; Harry Dimitropoulos; Ioannis Foufoulas; Marek Horst; Michele Artini; Dimitris Pierrakos; Jochen Schirrwagen; Alexandros Ioannidis; Katerina Iatropoulou (2023). OpenAIRE Graph Beginner's Kit Dataset [Dataset]. http://doi.org/10.5281/zenodo.8223812
    Explore at:
    tarAvailable download formats
    Dataset updated
    Aug 20, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Miriam Baglioni; Miriam Baglioni; Claudio Atzori; Claudio Atzori; Alessia Bardi; Alessia Bardi; Gianbattista Bloisi; Sandro La Bruzzo; Sandro La Bruzzo; Paolo Manghi; Paolo Manghi; Harry Dimitropoulos; Andrea Mannocci; Andrea Mannocci; Ioannis Foufoulas; Marek Horst; Michele De Bonis; Michele De Bonis; Michele Artini; Thanasis Vergoulis; Thanasis Vergoulis; Serafeim Chatzopoulos; Serafeim Chatzopoulos; Dimitris Pierrakos; Antonis Lempesis; Antonis Lempesis; Andreas Czerniak; Andreas Czerniak; Jochen Schirrwagen; Alexandros Ioannidis; Katerina Iatropoulou; Argiro Kokogiannaki; Argiro Kokogiannaki; Gianbattista Bloisi; Harry Dimitropoulos; Ioannis Foufoulas; Marek Horst; Michele Artini; Dimitris Pierrakos; Jochen Schirrwagen; Alexandros Ioannidis; Katerina Iatropoulou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The OpenAIRE Graph is an Open Access dataset containing metadata about research products (literature, datasets, software, etc.) linked to other entities of the research ecosystem like organisations, project grants, and data sources.

    The large size of the OpenAIRE Graph is a major impediment for beginners to familiarise with the underlying data model and explore its contents. Working with the Graph in its full size typically requires access to a huge distributed computing infrastructure which cannot be easily accessible to everyone.

    The OpenAIRE Beginner’s Kit aims to address this issue. It consists of two components:

    • A subset of the OpenAIRE Graph composed of the research products published between 2022-12-28 and 2023-07-31, all the entities connected to them and the respective relationships. The subset is composed of the following parts:

      • publication.tar: metadata records about research literature (includes types of publications listed here)

      • dataset.tar: metadata records about research data (includes the subtypes listed here)

      • software.tar: metadata records about research software (includes the subtypes listed here)

      • otherresearchproduct.tar: metadata records about research products that cannot be classified as research literature, data or software (includes types of products listed here)

      • organization.tar: metadata records about organizations involved in the research life-cycle, such as universities, research organizations, funders.

      • datasource.tar: metadata records about data sources whose content is available in the OpenAIRE Graph. They include institutional and thematic repositories, journals, aggregators, funders' databases.

      • project.tar: metadata records about project grants.

      • relation.tar: metadata records about relations between entities in the graph.

      • communities_infrastructures.tar: metadata records about research communities and research infrastructures

        Each file is a tar archive containing gz files, each with one json per line. Each json is compliant to the schema available at http://doi.org/10.5281/zenodo.8238874.

    • The code to analyse the data. It is available on GitHub. Just download the archive, unzip/untar it and follow the instruction on the README file (no need to clone the GitHub repository)


  12. Stack Exchange Graphs (SNAP)

    • kaggle.com
    zip
    Updated Dec 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Subhajit Sahu (2021). Stack Exchange Graphs (SNAP) [Dataset]. https://www.kaggle.com/datasets/wolfram77/graphs-snap-sx
    Explore at:
    zip(1480133729 bytes)Available download formats
    Dataset updated
    Dec 16, 2021
    Authors
    Subhajit Sahu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Ask Ubuntu temporal network

    https://snap.stanford.edu/data/sx-askubuntu.html

    Dataset information

    This is a temporal network of interactions on the stack exchange web site
    Ask Ubuntu (http://askubuntu.com/). There are three different types of
    interactions represented by a directed edge (u, v, t):

    user u answered user v's question at time t (in the graph sx-askubuntu-a2q) user u commented on user v's question at time t (in the graph
    sx-askubuntu-c2q) user u commented on user v's answer at time t (in the
    graph sx-askubuntu-c2a)

    The graph sx-askubuntu contains the union of these graphs. These graphs
    were constructed from the Stack Exchange Data Dump. Node ID numbers
    correspond to the 'OwnerUserId' tag in that data dump.

    Dataset statistics (sx-askubuntu)
    Nodes 159,316
    Temporal Edges 964,437
    Edges in static graph 596,933
    Time span 2613 days

    Dataset statistics (sx-askubuntu-a2q)
    Nodes 137,517
    Temporal Edges 280,102
    Edges in static graph 262,106
    Time span 2613 days

    Dataset statistics (sx-askubuntu-c2q)
    Nodes 79,155
    Temporal Edges 327,513
    Edges in static graph 198,852
    Time span 2047 days

    Dataset statistics (sx-askubuntu-c2a)
    Nodes 75,555
    Temporal Edges 356,822
    Edges in static graph 178,210
    Time span 2418 days

    Source (citation)
    Ashwin Paranjape, Austin R. Benson, and Jure Leskovec. "Motifs in Temporal Networks." In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, 2017.

    Files
    File Description
    sx-askubuntu.txt.gz All interactions
    sx-askubuntu-a2q.txt.gz Answers to questions
    sx-askubuntu-c2q.txt.gz Comments to questions
    sx-askubuntu-c2a.txt.gz Comments to answers

    Data format

    SRC DST UNIXTS                             
    

    where edges are separated by a new line and

    SRC: id of the source node (a user)                  
    TGT: id of the target node (a user)                  
    UNIXTS: Unix timestamp (seconds since the epoch)            
                   ...
    
  13. r

    Data from: Nowhere dense classes of graphs

    • resodate.org
    Updated Feb 16, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sebastian Siebertz (2016). Nowhere dense classes of graphs [Dataset]. http://doi.org/10.14279/depositonce-5011
    Explore at:
    Dataset updated
    Feb 16, 2016
    Dataset provided by
    Technische Universität Berlin
    DepositOnce
    Authors
    Sebastian Siebertz
    Description

    We show that every first-order property of graphs can be decided in almost linear time on every nowhere dense class of graphs. For graph classes closed under taking subgraphs, our result is optimal (under a standard complexity theoretic assumption): it was known before that for all classes C of graphs closed under taking subgraphs, if deciding first-order properties of graphs in C is fixed-parameter tractable, parameterized by the length of the input formula, then C must be nowhere dense. Nowhere dense graph classes form a large variety of classes of sparse graphs including the class of planar graphs, actually all classes with excluded minors, and also bounded degree graphs and graph classes of bounded expansion. For our proof, we provide two new characterisations of nowhere dense classes of graphs. The first characterisation is in terms of a game, which explains the local structure of graphs from nowhere dense classes. The second characterisation is by the existence of sparse neighbourhood covers. On the logical side, we prove a rank-preserving version of Gaifman's locality theorem. The characterisation by neighbourhood covers is based on a characterisation of nowhere dense classes by generalised colouring numbers. We show several new bounds for the generalised colouring numbers on restricted graph classes, such as for proper minor closed classes and for planar graphs. Finally, we study the parameterized complexity of the first-order model-checking problem on structures where an ordering is available to be used in formulas. We show that first-order logic on ordered structures as well as on structures with a successor relation is essentially intractable on nearly all interesting classes. On the other hand, we show that the model-checking problem of order-invariant monadic second-order logic is tractable essentially on the same classes as plain monadic second-order logic and that the model-checking problem for successor-invariant first-order logic is tractable on planar graphs.

  14. d

    Device Graph Data | 10+ Identity Types | 1500M+ Global Devices| CCPA...

    • datarade.ai
    Updated Aug 21, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DRAKO (2024). Device Graph Data | 10+ Identity Types | 1500M+ Global Devices| CCPA Compliant [Dataset]. https://datarade.ai/data-products/drako-device-graph-data-usa-canada-comprehensive-insi-drako
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Aug 21, 2024
    Dataset authored and provided by
    DRAKO
    Area covered
    Philippines, Mozambique, Cyprus, Brazil, Bahamas, Eritrea, Tonga, Lao People's Democratic Republic, South Sudan, Aruba
    Description

    DRAKO is a leader in providing Device Graph Data, focusing on understanding the relationships between consumer devices and identities. Our data allows businesses to create holistic profiles of users, track engagement across platforms, and measure the effectiveness of advertising efforts.

    Device Graph Data is essential for accurate audience targeting, cross-device attribution, and understanding consumer journeys. By integrating data from multiple sources, we provide a unified view of user interactions, helping businesses make informed decisions.

    Key Features: - Comprehensive device mapping to understand user behaviour across multiple platforms - Detailed Identity Graph Data for cross-device identification and engagement tracking - Integration with Connected TV Data for enhanced insights into video consumption habits - Mobile Attribution Data to measure the effectiveness of mobile campaigns - Customizable analytics to segment audiences based on device usage and demographics - Some ID types offered: AAID, idfa, Unified ID 2.0, AFAI, MSAI, RIDA, AAID_CTV, IDFA_CTV

    Use Cases: - Cross-device marketing strategies - Attribution modelling and campaign performance measurement - Audience segmentation and targeting - Enhanced insights for Connected TV advertising - Comprehensive consumer journey mapping

    Data Compliance: All of our Device Graph Data is sourced responsibly and adheres to industry standards for data privacy and protection. We ensure that user identities are handled with care, providing insights without compromising individual privacy.

    Data Quality: DRAKO employs robust validation techniques to ensure the accuracy and reliability of our Device Graph Data. Our quality assurance processes include continuous monitoring and updates to maintain data integrity and relevance.

  15. H

    United States Cancer Statistics (USCS)

    • dataverse.harvard.edu
    Updated May 4, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2011). United States Cancer Statistics (USCS) [Dataset]. http://doi.org/10.7910/DVN/JBJVUW
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 4, 2011
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    United States
    Description

    Users can download the data set and static graphs, tables and charts regarding cancers in the United States. Background The United States Cancer Statistics is web-based report created by the Centers for Disease Control and Prevention, in partnership with the National Cancer Institute (NCI) and the North American Association of Central Cancer Registries (NAACCR). The site contains cancer incidence and cancer mortality data. Specific information includes: the top ten cancers, state vs. national comparisons, selected cancers, childhood cancer, cancers grouped by state/ region, cancers gr ouped by race/ ethnicity and brain cancers by tumor type. User Functionality Users can view static graphs, tables and charts, which can be downloaded. Within childhood cancer, users can view by year and by cancer type and age group or by cancer type and racial/ ethnic group. Otherwise, users can view data by female, male or male and female. Users may also download the entire data sets directly. Data Notes The data sources for the cancer incidence data are the CD C's National Program for Cancer Registries (NPCR) and NCI's Surveillance, Epidemiology and End Result (SEER). CDC's National Vital Statistics System (NVSS) collects the data on cancer mortality. Data is available for each year between 1999 and 2007 or for 2003- 2007 combined. The site does not specify when new data becomes available.

  16. Global Graph Database Market Size By Type (Labeled Property Graph, Resource...

    • verifiedmarketresearch.com
    Updated Oct 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VERIFIED MARKET RESEARCH (2025). Global Graph Database Market Size By Type (Labeled Property Graph, Resource Description Framework), By Application (Fraud Detection, Recommendation Engines), By Component (Software, Services), By Geographic Scope And Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/graph-database-market/
    Explore at:
    Dataset updated
    Oct 6, 2025
    Dataset provided by
    Verified Market Researchhttps://www.verifiedmarketresearch.com/
    Authors
    VERIFIED MARKET RESEARCH
    License

    https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/

    Time period covered
    2026 - 2032
    Area covered
    Global
    Description

    Graph Database Market size was valued at USD 2.86 Billion in 2024 and is projected to reach USD 14.58 Billion by 2032, growing at a CAGR of 22.6% from 2026 to 2032. Global Graph Database Market DriversThe growth and development of the Graph Database Market is attributed to certain main market drivers. These factors have a big impact on how Graph Database are demanded and adopted in different sectors. Several of the major market forces are as follows:Growth of Connected Data: Graph databases are excellent at expressing and querying relationships as businesses work with datasets that are more complex and interconnected. Graph databases are becoming more and more in demand as connected data gains significance across multiple industries.Knowledge Graph Emergence: In fields like artificial intelligence, machine learning, and data analytics, knowledge graphs—which arrange information in a graph structure—are becoming more and more popular. Knowledge graphs can only be created and queried via graph databases, which is what is causing their widespread use.Analytics and Machine Learning Advancements: Graph databases handle relationships and patterns in data effectively, enabling applications related to advanced analytics and machine learning. Graph databases are becoming more and more in demand when combined with analytics and machine learning as businesses want to extract more insights from their data.Real-Time Data Processing: Graph databases can process data in real-time, which makes them appropriate for applications that need quick answers and insights. In situations like fraud detection, recommendation systems, and network analysis, this is especially helpful.Increasing Need for Security and Fraud Detection: Graph databases are useful for fraud security and detection applications because they can identify patterns and abnormalities in linked data. The growing need for graph databases in security solutions is a result of the ongoing evolution of cybersecurity threats.

  17. T

    Hungary - Distribution of population by household types: Single person

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Sep 15, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). Hungary - Distribution of population by household types: Single person [Dataset]. https://tradingeconomics.com/hungary/distribution-of-population-by-household-types-single-person-eurostat-data.html
    Explore at:
    xml, json, excel, csvAvailable download formats
    Dataset updated
    Sep 15, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Hungary
    Description

    Hungary - Distribution of population by household types: Single person was 13.80% in December of 2024, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Hungary - Distribution of population by household types: Single person - last updated from the EUROSTAT on November of 2025. Historically, Hungary - Distribution of population by household types: Single person reached a record high of 14.50% in December of 2017 and a record low of 9.20% in December of 2010.

  18. f

    Statistics information of datasets.

    • figshare.com
    xls
    Updated Oct 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhen Xie; Wenzhe Hou; Feiyang Wu; Hao Xu (2025). Statistics information of datasets. [Dataset]. http://doi.org/10.1371/journal.pone.0334724.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Oct 23, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Zhen Xie; Wenzhe Hou; Feiyang Wu; Hao Xu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Graphs are a representative type of fundamental data structures. They are capable of representing complex association relationships in diverse domains. For large-scale graph processing, the stream graphs have become efficient tools to process dynamically evolving graph data. When processing stream graphs, the subgraph counting problem is a key technique, which faces significant computational challenges due to its #P-complete nature. This work introduces StreamSC, a novel framework that efficiently estimate subgraph counting results on stream graphs through two key innovations: (i) It’s the first learning-based framework to address the subgraph counting problem focused on stream graphs; and (ii) this framework addresses the challenges from dynamic changes of the data graph caused by the insertion or deletion of edges. Experiments on 5 real-word graphs show the priority of StreamSC on accuracy and efficiency.

  19. Z

    Wikipedia Knowledge Graph dataset

    • data-staging.niaid.nih.gov
    • produccioncientifica.ugr.es
    • +2more
    Updated Jul 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arroyo-Machado, Wenceslao; Torres-Salinas, Daniel; Costas, Rodrigo (2024). Wikipedia Knowledge Graph dataset [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_6346899
    Explore at:
    Dataset updated
    Jul 17, 2024
    Dataset provided by
    University of Granada
    Centre for Science and Technology Studies (CWTS)
    Authors
    Arroyo-Machado, Wenceslao; Torres-Salinas, Daniel; Costas, Rodrigo
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Wikipedia is the largest and most read online free encyclopedia currently existing. As such, Wikipedia offers a large amount of data on all its own contents and interactions around them, as well as different types of open data sources. This makes Wikipedia a unique data source that can be analyzed with quantitative data science techniques. However, the enormous amount of data makes it difficult to have an overview, and sometimes many of the analytical possibilities that Wikipedia offers remain unknown. In order to reduce the complexity of identifying and collecting data on Wikipedia and expanding its analytical potential, after collecting different data from various sources and processing them, we have generated a dedicated Wikipedia Knowledge Graph aimed at facilitating the analysis, contextualization of the activity and relations of Wikipedia pages, in this case limited to its English edition. We share this Knowledge Graph dataset in an open way, aiming to be useful for a wide range of researchers, such as informetricians, sociologists or data scientists.

    There are a total of 9 files, all of them in tsv format, and they have been built under a relational structure. The main one that acts as the core of the dataset is the page file, after it there are 4 files with different entities related to the Wikipedia pages (category, url, pub and page_property files) and 4 other files that act as "intermediate tables" making it possible to connect the pages both with the latter and between pages (page_category, page_url, page_pub and page_link files).

    The document Dataset_summary includes a detailed description of the dataset.

    Thanks to Nees Jan van Eck and the Centre for Science and Technology Studies (CWTS) for the valuable comments and suggestions.

  20. T

    Sweden - Distribution of population by household types: Single person

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Sep 15, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). Sweden - Distribution of population by household types: Single person [Dataset]. https://tradingeconomics.com/sweden/distribution-of-population-by-household-types-single-person-eurostat-data.html
    Explore at:
    csv, excel, xml, jsonAvailable download formats
    Dataset updated
    Sep 15, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Sweden
    Description

    Sweden - Distribution of population by household types: Single person was 22.20% in December of 2024, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Sweden - Distribution of population by household types: Single person - last updated from the EUROSTAT on December of 2025. Historically, Sweden - Distribution of population by household types: Single person reached a record high of 24.10% in December of 2023 and a record low of 19.80% in December of 2012.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Maynor Jiménez-Castro; Pedro Arteaga; Carmen Batanero (2023). Statistical Graphs in Costa Rica Textbooks for Primary Education [Dataset]. http://doi.org/10.6084/m9.figshare.12171666.v1

Data from: Statistical Graphs in Costa Rica Textbooks for Primary Education

Related Article
Explore at:
jpegAvailable download formats
Dataset updated
Jun 3, 2023
Dataset provided by
SciELO journals
Authors
Maynor Jiménez-Castro; Pedro Arteaga; Carmen Batanero
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
Costa Rica
Description

Abstract The aim of this work was to analyze the statistical graphs included in the two most frequently series of textbooks used in Costa Rica basic education. We analyze the type of graph, its semiotic complexity, and the data context, as well as the type of task, reading level required to complete the task and purpose of the graph within the task. We observed the predominance of bar graphs, third level of semiotic complexity (representing a distribution), second reading level (reading between the data), work and school context, reading and computation tasks and analysis purpose. We describe the differences in the various grades and between both editorials, as well as differences and coincidences with results of other textbook studies carried out in Spain and Chile.

Search
Clear search
Close search
Google apps
Main menu