Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset information
Internet topology graph. From traceroutes run daily in 2005 -
http://www.caida.org/tools/measurement/skitter. From several scattered sources
to million destinations. 1.7 million nodes, 11 million edges.
Dataset statistics
Nodes 1696415
Edges 11095298
Nodes in largest WCC 1694616 (0.999)
Edges in largest WCC 11094209 (1.000)
Nodes in largest SCC 1694616 (0.999)
Edges in largest SCC 11094209 (1.000)
Average clustering coefficient 0.2963
Number of triangles 28769868
Fraction of closed triangles 0.005387
Diameter (longest shortest path) 25
90-percentile effective diameter 5.9
Source (citation)
J. Leskovec, J. Kleinberg and C. Faloutsos. Graphs over Time: Densification
Laws, Shrinking Diameters and Possible Explanations. ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (KDD), 2005.
Files
File Description
as-skitter.txt.gz AS from traceroutes run daily in 2005 by skitter
Facebook
TwitterThe source data file contains the raw data of graphs and charts not included in S1–S4 Data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time series data for the statistic Compilation of government finance statistics and country Uruguay. Indicator Definition:Compilation of government finance statistics refers to the Government Finance Statistics Manual (GFSM) in use for compiling the data. It provides guidelines on the institutional structure of governments and the presentation of fiscal data in a format similar to business accounting with a balance sheet and income statement plus guidelines on the treatment of exchange rate and other valuation adjustments. The latest manual GFSM2014 is harmonized with the SNA2008.
Facebook
TwitterIn this activity, students use real water chemistry data and descriptive statistics in Excel to examine primary productivity in an urban estuary of the Salish Sea. They will consider how actual data do or do not support expected annual trends.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data analytics as a field is currently at a crucial point in its development, as a commoditization takes place in the context of increasing amounts of data, more user diversity, and automated analysis solutions, the latter potentially eliminating the need for expert analysts. A central hypothesis of the present paper is that data visualizations should be adapted to both the user and the context. This idea was initially addressed in Study 1, which demonstrated substantial interindividual variability among a group of experts when freely choosing an option to visualize data sets. To lay the theoretical groundwork for a systematic, taxonomic approach, a user model combining user traits, states, strategies, and actions was proposed and further evaluated empirically in Studies 2 and 3. The results implied that for adapting to user traits, statistical expertise is a relevant dimension that should be considered. Additionally, for adapting to user states different user intentions such as monitoring and analysis should be accounted for. These results were used to develop a taxonomy which adapts visualization recommendations to these (and other) factors. A preliminary attempt to validate the taxonomy in Study 4 tested its visualization recommendations with a group of experts. While the corresponding results were somewhat ambiguous overall, some aspects nevertheless supported the claim that a user-adaptive data visualization approach based on the principles outlined in the taxonomy can indeed be useful. While the present approach to user adaptivity is still in its infancy and should be extended (e.g., by testing more participants), the general approach appears to be very promising.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
U.S. Granted Utility Patents Originating in an Undetermined Statistical Area in California was 0.00000 Number in January of 2015, according to the United States Federal Reserve. Historically, U.S. Granted Utility Patents Originating in an Undetermined Statistical Area in California reached a record high of 1.00000 in January of 2006 and a record low of 0.00000 in January of 2001. Trading Economics provides the current actual value, an historical data chart and related indicators for U.S. Granted Utility Patents Originating in an Undetermined Statistical Area in California - last updated from the United States Federal Reserve on December of 2025.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time series data for the statistic Compilation of government finance statistics and country Brunei Darussalam. Indicator Definition:Compilation of government finance statistics refers to the Government Finance Statistics Manual (GFSM) in use for compiling the data. It provides guidelines on the institutional structure of governments and the presentation of fiscal data in a format similar to business accounting with a balance sheet and income statement plus guidelines on the treatment of exchange rate and other valuation adjustments. The latest manual GFSM2014 is harmonized with the SNA2008.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset showing World hunger statistics by year from 2001 to 2022.
Facebook
TwitterSalmonella pangenome graph and variant call data for 539,283 genomesDescription:Salmonella enterica causes human disease and decreases agricultural production. The overall goals of this project is to generate a large database of S. enterica variants with 539,283 samples and 236,069 features for applications in machine learning and genomics. We transformed single nucleotide polymorphism (SNP) data into reduced dimensional representations which are tolerant of missing data based on disentangled variational autoencoders. TFRecord files were made with custom Python scripts that parsed the variant call formats (VCF) into sparse tensors and combined them with the Salmonella In Silico Typing Resource (SISTR) serotype data.The data directory contains:The tar file of TFRecords: tfrecords.tar (103 GB). The TFRecords are organized first by how they were genotyped. mpileup records were created with Mpileup, and the gvg records were created with graph variant calling. In each of these directories batches of ~10,000 sequence reads named Sra10k_XX.tfrecord.gz (00--54). File Sra10k_99.tfrecord.gz contains incomplete SRAs. Each TFRecord contains the shape of the tensor, the indices of non-zero variants, sample name, serotype, and sparse values. Value 99 was assigned to '.' records.The file output.tar (11.4 TB) contains the .vcf files used to create the TFRecords above. The data in here is contained more succinctly in the TTFrecord format. This data will not normally be used.A tar file of metadata files for the samples, metadata (95 MB). Sequence read archive (SRA) accessions were downloaded using edirect/eutilities and saved as SraAccList.txt.esearch -db sra -query "txid28901[Organism:exp] AND (cluster_public[prop] AND 'biomol dna'[Properties] AND 'library layout paired'[Properties] AND 'platform illumina'[Properties] AND 'strategy wgs'[Properties] OR 'strategy wga'[Properties] OR 'strategy wcs'[Properties] OR 'strategy clone'[Properties] OR 'strategy finishing'[Properties] OR 'strategy validation'[Properties])" | efetch -format runinfo -mode xml | xtract -pattern Row -element Run > SraAccList.txtGoogle BigQuery was used to download metadata for the SRA accessions from the National Institute of Health (NIH).SELECT * FROM nih-sra-datastore.sra.metadata as metadata INNER JOIN {table_id} as leiacc ON metadata.acc = leiacc.accID;Files were processed into batches of ~10,000 and named Sra_completed_XX.csv (00--53).A VCF document mapping the TFRecord data to the positions in the graph subjected to the Type strain LT2: mapping/DRR452337.gvg.vcf-with_TFRecord_in_1st_column.txtScripts for creating and reading TFRecord data: code.reading_and_parsing_fns.py defines functions for converting VCFs of variants called using gvg to sparse tensors and makes the TFRecord files.gvg_to_tfrecord.py creates TFRecords from from the sparse tensors.Tutorial for using the TFRecords: Example_logistic_regression.mdPangenome graph files and references used for variant calling and genotyping: pangenome.refPlus100.fasta.gz which contains the genomes of the 101 Salmonella strains without plasmids used for construction of the pangenome graph.salm.100.NC_003197_v2.d2_complete.gfa.gz The complete 101 Salmonella strain pangenome graph in Graphical Fragment Assembly (GFA2) Format 2.0 including alt nodes used for genotypingsalm.100.NC_003197_v2.full.gfa.gz the full graph including alt nodes.salm.100.NC_003197_v2.full.vcf.gz A VCF of the file abovegenotyped.gvg.vcf the genotype calls in vcf formatpaths.txt the paths of the graphSCINet users: The data folder can be accessed/retrieved with valid SCINet account at this location: /LTS/ADCdatastorage/NAL/published/node28083194/See the SCINet File Transfer guide for more information on moving large files: https://scinet.usda.gov/guides/data/datatransferGlobus users: The files can also be accessed through Globus by following this data link. The user will need to log in to Globus in order to access this data. User accounts are free of charge with several options for signing on. Instructions for creating an account are on the login page.
Facebook
Twitterhttps://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required
Graph and download economic data for International Merchandise Trade Statistics: Imports: Commodities for Poland (POLXTIMVA01CXMLQ) from Q1 1980 to Q2 2025 about Poland, imports, and trade.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset showing Africa immigration statistics by year from N/A to N/A.
Facebook
TwitterGraffiti is an urban phenomenon that is increasingly attracting the interest of the sciences. To the best of our knowledge, no suitable data corpora are available for systematic research until now. The Information System Graffiti in Germany project (Ingrid) closes this gap by dealing with graffiti image collections that have been made available to the project for public use. Within Ingrid, the graffiti images are collected, digitized and annotated. With this work, we aim to support the rapid access to a comprehensive data source on Ingrid targeted especially by researchers. In particular, we present IngridKG, an RDF knowledge graph of annotated graffiti, abides by the Linked Data and FAIR principles. We weekly update IngridKG by augmenting the new annotated graffiti to our knowledge graph. Our generation pipeline applies RDF data conversion, link discovery and data fusion approaches to the original data. The current version of IngridKG contains 460,640,154 triples and is linked to 3 other knowledge graphs by over 200,000 links. In our use case studies, we demonstrate the usefulness of our knowledge graph for different applications.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Business process event data modeled as labeled property graphs
Data Format
-----------
The dataset comprises one labeled property graph in two different file formats.
#1) Neo4j .dump format
A neo4j (https://neo4j.com) database dump that contains the entire graph and can be imported into a fresh neo4j database instance using the following command, see also the neo4j documentation: https://neo4j.com/docs/
/bin/neo4j-admin.(bat|sh) load --database=graph.db --from=
The .dump was created with Neo4j v3.5.
#2) .graphml format
A .zip file containing a .graphml file of the entire graph
Data Schema
-----------
The graph is a labeled property graph over business process event data. Each graph uses the following concepts
:Event nodes - each event node describes a discrete event, i.e., an atomic observation described by attribute "Activity" that occurred at the given "timestamp"
:Entity nodes - each entity node describes an entity (e.g., an object or a user), it has an EntityType and an identifier (attribute "ID")
:Log nodes - describes a collection of events that were recorded together, most graphs only contain one log node
:Class nodes - each class node describes a type of observation that has been recorded, e.g., the different types of activities that can be observed, :Class nodes group events into sets of identical observations
:CORR relationships - from :Event to :Entity nodes, describes whether an event is correlated to a specific entity; an event can be correlated to multiple entities
:DF relationships - "directly-followed by" between two :Event nodes describes which event is directly-followed by which other event; both events in a :DF relationship must be correlated to the same entity node. All :DF relationships form a directed acyclic graph.
:HAS relationship - from a :Log to an :Event node, describes which events had been recorded in which event log
:OBSERVES relationship - from an :Event to a :Class node, describes to which event class an event belongs, i.e., which activity was observed in the graph
:REL relationship - placeholder for any structural relationship between two :Entity nodes
The concepts a further defined in Stefan Esser, Dirk Fahland: Multi-Dimensional Event Data in Graph Databases. CoRR abs/2005.14552 (2020) https://arxiv.org/abs/2005.14552
Data Contents
-------------
neo4j-bpic15-2021-02-17 (.dump|.graphml.zip)
An integrated graph describing the raw event data of the entire BPI Challenge 2015 dataset.
van Dongen, B.F. (Boudewijn) (2015): BPI Challenge 2015. 4TU.ResearchData. Collection. https://doi.org/10.4121/uuid:31a308ef-c844-48da-948c-305d167a0ec1
This data is provided by five Dutch municipalities. The data contains all building permit applications over a period of approximately four years. There are many different activities present, denoted by both codes (attribute concept:name) and labels, both in Dutch (attribute taskNameNL) and in English (attribute taskNameEN). The cases in the log contain information on the main application as well as objection procedures in various stages. Furthermore, information is available about the resource that carried out the task and on the cost of the application (attribute SUMleges). The processes in the five municipalities should be identical, but may differ slightly. Especially when changes are made to procedures, rules or regulations the time at which these changes are pushed into the five municipalities may differ. Of course, over the four year period, the underlying processes have changed. The municipalities have a number of questions, namely: What are the roles of the people involved in the various stages of the process and how do these roles differ across municipalities? What are the possible points for improvement on the organizational structure for each of the municipalities? The employees of two of the five municipalities have physically moved into the same location recently. Did this lead to a change in the processes and if so, what is different? Some of the procedures will be outsourced from 2018, i.e. they will be removed from the process and the applicant needs to have these activities performed by an external party before submitting the application. What will be the effect of this on the organizational structures in the five municipalities? Where are differences in throughput times between the municipalities and how can these be explained? What are the differences in control flow between the municipalities? There are five different log files available in this collection. Events are labeled with both a code and a Dutch and English label. Each activity code consists of three parts: two digits, a variable number of characters, and then three digits. The first two digits as well as the characters indicate the subprocess the activity belongs to. For instance ‘01_HOOFD_xxx’ indicates the main process and ‘01_BB_xxx’ indicates the ‘objections and complaints’ (‘Beroep en Bezwaar’ in Dutch) subprocess. The last three digits hint on the order in which activities are executed, where the first digit often indicates a phase within a process. Each trace and each event, contain several data attributes that can be used for various checks and predictions. Furthermore, some employees may have performed tasks for different municipalities, i.e. if the employee number is the same, it is safe to assume the same person is being identified.
The data contains the following entities and their events
- Application - a building permit application handled in one of five Dutch municipalities
- Case_R - a user or worker involved in handling the application
- Responsible_actor - a user or worker designated as responsible actor for an activity
- monitoringResource - a user or worker designated as monitoring resource for an activity
The data contains 5 event log nodes as the data was integrated from 5 different event logs from 5 different systems.
Data Size
---------
BPIC15, nodes: 268851, relationships: 2620418
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Spain: Internet users, percent of population: The latest value from 2023 is 95.4 percent, an increase from 94.5 percent in 2022. In comparison, the world average is 72.46 percent, based on data from 177 countries. Historically, the average for Spain from 1990 to 2023 is 46.36 percent. The minimum value, 0.01 percent, was reached in 1990 while the maximum of 95.4 percent was recorded in 2023.
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The medical knowledge graph market is booming, projected to reach $6 billion by 2033, driven by AI, big data, and the need for improved healthcare data analysis. Learn about key market trends, leading companies, and future growth opportunities in this insightful report.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Bureau of Labor Statistics Industrialized Countries - Import Price Index: Other electrical equipment and component manufacturing for Industrialized Countries was 140.50000 Index 2010=100 in December of 2019, according to the United States Federal Reserve. Historically, Bureau of Labor Statistics Industrialized Countries - Import Price Index: Other electrical equipment and component manufacturing for Industrialized Countries reached a record high of 140.50000 in December of 2019 and a record low of 92.60000 in January of 2017. Trading Economics provides the current actual value, an historical data chart and related indicators for Bureau of Labor Statistics Industrialized Countries - Import Price Index: Other electrical equipment and component manufacturing for Industrialized Countries - last updated from the United States Federal Reserve on November of 2025.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Philippines: Political corruption index: The latest value from 2024 is 0.849 index points, an increase from 0.83 index points in 2023. In comparison, the world average is 0.483 index points, based on data from 171 countries. Historically, the average for the Philippines from 1960 to 2024 is 0.782 index points. The minimum value, 0.711 index points, was reached in 1992 while the maximum of 0.89 index points was recorded in 1979.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Bureau of Labor Statistics Pacific Rim - Import Price Index by Origin (NAICS): Plastics Product Manufacturing for Pacific Rim was 101.10000 Index 2010=100 in August of 2025, according to the United States Federal Reserve. Historically, Bureau of Labor Statistics Pacific Rim - Import Price Index by Origin (NAICS): Plastics Product Manufacturing for Pacific Rim reached a record high of 109.50000 in February of 2021 and a record low of 96.40000 in December of 2016. Trading Economics provides the current actual value, an historical data chart and related indicators for Bureau of Labor Statistics Pacific Rim - Import Price Index by Origin (NAICS): Plastics Product Manufacturing for Pacific Rim - last updated from the United States Federal Reserve on November of 2025.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Jordan: PISA science scores: The latest value from 2022 is 374.527 index points, a decline from 429.252 index points in 2018. In comparison, the world average is 449.005 index points, based on data from 78 countries. Historically, the average for Jordan from 2006 to 2022 is 409.863 index points. The minimum value, 374.527 index points, was reached in 2022 while the maximum of 429.252 index points was recorded in 2018.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
305 Global import shipment records of Graph with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset information
Internet topology graph. From traceroutes run daily in 2005 -
http://www.caida.org/tools/measurement/skitter. From several scattered sources
to million destinations. 1.7 million nodes, 11 million edges.
Dataset statistics
Nodes 1696415
Edges 11095298
Nodes in largest WCC 1694616 (0.999)
Edges in largest WCC 11094209 (1.000)
Nodes in largest SCC 1694616 (0.999)
Edges in largest SCC 11094209 (1.000)
Average clustering coefficient 0.2963
Number of triangles 28769868
Fraction of closed triangles 0.005387
Diameter (longest shortest path) 25
90-percentile effective diameter 5.9
Source (citation)
J. Leskovec, J. Kleinberg and C. Faloutsos. Graphs over Time: Densification
Laws, Shrinking Diameters and Possible Explanations. ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (KDD), 2005.
Files
File Description
as-skitter.txt.gz AS from traceroutes run daily in 2005 by skitter