This part of the data release includes graphical representation (figures) of data from sediment cores collected in 2009 offshore of Palos Verdes, California. This file graphically presents combined data for each core (one core per page). Data on each figure are continuous core photograph, CT scan (where available), graphic diagram core description (graphic legend included at right; visual grain size scale of clay, silt, very fine sand [vf], fine sand [f], medium sand [med], coarse sand [c], and very coarse sand [vc]), multi-sensor core logger (MSCL) p-wave velocity (meters per second) and gamma-ray density (grams per cc), radiocarbon age (calibrated years before present) with analytical error (years), and pie charts that present grain-size data as percent sand (white), silt (light gray), and clay (dark gray). This is one of seven files included in this U.S. Geological Survey data release that include data from a set of sediment cores acquired from the continental slope, offshore Los Angeles and the Palos Verdes Peninsula, adjacent to the Palos Verdes Fault. Gravity cores were collected by the USGS in 2009 (cruise ID S-I2-09-SC; http://cmgds.marine.usgs.gov/fan_info.php?fan=SI209SC), and vibracores were collected with the Monterey Bay Aquarium Research Institute's remotely operated vehicle (ROV) Doc Ricketts in 2010 (cruise ID W-1-10-SC; http://cmgds.marine.usgs.gov/fan_info.php?fan=W110SC). One spreadsheet (PalosVerdesCores_Info.xlsx) contains core name, location, and length. One spreadsheet (PalosVerdesCores_MSCLdata.xlsx) contains Multi-Sensor Core Logger P-wave velocity, gamma-ray density, and magnetic susceptibility whole-core logs. One zipped folder of .bmp files (PalosVerdesCores_Photos.zip) contains continuous core photographs of the archive half of each core. One spreadsheet (PalosVerdesCores_GrainSize.xlsx) contains laser particle grain size sample information and analytical results. One spreadsheet (PalosVerdesCores_Radiocarbon.xlsx) contains radiocarbon sample information, results, and calibrated ages. One zipped folder of DICOM files (PalosVerdesCores_CT.zip) contains raw computed tomography (CT) image files. One .pdf file (PalosVerdesCores_Figures.pdf) contains combined displays of data for each core, including graphic diagram descriptive logs. This particular metadata file describes the information contained in the file PalosVerdesCores_Figures.pdf. All cores are archived by the U.S. Geological Survey Pacific Coastal and Marine Science Center.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file contains the graph representation of structures in the Materials Project (www.materialsproject.org) and target properties, including formation energy per atom, band gap, and for a subset of 5830 structures, the shear moduli G_{VRH} and bulk moduli K_{VRH}. This data is part of the our paper "Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals". Change log:v5. Minor change of filenamev4. Add dummy state variables.v3. For the graph dictionaries, we modify the "node" key to "atom" and "distance" key to "bond" to match the latest MEGNet API. v2. Minor change of descriptionv1. Initial upload
This part of the data release includes graphical representation (figures) of data of sediment cores collected in 2014 in Monterey Canyon. It is one of five files included in this U.S. Geological Survey data release that include data from a set of sediment cores acquired from the continental slope, north of Monterey Canyon, offshore central California. Vibracores and push cores were collected with the Monterey Bay Aquarium Research Institute’s (MBARI’s) remotely operated vehicle (ROV) Doc Ricketts in 2014 (cruise ID 2014-615-FA). One spreadsheet (NorthernFlankMontereyCanyonCores_Info.xlsx) contains core name, location, and length. One spreadsheet (NorthernFlankMontereyCanyonCores_MSCLdata.xlsx) contains Multi-Sensor Core Logger P-wave velocity and gamma-ray density whole-core logs of vibracores. One zipped folder of .bmp files (NorthernFlankMontereyCanyonCores_Photos.zip) contains continuous core photographs of the archive half of each vibracore. One spreadsheet (NorthernFlankMontereyCanyonCores_Radiocarbon.xlsx) contains radiocarbon sample information, results, and calibrated ages. One .pdf file (NorthernFlankMontereyCanyonCores_Figures.pdf) contains combined displays of data for each vibracore, including graphic diagram descriptive logs. This particular metadata file describes the information contained in the file NorthernFlankMontereyCanyon_Figures.pdf. All vibracores are archived by the U.S. Geological Survey Pacific Coastal and Marine Science Center. Other remaining core material, if available, is archived at MBARI.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Metal–organic frameworks (MOFs) hold great potential in gas separation and storage. Graph neural networks (GNNs) have proven effective in exploring structure–property relationships and discovering new MOF structures. Unlike molecular graphs, crystal graphs must consider the periodicity and patterns. MOFs’ specific features at different scales, such as covalent bonds, functional groups, and global structures, influenced by interatomic interactions, exert varying degrees of impact on gas adsorption or selectivity. Moreover, redundant interatomic interactions hinder training accuracy, leading to overfitting. This research introduces a construction method for multiscale crystal graphs, which considers specific features at different scales by decomposing the crystal graph into multiple subgraphs based on interatomic interactions within varying distance ranges. Additionally, it takes into account the global structure of the crystal by encoding the periodic patterns of the unit cells. We propose MSAIGNN, a multiscale atomic interaction graph neural network with self-attention-based graph pooling mechanism, which incorporates three-body bond angle information, accounts for structural features at different scales, and minimizes interference from redundant interactions. Compared with traditional methods, MSAIGNN demonstrates higher prediction accuracy in assessing single-component adsorption, gas separation, and structural features. Visualization of attention scores confirms effective learning of structural features at different scales, highlighting MSAIGNN’s interpretability. Overall, MSAIGNN offers a novel, efficient, multilayered, and interpretable approach for property prediction of complex porous crystal structures like MOFs using deep learning.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The dataset contains graph instances used to test various optimization techniques for Roman domination problems. All instances are artificially generated. There are six subsets of graph instances provided: bipartite, grid, net, planar, random and recursive graphs.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Measuring the quality of Question Answering (QA) systems is a crucial task to validate the results of novel approaches. However, there are already indicators of a reproducibility crisis as many published systems have used outdated datasets or use subsets of QA benchmarks, making it hard to compare results. We identified the following core problems: there is no standard data format, instead, proprietary data representations are used by the different partly inconsistent datasets; additionally, the characteristics of datasets are typically not reflected by the dataset maintainers nor by the system publishers. To overcome these problems, we established an ontology---Question Answering Dataset Ontology (QADO)---for representing the QA datasets in RDF. The following datasets were mapped into the ontology: the QALD series, LC-QuAD series, RuBQ series, ComplexWebQuestions, and Mintaka. Hence, the integrated data in QADO covers widely used datasets and multilinguality. Additionally, we did intensive analyses of the datasets to identify their characteristics to make it easier for researchers to identify specific research questions and to select well-defined subsets. The provided resource will enable the research community to improve the quality of their research and support the reproducibility of experiments.
Here, the mapping results of the QADO process, the SPARQL queries for data analytics, and the archived analytics results file are provided.
Up-to-date statistics can be created automatically by the script provided at the corresponding QADO GitHub RDFizer repository.
New Zealand's official employment and unemployment statistics are sourced from the Household Labour Force Survey. Data on the number of people employed in New Zealand and the unemployment rate is available from 1970.
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
This data collection contains test Word Usage Graphs (WUGs) for English. Find a description of the data format, code to process the data and further datasets on the WUGsite.
The data is provided for testing purposes and thus contains specific data cases, which are sometimes artificially created, sometimes picked from existing data sets. The data contains the following cases:
afternoon_nn: sampled from DWUG EN 2.0.1. 200 uses partly annotated by multiple annotators with 427 judgments. Has clear cluster structure with only one cluster, no graded change, no binary change, and medium agreement of 0.62 Krippendorff's alpha.
arm: standard textbook example for semantic proximity (see reference below). Fully connected graph with six words uses, annotated by author.
plane_nn: sampled from DWUG EN 2.0.1. 200 uses partly annotated by multiple annotators with 1152 judgments. Has clear cluster structure, high graded change, binary change, and high agreement of 0.82 Krippendorff's alpha.
target: similar to arm, but with only three repeated sentences. Fully connected graph with 8 word uses, annotated by author. Same sentence (exactly same string) is annotated with 4, different string is annotated with 1.
Please find more information in the paper referenced below.
Version: 1.2.0, 30.06.2023. Remove instances files as these should be inferred from judgments when aggregating.
Reference
Dominik Schlechtweg. 2023. Human and Computational Measurement of Lexical Semantic Change. PhD thesis. University of Stuttgart.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Weighted graph representation of a road network in selected regions. Derived from Open Street Map https://www.openstreetmap.org. The dataset can be used as input for the betweenness centrality algorithm implemented here: https://code.it4i.cz/ADAS/betweenness.
Archive contents
The archive contains following folders.
CZE
Static graphs of three major cities in the Czech Republic (Praha, Brno, Ostrava) and entire Czech road network. Weighted by length of the road segments in metres.
PT
Static graphs of Lisbon, Porto and entire Portugese road network. Weighted by length of the road segments in metres.
Data format
Standard UTF-8 encoded CSV files, separated by semicolon with the following columns:
id1: (Type: unsigned long) - start node
id2: (Type: unsigned long) - end node
dist: (Type: unsigned long) - weight of the edge (length in metres, unless described otherwise)
edge_id: (Type: unsigned long) - unique edge identifier
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The graph representation of the A collection of public dataset graphs for the HRA dataset.
Digital line graph (DLG) data are digital representations of cartographic information. DLG's of map features are converted to digital form from maps and related sources. Intermediate-scale DLG data are derived from USGS 1:100,000-scale 30- by 60-minute quadrangle maps. If these maps are not available, Bureau of Land Management planimetric maps at a scale of 1: 100,000 are used. Intermediate-scale DLG's are sold in five categories: (1) Public Land Survey System; (2) boundaries (3) transportation; (4) hydrography; and (5) hypsography. All DLG data distributed by the USGS are DLG - Level 3 (DLG-3), which means the data contain a full range of attribute codes, have full topological structuring, and have passed certain quality-control checks.
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The global knowledge graph technology market is projected to reach a value of USD 4.7 billion by 2033, exhibiting a CAGR of 10.3% from 2025 to 2033. The surge in data volume and the increasing adoption of artificial intelligence (AI) and machine learning (ML) are the key factors driving the growth of this market. The increasing need for effective data management and analysis is also contributing to the market's expansion. Key market trends include the shift towards unstructured knowledge graphs, the integration of knowledge graphs with natural language processing, and the increasing use of knowledge graphs in enterprise applications. Based on type, the market is segmented into structured knowledge graphs and unstructured knowledge graphs. Structured knowledge graphs are more common and are used in a wide range of applications, including search engines, question answering systems, and recommender systems. Unstructured knowledge graphs are less common but are becoming increasingly popular as they can represent more complex and nuanced relationships. Based on application, the market is segmented into medical, finance, education, and others. The medical segment is the largest and is expected to continue to grow as knowledge graphs are used to improve patient care and outcomes. The finance segment is also growing rapidly as knowledge graphs are used to improve risk management, fraud detection, and customer segmentation. The education segment is also growing as knowledge graphs are used to improve student learning and engagement.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For K4 graph, a coloring type (K4,K4;n) is such an edge coloring of the full Kn graph, which does not have the K4 subgraph in the first color (representing by no edges in the graph) or the K4 subgraph in the second color (representing by edges in the graph).The Ramsey number R(4,4) is the smallest natural number n such that for any edge coloring of the full Kn graph there is an isomorphic subgraph with K4 in the first color (no edge in the graph) or isomorphic with K4 in the second color (exists edge in the graph). Coloring types (K4,K4;n) exist for n<R(4,4).The dataset consists of 14 files containing all non-isomorphic graphs that are coloring types (K4,K4;n) for 1<n<16.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Freebase is amongst the largest public cross-domain knowledge graphs. It possesses three main data modeling idiosyncrasies. It has a strong type system; its properties are purposefully represented in reverse pairs; and it uses mediator objects to represent multiary relationships. These design choices are important in modeling the real-world. But they also pose nontrivial challenges in research of embedding models for knowledge graph completion, especially when models are developed and evaluated agnostically of these idiosyncrasies. We make available several variants of the Freebase dataset by inclusion and exclusion of these data modeling idiosyncrasies. This is the first-ever publicly available full-scale Freebase dataset that has gone through proper preparation.
Dataset Details
The dataset consists of the four variants of Freebase dataset as well as related mapping/support files. For each variant, we made three kinds of files available:
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
This data collection contains diachronic Word Usage Graphs (WUGs) for English. Find a description of the data format, code to process the data and further datasets on the WUGsite.
See previous versions for additional testsets.
Please find more information on the provided data in the paper referenced below.
Version: 2.0.1, 30.11.2022. Assigns noise uses the cluster label '-1' instead of removing them. Important: Version 2.0.0 extends previous versions with one more annotation round and new clusterings.
Reference
Dominik Schlechtweg, Nina Tahmasebi, Simon Hengchen, Haim Dubossarsky, Barbara McGillivray. 2021. DWUG: A large Resource of Diachronic Word Usage Graphs in Four Languages. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Python scripts were developed to analyze and visualize data encoded in the provided .txt files. MATLAB scripts generate the raw time series data. Two examples of simulated results are provided Graphics Interchange Format files. For more details, two READ ME files are included:
This dataset is a collection of undirected and unweighted LFR benchmark graphs as proposed by Lancichinetti et al. [1]. We generated the graphs using the code provided by Santo Fortunato on his personal website [2], embedded in our evaluation framework [3], with two different parameter sets. Let N denote the number of vertices in the network, then
Maximum community size: 0.2N (Set A); 0.1N (Set B) Minimum community size: 0.05N (Set A); 10 (Set B) Maximum node degree: 0.19N (Set A); 0.19N (Set B) Community size distribution exponent: 1.0 (Set A); 1.0 (Set B) Degree distribution exponent: 2.0 (Set A); 2.0 (Set B).
All other parameters assume default values. We provide graphs with different combinations of average degree, network size and mixing parameter for the given parameter sets:
Set A: For average degrees in {15, 25, 50} we provide network sizes in {300, 600, 1200}, each with 20 different mixing parameters linearly spaced in [0.2, 0.8]. For each configuration we provide 100 benchmark graphs. Set A: For average degrees in {15, 25, 50} we provide mixing parameters in {0.35, 0.45, 0.55}, each with network sizes in {300, 450, 600, 900, 1200, 1800, 2400, 3600, 4800, 6200, 9600, 19200}. For each configuration we provide 50 benchmark graphs. Set B: For average degrees in {20} we provide network sizes in {300, 600, 1200, 2400}, each with 20 different mixing parameters linearly spaced in [0.2, 0.8]. For each configuration we provide 100 benchmark graphs.
Benchmark graphs are given in edge list format. Further, for each benchmark graph we provide ground truth communities as membership list and as structured datatype (.json), its generating random seeds and basic network statistics.
[1] Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Physical Review E 78(4):046110,https://doi.org/10.1103/PhysRevE.78.046110
[2] https://www.santofortunato.net/resources, Accessed: 19 Jan 2021
[3] https://github.com/synwalk/synwalk-analysis, Accessed: 19 Jan 2021
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
In exploring some of the concepts around Directed Acyclic Graphs and OLab in the assessment of clinical decision making, we have been juggling the ideas around layered and interconnected DAGs. Some of these explorations led us to the concept of heterogeneous graphs
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The ARG Database is a huge collection of labeled and unlabeled graphs realized by the MIVIA Group. The aim of this collection is to provide the graph research community with a standard test ground for the benchmarking of graph matching algorithms.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description. This is the data used in the experiment of the following conference paper:
N. Arınık, R. Figueiredo, and V. Labatut, “Signed Graph Analysis for the Interpretation of Voting Behavior,” in International Conference on Knowledge Technologies and Data-driven Business - International Workshop on Social Network Analysis and Digital Humanities, Graz, AT, 2017, vol. 2025. ⟨hal-01583133⟩
Source code. The code source is accessible on GitHub: https://github.com/CompNet/NetVotes
Citation. If you use the data or source code, please cite the above paper.
@InProceedings{Arinik2017, author = {Arınık, Nejat and Figueiredo, Rosa and Labatut, Vincent}, title = {Signed Graph Analysis for the Interpretation of Voting Behavior}, booktitle = {International Conference on Knowledge Technologies and Data-driven Business - International Workshop on Social Network Analysis and Digital Humanities}, year = {2017}, volume = {2025}, series = {CEUR Workshop Proceedings}, address = {Graz, AT}, url = {http://ceur-ws.org/Vol-2025/paper_rssna_1.pdf},}
Details.
----------------------# COMPARISON RESULTSThe 'material-stats' folder contains all the comparison results obtained for Ex-CC and ILS-CC. The csv files associated with plots are also provided.The folder structure is as follows:* material-stats/** execTimePerf: The plot shows the execution time of Ex-CC and ILS-CC based on randomly generated complete networks of different size.** graphStructureAnalysis: The plots show the weights and links statistics for all instances.** ILS-CC-vs-Ex-CC: The folder contains 4 different comparisons between Ex-CC and ILS-CC: Imbalance difference, number of detected clusters, difference of the number of detected clusters, NMI (Normalized Mutual Information)
----------------------Funding: Agorantic FR 3621, FMJH Program Gaspard Monge in optimization and operation research (Project 2015-2842H)
This part of the data release includes graphical representation (figures) of data from sediment cores collected in 2009 offshore of Palos Verdes, California. This file graphically presents combined data for each core (one core per page). Data on each figure are continuous core photograph, CT scan (where available), graphic diagram core description (graphic legend included at right; visual grain size scale of clay, silt, very fine sand [vf], fine sand [f], medium sand [med], coarse sand [c], and very coarse sand [vc]), multi-sensor core logger (MSCL) p-wave velocity (meters per second) and gamma-ray density (grams per cc), radiocarbon age (calibrated years before present) with analytical error (years), and pie charts that present grain-size data as percent sand (white), silt (light gray), and clay (dark gray). This is one of seven files included in this U.S. Geological Survey data release that include data from a set of sediment cores acquired from the continental slope, offshore Los Angeles and the Palos Verdes Peninsula, adjacent to the Palos Verdes Fault. Gravity cores were collected by the USGS in 2009 (cruise ID S-I2-09-SC; http://cmgds.marine.usgs.gov/fan_info.php?fan=SI209SC), and vibracores were collected with the Monterey Bay Aquarium Research Institute's remotely operated vehicle (ROV) Doc Ricketts in 2010 (cruise ID W-1-10-SC; http://cmgds.marine.usgs.gov/fan_info.php?fan=W110SC). One spreadsheet (PalosVerdesCores_Info.xlsx) contains core name, location, and length. One spreadsheet (PalosVerdesCores_MSCLdata.xlsx) contains Multi-Sensor Core Logger P-wave velocity, gamma-ray density, and magnetic susceptibility whole-core logs. One zipped folder of .bmp files (PalosVerdesCores_Photos.zip) contains continuous core photographs of the archive half of each core. One spreadsheet (PalosVerdesCores_GrainSize.xlsx) contains laser particle grain size sample information and analytical results. One spreadsheet (PalosVerdesCores_Radiocarbon.xlsx) contains radiocarbon sample information, results, and calibrated ages. One zipped folder of DICOM files (PalosVerdesCores_CT.zip) contains raw computed tomography (CT) image files. One .pdf file (PalosVerdesCores_Figures.pdf) contains combined displays of data for each core, including graphic diagram descriptive logs. This particular metadata file describes the information contained in the file PalosVerdesCores_Figures.pdf. All cores are archived by the U.S. Geological Survey Pacific Coastal and Marine Science Center.