100+ datasets found
  1. u

    MIVIA ARG Dataset

    • mivia.unisa.it
    • zenodo.org
    text/vf-format
    Updated Jan 1, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MIVIA Lab (2013). MIVIA ARG Dataset [Dataset]. http://doi.org/10.1016/S0167-8655(02)00253-2
    Explore at:
    text/vf-formatAvailable download formats
    Dataset updated
    Jan 1, 2013
    Dataset authored and provided by
    MIVIA Lab
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The ARG Database is a huge collection of labeled and unlabeled graphs realized by the MIVIA Group. The aim of this collection is to provide the graph research community with a standard test ground for the benchmarking of graph matching algorithms.

  2. Data from: PDD Graph: Bridging Electronic Medical Records and Biomedical...

    • springernature.figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Meng Wang; Jiaheng Zhang; Jun Liu; Wei Hu; Sen Wang; Xue Li; Wenqiang Liu (2023). PDD Graph: Bridging Electronic Medical Records and Biomedical Knowledge Graphs via Entity Linking [Dataset]. http://doi.org/10.6084/m9.figshare.5242138
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Meng Wang; Jiaheng Zhang; Jun Liu; Wei Hu; Sen Wang; Xue Li; Wenqiang Liu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Patient-drug-disease (PDD) Graph dataset, utilising Electronic medical records (EMRS) and biomedical Knowledge graphs. The novel framework to construct the PDD graph is described in the associated publication.PDD is an RDF graph consisting of PDD facts, where a PDD fact is represented by an RDF triple to indicate that a patient takes a drug or a patient is diagnosed with a disease. For instance, (pdd:274671, pdd:diagnosed, sepsis)Data files are in .nt N-Triple format, a line-based syntax for an RDF graph. These can be accessed via openly-available text edit software.diagnose_icd_information.nt - contains RDF triples mapping patients to diagnoses. For example:(pdd:18740, pdd:diagnosed, icd99592),where pdd:18740 is a patient entity, and icd99592 is the ICD-9 code of sepsis.drug_patients.nt- contains RDF triples mapping patients to drugs. For example:(pdd:18740, pdd:prescribed, aspirin),where pdd:18740 is a patient entity, and aspirin is the drug's name.Background:Electronic medical records contain multi-format electronic medical data that consist of an abundance of medical knowledge. Faced with patients' symptoms, experienced caregivers make the right medical decisions based on their professional knowledge, which accurately grasps relationships between symptoms, diagnoses and corresponding treatments. In the associated paper, we aim to capture these relationships by constructing a large and high-quality heterogenous graph linking patients, diseases, and drugs (PDD) in EMRs. Specifically, we propose a novel framework to extract important medical entities from MIMIC-III (Medical Information Mart for Intensive Care III) and automatically link them with the existing biomedical knowledge graphs, including ICD-9 ontology and DrugBank. The PDD graph presented in this paper is accessible on the Web via the SPARQL endpoint as well as in .nt format in this repository, and provides a pathway for medical discovery and applications, such as effective treatment recommendations.De-identificationIt is necessary to mention that MIMIC-III contains clinical information of patients. Although the protected health information was de-identifed, researchers who seek to use more clinical data should complete an on-line training course and then apply for the permission to download the complete MIMIC-III dataset: https://mimic.physionet.org/

  3. i

    MS-BioGraphs: Trillion-Scale Sequence Similarity Graph Datasets

    • ieee-dataport.org
    Updated Jan 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohsen Koohi (2025). MS-BioGraphs: Trillion-Scale Sequence Similarity Graph Datasets [Dataset]. https://ieee-dataport.org/open-access/ms-biographs-trillion-scale-sequence-similarity-graph-datasets
    Explore at:
    Dataset updated
    Jan 26, 2025
    Authors
    Mohsen Koohi
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    MS-BioGraphs are a family of sequence similarity graph datasets with up to 2.5 trillion edges. The graphs are weighted edges and presented in compressed WebGraph format. The dataset include symmetric and asymmetric graphs. The largest graph has been created by matching sequences in Metaclust dataset with 1.7 billion sequences. These real-world graph dataset are useful for measuring contributions in High-Performance Computing and High-Performance Graph Processing.

  4. h

    NLP-KnowledgeGraph

    • huggingface.co
    Updated Feb 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vishnu Nandakumar (2024). NLP-KnowledgeGraph [Dataset]. https://huggingface.co/datasets/vishnun/NLP-KnowledgeGraph
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 28, 2024
    Authors
    Vishnu Nandakumar
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Dataset Card for Dataset Name

      Dataset Summary
    

    KG dataset created by using spaCy PoS and Dependency parser.

      Supported Tasks and Leaderboards
    

    Can be leveraged for token classification for detection of knowledge graph entities and relations.

      Languages
    

    English

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    [More Information Needed]

      Data Fields
    

    Important fields for the token classification task are

    tokens - tokenized text tags - Tags… See the full description on the dataset page: https://huggingface.co/datasets/vishnun/NLP-KnowledgeGraph.

  5. n

    er-graph-1k-60k

    • networkrepository.com
    csv
    Updated Aug 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Network Data Repository (2021). er-graph-1k-60k [Dataset]. https://networkrepository.com/er-graph-1k-60k.php
    Explore at:
    csvAvailable download formats
    Dataset updated
    Aug 1, 2021
    Dataset authored and provided by
    Network Data Repository
    License

    https://networkrepository.com/policy.phphttps://networkrepository.com/policy.php

    Description

    Generated Graphs

  6. f

    The F-score of different methods.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qingju Jiao; Peige Zhao; Hanjin Zhang; Yahong Han; Guoying Liu (2023). The F-score of different methods. [Dataset]. http://doi.org/10.1371/journal.pone.0287001.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Qingju Jiao; Peige Zhao; Hanjin Zhang; Yahong Han; Guoying Liu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Most current graph neural networks (GNNs) are designed from the view of methodology and rarely consider the inherent characters of graph. Although the inherent characters may impact the performance of GNNs, very few methods are proposed to resolve the issue. In this work, we mainly focus on improving the performance of graph convolutional networks (GCNs) on the graphs without node features. In order to resolve the issue, we propose a method called t-hopGCN to describe t-hop neighbors by the shortest path between two nodes, then the adjacency matrix of t-hop neighbors as features to perform node classification. Experimental results show that t-hopGCN can significantly improve the performance of node classification in the graphs without node features. More importantly, adding the adjacency matrix of t-hop neighbors can improve the performance of existing popular GNNs on node classification.

  7. Y

    Citation Network Graph

    • shibatadb.com
    Updated Aug 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yubetsu (2025). Citation Network Graph [Dataset]. https://www.shibatadb.com/article/yVJrjFoS
    Explore at:
    Dataset updated
    Aug 17, 2025
    Dataset authored and provided by
    Yubetsu
    License

    https://www.shibatadb.com/license/data/proprietary/v1.0/license.txthttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txt

    Description

    Network of 46 papers and 72 citation links related to "Robotic exploration as graph construction".

  8. Communication Graphs

    • kaggle.com
    Updated Nov 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Subhajit Sahu (2021). Communication Graphs [Dataset]. https://www.kaggle.com/wolfram77/graphs-communication/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 15, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Subhajit Sahu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    email-EuAll: EU email communication network

    The network was generated using email data from a large European research institution. For a period from October 2003 to May 2005 (18 months) we have anonymized information about all incoming and outgoing email of the research institution. For each sent or received email message we know the time, the sender and the recipient of the email. Overall we have 3,038,531 emails between 287,755 different email addresses. Note that we have a complete email graph for only 1,258 email addresses that come from the research institution. Furthermore, there are 34,203 email addresses that both sent and received email within the span of our dataset. All other email addresses are either non-existing, mistyped or spam.

    Given a set of email messages, each node corresponds to an email address. We create a directed edge between nodes i and j, if i sent at least one message to j.

    email-Enron: Enron email network

    Enron email communication network covers all the email communication within a dataset of around half million emails. This data was originally made public, and posted to the web, by the Federal Energy Regulatory Commission during its investigation. Nodes of the network are email addresses and if an address i sent at least one email to address j, the graph contains an undirected edge from i to j. Note that non-Enron email addresses act as sinks and sources in the network as we only observe their communication with the Enron email addresses.

    The Enron email data was originally released by William Cohen at CMU.

    wiki-Talk: Wikipedia Talk network

    Wikipedia is a free encyclopedia written collaboratively by volunteers around the world. Each registered user has a talk page, that she and other users can edit in order to communicate and discuss updates to various articles on Wikipedia. Using the latest complete dump of Wikipedia page edit history (from January 3 2008) we extracted all user talk page changes and created a network.

    The network contains all the users and discussion from the inception of Wikipedia till January 2008. Nodes in the network represent Wikipedia users and a directed edge from node i to node j represents that user i at least once edited a talk page of user j.

    comm-f2f-Resistance: Dynamic Face-to-Face Interaction Networks

    The dynamic face-to-face interaction networks represent the interactions that happen during discussions between a group of participants playing the Resistance game. This dataset contains networks extracted from 62 games. Each game is played by 5-8 participants and lasts between 45--60 minutes. We extract dynamically evolving networks from the free-form discussions using the ICAF algorithm. The extracted networks are used to characterize and detect group deceptive behavior using the DeceptionRank algorithm.

    The networks are weighted, directed and temporal. Each node represents a participant. At each 1/3 second, a directed edge from node u to v is weighted by the probability of participant u looking at participant v or the laptop. Additionally, we also provide a binary version where an edge from u to v indicates participant u looks at participant v (or the laptop).

    Stanford Network Analysis Platform (SNAP) is a general purpose, high performance system for analysis and manipulation of large networks. Graphs consists of nodes and directed/undirected/multiple edges between the graph nodes. Networks are graphs with data on nodes and/or edges of the network.

    The core SNAP library is written in C++ and optimized for maximum performance and compact graph representation. It easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges. Besides scalability to large graphs, an additional strength of SNAP is that nodes, edges and attributes in a graph or a network can be changed dynamically during the computation.

    SNAP was originally developed by Jure Leskovec in the course of his PhD studies. The first release was made available in Nov, 2009. SNAP uses a general purpose STL (Standard Template Library)-like library GLib developed at Jozef Stefan Institute. SNAP and GLib are being actively developed and used in numerous academic and industrial projects.

    http://snap.stanford.edu/data/index.html#email

  9. u

    Data from: IngridKG: A FAIR Knowledge Graph of Graffiti

    • ris.uni-paderborn.de
    • data.niaid.nih.gov
    • +1more
    Updated Aug 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmed, Abdullah Fathi Ahmed; Morim da Silva, Ana Alexandra; Ngonga Ngomo, Axel-Cyrille; Niemann, Sven; Pestryakova, Svetlana; Sherif, Mohamed (2023). IngridKG: A FAIR Knowledge Graph of Graffiti [Dataset]. https://ris.uni-paderborn.de/record/45558
    Explore at:
    Dataset updated
    Aug 16, 2023
    Authors
    Ahmed, Abdullah Fathi Ahmed; Morim da Silva, Ana Alexandra; Ngonga Ngomo, Axel-Cyrille; Niemann, Sven; Pestryakova, Svetlana; Sherif, Mohamed
    Description

    Graffiti is an urban phenomenon that is increasingly attracting the interest of the sciences. To the best of our knowledge, no suitable data corpora are available for systematic research until now. The Information System Graffiti in Germany project (Ingrid) closes this gap by dealing with graffiti image collections that have been made available to the project for public use. Within Ingrid, the graffiti images are collected, digitized and annotated. With this work, we aim to support the rapid access to a comprehensive data source on Ingrid targeted especially by researchers. In particular, we present IngridKG, an RDF knowledge graph of annotated graffiti, abides by the Linked Data and FAIR principles. We weekly update IngridKG by augmenting the new annotated graffiti to our knowledge graph. Our generation pipeline applies RDF data conversion, link discovery and data fusion approaches to the original data. The current version of IngridKG contains 460,640,154 triples and is linked to 3 other knowledge graphs by over 200,000 links. In our use case studies, we demonstrate the usefulness of our knowledge graph for different applications.

  10. OpenAIRE Graph dataset: new collected projects

    • data.europa.eu
    • data.niaid.nih.gov
    unknown
    Updated Nov 28, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2023). OpenAIRE Graph dataset: new collected projects [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-10160505?locale=en
    Explore at:
    unknown(2088960)Available download formats
    Dataset updated
    Nov 28, 2023
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset includes metadata about projects grants collected by OpenAIRE until September 2023. This dump involves 563 HE (Horizon Europe) new projects 162 FCT (Fundação para a Ciência e a Tecnologia) new projects 3154 NWO (Netherlands Organisation for Scientific Research ) new projects 350 FWF (Austrian Science Fund) new projects 634 SNSF (Swiss National Science Foundation) new projects 1 WT (Wellcome Trust) new projects 11 TWCF (Templeton World Charity Foundation) new project and this is also a new funder This dataset contains also 157 new Irish funders for which we do not have a curated list of projects, therefore they are associated to a special type of project called unidentified

  11. OpenAIRE Graph dataset: new collected projects

    • zenodo.org
    • data.europa.eu
    tar
    Updated Apr 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miriam Baglioni; Miriam Baglioni; Alessia Bardi; Alessia Bardi; Harry Dimitropoulos; Claudio Atzori; Claudio Atzori; Paolo Manghi; Paolo Manghi; Harry Dimitropoulos (2024). OpenAIRE Graph dataset: new collected projects [Dataset]. http://doi.org/10.5281/zenodo.10948409
    Explore at:
    tarAvailable download formats
    Dataset updated
    Apr 17, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Miriam Baglioni; Miriam Baglioni; Alessia Bardi; Alessia Bardi; Harry Dimitropoulos; Claudio Atzori; Claudio Atzori; Paolo Manghi; Paolo Manghi; Harry Dimitropoulos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset includes metadata about projects grants collected by OpenAIRE until February 2024. This dump involves

    • 393 HE (Horizon Europe) new projects
    • 2 WT (Wellcome Trust) new projects
    • 2766 ARC (Australian Research Council) new projects

  12. Y

    Citation Network Graph

    • shibatadb.com
    Updated Dec 24, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yubetsu (2014). Citation Network Graph [Dataset]. https://www.shibatadb.com/article/U7bfefYP
    Explore at:
    Dataset updated
    Dec 24, 2014
    Dataset authored and provided by
    Yubetsu
    License

    https://www.shibatadb.com/license/data/proprietary/v1.0/license.txthttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txt

    Description

    Network of 41 papers and 100 citation links related to "Secret-Sharing Schemes for Very Dense Graphs".

  13. f

    Notations in this paper.

    • figshare.com
    xls
    Updated Jun 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Faezeh Faez; Negin Hashemi Dijujin; Mahdieh Soleymani Baghshah; Hamid R. Rabiee (2023). Notations in this paper. [Dataset]. http://doi.org/10.1371/journal.pone.0277887.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Faezeh Faez; Negin Hashemi Dijujin; Mahdieh Soleymani Baghshah; Hamid R. Rabiee
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Notations in this paper.

  14. Y

    Citation Network Graph

    • shibatadb.com
    Updated Sep 23, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yubetsu (2011). Citation Network Graph [Dataset]. https://www.shibatadb.com/article/wf6x6Dk2
    Explore at:
    Dataset updated
    Sep 23, 2011
    Dataset authored and provided by
    Yubetsu
    License

    https://www.shibatadb.com/license/data/proprietary/v1.0/license.txthttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txt

    Description

    Network of 44 papers and 88 citation links related to "Top scoring pairs for feature selection in machine learning and applications to cancer outcome prediction".

  15. Replication Data for: Geometry-Complete Perceptron Networks for 3D Molecular...

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Nov 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alex Morehead; Alex Morehead; Jianlin Cheng; Jianlin Cheng (2022). Replication Data for: Geometry-Complete Perceptron Networks for 3D Molecular Graphs [Dataset]. http://doi.org/10.5281/zenodo.7293186
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Nov 6, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alex Morehead; Alex Morehead; Jianlin Cheng; Jianlin Cheng
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Included are preprocessed data files for the Newtonian many-body systems modeling task described in our accompanying manuscript.

  16. d

    Salmonella enterica pangenome graph and variant call data for 539,283...

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Salmonella enterica pangenome graph and variant call data for 539,283 genomes [Dataset]. https://catalog.data.gov/dataset/isalmonella-enterica-ipangenome-graph-and-variant-call-data-for-539283-genomes
    Explore at:
    Dataset updated
    Jul 11, 2025
    Dataset provided by
    Agricultural Research Service
    Description

    Salmonella pangenome graph and variant call data for 539,283 genomesDescription:Salmonella enterica causes human disease and decreases agricultural production. The overall goals of this project is to generate a large database of S. enterica variants with 539,283 samples and 236,069 features for applications in machine learning and genomics. We transformed single nucleotide polymorphism (SNP) data into reduced dimensional representations which are tolerant of missing data based on disentangled variational autoencoders. TFRecord files were made with custom Python scripts that parsed the variant call formats (VCF) into sparse tensors and combined them with the Salmonella In Silico Typing Resource (SISTR) serotype data.The data directory contains:The tar file of TFRecords: tfrecords.tar (103 GB). The TFRecords are organized first by how they were genotyped. mpileup records were created with Mpileup, and the gvg records were created with graph variant calling. In each of these directories batches of ~10,000 sequence reads named Sra10k_XX.tfrecord.gz (00--54). File Sra10k_99.tfrecord.gz contains incomplete SRAs. Each TFRecord contains the shape of the tensor, the indices of non-zero variants, sample name, serotype, and sparse values. Value 99 was assigned to '.' records.The file output.tar (11.4 TB) contains the .vcf files used to create the TFRecords above. The data in here is contained more succinctly in the TTFrecord format. This data will not normally be used.A tar file of metadata files for the samples, metadata (95 MB). Sequence read archive (SRA) accessions were downloaded using edirect/eutilities and saved as SraAccList.txt.esearch -db sra -query "txid28901[Organism:exp] AND (cluster_public[prop] AND 'biomol dna'[Properties] AND 'library layout paired'[Properties] AND 'platform illumina'[Properties] AND 'strategy wgs'[Properties] OR 'strategy wga'[Properties] OR 'strategy wcs'[Properties] OR 'strategy clone'[Properties] OR 'strategy finishing'[Properties] OR 'strategy validation'[Properties])" | efetch -format runinfo -mode xml | xtract -pattern Row -element Run > SraAccList.txtGoogle BigQuery was used to download metadata for the SRA accessions from the National Institute of Health (NIH).SELECT * FROM nih-sra-datastore.sra.metadata as metadata INNER JOIN {table_id} as leiacc ON metadata.acc = leiacc.accID;Files were processed into batches of ~10,000 and named Sra_completed_XX.csv (00--53).A VCF document mapping the TFRecord data to the positions in the graph subjected to the Type strain LT2: mapping/DRR452337.gvg.vcf-with_TFRecord_in_1st_column.txtScripts for creating and reading TFRecord data: code.reading_and_parsing_fns.py defines functions for converting VCFs of variants called using gvg to sparse tensors and makes the TFRecord files.gvg_to_tfrecord.py creates TFRecords from from the sparse tensors.Tutorial for using the TFRecords: Example_logistic_regression.mdPangenome graph files and references used for variant calling and genotyping: pangenome.refPlus100.fasta.gz which contains the genomes of the 101 Salmonella strains without plasmids used for construction of the pangenome graph.salm.100.NC_003197_v2.d2_complete.gfa.gz The complete 101 Salmonella strain pangenome graph in Graphical Fragment Assembly (GFA2) Format 2.0 including alt nodes used for genotypingsalm.100.NC_003197_v2.full.gfa.gz the full graph including alt nodes.salm.100.NC_003197_v2.full.vcf.gz A VCF of the file abovegenotyped.gvg.vcf the genotype calls in vcf formatpaths.txt the paths of the graphSCINet users: The data folder can be accessed/retrieved with valid SCINet account at this location: /LTS/ADCdatastorage/NAL/published/node28083194/See the SCINet File Transfer guide for more information on moving large files: https://scinet.usda.gov/guides/data/datatransferGlobus users: The files can also be accessed through Globus by following this data link. The user will need to log in to Globus in order to access this data. User accounts are free of charge with several options for signing on. Instructions for creating an account are on the login page.

  17. F

    Employed full time: Wage and salary workers: Biomedical engineers...

    • fred.stlouisfed.org
    json
    Updated Jan 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Employed full time: Wage and salary workers: Biomedical engineers occupations: 16 years and over [Dataset]. https://fred.stlouisfed.org/series/LEU0254478700A
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jan 22, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain

    Description

    Graph and download economic data for Employed full time: Wage and salary workers: Biomedical engineers occupations: 16 years and over (LEU0254478700A) from 2000 to 2024 about biomedical, engineering, occupation, full-time, salaries, workers, 16 years +, wages, employment, and USA.

  18. F

    Unit Labor Costs for Professional, Scientific, and Technical Services:...

    • fred.stlouisfed.org
    json
    Updated Dec 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Unit Labor Costs for Professional, Scientific, and Technical Services: Advertising Agencies (NAICS 54181) in the United States [Dataset]. https://fred.stlouisfed.org/series/IPUMN54181U101000000
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Dec 2, 2024
    License

    https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain

    Area covered
    United States
    Description

    Graph and download economic data for Unit Labor Costs for Professional, Scientific, and Technical Services: Advertising Agencies (NAICS 54181) in the United States (IPUMN54181U101000000) from 1988 to 2022 about advertisement, science, agency, unit labor cost, professional, NAICS, services, and USA.

  19. Assembly graph

    • figshare.com
    txt
    Updated Aug 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cameron Thrash (2020). Assembly graph [Dataset]. http://doi.org/10.6084/m9.figshare.12857879.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 24, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Cameron Thrash
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Assembly graph from Unicycler hybrid assembly using Illumina and Nanopore reads.

  20. Y

    Citation Network Graph

    • shibatadb.com
    Updated May 15, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yubetsu (2012). Citation Network Graph [Dataset]. https://www.shibatadb.com/article/o6Bb38ik
    Explore at:
    Dataset updated
    May 15, 2012
    Dataset authored and provided by
    Yubetsu
    License

    https://www.shibatadb.com/license/data/proprietary/v1.0/license.txthttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txt

    Description

    Network of 44 papers and 110 citation links related to "Performance of histogram descriptors for the classification of 3D laser range data in urban environments".

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
MIVIA Lab (2013). MIVIA ARG Dataset [Dataset]. http://doi.org/10.1016/S0167-8655(02)00253-2

MIVIA ARG Dataset

Related Article
Explore at:
35 scholarly articles cite this dataset (View in Google Scholar)
text/vf-formatAvailable download formats
Dataset updated
Jan 1, 2013
Dataset authored and provided by
MIVIA Lab
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

The ARG Database is a huge collection of labeled and unlabeled graphs realized by the MIVIA Group. The aim of this collection is to provide the graph research community with a standard test ground for the benchmarking of graph matching algorithms.

Search
Clear search
Close search
Google apps
Main menu