36 datasets found

P
OGB-LSC Dataset
paperswithcode.com
Updated Jan 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Weihua Hu; Matthias Fey; Hongyu Ren; Maho Nakata; Yuxiao Dong; Jure Leskovec (2024). OGB-LSC Dataset [Dataset]. https://paperswithcode.com/dataset/ogb-lsc
Explore at:
Dataset updated
Jan 25, 2024
Authors
Weihua Hu; Matthias Fey; Hongyu Ren; Maho Nakata; Yuxiao Dong; Jure Leskovec
Description
OGB Large-Scale Challenge (OGB-LSC) is a collection of three real-world datasets for advancing the state-of-the-art in large-scale graph ML. OGB-LSC provides graph datasets that are orders of magnitude larger than existing ones and covers three core graph learning tasks -- link prediction, graph regression, and node classification.

OGB-LSC consists of three datasets: MAG240M-LSC, WikiKG90M-LSC, and PCQM4M-LSC. Each dataset offers an independent task.

MAG240M-LSC is a heterogeneous academic graph, and the task is to predict the subject areas of papers situated in the heterogeneous graph (node classification). WikiKG90M-LSC is a knowledge graph, and the task is to impute missing triplets (link prediction). PCQM4M-LSC is a quantum chemistry dataset, and the task is to predict an important molecular property, the HOMO-LUMO gap, of a given molecule (graph regression).
P
OGB Dataset
paperswithcode.com
Updated Jul 19, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Weihua Hu; Matthias Fey; Marinka Zitnik; Yuxiao Dong; Hongyu Ren; Bowen Liu; Michele Catasta; Jure Leskovec (2021). OGB Dataset [Dataset]. https://paperswithcode.com/dataset/ogb
Explore at:
Dataset updated
Jul 19, 2021
Authors
Weihua Hu; Matthias Fey; Marinka Zitnik; Yuxiao Dong; Hongyu Ren; Bowen Liu; Michele Catasta; Jure Leskovec
Description
The Open Graph Benchmark (OGB) is a collection of realistic, large-scale, and diverse benchmark datasets for machine learning on graphs. OGB datasets are automatically downloaded, processed, and split using the OGB Data Loader. The model performance can be evaluated using the OGB Evaluator in a unified manner. OGB is a community-driven initiative in active development.
T
ogbg_molpcba
tensorflow.org
Updated Dec 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). ogbg_molpcba [Dataset]. https://www.tensorflow.org/datasets/catalog/ogbg_molpcba
Explore at:
Dataset updated
Dec 14, 2022
Description
'ogbg-molpcba' is a molecular dataset sampled from PubChem BioAssay. It is a graph prediction dataset from the Open Graph Benchmark (OGB).

This dataset is experimental, and the API is subject to change in future releases.

The below description of the dataset is adapted from the OGB paper:

Input Format

All the molecules are pre-processed using RDKit ([1]).

Each graph represents a molecule, where nodes are atoms, and edges are chemical bonds.

Input node features are 9-dimensional, containing atomic number and chirality, as well as other additional atom features such as formal charge and whether the atom is in the ring.

Input edge features are 3-dimensional, containing bond type, bond stereochemistry, as well as an additional bond feature indicating whether the bond is conjugated.

The exact description of all features is available at https://github.com/snap-stanford/ogb/blob/master/ogb/utils/features.py.

Prediction

The task is to predict 128 different biological activities (inactive/active). See [2] and [3] for more description about these targets. Not all targets apply to each molecule: missing targets are indicated by NaNs.

References

[1]: Greg Landrum, et al. 'RDKit: Open-source cheminformatics'. URL: https://github.com/rdkit/rdkit

[2]: Bharath Ramsundar, Steven Kearnes, Patrick Riley, Dale Webster, David Konerding and Vijay Pande. 'Massively Multitask Networks for Drug Discovery'. URL: https://arxiv.org/pdf/1502.02072.pdf

[3]: Zhenqin Wu, Bharath Ramsundar, Evan N Feinberg, Joseph Gomes, Caleb Geniesse, Aneesh S. Pappu, Karl Leswing, and Vijay Pande. MoleculeNet: a benchmark for molecular machine learning. Chemical Science, 9(2):513-530, 2018.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('ogbg_molpcba', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/ogbg_molpcba-0.1.3.png" alt="Visualization" width="500px">
OGB(Open Graph Benchmark)
opendatalab.com
zip
Updated May 1, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harvard University (2020). OGB(Open Graph Benchmark) [Dataset]. https://opendatalab.com/OpenDataLab/OGB
Explore at:
zipAvailable download formats
Dataset updated
May 1, 2020
Dataset provided by
微软研究院https://www.microsoft.com/research/
Harvard University
Technical University of Dortmund
Stanford University
Description
The Open Graph Benchmark (OGB) is a collection of realistic, large-scale, and diverse benchmark datasets for machine learning on graphs. OGB datasets are automatically downloaded, processed, and split using the OGB Data Loader. The model performance can be evaluated using the OGB Evaluator in a unified manner. OGB is a community-driven initiative in active development.
h
OGB
huggingface.co
Updated Jan 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhikai chen (2024). OGB [Dataset]. https://huggingface.co/datasets/zkchen/OGB
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 25, 2024
Authors
Zhikai chen
Description
zkchen/OGB dataset hosted on Hugging Face and contributed by the HF Datasets community
OGBN-Proteins (Processed for PyG)
kaggle.com
zip
Updated Feb 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Redao da Taupl (2021). OGBN-Proteins (Processed for PyG) [Dataset]. https://www.kaggle.com/dataup1/ogbn-proteins
Explore at:
zip(677947148 bytes)Available download formats
Dataset updated
Feb 27, 2021
Authors
Redao da Taupl
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
OGBN-Proteins

Webpage: https://ogb.stanford.edu/docs/nodeprop/#ogbn-proteins

Usage in Python

import os.path as osp import pandas as pd import torch import torch_geometric.transforms as T from ogb.nodeproppred import PygNodePropPredDataset class PygOgbnProteins(PygNodePropPredDataset): def _init_(self, meta_csv = None): root, name, transform = '/kaggle/input', 'ogbn-proteins', T.ToSparseTensor() if meta_csv is None: meta_csv = osp.join(root, name, 'ogbn-master.csv') master = pd.read_csv(meta_csv, index_col = 0) meta_dict = master[name] meta_dict['dir_path'] = osp.join(root, name) super()._init_(name = name, root = root, transform = transform, meta_dict = meta_dict) def get_idx_split(self, split_type = None): if split_type is None: split_type = self.meta_info['split'] path = osp.join(self.root, 'split', split_type) if osp.isfile(os.path.join(path, 'split_dict.pt')): return torch.load(os.path.join(path, 'split_dict.pt')) if self.is_hetero: train_idx_dict, valid_idx_dict, test_idx_dict = read_nodesplitidx_split_hetero(path) for nodetype in train_idx_dict.keys(): train_idx_dict[nodetype] = torch.from_numpy(train_idx_dict[nodetype]).to(torch.long) valid_idx_dict[nodetype] = torch.from_numpy(valid_idx_dict[nodetype]).to(torch.long) test_idx_dict[nodetype] = torch.from_numpy(test_idx_dict[nodetype]).to(torch.long) return {'train': train_idx_dict, 'valid': valid_idx_dict, 'test': test_idx_dict} else: train_idx = dt.fread(osp.join(path, 'train.csv'), header = None).to_numpy().T[0] train_idx = torch.from_numpy(train_idx).to(torch.long) valid_idx = dt.fread(osp.join(path, 'valid.csv'), header = None).to_numpy().T[0] valid_idx = torch.from_numpy(valid_idx).to(torch.long) test_idx = dt.fread(osp.join(path, 'test.csv'), header = None).to_numpy().T[0] test_idx = torch.from_numpy(test_idx).to(torch.long) return {'train': train_idx, 'valid': valid_idx, 'test': test_idx}

dataset = PygOgbnProteins() split_idx = dataset.get_idx_split() train_idx, valid_idx, test_idx = split_idx['train'], split_idx['valid'], split_idx['test'] graph = dataset[0] # PyG Graph object

Description

Graph: The ogbn-proteins dataset is an undirected, weighted, and typed (according to species) graph. Nodes represent proteins, and edges indicate different types of biologically meaningful associations between proteins, e.g., physical interactions, co-expression or homology [1,2]. All edges come with 8-dimensional features, where each dimension represents the strength of a single association type and takes values between 0 and 1 (the larger the value is, the stronger the association is). The proteins come from 8 species.

Prediction task: The task is to predict the presence of protein functions in a multi-label binary classification setup, where there are 112 kinds of labels to predict in total. The performance is measured by the average of ROC-AUC scores across the 112 tasks.

Dataset splitting: The authors split the protein nodes into training/validation/test sets according to the species which the proteins come from. This enables the evaluation of the generalization performance of the model across different species.

Note: For undirected graphs, the loaded graphs will have the doubled number of edges because the bidirectional edges will be added automatically.

Summary

Package #Nodes #Edges Split Type Task Type Metric
ogb>=1.1.1 132,534 39,561,252 Species Multi-label binary classification ROC-AUC

Open Graph Benchmark

Website: https://ogb.stanford.edu

The Open Graph Benchmark (OGB) [3] is a collection of realistic, large-scale, and diverse benchmark datasets for machine learning on graphs. OGB datasets are automatically downloaded, processed, and split using the OGB Data Loader. The model performance can be evaluated using the OGB Evaluator in a unified manner.

References

[1] Damian Szklarczyk, Annika L Gable, David Lyon, Alexander Junge, Stefan Wyder, Jaime Huerta-Cepas, Milan Simonovic, Nadezhda T Doncheva, John H Morris, Peer Bork, et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Research, 47(D1):D607–D613, 2019. [2] Gene Ontology Consortium. The gene ontology resource: 20 years and still going strong. Nucleic Acids Research, 47(D1):D330–D338, 2018. [3] Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. Open graph benchmark: Datasets for machine learning on graphs. Advances in Neural Information Processing Systems, pp. 22118–22133, 2020.

Disclaimer

I am NOT the author of this dataset. It was downloaded from its official website. I assume no responsibility or liability for the content in this dataset. Any questions, problems or issues, please contact the original authors at their website or their GitHub repo.
t
CORA, Citeseer, Pubmed, OGB arXiv - Dataset - LDM
service.tib.eu
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). CORA, Citeseer, Pubmed, OGB arXiv - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/cora--citeseer--pubmed--ogb-arxiv
Explore at:
Dataset updated
Dec 16, 2024
Description
CORA, Citeseer, Pubmed, OGB arXiv
t
Weihua Hu, Matthias Fey, Hongyu Ren, Maho Nakata, Yuxiao Dong, Jure Leskovec...
service.tib.eu
Updated Dec 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Weihua Hu, Matthias Fey, Hongyu Ren, Maho Nakata, Yuxiao Dong, Jure Leskovec (2024). Dataset: OGB-LSC. https://doi.org/10.57702/lsm2j4pu [Dataset]. https://service.tib.eu/ldmservice/dataset/ogb-lsc
Explore at:
Dataset updated
Dec 2, 2024
Description
OGB-LSC provides the three large-scale realistic benchmark datasets, covering the core graph ML tasks of node classiﬁcation, link prediction, and graph regression.
Z
Benchmark Data for Chemprop
data.niaid.nih.gov
Updated Nov 9, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Graff, David E. (2023). Benchmark Data for Chemprop [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8174267
Explore at:
Dataset updated
Nov 9, 2023
Dataset provided by
Chung, Yunsie
Green, William H.
Vermeire, Florence H.
Heid, Esther
Greenman, Kevin P.
Li, Shih-Cheng
McGill, Charles J.
Wu, Haoyang
Graff, David E.
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Datasets and splits of the manuscript "Chemprop: Machine Learning Package for Chemical Property Prediction." Train, validation and test splits are located within each folder, as well as additional data necessary for some of the benchmarks. To train Chemprop models, refer to our code repository to obtain ready-to-use scripts to train machine learning models for each of the systems. Available benchmarking systems:

hiv HIV replication inhibition from MoleculeNet and OGB with scaffold splits pcba_random Biological activities from MoleculeNet with random splits (with missing targets filled in with zeros as provided by MoleculeNet) pcba_random_nans Biological activities from MoleculeNet with random splits and data format to match OGB (with missing targets not filled in with zeros) pcba_scaffold Biological activities from OGB with scaffold splits qm9_multitask DFT calculated properties from MoleculeNet and OGB, trained as a multi-task model qm9_u0 DFT calculated properties from MoleculeNet and OGB, trained as a single-task model on the target U0 only qm9_gap DFT calculated properties from MoleculeNet and OGB, trained as a single-task model on the target gap only sampl Water-octanol partition coefficients, used to predict molecules from the SAMPL6, 7 and 9 challenges atom_bond_137k Quantum-mechanical atom and bond descriptors bde Bond dissociation enthalpies trained as single-task model bde_charges Bond dissociation enthalpies trained as multi-task model together with atomic partial charges charges_eps_4 Partial charges at a dielectric constant of 4 (in protein) charges_eps_78 Partial charges at a dielectric constant of 78 (in water) barriers_e2 Reaction barrier heights of E2 reactions barriers_sn2 Reaction barrier heights of SN2 reactions barriers_cycloadd Reaction barrier heights of cycloaddition reactions barriers_rdb7 Reaction barrier heights in the RDB7 dataset barriers_rgd1 Reaction barrier heights in the RGD1-CNHO dataset multi_molecule UV/Vis peak absorption wavelengths in different solvents ir IR Spectra pcqm4mv2 HOMO-LUMO gaps of the PCQM4Mv2 dataset uncertainty_ensemble Uncertainty estimation using an ensemble using the QM9 gap dataset uncertainty_evidential Uncertainty estimation using evidential learning using the QM9 gap dataset uncertainty_mve Uncertainty estimation using mean-variance estimation using the QM9 gap dataset timing Timing benchmark using subsets of QM9 gap Version: This version of the dataset (Version 2) is compatible with all versions of Chemprop (supporting the respective functionality). Version 1 of this dataset is compatible with all versions except Chemprop v.1.6.1, which cannot process the charges_eps_4 and charges_eps_78 datasets (all other benchmarks work as expected). We therefore recommend to always use Version 2 of the dataset (with reformatted charges_eps_4 and charges_eps_78 datasets), since it is compatible with all versions of Chemprop. For use with any other ML software, you can use any version.
e
Ets Ogb Commerce General Imp Exp | See Full Import/Export Data | Eximpedia
eximpedia.app
Updated Feb 18, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2025). Ets Ogb Commerce General Imp Exp | See Full Import/Export Data | Eximpedia [Dataset]. https://www.eximpedia.app/
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Feb 18, 2025
Dataset provided by
Eximpedia Export Import Trade Data
Eximpedia PTE LTD
Authors
Seair Exim
Area covered
Jamaica, Panama, Saint Barthélemy, India, Djibouti, Jersey, Gabon, Switzerland, Micronesia (Federated States of), Libya
Description
Eximpedia Export import trade data lets you search trade data and active Exporters, Importers, Buyers, Suppliers, manufacturers exporters from over 209 countries
Vintage Ogb (Name) - Reverse Whois Lookup
whoisdatacenter.com
csv
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AllHeart Web Inc, Vintage Ogb (Name) - Reverse Whois Lookup [Dataset]. https://whoisdatacenter.com/name/Vintage-Ogb/
Explore at:
csvAvailable download formats
Dataset provided by
AllHeart Web
Authors
AllHeart Web Inc
License
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Time period covered
Mar 15, 1985 - Jul 1, 2025
Description
Investigate historical ownership changes and registration details by initiating a reverse Whois lookup for the name Vintage Ogb.
h
pjf-podcast-qa-sharegpt
huggingface.co
Updated Mar 15, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ogb (2020). pjf-podcast-qa-sharegpt [Dataset]. https://huggingface.co/datasets/ogbrandt/pjf-podcast-qa-sharegpt
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 15, 2020
Authors
ogb
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Used TheBloke/OpenHermes-2-Mistral-7B-GPTQ to convert chunks into QA pairs used for finetuning
f
Calcium time series from OGB labeled V1 neurons in awake or anesthesized...
figshare.com
bin
Updated Jan 19, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pieter Goltstein (2016). Calcium time series from OGB labeled V1 neurons in awake or anesthesized mice. [Dataset]. http://doi.org/10.6084/m9.figshare.1287764.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1287764.v1
Dataset updated
Jan 19, 2016
Dataset provided by
figshare
Authors
Pieter Goltstein
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Calcium time series from OGB labeled V1 neurons in awake or anesthesized mice. Data published in: Pieter M. Goltstein, Jorrit S. Montijn, Cyriel M.A. Pennartz. (2015). Effects of isoflurane anesthesia on ensemble patterns of Ca2+ activity in mouse V1: Reduced direction selectivity independent of increased correlations in cellular activity. PLOS ONE.
w
xn--biberciimento-ogb.com - Historical whois Lookup
whoisdatacenter.com
csv
Updated Feb 23, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AllHeart Web Inc (2018). xn--biberciimento-ogb.com - Historical whois Lookup [Dataset]. https://whoisdatacenter.com/domain/xn--biberciimento-ogb.com/
Explore at:
csvAvailable download formats
Dataset updated
Feb 23, 2018
Dataset authored and provided by
AllHeart Web Inc
License
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Time period covered
Mar 15, 1985 - Jun 17, 2025
Description
Explore the historical Whois records related to xn--biberciimento-ogb.com (Domain). Get insights into ownership history and changes over time.
h
pjf_llama_instruction_prep
huggingface.co
Updated Mar 15, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ogb (2020). pjf_llama_instruction_prep [Dataset]. https://huggingface.co/datasets/ogbrandt/pjf_llama_instruction_prep
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 15, 2020
Authors
ogb
Description
ogbrandt/pjf_llama_instruction_prep dataset hosted on Hugging Face and contributed by the HF Datasets community
Amelia Ad1 Llc Importer and Ogb Engineerding Bv Exporter Data to USA
seair.co.in
Updated Feb 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2024). Amelia Ad1 Llc Importer and Ogb Engineerding Bv Exporter Data to USA [Dataset]. https://www.seair.co.in
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Feb 18, 2024
Dataset provided by
Seair Exim Solutions
Authors
Seair Exim
Area covered
United States
Description
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
xn--digitalebrn-ogb.com - Historical whois Lookup
whoisdatacenter.com
csv
Updated Feb 24, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AllHeart Web Inc (2017). xn--digitalebrn-ogb.com - Historical whois Lookup [Dataset]. https://whoisdatacenter.com/domain/xn--digitalebrn-ogb.com/
Explore at:
csvAvailable download formats
Dataset updated
Feb 24, 2017
Dataset provided by
AllHeart Web
Authors
AllHeart Web Inc
License
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Time period covered
Mar 15, 1985 - Jul 11, 2025
Description
Explore the historical Whois records related to xn--digitalebrn-ogb.com (Domain). Get insights into ownership history and changes over time.
h
nous-pjf
huggingface.co
Updated Mar 15, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ogb (2020). nous-pjf [Dataset]. https://huggingface.co/datasets/ogbrandt/nous-pjf
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 15, 2020
Authors
ogb
Description
ogbrandt/nous-pjf dataset hosted on Hugging Face and contributed by the HF Datasets community
h
Geom3D_data
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
shengchao, Geom3D_data [Dataset]. https://huggingface.co/datasets/chao1224/Geom3D_data
Explore at:
Authors
shengchao
Description
Specifications of Dataset Download in Geom3D

We provide both the raw and processed data at this HuggingFace link.

PCQM4Mv2

mkdir -p pcqm4mv2/raw cd pcqm4mv2/raw wget http://ogb-data.stanford.edu/data/lsc/pcqm4m-v2-train.sdf.tar.gz tar -xf pcqm4m-v2-train.sdf.tar.gz

wget http://ogb-data.stanford.edu/data/lsc/pcqm4m-v2.zip unzip pcqm4m-v2.zip mv pcqm4m-v2/raw/data.csv.gz . rm pcqm4m-v2.zip rm -rf pcqm4m-v2

GEOM

wget… See the full description on the dataset page: https://huggingface.co/datasets/chao1224/Geom3D_data.
h
gpt4_preference_rlaif
huggingface.co
Updated Feb 18, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ogb (2024). gpt4_preference_rlaif [Dataset]. https://huggingface.co/datasets/ogbrandt/gpt4_preference_rlaif
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 18, 2024
Authors
ogb
Description
ogbrandt/gpt4_preference_rlaif dataset hosted on Hugging Face and contributed by the HF Datasets community

Package	#Nodes	#Edges	Split Type	Task Type	Metric
`ogb>=1.1.1`	132,534	39,561,252	Species	Multi-label binary classification	ROC-AUC

Facebook

Twitter

Click to copy link

Link copied

Cite

Weihua Hu; Matthias Fey; Hongyu Ren; Maho Nakata; Yuxiao Dong; Jure Leskovec (2024). OGB-LSC Dataset [Dataset]. https://paperswithcode.com/dataset/ogb-lsc

OGB-LSC Dataset

OGB Large-Scale Challenge

Explore at:

454 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jan 25, 2024

Authors

Weihua Hu; Matthias Fey; Hongyu Ren; Maho Nakata; Yuxiao Dong; Jure Leskovec

Description

OGB Large-Scale Challenge (OGB-LSC) is a collection of three real-world datasets for advancing the state-of-the-art in large-scale graph ML. OGB-LSC provides graph datasets that are orders of magnitude larger than existing ones and covers three core graph learning tasks -- link prediction, graph regression, and node classification.

OGB-LSC consists of three datasets: MAG240M-LSC, WikiKG90M-LSC, and PCQM4M-LSC. Each dataset offers an independent task.

MAG240M-LSC is a heterogeneous academic graph, and the task is to predict the subject areas of papers situated in the heterogeneous graph (node classification). WikiKG90M-LSC is a knowledge graph, and the task is to impute missing triplets (link prediction). PCQM4M-LSC is a quantum chemistry dataset, and the task is to predict an important molecular property, the HOMO-LUMO gap, of a given molecule (graph regression).

Clear search

Close search

Google apps

Main menu

OGB-LSC Dataset

OGB Dataset

ogbg_molpcba

Input Format

Prediction

References

OGB(Open Graph Benchmark)

OGB

OGBN-Proteins (Processed for PyG)

OGBN-Proteins

Usage in Python

Description

Summary

Open Graph Benchmark

References

Disclaimer

CORA, Citeseer, Pubmed, OGB arXiv - Dataset - LDM

Weihua Hu, Matthias Fey, Hongyu Ren, Maho Nakata, Yuxiao Dong, Jure Leskovec...

Benchmark Data for Chemprop

Ets Ogb Commerce General Imp Exp | See Full Import/Export Data | Eximpedia

Vintage Ogb (Name) - Reverse Whois Lookup

pjf-podcast-qa-sharegpt

Calcium time series from OGB labeled V1 neurons in awake or anesthesized...

xn--biberciimento-ogb.com - Historical whois Lookup

pjf_llama_instruction_prep

Amelia Ad1 Llc Importer and Ogb Engineerding Bv Exporter Data to USA

xn--digitalebrn-ogb.com - Historical whois Lookup

nous-pjf

Geom3D_data

gpt4_preference_rlaif

OGB-LSC DatasetSee More Versions

OGB Large-Scale Challenge

OGB-LSC Dataset