96 datasets found

w
Dataset of books called Python in a nutshell : a desktop quick reference
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Python in a nutshell : a desktop quick reference [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Python+in+a+nutshell+%3A+a+desktop+quick+reference
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is Python in a nutshell : a desktop quick reference. It features 7 columns including author, publication date, language, and book publisher.
T
scicite
tensorflow.org
opendatalab.com
+1more
Updated Dec 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). scicite [Dataset]. https://www.tensorflow.org/datasets/catalog/scicite
Explore at:
Dataset updated
Dec 23, 2022
Description
This is a dataset for classifying citation intents in academic papers. The main citation intent label for each Json object is specified with the label key while the citation context is specified in with a context key. Example:

{ 'string': 'In chacma baboons, male-infant relationships can be linked to both formation of friendships and paternity success [30,31].' 'sectionName': 'Introduction', 'label': 'background', 'citingPaperId': '7a6b2d4b405439', 'citedPaperId': '9d1abadc55b5e0', ... }

You may obtain the full information about the paper using the provided paper ids with the Semantic Scholar API (https://api.semanticscholar.org/).

The labels are: Method, Background, Result

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('scicite', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
DataCite public data exploration
redivis.com
Updated Apr 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ian Mathews (2025). DataCite public data exploration [Dataset]. https://redivis.com/workflows/hx1e-a6w8vmwsx
Explore at:
Dataset updated
Apr 29, 2025
Dataset provided by
Redivis Inc.
Authors
Ian Mathews
Description
This is a sample project highlighting some basic methodologies in working with the DataCite public data file and Data Citation Corpus on Redivis.

Using the transform interface, we extract all records associated with DOIs for Stanford datasets on Redivis. We then make a simple plot using a python notebook to see DOI issuance over time. The nested nature of some of the public data file fields makes exploration a bit challenging; future work could break this dataset into multiple related tables for easier analysis.

We can also join with the Data Citation Corpus to find all citations referencing Stanford-on-Redivis DOIs (the citation corpus is a work in progress, and doesn't currently capture many of the citations in the literature).
H
How to Extract Legal Citations using Python (for the complete beginner)
dataverse.harvard.edu
Updated Jun 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rachael K. Hinkle (2022). How to Extract Legal Citations using Python (for the complete beginner) [Dataset]. http://doi.org/10.7910/DVN/9LNCLF
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/9LNCLF
Dataset updated
Jun 8, 2022
Dataset provided by
Harvard Dataverse
Authors
Rachael K. Hinkle
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
These files accompany an article published in the Law and Courts Newsletter
w
Dataset of book subjects that contain Python pocket reference
workwithdata.com
Updated Nov 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2024). Dataset of book subjects that contain Python pocket reference [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=j0-book&fop0=%3D&fval0=Python+pocket+reference&j=1&j0=books
Explore at:
Dataset updated
Nov 7, 2024
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about book subjects. It has 1 row and is filtered where the books is Python pocket reference. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
w
Dataset of authors, books and publication dates of book subjects where books...
workwithdata.com
Updated Nov 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2024). Dataset of authors, books and publication dates of book subjects where books equals Python : the complete reference [Dataset]. https://www.workwithdata.com/datasets/book-subjects?col=book_subject%2Cj0-author%2Cj0-book%2Cj0-publication_date&f=1&fcol0=j0-book&fop0=%3D&fval0=Python+%3A+the+complete+reference&j=1&j0=books
Explore at:
Dataset updated
Nov 7, 2024
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about book subjects. It has 1 row and is filtered where the books is Python : the complete reference. It features 4 columns: authors, books, and publication dates.
T
mnist
tensorflow.org
universe.roboflow.com
+3more
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist
Explore at:
Dataset updated
Jun 1, 2024
Description
The MNIST database of handwritten digits.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('mnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">
Citations with contexts in Wikipedia
figshare.com
html
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aaron Halfaker; Meen Chul Kim; Andrea Forte; Dario Taraborelli (2023). Citations with contexts in Wikipedia [Dataset]. http://doi.org/10.6084/m9.figshare.5588842.v1
Explore at:
htmlAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5588842.v1
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Aaron Halfaker; Meen Chul Kim; Andrea Forte; Dario Taraborelli
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset represents structured metadata and contextual information about references added to Wikipedia articles in a JSON format. Each record represents an individual Wikipedia article revision with all the tags parsed, as stored in Wikipedia's XML dumps, including information about: 1) the context(s) in which the reference occurs within the article – such as the surrounding text, parent section title, and section level – 2) structured data and bibliographic metadata included within the reference itself (such as: any citation template used, external links, any known persistent identifiers) 3) additional data/metadata about the reference itself (the reference name, its raw content, and if applicable, revision ID associated with reference addition/deletion/change)The data is available as a set of compressed JSON files, extracted from the July 1, 2017 XML dump of English Wikipedia. Other languages may be added to this dataset in the future.The JSON schema and Python parsing libraries used to generate the data are in the references.
NewsMediaBias-Plus Dataset
zenodo.org
huggingface.co
bin, zip
Updated Nov 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shaina Raza; Shaina Raza (2024). NewsMediaBias-Plus Dataset [Dataset]. http://doi.org/10.5281/zenodo.13961155
Explore at:
bin, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13961155
Dataset updated
Nov 29, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Shaina Raza; Shaina Raza
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
NewsMediaBias-Plus Dataset

Overview

The NewsMediaBias-Plus dataset is designed for the analysis of media bias and disinformation by combining textual and visual data from news articles. It aims to support research in detecting, categorizing, and understanding biased reporting in media outlets.

Dataset Description

NewsMediaBias-Plus pairs news articles with relevant images and annotations indicating perceived biases and the reliability of the content. It adds a multimodal dimension for bias detection in news media.

Contents

unique_id: Unique identifier for each news item. Each unique_id matches an image for the same article.

outlet: The publisher of the article.

headline: The headline of the article.

article_text: The full content of the news article.

image_description: Description of the paired image.

image: The file path of the associated image.

date_published: The date the article was published.

source_url: The original URL of the article.

canonical_link: The canonical URL of the article.

new_categories: Categories assigned to the article.

news_categories_confidence_scores: Confidence scores for each category.

Annotation Labels

text_label: Indicates the likelihood of the article being disinformation:

Likely: Likely to be disinformation.

Unlikely: Unlikely to be disinformation.

multimodal_label: Indicates the likelihood of disinformation from the combination of the text snippet and image content:

Likely: Likely to be disinformation.

Unlikely: Unlikely to be disinformation.

Getting Started

Prerequisites

Python 3.6+

Pandas

Hugging Face Datasets

Hugging Face Hub

Installation

Load the dataset into Python:

python

Copy code

from datasets import load_dataset ds = load_dataset("vector-institute/newsmediabias-plus") print(ds) # View structure and splits print(ds['train'][0]) # Access the first record of the train split print(ds['train'][:5]) # Access the first five records

Load a Few Records

python

Copy code

from datasets import load_dataset # Load the dataset in streaming mode streamed_dataset = load_dataset("vector-institute/newsmediabias-plus", streaming=True) # Get an iterable dataset dataset_iterable = streamed_dataset['train'].take(5) # Print the records for record in dataset_iterable: print(record)

Contributions

Contributions are welcome! You can:

Add Data: Contribute more data points.

Refine Annotations: Improve annotation accuracy.

Share Usage Examples: Help others use the dataset effectively.

To contribute, fork the repository and create a pull request with your changes.

License

This dataset is released under a non-commercial license. See the LICENSE file for more details.

Citation

Please cite the dataset using this BibTeX entry:

bibtex

Copy code

@misc{vector_institute_2024_newsmediabias_plus, title={NewsMediaBias-Plus: A Multimodal Dataset for Analyzing Media Bias}, author={Vector Institute Research Team}, year={2024}, url={https://huggingface.co/datasets/vector-institute/newsmediabias-plus} }

Contact

For questions or support, contact Shaina Raza at: shaina.raza@vectorinstitute.ai

Disclaimer and User Guidance

Disclaimer: The labels Likely and Unlikely are based on LLM annotations and expert assessments, intended for informational use only. They should not be considered final judgments.

Guidance: This dataset is for research purposes. Cross-reference findings with other reliable sources before drawing conclusions. The dataset aims to encourage critical thinking, not provide definitive classifications.
w
Dataset of books called Machine learning pocket reference : working with...
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Machine learning pocket reference : working with structured data in Python [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Machine+learning+pocket+reference+%3A+working+with+structured+data+in+Python
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is Machine learning pocket reference : working with structured data in Python. It features 7 columns including author, publication date, language, and book publisher.
Z
Three Annotated Anomaly Detection Datasets for Line-Scan Algorithms
data.niaid.nih.gov
zenodo.org
Updated Aug 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Garske, Samuel (2024). Three Annotated Anomaly Detection Datasets for Line-Scan Algorithms [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13370799
Explore at:
Dataset updated
Aug 29, 2024
Dataset provided by
Mao, Yiwei
Garske, Samuel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Summary

This dataset contains two hyperspectral and one multispectral anomaly detection images, and their corresponding binary pixel masks. They were initially used for real-time anomaly detection in line-scanning, but they can be used for any anomaly detection task.

They are in .npy file format (will add tiff or geotiff variants in the future), with the image datasets being in the order of (height, width, channels). The SNP dataset was collected using sentinelhub, and the Synthetic dataset was collected from AVIRIS. The Python code used to analyse these datasets can be found at: https://github.com/WiseGamgee/HyperAD

How to Get Started

All that is needed to load these datasets is Python (preferably 3.8+) and the NumPy package. Example code for loading the Beach Dataset if you put it in a folder called "data" with the python script is:

import numpy as np

Load image file

hsi_array = np.load("data/beach_hsi.npy") n_pixels, n_lines, n_bands = hsi_array.shape print(f"This dataset has {n_pixels} pixels, {n_lines} lines, and {n_bands}.")

Load image mask

mask_array = np.load("data/beach_mask.npy") m_pixels, m_lines = mask_array.shape print(f"The corresponding anomaly mask is {m_pixels} pixels by {m_lines} lines.")

Citing the Datasets

If you use any of these datasets, please cite the following paper:

@article{garske2024erx, title={ERX - a Fast Real-Time Anomaly Detection Algorithm for Hyperspectral Line-Scanning}, author={Garske, Samuel and Evans, Bradley and Artlett, Christopher and Wong, KC}, journal={arXiv preprint arXiv:2408.14947}, year={2024},}

If you use the beach dataset please cite the following paper as well (original source):

@article{mao2022openhsi, title={OpenHSI: A complete open-source hyperspectral imaging solution for everyone}, author={Mao, Yiwei and Betters, Christopher H and Evans, Bradley and Artlett, Christopher P and Leon-Saval, Sergio G and Garske, Samuel and Cairns, Iver H and Cocks, Terry and Winter, Robert and Dell, Timothy}, journal={Remote Sensing}, volume={14}, number={9}, pages={2244}, year={2022}, publisher={MDPI} }
i
Python algorithms and dataset of empirical line method applied to inland...
ieee-dataport.org
Updated Mar 9, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alisson Fernando Carmo (2019). Python algorithms and dataset of empirical line method applied to inland water hyperspectral images combining reference targets and in situ water measurements [Dataset]. https://ieee-dataport.org/open-access/python-algorithms-and-dataset-empirical-line-method-applied-inland-water-hyperspectral
Explore at:
Dataset updated
Mar 9, 2019
Authors
Alisson Fernando Carmo
Description
Empirical line methods (ELM) are frequently used to correct images from aerial remote sensing. Remote sensing of aquatic environments captures only a small amount of energy because the water absorbs much of it. The small signal response of the water is proportionally smaller when compared to the other land surface targets. This dataset presents some resources and results of a new approach to calibrate empirical lines combining reference calibration panels with water samples. We optimize the method using python algorithms until reaches the best result.
T
cifar10
tensorflow.org
opendatalab.com
+3more
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). cifar10 [Dataset]. https://www.tensorflow.org/datasets/catalog/cifar10
Explore at:
Dataset updated
Jun 1, 2024
Description
The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('cifar10', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/cifar10-3.0.2.png" alt="Visualization" width="500px">
H
Dataset metadata of known Dataverse installations, August 2023
dataverse.harvard.edu
search.dataone.org
Updated Aug 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julian Gautier (2024). Dataset metadata of known Dataverse installations, August 2023 [Dataset]. http://doi.org/10.7910/DVN/8FEGUV
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/8FEGUV
Dataset updated
Aug 30, 2024
Dataset provided by
Harvard Dataverse
Authors
Julian Gautier
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset contains the metadata of the datasets published in 85 Dataverse installations and information about each installation's metadata blocks. It also includes the lists of pre-defined licenses or terms of use that dataset depositors can apply to the datasets they publish in the 58 installations that were running versions of the Dataverse software that include that feature. The data is useful for reporting on the quality of dataset and file-level metadata within and across Dataverse installations and improving understandings about how certain Dataverse features and metadata fields are used. Curators and other researchers can use this dataset to explore how well Dataverse software and the repositories using the software help depositors describe data. How the metadata was downloaded The dataset metadata and metadata block JSON files were downloaded from each installation between August 22 and August 28, 2023 using a Python script kept in a GitHub repo at https://github.com/jggautier/dataverse-scripts/blob/main/other_scripts/get_dataset_metadata_of_all_installations.py. In order to get the metadata from installations that require an installation account API token to use certain Dataverse software APIs, I created a CSV file with two columns: one column named "hostname" listing each installation URL in which I was able to create an account and another column named "apikey" listing my accounts' API tokens. The Python script expects the CSV file and the listed API tokens to get metadata and other information from installations that require API tokens. How the files are organized ├── csv_files_with_metadata_from_most_known_dataverse_installations │ ├── author(citation)_2023.08.22-2023.08.28.csv │ ├── contributor(citation)_2023.08.22-2023.08.28.csv │ ├── data_source(citation)_2023.08.22-2023.08.28.csv │ ├── ... │ └── topic_classification(citation)_2023.08.22-2023.08.28.csv ├── dataverse_json_metadata_from_each_known_dataverse_installation │ ├── Abacus_2023.08.27_12.59.59.zip │ ├── dataset_pids_Abacus_2023.08.27_12.59.59.csv │ ├── Dataverse_JSON_metadata_2023.08.27_12.59.59 │ ├── hdl_11272.1_AB2_0AQZNT_v1.0(latest_version).json │ ├── ... │ ├── metadatablocks_v5.6 │ ├── astrophysics_v5.6.json │ ├── biomedical_v5.6.json │ ├── citation_v5.6.json │ ├── ... │ ├── socialscience_v5.6.json │ ├── ACSS_Dataverse_2023.08.26_22.14.04.zip │ ├── ADA_Dataverse_2023.08.27_13.16.20.zip │ ├── Arca_Dados_2023.08.27_13.34.09.zip │ ├── ... │ └── World_Agroforestry_-_Research_Data_Repository_2023.08.27_19.24.15.zip └── dataverse_installations_summary_2023.08.28.csv └── dataset_pids_from_most_known_dataverse_installations_2023.08.csv └── license_options_for_each_dataverse_installation_2023.09.05.csv └── metadatablocks_from_most_known_dataverse_installations_2023.09.05.csv This dataset contains two directories and four CSV files not in a directory. One directory, "csv_files_with_metadata_from_most_known_dataverse_installations", contains 20 CSV files that list the values of many of the metadata fields in the citation metadata block and geospatial metadata block of datasets in the 85 Dataverse installations. For example, author(citation)_2023.08.22-2023.08.28.csv contains the "Author" metadata for the latest versions of all published, non-deaccessioned datasets in the 85 installations, where there's a row for author names, affiliations, identifier types and identifiers. The other directory, "dataverse_json_metadata_from_each_known_dataverse_installation", contains 85 zipped files, one for each of the 85 Dataverse installations whose dataset metadata I was able to download. Each zip file contains a CSV file and two sub-directories: The CSV file contains the persistent IDs and URLs of each published dataset in the Dataverse installation as well as a column to indicate if the Python script was able to download the Dataverse JSON metadata for each dataset. It also includes the alias/identifier and category of the Dataverse collection that the dataset is in. One sub-directory contains a JSON file for each of the installation's published, non-deaccessioned dataset versions. The JSON files contain the metadata in the "Dataverse JSON" metadata schema. The Dataverse JSON export of the latest version of each dataset includes "(latest_version)" in the file name. This should help those who are interested in the metadata of only the latest version of each dataset. The other sub-directory contains information about the metadata models (the "metadata blocks" in JSON files) that the installation was using when the dataset metadata was downloaded. I included them so that they can be used when extracting metadata from the dataset's Dataverse JSON exports. The dataverse_installations_summary_2023.08.28.csv file contains information about each installation, including its name, URL, Dataverse software version, and counts of dataset metadata...
Wolverton Oxides Data
figshare.com
search.datacite.org
application/gzip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antoine Emery; Christopher Wolverton; Hacking Materials (2023). Wolverton Oxides Data [Dataset]. http://doi.org/10.6084/m9.figshare.7250417.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7250417.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Antoine Emery; Christopher Wolverton; Hacking Materials
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
4,914 perovskite oxides containing composition data, lattice constants, and formation + vacancy formation energies. All perovskites are of the form ABO3. Adapted from a dataset presented by Emery and Wolverton.Available as Monty Encoder encoded JSON and as CSV. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset described in:Emery, A. A. & Wolverton, C. High-throughput DFT calculations of formation energy, stability and oxygen vacancy formation energy of ABO3 perovskites. Sci. Data 4:170153 doi: 10.1038/sdata.2017.153 (2017).Data sourced from:Emery, A. A., & Wolverton, C. Figshare http://dx.doi.org/10.6084/m9.figshare.5334142 (2017)
Atlas of the Working Group I Contribution to the IPCC Sixth Assessment...
catalogue.ceda.ac.uk
Updated Jun 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maialen Iturbide; José Manuel Gutiérrez; Joaquín Bedia; Ezequiel Cimadevilla; Javier Díez-Sierra; Rodrigo Manzanas; Ana Casanueva; Jorge Baño-Medina; Josipa Milovac; Sixto Milovac; Antonio S. Cofiño; Daniel San Martín; Markel García-Díez; Mathias Hauser; David Huard; Özge Yelekci; Jesús Fernández (2023). Atlas of the Working Group I Contribution to the IPCC Sixth Assessment Report - data for Figure Atlas.2 (v20221104) [Dataset]. https://catalogue.ceda.ac.uk/uuid/789ad030299342ea99534edfb62450d9
Explore at:
Dataset updated
Jun 19, 2023
Dataset provided by
Centre for Environmental Data Analysishttp://www.ceda.ac.uk/
Authors
Maialen Iturbide; José Manuel Gutiérrez; Joaquín Bedia; Ezequiel Cimadevilla; Javier Díez-Sierra; Rodrigo Manzanas; Ana Casanueva; Jorge Baño-Medina; Josipa Milovac; Sixto Milovac; Antonio S. Cofiño; Daniel San Martín; Markel García-Díez; Mathias Hauser; David Huard; Özge Yelekci; Jesús Fernández
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1850 - Dec 31, 2099
Area covered
Earth
Description
Data for Figure Atlas.2 from Atlas of the Working Group I (WGI) Contribution to the Intergovernmental Panel on Climate Change (IPCC) Sixth Assessment Report (AR6).

Figure Atlas.2 shows WGI reference regions used in the (a) AR5 and (b) AR6 reports.

How to cite this dataset

When citing this dataset, please include both the data citation below (under 'Citable as') and the following citations: For the report component from which the figure originates: Gutiérrez, J.M., R.G. Jones, G.T. Narisma, L.M. Alves, M. Amjad, I.V. Gorodetskaya, M. Grose, N.A.B. Klutse, S. Krakovska, J. Li, D. Martínez-Castro, L.O. Mearns, S.H. Mernild, T. Ngo-Duc, B. van den Hurk, and J.-H. Yoon, 2021: Atlas. In Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change [Masson-Delmotte, V., P. Zhai, A. Pirani, S.L. Connors, C. Péan, S. Berger, N. Caud, Y. Chen, L. Goldfarb, M.I. Gomis, M. Huang, K. Leitzell, E. Lonnoy, J.B.R. Matthews, T.K. Maycock, T. Waterfield, O. Yelekçi, R. Yu, and B. Zhou (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, pp. 1927–2058, doi:10.1017/9781009157896.021

Iturbide, M. et al., 2021: Repository supporting the implementation of FAIR principles in the IPCC-WG1 Interactive Atlas. Zenodo. Retrieved from: http://doi.org/10.5281/zenodo.5171760

Figure subpanels

The figure has two panels, with data provided for both panels in the master GitHub repository linked in the documentation.

Data provided in relation to figure

This dataset contains the corner coordinates defining each reference region for the second panel of the figure, which contain coordinate information at a 0.44º resolution. The repository directory 'reference-regions' contains data provided for the reference regions as polygons in different formats (CSV with coordinates, R data, shapefile and geojson) together with R and Python notebooks illustrating the use of these regions with worked examples.

Data for reference regions for AR5 can be found here: https://catalogue.ceda.ac.uk/uuid/a3b6d7f93e5c4ea986f3622eeee2b96f

CMIP5 is the fifth phase of the Coupled Model Intercomparison Project. CMIP6 is the sixth phase of the Coupled Model Intercomparison Project. CORDEX is The Coordinated Regional Downscaling Experiment from the WCRP. AR5 and AR6 refer to the 5th and 6th Annual Report of the IPCC. WGI stands for Working Group I

Notes on reproducing the figure from the provided data

Data and figures produced by the Jupyter Notebooks live inside the notebooks directory. The notebooks describe step by step the basic process followed to generate some key figures of the AR6 WGI Atlas and some products underpinning the Interactive Atlas, such as reference regions, global warming levels, aggregated datasets. They include comments and hints to extend the analysis, thus promoting reusability of the results. These notebooks are provided as guidance for practitioners, more user friendly than the code provided as scripts in the reproducibility folder.

Some of the notebooks require access to large data volumes out of this repository. To speed up the execution of the notebook, in addition to the full code to access the data, we provide a data loading shortcut, by storing intermediate results in the auxiliary-material folder in this repository. To test other parameter settings, the full data access instructions should be followed, which can take long waiting times.

Sources of additional information

The following weblinks are provided in the Related Documents section of this catalogue record: - Link to the figure on the IPCC AR6 website - Link to the report component containing the figure (Atlas) - Link to the Supplementary Material for Atlas, which contains details on the input data used in Table Atlas.SM.15. - Link to the code for the figure, archived on Zenodo. - Link to the necessary notebooks for reproducing the figure from GitHub. - Link to IPCC AR5 reference regions dataset
Glass Binary Data
figshare.com
application/gzip
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Y. T. Sun; H. Y. Bai; M. Z. Li; W. H. Wang; Hacking Materials (2023). Glass Binary Data [Dataset]. http://doi.org/10.6084/m9.figshare.7268507.v2
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7268507.v2
Dataset updated
Jun 3, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Y. T. Sun; H. Y. Bai; M. Z. Li; W. H. Wang; Hacking Materials
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Metallic glass formation data for binary alloys, collected from various experimental techniques such as melt-spinning or mechanical alloying. This dataset covers all compositions with an interval of 5 at.% in 59 binary systems, containing a total of 5959 alloys in the dataset. The target property of this dataset is the glass forming ability (GFA), i.e. whether the composition can form monolithic glass or not, which is either 1 for glass forming or 0 for non-full glass forming.The V2 versions of this dataset have been cleaned to remove duplicate data points. Any entries with identical formula and both negative and positive GFA classes were combined to a single entry with a positive GFA class.Data is available as Monty Encoder encoded JSON and as the source CSV file. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:Machine Learning Approach for Prediction and Understanding of Glass-Forming AbilityY. T. Sun†§ , H. Y. Bai†§, M. Z. Li*‡, and W. H. Wang*†§† Institute of Physics, Chinese Academy of Sciences, Beijing 100190, People’s Republic of China‡ Department of Physics, Beijing Key Laboratory of Optoelectronic Functional Materials & Micro-nano Devices, Renmin University of China, Beijing 100872, People’s Republic of China§ University of Chinese Academy of Science, Beijing 100049, People’s Republic of ChinaJ. Phys. Chem. Lett., 2017, 8 (14), pp 3434–3439DOI: 10.1021/acs.jpclett.7b01046Publication Date (Web): July 11, 2017
e
Replication Data for: Rayleigh invariance allows the estimation of effective...
b2find.eudat.eu
Updated Aug 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Replication Data for: Rayleigh invariance allows the estimation of effective CO2 fluxes due to convective dissolution into water-filled fractures - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/5f55a338-ab06-57bc-b6c2-da042eef1406
Explore at:
Dataset updated
Aug 5, 2025
Description
This dataset features both data and code related to the research article titled "Rayleigh Invariance Enables Estimation of Effective CO2 Fluxes Resulting from Convective Dissolution in Water-Filled Fractures." It includes raw data packaged in tarball format, including Python scripts used to derive the results presented in the publication. High-resolution raw data for contour plots is available upon request. 1 Download the Dataset: Download the dataset file using Access Dataset. Ensure you have sufficient disk space available for storing and processing the dataset. 2 Extract the Dataset: Once the dataset file is downloaded, extract its contents. The dataset is compressed in a tar.xz format. Use appropriate tools to extract it. For example, in Linux, you can use the following command: tar -xf Publication_CCS.tar.xz tar -xf Publication_Karst.tar.xz tar -xf Validation_Sim.tar.xz This will create a directory containing the dataset files. 3 Install Required Python Packages: Before running any code, ensure you have the necessary Python (version 3.10 tested) packages installed. The required packages and their versions are listed in the requirements.txt file. You can install the required packages using pip: pip install -r requirements.txt 4 Run the Post Processing Script: After extracting the dataset and installing the required Python packages, you can run the provided post processing script. The post processing script (post_process.py) is designed to replicate all the plots from a publication based on the dataset. Execute the script using Python: python3 post_process.py This script will generate the plots and output them to the specified directory. 5 Explore and Analyze: Once the script has completed running, you can explore the generated plots to gain insights from the dataset. Feel free to modify the script or use the dataset in your own analysis and experiments. High-resolution data, such as the vtu's for contour plots is available upon request; please feel free to reach out if needed. 6 Small Grid Study: There is a tarball for the data that was generated to study the grid used in the related publication. tar -xf Publication_CCS.tar.xz If you unpack the tarball and have the requirements from above installed, you can use the python script to generate the plots. 7 Citation: If you use this dataset in your research or publication, please cite the original source appropriately to give credit to the authors and contributors.
h
stack-dedup-python-testgen-starcoder-filter-v2-dedup
huggingface.co
Updated Jul 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Northeastern University Programming Research Lab (2024). stack-dedup-python-testgen-starcoder-filter-v2-dedup [Dataset]. https://huggingface.co/datasets/nuprl/stack-dedup-python-testgen-starcoder-filter-v2-dedup
Explore at:
Dataset updated
Jul 20, 2024
Dataset authored and provided by
Northeastern University Programming Research Lab
Description
MultiPL-T Python Sources

Citation

If you use this dataset we request that you cite our work: @misc{cassano:multipl-t, title={Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs}, author={Federico Cassano and John Gouwar and Francesca Lucchetti and Claire Schlesinger and Anders Freeman and Carolyn Jane Anderson and Molly Q Feldman and Michael Greenberg and Abhinav Jangda and Arjun Guha}, year={2024}… See the full description on the dataset page: https://huggingface.co/datasets/nuprl/stack-dedup-python-testgen-starcoder-filter-v2-dedup.
e
Dataset and scripts for A Deep Dive into Machine Learning Density Functional...
b2find.eudat.eu
Updated Oct 2, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Dataset and scripts for A Deep Dive into Machine Learning Density Functional Theory for Materials Science and Chemistry - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/ca6d2cfb-39c1-5cb2-93ec-86cd256a765e
Explore at:
Dataset updated
Oct 2, 2021
Description
This dataset contains additional data for the publication "A Deep Dive into Machine Learning Density Functional Theory for Materials Science and Chemistry". Its goal is to enable interested people to reproduce the citation analysis carried out in the aforementioned publication. Prerequesites The following software versions were used for the python version of this dataset: Python: 3.8.6 Scholarly: 1.2.0 Pyzotero: 1.4.24 Numpy: 1.20.1 Contents results/ : Contains the .csv files that were the results of the citation analysis. Paper groupings follow the ones outlined in the publication. scripts/ : Contains scripts to perform the citation analysis. Zotero.cached.pkl : Contains the cached Zotero library. Usage In order to reproduce the results of the citation analysis, you can use citation_analysis.py in conjunction with cached Zotero library. Manual additions can be verified using the check_consistency script. Please note that you will need a Tor key for the citation analysis, and access to our Zotero library if you don't want to use the cached version. If you need this access, simply contact us.

Facebook

Twitter

Click to copy link

Link copied

Cite

Work With Data (2025). Dataset of books called Python in a nutshell : a desktop quick reference [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Python+in+a+nutshell+%3A+a+desktop+quick+reference

Dataset of books called Python in a nutshell : a desktop quick reference

Explore at:

Dataset updated

Apr 17, 2025

Dataset authored and provided by

Work With Data

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset is about books. It has 1 row and is filtered where the book is Python in a nutshell : a desktop quick reference. It features 7 columns including author, publication date, language, and book publisher.

Clear search

Close search

Google apps

Main menu

Dataset of books called Python in a nutshell : a desktop quick reference

scicite

DataCite public data exploration

How to Extract Legal Citations using Python (for the complete beginner)

Dataset of book subjects that contain Python pocket reference

Dataset of authors, books and publication dates of book subjects where books...

mnist

Citations with contexts in Wikipedia

NewsMediaBias-Plus Dataset

NewsMediaBias-Plus Dataset

Overview

Dataset Description

Contents

Annotation Labels

Getting Started

Prerequisites

Installation

Load a Few Records

Contributions

License

Citation

Contact

Disclaimer and User Guidance

Dataset of books called Machine learning pocket reference : working with...

Three Annotated Anomaly Detection Datasets for Line-Scan Algorithms

Load image file

Load image mask

Python algorithms and dataset of empirical line method applied to inland...

cifar10

Dataset metadata of known Dataverse installations, August 2023

Wolverton Oxides Data

Atlas of the Working Group I Contribution to the IPCC Sixth Assessment...

Glass Binary Data

Replication Data for: Rayleigh invariance allows the estimation of effective...

stack-dedup-python-testgen-starcoder-filter-v2-dedup

Dataset and scripts for A Deep Dive into Machine Learning Density Functional...

Dataset of books called Python in a nutshell : a desktop quick reference