96 datasets found
  1. w

    Dataset of books called Python in a nutshell : a desktop quick reference

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called Python in a nutshell : a desktop quick reference [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Python+in+a+nutshell+%3A+a+desktop+quick+reference
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 1 row and is filtered where the book is Python in a nutshell : a desktop quick reference. It features 7 columns including author, publication date, language, and book publisher.

  2. T

    scicite

    • tensorflow.org
    • opendatalab.com
    • +1more
    Updated Dec 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). scicite [Dataset]. https://www.tensorflow.org/datasets/catalog/scicite
    Explore at:
    Dataset updated
    Dec 23, 2022
    Description

    This is a dataset for classifying citation intents in academic papers. The main citation intent label for each Json object is specified with the label key while the citation context is specified in with a context key. Example:

    {
     'string': 'In chacma baboons, male-infant relationships can be linked to both
      formation of friendships and paternity success [30,31].'
     'sectionName': 'Introduction',
     'label': 'background',
     'citingPaperId': '7a6b2d4b405439',
     'citedPaperId': '9d1abadc55b5e0',
     ...
     }
    

    You may obtain the full information about the paper using the provided paper ids with the Semantic Scholar API (https://api.semanticscholar.org/).

    The labels are: Method, Background, Result

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('scicite', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  3. DataCite public data exploration

    • redivis.com
    Updated Apr 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ian Mathews (2025). DataCite public data exploration [Dataset]. https://redivis.com/workflows/hx1e-a6w8vmwsx
    Explore at:
    Dataset updated
    Apr 29, 2025
    Dataset provided by
    Redivis Inc.
    Authors
    Ian Mathews
    Description

    This is a sample project highlighting some basic methodologies in working with the DataCite public data file and Data Citation Corpus on Redivis.

    Using the transform interface, we extract all records associated with DOIs for Stanford datasets on Redivis. We then make a simple plot using a python notebook to see DOI issuance over time. The nested nature of some of the public data file fields makes exploration a bit challenging; future work could break this dataset into multiple related tables for easier analysis.

    We can also join with the Data Citation Corpus to find all citations referencing Stanford-on-Redivis DOIs (the citation corpus is a work in progress, and doesn't currently capture many of the citations in the literature).

  4. H

    How to Extract Legal Citations using Python (for the complete beginner)

    • dataverse.harvard.edu
    Updated Jun 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rachael K. Hinkle (2022). How to Extract Legal Citations using Python (for the complete beginner) [Dataset]. http://doi.org/10.7910/DVN/9LNCLF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 8, 2022
    Dataset provided by
    Harvard Dataverse
    Authors
    Rachael K. Hinkle
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    These files accompany an article published in the Law and Courts Newsletter

  5. w

    Dataset of book subjects that contain Python pocket reference

    • workwithdata.com
    Updated Nov 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of book subjects that contain Python pocket reference [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=j0-book&fop0=%3D&fval0=Python+pocket+reference&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 7, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book subjects. It has 1 row and is filtered where the books is Python pocket reference. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.

  6. w

    Dataset of authors, books and publication dates of book subjects where books...

    • workwithdata.com
    Updated Nov 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of authors, books and publication dates of book subjects where books equals Python : the complete reference [Dataset]. https://www.workwithdata.com/datasets/book-subjects?col=book_subject%2Cj0-author%2Cj0-book%2Cj0-publication_date&f=1&fcol0=j0-book&fop0=%3D&fval0=Python+%3A+the+complete+reference&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 7, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book subjects. It has 1 row and is filtered where the books is Python : the complete reference. It features 4 columns: authors, books, and publication dates.

  7. T

    mnist

    • tensorflow.org
    • universe.roboflow.com
    • +3more
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    The MNIST database of handwritten digits.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('mnist', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">

  8. Citations with contexts in Wikipedia

    • figshare.com
    html
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron Halfaker; Meen Chul Kim; Andrea Forte; Dario Taraborelli (2023). Citations with contexts in Wikipedia [Dataset]. http://doi.org/10.6084/m9.figshare.5588842.v1
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Aaron Halfaker; Meen Chul Kim; Andrea Forte; Dario Taraborelli
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset represents structured metadata and contextual information about references added to Wikipedia articles in a JSON format. Each record represents an individual Wikipedia article revision with all the tags parsed, as stored in Wikipedia's XML dumps, including information about: 1) the context(s) in which the reference occurs within the article – such as the surrounding text, parent section title, and section level – 2) structured data and bibliographic metadata included within the reference itself (such as: any citation template used, external links, any known persistent identifiers) 3) additional data/metadata about the reference itself (the reference name, its raw content, and if applicable, revision ID associated with reference addition/deletion/change)The data is available as a set of compressed JSON files, extracted from the July 1, 2017 XML dump of English Wikipedia. Other languages may be added to this dataset in the future.The JSON schema and Python parsing libraries used to generate the data are in the references.

  9. NewsMediaBias-Plus Dataset

    • zenodo.org
    • huggingface.co
    bin, zip
    Updated Nov 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shaina Raza; Shaina Raza (2024). NewsMediaBias-Plus Dataset [Dataset]. http://doi.org/10.5281/zenodo.13961155
    Explore at:
    bin, zipAvailable download formats
    Dataset updated
    Nov 29, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Shaina Raza; Shaina Raza
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    NewsMediaBias-Plus Dataset

    Overview

    The NewsMediaBias-Plus dataset is designed for the analysis of media bias and disinformation by combining textual and visual data from news articles. It aims to support research in detecting, categorizing, and understanding biased reporting in media outlets.

    Dataset Description

    NewsMediaBias-Plus pairs news articles with relevant images and annotations indicating perceived biases and the reliability of the content. It adds a multimodal dimension for bias detection in news media.

    Contents

    • unique_id: Unique identifier for each news item. Each unique_id matches an image for the same article.
    • outlet: The publisher of the article.
    • headline: The headline of the article.
    • article_text: The full content of the news article.
    • image_description: Description of the paired image.
    • image: The file path of the associated image.
    • date_published: The date the article was published.
    • source_url: The original URL of the article.
    • canonical_link: The canonical URL of the article.
    • new_categories: Categories assigned to the article.
    • news_categories_confidence_scores: Confidence scores for each category.

    Annotation Labels

    • text_label: Indicates the likelihood of the article being disinformation:

      • Likely: Likely to be disinformation.
      • Unlikely: Unlikely to be disinformation.
    • multimodal_label: Indicates the likelihood of disinformation from the combination of the text snippet and image content:

      • Likely: Likely to be disinformation.
      • Unlikely: Unlikely to be disinformation.

    Getting Started

    Prerequisites

    • Python 3.6+
    • Pandas
    • Hugging Face Datasets
    • Hugging Face Hub

    Installation

    Load the dataset into Python:

    python
    Copy code
    from datasets import load_dataset ds = load_dataset("vector-institute/newsmediabias-plus") print(ds) # View structure and splits print(ds['train'][0]) # Access the first record of the train split print(ds['train'][:5]) # Access the first five records

    Load a Few Records

    python
    Copy code
    from datasets import load_dataset # Load the dataset in streaming mode streamed_dataset = load_dataset("vector-institute/newsmediabias-plus", streaming=True) # Get an iterable dataset dataset_iterable = streamed_dataset['train'].take(5) # Print the records for record in dataset_iterable: print(record)

    Contributions

    Contributions are welcome! You can:

    • Add Data: Contribute more data points.
    • Refine Annotations: Improve annotation accuracy.
    • Share Usage Examples: Help others use the dataset effectively.

    To contribute, fork the repository and create a pull request with your changes.

    License

    This dataset is released under a non-commercial license. See the LICENSE file for more details.

    Citation

    Please cite the dataset using this BibTeX entry:

    bibtex
    Copy code
    @misc{vector_institute_2024_newsmediabias_plus, title={NewsMediaBias-Plus: A Multimodal Dataset for Analyzing Media Bias}, author={Vector Institute Research Team}, year={2024}, url={https://huggingface.co/datasets/vector-institute/newsmediabias-plus} }

    Contact

    For questions or support, contact Shaina Raza at: shaina.raza@vectorinstitute.ai

    Disclaimer and User Guidance

    Disclaimer: The labels Likely and Unlikely are based on LLM annotations and expert assessments, intended for informational use only. They should not be considered final judgments.

    Guidance: This dataset is for research purposes. Cross-reference findings with other reliable sources before drawing conclusions. The dataset aims to encourage critical thinking, not provide definitive classifications.

  10. w

    Dataset of books called Machine learning pocket reference : working with...

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called Machine learning pocket reference : working with structured data in Python [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Machine+learning+pocket+reference+%3A+working+with+structured+data+in+Python
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 1 row and is filtered where the book is Machine learning pocket reference : working with structured data in Python. It features 7 columns including author, publication date, language, and book publisher.

  11. Z

    Three Annotated Anomaly Detection Datasets for Line-Scan Algorithms

    • data.niaid.nih.gov
    • zenodo.org
    Updated Aug 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Garske, Samuel (2024). Three Annotated Anomaly Detection Datasets for Line-Scan Algorithms [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13370799
    Explore at:
    Dataset updated
    Aug 29, 2024
    Dataset provided by
    Mao, Yiwei
    Garske, Samuel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary

    This dataset contains two hyperspectral and one multispectral anomaly detection images, and their corresponding binary pixel masks. They were initially used for real-time anomaly detection in line-scanning, but they can be used for any anomaly detection task.

    They are in .npy file format (will add tiff or geotiff variants in the future), with the image datasets being in the order of (height, width, channels). The SNP dataset was collected using sentinelhub, and the Synthetic dataset was collected from AVIRIS. The Python code used to analyse these datasets can be found at: https://github.com/WiseGamgee/HyperAD

    How to Get Started

    All that is needed to load these datasets is Python (preferably 3.8+) and the NumPy package. Example code for loading the Beach Dataset if you put it in a folder called "data" with the python script is:

    import numpy as np

    Load image file

    hsi_array = np.load("data/beach_hsi.npy") n_pixels, n_lines, n_bands = hsi_array.shape print(f"This dataset has {n_pixels} pixels, {n_lines} lines, and {n_bands}.")

    Load image mask

    mask_array = np.load("data/beach_mask.npy") m_pixels, m_lines = mask_array.shape print(f"The corresponding anomaly mask is {m_pixels} pixels by {m_lines} lines.")

    Citing the Datasets

    If you use any of these datasets, please cite the following paper:

    @article{garske2024erx, title={ERX - a Fast Real-Time Anomaly Detection Algorithm for Hyperspectral Line-Scanning}, author={Garske, Samuel and Evans, Bradley and Artlett, Christopher and Wong, KC}, journal={arXiv preprint arXiv:2408.14947}, year={2024},}

    If you use the beach dataset please cite the following paper as well (original source):

    @article{mao2022openhsi, title={OpenHSI: A complete open-source hyperspectral imaging solution for everyone}, author={Mao, Yiwei and Betters, Christopher H and Evans, Bradley and Artlett, Christopher P and Leon-Saval, Sergio G and Garske, Samuel and Cairns, Iver H and Cocks, Terry and Winter, Robert and Dell, Timothy}, journal={Remote Sensing}, volume={14}, number={9}, pages={2244}, year={2022}, publisher={MDPI} }

  12. i

    Python algorithms and dataset of empirical line method applied to inland...

    • ieee-dataport.org
    Updated Mar 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alisson Fernando Carmo (2019). Python algorithms and dataset of empirical line method applied to inland water hyperspectral images combining reference targets and in situ water measurements [Dataset]. https://ieee-dataport.org/open-access/python-algorithms-and-dataset-empirical-line-method-applied-inland-water-hyperspectral
    Explore at:
    Dataset updated
    Mar 9, 2019
    Authors
    Alisson Fernando Carmo
    Description

    Empirical line methods (ELM) are frequently used to correct images from aerial remote sensing. Remote sensing of aquatic environments captures only a small amount of energy because the water absorbs much of it. The small signal response of the water is proportionally smaller when compared to the other land surface targets. This dataset presents some resources and results of a new approach to calibrate empirical lines combining reference calibration panels with water samples. We optimize the method using python algorithms until reaches the best result.

  13. T

    cifar10

    • tensorflow.org
    • opendatalab.com
    • +3more
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). cifar10 [Dataset]. https://www.tensorflow.org/datasets/catalog/cifar10
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('cifar10', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/cifar10-3.0.2.png" alt="Visualization" width="500px">

  14. H

    Dataset metadata of known Dataverse installations, August 2023

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Aug 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julian Gautier (2024). Dataset metadata of known Dataverse installations, August 2023 [Dataset]. http://doi.org/10.7910/DVN/8FEGUV
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 30, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Julian Gautier
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains the metadata of the datasets published in 85 Dataverse installations and information about each installation's metadata blocks. It also includes the lists of pre-defined licenses or terms of use that dataset depositors can apply to the datasets they publish in the 58 installations that were running versions of the Dataverse software that include that feature. The data is useful for reporting on the quality of dataset and file-level metadata within and across Dataverse installations and improving understandings about how certain Dataverse features and metadata fields are used. Curators and other researchers can use this dataset to explore how well Dataverse software and the repositories using the software help depositors describe data. How the metadata was downloaded The dataset metadata and metadata block JSON files were downloaded from each installation between August 22 and August 28, 2023 using a Python script kept in a GitHub repo at https://github.com/jggautier/dataverse-scripts/blob/main/other_scripts/get_dataset_metadata_of_all_installations.py. In order to get the metadata from installations that require an installation account API token to use certain Dataverse software APIs, I created a CSV file with two columns: one column named "hostname" listing each installation URL in which I was able to create an account and another column named "apikey" listing my accounts' API tokens. The Python script expects the CSV file and the listed API tokens to get metadata and other information from installations that require API tokens. How the files are organized ├── csv_files_with_metadata_from_most_known_dataverse_installations │ ├── author(citation)_2023.08.22-2023.08.28.csv │ ├── contributor(citation)_2023.08.22-2023.08.28.csv │ ├── data_source(citation)_2023.08.22-2023.08.28.csv │ ├── ... │ └── topic_classification(citation)_2023.08.22-2023.08.28.csv ├── dataverse_json_metadata_from_each_known_dataverse_installation │ ├── Abacus_2023.08.27_12.59.59.zip │ ├── dataset_pids_Abacus_2023.08.27_12.59.59.csv │ ├── Dataverse_JSON_metadata_2023.08.27_12.59.59 │ ├── hdl_11272.1_AB2_0AQZNT_v1.0(latest_version).json │ ├── ... │ ├── metadatablocks_v5.6 │ ├── astrophysics_v5.6.json │ ├── biomedical_v5.6.json │ ├── citation_v5.6.json │ ├── ... │ ├── socialscience_v5.6.json │ ├── ACSS_Dataverse_2023.08.26_22.14.04.zip │ ├── ADA_Dataverse_2023.08.27_13.16.20.zip │ ├── Arca_Dados_2023.08.27_13.34.09.zip │ ├── ... │ └── World_Agroforestry_-_Research_Data_Repository_2023.08.27_19.24.15.zip └── dataverse_installations_summary_2023.08.28.csv └── dataset_pids_from_most_known_dataverse_installations_2023.08.csv └── license_options_for_each_dataverse_installation_2023.09.05.csv └── metadatablocks_from_most_known_dataverse_installations_2023.09.05.csv This dataset contains two directories and four CSV files not in a directory. One directory, "csv_files_with_metadata_from_most_known_dataverse_installations", contains 20 CSV files that list the values of many of the metadata fields in the citation metadata block and geospatial metadata block of datasets in the 85 Dataverse installations. For example, author(citation)_2023.08.22-2023.08.28.csv contains the "Author" metadata for the latest versions of all published, non-deaccessioned datasets in the 85 installations, where there's a row for author names, affiliations, identifier types and identifiers. The other directory, "dataverse_json_metadata_from_each_known_dataverse_installation", contains 85 zipped files, one for each of the 85 Dataverse installations whose dataset metadata I was able to download. Each zip file contains a CSV file and two sub-directories: The CSV file contains the persistent IDs and URLs of each published dataset in the Dataverse installation as well as a column to indicate if the Python script was able to download the Dataverse JSON metadata for each dataset. It also includes the alias/identifier and category of the Dataverse collection that the dataset is in. One sub-directory contains a JSON file for each of the installation's published, non-deaccessioned dataset versions. The JSON files contain the metadata in the "Dataverse JSON" metadata schema. The Dataverse JSON export of the latest version of each dataset includes "(latest_version)" in the file name. This should help those who are interested in the metadata of only the latest version of each dataset. The other sub-directory contains information about the metadata models (the "metadata blocks" in JSON files) that the installation was using when the dataset metadata was downloaded. I included them so that they can be used when extracting metadata from the dataset's Dataverse JSON exports. The dataverse_installations_summary_2023.08.28.csv file contains information about each installation, including its name, URL, Dataverse software version, and counts of dataset metadata...

  15. Wolverton Oxides Data

    • figshare.com
    • search.datacite.org
    application/gzip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antoine Emery; Christopher Wolverton; Hacking Materials (2023). Wolverton Oxides Data [Dataset]. http://doi.org/10.6084/m9.figshare.7250417.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Antoine Emery; Christopher Wolverton; Hacking Materials
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    4,914 perovskite oxides containing composition data, lattice constants, and formation + vacancy formation energies. All perovskites are of the form ABO3. Adapted from a dataset presented by Emery and Wolverton.Available as Monty Encoder encoded JSON and as CSV. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset described in:Emery, A. A. & Wolverton, C. High-throughput DFT calculations of formation energy, stability and oxygen vacancy formation energy of ABO3 perovskites. Sci. Data 4:170153 doi: 10.1038/sdata.2017.153 (2017).Data sourced from:Emery, A. A., & Wolverton, C. Figshare http://dx.doi.org/10.6084/m9.figshare.5334142 (2017)

  16. Atlas of the Working Group I Contribution to the IPCC Sixth Assessment...

    • catalogue.ceda.ac.uk
    Updated Jun 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maialen Iturbide; José Manuel Gutiérrez; Joaquín Bedia; Ezequiel Cimadevilla; Javier Díez-Sierra; Rodrigo Manzanas; Ana Casanueva; Jorge Baño-Medina; Josipa Milovac; Sixto Milovac; Antonio S. Cofiño; Daniel San Martín; Markel García-Díez; Mathias Hauser; David Huard; Özge Yelekci; Jesús Fernández (2023). Atlas of the Working Group I Contribution to the IPCC Sixth Assessment Report - data for Figure Atlas.2 (v20221104) [Dataset]. https://catalogue.ceda.ac.uk/uuid/789ad030299342ea99534edfb62450d9
    Explore at:
    Dataset updated
    Jun 19, 2023
    Dataset provided by
    Centre for Environmental Data Analysishttp://www.ceda.ac.uk/
    Authors
    Maialen Iturbide; José Manuel Gutiérrez; Joaquín Bedia; Ezequiel Cimadevilla; Javier Díez-Sierra; Rodrigo Manzanas; Ana Casanueva; Jorge Baño-Medina; Josipa Milovac; Sixto Milovac; Antonio S. Cofiño; Daniel San Martín; Markel García-Díez; Mathias Hauser; David Huard; Özge Yelekci; Jesús Fernández
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1850 - Dec 31, 2099
    Area covered
    Earth
    Description

    Data for Figure Atlas.2 from Atlas of the Working Group I (WGI) Contribution to the Intergovernmental Panel on Climate Change (IPCC) Sixth Assessment Report (AR6).

    Figure Atlas.2 shows WGI reference regions used in the (a) AR5 and (b) AR6 reports.

    How to cite this dataset

    When citing this dataset, please include both the data citation below (under 'Citable as') and the following citations: For the report component from which the figure originates: Gutiérrez, J.M., R.G. Jones, G.T. Narisma, L.M. Alves, M. Amjad, I.V. Gorodetskaya, M. Grose, N.A.B. Klutse, S. Krakovska, J. Li, D. Martínez-Castro, L.O. Mearns, S.H. Mernild, T. Ngo-Duc, B. van den Hurk, and J.-H. Yoon, 2021: Atlas. In Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change [Masson-Delmotte, V., P. Zhai, A. Pirani, S.L. Connors, C. Péan, S. Berger, N. Caud, Y. Chen, L. Goldfarb, M.I. Gomis, M. Huang, K. Leitzell, E. Lonnoy, J.B.R. Matthews, T.K. Maycock, T. Waterfield, O. Yelekçi, R. Yu, and B. Zhou (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, pp. 1927–2058, doi:10.1017/9781009157896.021

    Iturbide, M. et al., 2021: Repository supporting the implementation of FAIR principles in the IPCC-WG1 Interactive Atlas. Zenodo. Retrieved from: http://doi.org/10.5281/zenodo.5171760

    Figure subpanels

    The figure has two panels, with data provided for both panels in the master GitHub repository linked in the documentation.

    Data provided in relation to figure

    This dataset contains the corner coordinates defining each reference region for the second panel of the figure, which contain coordinate information at a 0.44º resolution. The repository directory 'reference-regions' contains data provided for the reference regions as polygons in different formats (CSV with coordinates, R data, shapefile and geojson) together with R and Python notebooks illustrating the use of these regions with worked examples.

    Data for reference regions for AR5 can be found here: https://catalogue.ceda.ac.uk/uuid/a3b6d7f93e5c4ea986f3622eeee2b96f

    CMIP5 is the fifth phase of the Coupled Model Intercomparison Project. CMIP6 is the sixth phase of the Coupled Model Intercomparison Project. CORDEX is The Coordinated Regional Downscaling Experiment from the WCRP. AR5 and AR6 refer to the 5th and 6th Annual Report of the IPCC. WGI stands for Working Group I

    Notes on reproducing the figure from the provided data

    Data and figures produced by the Jupyter Notebooks live inside the notebooks directory. The notebooks describe step by step the basic process followed to generate some key figures of the AR6 WGI Atlas and some products underpinning the Interactive Atlas, such as reference regions, global warming levels, aggregated datasets. They include comments and hints to extend the analysis, thus promoting reusability of the results. These notebooks are provided as guidance for practitioners, more user friendly than the code provided as scripts in the reproducibility folder.

    Some of the notebooks require access to large data volumes out of this repository. To speed up the execution of the notebook, in addition to the full code to access the data, we provide a data loading shortcut, by storing intermediate results in the auxiliary-material folder in this repository. To test other parameter settings, the full data access instructions should be followed, which can take long waiting times.

    Sources of additional information

    The following weblinks are provided in the Related Documents section of this catalogue record: - Link to the figure on the IPCC AR6 website - Link to the report component containing the figure (Atlas) - Link to the Supplementary Material for Atlas, which contains details on the input data used in Table Atlas.SM.15. - Link to the code for the figure, archived on Zenodo. - Link to the necessary notebooks for reproducing the figure from GitHub. - Link to IPCC AR5 reference regions dataset

  17. Glass Binary Data

    • figshare.com
    application/gzip
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Y. T. Sun; H. Y. Bai; M. Z. Li; W. H. Wang; Hacking Materials (2023). Glass Binary Data [Dataset]. http://doi.org/10.6084/m9.figshare.7268507.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Y. T. Sun; H. Y. Bai; M. Z. Li; W. H. Wang; Hacking Materials
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Metallic glass formation data for binary alloys, collected from various experimental techniques such as melt-spinning or mechanical alloying. This dataset covers all compositions with an interval of 5 at.% in 59 binary systems, containing a total of 5959 alloys in the dataset. The target property of this dataset is the glass forming ability (GFA), i.e. whether the composition can form monolithic glass or not, which is either 1 for glass forming or 0 for non-full glass forming.The V2 versions of this dataset have been cleaned to remove duplicate data points. Any entries with identical formula and both negative and positive GFA classes were combined to a single entry with a positive GFA class.Data is available as Monty Encoder encoded JSON and as the source CSV file. Recommended access method is with the matminer Python package using the datasets module.Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:Machine Learning Approach for Prediction and Understanding of Glass-Forming AbilityY. T. Sun†§ , H. Y. Bai†§, M. Z. Li*‡, and W. H. Wang*†§† Institute of Physics, Chinese Academy of Sciences, Beijing 100190, People’s Republic of China‡ Department of Physics, Beijing Key Laboratory of Optoelectronic Functional Materials & Micro-nano Devices, Renmin University of China, Beijing 100872, People’s Republic of China§ University of Chinese Academy of Science, Beijing 100049, People’s Republic of ChinaJ. Phys. Chem. Lett., 2017, 8 (14), pp 3434–3439DOI: 10.1021/acs.jpclett.7b01046Publication Date (Web): July 11, 2017

  18. e

    Replication Data for: Rayleigh invariance allows the estimation of effective...

    • b2find.eudat.eu
    Updated Aug 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Replication Data for: Rayleigh invariance allows the estimation of effective CO2 fluxes due to convective dissolution into water-filled fractures - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/5f55a338-ab06-57bc-b6c2-da042eef1406
    Explore at:
    Dataset updated
    Aug 5, 2025
    Description

    This dataset features both data and code related to the research article titled "Rayleigh Invariance Enables Estimation of Effective CO2 Fluxes Resulting from Convective Dissolution in Water-Filled Fractures." It includes raw data packaged in tarball format, including Python scripts used to derive the results presented in the publication. High-resolution raw data for contour plots is available upon request. 1 Download the Dataset: Download the dataset file using Access Dataset. Ensure you have sufficient disk space available for storing and processing the dataset. 2 Extract the Dataset: Once the dataset file is downloaded, extract its contents. The dataset is compressed in a tar.xz format. Use appropriate tools to extract it. For example, in Linux, you can use the following command: tar -xf Publication_CCS.tar.xz tar -xf Publication_Karst.tar.xz tar -xf Validation_Sim.tar.xz This will create a directory containing the dataset files. 3 Install Required Python Packages: Before running any code, ensure you have the necessary Python (version 3.10 tested) packages installed. The required packages and their versions are listed in the requirements.txt file. You can install the required packages using pip: pip install -r requirements.txt 4 Run the Post Processing Script: After extracting the dataset and installing the required Python packages, you can run the provided post processing script. The post processing script (post_process.py) is designed to replicate all the plots from a publication based on the dataset. Execute the script using Python: python3 post_process.py This script will generate the plots and output them to the specified directory. 5 Explore and Analyze: Once the script has completed running, you can explore the generated plots to gain insights from the dataset. Feel free to modify the script or use the dataset in your own analysis and experiments. High-resolution data, such as the vtu's for contour plots is available upon request; please feel free to reach out if needed. 6 Small Grid Study: There is a tarball for the data that was generated to study the grid used in the related publication. tar -xf Publication_CCS.tar.xz If you unpack the tarball and have the requirements from above installed, you can use the python script to generate the plots. 7 Citation: If you use this dataset in your research or publication, please cite the original source appropriately to give credit to the authors and contributors.

  19. h

    stack-dedup-python-testgen-starcoder-filter-v2-dedup

    • huggingface.co
    Updated Jul 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Northeastern University Programming Research Lab (2024). stack-dedup-python-testgen-starcoder-filter-v2-dedup [Dataset]. https://huggingface.co/datasets/nuprl/stack-dedup-python-testgen-starcoder-filter-v2-dedup
    Explore at:
    Dataset updated
    Jul 20, 2024
    Dataset authored and provided by
    Northeastern University Programming Research Lab
    Description

    MultiPL-T Python Sources

      Citation
    

    If you use this dataset we request that you cite our work: @misc{cassano:multipl-t, title={Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs}, author={Federico Cassano and John Gouwar and Francesca Lucchetti and Claire Schlesinger and Anders Freeman and Carolyn Jane Anderson and Molly Q Feldman and Michael Greenberg and Abhinav Jangda and Arjun Guha}, year={2024}… See the full description on the dataset page: https://huggingface.co/datasets/nuprl/stack-dedup-python-testgen-starcoder-filter-v2-dedup.

  20. e

    Dataset and scripts for A Deep Dive into Machine Learning Density Functional...

    • b2find.eudat.eu
    Updated Oct 2, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Dataset and scripts for A Deep Dive into Machine Learning Density Functional Theory for Materials Science and Chemistry - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/ca6d2cfb-39c1-5cb2-93ec-86cd256a765e
    Explore at:
    Dataset updated
    Oct 2, 2021
    Description

    This dataset contains additional data for the publication "A Deep Dive into Machine Learning Density Functional Theory for Materials Science and Chemistry". Its goal is to enable interested people to reproduce the citation analysis carried out in the aforementioned publication. Prerequesites The following software versions were used for the python version of this dataset: Python: 3.8.6 Scholarly: 1.2.0 Pyzotero: 1.4.24 Numpy: 1.20.1 Contents results/ : Contains the .csv files that were the results of the citation analysis. Paper groupings follow the ones outlined in the publication. scripts/ : Contains scripts to perform the citation analysis. Zotero.cached.pkl : Contains the cached Zotero library. Usage In order to reproduce the results of the citation analysis, you can use citation_analysis.py in conjunction with cached Zotero library. Manual additions can be verified using the check_consistency script. Please note that you will need a Tor key for the citation analysis, and access to our Zotero library if you don't want to use the cached version. If you need this access, simply contact us.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Work With Data (2025). Dataset of books called Python in a nutshell : a desktop quick reference [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Python+in+a+nutshell+%3A+a+desktop+quick+reference

Dataset of books called Python in a nutshell : a desktop quick reference

Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset is about books. It has 1 row and is filtered where the book is Python in a nutshell : a desktop quick reference. It features 7 columns including author, publication date, language, and book publisher.

Search
Clear search
Close search
Google apps
Main menu