100+ datasets found
  1. Z

    VegeNet - Image datasets and Codes

    • data.niaid.nih.gov
    • zenodo.org
    Updated Oct 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tan, Jo Yen (2022). VegeNet - Image datasets and Codes [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7254507
    Explore at:
    Dataset updated
    Oct 27, 2022
    Dataset authored and provided by
    Tan, Jo Yen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Compilation of python codes for data preprocessing and VegeNet building, as well as image datasets (zip files).

    Image datasets:

    vege_original : Images of vegetables captured manually in data acquisition stage

    vege_cropped_renamed : Images in (1) cropped to remove background areas and image labels renamed

    non-vege images : Images of non-vegetable foods for CNN network to recognize other-than-vegetable foods

    food_image_dataset : Complete set of vege (2) and non-vege (3) images for architecture building.

    food_image_dataset_split : Image dataset (4) split into train and test sets

    process : Images created when cropping (pre-processing step) to create dataset (2).

  2. h

    Python-codes

    • huggingface.co
    Updated Sep 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arjun G Ravi (2023). Python-codes [Dataset]. https://huggingface.co/datasets/Arjun-G-Ravi/Python-codes
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 13, 2023
    Authors
    Arjun G Ravi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for Dataset Name

    Please note that this dataset maynot be perfect and may contain a very small quantity of non python codes. But the quantity appears to be very small

      Dataset Summary
    

    The dataset contains a collection of python question and their code. This is meant to be used for training models to be efficient in Python specific coding. The dataset has two features - 'question' and 'code'. An example is: {'question': 'Create a function that takes in a string… See the full description on the dataset page: https://huggingface.co/datasets/Arjun-G-Ravi/Python-codes.

  3. Date-Dataset

    • kaggle.com
    Updated Aug 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nishtha kukreti (2021). Date-Dataset [Dataset]. https://www.kaggle.com/nishthakukreti/datedataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 18, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    nishtha kukreti
    Description

    Context

    This is the random date data-set generated by me using python script to create a Machine Learning model to tag the date in any given document.

    Content

    This data-set contains whether the given word of word are dates or not

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Implement Machine Learning model or Deep learning Model or train a custom spacy to tag the date and other POS.

  4. VSAT-2D Example Dataset

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Apr 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugo Méndez-Hernández; Hugo Méndez-Hernández (2021). VSAT-2D Example Dataset [Dataset]. http://doi.org/10.5281/zenodo.4671110
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Apr 9, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Hugo Méndez-Hernández; Hugo Méndez-Hernández
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset to run Example.py script of the Valparaíso Stacking Analysis Tool (VSAT-2D). The Valparaíso Stacking Analysis Tool (VSAT-2D) provides a series of tools for selecting, stacking, and analyzing moment-0 intensity maps from interferometric datasets. It is intended for stacking samples of moment-0 extracted from interferometric datasets, belonging to large extragalactic catalogs by selecting subsamples of galaxies defined by their available properties (e.g. redshift, stellar mass, star formation rate) being possible to generate diverse (e.g. median, average, weighted average, histogram) composite spectra. However, it is possible to also use VSAT-2D on smaller datasets containing any type of astronomical object.

    VSAT-2D can be downloaded from the github repository link.

  5. T

    example dataset with Python files

    • dataverse-training.tdl.org
    Updated Mar 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Shensky; Michael Shensky (2025). example dataset with Python files [Dataset]. http://doi.org/10.33536/FK2/C4QFGQ
    Explore at:
    txt(7804), application/x-ipynb+json(18788)Available download formats
    Dataset updated
    Mar 7, 2025
    Dataset provided by
    Texas Data Repository ***TRAINING*** Dataverse
    Authors
    Michael Shensky; Michael Shensky
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    example dataset with Python files

  6. Data from: Code4ML: a Large-scale Dataset of annotated Machine Learning Code...

    • zenodo.org
    csv
    Updated Sep 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous authors; Anonymous authors (2023). Code4ML: a Large-scale Dataset of annotated Machine Learning Code [Dataset]. http://doi.org/10.5281/zenodo.6607065
    Explore at:
    csvAvailable download formats
    Dataset updated
    Sep 15, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anonymous authors; Anonymous authors
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We present Code4ML: a Large-scale Dataset of annotated Machine Learning Code, a corpus of Python code snippets, competition summaries, and data descriptions from Kaggle.

    The data is organized in a table structure. Code4ML includes several main objects: competitions information, raw code blocks collected form Kaggle and manually marked up snippets. Each table has a .csv format.

    Each competition has the text description and metadata, reflecting competition and used dataset characteristics as well as evaluation metrics (competitions.csv). The corresponding datasets can be loaded using Kaggle API and data sources.

    The code blocks themselves and their metadata are collected to the data frames concerning the publishing year of the initial kernels. The current version of the corpus includes two code blocks files: snippets from kernels up to the 2020 year (сode_blocks_upto_20.csv) and those from the 2021 year (сode_blocks_21.csv) with corresponding metadata. The corpus consists of 2 743 615 ML code blocks collected from 107 524 Jupyter notebooks.

    Marked up code blocks have the following metadata: anonymized id, the format of the used data (for example, table or audio), the id of the semantic type, a flag for the code errors, the estimated relevance to the semantic class (from 1 to 5), the id of the parent notebook, and the name of the competition. The current version of the corpus has ~12 000 labeled snippets (markup_data_20220415.csv).

    As marked up code blocks data contains the numeric id of the code block semantic type, we also provide a mapping from this number to semantic type and subclass (actual_graph_2022-06-01.csv).

    The dataset can help solve various problems, including code synthesis from a prompt in natural language, code autocompletion, and semantic code classification.

  7. w

    Dataset of books called Natural language processing : Python and NLTK :...

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called Natural language processing : Python and NLTK : learning path : learn to build expert NLP and machine learning projects using NLTK and other Python libraries [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Natural+language+processing+%3A+Python+and+NLTK+%3A+learning+path+%3A+learn+to+build+expert+NLP+and+machine+learning+projects+using+NLTK+and+other+Python+libraries
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 1 row and is filtered where the book is Natural language processing : Python and NLTK : learning path : learn to build expert NLP and machine learning projects using NLTK and other Python libraries. It features 7 columns including author, publication date, language, and book publisher.

  8. h

    python-codes-25k

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FLOCK4H, python-codes-25k [Dataset]. https://huggingface.co/datasets/flytech/python-codes-25k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    FLOCK4H
    Description

    License

    MIT

      This is a Cleaned Python Dataset Covering 25,000 Instructional Tasks
    
    
    
    
    
      Overview
    

    The dataset has 4 key features (fields): instruction, input, output, and text.It's a rich source for Python codes, tasks, and extends into behavioral aspects.

      Dataset Statistics
    

    Total Entries: 24,813 Unique Instructions: 24,580 Unique Inputs: 3,666 Unique Outputs: 24,581 Unique Texts: 24,813 Average Tokens per example: 508

      Features… See the full description on the dataset page: https://huggingface.co/datasets/flytech/python-codes-25k.
    
  9. d

    I-GUIDE Dataset Catalog Example

    • search.dataone.org
    • hydroshare.org
    • +1more
    Updated Dec 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony M. Castronova (2023). I-GUIDE Dataset Catalog Example [Dataset]. https://search.dataone.org/view/sha256%3A861df8feeaa7d4c7ae4eee87e2cde45c1998cbfc1e3e8e879d5a0e627bab8ceb
    Explore at:
    Dataset updated
    Dec 30, 2023
    Dataset provided by
    Hydroshare
    Authors
    Anthony M. Castronova
    Description

    This resource contains a Jupyter notebook that demonstrates how someone can query the I-GUIDE data catalog, retrieve data, and execute a code workflow.

  10. H

    Dataset metadata of known Dataverse installations, August 2023

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Aug 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julian Gautier (2024). Dataset metadata of known Dataverse installations, August 2023 [Dataset]. http://doi.org/10.7910/DVN/8FEGUV
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 30, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Julian Gautier
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains the metadata of the datasets published in 85 Dataverse installations and information about each installation's metadata blocks. It also includes the lists of pre-defined licenses or terms of use that dataset depositors can apply to the datasets they publish in the 58 installations that were running versions of the Dataverse software that include that feature. The data is useful for reporting on the quality of dataset and file-level metadata within and across Dataverse installations and improving understandings about how certain Dataverse features and metadata fields are used. Curators and other researchers can use this dataset to explore how well Dataverse software and the repositories using the software help depositors describe data. How the metadata was downloaded The dataset metadata and metadata block JSON files were downloaded from each installation between August 22 and August 28, 2023 using a Python script kept in a GitHub repo at https://github.com/jggautier/dataverse-scripts/blob/main/other_scripts/get_dataset_metadata_of_all_installations.py. In order to get the metadata from installations that require an installation account API token to use certain Dataverse software APIs, I created a CSV file with two columns: one column named "hostname" listing each installation URL in which I was able to create an account and another column named "apikey" listing my accounts' API tokens. The Python script expects the CSV file and the listed API tokens to get metadata and other information from installations that require API tokens. How the files are organized ├── csv_files_with_metadata_from_most_known_dataverse_installations │ ├── author(citation)_2023.08.22-2023.08.28.csv │ ├── contributor(citation)_2023.08.22-2023.08.28.csv │ ├── data_source(citation)_2023.08.22-2023.08.28.csv │ ├── ... │ └── topic_classification(citation)_2023.08.22-2023.08.28.csv ├── dataverse_json_metadata_from_each_known_dataverse_installation │ ├── Abacus_2023.08.27_12.59.59.zip │ ├── dataset_pids_Abacus_2023.08.27_12.59.59.csv │ ├── Dataverse_JSON_metadata_2023.08.27_12.59.59 │ ├── hdl_11272.1_AB2_0AQZNT_v1.0(latest_version).json │ ├── ... │ ├── metadatablocks_v5.6 │ ├── astrophysics_v5.6.json │ ├── biomedical_v5.6.json │ ├── citation_v5.6.json │ ├── ... │ ├── socialscience_v5.6.json │ ├── ACSS_Dataverse_2023.08.26_22.14.04.zip │ ├── ADA_Dataverse_2023.08.27_13.16.20.zip │ ├── Arca_Dados_2023.08.27_13.34.09.zip │ ├── ... │ └── World_Agroforestry_-_Research_Data_Repository_2023.08.27_19.24.15.zip └── dataverse_installations_summary_2023.08.28.csv └── dataset_pids_from_most_known_dataverse_installations_2023.08.csv └── license_options_for_each_dataverse_installation_2023.09.05.csv └── metadatablocks_from_most_known_dataverse_installations_2023.09.05.csv This dataset contains two directories and four CSV files not in a directory. One directory, "csv_files_with_metadata_from_most_known_dataverse_installations", contains 20 CSV files that list the values of many of the metadata fields in the citation metadata block and geospatial metadata block of datasets in the 85 Dataverse installations. For example, author(citation)_2023.08.22-2023.08.28.csv contains the "Author" metadata for the latest versions of all published, non-deaccessioned datasets in the 85 installations, where there's a row for author names, affiliations, identifier types and identifiers. The other directory, "dataverse_json_metadata_from_each_known_dataverse_installation", contains 85 zipped files, one for each of the 85 Dataverse installations whose dataset metadata I was able to download. Each zip file contains a CSV file and two sub-directories: The CSV file contains the persistent IDs and URLs of each published dataset in the Dataverse installation as well as a column to indicate if the Python script was able to download the Dataverse JSON metadata for each dataset. It also includes the alias/identifier and category of the Dataverse collection that the dataset is in. One sub-directory contains a JSON file for each of the installation's published, non-deaccessioned dataset versions. The JSON files contain the metadata in the "Dataverse JSON" metadata schema. The Dataverse JSON export of the latest version of each dataset includes "(latest_version)" in the file name. This should help those who are interested in the metadata of only the latest version of each dataset. The other sub-directory contains information about the metadata models (the "metadata blocks" in JSON files) that the installation was using when the dataset metadata was downloaded. I included them so that they can be used when extracting metadata from the dataset's Dataverse JSON exports. The dataverse_installations_summary_2023.08.28.csv file contains information about each installation, including its name, URL, Dataverse software version, and counts of dataset metadata...

  11. R

    Projet Python Dataset

    • universe.roboflow.com
    zip
    Updated May 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fichesmat (2023). Projet Python Dataset [Dataset]. https://universe.roboflow.com/fichesmat/projet-python/model/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 8, 2023
    Dataset authored and provided by
    fichesmat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Generative Adversarial Network
    Description

    Projet Python

    ## Overview
    
    Projet Python is a dataset for classification tasks - it contains Generative Adversarial Network annotations for 620 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  12. Example datasets for demonstrating and testing the vreg python package

    • zenodo.org
    bin
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Steven Sourbron; Steven Sourbron (2025). Example datasets for demonstrating and testing the vreg python package [Dataset]. http://doi.org/10.5281/zenodo.15535700
    Explore at:
    binAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Steven Sourbron; Steven Sourbron
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Example data included with the python package vreg (openmiblab.github.io/vreg).

    The package includes an API for downloading and reading the data through the function vreg.fetch.

    The database includes kidney MRI data of different types from a single subject of the iBEAt study:

    • Dixon_out_phase: A 3D coronal opposed-phase volume from a DIXON scan of the abdomen.
    • Dixon_in_phase: In-phase acquisition of the same volume.
    • Dixon_water: Water map computed from in- and out phase.
    • Dixon_fat: Fat map computed from in- and out phase.
    • kidneys: A 3D binary mask covering the kidneys on the coronal DIXON scan.
    • left_kidney: A 3D mask covering the left kidney on the DIXON scan.
    • right_kidney: A 3D mask covering the right kidney on the DIXON scan.
    • MTR: A 3D magnetization transfer ratio volume with oblique orientation aligned with the axis of the kidneys.
    • T1: A 5-slice 2D T1-map through the kidneys in oblique orientations aligned with the axis of the kidneys. Since the slices have gaps between them, each is stored as a separate file.
    • T2star: A 5-slice 2D T2* map through the kidneys in oblique orientations aligned with the axis of the kidneys. Since the slices have gaps between them, each is stored as a separate file.

    Version history

    • v0.0.1: data format to nifti.
    • v0.0.0: data uploaded in pickle format.
  13. w

    Dataset of books series that contain Building machine learning systems with...

    • workwithdata.com
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of books series that contain Building machine learning systems with Python : master the art of machine learning with Python and build effective machine learning sytems with this intensive hands-on guide [Dataset]. https://www.workwithdata.com/datasets/book-series?f=1&fcol0=j0-book&fop0=%3D&fval0=Building+machine+learning+systems+with+Python+:+master+the+art+of+machine+learning+with+Python+and+build+effective+machine+learning+sytems+with+this+intensive+hands-on+guide&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 25, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book series. It has 1 row and is filtered where the books is Building machine learning systems with Python : master the art of machine learning with Python and build effective machine learning sytems with this intensive hands-on guide. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.

  14. Z

    VSAT-3D Example Dataset

    • data.niaid.nih.gov
    Updated Apr 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Méndez-Hernández, Hugo (2021). VSAT-3D Example Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4671100
    Explore at:
    Dataset updated
    Apr 9, 2021
    Dataset authored and provided by
    Méndez-Hernández, Hugo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset to run Example.py script of the Valparaíso Stacking Analysis Tool (VSAT-3D). The Valparaíso Stacking Analysis Tool (VSAT-3D) provides a series of tools for selecting, stacking, and analyzing 3D spectra. It is intended for stacking samples of datacubes extracted from interferometric datasets, belonging to large extragalactic catalogs by selecting subsamples of galaxies defined by their available properties (e.g. redshift, stellar mass, star formation rate) being possible to generate diverse (e.g. median, average, weighted average, histogram) composite spectra. However, it is possible to also use VSAT-3D on smaller datasets containing any type of astronomical object.

    VSAT-3D can be downloaded from the github repository link.

  15. Python scripts used to generate the figures in "An algorithm to identify...

    • datasets.ai
    • s.cnmilf.com
    • +2more
    0, 47, 57
    Updated Aug 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2024). Python scripts used to generate the figures in "An algorithm to identify vapor-liquid-liquid equilibria of binary mixtures from vapor-liquid equilibria" [Dataset]. https://datasets.ai/datasets/python-scripts-used-to-generate-the-figures-in-an-algorithm-to-identify-vapor-liquid-liqui
    Explore at:
    57, 47, 0Available download formats
    Dataset updated
    Aug 8, 2024
    Dataset authored and provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    The files in this repository can be used to generate the complete set of figures in the paper "An algorithm to identify vapor-liquid-liquid equilibria from vapor-liquid equilibria". The zip file, when expanded, includes a conda environment to populate the dependencies, and a set of python scripts. Running make_figures.py will regenerate all the figures, demonstrating how to use the algorithm.

  16. w

    Dataset of book subjects that contain Scientific computing with Python 3 :...

    • workwithdata.com
    Updated Nov 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of book subjects that contain Scientific computing with Python 3 : an example-rich, comprehensive guide for all of your Python computational needs [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=j0-book&fop0=%3D&fval0=Scientific+computing+with+Python+3+:+an+example-rich%2C+comprehensive+guide+for+all+of+your+Python+computational+needs&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 7, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book subjects. It has 2 rows and is filtered where the books is Scientific computing with Python 3 : an example-rich, comprehensive guide for all of your Python computational needs. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.

  17. w

    Dataset of book subjects that contain Kivy : interactive applications in...

    • workwithdata.com
    Updated Nov 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of book subjects that contain Kivy : interactive applications in Python : create cross-platform UI/UX applications and games in Python [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=j0-book&fop0=%3D&fval0=Kivy+:+interactive+applications+in+Python+:+create+cross-platform+UI/UX+applications+and+games+in+Python&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 7, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book subjects. It has 3 rows and is filtered where the books is Kivy : interactive applications in Python : create cross-platform UI/UX applications and games in Python. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.

  18. d

    (HS 2) Automate Workflows using Jupyter notebook to create Large Extent...

    • search.dataone.org
    • hydroshare.org
    Updated Oct 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Young-Don Choi (2024). (HS 2) Automate Workflows using Jupyter notebook to create Large Extent Spatial Datasets [Dataset]. http://doi.org/10.4211/hs.a52df87347ef47c388d9633925cde9ad
    Explore at:
    Dataset updated
    Oct 19, 2024
    Dataset provided by
    Hydroshare
    Authors
    Young-Don Choi
    Description

    We implemented automated workflows using Jupyter notebooks for each state. The GIS processing, crucial for merging, extracting, and projecting GeoTIFF data, was performed using ArcPy—a Python package for geographic data analysis, conversion, and management within ArcGIS (Toms, 2015). After generating state-scale LES (large extent spatial) datasets in GeoTIFF format, we utilized the xarray and rioxarray Python packages to convert GeoTIFF to NetCDF. Xarray is a Python package to work with multi-dimensional arrays and rioxarray is rasterio xarray extension. Rasterio is a Python library to read and write GeoTIFF and other raster formats. Xarray facilitated data manipulation and metadata addition in the NetCDF file, while rioxarray was used to save GeoTIFF as NetCDF. These procedures resulted in the creation of three HydroShare resources (HS 3, HS 4 and HS 5) for sharing state-scale LES datasets. Notably, due to licensing constraints with ArcGIS Pro, a commercial GIS software, the Jupyter notebook development was undertaken on a Windows OS.

  19. h

    initial_dataset

    • huggingface.co
    Updated Jun 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michal Lazniewski (2025). initial_dataset [Dataset]. https://huggingface.co/datasets/mlazniewski/initial_dataset
    Explore at:
    Dataset updated
    Jun 1, 2025
    Authors
    Michal Lazniewski
    Description

    My Cool Dataset

    This dataset is an example of how to create and upload a dataset card using Python. I use only to practice how to manipulate dataset iteslf, add new data remove them. Fix typos.

      Dataset Details
    

    Language: English License: MIT Tags: text-classification, example

  20. R

    Python Test Dataset

    • universe.roboflow.com
    zip
    Updated Jan 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Python Test (2024). Python Test Dataset [Dataset]. https://universe.roboflow.com/python-test/python-test/model/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 27, 2024
    Dataset authored and provided by
    Python Test
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Atk Bounding Boxes
    Description

    Python Test

    ## Overview
    
    Python Test is a dataset for object detection tasks - it contains Atk annotations for 970 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Tan, Jo Yen (2022). VegeNet - Image datasets and Codes [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7254507

VegeNet - Image datasets and Codes

Explore at:
Dataset updated
Oct 27, 2022
Dataset authored and provided by
Tan, Jo Yen
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Compilation of python codes for data preprocessing and VegeNet building, as well as image datasets (zip files).

Image datasets:

vege_original : Images of vegetables captured manually in data acquisition stage

vege_cropped_renamed : Images in (1) cropped to remove background areas and image labels renamed

non-vege images : Images of non-vegetable foods for CNN network to recognize other-than-vegetable foods

food_image_dataset : Complete set of vege (2) and non-vege (3) images for architecture building.

food_image_dataset_split : Image dataset (4) split into train and test sets

process : Images created when cropping (pre-processing step) to create dataset (2).

Search
Clear search
Close search
Google apps
Main menu