100+ datasets found
  1. metadata

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Nov 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). metadata [Dataset]. https://catalog.data.gov/dataset/metadata-f2500
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The dataset consists of public domain acute and chronic toxicity and chemistry data for algal species. Data are accessible at: https://envirotoxdatabase.org/ Data include algal species, chemical identification, and the concentrations that do and do not affect algal growth.

  2. h

    the-stack-metadata

    • huggingface.co
    Updated Apr 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BigCode (2023). the-stack-metadata [Dataset]. https://huggingface.co/datasets/bigcode/the-stack-metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 16, 2023
    Dataset authored and provided by
    BigCode
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Dataset Card for The Stack Metadata

      Changelog
    

    Release Description

    v1.1 This is the first release of the metadata. It is for The Stack v1.1

    v1.2 Metadata dataset matching The Stack v1.2

      Dataset Summary
    

    This is a set of additional information for repositories used for The Stack. It contains file paths, detected licenes as well as some other information for the repositories.

      Supported Tasks and Leaderboards
    

    The main task is to recreate… See the full description on the dataset page: https://huggingface.co/datasets/bigcode/the-stack-metadata.

  3. IMDB & TMDB Movie Metadata Big Dataset (over 1M)

    • kaggle.com
    zip
    Updated Aug 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shubham Chandra (2024). IMDB & TMDB Movie Metadata Big Dataset (over 1M) [Dataset]. https://www.kaggle.com/datasets/shubhamchandra235/imdb-and-tmdb-movie-metadata-big-dataset-1m
    Explore at:
    zip(416807108 bytes)Available download formats
    Dataset updated
    Aug 5, 2024
    Authors
    Shubham Chandra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Title: IMDB & TMDB Movie Metadata Big Dataset (>1M)

    Subtitle: A Comprehensive Dataset Featuring Detailed Metadata of Movies (IMDB, TMDB). Over 1M Rows & 42 Features: Metadata, Ratings, Genres, Cast, Crew, Sentiment Analysis and many more...

    Detailed Description:

    Overview: This comprehensive dataset merges the extensive film data available from both IMDB and TMDB, offering a rich resource for movie enthusiasts, data scientists, and researchers. With over 1 million rows and 42 detailed features, this dataset provides in-depth information about a wide variety of movies, spanning different genres, periods, and production backgrounds.

    File Information: 1. File Size: ≈ 1GB 2. Format: CSV (Comma-Separated Values)

    Column Descriptors/Key Features: 1. ID: Unique identifier for each movie. 2. Title: The official title of the movie. 3. Vote Average: Average rating received by the movie. 4. Vote Count: Number of votes the movie has received. 5. Status: Current status of the movie (e.g., Released, Post-Production). 6. Release Date: Official release date of the movie. 7. Revenue: Box office revenue generated by the movie. 8. Runtime: Duration of the movie in minutes. 9. Adult: Indicates if the movie is for adults. 10. Genres: List of genres the movie belongs to. 11. Overview Sentiment: Sentiment analysis of the movie's overview text. 12. Cast: List of main actors in the movie. 13. Crew: List of key crew members, including directors, producers, and writers. 14. Genres List: Detailed genres in list format. 15. Keywords: List of relevant keywords associated with the movie. 16. Director of Photography: Name of the cinematographer. 17. Producers: Names of the producers. 18. Music Composer: Name of the music composer.

    Additional Features:

    1. Unnamed 0: Index column.
    2. Star1, Star2, Star3, Star4: Names of the top-billed stars.
    3. Writer: Name(s) of the writer(s).
    4. Original Language: Original language of the movie.
    5. Original Title: Original title if different from the main title.
    6. Popularity: Popularity score of the movie.
    7. Budget: Budget allocated for the movie.
    8. Tagline: Promotional tagline of the movie.
    9. Production Companies: Companies involved in the production.
    10. Production Countries: Countries where the movie was produced.
    11. Spoken Languages: Languages spoken in the movie.
    12. Homepage: Official website of the movie.
    13. IMDB ID: Unique identifier on IMDB.
    14. TMDB ID: Unique identifier on TMDB.
    15. Video: Indicates if there is a video associated.
    16. Poster Path: Path to the movie poster image.
    17. Backdrop Path: Path to the backdrop image.
    18. Release Year: Year the movie was released.
    19. Collection Name: Name of the collection the movie belongs to.
    20. Collection ID: Unique identifier for the collection.
    21. Genres ID: Unique identifier for the genres.
    22. Original Language Code: Code for the original language.
    23. Overview: Brief summary of the movie.
    24. All Combined Keywords: Combined keywords in a single field.

    Potential Use Cases: - Sentiment Analysis: Analyze audience sentiment towards movies based on reviews and ratings. - Recommendation Systems: Build models to recommend movies based on user preferences and viewing history. - Market Analysis: Study trends in the movie industry, including genre popularity and revenue patterns. - Content Analysis: Investigate the thematic content and diversity of movies over time. - Data Visualization: Create visual representations of movie data to uncover hidden insights.

  4. Movies & TV Shows Metadata Dataset (190K+ Records, Horror-Heavy Collection)

    • crawlfeeds.com
    csv, zip
    Updated Aug 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Movies & TV Shows Metadata Dataset (190K+ Records, Horror-Heavy Collection) [Dataset]. https://crawlfeeds.com/datasets/movies-tv-shows-metadata-dataset-190k-records-horror-heavy-collection
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Aug 23, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    This comprehensive dataset features detailed metadata for over 190,000 movies and TV shows, with a strong concentration in the Horror genre. It is ideal for entertainment research, machine learning models, genre-specific trend analysis, and content recommendation systems.

    Each record contains rich information, making it perfect for streaming platforms, film industry analysts, or academic media researchers.

    Primary Genre Focus: Horror

    Use Cases:

    • Build movie recommendation systems or genre classifiers

    • Train NLP models on movie descriptions

    • Analyze Horror content trends over time

    • Explore box office vs. rating correlations

    • Enrich entertainment datasets with directorial and cast metadata

  5. f

    Data from: Sample metadata

    • fairdomhub.org
    xlsx
    Updated Jul 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Harvey (2021). Sample metadata [Dataset]. https://fairdomhub.org/data_files/1440
    Explore at:
    xlsx(43.9 KB)Available download formats
    Dataset updated
    Jul 1, 2021
    Authors
    Thomas Harvey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Information on samples submitted for RNAseq

    Rows are individual samples

    Columns are: ID Sample Name Date sampled Species Sex Tissue Geographic location Date extracted Extracted by Nanodrop Conc. (ng/µl) 260/280 260/230 RIN Plate ID Position Index name Index Seq Qubit BR kit Conc. (ng/ul) BioAnalyzer Conc. (ng/ul) BioAnalyzer bp (region 200-1200) Submission reference Date submitted Conc. (nM) Volume provided PE/SE Number of reads Read length

  6. h

    arxiv-metadata-dataset

    • huggingface.co
    Updated Jun 30, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sumuk Shashidhar (2015). arxiv-metadata-dataset [Dataset]. https://huggingface.co/datasets/sumuks/arxiv-metadata-dataset
    Explore at:
    Dataset updated
    Jun 30, 2015
    Authors
    Sumuk Shashidhar
    Description

    sumuks/arxiv-metadata-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. f

    Metadata for the analysis dataset.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Jan 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rougeron, Virginie; Berry, Antoine; Prugnolle, Franck; Trape, Jean-François; Fontecha, Gustavo A.; Arnathau, Céline; Pradines, Bruno; Houzé, Sandrine; Severini, Carlo; Sáenz, Fabian E.; Fontaine, Michael C.; Noya, Oscar; Degrugillier, Fanny; Lefebvre, Margaux J. M. (2025). Metadata for the analysis dataset. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001319987
    Explore at:
    Dataset updated
    Jan 13, 2025
    Authors
    Rougeron, Virginie; Berry, Antoine; Prugnolle, Franck; Trape, Jean-François; Fontecha, Gustavo A.; Arnathau, Céline; Pradines, Bruno; Houzé, Sandrine; Severini, Carlo; Sáenz, Fabian E.; Fontaine, Michael C.; Noya, Oscar; Degrugillier, Fanny; Lefebvre, Margaux J. M.
    Description

    For each sample, the number of reads and the mean coverage are indicated. Each sample metadata included the percentage of the genome covered by at least 1X (% > 1X), 5X (% > 5X), and 10X (% > 10X) sequencing depth. When available, the latitude and longitude are specified. NA: information not available. (XLSX)

  8. Metadata

    • catalog.data.gov
    Updated Mar 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2024). Metadata [Dataset]. https://catalog.data.gov/dataset/metadata-dbea5
    Explore at:
    Dataset updated
    Mar 24, 2024
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Dataset includes CMAQ predicted results. This dataset is not publicly accessible because: Shanghai Jiao Tong University created the dataset - EPA does not have the dataset. It can be accessed through the following means: Contact - Ping Liu, School of Environmental Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China, email: ping_liu@sjtu.edu.cn. Format: Dataset includes CMAQ output files using netcdf format. This dataset is associated with the following publication: Chen, H., P. Liu, Q. Wang, R. Huang, and G. Sarwar. Impact and pathway of halogens on atmospheric oxidants in coastal city clusters in the Yangtze River Delta region in China. Atmospheric Pollution Research. Turkish National Committee for Air Pollution Research and Control, Izmir, TURKEY, 15(2): N/A, (2024).

  9. Enterprise Metadata Repository (EMR)

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated Nov 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Social Security Administration (2025). Enterprise Metadata Repository (EMR) [Dataset]. https://catalog.data.gov/dataset/enterprise-metadata-repository-emr
    Explore at:
    Dataset updated
    Nov 22, 2025
    Dataset provided by
    Social Security Administrationhttp://ssa.gov/
    Description

    Stores physical and logical information about relational databases and record structures to assist in data identification and management.

  10. Common Metadata Elements for Cataloging Biomedical Datasets

    • figshare.com
    xlsx
    Updated Jan 20, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kevin Read (2016). Common Metadata Elements for Cataloging Biomedical Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.1496573.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 20, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Kevin Read
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset outlines a proposed set of core, minimal metadata elements that can be used to describe biomedical datasets, such as those resulting from research funded by the National Institutes of Health. It can inform efforts to better catalog or index such data to improve discoverability. The proposed metadata elements are based on an analysis of the metadata schemas used in a set of NIH-supported data sharing repositories. Common elements from these data repositories were identified, mapped to existing data-specific metadata standards from to existing multidisciplinary data repositories, DataCite and Dryad, and compared with metadata used in MEDLINE records to establish a sustainable and integrated metadata schema. From the mappings, we developed a preliminary set of minimal metadata elements that can be used to describe NIH-funded datasets. Please see the readme file for more details about the individual sheets within the spreadsheet.

  11. d

    Garner Valley DAS Metadata

    • catalog.data.gov
    • gdr.openei.org
    • +3more
    Updated Jan 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Wisconsin (2025). Garner Valley DAS Metadata [Dataset]. https://catalog.data.gov/dataset/garner-valley-das-metadata-b4222
    Explore at:
    Dataset updated
    Jan 20, 2025
    Dataset provided by
    University of Wisconsin
    Description

    Metadata for the data collected at the NEES@UCSB Garner Valley Downhole Array field site on September 10-12, 2013 as part of the larger PoroTomo project.

  12. h

    movie-metadata

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    datadruids, movie-metadata [Dataset]. https://huggingface.co/datasets/ada-datadruids/movie-metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset authored and provided by
    datadruids
    Description

    ada-datadruids/movie-metadata dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. n

    OpenScience Slovenia document metadata dataset

    • narcis.nl
    • data.mendeley.com
    Updated Mar 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Borovič, M (via Mendeley Data) (2021). OpenScience Slovenia document metadata dataset [Dataset]. http://doi.org/10.17632/7wh9xvvmgk.3
    Explore at:
    Dataset updated
    Mar 9, 2021
    Dataset provided by
    Data Archiving and Networked Services (DANS)
    Authors
    Borovič, M (via Mendeley Data)
    Area covered
    Slovenia
    Description

    The OpenScience Slovenia metadata dataset contains metadata entries for Slovenian public domain academic documents which include undergraduate and postgraduate theses, research and professional articles, along with other academic document types. The data within the dataset was collected as a part of the establishment of the Slovenian Open-Access Infrastructure which defined a unified document collection process and cataloguing for universities in Slovenia within the infrastructure repositories. The data was collected from several already established but separate library systems in Slovenia and merged into a single metadata scheme using metadata deduplication and merging techniques. It consists of text and numerical fields, representing attributes that describe documents. These attributes include document titles, keywords, abstracts, typologies, authors, issue years and other identifiers such as URL and UDC. The potential of this dataset lies especially in text mining and text classification tasks and can also be used in development or benchmarking of content-based recommender systems on real-world data.

  14. UNIFESP X-ray Body Part - DICOM METADATA CSV

    • kaggle.com
    zip
    Updated Apr 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Icaro Bombonato (2022). UNIFESP X-ray Body Part - DICOM METADATA CSV [Dataset]. https://www.kaggle.com/datasets/ibombonato/unifesp-xray-body-part-dicom-metadata-csv
    Explore at:
    zip(282385 bytes)Available download formats
    Dataset updated
    Apr 5, 2022
    Authors
    Icaro Bombonato
    Description

    This is the metadata from DICOM files for UNIFESP X-ray Body Part Competition in csv format

    Competition and original dataset:

    https://www.kaggle.com/competitions/unifesp-x-ray-body-part-classifier/

    Acknowledgements We thank Sarah Lustosa Haiek, Julia Tagliaferri, Lucas Diniz, and Rogerio Jadjiski for annotating this dataset. We thank the PI Nitamar Abdala, MD, PhD, for supporting this work. We thank Ernandez, our PACS admin, and Jefferson, our IT manager. We thank MD.ai for providing the annotation platform.

  15. t

    Metadata Form Template

    • performance.tempe.gov
    • datasets.ai
    • +8more
    Updated Jun 5, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tempe (2020). Metadata Form Template [Dataset]. https://performance.tempe.gov/documents/c450d13c28ed4b1888ed6ab9d0363473
    Explore at:
    Dataset updated
    Jun 5, 2020
    Dataset authored and provided by
    City of Tempe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Metadata form template for Tempe Open Data.

  16. Metadata dataset

    • kaggle.com
    zip
    Updated Apr 26, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicole Wong98 (2021). Metadata dataset [Dataset]. https://www.kaggle.com/datasets/nicolewong98/metadata-dataset
    Explore at:
    zip(1896422671 bytes)Available download formats
    Dataset updated
    Apr 26, 2021
    Authors
    Nicole Wong98
    Description

    Dataset

    This dataset was created by Nicole Wong98

    Contents

  17. o

    Metadata Catalogue

    • spenergynetworks.opendatasoft.com
    csv, excel, json
    Updated Nov 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Metadata Catalogue [Dataset]. https://spenergynetworks.opendatasoft.com/explore/dataset/metadata-catalogue/
    Explore at:
    csv, json, excelAvailable download formats
    Dataset updated
    Nov 1, 2025
    Description

    A dataset containing the metadata for all openly published datasets on the SP Energy Networks Open Data Portal. All metadata conforms to the Dublin Core metadata standard - a set of 15 'core' elements. Download dataset metadata (JSON)If you wish to provide feedback at a dataset or row level, please click on the “Feedback” tab above.

  18. u

    Gede Heritage Spatial Documentation Metadata Dataset

    • zivahub.uct.ac.za
    jpeg
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heinz Rüther; Roshan Bhurtha; Ralph Schröder; Stephen Wessels; Bruce McDonald (2025). Gede Heritage Spatial Documentation Metadata Dataset [Dataset]. http://doi.org/10.25375/uct.11854452.v2
    Explore at:
    jpegAvailable download formats
    Dataset updated
    Nov 29, 2025
    Dataset provided by
    University of Cape Town
    Authors
    Heinz Rüther; Roshan Bhurtha; Ralph Schröder; Stephen Wessels; Bruce McDonald
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This master metadata spreadsheet documents all of the Gede ruins heritage items published by the Zamani Project.The information in this site description is provided for contextual purposes only and should not be regarded as a primary source.Gede is a Swahili archaeological site comprising coral stone structures, including mosques, houses, and tombs arranged within a walled town layout. Architectural features such as mihrabs, water cisterns, and decorative niches reflect Islamic influence and urban planning. Excavations have revealed trade goods and domestic artifacts, indicating participation in Indian Ocean commerce. Gede provides insights into Swahili cultural identity, religious practice, and economic networks.Gede is listed as the UNESCO World Heritage Site, 'The Historic Town and Archaeological Site of Gedi'.The Zamani Project seeks to increase awareness and knowledge of tangible cultural heritage in Africa and internationally by creating metrically accurate digital representations of historical sites. Digital spatial data of cultural heritage sites can be used for research and education, for restoration and conservation, and as a record for future generations. The Zamani Project operates as a non-profit organisation within the University of Cape Town.Special thanks to the Saville Foundation, and the Andrew W. Mellon Foundation, among others, for their contributions to the digital documentation of this heritage site.If you believe any information in this description is incorrect, please contact the repository administrators.

  19. b

    GOLD metadata

    • bioregistry.io
    Updated Apr 27, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). GOLD metadata [Dataset]. https://bioregistry.io/gold.meta
    Explore at:
    Dataset updated
    Apr 27, 2021
    Description
    • DEPRECATION NOTE - Please, keep in mind that this namespace has been superseeded by ‘gold’ prefix at https://registry.identifiers.org/registry/gold, and this namespace is kept here for support to already existing citations, new ones would need to use the pointed ‘gold’ namespace.

    The GOLD (Genomes OnLine Database)is a resource for centralized monitoring of genome and metagenome projects worldwide. It stores information on complete and ongoing projects, along with their associated metadata. This collection references metadata associated with samples.

  20. Metadata An analysis of degradation in low-cost particulate matter sensors

    • catalog.data.gov
    • s.cnmilf.com
    Updated Apr 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2023). Metadata An analysis of degradation in low-cost particulate matter sensors [Dataset]. https://catalog.data.gov/dataset/metadata-an-analysis-of-degradation-in-low-cost-particulate-matter-sensors
    Explore at:
    Dataset updated
    Apr 15, 2023
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    This file describes where to find the dataset used for this paper (PurpleAir and AQS) and the data fields used in the analysis. Contact the corresponding author for access to the code used to generate the dataset. This dataset is associated with the following publication: deSouza, P., K. Barkjohn, A. Clements, J. Lee, R. Kahn, and B. Crawford. An analysis of degradation in low-cost particulate matter sensors. Environmental Science: Atmospheres. Royal Society of Chemistry, Cambridge, UK, NA, (2023).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
U.S. EPA Office of Research and Development (ORD) (2020). metadata [Dataset]. https://catalog.data.gov/dataset/metadata-f2500
Organization logo

metadata

Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description

The dataset consists of public domain acute and chronic toxicity and chemistry data for algal species. Data are accessible at: https://envirotoxdatabase.org/ Data include algal species, chemical identification, and the concentrations that do and do not affect algal growth.

Search
Clear search
Close search
Google apps
Main menu