100+ datasets found
  1. h

    danbooru2023-metadata-database

    • huggingface.co
    Updated Jan 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shih-Ying Yeh (2024). danbooru2023-metadata-database [Dataset]. https://huggingface.co/datasets/KBlueLeaf/danbooru2023-metadata-database
    Explore at:
    Dataset updated
    Jan 11, 2024
    Authors
    Shih-Ying Yeh
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Metadata Database for Danbooru2023

    Danbooru 2023 datasets: https://huggingface.co/datasets/nyanko7/danbooru2023 The latest entry of this database is id 7,866,491. Which is newer than nyanko7's dataset. This dataset contains a sqlite db file which have all the tags and posts metadata in it. The Peewee ORM config file is provided too, plz check it for more information. (Especially on how I link posts and tags together) The original data is from the official dump of the posts info.… See the full description on the dataset page: https://huggingface.co/datasets/KBlueLeaf/danbooru2023-metadata-database.

  2. d

    Tethys Acoustic Metadata Database

    • catalog.data.gov
    • fisheries.noaa.gov
    Updated May 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (Point of Contact, Custodian) (2025). Tethys Acoustic Metadata Database [Dataset]. https://catalog.data.gov/dataset/tethys-acoustic-metadata-database1
    Explore at:
    Dataset updated
    May 24, 2025
    Dataset provided by
    (Point of Contact, Custodian)
    Description

    The Tethys database houses the metadata associated with the acoustic data collection efforts by the Passive Acoustic Group. These metadata include dates, locations and sampling rate, among other things. The database platform itself was developed by colleagues at San Diego State University, and is freely available and open source. See citation details for website link.

  3. Z

    Searchable Index of Metadata Aggregators

    • data-staging.niaid.nih.gov
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Li, Winnie Ak Wai; Payne, Karen (2022). Searchable Index of Metadata Aggregators [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_4589049
    Explore at:
    Dataset updated
    Jan 29, 2022
    Dataset provided by
    International Technology Office
    Authors
    Li, Winnie Ak Wai; Payne, Karen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Searchable Index of Metadata Aggregators is a database that stores general information of metadata aggregators. This database is accompanied with the “A WDS guide to Metadata Aggregators for Repository Managers”. The Searchable Index of Metadata Aggregators is an up-to-date catalogue of Dataset Metadata Aggregators (DMAs), implemented as an access database. It was designed to fill in a gap found by the Harvestable Metadata Services Working Group (HMetS-WG) members of the World Data System’s International Technology Office (WDS-ITO). These include up-to-date resources giving an overview of current infrastructures used to syndicate dataset metadata. The database contains information on DMA's supported metadata standards and software interfaces, as well as documentation on how to be aggregated by each.

    The WDS Guide to Metadata Aggregators is a guidance document for the associated Searchable Index of Metadata Aggregators. We have defined DMAs as federated service infrastructures that foster the findability and accessibility of data products by enabling access to multiple, distributed metadata records via a single search interface. This guide gives a description of this catalogue and general guidance on how to use it. In the sections that follow, we give a short background to the Harvestable Metadata Services-Working Group project. Then, we outline the project's research methodology and the properties of the searchable index. Finally, we discuss this project's limitations, as well as its future development. Providing metadata to aggregators can significantly improve the findability of research data products.

    Together, this guidance document and dataset package are designed to provide research data repository managers with options for participation in federated research data systems, and support institutional repositories' harvestable metadata service implementation strategies. In addition, as developers in the global research data management community seek to create pathways and workflows across data, software and compute resources, we anticipate that they're likely to prioritize connecting sites, organizations and services that have already done a lot of work harmonizing content from disparate providers. In this context, this resource will be helpful for creating roadmaps and implementation plans for integration across science clouds.

  4. Data from: Metadata File connected to the OYUP database

    • zenodo.org
    • openagrar.de
    • +2more
    bin
    Updated Apr 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maike Krauss; Maike Krauss; Bernhard Schlatter; José Manuel Blanco-Moreno; José Manuel Blanco-Moreno; Jaime Vila-Traver; Jaime Vila-Traver; Gabriele Ridolfi; Gabriele Ridolfi; Bernhard Schlatter (2025). Metadata File connected to the OYUP database [Dataset]. http://doi.org/10.5281/zenodo.15276083
    Explore at:
    binAvailable download formats
    Dataset updated
    Apr 30, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Maike Krauss; Maike Krauss; Bernhard Schlatter; José Manuel Blanco-Moreno; José Manuel Blanco-Moreno; Jaime Vila-Traver; Jaime Vila-Traver; Gabriele Ridolfi; Gabriele Ridolfi; Bernhard Schlatter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is part of the database compiled as an outcome of Work Area 1 in project OrganicYieldsUP. This Excel file describes the content of the OYUP relational database for each table and column.

    The main sheet "table_schema_oyup" contains:

    ORDINAL_POSITION_per_tableUnique row number for each table. Can be used for sorting.
    TABLE_NAMEName of Table being part of the relational database.
    COLUMN_NAMEName of Column in the respective table.
    COLUMN DESCRIPTIONDescription of the Column content.
    DATA_TYPESQlite data type of the Column.
    TABLE_COLUMN_IDLetter-based ID for the Column that was used during data upload into the database. Can be used to link gap filling information to the gap filled indicator.

    The additional sheet "quality_indicator_description" contains:

    PARAMETER Quality indicator (unit) reported
    CROPCrop that the quality indicator refers to
    DESCRIPTIONDescription of the quality indicator (unit)

  5. u

    REDIRE Database metadata subject counts

    • redire.uni-bonn.de
    Updated Sep 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). REDIRE Database metadata subject counts [Dataset]. https://redire.uni-bonn.de/data.html
    Explore at:
    Dataset updated
    Sep 24, 2025
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Unique values and counts of metadata subject fields.

  6. Enterprise Metadata Repository (EMR)

    • data.wu.ac.at
    • s.cnmilf.com
    • +1more
    Updated Aug 31, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Social Security Administration (2018). Enterprise Metadata Repository (EMR) [Dataset]. https://data.wu.ac.at/schema/data_gov/NDU5NTU2MjktZjg1Yy00NjI0LWI2M2UtMzc3ZTU2ZDVjMzFk
    Explore at:
    Dataset updated
    Aug 31, 2018
    Dataset provided by
    Social Security Administrationhttp://ssa.gov/
    Description

    Stores physical and logical information about relational databases and record structures to assist in data identification and management.

  7. r

    Seaborne metadata database

    • researchdata.edu.au
    • data.csiro.au
    datadownload
    Updated Jul 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Victoria Graham; Jeremy De Valck; Diane Jarvis; Anthea Coggan; Akshat Sehgal; Lauren Stevens; Petina Pert; Petina Pert; Anthea Coggan (2024). Seaborne metadata database [Dataset]. http://doi.org/10.25919/WKKK-6G61
    Explore at:
    datadownloadAvailable download formats
    Dataset updated
    Jul 9, 2024
    Dataset provided by
    Commonwealth Scientific and Industrial Research Organisation
    Authors
    Victoria Graham; Jeremy De Valck; Diane Jarvis; Anthea Coggan; Akshat Sehgal; Lauren Stevens; Petina Pert; Petina Pert; Anthea Coggan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Nov 1, 2021 - Jul 31, 2024
    Area covered
    Description

    The SEABORNE (Sustainable UsE And Benefits fOR mariNE) has consolidated and synthesised existing information about who is using the Reef, how it is being used and what the benefits are from this use. SEABORNE began in November 2021, and initially, we were provided with a list of potential datasets relevant to our project in a spreadsheet. To this, we continued to search various data portals online and find additional datasets relevant to our project, particularly focusing on the Great Barrier Reef. We recorded these initially in an Excel spreadsheet. We then transferred this to an MS Access database and developed a more user-friendly entry form. Within the MS Access database, there is one table that stores all the metadata records entered. And another table that stores the static preview images. There are 58 fields (which have been described in a data dictionary) – some of these are mandatory. At the moment there are 3 metadata records entered and we expect this to grow to 50-100 records by the completion of the project. Lineage: Data was produced by examining each of the datasets metadata and documenting various features of each of the individual datasets and how useful they were for examining ecosystem services. Data was initially entered in excel, then migrated to MS Access database, and then imported or read in by SHiny R app.

  8. Cnidarian Microbiome Metadata

    • figshare.com
    txt
    Updated Feb 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mark McCauley (2023). Cnidarian Microbiome Metadata [Dataset]. http://doi.org/10.6084/m9.figshare.22012886.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 6, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Mark McCauley
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Metadata for the 16,012 microbial samples included in this database. Metadata was collated from the originally published studies, available supplementary information, and from online databases

  9. f

    Data from: Metadata Standard

    • fairsharing.org
    Updated Jun 28, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Oxford, Dept. of Engineering Science, Data Readiness Group (2017). Metadata Standard [Dataset]. https://fairsharing.org/
    Explore at:
    Dataset updated
    Jun 28, 2017
    Dataset authored and provided by
    University of Oxford, Dept. of Engineering Science, Data Readiness Group
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    A manually curated registry of standards, split into three types - Terminology Artifacts (ontologies, e.g. Gene Ontology), Models and Formats (conceptual schema, formats, data models, e.g. FASTA), and Reporting Guidelines (e.g. the ARRIVE guidelines for in vivo animal testing). These are linked to the databases that implement them and the funder and journal publisher data policies that recommend or endorse their use.

  10. Extracted Schemas from the Life Sciences Linked Open Data Cloud

    • figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maulik Kamdar (2023). Extracted Schemas from the Life Sciences Linked Open Data Cloud [Dataset]. http://doi.org/10.6084/m9.figshare.12402425.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Maulik Kamdar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is related to the manuscript "An empirical meta-analysis of the life sciences linked open data on the web" published at Nature Scientific Data. If you use the dataset, please cite the manuscript as follows:Kamdar, M.R., Musen, M.A. An empirical meta-analysis of the life sciences linked open data on the web. Sci Data 8, 24 (2021). https://doi.org/10.1038/s41597-021-00797-yWe have extracted schemas from more than 80 publicly available biomedical linked data graphs in the Life Sciences Linked Open Data (LSLOD) cloud into an LSLOD schema graph and conduct an empirical meta-analysis to evaluate the extent of semantic heterogeneity across the LSLOD cloud. The dataset published here contains the following files:- The set of Linked Data Graphs from the LSLOD cloud from which schemas are extracted.- Refined Sets of extracted classes, object properties, data properties, and datatypes, shared across the Linked Data Graphs on LSLOD cloud. Where the schema element is reused from a Linked Open Vocabulary or an ontology, it is explicitly indicated.- The LSLOD Schema Graph, which contains all the above extracted schema elements interlinked with each other based on the underlying content. Sample instances and sample assertions are also provided along with broad level characteristics of the modeled content. The LSLOD Schema Graph is saved as a JSON Pickle File. To read the JSON object in this Pickle file use the Python command as follows:with open('LSLOD-Schema-Graph.json.pickle' , 'rb') as infile: x = pickle.load(infile, encoding='iso-8859-1')Check the Referenced Link for more details on this research, raw data files, and code references.

  11. g

    PoliS-Lombardy - Catalogue of data, metadata and databases | gimi9.com

    • gimi9.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PoliS-Lombardy - Catalogue of data, metadata and databases | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_el-rb6i-vtfi/
    Explore at:
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Lombardy
    Description

    Catalogue of data, metadata and databases that PoliS-Lombardia manages on behalf of the Region pursuant to Article 52(1) of the Digital Administration Code.

  12. 4

    A database of reviewed datasets to investigate the use of metadata and...

    • data.4tu.nl
    zip
    Updated Oct 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Florian J. Ellsäßer; Alice Nikuze (2025). A database of reviewed datasets to investigate the use of metadata and adoption of metadata standards for Uncrewed Aerial Vehicle (UAV) data [Dataset]. http://doi.org/10.4121/d845f33d-e199-4c96-8a1f-1db2ad9f2a9c.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 22, 2025
    Dataset provided by
    4TU.ResearchData
    Authors
    Florian J. Ellsäßer; Alice Nikuze
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The database was developed as part of a research project investigating the use and adoption of metadata standards for UAV (Uncrewed Aerial Vehicle) data. It compiles a list of published datasets containing UAV data or products generated based on UAV data identified through a systematic search of public data repositories. The search covered established data platforms, including DANS, 4TU.ResearchData, DataONE Science Data Bank, DRYAD, Figshare and Zenodo. In addition, a broader internet search using search engines such as Google, DuckDuckGo, Bing, and Perplexity was conducted to identify other publicly accessible UAV datasets. Only datasets with a persistent identifier, such as a DOI (Digital Object Identifier), were included.

  13. Metadata for FIA P3 data on lichen

    • catalog.data.gov
    • datasets.ai
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Metadata for FIA P3 data on lichen [Dataset]. https://catalog.data.gov/dataset/metadata-for-fia-p3-data-on-lichen
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    This data describe the abundance of individual lichen species across the U.S. as recorded in the Forest Health and Monitoring dataset of the Forest Inventory and Analysis program (i.e. Phase 3 plots). This dataset is not publicly accessible because: These data are already housed on the USFS Forest Inventory and Analysis site (see below). It can be accessed through the following means: The lichen data for this product are from the USDA Forest Services (USFS) Forest Inventory and Analysis (FIA) Phase 3 (P3) dataset - Forest Health and Monitoring. The metadata and database description for the FIA-P3 is here (https://www.fia.fs.fed.us/library/database-documentation/). The data itself is located at the USFS Data Mart here (https://apps.fs.usda.gov/fia/datamart/CSV/datamart_csv.html) in two files: “LICHEN_PLOT_SUMMARY.zip,” and “LICHEN_VISIT.zip.” Point of contact: Linda Geiser, lgeiser@fs.fed.us. Format: The data are in .csv format.

  14. Metadata of the database on requests for public access to Council documents

    • data.europa.eu
    • data.wu.ac.at
    rdf xml, unknown, zip
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Council of the European Union, Metadata of the database on requests for public access to Council documents [Dataset]. https://data.europa.eu/data/datasets/public-access-requests?locale=sl
    Explore at:
    zip, unknown, rdf xmlAvailable download formats
    Dataset authored and provided by
    Council of the European Unionhttp://consilium.europa.eu/
    License

    http://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj

    Description

    Each year the Council publishes an annual report on the implementation of regulation 1049/2001 on access to documents. The annual report contains statistical information on the requests for public access received by the Council. With the exception of personal data, information on such requests is public.

    This dataset contains the following information on the requests for public access to documents received by the Council:

    1. General information on the applicant (anonymous): professional activity of applicant; geographic origin

    2. General Information on the request: request number; type of request (initial request, confirmatory application); date of request; deadline to reply; extended deadline to reply; date of reply; effort spent; follow-up; policy area(s).

    3. Information on the requested documents: publication status (public or not); type of reply; document category; document number

  15. p

    Metadata of the Danube Delta Database - Dataset - CKAN

    • dataportal.ponderful.eu
    Updated Jun 23, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). Metadata of the Danube Delta Database - Dataset - CKAN [Dataset]. https://dataportal.ponderful.eu/dataset/metadata-of-the-danube-delta-database
    Explore at:
    Dataset updated
    Jun 23, 2017
    Area covered
    Danube River, Danube Delta
    Description

    A description of biological and ecological data of the Danube delta lakes and channels is presented. The biological indicators refer to aquatic macrophytes, fish, zoo-plankton, and macro-invertebrates. Environmental data include physio-chemical data as well as hydrological parameters. More information on this dataset can be found in the Freshwater Metadatabase - MARS_12 (http://www.freshwatermetadata.eu/metadb/bf_mdb_view.php?entryID=MARS_12

  16. u

    River Recreation Research Database timeline metadata

    • lib.uidaho.edu
    json
    Updated Jan 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). River Recreation Research Database timeline metadata [Dataset]. https://www.lib.uidaho.edu/digital/rrrd/data.html
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jan 2, 2024
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Time-based metadata formatted for TimelineJS or other applications.

  17. Z

    WormBiome - Metadata file

    • data.niaid.nih.gov
    Updated Feb 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Assie, Adrien (2024). WormBiome - Metadata file [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10139660
    Explore at:
    Dataset updated
    Feb 21, 2024
    Dataset provided by
    Baylor College of Medicine
    Authors
    Assie, Adrien
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file is the metadata associated with all the genomic annotation curation in the Wormbiome collection.

    The Wormbiome collection is an online database dedicated to centralizing all the information related to bacteria associated with C. elegans. More information on wormbiome.org

  18. w

    International Comparative Legislatures Metadata Database: Using Roll Call...

    • data.library.wustl.edu
    sql
    Updated Aug 22, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crisp, Brian; Gabel, Matt; Carrubba, Clifford (2016). International Comparative Legislatures Metadata Database: Using Roll Call Votes to Understand Legislative Behavior [Dataset]. http://doi.org/10.7936/K7Z60NG4
    Explore at:
    sql(4179434)Available download formats
    Dataset updated
    Aug 22, 2016
    Dataset provided by
    Emory University
    Washington University in St. Louis
    Authors
    Crisp, Brian; Gabel, Matt; Carrubba, Clifford
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In most democracies, the public record of legislative votes in national and local parliaments is an important basis for holding elected officials accountable. In political science, that record is also an important source of data on legislator and party behavior. In practice, many legislatures create a public record of the votes cast by individual legislators for only a fraction of the issues on which votes occur. These recorded votes often are not a representative sample of all votes cast and may exhibit systematic biases that have implications for political accountability and for the science of political behavior. Therefore, understanding the characteristics of the issues that receive a publicly recorded vote (a roll-call vote) is essential to our understanding of democratic processes and evaluating the limits of scientific inferences that can be drawn from roll-call data. This data set advances our understanding of the voting record through examination of national parliamentary bodies around the world.

  19. u

    REDIRE Database Metadata Facets

    • redire.uni-bonn.de
    Updated Sep 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). REDIRE Database Metadata Facets [Dataset]. https://redire.uni-bonn.de/data.html
    Explore at:
    Dataset updated
    Sep 24, 2025
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Unique values and counts of metadata facet fields.

  20. 🎬 Movies Metadata Cleaned Dataset (1900–2025)

    • kaggle.com
    zip
    Updated Nov 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mustafa Sayed Saeed (2025). 🎬 Movies Metadata Cleaned Dataset (1900–2025) [Dataset]. https://www.kaggle.com/datasets/mustafasayed1181/movies-metadata-cleaned-dataset-19002025
    Explore at:
    zip(115088509 bytes)Available download formats
    Dataset updated
    Nov 7, 2025
    Authors
    Mustafa Sayed Saeed
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    🎬 Overview

    This dataset contains a cleaned and structured collection of movie metadata sourced from The Movie Database (TMDB), covering films released between 1900 and 2025. It includes over 946,000 movies with detailed information such as genres, production companies, budgets, revenues, popularity, ratings, and more.

    This dataset is ideal for data science, analytics, and machine learning projects related to the film industry — including trend analysis, box office prediction, and recommendation systems.

    📊 Dataset Columns Description

    ColumnDescription
    idUnique movie identifier
    titleOfficial movie title
    adultBoolean flag indicating adult content
    original_languageOriginal spoken language (ISO 639-1 code)
    origin_countryList of production countries
    release_dateMovie release date
    genre_namesList of genres associated with the movie
    production_company_namesNames of involved production companies
    budgetReported production budget (USD)
    revenueWorldwide gross revenue (USD)
    runtimeDuration in minutes
    popularityPopularity score (as provided by TMDB)
    vote_averageAverage user rating
    vote_countNumber of votes received

    🧠 Potential Use Cases

    🎥 Movie trend analysis across decades

    💰 Budget vs. revenue ROI exploration

    ⭐ Predictive modeling for ratings or popularity

    🌍 Cross-cultural film analysis by countries and languages

    🧩 Recommender systems and content-based filtering projects

    ⚙️ Data Source & Attribution

    The data in this dataset was collected and preprocessed using the TMDB API. All movie information is © TMDB — provided under their Terms of Use .

    This dataset is not endorsed or certified by TMDB. Users must comply with TMDB’s attribution and API usage policies when using this data.

    🙌 Acknowledgements

    Special thanks to The Movie Database (TMDB) for providing open access to their rich movie metadata. Dataset cleaned, organized, and published by Mustafa Sayed Said 🧑‍💻.

    🏷️ Tags

    movies film cinema tmdb data-cleaning machine-learning dataset EDA entertainment analytics

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shih-Ying Yeh (2024). danbooru2023-metadata-database [Dataset]. https://huggingface.co/datasets/KBlueLeaf/danbooru2023-metadata-database

danbooru2023-metadata-database

KBlueLeaf/danbooru2023-metadata-database

Explore at:
Dataset updated
Jan 11, 2024
Authors
Shih-Ying Yeh
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Metadata Database for Danbooru2023

Danbooru 2023 datasets: https://huggingface.co/datasets/nyanko7/danbooru2023 The latest entry of this database is id 7,866,491. Which is newer than nyanko7's dataset. This dataset contains a sqlite db file which have all the tags and posts metadata in it. The Peewee ORM config file is provided too, plz check it for more information. (Especially on how I link posts and tags together) The original data is from the official dump of the posts info.… See the full description on the dataset page: https://huggingface.co/datasets/KBlueLeaf/danbooru2023-metadata-database.

Search
Clear search
Close search
Google apps
Main menu