100+ datasets found
  1. O*NET Database

    • onetcenter.org
    excel, mysql, oracle +2
    Updated Dec 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Center for O*NET Development (2025). O*NET Database [Dataset]. https://www.onetcenter.org/database.html
    Explore at:
    oracle, sql server, text, mysql, excelAvailable download formats
    Dataset updated
    Dec 16, 2025
    Dataset provided by
    Occupational Information Network
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Dataset funded by
    US Department of Labor, Employment and Training Administration
    Description

    The O*NET Database contains hundreds of standardized and occupation-specific descriptors on almost 1,000 occupations covering the entire U.S. economy. The database, which is available to the public at no cost, is continually updated by a multi-method data collection program. Sources of data include: job incumbents, occupational experts, occupational analysts, employer job postings, and customer/professional association input.

    Data content areas include:

    • Worker Characteristics (e.g., Abilities, Interests, Work Styles)
    • Worker Requirements (e.g., Education, Knowledge, Skills)
    • Experience Requirements (e.g., On-the-Job Training, Work Experience)
    • Occupational Requirements (e.g., Detailed Work Activities, Work Context)
    • Occupation-Specific Information (e.g., Job Titles, Tasks, Technology Skills)

  2. d

    DWR Continuous Data Download Links

    • catalog.data.gov
    • data.ca.gov
    • +1more
    Updated Jan 23, 2026
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Water Resources (2026). DWR Continuous Data Download Links [Dataset]. https://catalog.data.gov/dataset/dwr-continuous-data-download-links-90cc9
    Explore at:
    Dataset updated
    Jan 23, 2026
    Dataset provided by
    California Department of Water Resources
    Description

    Stations and a table of download links for time-series data, from DWR's continuous environmental monitoring database. For more information, see DWR's Water Data Library, continuous data section: https://wdl.water.ca.gov/ContinuousData.aspx, where this data is also available.

  3. E-commerce dataset by Olist (SQLite)

    • kaggle.com
    zip
    Updated Apr 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Terenci Claramunt (2024). E-commerce dataset by Olist (SQLite) [Dataset]. https://www.kaggle.com/datasets/terencicp/e-commerce-dataset-by-olist-as-an-sqlite-database
    Explore at:
    zip(51085670 bytes)Available download formats
    Dataset updated
    Apr 28, 2024
    Authors
    Terenci Claramunt
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    I imported the two Olist Kaggle datasets into an SQLite database. I modified the original table names to make them shorter and easier to understand. Here's the Entity-Relationship Diagram of the resulting SQLite database:

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2473556%2F23a7d4d8cd99e36e32e57303eb804fff%2Fdb-schema.png?generation=1714391550829633&alt=media" alt="Database Schema">

    Data sources:

    https://www.kaggle.com/datasets/olistbr/brazilian-ecommerce

    https://www.kaggle.com/datasets/olistbr/marketing-funnel-olist


    I used this database as a data source for my notebook:

    SQL Challenge: E-commerce data analysis

  4. classicmodels

    • kaggle.com
    zip
    Updated Dec 10, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marta Tavares (2022). classicmodels [Dataset]. https://www.kaggle.com/datasets/martatavares/classicmodels
    Explore at:
    zip(72431 bytes)Available download formats
    Dataset updated
    Dec 10, 2022
    Authors
    Marta Tavares
    Description

    MySQL Classicmodels sample database

    The MySQL sample database schema consists of the following tables:

    • Customers: stores customer’s data.
    • Products: stores a list of scale model cars.
    • ProductLines: stores a list of product line categories.
    • Orders: stores sales orders placed by customers.
    • OrderDetails: stores sales order line items for each sales order.
    • Payments: stores payments made by customers based on their accounts.
    • Employees: stores all employee information as well as the organization structure such as who reports to whom.
    • Offices: stores sales office data.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F8652778%2Fefc56365be54c0e2591a1aefa5041f36%2FMySQL-Sample-Database-Schema.png?generation=1670498341027618&alt=media" alt="">

  5. d

    Dr. Duke's Phytochemical and Ethnobotanical Databases

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Dec 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Dr. Duke's Phytochemical and Ethnobotanical Databases [Dataset]. https://catalog.data.gov/dataset/dr-dukes-phytochemical-and-ethnobotanical-databases-0849e
    Explore at:
    Dataset updated
    Dec 2, 2025
    Dataset provided by
    Agricultural Research Service
    Description

    Of interest to pharmaceutical, nutritional, and biomedical researchers, as well as individuals and companies involved with alternative therapies and and herbal products, this database is one of the world's leading repositories of ethnobotanical data, evolving out of the extensive compilations by the former Chief of USDA's Economic Botany Laboratory in the Agricultural Research Service in Beltsville, Maryland, in particular his popular Handbook of phytochemical constituents of GRAS herbs and other economic plants (CRC Press, Boca Raton, FL, 1992). In addition to Duke's own publications, the database documents phytochemical information and quantitative data collected over many years through research results presented at meetings and symposia, and findings from the published scientific literature. The current Phytochemical and Ethnobotanical databases facilitate plant, chemical, bioactivity, and ethnobotany searches. A large number of plants and their chemical profiles are covered, and data are structured to support browsing and searching in several user-focused ways. For example, users can get a list of chemicals and activities for a specific plant of interest, using either its scientific or common name download a list of chemicals and their known activities in PDF or spreadsheet form find plants with chemicals known for a specific biological activity display a list of chemicals with their LD toxicity data find plants with potential cancer-preventing activity display a list of plants for a given ethnobotanical use find out which plants have the highest levels of a specific chemical References to the supporting scientific publications are provided for each specific result. Resources in this dataset: Resource Title: Duke-Source-CSV.zip. File Name: Duke-Source-CSV.zipResource Description: Dr. Duke's Phytochemistry and Ethnobotany - raw database tables for archival purposes. Visit https://phytochem.nal.usda.gov/phytochem/search for the interactive web version of the database. Resource Title: Data Dictionary (preliminary). File Name: DrDukesDatabaseDataDictionary-prelim.csvResource Description: This Data Dictionary describes the columns for each table. [Note that this is in progress and some variables are yet to be defined or are unused in the current implementation. Please send comments/suggestions to nal-adc-curator@ars.usda.gov ]

  6. Chinook Database

    • kaggle.com
    zip
    Updated Nov 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rana Sabry (2023). Chinook Database [Dataset]. https://www.kaggle.com/datasets/ranasabrii/chinook
    Explore at:
    zip(448874 bytes)Available download formats
    Dataset updated
    Nov 7, 2023
    Authors
    Rana Sabry
    Description

    The Chinook database was created as an alternative to the Northwind database. It represents a digital media store, including tables for artists, albums, media tracks, invoices and customers.

    The Chinook database is available on GitHub. It’s available for various DBMSs including MySQL, SQL Server, SQL Server Compact, PostgreSQL, Oracle, DB2, and of course, SQLite.

  7. m

    Download CSV DB

    • maclookup.app
    json
    Updated Jan 30, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2026). Download CSV DB [Dataset]. https://maclookup.app/downloads/csv-database
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jan 30, 2026
    Description

    Free, daily updated MAC prefix and vendor CSV database. Download now for accurate device identification.

  8. 🇺🇸 US Zip Codes Database (Oct 04 2024 update)

    • kaggle.com
    zip
    Updated Oct 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BwandoWando (2024). 🇺🇸 US Zip Codes Database (Oct 04 2024 update) [Dataset]. https://www.kaggle.com/datasets/bwandowando/us-zip-codes-database-from-simplemaps-com
    Explore at:
    zip(4195930 bytes)Available download formats
    Dataset updated
    Oct 10, 2024
    Authors
    BwandoWando
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1842206%2F4408fd0c0561e4a48a03776b784ed650%2Fzip2.jpeg?generation=1728526740859651&alt=media" alt="">

    US Zip Codes Database We're proud to offer a simple, accurate and up-to-date database of US Zip Codes. It's been built from the ground up using authoritative sources including the U.S. Postal Service™, U.S. Census Bureau, National Weather Service, American Community Survey, and the IRS. - Up-to-date: Data updated as of October 8, 2024. Includes data from the most recent American Community Survey (2022)! - Comprehensive: 41,618 unique zip codes including ZCTA, unique, military, and PO box zips. - Useful fields: From latitude and longitude to household income. - Accurate: Aggregated from official sources and precisely geocoded to latitude and longitude. - Simple: A single CSV file, concise field names, only one entry per zip code.

    From https://simplemaps.com/data/us-zips

    Image

    Generated with Bing Image Generator

    Note

    I just downloaded and uploaded it here. All credits to https://simplemaps.com/data/us-zips

  9. GDB Databases

    • zenodo.org
    application/gzip, bin
    Updated Sep 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tobias Fink; Lorenz C. Blum; Lars Ruddigkeit; Ruud van Deursen; Jean-Louis Reymond; Tobias Fink; Lorenz C. Blum; Lars Ruddigkeit; Ruud van Deursen; Jean-Louis Reymond (2022). GDB Databases [Dataset]. http://doi.org/10.5281/zenodo.5172018
    Explore at:
    bin, application/gzipAvailable download formats
    Dataset updated
    Sep 1, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Tobias Fink; Lorenz C. Blum; Lars Ruddigkeit; Ruud van Deursen; Jean-Louis Reymond; Tobias Fink; Lorenz C. Blum; Lars Ruddigkeit; Ruud van Deursen; Jean-Louis Reymond
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    About

    GDB-11 enumerates small organic molecules up to 11 atoms of C, N, O and F following simple chemical stability and synthetic feasibility rules.
    GDB-13 enumerates small organic molecules up to 13 atoms of C, N, O, S and Cl following simple chemical stability and synthetic feasibility rules. With 977 468 314 structures, GDB-13 is the largest publicly available small organic molecule database to date.

    How to cite

    To cite GDB-11, please reference:

    Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physico-chemical properties, compound classes and drug discovery. Fink, T.; Reymond, J.-L. J. Chem. Inf. Model. 2007, 47, 342-353.

    Virtual Exploration of the Small Molecule Chemical Universe below 160 Daltons. Fink, T.; Bruggesser, H.; Reymond, J.-L. Angew. Chem. Int. Ed. 2005, 44, 1504-1508.

    To cite GDB-13, please reference:

    970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13. Blum L. C.; Reymond J.-L. J. Am. Chem. Soc., 2009, 131, 8732-8733.

    To cite GDB-17, please reference:

    Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. Ruddigkeit Lars, van Deursen Ruud, Blum L. C.; Reymond J.-L. J. Chem. Inf. Model., 2012, 52, 2864-2875.

    Download

    You can download the databases and subsets of it using the links provided. All the molecules are stored in dearomatized, canonized SMILES format and compressed as tar/gz archive (for Windows users: Download 7-zip to open archives).


    GDB-17
    GDB-17-Set (50 million) GDB17.50000000.smi.gz 314 MB
    Lead-like Set (100-350 MW & 1-3 clogP)(11 million) GDB17.50000000LL.smi.gz 75 MB
    Lead-like Set (100-350 MW & 1-3 clogP) without small rings (3-4 ring atoms)(0.8 million) GDB17.50000000LLnoSR.smi.gz 55 MB

    GDB-13
    Entire GDB-13 (including all C/N/O/Cl/S molecules) gdb13.tgz 2.6 GB
    GDB-13 Subsets (The sum of all the subsets below correspond to the entire GDB-13 above)
    Graph subset (saturated hydrocarbons) gdb13.g.tgz 1.1 MB
    Skeleton subset (unsaturated hydrocarbons) gdb13.sk.tgz 14 MB
    Only carbon & nitrogen containing molecules gdb13.cn.tgz 443 MB
    Only carbon & oxygen containing molecules gdb13.co.tgz 299 MB
    Only carbon & nitrogen & oxygen containing molecules gdb13.cno.tgz 1.8 GB
    Chlorine & sulphur containing molecules gdb13.cls.tgz 189 MB

    GDB-13 Subsets (For details please refer to the Table 2 in J Comput Aided Mol Des 2011 25:637 to 647)
    GDB-13 Subset AB (~635 Millions) AB.smi.gz 2.4 GB
    GDB-13 Subset ABC (~441 Millions) ABC.smi.gz 1.7 GB
    GDB-13 Subset ABCD (~277 Millions) ABCD.smi.gz 1.1 GB
    GDB-13 Subset ABCDE (~140 Millions) ABCDE.smi.gz 565 MB
    GDB-13 Subset ABCDEF (~43 Millions) ABCDEF.smi.gz 171 MB
    GDB-13 Subset ABCDEFG (~13 Millions) ABCDEFG.smi.gz 50 MB
    GDB-13 Subset ABCDEFGH (~1.4 Millions) ABCDEFGH.smi.gz 6.2 MB
    GDB-13 Random Sample. Annotated with frequency and log-likelihood (Please refer to Exploring the GDB-13 chemical space using deep generative models)
    GDB-13 Random Sample (1 Million) gdb13.1M.freq.ll.smi.gz 14.8 MB

    FDB-17
    FDB-17 FDB-17-fragmentset.smi.gz 62.2 MB


    GDB4c
    GDB4c (SMILES) GDB4c.smi.gz 6.2 MB
    GDB4c3D (SMILES) GDB4c3D.smi.gz 161 MB
    GDB4c3D (SDF) GDB4c3D.sdf.tar.gz 2 GB


    Other
    GDBMedChem (SMILES) GDBMedChem.smi 276 MB
    GDBChEMBL (SMILES) GDBChEMBL.smi 353.6 MB
    GDB-13 random selection (1 million) gdb13.rand1M.smi.gz 7.2 MB
    Fragment-like subset (Rule of three) gdb13.frl.tgz 1.2 GB
    Dark matter universe up to 9 heavy atoms dmu9.tgz 87 MB

    GDB-11
    Entire GDB-11 (including all C/N/O/F molecules) gdb11.tgz 122 MB
    Fragrance Like Subsets: For details please refer to Ruddigkeit et al. Journal of Cheminformatics 2014, 6:27
    FragranceDB (SuperScent + Flavornet) FragranceDB.smi 56 KB
    TasteDB (SuperSweet + BitterDB) TasteDB.smi 44 KB
    FragranceDB.FL (Fragrance-like subset of FragranceDB) FragranceDB.FL.smi 32 KB
    ChEMBL.FL (Fragrance-like subset of ChEMBL) ChEMBL.FL.smi 452 KB
    PubChem.FL Fragrance-like subset of PubChem PubChem.FL.smi 20 MB
    ZINC.FL (Fragrance-like subset of ZINC) ZINC.FL.smi 1.3 MB
    GDB-13.FL (Fragrance-like subset of GDB-13) GDB-13.FL.smi.gz 165 MB

    Terms and conditions: The GDB databases may be downloaded free of charge. In published research involving GDB, cite the appropriate references mentioned above. GDB must not be used as part of or in patents. GDB and large portions thereof must not be redistributed without the express written permission of Jean-Louis Reymond.

  10. Example Data Files

    • redivis.com
    application/jsonl +7
    Updated Jan 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Redivis Demo Organization (2025). Example Data Files [Dataset]. https://redivis.com/datasets/yz1s-d09009dbb
    Explore at:
    sas, csv, spss, avro, stata, arrow, application/jsonl, parquetAvailable download formats
    Dataset updated
    Jan 30, 2025
    Dataset provided by
    Redivis Inc.
    Authors
    Redivis Demo Organization
    Description

    Abstract

    This is an example dataset demonstrating new non-tabular data file functionality on Redivis.

    Methodology

    Redivis now supports uploading arbitrary files to datasets. Alongside existing support for tabular data, this expands the breadth of data on Redivis and opens up novel research opportunities. Datasets can have millions of files, and each file can be up to 5 terabytes.

    While you can upload literally any file type, this dataset demonstrates previews for some common file formats:

    %3C!-- --%3E

    %3C!-- --%3E

    These previews are enabled by numerous contributions in the open source and academic community, including:

    %3C!-- --%3E

    Usage

    This dataset primarily consists of two folders that can be used to train and evaluate image classifications models (cats vs. dogs, of course). The files in the training images folder have already been classified, while those in the test images folder are not.

    This dataset also contains another folder of example file types that have built-in previews on Redivis. You can upload any file type to Redivis, and download these files and work with them in your notebooks. However, we endeavor to provide interactive previews for common file types when it is feasible in a web browser environment. Contact us if you'd like to see a preview added for a new file format!

    Beyond previewing and downloading files, many use cases will utilize the redivis-python and redivis-r client libraries to stream files to a computational environment (either within Redivis notebooks or elsewhere) for further analysis.

    You can view an example image classification project using this dataset here.

    This analysis is reproduced from [it original publication.](%3Chttps://towardsdatascience.com/image-classifier-cats-vs-dogs-with-convolu

  11. Northwind and Chinook DataBase

    • kaggle.com
    zip
    Updated Jun 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RCURIOSO (2024). Northwind and Chinook DataBase [Dataset]. https://www.kaggle.com/datasets/rcurioso/northwind-and-chinook-database/code
    Explore at:
    zip(461230 bytes)Available download formats
    Dataset updated
    Jun 19, 2024
    Authors
    RCURIOSO
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Northwind Database

    La base de datos Northwind es una base de datos de muestra creada originalmente por Microsoft y utilizada como base para sus tutoriales en una variedad de productos de bases de datos durante décadas. La base de datos de Northwind contiene datos de ventas de una empresa ficticia llamada "Northwind Traders", que importa y exporta alimentos especiales de todo el mundo. La base de datos Northwind es un excelente esquema tutorial para un ERP de pequeñas empresas, con clientes, pedidos, inventario, compras, proveedores, envíos, empleados y contabilidad de entrada única. Desde entonces, la base de datos Northwind ha sido trasladada a una variedad de bases de datos que no son de Microsoft, incluido PostgreSQL.

    El conjunto de datos de Northwind incluye datos de muestra para lo siguiente.

    • Proveedores: Proveedores y vendedores de Northwind
    • Clientes: Clientes que compran productos de Northwind
    • Empleados: detalles de los empleados de los comerciantes de Northwind
    • Productos: Información del producto
    • Transportistas: los detalles de los transportistas que envían los productos desde los comerciantes a los clientes finales.
    • Órdenes y detalles de la orden: transacciones de órdenes de venta que tienen lugar entre los clientes y la empresa.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13411583%2Fa52a5bbc3d8842abfdfcfe608b7a8d25%2FNorthwind_E-R_Diagram.png?generation=1718785485874540&alt=media" alt="">

    Chinook DataBase

    Chinook es una base de datos de muestra disponible para SQL Server, Oracle, MySQL, etc. Se puede crear ejecutando un único script SQL. La base de datos Chinook es una alternativa a la base de datos Northwind, siendo ideal para demostraciones y pruebas de herramientas ORM dirigidas a servidores de bases de datos únicos o múltiples.

    El modelo de datos Chinook representa una tienda de medios digitales, que incluye tablas para artistas, álbumes, pistas multimedia, facturas y clientes.

    Los datos relacionados con los medios se crearon utilizando datos reales de una biblioteca de iTunes. La información de clientes y empleados se creó manualmente utilizando nombres ficticios, direcciones que se pueden ubicar en mapas de Google y otros datos bien formateados (teléfono, fax, correo electrónico, etc.). La información de ventas se genera automáticamente utilizando datos aleatorios durante un período de cuatro años.

    ¿Por qué el nombre Chinook? El nombre de esta base de datos de ejemplo se basó en la base de datos Northwind. Los chinooks son vientos en el interior oeste de América del Norte, donde las praderas canadienses y las grandes llanuras se encuentran con varias cadenas montañosas. Los chinooks son más frecuentes en el sur de Alberta en Canadá. Chinook es una buena opción de nombre para una base de datos que pretende ser una alternativa a Northwind.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13411583%2Fd856e0358e3a572d50f1aba5e171c1c6%2FChinook%20DataBase.png?generation=1718785749657445&alt=media" alt="">

  12. OECD Regional database

    • catalog.data.gov
    Updated Mar 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of State (2021). OECD Regional database [Dataset]. https://catalog.data.gov/dataset/oecd-regional-database
    Explore at:
    Dataset updated
    Mar 30, 2021
    Dataset provided by
    United States Department of Statehttp://state.gov/
    Description

    The OECD regional database is delivered through the viewer OECD eXplorer, an interactive mapping tool designed to let users explore, download and visualize data with maps, histograms, scatterplot and others. The database comprise a set of comparable statistics on about 2000 regions in the 33 OECD countries, on topics such as population, economic output, productivity, labor market, education and innovation themes to highlight differences within countries.

  13. c

    Walmart Products Dataset – Free Product Data CSV

    • crawlfeeds.com
    csv, zip
    Updated Dec 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Walmart Products Dataset – Free Product Data CSV [Dataset]. https://crawlfeeds.com/datasets/walmart-products-free-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Dec 2, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Looking for a free Walmart product dataset? The Walmart Products Free Dataset delivers a ready-to-use ecommerce product data CSV containing ~2,100 verified product records from Walmart.com. It includes vital details like product titles, prices, categories, brand info, availability, and descriptions — perfect for data analysis, price comparison, market research, or building machine-learning models.

    Key Features

    Complete Product Metadata: Each entry includes URL, title, brand, SKU, price, currency, description, availability, delivery method, average rating, total ratings, image links, unique ID, and timestamp.

    CSV Format, Ready to Use: Download instantly - no need for scraping, cleaning or formatting.

    Good for E-commerce Research & ML: Ideal for product cataloging, price tracking, demand forecasting, recommendation systems, or data-driven projects.

    Free & Easy Access: Priced at USD $0.0, making it a great starting point for developers, data analysts or students.

    Who Benefits?

    • Data analysts & researchers exploring e-commerce trends or product catalog data.
    • Developers & data scientists building price-comparison tools, recommendation engines or ML models.
    • E-commerce strategists/marketers need product metadata for competitive analysis or market research.
    • Students/hobbyists needing a free dataset for learning or demo projects.

    Why Use This Dataset Instead of Manual Scraping?

    • Time-saving: No need to write scrapers or deal with rate limits.
    • Clean, structured data: All records are verified and already formatted in CSV, saving hours of cleaning.
    • Risk-free: Avoid Terms-of-Service issues or IP blocks that come with manual scraping.
      Instant access: Free and immediately downloadable.
  14. World Administrative Boundaries

    • geopostcodes.com
    csv
    Updated Apr 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GeoPostcodes (2024). World Administrative Boundaries [Dataset]. https://www.geopostcodes.com/world-administrative-boundaries/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Apr 28, 2024
    Dataset authored and provided by
    GeoPostcodes
    Area covered
    World
    Description

    Our World Administrative Boundaries Database offers comprehensive postal code data for spatial analysis, including postal and administrative areas. This dataset contains accurate and up-to-date information on all administrative divisions, cities, and zip codes, making it an invaluable resource for various applications such as address capture and validation, map and visualization, reporting and business intelligence (BI), master data management, logistics and supply chain management, and sales and marketing. Our location data packages are available in various formats, including CSV, optimized for seamless integration with popular systems like Esri ArcGIS, Snowflake, QGIS, and more. Product features include fully and accurately geocoded data, multi-language support with address names in local and foreign languages, comprehensive city definitions, and the option to combine map data with UNLOCODE and IATA codes, time zones, and daylight saving times. Companies choose our location databases for their enterprise-grade service, reduction in integration time and cost by 30%, and weekly updates to ensure the highest quality.

  15. EPA Facility Registry Service (FRS): Facility Interests Dataset Download

    • catalog.data.gov
    • data.cnra.ca.gov
    • +3more
    Updated Feb 10, 2026
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Environmental Protection Agency, Office of Environmental Information (Publisher) (2026). EPA Facility Registry Service (FRS): Facility Interests Dataset Download [Dataset]. https://catalog.data.gov/dataset/epa-facility-registry-service-frs-facility-interests-dataset-download9
    Explore at:
    Dataset updated
    Feb 10, 2026
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    This downloadable data package consists of location and facility identification information from EPA's Facility Registry Service (FRS) for all sites that are available in the FRS individual feature layers. The layers comprise the FRS major program databases, including: Assessment Cleanup and Redevelopment Exchange System (ACRES) : brownfields sites ; Air Facility System (AFS) : stationary sources of air pollution ; ICIS-AIR (AIR) : stationary sources of air pollution; Bureau of Indian Affairs (BIA) : schools data on Indian land; Base Realignment and Closure (BRAC) facilities; Clean Air Markets Division Business System (CAMDBS) : market-based air pollution control programs; Comprehensive Environmental Response, Superfund Enterprise Management System (SEMS): hazardous waste sites; Integrated Compliance Information System (ICIS) : integrated enforcement and compliance information; National Compliance Database (NCDB) : Federal Insecticide, Fungicide, and Rodenticide Act (FIFRA) and the Toxic Substances Control Act (TSCA); National Pollutant Discharge Elimination System (NPDES) module of ICIS : NPDES surface water permits; Radiation Information Database (RADINFO) : radiation and radioactivity facilities; RACT/BACT/LAER Clearinghouse (RBLC) : best available air pollution technology requirements; Resource Conservation and Recovery Act Information System (RCRAInfo) : tracks generators, transporters, treaters, storers, and disposers of hazardous waste; Toxic Release Inventory (TRI) : certain industries that use, manufacture, treat, or transport more than 650 toxic chemicals; Emission Inventory System (EIS) : inventory of large stationary sources and voluntarily-reported smaller sources of air point pollution emitters; countermeasure (SPCC) and facility response plan (FRP) subject facilities; Electronic Greenhouse Gas Reporting Tool (E-GGRT) : large greenhouse gas emitters; Emissions and; Generation Resource Integrated Database (EGRID) : power plants. The Facility Registry Service (FRS) identifies and geospatially locates facilities, sites or places subject to environmental regulations or of environmental interest. Using vigorous verification and data management procedures, FRS integrates facility data from EPA's national program systems, other federal agencies, and State and tribal master facility records and provides EPA with a centrally managed, single source of comprehensive and authoritative information on facilities. This data set contains the FRS facilities that link to the programs listed above once the program data has been integrated into the FRS database. Additional information on FRS is available at the EPA website https://www.epa.gov/enviro/facility-registry-service-frs. Included in this package are a file geodatabase, Esri ArcMap map document and an XML file of this metadata record. Full FGDC metadata records for each layer are contained in the database.

  16. Gridded Soil Survey Geographic Database (gSSURGO)

    • agdatacommons.nal.usda.gov
    bin
    Updated Nov 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    USDA Natural Resources Conservation Service (2025). Gridded Soil Survey Geographic Database (gSSURGO) [Dataset]. http://doi.org/10.15482/USDA.ADC/1255234
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 22, 2025
    Dataset provided by
    United States Department of Agriculturehttp://usda.gov/
    Natural Resources Conservation Servicehttp://www.nrcs.usda.gov/
    Authors
    USDA Natural Resources Conservation Service
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset is called the Gridded SSURGO (gSSURGO) Database and is derived from the Soil Survey Geographic (SSURGO) Database. SSURGO is generally the most detailed level of soil geographic data developed by the National Cooperative Soil Survey (NCSS) in accordance with NCSS mapping standards. The tabular data represent the soil attributes, and are derived from properties and characteristics stored in the National Soil Information System (NASIS). The gSSURGO data were prepared by merging traditional SSURGO digital vector map and tabular data into State-wide extents, and adding a State-wide gridded map layer derived from the vector, plus a new value added look up (valu) table containing "ready to map" attributes. The gridded map layer is offered in an ArcGIS file geodatabase raster format. The raster and vector map data have a State-wide extent. The raster map data have a 10 meter cell size that approximates the vector polygons in an Albers Equal Area projection. Each cell (and polygon) is linked to a map unit identifier called the map unit key. A unique map unit key is used to link to raster cells and polygons to attribute tables, including the new value added look up (valu) table that contains additional derived data. The value added look up (valu) table contains attribute data summarized to the map unit level using best practice generalization methods intended to meet the needs of most users. The generalization methods include map unit component weighted averages and percent of the map unit meeting a given criteria. Resources in this dataset:Resource Title: gSSURGO downloads Page. File Name: Web Page, url: https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/survey/geo/?cid=nrcs142p2_053628#value Download gSSURGO Databases

    Other resources include introduction to gSSURGO, User Guide (PDF; 4.22 MB), SSURGO/gSSURGO ArcTools, Valu1 (Value Added Look Up) Table, Metadata, Recommended Data Citations, Technical Information, Sample gSSURGO Map Themes

  17. Complete Antivirus Database

    • comodo.com
    cav
    Updated Apr 15, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Comodo (2010). Complete Antivirus Database [Dataset]. https://www.comodo.com/home/internet-security/updates/vdp/database.php
    Explore at:
    cavAvailable download formats
    Dataset updated
    Apr 15, 2010
    Dataset provided by
    Comodo Grouphttp://www.comodo.com/
    Authors
    Comodo
    License

    https://www.comodo.com/home/internet-security/updates/vdp/database.phphttps://www.comodo.com/home/internet-security/updates/vdp/database.php

    Description

    The complete Comodo Internet Security database is available for download...

  18. d

    Data from: Global Terrorism Database

    • catalog.data.gov
    • datasets.ai
    Updated May 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Maryland (UMD) (2023). Global Terrorism Database [Dataset]. https://catalog.data.gov/dataset/global-terrorism-database
    Explore at:
    Dataset updated
    May 30, 2023
    Dataset provided by
    University of Maryland (UMD)
    Description

    The Global Terrorism Database™ (GTD) is an open-source database including information on terrorist events around the world from 1970 through 2020 (with annual updates planned for the future). Unlike many other event databases, the GTD includes systematic data on domestic as well as international terrorist incidents that have occurred during this time period and now includes more than 200,000 cases.

  19. Kraken2 Human database

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Feb 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael B. Hall; Michael B. Hall (2024). Kraken2 Human database [Dataset]. http://doi.org/10.5281/zenodo.8339700
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Feb 15, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Michael B. Hall; Michael B. Hall
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A kraken2 database built from just the Human library on 29/06/2023. This archive contains just the three files required by kraken2, hash.k2d, opts.k2d, and taxo.k2d.

    The commands used to download and build this database are:

    k2 download-taxonomy --db db/
    
    k2 download-library --db db/ --library human
    
    k2 build --kmer-len 35 --minimizer-len 31 --minimizer-spaces 7 --threads 8 --db db/

  20. d

    YMDB - Yeast Metabolome Database

    • dknet.org
    • rrid.site
    • +2more
    Updated Aug 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). YMDB - Yeast Metabolome Database [Dataset]. http://identifiers.org/RRID:SCR_005890
    Explore at:
    Dataset updated
    Aug 12, 2024
    Description

    A manually curated database of small molecule metabolites found in or produced by Saccharomyces cerevisiae (also known as Baker's yeast and Brewer's yeast). This database covers metabolites described in textbooks, scientific journals, metabolic reconstructions and other electronic databases. YMDB contains metabolites arising from normal S. cerevisiae metabolism under defined laboratory conditions as well as metabolites generated by S. cerevisiae when used in baking and in the production of wines, beers and spirits. YMDB currently contains 2027 small molecules with 857 associated enzymes and 138 associated transporters. Each small molecule has 48 data fields describing the metabolite, its chemical properties and links to spectral and chemical databases. Each enzyme/transporter is linked to its associated metabolites and has 30 data fields describing both the gene and corresponding protein. Users may search through the YMDB using a variety of database-specific tools. The simple text query supports general text queries of the textual component of the database. By selecting either metabolites or proteins in the search for field it is possible to restrict the search and the returned results to only those data associated with metabolites or with proteins. Clicking on the Browse button generates a tabular synopsis of YMDB's content. This browser view allows users to casually scroll through the database or re-sort its contents. Clicking on a given MetaboCard button brings up the full data content for the corresponding metabolite. A complete explanation of all the YMDB fields and sources is available. Under the Search link users will find a number of search options listed in a pull-down menu. The Chem Query option allows users to draw (using MarvinSketch applet or a ChemSketch applet) or to type (SMILES string) a chemical compound and to search the YMDB for chemicals similar or identical to the query compound. The Advanced Search option supports a more sophisticated text search of the text portion of YMDB. The Sequence Search button allows users to conduct BLASTP (protein) sequence searches of all sequences contained in YMDB. Both single and multiple sequence (i.e. whole proteome) BLAST queries are supported. YMDB also supports a Data Extractor option that allows specific data fields or combinations of data fields to be searched and/or extracted. Spectral searches of YMDB's reference compound NMR and MS spectral data are also supported through its MS, MS/MS, GC/MS and NMR Spectra Search links. Users may download YMDB's complete textual data, chemical structures and sequence data by clicking on the Download button.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Center for O*NET Development (2025). O*NET Database [Dataset]. https://www.onetcenter.org/database.html
Organization logo

O*NET Database

Explore at:
oracle, sql server, text, mysql, excelAvailable download formats
Dataset updated
Dec 16, 2025
Dataset provided by
Occupational Information Network
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
United States
Dataset funded by
US Department of Labor, Employment and Training Administration
Description

The O*NET Database contains hundreds of standardized and occupation-specific descriptors on almost 1,000 occupations covering the entire U.S. economy. The database, which is available to the public at no cost, is continually updated by a multi-method data collection program. Sources of data include: job incumbents, occupational experts, occupational analysts, employer job postings, and customer/professional association input.

Data content areas include:

  • Worker Characteristics (e.g., Abilities, Interests, Work Styles)
  • Worker Requirements (e.g., Education, Knowledge, Skills)
  • Experience Requirements (e.g., On-the-Job Training, Work Experience)
  • Occupational Requirements (e.g., Detailed Work Activities, Work Context)
  • Occupation-Specific Information (e.g., Job Titles, Tasks, Technology Skills)

Search
Clear search
Close search
Google apps
Main menu