100+ datasets found

h
dataset-card-example
huggingface.co
Updated Sep 28, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Templates (2023). dataset-card-example [Dataset]. https://huggingface.co/datasets/templates/dataset-card-example
Explore at:
Dataset updated
Sep 28, 2023
Dataset authored and provided by
Templates
Description
Dataset Card for Dataset Name

This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

Dataset Details Dataset Description

Curated by: [More Information Needed] Funded by [optional]: [More Information Needed] Shared by [optional]: [More Information Needed] Language(s) (NLP): [More Information Needed] License: [More Information Needed]

Dataset Sources [optional]… See the full description on the dataset page: https://huggingface.co/datasets/templates/dataset-card-example.
d
An example data set for exploration of Multiple Linear Regression
catalog.data.gov
data.usgs.gov
Updated Jul 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). An example data set for exploration of Multiple Linear Regression [Dataset]. https://catalog.data.gov/dataset/an-example-data-set-for-exploration-of-multiple-linear-regression
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
This data set contains example data for exploration of the theory of regression based regionalization. The 90th percentile of annual maximum streamflow is provided as an example response variable for 293 streamgages in the conterminous United States. Several explanatory variables are drawn from the GAGES-II data base in order to demonstrate how multiple linear regression is applied. Example scripts demonstrate how to collect the original streamflow data provided and how to recreate the figures from the associated Techniques and Methods chapter.
d
Open Data Guidance - Privacy for Open Datasets
catalog.data.gov
data.oregon.gov
Updated Aug 7, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.oregon.gov (2021). Open Data Guidance - Privacy for Open Datasets [Dataset]. https://catalog.data.gov/dataset/open-data-guidance-privacy-for-open-datasets
Explore at:
Dataset updated
Aug 7, 2021
Dataset provided by
data.oregon.gov
Description
This document provides guidance to State agencies on evaluating datasets with PII, PHI, or other forms of private or confidential data. This guidance includes a sample risk benefit analysis form and process to enable agencies to evaluate datasets for publication and help select appropriate privacy protections for open datasets.
Training images
redivis.com
Updated Aug 17, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Redivis Demo Organization (2022). Training images [Dataset]. https://redivis.com/datasets/yz1s-d09009dbb
Explore at:
Dataset updated
Aug 17, 2022
Dataset provided by
Redivis Inc.
Authors
Redivis Demo Organization
Time period covered
Aug 8, 2022
Description
This is an auto-generated index table corresponding to a folder of files in this dataset with the same name. This table can be used to extract a subset of files based on their metadata, which can then be used for further analysis. You can view the contents of specific files by navigating to the "cells" tab and clicking on an individual file_kd.
h
AirfRANS_clipped
huggingface.co
Updated May 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PLAID-datasets (2025). AirfRANS_clipped [Dataset]. https://huggingface.co/datasets/PLAID-datasets/AirfRANS_clipped
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 5, 2025
Dataset authored and provided by
PLAID-datasets
License
https://choosealicense.com/licenses/odbl/https://choosealicense.com/licenses/odbl/
Description
Dataset Card

This dataset contains a single huggingface split, named 'all_samples'. The samples contains a single huggingface feature, named called "sample". Samples are instances of plaid.containers.sample.Sample. Mesh objects included in samples follow the CGNS standard, and can be converted in Muscat.Containers.Mesh.Mesh. Example of commands: import pickle from datasets import load_dataset from plaid.containers.sample import Sample

Load the dataset

dataset =… See the full description on the dataset page: https://huggingface.co/datasets/PLAID-datasets/AirfRANS_clipped.
h
example-space-to-dataset-image-zip
huggingface.co
Updated Jun 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lucain Pouget (2023). example-space-to-dataset-image-zip [Dataset]. https://huggingface.co/datasets/Wauplin/example-space-to-dataset-image-zip
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 16, 2023
Authors
Lucain Pouget
Description
Demo to save data from a Space to a Dataset. Goal is to provide reusable snippets of code.

Documentation: https://huggingface.co/docs/huggingface_hub/main/en/guides/upload#scheduled-uploads Space: https://huggingface.co/spaces/Wauplin/space_to_dataset_saver/ JSON dataset: https://huggingface.co/datasets/Wauplin/example-commit-scheduler-json Image dataset: https://huggingface.co/datasets/Wauplin/example-commit-scheduler-image Image (zipped) dataset:… See the full description on the dataset page: https://huggingface.co/datasets/Wauplin/example-space-to-dataset-image-zip.
f
example datasets
figshare.com
zip
Updated Jan 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
soda-inria (2025). example datasets [Dataset]. http://doi.org/10.6084/m9.figshare.28241549.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28241549.v1
Dataset updated
Jan 20, 2025
Dataset provided by
figshare
Authors
soda-inria
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
example dataset
g
Sample dataset
carlvlewis.github.io
siciliahub.github.io
+7more
api, csv, shp
Updated Sep 22, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Sample dataset [Dataset]. https://carlvlewis.github.io/jkan/datasets/sample-dataset/
Explore at:
shp, api, csvAvailable download formats
Dataset updated
Sep 22, 2018
Description
This is an example dataset that comes with a new installation of JKAN
LinkedIn Datasets
brightdata.com
.json, .csv, .xlsx
Updated Dec 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2021). LinkedIn Datasets [Dataset]. https://brightdata.com/products/datasets/linkedin
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Dec 17, 2021
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Unlock the full potential of LinkedIn data with our extensive dataset that combines profiles, company information, and job listings into one powerful resource for business decision-making, strategic hiring, competitive analysis, and market trend insights. This all-encompassing dataset is ideal for professionals, recruiters, analysts, and marketers aiming to enhance their strategies and operations across various business functions. Dataset Features

Profiles: Dive into detailed public profiles featuring names, titles, positions, experience, education, skills, and more. Utilize this data for talent sourcing, lead generation, and investment signaling, with a refresh rate ensuring up to 30 million records per month. Companies: Access comprehensive company data including ID, country, industry, size, number of followers, website details, subsidiaries, and posts. Tailored subsets by industry or region provide invaluable insights for CRM enrichment, competitive intelligence, and understanding the startup ecosystem, updated monthly with up to 40 million records. Job Listings: Explore current job opportunities detailed with job titles, company names, locations, and employment specifics such as seniority levels and employment functions. This dataset includes direct application links and real-time application numbers, serving as a crucial tool for job seekers and analysts looking to understand industry trends and the job market dynamics.

Customizable Subsets for Specific Needs Our LinkedIn dataset offers the flexibility to tailor the dataset according to your specific business requirements. Whether you need comprehensive insights across all data points or are focused on specific segments like job listings, company profiles, or individual professional details, we can customize the dataset to match your needs. This modular approach ensures that you get only the data that is most relevant to your objectives, maximizing efficiency and relevance in your strategic applications. Popular Use Cases

Strategic Hiring and Recruiting: Track talent movement, identify growth opportunities, and enhance your recruiting efforts with targeted data. Market Analysis and Competitive Intelligence: Gain a competitive edge by analyzing company growth, industry trends, and strategic opportunities. Lead Generation and CRM Enrichment: Enrich your database with up-to-date company and professional data for targeted marketing and sales strategies. Job Market Insights and Trends: Leverage detailed job listings for a nuanced understanding of employment trends and opportunities, facilitating effective job matching and market analysis. AI-Driven Predictive Analytics: Utilize AI algorithms to analyze large datasets for predicting industry shifts, optimizing business operations, and enhancing decision-making processes based on actionable data insights.

Whether you are mapping out competitive landscapes, sourcing new talent, or analyzing job market trends, our LinkedIn dataset provides the tools you need to succeed. Customize your access to fit specific needs, ensuring that you have the most relevant and timely data at your fingertips.
P
Meta-Dataset Dataset
paperswithcode.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eleni Triantafillou; Tyler Zhu; Vincent Dumoulin; Pascal Lamblin; Utku Evci; Kelvin Xu; Ross Goroshin; Carles Gelada; Kevin Swersky; Pierre-Antoine Manzagol; Hugo Larochelle, Meta-Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/meta-dataset
Explore at:
Authors
Eleni Triantafillou; Tyler Zhu; Vincent Dumoulin; Pascal Lamblin; Utku Evci; Kelvin Xu; Ross Goroshin; Carles Gelada; Kevin Swersky; Pierre-Antoine Manzagol; Hugo Larochelle
Description
The Meta-Dataset benchmark is a large few-shot learning benchmark and consists of multiple datasets of different data distributions. It does not restrict few-shot tasks to have fixed ways and shots, thus representing a more realistic scenario. It consists of 10 datasets from diverse domains:

ILSVRC-2012 (the ImageNet dataset, consisting of natural images with 1000 categories) Omniglot (hand-written characters, 1623 classes) Aircraft (dataset of aircraft images, 100 classes) CUB-200-2011 (dataset of Birds, 200 classes) Describable Textures (different kinds of texture images with 43 categories) Quick Draw (black and white sketches of 345 different categories) Fungi (a large dataset of mushrooms with 1500 categories) VGG Flower (dataset of flower images with 102 categories), Traffic Signs (German traffic sign images with 43 classes) MSCOCO (images collected from Flickr, 80 classes).

All datasets except Traffic signs and MSCOCO have a training, validation and test split (proportioned roughly into 70%, 15%, 15%). The datasets Traffic Signs and MSCOCO are reserved for testing only.
b
News Datasets
brightdata.com
.json, .csv, .xlsx
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data, News Datasets [Dataset]. https://brightdata.com/products/datasets/news
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset authored and provided by
Bright Data
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Stay ahead with our comprehensive News Dataset, designed for businesses, analysts, and researchers to track global events, monitor media trends, and extract valuable insights from news sources worldwide.

Dataset Features

News Articles: Access structured news data, including headlines, summaries, full articles, publication dates, and source details. Ideal for media monitoring and sentiment analysis. Publisher & Source Information: Extract details about news publishers, including domain, region, and credibility indicators. Sentiment & Topic Classification: Analyze news sentiment, categorize articles by topic, and track emerging trends in real time. Historical & Real-Time Data: Retrieve historical archives or access continuously updated news feeds for up-to-date insights.

Customizable Subsets for Specific Needs Our News Dataset is fully customizable, allowing you to filter data based on publication date, region, topic, sentiment, or specific news sources. Whether you need broad coverage for trend analysis or focused data for competitive intelligence, we tailor the dataset to your needs.

Popular Use Cases

Media Monitoring & Reputation Management: Track brand mentions, analyze media coverage, and assess public sentiment. Market & Competitive Intelligence: Monitor industry trends, competitor activity, and emerging market opportunities. AI & Machine Learning Training: Use structured news data to train AI models for sentiment analysis, topic classification, and predictive analytics. Financial & Investment Research: Analyze news impact on stock markets, commodities, and economic indicators. Policy & Risk Analysis: Track regulatory changes, geopolitical events, and crisis developments in real time.

Whether you're analyzing market trends, monitoring brand reputation, or training AI models, our News Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
d
Audience Targeting Data I US Consumer | Behavioral Intelligence | Purchase,...
datarade.ai
.csv, .xls
Updated Nov 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Allforce (formerly Solution Publishing) (2023). Audience Targeting Data I US Consumer | Behavioral Intelligence | Purchase, Shopper, Lifestyle Data | Verified Email, Phone, Address [Dataset]. https://datarade.ai/data-categories/consumer-data/datasets
Explore at:
.csv, .xlsAvailable download formats
Dataset updated
Nov 14, 2023
Dataset authored and provided by
Allforce (formerly Solution Publishing)
Area covered
United States of America
Description
Access high-fidelity consumer data powered by our proprietary modeling technology that provides the most comprehensive consumer intelligence, accurate targeting, first-party data enrichment, and personalization at scale. Our deterministic dataset, anchored in the purchasing habits of over 140 million U.S. consumers, delivers superior targeting performance with proven 70% increase in ROAS.

Core Data Assets Transactional Data Foundation: Real purchasing behavior from over 140 million U.S. consumers with 8.5 billion behavioral signals across 250 million adults. Seven years of daily credit card and debit card purchase data aggregated from all major credit cards sourced from more than 300 national banks, capturing $2+ trillion in annual discretionary spending.

Consumer Demographics & Lifestyle: Comprehensive profiles including age, income, household composition, geographic distribution, education, employment, and lifestyle indicators. Our proprietary taxonomy organizes consumer spending across 8,000+ brands and 2,500+ merchants, from major retailers to emerging direct-to-consumer brands.

Behavioral Segmentation: 150+ custom consumer communities including demographic groups (Gen Z, Millennials, Gen X), lifestyle segments (Health & Fitness Enthusiasts, Tech Early Adopters, Luxury Shoppers), and behavioral categories (Deal Seekers, Brand Loyalists, Premium Service Users, Streaming Subscribers). Purchase Intelligence: Deep insights into consumer spending patterns across entertainment, fitness, fashion, technology, travel, dining, and retail categories. Our models identify cross-category purchasing behaviors, seasonal trends, and brand switching patterns to optimize targeting strategies. Advanced Modeling Technology

Our proprietary consumer intelligence engine combines deterministic transaction-based data with Smart Audience Engineering that transforms first-party signals from anonymized website traffic, behavioral indicators, and CRM enrichment into precision-modeled segments. Unlike traditional data providers who sell static lists, our AI-powered predictive modeling continuously learns and optimizes for unprecedented precision and superior conversion outcomes.

Performance Advantages: Audiences built on user-level transactional data deliver 70% increase in ROAS compared to traditional targeting methods. Weekly-optimized audiences with performance narratives eliminate wasted ad spend by 20-30%, while our deterministic AI models analyze hundreds of attributes and conversion-validated signals to identify prospects with genuine purchase intent, not just lookalike behaviors.
m
Event Detection Dataset
data.mendeley.com
datosdeinvestigacion.conicet.gov.ar
+2more
Updated Jul 11, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mariano Maisonnave (2020). Event Detection Dataset [Dataset]. http://doi.org/10.17632/7d54rvzxkr.1
Explore at:
Unique identifier
https://doi.org/10.17632/7d54rvzxkr.1
Dataset updated
Jul 11, 2020
Authors
Mariano Maisonnave
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The present is a manually labeled data set for the task of Event Detection (ED). The task of ED consists of identifying event triggers, the word that most clearly indicates the occurrence of an event.

The present data set consists of 2,200 news extracts from The New York Times (NYT) Annotated Corpus, separated into training (2,000) and testing (200) sets. Each news extract contains the plain text with the labels (event mentions), along with two metadata (publication date and an identifier).

Labels description: We consider as event any ongoing real-world event or situation reported in the news articles. It is important to distinguish those events and situations that are in progress (or are reported as fresh events) at the moment the news is delivered from past events that are simply brought back, future events, hypothetical events, or events that will not take place. In our data set we only labeled as event the first type of event. Based on this criterion, some words that are typically considered as events are labeled as non-event triggers if they do not refer to ongoing events at the time the analyzed news is released. Take for instance the following news extract: "devaluation is not a realistic option to the current account deficit since it would only contribute to weakening the credibility of economic policies as it did during the last crisis." The only word that is labeled as event trigger in this example is "deficit" because it is the only ongoing event refereed in the news. Note that the words "devaluation", "weakening" and "crisis" could be labeled as event triggers in other news extracts, where the context of use of these words is different, but not in the given example.

Further information: For a more detailed description of the data set and the data collection process please visit: https://cs.uns.edu.ar/~mmaisonnave/resources/ED_data.

Data format: The dataset is split in two folders: training and testing. The first folder contains 2,000 XML files. The second folder contains 200 XML files. Each XML file has the following format.

<?xml version="1.0" encoding="UTF-8"?>

The first three tags (pubdate, file-id and sent-idx) contain metadata information. The first one is the publication date of the news article that contained that text extract. The next two tags represent a unique identifier for the text extract. The file-id uniquely identifies a news article, that can hold several text extracts. The second one is the index that identifies that text extract inside the full article.

The last tag (sentence) defines the beginning and end of the text extract. Inside that text are the tags. Each of these tags surrounds one word that was manually labeled as an event trigger.
Developer Community and Code Datasets
datarade.ai
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oxylabs, Developer Community and Code Datasets [Dataset]. https://datarade.ai/data-products/developer-community-and-code-datasets-oxylabs
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset authored and provided by
Oxylabs
Area covered
Philippines, Tuvalu, Bahamas, El Salvador, Marshall Islands, South Sudan, Guyana, Saint Pierre and Miquelon, United Kingdom, Djibouti
Description
Unlock the power of ready-to-use data sourced from developer communities and repositories with Developer Community and Code Datasets.

Data Sources:

GitHub: Access comprehensive data about GitHub repositories, developer profiles, contributions, issues, social interactions, and more.

StackShare: Receive information about companies, their technology stacks, reviews, tools, services, trends, and more.

DockerHub: Dive into data from container images, repositories, developer profiles, contributions, usage statistics, and more.

Developer Community and Code Datasets are a treasure trove of public data points gathered from tech communities and code repositories across the web.

With our datasets, you'll receive:

Usernames;

Companies;

Locations;

Job Titles;

Follower Counts;

Contact Details;

Employability Statuses;

And More.

Choose from various output formats, storage options, and delivery frequencies:

Get datasets in CSV, JSON, or other preferred formats.

Opt for data delivery via SFTP or directly to your cloud storage, such as AWS S3.

Receive datasets either once or as per your agreed-upon schedule.

Why choose our Datasets?

Fresh and accurate data: Access complete, clean, and structured data from scraping professionals, ensuring the highest quality.

Time and resource savings: Let us handle data extraction and processing cost-effectively, freeing your resources for strategic tasks.

Customized solutions: Share your unique data needs, and we'll tailor our data harvesting approach to fit your requirements perfectly.

Legal compliance: Partner with a trusted leader in ethical data collection. Oxylabs is trusted by Fortune 500 companies and adheres to GDPR and CCPA standards.

Pricing Options:

Standard Datasets: choose from various ready-to-use datasets with standardized data schemas, priced from $1,000/month.

Custom Datasets: Tailor datasets from any public web domain to your unique business needs. Contact our sales team for custom pricing.

Experience a seamless journey with Oxylabs:

Understanding your data needs: We work closely to understand your business nature and daily operations, defining your unique data requirements.

Developing a customized solution: Our experts create a custom framework to extract public data using our in-house web scraping infrastructure.

Delivering data sample: We provide a sample for your feedback on data quality and the entire delivery process.

Continuous data delivery: We continuously collect public data and deliver custom datasets per the agreed frequency.

Empower your data-driven decisions with Oxylabs Developer Community and Code Datasets!
Sample Leads Dataset
kaggle.com
Updated Jun 24, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ThatSean (2022). Sample Leads Dataset [Dataset]. https://www.kaggle.com/datasets/thatsean/sample-leads-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 24, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
ThatSean
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset is based on the Sample Leads Dataset and is intended to allow some simple filtering by lead source. I had modified this dataset to support an upcoming Towards Data Science article walking through the process. Link to be shared once published.
Clustering Data Sets With 2 Examples
kaggle.com
zip
Updated Sep 9, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Manohar Reddy (2019). Clustering Data Sets With 2 Examples [Dataset]. https://www.kaggle.com/manohar676/clustering-data-sets-with-2-examples
Explore at:
zip(1905 bytes)Available download formats
Dataset updated
Sep 9, 2019
Authors
Manohar Reddy
Description
Dataset

This dataset was created by Manohar Reddy

Contents
i
Malware Analysis Datasets: Top-1000 PE Imports
ieee-dataport.org
Updated Nov 8, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Angelo Oliveira (2019). Malware Analysis Datasets: Top-1000 PE Imports [Dataset]. https://ieee-dataport.org/open-access/malware-analysis-datasets-top-1000-pe-imports
Explore at:
Dataset updated
Nov 8, 2019
Authors
Angelo Oliveira
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is part of my PhD research on malware detection and classification using Deep Learning. It contains static analysis data: Top-1000 imported functions extracted from the 'pe_imports' elements of Cuckoo Sandbox reports. PE malware examples were downloaded from virusshare.com. PE goodware examples were downloaded from portableapps.com and from Windows 7 x86 directories.
T
fashion_mnist
tensorflow.org
opendatalab.com
+3more
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). fashion_mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/fashion_mnist
Explore at:
Dataset updated
Jun 1, 2024
Description
Fashion-MNIST is a dataset of Zalando's article images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('fashion_mnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/fashion_mnist-3.0.1.png" alt="Visualization" width="500px">
D
PDEBench Datasets
darus.uni-stuttgart.de
opendatalab.com
Updated Feb 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Makoto Takamoto; Timothy Praditia; Raphael Leiteritz; Dan MacKinlay; Francesco Alesiani; Dirk Pflüger; Mathias Niepert (2024). PDEBench Datasets [Dataset]. http://doi.org/10.18419/DARUS-2986
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.18419/DARUS-2986
Dataset updated
Feb 13, 2024
Dataset provided by
DaRUS
Authors
Makoto Takamoto; Timothy Praditia; Raphael Leiteritz; Dan MacKinlay; Francesco Alesiani; Dirk Pflüger; Mathias Niepert
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset funded by
DFG
Description
This dataset contains benchmark data, generated with numerical simulation based on different PDEs, namely 1D advection, 1D Burgers', 1D and 2D diffusion-reaction, 1D diffusion-sorption, 1D, 2D, and 3D compressible Navier-Stokes, 2D Darcy flow, and 2D shallow water equation. This dataset is intended to progress the scientific ML research area. In general, the data are stored in HDF5 format, with the array dimensions packed according to the convention [b,t,x1,...,xd,v], where b is the batch size (i.e. number of samples), t is the time dimension, x1,...,xd are the spatial dimensions, and v is the number of channels (i.e. number of variables of interest). More detailed information are also provided in our Github repository (https://github.com/pdebench/PDEBench) and our submitting paper to NeurIPS 2022 Benchmark track.
m
Data from: Active Sonar Data Set
data.mendeley.com
search.datacite.org
Updated Oct 9, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammad Khishe (2017). Active Sonar Data Set [Dataset]. http://doi.org/10.17632/fyxjjwzphf.1
Explore at:
Unique identifier
https://doi.org/10.17632/fyxjjwzphf.1
Dataset updated
Oct 9, 2017
Authors
Mohammad Khishe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this data set, 6 objects including 2 targets and 4 non-targets lay on the sea sand bottom. Upon this experiment, the transmitted signal is Wide-Band Linear Frequency Modulated Pulse (WLFM) which covers frequency range 5-110 KHz. Targets lay on the bottom rotate 180 degrees with 1 degree accuracy via electromotor. Off target to 10 meters backscattered echoes are accumulated. Fine dataset takes key role in sonar target classification. Regarding massive raw data obtained from previous stage, above massive calculation will be expected. To reduce calculation burden relating to classifying and extracting feature, it is essential to detect targets out of total received data. To implement this, the intensity of the received signal is used. It is inevitable to consider multi-path propagation, secondary reflections, and reverberation due to shoal of the region. The researcher attempts to eliminate artifact tract after detecting stage and before extracting feature by the use of a matched filter.

Facebook

Twitter

Click to copy link

Link copied

Cite

Templates (2023). dataset-card-example [Dataset]. https://huggingface.co/datasets/templates/dataset-card-example

dataset-card-example

templates/dataset-card-example

Explore at:

Dataset updated

Sep 28, 2023

Dataset authored and provided by

Templates

Description

Dataset Card for Dataset Name

This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

  Dataset Details







  Dataset Description

Curated by: [More Information Needed] Funded by [optional]: [More Information Needed] Shared by [optional]: [More Information Needed] Language(s) (NLP): [More Information Needed] License: [More Information Needed]

  Dataset Sources [optional]… See the full description on the dataset page: https://huggingface.co/datasets/templates/dataset-card-example.

Clear search

Close search

Google apps

Main menu

dataset-card-example

An example data set for exploration of Multiple Linear Regression

Open Data Guidance - Privacy for Open Datasets

Training images

AirfRANS_clipped

Load the dataset

example-space-to-dataset-image-zip

example datasets

Sample dataset

LinkedIn Datasets

Meta-Dataset Dataset

News Datasets

Audience Targeting Data I US Consumer | Behavioral Intelligence | Purchase,...

Event Detection Dataset

Developer Community and Code Datasets

Sample Leads Dataset

Clustering Data Sets With 2 Examples

Dataset

Contents

Malware Analysis Datasets: Top-1000 PE Imports

fashion_mnist

PDEBench Datasets

Data from: Active Sonar Data Set

dataset-card-exampleSee More Versions

templates/dataset-card-example

dataset-card-example