100+ datasets found

P
Meta-Dataset Dataset
paperswithcode.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eleni Triantafillou; Tyler Zhu; Vincent Dumoulin; Pascal Lamblin; Utku Evci; Kelvin Xu; Ross Goroshin; Carles Gelada; Kevin Swersky; Pierre-Antoine Manzagol; Hugo Larochelle, Meta-Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/meta-dataset
Explore at:
Authors
Eleni Triantafillou; Tyler Zhu; Vincent Dumoulin; Pascal Lamblin; Utku Evci; Kelvin Xu; Ross Goroshin; Carles Gelada; Kevin Swersky; Pierre-Antoine Manzagol; Hugo Larochelle
Description
The Meta-Dataset benchmark is a large few-shot learning benchmark and consists of multiple datasets of different data distributions. It does not restrict few-shot tasks to have fixed ways and shots, thus representing a more realistic scenario. It consists of 10 datasets from diverse domains:

ILSVRC-2012 (the ImageNet dataset, consisting of natural images with 1000 categories) Omniglot (hand-written characters, 1623 classes) Aircraft (dataset of aircraft images, 100 classes) CUB-200-2011 (dataset of Birds, 200 classes) Describable Textures (different kinds of texture images with 43 categories) Quick Draw (black and white sketches of 345 different categories) Fungi (a large dataset of mushrooms with 1500 categories) VGG Flower (dataset of flower images with 102 categories), Traffic Signs (German traffic sign images with 43 classes) MSCOCO (images collected from Flickr, 80 classes).

All datasets except Traffic signs and MSCOCO have a training, validation and test split (proportioned roughly into 70%, 15%, 15%). The datasets Traffic Signs and MSCOCO are reserved for testing only.
R
Face For Small Large Dataset
universe.roboflow.com
zip
Updated May 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ok (2024). Face For Small Large Dataset [Dataset]. https://universe.roboflow.com/ok-4sjtq/face-for-small-large/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
May 13, 2024
Dataset authored and provided by
ok
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Faces Bounding Boxes
Description
Face For Small Large

## Overview Face For Small Large is a dataset for object detection tasks - it contains Faces annotations for 389 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
g
INSPIRE Download Service (predefined ATOM) for dataset Large meadows 2. |...
gimi9.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
INSPIRE Download Service (predefined ATOM) for dataset Large meadows 2. | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_da5acc6d-b0c0-0002-6a1f-d8d41d0d9a96/
Explore at:
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Description of the INSPIRE Download Service (predefined Atom): Development plan "Große Wiesen 2. Änderung" of the municipality of Rettert - The link(s) for downloading the data sets is/are dynamically generated from Get Map calls to a WMS interface
N
Excel, AL Age Group Population Dataset: A Complete Breakdown of Excel Age...
neilsberg.com
csv, json
Updated Jul 24, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Excel, AL Age Group Population Dataset: A Complete Breakdown of Excel Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/aa8c95e0-4983-11ef-ae5d-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Jul 24, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Excel
Variables measured
Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the Excel population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Excel. The dataset can be utilized to understand the population distribution of Excel by age. For example, using this dataset, we can identify the largest age group in Excel.

Key observations

The largest age group in Excel, AL was for the group of age 45 to 49 years years with a population of 74 (15.64%), according to the ACS 2018-2022 5-Year Estimates. At the same time, the smallest age group in Excel, AL was the 85 years and over years with a population of 2 (0.42%). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

Age groups:

Under 5 years

5 to 9 years

10 to 14 years

15 to 19 years

20 to 24 years

25 to 29 years

30 to 34 years

35 to 39 years

40 to 44 years

45 to 49 years

50 to 54 years

55 to 59 years

60 to 64 years

65 to 69 years

70 to 74 years

75 to 79 years

80 to 84 years

85 years and over

Variables / Data Columns

Age Group: This column displays the age group in consideration

Population: The population for the specific age group in the Excel is shown in this column.

% of Total Population: This column displays the population of each age group as a proportion of Excel total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Excel Population by Age. You can refer the same here
n
FOI 30990 - Datasets - Open Data Portal
opendata.nhsbsa.net
Updated Feb 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). FOI 30990 - Datasets - Open Data Portal [Dataset]. https://opendata.nhsbsa.net/dataset/foi-30990
Explore at:
Dataset updated
Feb 13, 2023
Description
Once PowerPivot has been installed, to load the large files, please follow the instructions below: Start Excel as normal Click on the PowerPivot tab Click on the PowerPivot Window icon (top left) In the PowerPivot Window, click on the "From Other Sources" icon In the Table Import Wizard e.g. scroll to the bottom and select Text File Browse to the file you want to open and choose the file extension you require e.g. CSV Please read the below notes to ensure correct understanding of the data. Microsoft PowerPivot add-on for Excel can be used to handle larger data sets. The Microsoft PowerPivot add-on for Excel is available using the link in the 'Related Links' section - https://www.microsoft.com/en-us/download/details.aspx?id=43348 Once PowerPivot has been installed, to load the large files, please follow the instructions below: 1. Start Excel as normal 2. Click on the PowerPivot tab 3. Click on the PowerPivot Window icon (top left) 4. In the PowerPivot Window, click on the "From Other Sources" icon 5. In the Table Import Wizard e.g. scroll to the bottom and select Text File 6. Browse to the file you want to open and choose the file extension you require e.g. CSV Please read the below notes to ensure correct understanding of the data. Fewer than 5 Items Please be aware that I have decided not to release the exact number of items, where the total number of items falls below 5, for certain drugs/patient combinations. Where suppression has been applied a * is shown in place of the number of items, please read this as 1-4 items. Suppressions have been applied where items are lower than 5, for items and NIC and for quantity when quantity and items are both lower than 5 for the following drugs and identified genders as per the sensitive drug list; When the BNF Paragraph Code is 60401 (Female Sex Hormones & Their Modulators) and the gender identified on the prescription is Male When the BNF Paragraph Code is 60402 (Male Sex Hormones And Antagonists) and the gender identified on the prescription is Female When the BNF Paragraph Code is 70201 (Preparations For Vaginal/Vulval Changes) and the gender identified on the prescription is Male When the BNF Paragraph Code is 70202 (Vaginal And Vulval Infections) and the gender identified on the prescription is Male When the BNF Paragraph Code is 70301 (Combined Hormonal Contraceptives/Systems) and the gender identified on the prescription is Male When the BNF Paragraph Code is 70302 (Progestogen-only Contraceptives) and the gender identified on the prescription is Male When the BNF Paragraph Code is 80302 (Progestogens) and the gender identified on the prescription is Male When the BNF Paragraph Code is 70405 (Drugs For Erectile Dysfunction) and the gender identified on the prescription is Female When the BNF Paragraph Code is 70406 (Drugs For Premature Ejaculation) and the gender identified on the prescription is Female This is because the patients could be identified, when combined with other information that may be in the public domain or reasonably available. This information falls under the exemption in section 40 subsections 2 and 3A (a) of the Freedom of Information Act. This is because it would breach the first data protection principle as: a. it is not fair to disclose patients personal details to the world and is likely to cause damage or distress. b. these details are not of sufficient interest to the public to warrant an intrusion into the privacy of the patients. Please click the below web link to see the exemption in full.
g
INSPIRE Download Service (predefined ATOM) for dataset Large Gardens |...
gimi9.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
INSPIRE Download Service (predefined ATOM) for dataset Large Gardens | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_723061d2-ea3f-0002-2b64-31e9ac312e5e/
Explore at:
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Description of the INSPIRE Download Service (predefined Atom): Development Plan Große Garten Böhl-Iggelheim - The link(s) for downloading the datasets is/are dynamically generated from Get Map calls to a WMS interface
Stanford Large-Scale 3D Indoor Spaces Dataset (S3DIS)
redivis.com
application/jsonl +7
Updated Jun 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanford Doerr School of Sustainability (2024). Stanford Large-Scale 3D Indoor Spaces Dataset (S3DIS) [Dataset]. http://doi.org/10.57761/gk3g-wc33
Explore at:
avro, sas, arrow, csv, application/jsonl, parquet, spss, stataAvailable download formats
Unique identifier
https://doi.org/10.57761/gk3g-wc33
Dataset updated
Jun 28, 2024
Dataset provided by
Redivis Inc.
Authors
Stanford Doerr School of Sustainability
Time period covered
Jun 27, 2024
Description
Abstract

S3DIS comprises 6 colored 3D point clouds from 6 large-scale indoor areas, along with semantic instance annotations for 12 object categories (wall, floor, ceiling, beam, column, window, door, sofa, desk, chair, bookcase, and board).

Methodology

The Stanford Large-Scale 3D Indoor Spaces (S3DIS) dataset is composed of the colored 3D point clouds of six large-scale indoor areas from three different buildings, each covering approximately 935, 965, 450, 1700, 870, and 1100 square meters (total of 6020 square meters). These areas show diverse properties in architectural style and appearance and include mainly office areas, educational and exhibition spaces, and conference rooms, personal offices, restrooms, open spaces, lobbies, stairways, and hallways are commonly found therein. The entire point clouds are automatically generated without any manual intervention using the Matterport scanner. The dataset also includes semantic instance annotations on the point clouds for 12 semantic elements, which are structural elements (ceiling, floor, wall, beam, column, window, and door) and commonly found items and furniture (table, chair, sofa, bookcase, and board).

https://redivis.com/fileUploads/5bdaf09c-7d3b-4a91-b192-d98a0f0b0018%3E" alt="S3DIS.png">

%3Cu%3E%3Cstrong%3EImportant Information%3C/strong%3E%3C/u%3E

This paper was presented in the "3D Semantic Parsing of Large-Scale Indoor Spaces", CVPR 2016.

Project website: http://buildingparser.stanford.edu/

%3C!-- --%3E
T
newsroom
tensorflow.org
opendatalab.com
Updated Dec 14, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). newsroom [Dataset]. https://www.tensorflow.org/datasets/catalog/newsroom
Explore at:
Dataset updated
Dec 14, 2022
Description
NEWSROOM is a large dataset for training and evaluating summarization systems. It contains 1.3 million articles and summaries written by authors and editors in the newsrooms of 38 major publications.

Dataset features includes:

text: Input news text.

summary: Summary for the news.

And additional features:

title: news title.

url: url of the news.

date: date of the article.

density: extractive density.

coverage: extractive coverage.

compression: compression ratio.

density_bin: low, medium, high.

coverage_bin: extractive, abstractive.

compression_bin: low, medium, high.

This dataset can be downloaded upon requests. Unzip all the contents "train.jsonl, dev.jsonl, test.jsonl" to the tfds folder.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('newsroom', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
T
criteo
tensorflow.org
Updated Dec 22, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). criteo [Dataset]. https://www.tensorflow.org/datasets/catalog/criteo
Explore at:
Dataset updated
Dec 22, 2022
Description
Criteo Uplift Modeling Dataset

This dataset is released along with the paper: “A Large Scale Benchmark for Uplift Modeling” Eustache Diemert, Artem Betlei, Christophe Renaudin; (Criteo AI Lab), Massih-Reza Amini (LIG, Grenoble INP)

This work was published in: AdKDD 2018 Workshop, in conjunction with KDD 2018.

Data description

This dataset is constructed by assembling data resulting from several incrementality tests, a particular randomized trial procedure where a random part of the population is prevented from being targeted by advertising. it consists of 25M rows, each one representing a user with 11 features, a treatment indicator and 2 labels (visits and conversions).

Fields

Here is a detailed description of the fields (they are comma-separated in the file):

f0, f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11: feature values (dense, float)

treatment: treatment group (1 = treated, 0 = control)

conversion: whether a conversion occured for this user (binary, label)

visit: whether a visit occured for this user (binary, label)

exposure: treatment effect, whether the user has been effectively exposed (binary)

Key figures

Format: CSV

Size: 459MB (compressed)

Rows: 25,309,483

Average Visit Rate: .04132

Average Conversion Rate: .00229

Treatment Ratio: .846

Tasks

The dataset was collected and prepared with uplift prediction in mind as the main task. Additionally we can foresee related usages such as but not limited to:

benchmark for causal inference

uplift modeling

interactions between features and treatment

heterogeneity of treatment

benchmark for observational causality methods

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('criteo', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
e
OSNI Open Data 50m Digital Terrain Model CSV
data.europa.eu
data.wu.ac.at
csv
Updated Oct 11, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenDataNI (2021). OSNI Open Data 50m Digital Terrain Model CSV [Dataset]. https://data.europa.eu/data/datasets/osni-open-data-50m-digital-terrain-model-csv1
Explore at:
csvAvailable download formats
Dataset updated
Oct 11, 2021
Dataset authored and provided by
OpenDataNI
License
http://reference.data.gov.uk/id/open-government-licencehttp://reference.data.gov.uk/id/open-government-licence
Description
A Digital Terrain Model (DTM) is a digital file consisting of a grid of regularly spaced points of known height which, when used with other digital data such as maps or orthophotographs, can provide a 3D image of the land surface. 10m and 50m DTM’s are available. This is a large dataset and will take sometime to download. Please be patient. By download or use of this dataset you agree to abide by the Open Government Data Licence.
a
Animal Totals - Expense, Measured in US Dollars
impactmap-smudallas.hub.arcgis.com
Updated May 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SMU (2024). Animal Totals - Expense, Measured in US Dollars [Dataset]. https://impactmap-smudallas.hub.arcgis.com/datasets/animal-totals-expense-measured-in-us-dollars
Explore at:
Dataset updated
May 29, 2024
Dataset authored and provided by
SMU
Area covered

Description
The Census of Agriculture, produced by the United States Department of Agriculture (USDA), provides a complete count of Texas' farms, ranches and the people who grow our food. The census is conducted every five years, most recently in 2022, and provides an in-depth look at the agricultural industry.The complete census includes over 260 separate commodities. This dataset is a subset of 23 commodities selected for publishingThis layer was produced from data obtained from the USDA National Agriculture Statistics Service (NASS) Large Datasets download page. The data were transformed and prepared for publishing using the Pivot Table geoprocessing tool in ArcGIS Pro and joined to county boundaries. The county boundaries are 2022 vintage and come from Living Atlas ACS 2022 feature layers.AttributesNote that some values are suppressed as "Withheld to avoid disclosing data for individual operations", "Not applicable", or "Less than half the rounding unit". These have been coded in the data as -999, -888, and -777 respectively.AlmondsAnimal TotalsBarleyCattleChickensCornCottonCrop TotalsGovt ProgramsGrainGrapesHayHogsLaborMachinery TotalsRiceSorghumSoybeanTractorsTrucksTurkeysWheatWinter Wheat
h
wikipedia-small-3000-embedded
huggingface.co
Updated Apr 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hafedh Hichri (2024). wikipedia-small-3000-embedded [Dataset]. https://huggingface.co/datasets/not-lain/wikipedia-small-3000-embedded
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 6, 2024
Authors
Hafedh Hichri
License
https://choosealicense.com/licenses/gfdl/https://choosealicense.com/licenses/gfdl/
Description
this is a subset of the wikimedia/wikipedia dataset code for creating this dataset : from datasets import load_dataset, Dataset from sentence_transformers import SentenceTransformer model = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1")

load dataset in streaming mode (no download and it's fast)

dataset = load_dataset( "wikimedia/wikipedia", "20231101.en", split="train", streaming=True )

select 3000 samples

from tqdm importtqdm data = Dataset.from_dict({}) for i, entry in… See the full description on the dataset page: https://huggingface.co/datasets/not-lain/wikipedia-small-3000-embedded.
z
Large File Download Application - Dataset - data.govt.nz - discover and use...
portal.zero.govt.nz
Updated Nov 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Large File Download Application - Dataset - data.govt.nz - discover and use data [Dataset]. https://portal.zero.govt.nz/77d6ef04507c10508fcfc67a7c24be32/dataset/large-file-download-application5
Explore at:
Dataset updated
Nov 5, 2023
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Taupō District Council large file download application. Source lidar, contour and imagery files are available for download. Flood Hazard data relating to Plan Change 34 of the Taupō District Plan is also available for download. Taupō District Council does not make any representation or give any warranty as to the accuracy or exhaustiveness of the data provided for download via this application. The data provided is indicative only and does not purport to be a complete database of all information in Taupō District Council's possession or control. Taupō District Council shall not be liable for any loss, damage, cost or expense (whether direct or indirect) arising from reliance upon or use of any data provided, or Council's failure to provide this data.
Z
Data from: Caravan - A global community dataset for large-sample hydrology
data.niaid.nih.gov
biorxiv.org
+2more
Updated Jan 16, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shalev, Guy (2025). Caravan - A global community dataset for large-sample hydrology [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6522634
Explore at:
Dataset updated
Jan 16, 2025
Dataset provided by
Erickson, Tyler
Addor, Nans
Kratzert, Frederik
Shalev, Guy
Nearing, Grey
Matias, Yossi
Gilon, Oren
Gudmundsson, Lukas
Klotz, Daniel
Nevo, Sella
Gauch, Martin
Hassidim, Avinatan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the accompanying dataset to the following paper https://www.nature.com/articles/s41597-023-01975-w

Caravan is an open community dataset of meteorological forcing data, catchment attributes, and discharge daat for catchments around the world. Additionally, Caravan provides code to derive meteorological forcing data and catchment attributes from the same data sources in the cloud, making it easy for anyone to extend Caravan to new catchments. The vision of Caravan is to provide the foundation for a truly global open source community resource that will grow over time.

If you use Caravan in your research, it would be appreciated to not only cite Caravan itself, but also the source datasets, to pay respect to the amount of work that was put into the creation of these datasets and that made Caravan possible in the first place.

All current development and additional community extensions can be found at https://github.com/kratzert/Caravan

Channel Log:

23 May 2022: Version 0.2 - Resolved a bug when renaming the LamaH gauge ids from the LamaH ids to the official gauge ids provided as "govnr" in the LamaH dataset attribute files.

24 May 2022: Version 0.3 - Fixed gaps in forcing data in some "camels" (US) basins.

15 June 2022: Version 0.4 - Fixed replacing negative CAMELS US values with NaN (-999 in CAMELS indicates missing observation).

1 December 2022: Version 0.4 - Added 4298 basins in the US, Canada and Mexico (part of HYSETS), now totalling to 6830 basins. Fixed a bug in the computation of catchment attributes that are defined as pour point properties, where sometimes the wrong HydroATLAS polygon was picked. Restructured the attribute files and added some more meta data (station name and country).

16 January 2023: Version 1.0 - Version of the official paper release. No changes in the data but added a static copy of the accompanying code of the paper. For the most up to date version, please check https://github.com/kratzert/Caravan

10 May 2023: Version 1.1 - No data change, just update data description.

17 May 2023: Version 1.2 - Updated a handful of attribute values that were affected by a bug in their derivation. See https://github.com/kratzert/Caravan/issues/22 for details.

16 April 2024: Version 1.4 - Added 9130 gauges from the original source dataset that were initially not included because of the area thresholds (i.e. basins smaller than 100sqkm or larger than 2000sqkm). Also extended the forcing period for all gauges (including the original ones) to 1950-2023. Added two different download options that include timeseries data only as either csv files (Caravan-csv.tar.xz) or netcdf files (Caravan-nc.tar.xz). Including the large basins also required an update in the earth engine code

16 Jan 2025: Version 1.5 - Added FAO Penman-Monteith PET (potential_evaporation_sum_FAO_PENMAN_MONTEITH) and renamed the ERA5-LAND potential_evaporation band to potential_evaporation_sum_ERA5_LAND. Also added all PET-related climated indices derived with the Penman-Monteith PET band (suffix "_FAO_PM") and renamed the old PET-related indices accordingly (suffix "_ERA5_LAND").
a
OSNI Open Data - 10M DTM - Sheets 201-250
hub.arcgis.com
data.europa.eu
+2more
Updated Jun 3, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SpatialNI (2020). OSNI Open Data - 10M DTM - Sheets 201-250 [Dataset]. https://hub.arcgis.com/documents/fabaca5127ae4e92abb48eac13421c41
Explore at:
Dataset updated
Jun 3, 2020
Dataset authored and provided by
SpatialNI
Description
A Digital Terrain Model (DTM) is a digital file consisting of a grid of regularly spaced points of known height which, when used with other digital data such as maps or orthophotographs, can provide a 3D image of the land surface. This download contains OSNI 10k sheet numbers 201-250. This is a large dataset and will take sometime to download. Please be patient. This service is published for OpenData. By download or use of this dataset you agree to abide by the LPS Open Government Data Licence.Please Note for Open Data NI Users: Esri Rest API is not Broken, it will not open on its own in a Web Browser but can be copied and used in Desktop and Webmaps
P
EdNet Dataset
paperswithcode.com
Updated Apr 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Youngduck Choi; Youngnam Lee; Dongmin Shin; Junghyun Cho; Seoyon Park; Seewoo Lee; Jineon Baek; Chan Bae; Byung-soo Kim; Jaewe Heo (2023). EdNet Dataset [Dataset]. https://paperswithcode.com/dataset/ednet
Explore at:
Dataset updated
Apr 4, 2023
Authors
Youngduck Choi; Youngnam Lee; Dongmin Shin; Junghyun Cho; Seoyon Park; Seewoo Lee; Jineon Baek; Chan Bae; Byung-soo Kim; Jaewe Heo
Description
A large-scale hierarchical dataset of diverse student activities collected by Santa, a multi-platform self-study solution equipped with artificial intelligence tutoring system. EdNet contains 131,441,538 interactions from 784,309 students collected over more than 2 years, which is the largest among the ITS datasets released to the public so far.
N
Big Rock, IL Population Breakdown By Race (Excluding Ethnicity) Dataset:...
neilsberg.com
csv, json
Updated Jul 7, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Big Rock, IL Population Breakdown By Race (Excluding Ethnicity) Dataset: Population Counts and Percentages for 7 Racial Categories as Identified by the US Census Bureau // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/2dad4987-230c-11ef-bd92-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Jul 7, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Big Rock, Illinois
Variables measured
Asian Population, Black Population, White Population, Some other race Population, Two or more races Population, American Indian and Alaska Native Population, Asian Population as Percent of Total Population, Black Population as Percent of Total Population, White Population as Percent of Total Population, Native Hawaiian and Other Pacific Islander Population, and 4 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the racial categories idetified by the US Census Bureau. It is ensured that the population estimates used in this dataset pertain exclusively to the identified racial categories, and do not rely on any ethnicity classification. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of Big Rock by race. It includes the population of Big Rock across racial categories (excluding ethnicity) as identified by the Census Bureau. The dataset can be utilized to understand the population distribution of Big Rock across relevant racial categories.

Key observations

The percent distribution of Big Rock population by race (across all racial categories recognized by the U.S. Census Bureau): 93.04% are white, 0.16% are American Indian and Alaska Native, 1.80% are Asian, 0.25% are some other race and 4.75% are multiracial.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.

Racial categories include:

White

Black or African American

American Indian and Alaska Native

Asian

Native Hawaiian and Other Pacific Islander

Some other race

Two or more races (multiracial)

Variables / Data Columns

Race: This column displays the racial categories (excluding ethnicity) for the Big Rock

Population: The population of the racial category (excluding ethnicity) in the Big Rock is shown in this column.

% of Total Population: This column displays the percentage distribution of each race as a proportion of Big Rock total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Big Rock Population by Race & Ethnicity. You can refer the same here
c
SAROS - A large, heterogeneous, and sparsely annotated segmentation dataset...
cancerimagingarchive.net
csv, n/a +1
Updated Oct 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2023). SAROS - A large, heterogeneous, and sparsely annotated segmentation dataset on CT imaging data [Dataset]. http://doi.org/10.25737/SZ96-ZG60
Explore at:
csv, n/a, nifti and zipAvailable download formats
Unique identifier
https://doi.org/10.25737/SZ96-ZG60
Dataset updated
Oct 29, 2023
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
Mar 7, 2024
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
Sparsely Annotated Region and Organ Segmentation (SAROS) contributes a large heterogeneous semantic segmentation annotation dataset for existing CT imaging cases on TCIA. The goal of this dataset is to provide high-quality annotations for building body composition analysis tools (References: Koitka 2020 and Haubold 2023). Existing in-house segmentation models were employed to generate annotation candidates on randomly selected cases. All generated annotations were manually reviewed and corrected by medical residents and students on every fifth axial slice while other slices were set to an ignore label (numeric value 255). 900 CT series from 882 patients were randomly selected from the following TCIA collections (number of CTs per collection in parenthesis): ACRIN-FLT-Breast (32), ACRIN-HNSCC-FDG-PET/CT (48), ACRIN-NSCLC-FDG-PET (129), Anti-PD-1_Lung (12), Anti-PD-1_MELANOMA (2), C4KC-KiTS (175), COVID-19-NY-SBU (1), CPTAC-CM (1), CPTAC-LSCC (3), CPTAC-LUAD (1), CPTAC-PDA (8), CPTAC-UCEC (26), HNSCC (17), Head-Neck Cetuximab (12), LIDC-IDRI (133), Lung-PET-CT-Dx (17), NSCLC Radiogenomics (7), NSCLC-Radiomics (56), NSCLC-Radiomics-Genomics (20), Pancreas-CT (58), QIN-HEADNECK (94), Soft-tissue-Sarcoma (6), TCGA-HNSC (1), TCGA-LIHC (33), TCGA-LUAD (2), TCGA-LUSC (3), TCGA-STAD (2), TCGA-UCEC (1). A script to download and resample the images is provided in our GitHub repository: https://github.com/UMEssen/saros-dataset The annotations are provided in NIfTI format and were performed on 5mm slice thickness. The annotation files define foreground labels on the same axial slices and match pixel-perfect. In total, 13 semantic body regions and 6 body part labels were annotated with an index that corresponds to a numeric value in the segmentation file.
Body Regions

Subcutaneous Tissue

Muscle

Abdominal Cavity

Thoracic Cavity

Bones

Parotid Glands

Pericardium

Breast Implant

Mediastinum

Brain

Spinal Cord

Thyroid Glands

Submandibular Glands

Body Parts

Torso

Head

Right Leg

Left Leg

Right Arm

Left Arm

The labels which were modified or require further commentary are listed and explained below:

Subcutaneous Adipose Tissue: The cutis was included into this label due to its limited differentiation in 5mm-CT.

Muscle: All muscular tissue was segmented contiguously and not separated into single muscles. Thus, fascias and intermuscular fat were included into the label. Inter- and intramuscular fat is subtracted automatically in the process.

Abdominal Cavity: This label includes the pelvis. The label does not separate between the positional relationships of the peritoneum.

Mediastinum: The International Thymic Malignancy Group (ITMIG) scheme was used for the segmentation guidelines.

Head + Neck: The neck is confined by the base of the trapezius muscle.

Right + Left Leg: The legs are separated from the torso by the line between the two lowest points of the Rami ossa pubis.

Right + Left Arm: The arms are separated from the torso by the diagonal between the most lateral point of the acromion and the tuberculum infraglenoidale.

For reproducibility on downstream tasks, five cross-validation folds and a test set were pre-defined and are described in the provided spreadsheet. Segmentation was conducted strictly in accordance with anatomical guidelines and only modified if required for the gain of segmentation efficiency.
e
OSNI Open Data - 10M DTM - Sheets 1-50
data.europa.eu
hub.arcgis.com
+2more
html, json
Updated Jun 30, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenDataNI (2022). OSNI Open Data - 10M DTM - Sheets 1-50 [Dataset]. https://data.europa.eu/data/datasets/osni-open-data-10m-dtm-sheets-1-503?locale=et
Explore at:
html, jsonAvailable download formats
Dataset updated
Jun 30, 2022
Dataset authored and provided by
OpenDataNI
License
http://reference.data.gov.uk/id/open-government-licencehttp://reference.data.gov.uk/id/open-government-licence
Description
A Digital Terrain Model (DTM) is a digital file consisting of a grid of regularly spaced points of known height which, when used with other digital data such as maps or orthophotographs, can provide a 3D image of the land surface. This download contains OSNI 10k sheet numbers 1-50.

This is a large dataset and will take sometime to download. Please be patient. This service is published for OpenData. By download or use of this dataset you agree to abide by the LPS Open Government Data Licence.

Please Note for Open Data NI Users: Esri Rest API is not Broken, it will not open on its own in a Web Browser but can be copied and used in Desktop and Webmaps
c
Fox News dataset is for analyzing media trends and narratives
crawlfeeds.com
csv, zip
Updated May 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Fox News dataset is for analyzing media trends and narratives [Dataset]. https://crawlfeeds.com/datasets/fox-news-dataset
Explore at:
zip, csvAvailable download formats
Dataset updated
May 19, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
The Fox News Dataset is a comprehensive collection of over 1 million news articles, offering an unparalleled resource for analyzing media narratives, public discourse, and political trends. Covering articles up to the year 2023, this dataset is a treasure trove for researchers, analysts, and businesses interested in gaining deeper insights into the topics and trends covered by Fox News.

Key Features of the Fox News Dataset

Extensive Coverage: Contains more than 1 million articles spanning various topics and events up to 2023.

Research-Ready: Perfect for text classification, natural language processing (NLP), and other research purposes.

Format: Provided in CSV format for seamless integration into analytical and research tools.

Why Use This Dataset?

This large dataset is ideal for:

Text Classification: Develop machine learning models to classify and categorize news content.

Natural Language Processing (NLP): Conduct sentiment analysis, keyword extraction, or topic modeling.

Media and Political Research: Analyze media narratives, public opinion, and political trends reflected in Fox News articles.

Trend Analysis: Identify shifts in public discourse and media focus over time.

Explore More News Datasets

Discover additional resources for your research needs by visiting our news dataset collection. These datasets are tailored to support diverse analytical applications, including sentiment analysis and trend modeling.

The Fox News Dataset is a must-have for anyone interested in exploring large-scale media data and leveraging it for advanced analysis. Ready to dive into this wealth of information? Download the dataset now in CSV format and start uncovering the stories behind the headlines.

Facebook

Twitter

Click to copy link

Link copied

Cite

Eleni Triantafillou; Tyler Zhu; Vincent Dumoulin; Pascal Lamblin; Utku Evci; Kelvin Xu; Ross Goroshin; Carles Gelada; Kevin Swersky; Pierre-Antoine Manzagol; Hugo Larochelle, Meta-Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/meta-dataset

Meta-Dataset Dataset

Explore at:

Authors

Eleni Triantafillou; Tyler Zhu; Vincent Dumoulin; Pascal Lamblin; Utku Evci; Kelvin Xu; Ross Goroshin; Carles Gelada; Kevin Swersky; Pierre-Antoine Manzagol; Hugo Larochelle

Description

The Meta-Dataset benchmark is a large few-shot learning benchmark and consists of multiple datasets of different data distributions. It does not restrict few-shot tasks to have fixed ways and shots, thus representing a more realistic scenario. It consists of 10 datasets from diverse domains:

ILSVRC-2012 (the ImageNet dataset, consisting of natural images with 1000 categories) Omniglot (hand-written characters, 1623 classes) Aircraft (dataset of aircraft images, 100 classes) CUB-200-2011 (dataset of Birds, 200 classes) Describable Textures (different kinds of texture images with 43 categories) Quick Draw (black and white sketches of 345 different categories) Fungi (a large dataset of mushrooms with 1500 categories) VGG Flower (dataset of flower images with 102 categories), Traffic Signs (German traffic sign images with 43 classes) MSCOCO (images collected from Flickr, 80 classes).

All datasets except Traffic signs and MSCOCO have a training, validation and test split (proportioned roughly into 70%, 15%, 15%). The datasets Traffic Signs and MSCOCO are reserved for testing only.

Clear search

Close search

Google apps

Main menu

Meta-Dataset Dataset

Face For Small Large Dataset

Face For Small Large

INSPIRE Download Service (predefined ATOM) for dataset Large meadows 2. |...

Excel, AL Age Group Population Dataset: A Complete Breakdown of Excel Age...

About this dataset

Content

Inspiration

Recommended for further research

FOI 30990 - Datasets - Open Data Portal

INSPIRE Download Service (predefined ATOM) for dataset Large Gardens |...

Stanford Large-Scale 3D Indoor Spaces Dataset (S3DIS)

Abstract

Methodology

newsroom

criteo

Criteo Uplift Modeling Dataset

Data description

Fields

Key figures

Tasks

OSNI Open Data 50m Digital Terrain Model CSV

Animal Totals - Expense, Measured in US Dollars

wikipedia-small-3000-embedded

load dataset in streaming mode (no download and it's fast)

select 3000 samples

Large File Download Application - Dataset - data.govt.nz - discover and use...

Data from: Caravan - A global community dataset for large-sample hydrology

OSNI Open Data - 10M DTM - Sheets 201-250

EdNet Dataset

Big Rock, IL Population Breakdown By Race (Excluding Ethnicity) Dataset:...

About this dataset

Content

Inspiration

Recommended for further research

SAROS - A large, heterogeneous, and sparsely annotated segmentation dataset...

Body Regions

Body Parts

OSNI Open Data - 10M DTM - Sheets 1-50

Fox News dataset is for analyzing media trends and narratives

Key Features of the Fox News Dataset

Why Use This Dataset?

Explore More News Datasets

Meta-Dataset DatasetSee More Versions

Meta-Dataset Dataset