62 datasets found

Z
Fused Image dataset for convolutional neural Network-based crack Detection...
data.niaid.nih.gov
zenodo.org
Updated Apr 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shanglian Zhou; Carlos Canchila; Wei Song (2023). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6383043
Explore at:
Dataset updated
Apr 20, 2023
Authors
Shanglian Zhou; Carlos Canchila; Wei Song
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The “Fused Image dataset for convolutional neural Network-based crack Detection” (FIND) is a large-scale image dataset with pixel-level ground truth crack data for deep learning-based crack segmentation analysis. It features four types of image data including raw intensity image, raw range (i.e., elevation) image, filtered range image, and fused raw image. The FIND dataset consists of 2500 image patches (dimension: 256x256 pixels) and their ground truth crack maps for each of the four data types.

The images contained in this dataset were collected from multiple bridge decks and roadways under real-world conditions. A laser scanning device was adopted for data acquisition such that the captured raw intensity and raw range images have pixel-to-pixel location correspondence (i.e., spatial co-registration feature). The filtered range data were generated by applying frequency domain filtering to eliminate image disturbances (e.g., surface variations, and grooved patterns) from the raw range data [1]. The fused image data were obtained by combining the raw range and raw intensity data to achieve cross-domain feature correlation [2,3]. Please refer to [4] for a comprehensive benchmark study performed using the FIND dataset to investigate the impact from different types of image data on deep convolutional neural network (DCNN) performance.

If you share or use this dataset, please cite [4] and [5] in any relevant documentation.

In addition, an image dataset for crack classification has also been published at [6].

References:

[1] Shanglian Zhou, & Wei Song. (2020). Robust Image-Based Surface Crack Detection Using Range Data. Journal of Computing in Civil Engineering, 34(2), 04019054. https://doi.org/10.1061/(asce)cp.1943-5487.0000873

[2] Shanglian Zhou, & Wei Song. (2021). Crack segmentation through deep convolutional neural networks and heterogeneous image fusion. Automation in Construction, 125. https://doi.org/10.1016/j.autcon.2021.103605

[3] Shanglian Zhou, & Wei Song. (2020). Deep learning–based roadway crack classification with heterogeneous image data fusion. Structural Health Monitoring, 20(3), 1274-1293. https://doi.org/10.1177/1475921720948434

[4] Shanglian Zhou, Carlos Canchila, & Wei Song. (2023). Deep learning-based crack segmentation for civil infrastructure: data types, architectures, and benchmarked performance. Automation in Construction, 146. https://doi.org/10.1016/j.autcon.2022.104678

5 Shanglian Zhou, Carlos Canchila, & Wei Song. (2022). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6383044

[6] Wei Song, & Shanglian Zhou. (2020). Laser-scanned roadway range image dataset (LRRD). Laser-scanned Range Image Dataset from Asphalt and Concrete Roadways for DCNN-based Crack Classification, DesignSafe-CI. https://doi.org/10.17603/ds2-bzv3-nc78
🛒 Supermarket Data
kaggle.com
zip
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mexwell (2024). 🛒 Supermarket Data [Dataset]. https://www.kaggle.com/datasets/mexwell/supermarket-data/versions/1
Explore at:
zip(78427538 bytes)Available download formats
Dataset updated
Jul 19, 2024
Authors
mexwell
Description
This is the dataset released as companion for the paper “Explaining the Product Range Effect in Purchase Data“, presented at the BigData 2013 conference.

supermarket_distances: three columns. The first column is the customer id, the second is the shop id and the third is the distance between the customer’s house and the shop location. The distance is a calculated in meters as a straight line so it does not take into account the road graph.

supermarket_prices: two columns. The first column is the product id and the second column is its unit price. The price is in Euro and it is calculated as the average unit price for the time span of the dataset.

supermarket_purchases: four columns. The first column is the customer id, the second is the product id, the third is the shop id and the fourth is the total amount of items that the customer bought the product in that particular shop. The data is recorded from January 2007 to December 2011.

Citation

Pennacchioli, D., Coscia, M., Rinzivillo, S., Pedreschi, D. and Giannotti, F., Explaining the Product Range Effect in Purchase Data. In BigData, 2013.

Acknowlegement

Foto von Eduardo Soares auf Unsplash
Data from: FISBe: A real-world benchmark dataset for instance segmentation...
zenodo.org
data.niaid.nih.gov
+1more
bin, json +3
Updated Apr 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lisa Mais; Lisa Mais; Peter Hirsch; Peter Hirsch; Claire Managan; Claire Managan; Ramya Kandarpa; Josef Lorenz Rumberger; Josef Lorenz Rumberger; Annika Reinke; Annika Reinke; Lena Maier-Hein; Lena Maier-Hein; Gudrun Ihrke; Gudrun Ihrke; Dagmar Kainmueller; Dagmar Kainmueller; Ramya Kandarpa (2024). FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures [Dataset]. http://doi.org/10.5281/zenodo.10875063
Explore at:
zip, text/x-python, bin, json, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10875063
Dataset updated
Apr 2, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Lisa Mais; Lisa Mais; Peter Hirsch; Peter Hirsch; Claire Managan; Claire Managan; Ramya Kandarpa; Josef Lorenz Rumberger; Josef Lorenz Rumberger; Annika Reinke; Annika Reinke; Lena Maier-Hein; Lena Maier-Hein; Gudrun Ihrke; Gudrun Ihrke; Dagmar Kainmueller; Dagmar Kainmueller; Ramya Kandarpa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Feb 26, 2024
Description
General

For more details and the most up-to-date information please consult our project page: https://kainmueller-lab.github.io/fisbe.

Summary

A new dataset for neuron instance segmentation in 3d multicolor light microscopy data of fruit fly brains

30 completely labeled (segmented) images

71 partly labeled images

altogether comprising ∼600 expert-labeled neuron instances (labeling a single neuron takes between 30-60 min on average, yet a difficult one can take up to 4 hours)

To the best of our knowledge, the first real-world benchmark dataset for instance segmentation of long thin filamentous objects

A set of metrics and a novel ranking score for respective meaningful method benchmarking

An evaluation of three baseline methods in terms of the above metrics and score

Abstract

Instance segmentation of neurons in volumetric light microscopy images of nervous systems enables groundbreaking research in neuroscience by facilitating joint functional and morphological analyses of neural circuits at cellular resolution. Yet said multi-neuron light microscopy data exhibits extremely challenging properties for the task of instance segmentation: Individual neurons have long-ranging, thin filamentous and widely branching morphologies, multiple neurons are tightly inter-weaved, and partial volume effects, uneven illumination and noise inherent to light microscopy severely impede local disentangling as well as long-range tracing of individual neurons. These properties reflect a current key challenge in machine learning research, namely to effectively capture long-range dependencies in the data. While respective methodological research is buzzing, to date methods are typically benchmarked on synthetic datasets. To address this gap, we release the FlyLight Instance Segmentation Benchmark (FISBe) dataset, the first publicly available multi-neuron light microscopy dataset with pixel-wise annotations. In addition, we define a set of instance segmentation metrics for benchmarking that we designed to be meaningful with regard to downstream analyses. Lastly, we provide three baselines to kick off a competition that we envision to both advance the field of machine learning regarding methodology for capturing long-range data dependencies, and facilitate scientific discovery in basic neuroscience.

Dataset documentation:

We provide a detailed documentation of our dataset, following the Datasheet for Datasets questionnaire:

>> FISBe Datasheet

Our dataset originates from the FlyLight project, where the authors released a large image collection of nervous systems of ~74,000 flies, available for download under CC BY 4.0 license.

Files

fisbe_v1.0_{completely,partly}.zip

contains the image and ground truth segmentation data; there is one zarr file per sample, see below for more information on how to access zarr files.

fisbe_v1.0_mips.zip

maximum intensity projections of all samples, for convenience.

sample_list_per_split.txt

a simple list of all samples and the subset they are in, for convenience.

view_data.py

a simple python script to visualize samples, see below for more information on how to use it.

dim_neurons_val_and_test_sets.json

a list of instance ids per sample that are considered to be of low intensity/dim; can be used for extended evaluation.

Readme.md

general information

How to work with the image files

Each sample consists of a single 3d MCFO image of neurons of the fruit fly.
For each image, we provide a pixel-wise instance segmentation for all separable neurons.
Each sample is stored as a separate zarr file (zarr is a file storage format for chunked, compressed, N-dimensional arrays based on an open-source specification.").
The image data ("raw") and the segmentation ("gt_instances") are stored as two arrays within a single zarr file.
The segmentation mask for each neuron is stored in a separate channel.
The order of dimensions is CZYX.

We recommend to work in a virtual environment, e.g., by using conda:

conda create -y -n flylight-env -c conda-forge python=3.9
conda activate flylight-env

How to open zarr files

Install the python zarr package:
pip install zarr

Opened a zarr file with:

import zarr
raw = zarr.open(
seg = zarr.open(

# optional:
import numpy as np
raw_np = np.array(raw)

Zarr arrays are read lazily on-demand.
Many functions that expect numpy arrays also work with zarr arrays.
Optionally, the arrays can also explicitly be converted to numpy arrays.

How to view zarr image files

We recommend to use napari to view the image data.

Install napari:
pip install "napari[all]"

Save the following Python script:

import zarr, sys, napari

raw = zarr.load(sys.argv[1], mode='r', path="volumes/raw")
gts = zarr.load(sys.argv[1], mode='r', path="volumes/gt_instances")

viewer = napari.Viewer(ndisplay=3)
for idx, gt in enumerate(gts):
viewer.add_labels(
gt, rendering='translucent', blending='additive', name=f'gt_{idx}')
viewer.add_image(raw[0], colormap="red", name='raw_r', blending='additive')
viewer.add_image(raw[1], colormap="green", name='raw_g', blending='additive')
viewer.add_image(raw[2], colormap="blue", name='raw_b', blending='additive')
napari.run()

Execute:
python view_data.py

Metrics

S: Average of avF1 and C

avF1: Average F1 Score

C: Average ground truth coverage

clDice_TP: Average true positives clDice

FS: Number of false splits

FM: Number of false merges

tp: Relative number of true positives

For more information on our selected metrics and formal definitions please see our paper.

Baseline

To showcase the FISBe dataset together with our selection of metrics, we provide evaluation results for three baseline methods, namely PatchPerPix (ppp), Flood Filling Networks (FFN) and a non-learnt application-specific color clustering from Duan et al..
For detailed information on the methods and the quantitative results please see our paper.

License

The FlyLight Instance Segmentation Benchmark (FISBe) dataset is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

Citation

If you use FISBe in your research, please use the following BibTeX entry:

@misc{mais2024fisbe, title = {FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures}, author = {Lisa Mais and Peter Hirsch and Claire Managan and Ramya Kandarpa and Josef Lorenz Rumberger and Annika Reinke and Lena Maier-Hein and Gudrun Ihrke and Dagmar Kainmueller}, year = 2024, eprint = {2404.00130}, archivePrefix ={arXiv}, primaryClass = {cs.CV} }

Acknowledgments

We thank Aljoscha Nern for providing unpublished MCFO images as well as Geoffrey W. Meissner and the entire FlyLight Project Team for valuable
discussions.
P.H., L.M. and D.K. were supported by the HHMI Janelia Visiting Scientist Program.
This work was co-funded by Helmholtz Imaging.

Changelog

There have been no changes to the dataset so far.
All future change will be listed on the changelog page.

Contributing

If you would like to contribute, have encountered any issues or have any suggestions, please open an issue for the FISBe dataset in the accompanying github repository.

All contributions are welcome!
N
South Range, MI Population Breakdown by Gender
neilsberg.com
csv, json
Updated Sep 14, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2023). South Range, MI Population Breakdown by Gender [Dataset]. https://www.neilsberg.com/research/datasets/658fcb29-3d85-11ee-9abe-0aa64bf2eeb2/
Explore at:
json, csvAvailable download formats
Dataset updated
Sep 14, 2023
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Michigan, South Range
Variables measured
Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of South Range by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of South Range across both sexes and to determine which sex constitutes the majority.

Key observations

There is a slight majority of male population, with 50.54% of total population being male. Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Scope of gender :

Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

Variables / Data Columns

Gender: This column displays the Gender (Male / Female)

Population: The population of the gender in the South Range is shown in this column.

% of Total Population: This column displays the percentage distribution of each gender as a proportion of South Range total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for South Range Population by Gender. You can refer the same here
N
Grass Range, MT Population Breakdown by Gender
neilsberg.com
csv, json
Updated Sep 14, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2023). Grass Range, MT Population Breakdown by Gender [Dataset]. https://www.neilsberg.com/research/datasets/649529eb-3d85-11ee-9abe-0aa64bf2eeb2/
Explore at:
json, csvAvailable download formats
Dataset updated
Sep 14, 2023
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Montana, Grass Range
Variables measured
Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of Grass Range by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Grass Range across both sexes and to determine which sex constitutes the majority.

Key observations

There is a slight majority of female population, with 52.63% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Scope of gender :

Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

Variables / Data Columns

Gender: This column displays the Gender (Male / Female)

Population: The population of the gender in the Grass Range is shown in this column.

% of Total Population: This column displays the percentage distribution of each gender as a proportion of Grass Range total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Grass Range Population by Gender. You can refer the same here
z
mmWave-based Fitness Activity Recognition Dataset
zenodo.org
png, zip
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yucheng Xie; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen; Yucheng Xie; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen (2024). mmWave-based Fitness Activity Recognition Dataset [Dataset]. http://doi.org/10.5281/zenodo.7793613
Explore at:
zip, pngAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7793613
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodo
Authors
Yucheng Xie; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen; Yucheng Xie; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description:
This mmWave Datasets are used for fitness activity identification. This dataset (FA Dataset) contains 14 common fitness daily activities. The data are captured by the mmWave radar TI-AWR1642. The dataset can be used by fellow researchers to reproduce the original work or to further explore other machine-learning problems in the domain of mmWave signals.
Format: .png format
Section 1: Device Configuration
A commodity mmWave radar TI AWR1642, which integrates a 2 × 4 antenna array. The detailed information of it can be found at https://www.ti.com/product/AWR1642#:~:text=The%20AWR1642%20is%20an%20ideal,of%2076%20to%2081%20GHz.
A TI DCA1000EVM data capture card is used to collect data from the mmWave device and send data to a laptop. The detailed information can be found at https://www.ti.com/tool/DCA1000EVM?keyMatch=DCA1000EVM.
mmWave radar work at the frequency in the range of 77~81GHz. The sampling rate is fixed at 100 frames per second and each frame has 17 chirps.
Section 2: Data Format
We provide our mmWave data in heatmaps for this dataset. The data file is in the png format. The details are shown in the following:
14 activities are included in the FA Dataset.
2 participants are included in the FA Dataset.
FA_d_p_i_u_j.png:
d represents the date to collect the fitness data.
p represents the environment to collect the fitness data.
i represents fitness activity type index
u represents user id
j represents sample index
Example:
FA_20220101_lab_1_2_3 represents the 3rd data sample of user 2 of activity 1 collected in the lab
Section 3: Experimental Setup
We place the mmWave device on a table with a height of 60cm.
The participants are asked to perform fitness activity in front of a mmWave device with a distance of 2m.
The data are collected at an lab with a size of (5.0m×3.0m).
Section 4: Data Description
We develop a spatial-temporal heatmap to integrates multiple activity features, including the range of movement, velocity, and time duration of each activity repetition.

We first derive the Doppler-range map of the users' activity by calculating Range-FFT and Doppler-FFT. Then, we generate the spatial-temporal heatmap by accumulating the velocity of every distance in every Doppler-range map together. Next, we normalize the derived velocity information and present the velocity-distance relationship in time dimension. In this way, we transfer the original instantaneous velocity-distance relationship to a more comprehensive spatial-temporal heatmap which describes the process of a whole activity.

As shown in Figure attached, in each spatial-temporal heatmap, the horizontal axis represents the time duration of an activity repetition while the vertical axis represents the range of movement. The velocity is represented by color.

We create 14 zip files to store the the dataset. There are 14 zip files starting with "FA", each contains repetitions from the same fitness activity.
14 common daily activities and their corresponding files
File Name Activity Type File Name Activity Type
FA1 Crunches FA8 Squats
FA2 Elbow plank and reach FA9 Burpees
FA3 Leg raise FA10 Chest squeezes
FA4 Lunges FA11 High knees
FA5 Mountain climber FA12 Side leg raise
FA6 Punches FA13 Side to side chops
FA7 Push ups FA14 Turning kicks

Section 5: Raw Data and Data Processing Algorithms
We also provide the mmWave raw data (.mat format) stored in the same zip file corresponding to the heatmap datasets. Each .mat file can store one set of activity repetitions (e.g., 4 repetations) from a same user.
For example: FA_d_p_i_u_j.mat:
d represents the data to collect the data.
p represents the environment to collect the data.
i represents the activity type index
u represents the user id
j represents the set index
We plan to provide the data processing algorithms (heatmap_generation.py) to load the mmWave raw data and generate the corresponding heatmap data.
Section 6: Citations
If your paper is related to our works, please cite our papers as follows.
https://ieeexplore.ieee.org/document/9868878/
Xie, Yucheng, Ruizhe Jiang, Xiaonan Guo, Yan Wang, Jerry Cheng, and Yingying Chen. "mmFit: Low-Effort Personalized Fitness Monitoring Using Millimeter Wave." In 2022 International Conference on Computer Communications and Networks (ICCCN), pp. 1-10. IEEE, 2022.
Bibtex:
@inproceedings{xie2022mmfit,
title={mmFit: Low-Effort Personalized Fitness Monitoring Using Millimeter Wave},
author={Xie, Yucheng and Jiang, Ruizhe and Guo, Xiaonan and Wang, Yan and Cheng, Jerry and Chen, Yingying},
booktitle={2022 International Conference on Computer Communications and Networks (ICCCN)},
pages={1--10},
year={2022},
organization={IEEE}
}
Z
ANN development + final testing datasets
data.niaid.nih.gov
resodate.org
+1more
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Authors (2020). ANN development + final testing datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1445865
Explore at:
Dataset updated
Jan 24, 2020
Authors
Authors
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
File name definitions:

'...v_50_175_250_300...' - dataset for velocity ranges [50, 175] + [250, 300] m/s

'...v_175_250...' - dataset for velocity range [175, 250] m/s

'ANNdevelop...' - used to perform 9 parametric sub-analyses where, in each one, many ANNs are developed (trained, validated and tested) and the one yielding the best results is selected

'ANNtest...' - used to test the best ANN from each aforementioned parametric sub-analysis, aiming to find the best ANN model; this dataset includes the 'ANNdevelop...' counterpart

Where to find the input (independent) and target (dependent) variable values for each dataset/excel ?

input values in 'IN' sheet

target values in 'TARGET' sheet

Where to find the results from the best ANN model (for each target/output variable and each velocity range)?

open the corresponding excel file and the expected (target) vs ANN (output) results are written in 'TARGET vs OUTPUT' sheet

Check reference below (to be added when the paper is published)

https://www.researchgate.net/publication/328849817_11_Neural_Networks_-_Max_Disp_-_Railway_Beams
e
INSPIRE Priority Data Set (Compliant) - Species range
inspire-geoportal.ec.europa.eu
inspire-geoportal.lt
+1more
Updated Aug 26, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Construction Sector Development Agency (2020). INSPIRE Priority Data Set (Compliant) - Species range [Dataset]. https://inspire-geoportal.ec.europa.eu/srv/api/records/bfcc7a93-dd66-453b-b7f5-9fc4a868e69f
Explore at:
www:download-1.0-http--download, www:link-1.0-http--link, ogc:wms-1.3.0-http-get-mapAvailable download formats
Dataset updated
Aug 26, 2020
Dataset provided by
Construction Sector Development Agency
State Service for Protected Areas under the Ministry of Environment
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitationshttp://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations
Area covered

Description
INSPIRE Priority Data Set (Compliant) - Species range
d
Street Network Database SND
catalog.data.gov
data.seattle.gov
+2more
Updated Oct 4, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Seattle ArcGIS Online (2025). Street Network Database SND [Dataset]. https://catalog.data.gov/dataset/street-network-database-snd-1712b
Explore at:
Dataset updated
Oct 4, 2025
Dataset provided by
City of Seattle ArcGIS Online
Description
The pathway representation consists of segments and intersection elements. A segment is a linear graphic element that represents a continuous physical travel path terminated by path end (dead end) or physical intersection with other travel paths. Segments have one street name, one address range and one set of segment characteristics. A segment may have none or multiple alias street names. Segment types included are Freeways, Highways, Streets, Alleys (named only), Railroads, Walkways, and Bike lanes. SNDSEG_PV is a linear feature class representing the SND Segment Feature, with attributes for Street name, Address Range, Alias Street name and segment Characteristics objects. Part of the Address Range and all of Street name objects are logically shared with the Discrete Address Point-Master Address File layer. Appropriate uses include: Cartography - Used to depict the City's transportation network location and connections, typically on smaller scaled maps or images where a single line representation is appropriate. Used to depict specific classifications of roadway use, also typically at smaller scales. Used to label transportation network feature names typically on larger scaled maps. Used to label address ranges with associated transportation network features typically on larger scaled maps. Geocode reference - Used as a source for derived reference data for address validation and theoretical address location Address Range data repository - This data store is the City's address range repository defining address ranges in association with transportation network features. Polygon boundary reference - Used to define various area boundaries is other feature classes where coincident with the transportation network. Does not contain polygon features. Address based extracts - Used to create flat-file extracts typically indexed by address with reference to business data typically associated with transportation network features. Thematic linear location reference - By providing unique, stable identifiers for each linear feature, thematic data is associated to specific transportation network features via these identifiers. Thematic intersection location reference - By providing unique, stable identifiers for each intersection feature, thematic data is associated to specific transportation network features via these identifiers. Network route tracing - Used as source for derived reference data used to determine point to point travel paths or determine optimal stop allocation along a travel path. Topological connections with segments - Used to provide a specific definition of location for each transportation network feature. Also provides a specific definition of connection between each transportation network feature. (defines where the streets are and the relationship between them ie. 4th Ave is west of 5th Ave and 4th Ave does intersect with Cherry St) Event location reference - Used as source for derived reference data used to locate event and linear referencing.Data source is TRANSPO.SNDSEG_PV. Updated weekly.

Israel-Palestine Conflict Tweets Dataset

kaggle.com

zip

Updated Jan 1, 2024

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

MehyarMlaweh (2024). Israel-Palestine Conflict Tweets Dataset [Dataset]. https://www.kaggle.com/datasets/mehyarmlaweh/israel-palestine-conflict-tweets-dataset

Explore at:

zip(2016138 bytes)Available download formats

Dataset updated

Jan 1, 2024

Authors

MehyarMlaweh

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Area covered

Israel

Description

This dataset contains tweets related to the Israel-Palestine conflict from October 17, 2023, to December 17, 2023. It includes information on tweet IDs, links, text, date, likes, and comments, categorized into different ranges of like counts.

Dataset Details

Date Range: October 17, 2023 - December 17, 2023
Total Tweets: 15,478
Unique Tweets: 14,854

Data Description

The dataset consists of the following columns:

Column	Description
`id`	Unique identifier for the tweet
`link`	URL link to the tweet
`text`	Text content of the tweet
`date`	Date and time when the tweet was posted
`likes`	Number of likes the tweet received
`comments`	Number of comments the tweet received
`Label`	Like count range category
`Count`	Number of tweets in the like count range category

How to Process the Data

To process the dataset, you can use the following Python code. This code reads the CSV file, cleans the tweets, tokenizes and lemmatizes the text, and filters out non-English tweets.

Required Libraries

Make sure you have the following libraries installed:

pip install pandas nltk langdetect

Data Processing Code

Here’s the code to process the tweets:

import pandas as pd
import re
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
from langdetect import detect, LangDetectException
# Define the TweetProcessor class
class TweetProcessor:
  def _init_(self, file_path):
    """
    Initialize the object with the path to the CSV file.
    """
    self.df = pd.read_csv(file_path)
    # Convert 'text' column to string type
    self.df['text'] = self.df['text'].astype(str)
  def clean_tweet(self, tweet):
    """
    Clean a tweet by removing links, special characters, and extra spaces.
    """
    # Remove links
    tweet = re.sub(r'https\S+', '', tweet, flags=re.MULTILINE)
    # Remove special characters and numbers
    tweet = re.sub(r'\W', ' ', tweet)
    # Replace multiple spaces with a single space
    tweet = re.sub(r'\s+', ' ', tweet)
    # Remove leading and trailing spaces
    tweet = tweet.strip()
    return tweet
  def tokenize_and_lemmatize(self, tweet):
    """
    Tokenize and lemmatize a tweet by converting to lowercase, removing stopwords, and lemmatizing.
    """
    # Tokenize the text
    tokens = word_tokenize(tweet)
    # Remove punctuation and numbers, and convert to lowercase
    tokens = [word.lower() for word in tokens if word.isalpha()]
    # Remove stopwords
    stop_words = set(stopwords.words('english'))
    tokens = [word for word in tokens if word not in stop_words]
    # Lemmatize the tokens
    lemmatizer = WordNetLemmatizer()
    tokens = [lemmatizer.lemmatize(word) for word in tokens]
    # Join tokens back into a single string
    return ' '.join(tokens)
  def process_tweets(self):
    """
    Apply cleaning and lemmatization functions to the tweets in the DataFrame.
    """
    def lang(x):
      try:
        return detect(x) == 'en'
      except LangDetectException:
        return False
    # Filter tweets for English language
    self.df = self.df[self.df['text'].apply(lang)]
    # Apply cleaning function
    self.df['cleaned_text'] = self.df['text'].apply(self.clean_tweet)
    # Apply tokenization and lemmatization function
    self.df['tokenized_and_lemmatized'] = self.df['cleaned_text'].apply(self.tokenize_and_lemmatize)

Feel free to add or modify any details according to your specific requirements!

Let me know if there’s anything else you’d like to adjust or add!

Usage

This dataset can be used for various research purposes, including sentiment analysis, trend analysis, and event impact studies related to the Israel-Palestine conflict. For questions or feedback, please contact:

Name: Mehyar Mlaweh
Email: mehyarmlaweh0@gmail.com

R
Guns Close Range Dataset
universe.roboflow.com
zip
Updated Oct 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Computer vision (2025). Guns Close Range Dataset [Dataset]. https://universe.roboflow.com/computer-vision-kcsdu/guns-close-range-7hqvz/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Oct 22, 2025
Dataset authored and provided by
Computer vision
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Objects Objects Objects Obj 2SfO Bounding Boxes
Description
Guns Close Range

## Overview Guns Close Range is a dataset for object detection tasks - it contains Objects Objects Objects Obj 2SfO annotations for 682 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
p
Archery ranges Business Data for United States
poidata.io
csv, json
Updated Nov 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Business Data Provider (2025). Archery ranges Business Data for United States [Dataset]. https://www.poidata.io/report/archery-range/united-states
Explore at:
json, csvAvailable download formats
Dataset updated
Nov 30, 2025
Dataset authored and provided by
Business Data Provider
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
United States
Variables measured
Website URL, Phone Number, Review Count, Business Name, Email Address, Business Hours, Customer Rating, Business Address, Business Categories, Geographic Coordinates
Description
Comprehensive dataset containing 1,852 verified Archery range businesses in United States with complete contact information, ratings, reviews, and location data.
housing
kaggle.com
zip
Updated Sep 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HappyRautela (2023). housing [Dataset]. https://www.kaggle.com/datasets/happyrautela/housing
Explore at:
zip(809785 bytes)Available download formats
Dataset updated
Sep 22, 2023
Authors
HappyRautela
Description
The exercise after this contains questions that are based on the housing dataset.

How many houses have a waterfront? a. 21000 b. 21450 c. 163 d. 173

How many houses have 2 floors? a. 2692 b. 8241 c. 10680 d. 161

How many houses built before 1960 have a waterfront? a. 80 b. 7309 c. 90 d. 92

What is the price of the most expensive house having more than 4 bathrooms? a. 7700000 b. 187000 c. 290000 d. 399000

For instance, if the ‘price’ column consists of outliers, how can you make the data clean and remove the redundancies? a. Calculate the IQR range and drop the values outside the range. b. Calculate the p-value and remove the values less than 0.05. c. Calculate the correlation coefficient of the price column and remove the values less than the correlation coefficient. d. Calculate the Z-score of the price column and remove the values less than the z-score.

What are the various parameters that can be used to determine the dependent variables in the housing data to determine the price of the house? a. Correlation coefficients b. Z-score c. IQR Range d. Range of the Features

If we get the r2 score as 0.38, what inferences can we make about the model and its efficiency? a. The model is 38% accurate, and shows poor efficiency. b. The model is showing 0.38% discrepancies in the outcomes. c. Low difference between observed and fitted values. d. High difference between observed and fitted values.

If the metrics show that the p-value for the grade column is 0.092, what all inferences can we make about the grade column? a. Significant in presence of other variables. b. Highly significant in presence of other variables c. insignificance in presence of other variables d. None of the above

If the Variance Inflation Factor value for a feature is considerably higher than the other features, what can we say about that column/feature? a. High multicollinearity b. Low multicollinearity c. Both A and B d. None of the above
B
Data from: A comprehensive analysis of autocorrelation and bias in home...
borealisdata.ca
datasetcatalog.nlm.nih.gov
+1more
Updated May 19, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael J. Noonan; Marlee A. Tucker; Christen H. Fleming; Tom S. Akre; Susan C. Alberts; Abdullahi H. Ali; Jeanne Altmann; Pamela C. Antunes; Jerrold L. Belant; Dean Beyer; Niels Blaum; Katrin Böhning-Gaese; Laury Cullen Jr.; Rogerio de Paula Cunha; Jasja Dekker; Jonathan Drescher-Lehman; Nina Farwig; Claudia Fichtel; Christina Fischer; Adam T. Ford; Jacob R. Goheen; René Janssen; Florian Jeltsch; Matthew Kauffman; Peter M. Kappeler; Flávia Koch; Scott LaPoint; A. Catherine Markham; Emilia Patricia Medici; Ronaldo G. Morato; Ran Nathan; Luiz Gustavo R. Oliveira-Santos; Kirk A. Olson; Bruce D. Patterson; Agustin Paviolo; Emiliano E. Ramalho; Sascha Rosner; Nuria Selva; Agnieszka Sergiel; Marina X. da Silva; Orr Spiegel; Peter Thompson; Wiebke Ullmann; Filip Zięba; Tomasz Zwijacz-Kozica; William F. Fagan; Thomas Mueller; Justin M. Calabrese (2021). Data from: A comprehensive analysis of autocorrelation and bias in home range estimation [Dataset]. http://doi.org/10.5683/SP2/OAJTAO
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP2/OAJTAO
Dataset updated
May 19, 2021
Dataset provided by
Borealis
Authors
Michael J. Noonan; Marlee A. Tucker; Christen H. Fleming; Tom S. Akre; Susan C. Alberts; Abdullahi H. Ali; Jeanne Altmann; Pamela C. Antunes; Jerrold L. Belant; Dean Beyer; Niels Blaum; Katrin Böhning-Gaese; Laury Cullen Jr.; Rogerio de Paula Cunha; Jasja Dekker; Jonathan Drescher-Lehman; Nina Farwig; Claudia Fichtel; Christina Fischer; Adam T. Ford; Jacob R. Goheen; René Janssen; Florian Jeltsch; Matthew Kauffman; Peter M. Kappeler; Flávia Koch; Scott LaPoint; A. Catherine Markham; Emilia Patricia Medici; Ronaldo G. Morato; Ran Nathan; Luiz Gustavo R. Oliveira-Santos; Kirk A. Olson; Bruce D. Patterson; Agustin Paviolo; Emiliano E. Ramalho; Sascha Rosner; Nuria Selva; Agnieszka Sergiel; Marina X. da Silva; Orr Spiegel; Peter Thompson; Wiebke Ullmann; Filip Zięba; Tomasz Zwijacz-Kozica; William F. Fagan; Thomas Mueller; Justin M. Calabrese
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Global
Dataset funded by
National Science Foundation
Description
AbstractHome range estimation is routine practice in ecological research. While advances in animal tracking technology have increased our capacity to collect data to support home range analysis, these same advances have also resulted in increasingly autocorrelated data. Consequently, the question of which home range estimator to use on modern, highly autocorrelated tracking data remains open. This question is particularly relevant given that most estimators assume independently sampled data. Here, we provide a comprehensive evaluation of the effects of autocorrelation on home range estimation. We base our study on an extensive dataset of GPS locations from 369 individuals representing 27 species distributed across 5 continents. We first assemble a broad array of home range estimators, including Kernel Density Estimation (KDE) with four bandwidth optimizers (Gaussian reference function, autocorrelated-Gaussian reference function (AKDE), Silverman's rule of thumb, and least squares cross-validation), Minimum Convex Polygon, and Local Convex Hull methods. Notably, all of these estimators except AKDE assume independent and identically distributed (IID) data. We then employ half-sample cross-validation to objectively quantify estimator performance, and the recently introduced effective sample size for home range area estimation ($\hat{N}_\mathrm{area}$) to quantify the information content of each dataset. We found that AKDE 95\% area estimates were larger than conventional IID-based estimates by a mean factor of 2. The median number of cross-validated locations included in the holdout sets by AKDE 95\% (or 50\%) estimates was 95.3\% (or 50.1\%), confirming the larger AKDE ranges were appropriately selective at the specified quantile. Conversely, conventional estimates exhibited negative bias that increased with decreasing $\hat{N}_\mathrm{area}$. To contextualize our empirical results, we performed a detailed simulation study to tease apart how sampling frequency, sampling duration, and the focal animal's movement conspire to affect range estimates. Paralleling our empirical results, the simulation study demonstrated that AKDE was generally more accurate than conventional methods, particularly for small $\hat{N}_\mathrm{area}$. While 72\% of the 369 empirical datasets had \textgreater1000 total observations, only 4\% had an $\hat{N}_\mathrm{area}$ \textgreater1000, where 30\% had an $\hat{N}_\mathrm{area}$ \textless30. In this frequently encountered scenario of small $\hat{N}_\mathrm{area}$, AKDE was the only estimator capable of producing an accurate home range estimate on autocorrelated data. Usage notesEmpirical GPS tracking dataAnonymised, empirical tracking data used to estimate home range areas based on various home range estimators.Anonymised_Data.zip
N
Grass Range, MT households by income brackets: family, non-family, and...
neilsberg.com
csv, json
Updated Mar 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Grass Range, MT households by income brackets: family, non-family, and total, in 2023 inflation-adjusted dollars [Dataset]. https://www.neilsberg.com/insights/grass-range-mt-median-household-income/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Mar 3, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Montana, Grass Range
Variables measured
Income Level, All households, Family households, Non-Family households, Percent of All households, Percent of Family households, Percent of Non-Family households
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It delineates income distributions across income brackets (mentioned above) following an initial analysis and categorization. The percentage of all, family and nonfamily households were collected by grouping data as applicable. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents a breakdown of households across various income brackets in Grass Range, MT, as reported by the U.S. Census Bureau. The Census Bureau classifies households into different categories, including total households, family households, and non-family households. Our analysis of U.S. Census Bureau American Community Survey data for Grass Range, MT reveals how household income distribution varies among these categories. The dataset highlights the variation in number of households with income, offering valuable insights into the distribution of Grass Range households based on income levels.

Key observations

For Family Households: In Grass Range, the majority of family households, representing NA%, earn NA, showcasing a substantial share of the community families falling within this income bracket. Conversely, the minority of family households, comprising NA%, have incomes falling NA, representing a smaller but still significant segment of the community.

For Non-Family Households: In Grass Range, the majority of non-family households, accounting for NA%, have income NA, indicating that a substantial portion of non-family households falls within this income bracket. On the other hand, the minority of non-family households, comprising NA%, earn NA, representing a smaller, yet notable, portion of non-family households in the community.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Income Levels:

Less than $10,000

$10,000 to $14,999

$15,000 to $19,999

$20,000 to $24,999

$25,000 to $29,999

$30,000 to $34,999

$35,000 to $39,999

$40,000 to $44,999

$45,000 to $49,999

$50,000 to $59,999

$60,000 to $74,999

$75,000 to $99,999

$125,000 to $149,999

$150,000 to $199,999

$200,000 or more

Variables / Data Columns

Income Level: The income level represents the income brackets ranging from Less than $10,000 to $200,000 or more in Grass Range, MT (As mentioned above).

All Households: Count of households for the specified income level

% All Households: Percentage of households at the specified income level relative to the total households in Grass Range, MT

Family Households: Count of family households for the specified income level

% Family Households: Percentage of family households at the specified income level relative to the total family households in Grass Range, MT

Non-Family Households: Count of non-family households for the specified income level

% Non-Family Households: Percentage of non-family households at the specified income level relative to the total non-family households in Grass Range, MT

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Grass Range median household income. You can refer the same here
Data from: How can appropriate hue ranges be selected for sequential color...
figshare.com
datasetcatalog.nlm.nih.gov
xlsx
Updated Apr 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Taisheng Chen; Xi Lv; Kun Hu; Menglin Chen; Lu Cheng; Weixing Jiang (2024). How can appropriate hue ranges be selected for sequential color schemes on choropleth maps? A quantitative evaluation using map-reading experiments [Dataset]. http://doi.org/10.6084/m9.figshare.25572387.v3
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25572387.v3
Dataset updated
Apr 10, 2024
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Taisheng Chen; Xi Lv; Kun Hu; Menglin Chen; Lu Cheng; Weixing Jiang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We recruited 414 college students to participate in the experiment. Through the experiment, we collected their visual data and arranged them according to different visual indicators. Then we process our data through qualitative and quantitative analysis to get the final result.
d
Data from: Haploids adapt faster than diploids across a range of...
datadryad.org
data.niaid.nih.gov
+1more
zip
Updated Dec 7, 2010
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aleeza C Gerstein; Lesley A Cleathero; Mohammad A Mandegar; Sarah P. Otto (2010). Haploids adapt faster than diploids across a range of environments [Dataset]. http://doi.org/10.5061/dryad.8048
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.8048
Dataset updated
Dec 7, 2010
Dataset provided by
Dryad
Authors
Aleeza C Gerstein; Lesley A Cleathero; Mohammad A Mandegar; Sarah P. Otto
Time period covered
Dec 7, 2010
Description
Raw data to calculate rate of adaptationRaw dataset for rate of adaptation calculations (Figure 1) and related statistics.dataall.csvR code to analyze raw data for rate of adaptationCompetition Analysis.RRaw data to calculate effective population sizesdatacount.csvR code to analayze effective population sizesR code used to analyze effective population sizes; Figure 2Cell Count Ne.RR code to determine our best estimate of the dominance coefficient in each environmentR code to produce figures 3, S4, S5 -- what is the best estimate of dominance? Note, competition and effective population size R code must be run first in the same session.what is h.R

mmWave-based Activity Recognition Dataset

zenodo.org

png, zip

Updated Jul 12, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Yucheng Xie; Ruizhe Jiang; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen; Yucheng Xie; Ruizhe Jiang; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen (2024). mmWave-based Activity Recognition Dataset [Dataset]. http://doi.org/10.5281/zenodo.7678020

Explore at:

png, zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.7678020

Dataset updated

Jul 12, 2024

Dataset provided by

Zenodo

Authors

Yucheng Xie; Ruizhe Jiang; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen; Yucheng Xie; Ruizhe Jiang; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Description:

This mmWave Datasets are used for activity verification. It contains two datasets. The first dataset (FA Dataset) contains 14 common daily activities. This second one (EA Dataset) contains 5 kinds of eating activities. The data are captured by the mmWave radar TI-AWR1642. The dataset can be used by fellow researchers to reproduce the original work or to further explore other machine-learning problems in the domain of mmWave signals.

Format: .png format

Section 1: Device Configuration

A commodity mmWave radar TI AWR1642, which integrates a 2 × 4 antenna array. The detailed information of it can be found at https://www.ti.com/product/AWR1642#:~:text=The%20AWR1642%20is%20an%20ideal,of%2076%20to%2081%20GHz.
A TI DCA1000EVM data capture card is used to collect data from the mmWave device and send data to a laptop. The detailed information can be found at https://www.ti.com/tool/DCA1000EVM?keyMatch=DCA1000EVM.
mmWave radar work at the frequency in the range of 77~81GHz. The sampling rate is fixed at 100 frames per second and each frame has 17 chirps.

Section 2: Data Format

We provide our mmWave data in heatmaps for the two datasets. The data file is in the png format. The details are shown in the following:

FA Dataset

2 participants are included in the FA Dataset.
14 activities are included in the FA Dataset.
FA_d_p_i_u_j.png:
- d represents the data to collect the data.
- p represents the environment to collect the data.
- i represents activity type index
- u represents user id
- j represents sample index
Example:
- FA_20220101_lab_1_2_3 represents the 3rd data sample of user 2 of activity 1 collected in the lab

EA Dataset

2 participants are included in the EA Dataset.
5 activities are included in the EA Dataset.
EA_d_p_i_u_j.png:
- d represents the data to collect the data.
- p represents the environment to collect the data.
- i represents the activity type index
- u represents the user id
- j represents the sample index

Section 3: Experimental Setup

FA Dataset

We place the mmWave device on a table with a height of 60cm.
The participants are asked to perform fitness activity in front of a mmWave device with a distance of 2m.
The data are collected at an lab with a size of (5.0m×3.0m).

EA Dataset

We place the mmWave device on a table with a height of 60cm.
The participants are asked to eat with different utensils (i.e., fork, fork&knife, spoon, chopsticks, bare hand) in front of a mmWave device with a distance of 1m.
The data are collected at an lab with a size of (5.0m×3.0m).

Section 4: Data Description

We develop a spatial-temporal heatmap to integrates multiple activity features, including the range of movement, velocity, and time duration of each activity repetition.

We first derive the Doppler-range map of the users’ activity by calculating Range-FFT and Doppler-FFT. Then, we generate the spatial-temporal heatmap by accumulating the velocity of every distance in every Doppler-range map together. Next, we normalize the derived velocity information and present the velocity-distance relationship in time dimension. In this way, we transfer the original instantaneous velocity-distance relationship to a more comprehensive spatial-temporal heatmap which describes the process of a whole activity.

As shown in Figure attached, in each spatial-temporal heatmap, the horizontal axis represents the time duration of an activity repetition while the vertical axis represents the range of movement. The velocity is represented by color.

We create 2 folders to store two dataset respectively. In FA folder, there are 14 subfolders, each contains repetitions from the same fitness activity. In EA folder, there are 5 subfolders, each contains repetitions with different utensils.

14 common daily activities and their corresponding folders
Folder Name	Activity Type	Folder Name	Activity Type
FA1	Crunches	FA8	Squats
FA2	Elbow plank and reach	FA9	Burpees
FA3	Leg raise	FA10	Chest squeezes
FA4	Lunges	FA11	High knees
FA5	Mountain climber	FA12	Side leg raise
FA6	Punches	FA13	Side to side chops
FA7	Push ups	FA14	Turning kicks

5 eating activities and their corresponding folders
Folder Name	Activity Type
EA1	Eating with chopsticks
EA2	Eating with fork
EA3	Eating with bare hand
EA4	Eating with fork&knife
EA5	Eating with spoon

Section 5: Raw Data and Data Processing Algorithms

We also provide the mmWave raw data (.mat format) stored in the same folder corresponding to the heatmap datasets. Each .mat file can store one set of activity repetitions (e.g., 4 repetations) from a same user.
- For example: EA_d_p_i_u_j.mat:
  - d represents the data to collect the data.
  - p represents the environment to collect the data.
  - i represents the activity type index
  - u represents the user id
  - j represents the set index
We plan to provide the data processing algorithms (heatmap_generation.py) to load the mmWave raw data and generate the corresponding heatmap data.

Section 6: Citations

If your paper is related to our works, please cite our papers as follows.

https://ieeexplore.ieee.org/document/9868878/

Xie, Yucheng, Ruizhe Jiang, Xiaonan Guo, Yan Wang, Jerry Cheng, and Yingying Chen. "mmFit: Low-Effort Personalized Fitness Monitoring Using Millimeter Wave." In 2022 International Conference on Computer Communications and Networks (ICCCN), pp. 1-10. IEEE, 2022.

Bibtex:

@inproceedings{xie2022mmfit,

title={mmFit: Low-Effort Personalized Fitness Monitoring Using Millimeter Wave},

author={Xie, Yucheng and Jiang, Ruizhe and Guo, Xiaonan and Wang, Yan and Cheng, Jerry and Chen, Yingying},

booktitle={2022 International Conference on Computer Communications and Networks (ICCCN)},

pages={1--10},

year={2022},

organization={IEEE}

}

https://www.sciencedirect.com/science/article/abs/pii/S2352648321000532

Xie, Yucheng, Ruizhe Jiang, Xiaonan Guo, Yan Wang, Jerry Cheng, and Yingying Chen. "mmEat: Millimeter wave-enabled environment-invariant eating behavior monitoring." Smart Health 23 (2022): 100236.

Bibtex:

@article{xie2022mmeat,

title={mmEat: Millimeter wave-enabled environment-invariant eating behavior monitoring},

author={Xie, Yucheng and Jiang, Ruizhe and Guo, Xiaonan and Wang, Yan and Cheng, Jerry and Chen, Yingying},

journal={Smart Health},

volume={23},

pages={100236},

year={2022},

publisher={Elsevier}

}

r
Global tidal variables
researchdata.se
data.europa.eu
Updated Jun 11, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthias Obst (2019). Global tidal variables [Dataset]. http://doi.org/10.5879/c49r-x993
Explore at:
(120227407)Available download formats
Unique identifier
https://doi.org/10.5879/c49r-x993
Dataset updated
Jun 11, 2019
Dataset provided by
University of Gothenburg
Authors
Matthias Obst
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains global tidal variables in form of GeoTIFF raster layers generated by Vestbo et al (2018). The raster layers were generated using the Finite Element Solution oceanographic model (FES2012), provided by Noveltis, Legos and CLS Space Oceanography Division and distributed by AVISO+ (http://www.aviso.altimetry.fr). FES2012 includes overall 32 tidal constituents distributed on 1/16° grids (amplitude and phase), corresponding to 3.75 arc-minutes.

The dataset contains the following five raster layers, plus the algorithm for calling the FES program (written in C).

(1) Annual average cycle amplitude in cm. (2) Maximum annual cycle amplitude in cm. (3) Annual standard deviation of cycle amplitude in cm. (4) Annual average duration of tidal cycles in hours. (5) Annual number of cycles.

A detailed description of the data generation procedure is provided in the original paper (Vestbo et al 2018). References: Vestbo S, Obst M, Quevedo-Fernandez F, Intanai I, Funch P (2018). Present and Potential Future Distributions of Asian Horseshoe Crabs Determine Areas for Conservation. Frontiers in Marine Science. doi: 10.3389/fmars.2018.00164 https://www.frontiersin.org/articles/10.3389/fmars.2018.00164/abstract

The dataset contains the following five raster layers, plus the algorithm for calling the FES program (written in C): (1) Annual average cycle amplitude in cm (2) Maximum annual cycle amplitude in cm (3) Annual standard deviation of cycle amplitude in cm (4) Annual average duration of tidal cycles in hours (5) Annual number of cycles
d
Data from: GALILEO VENUS RANGE FIX RAW DATA V1.0
datasets.ai
s.cnmilf.com
+1more
Updated Jan 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Aeronautics and Space Administration (2023). GALILEO VENUS RANGE FIX RAW DATA V1.0 [Dataset]. https://datasets.ai/datasets/galileo-venus-range-fix-raw-data-v1-0-0943a
Explore at:
Dataset updated
Jan 31, 2023
Dataset authored and provided by
National Aeronautics and Space Administration
Description
Raw radio tracking data used to determine the precise distance to Venus (and improve knowledge of the Astronomical Unit) from the Galileo flyby on 10 February 1990.

Facebook

Twitter

Click to copy link

Link copied

Cite

Shanglian Zhou; Carlos Canchila; Wei Song (2023). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6383043

Fused Image dataset for convolutional neural Network-based crack Detection (FIND)

Explore at:

Dataset updated

Apr 20, 2023

Authors

Shanglian Zhou; Carlos Canchila; Wei Song

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The “Fused Image dataset for convolutional neural Network-based crack Detection” (FIND) is a large-scale image dataset with pixel-level ground truth crack data for deep learning-based crack segmentation analysis. It features four types of image data including raw intensity image, raw range (i.e., elevation) image, filtered range image, and fused raw image. The FIND dataset consists of 2500 image patches (dimension: 256x256 pixels) and their ground truth crack maps for each of the four data types.

The images contained in this dataset were collected from multiple bridge decks and roadways under real-world conditions. A laser scanning device was adopted for data acquisition such that the captured raw intensity and raw range images have pixel-to-pixel location correspondence (i.e., spatial co-registration feature). The filtered range data were generated by applying frequency domain filtering to eliminate image disturbances (e.g., surface variations, and grooved patterns) from the raw range data [1]. The fused image data were obtained by combining the raw range and raw intensity data to achieve cross-domain feature correlation [2,3]. Please refer to [4] for a comprehensive benchmark study performed using the FIND dataset to investigate the impact from different types of image data on deep convolutional neural network (DCNN) performance.

If you share or use this dataset, please cite [4] and [5] in any relevant documentation.

In addition, an image dataset for crack classification has also been published at [6].

References:

[1] Shanglian Zhou, & Wei Song. (2020). Robust Image-Based Surface Crack Detection Using Range Data. Journal of Computing in Civil Engineering, 34(2), 04019054. https://doi.org/10.1061/(asce)cp.1943-5487.0000873

[2] Shanglian Zhou, & Wei Song. (2021). Crack segmentation through deep convolutional neural networks and heterogeneous image fusion. Automation in Construction, 125. https://doi.org/10.1016/j.autcon.2021.103605

[3] Shanglian Zhou, & Wei Song. (2020). Deep learning–based roadway crack classification with heterogeneous image data fusion. Structural Health Monitoring, 20(3), 1274-1293. https://doi.org/10.1177/1475921720948434

[4] Shanglian Zhou, Carlos Canchila, & Wei Song. (2023). Deep learning-based crack segmentation for civil infrastructure: data types, architectures, and benchmarked performance. Automation in Construction, 146. https://doi.org/10.1016/j.autcon.2022.104678

5 Shanglian Zhou, Carlos Canchila, & Wei Song. (2022). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6383044

[6] Wei Song, & Shanglian Zhou. (2020). Laser-scanned roadway range image dataset (LRRD). Laser-scanned Range Image Dataset from Asphalt and Concrete Roadways for DCNN-based Crack Classification, DesignSafe-CI. https://doi.org/10.17603/ds2-bzv3-nc78

Clear search

Close search

Google apps

Main menu

Fused Image dataset for convolutional neural Network-based crack Detection...

🛒 Supermarket Data

Citation

Acknowlegement

Data from: FISBe: A real-world benchmark dataset for instance segmentation...

General

Summary

Abstract

Dataset documentation:

Files

How to work with the image files

How to open zarr files

How to view zarr image files

Metrics

Baseline

License

Citation

Acknowledgments

Changelog

Contributing

South Range, MI Population Breakdown by Gender

About this dataset

Content

Inspiration

Recommended for further research

Grass Range, MT Population Breakdown by Gender

About this dataset

Content

Inspiration

Recommended for further research

mmWave-based Fitness Activity Recognition Dataset

ANN development + final testing datasets

INSPIRE Priority Data Set (Compliant) - Species range

Street Network Database SND

Israel-Palestine Conflict Tweets Dataset

Dataset Details

Data Description

How to Process the Data

Required Libraries

Data Processing Code

Usage

Guns Close Range Dataset

Guns Close Range

Archery ranges Business Data for United States

housing

Data from: A comprehensive analysis of autocorrelation and bias in home...

Grass Range, MT households by income brackets: family, non-family, and...

About this dataset

Content

Inspiration

Recommended for further research

Data from: How can appropriate hue ranges be selected for sequential color...

Data from: Haploids adapt faster than diploids across a range of...

mmWave-based Activity Recognition Dataset

Global tidal variables

Data from: GALILEO VENUS RANGE FIX RAW DATA V1.0

Fused Image dataset for convolutional neural Network-based crack Detection (FIND)