54 datasets found

e
INSPIRE Priority Data Set (Compliant) - Species range
inspire-geoportal.ec.europa.eu
inspire-geoportal.lt
+1more
Updated Aug 26, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Construction Sector Development Agency (2020). INSPIRE Priority Data Set (Compliant) - Species range [Dataset]. https://inspire-geoportal.ec.europa.eu/srv/api/records/bfcc7a93-dd66-453b-b7f5-9fc4a868e69f
Explore at:
www:download-1.0-http--download, www:link-1.0-http--link, ogc:wms-1.3.0-http-get-mapAvailable download formats
Dataset updated
Aug 26, 2020
Dataset provided by
State Service for Protected Areas under the Ministry of Environment
Construction Sector Development Agency
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitationshttp://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations
Area covered

Description
INSPIRE Priority Data Set (Compliant) - Species range
Path loss at 5G high frequency range in South Asia
kaggle.com
Updated Apr 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
S M MEHEDI ZAMAN (2023). Path loss at 5G high frequency range in South Asia [Dataset]. https://www.kaggle.com/datasets/smmehedizaman/path-loss-at-5g-high-frequency-range-in-south-asia
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 25, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
S M MEHEDI ZAMAN
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Area covered
Asia, South Asia
Description
This dataset has been generated using NYUSIM 3.0 mm-Wave channel simulator software, which takes into account atmospheric data such as rain rate, humidity, barometric pressure, and temperature. The input data was collected over the course of a year in South Asia. As a result, the dataset provides an accurate representation of the seasonal variations in mm-wave channel characteristics in these areas. The dataset includes a total of 2835 records, each of which contains T-R Separation Distance (m), Time Delay (ns), Received Power (dBm), Phase (rad), Azimuth AoD (degree), Elevation AoD (degree), Azimuth AoA (degree), Elevation, AoA (degree), RMS Delay Spread (ns), Season, Frequency and Path Loss (dB). Four main seasons have been considered in this dataset: Spring, Summer, Fall, and Winter. Each season is subdivided into three parts (i.e., low, medium, and high), to accurately include the atmospheric variations in a season. To simulate the path loss, realistic Tx and Rx height, NLoS environment, and mean human blockage attenuation effects have been taken into consideration. The data has been preprocessed and normalized to ensure consistency and ease of use. Researchers in the field of mm-wave communications and networking can use this dataset to study the impact of atmospheric conditions on mm-wave channel characteristics and develop more accurate models for predicting channel behavior. The dataset can also be used to evaluate the performance of different communication protocols and signal processing techniques under varying weather conditions. Note that while the data was collected specifically in South Asia region, the high correlation between the weather patterns in this region and other areas means that the dataset may also be applicable to other regions with similar atmospheric conditions.

Acknowledgements The paper in which the dataset was proposed is available on: https://ieeexplore.ieee.org/abstract/document/10307972

Citation

If you use this dataset, please cite the following paper:

Rashed Hasan Ratul, S. M. Mehedi Zaman, Hasib Arman Chowdhury, Md. Zayed Hassan Sagor, Mohammad Tawhid Kawser, and Mirza Muntasir Nishat, “Atmospheric Influence on the Path Loss at High Frequencies for Deployment of 5G Cellular Communication Networks,” 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), 2023, pp. 1–6. https://doi.org/10.1109/ICCCNT56998.2023.10307972

BibTeX ```bibtex @inproceedings{Ratul2023Atmospheric, author = {Ratul, Rashed Hasan and Zaman, S. M. Mehedi and Chowdhury, Hasib Arman and Sagor, Md. Zayed Hassan and Kawser, Mohammad Tawhid and Nishat, Mirza Muntasir}, title = {Atmospheric Influence on the Path Loss at High Frequencies for Deployment of {5G} Cellular Communication Networks}, booktitle = {2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT)}, year = {2023}, pages = {1--6}, doi = {10.1109/ICCCNT56998.2023.10307972}, keywords = {Wireless communication; Fluctuations; Rain; 5G mobile communication; Atmospheric modeling; Simulation; Predictive models; 5G-NR; mm-wave propagation; path loss; atmospheric influence; NYUSIM; ML} }
Fused Image dataset for convolutional neural Network-based crack Detection...
zenodo.org
data.niaid.nih.gov
zip
Updated Apr 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shanglian Zhou; Shanglian Zhou; Carlos Canchila; Carlos Canchila; Wei Song; Wei Song (2023). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Dataset]. http://doi.org/10.5281/zenodo.6383044
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6383044
Dataset updated
Apr 20, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Shanglian Zhou; Shanglian Zhou; Carlos Canchila; Carlos Canchila; Wei Song; Wei Song
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The “Fused Image dataset for convolutional neural Network-based crack Detection” (FIND) is a large-scale image dataset with pixel-level ground truth crack data for deep learning-based crack segmentation analysis. It features four types of image data including raw intensity image, raw range (i.e., elevation) image, filtered range image, and fused raw image. The FIND dataset consists of 2500 image patches (dimension: 256x256 pixels) and their ground truth crack maps for each of the four data types.

The images contained in this dataset were collected from multiple bridge decks and roadways under real-world conditions. A laser scanning device was adopted for data acquisition such that the captured raw intensity and raw range images have pixel-to-pixel location correspondence (i.e., spatial co-registration feature). The filtered range data were generated by applying frequency domain filtering to eliminate image disturbances (e.g., surface variations, and grooved patterns) from the raw range data [1]. The fused image data were obtained by combining the raw range and raw intensity data to achieve cross-domain feature correlation [2,3]. Please refer to [4] for a comprehensive benchmark study performed using the FIND dataset to investigate the impact from different types of image data on deep convolutional neural network (DCNN) performance.

If you share or use this dataset, please cite [4] and [5] in any relevant documentation.

In addition, an image dataset for crack classification has also been published at [6].

References:

[1] Shanglian Zhou, & Wei Song. (2020). Robust Image-Based Surface Crack Detection Using Range Data. Journal of Computing in Civil Engineering, 34(2), 04019054. https://doi.org/10.1061/(asce)cp.1943-5487.0000873

[2] Shanglian Zhou, & Wei Song. (2021). Crack segmentation through deep convolutional neural networks and heterogeneous image fusion. Automation in Construction, 125. https://doi.org/10.1016/j.autcon.2021.103605

[3] Shanglian Zhou, & Wei Song. (2020). Deep learning–based roadway crack classification with heterogeneous image data fusion. Structural Health Monitoring, 20(3), 1274-1293. https://doi.org/10.1177/1475921720948434

[4] Shanglian Zhou, Carlos Canchila, & Wei Song. (2023). Deep learning-based crack segmentation for civil infrastructure: data types, architectures, and benchmarked performance. Automation in Construction, 146. https://doi.org/10.1016/j.autcon.2022.104678

[5] (This dataset) Shanglian Zhou, Carlos Canchila, & Wei Song. (2022). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6383044

[6] Wei Song, & Shanglian Zhou. (2020). Laser-scanned roadway range image dataset (LRRD). Laser-scanned Range Image Dataset from Asphalt and Concrete Roadways for DCNN-based Crack Classification, DesignSafe-CI. https://doi.org/10.17603/ds2-bzv3-nc78

Credit Card Eligibility Data: Determining Factors

kaggle.com

zip

Updated May 18, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Rohit Sharma (2024). Credit Card Eligibility Data: Determining Factors [Dataset]. https://www.kaggle.com/datasets/rohit265/credit-card-eligibility-data-determining-factors

Explore at:

zip(303227 bytes)Available download formats

Dataset updated

May 18, 2024

Authors

Rohit Sharma

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Description of the Credit Card Eligibility Data: Determining Factors

The Credit Card Eligibility Dataset: Determining Factors is a comprehensive collection of variables aimed at understanding the factors that influence an individual's eligibility for a credit card. This dataset encompasses a wide range of demographic, financial, and personal attributes that are commonly considered by financial institutions when assessing an individual's suitability for credit.

Each row in the dataset represents a unique individual, identified by a unique ID, with associated attributes ranging from basic demographic information such as gender and age, to financial indicators like total income and employment status. Additionally, the dataset includes variables related to familial status, housing, education, and occupation, providing a holistic view of the individual's background and circumstances.

Variable	Description
ID	An identifier for each individual (customer).
Gender	The gender of the individual.
Own_car	A binary feature indicating whether the individual owns a car.
Own_property	A binary feature indicating whether the individual owns a property.
Work_phone	A binary feature indicating whether the individual has a work phone.
Phone	A binary feature indicating whether the individual has a phone.
Email	A binary feature indicating whether the individual has provided an email address.
Unemployed	A binary feature indicating whether the individual is unemployed.
Num_children	The number of children the individual has.
Num_family	The total number of family members.
Account_length	The length of the individual's account with a bank or financial institution.
Total_income	The total income of the individual.
Age	The age of the individual.
Years_employed	The number of years the individual has been employed.
Income_type	The type of income (e.g., employed, self-employed, etc.).
Education_type	The education level of the individual.
Family_status	The family status of the individual.
Housing_type	The type of housing the individual lives in.
Occupation_type	The type of occupation the individual is engaged in.
Target	The target variable for the classification task, indicating whether the individual is eligible for a credit card or not (e.g., Yes/No, 1/0).

Researchers, analysts, and financial institutions can leverage this dataset to gain insights into the key factors influencing credit card eligibility and to develop predictive models that assist in automating the credit assessment process. By understanding the relationship between various attributes and credit card eligibility, stakeholders can make more informed decisions, improve risk assessment strategies, and enhance customer targeting and segmentation efforts.

This dataset is valuable for a wide range of applications within the financial industry, including credit risk management, customer relationship management, and marketing analytics. Furthermore, it provides a valuable resource for academic research and educational purposes, enabling students and researchers to explore the intricate dynamics of credit card eligibility determination.

N
South Range, MI Population Breakdown by Gender
neilsberg.com
csv, json
Updated Sep 14, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2023). South Range, MI Population Breakdown by Gender [Dataset]. https://www.neilsberg.com/research/datasets/658fcb29-3d85-11ee-9abe-0aa64bf2eeb2/
Explore at:
json, csvAvailable download formats
Dataset updated
Sep 14, 2023
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Michigan, South Range
Variables measured
Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of South Range by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of South Range across both sexes and to determine which sex constitutes the majority.

Key observations

There is a slight majority of male population, with 50.54% of total population being male. Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Scope of gender :

Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

Variables / Data Columns

Gender: This column displays the Gender (Male / Female)

Population: The population of the gender in the South Range is shown in this column.

% of Total Population: This column displays the percentage distribution of each gender as a proportion of South Range total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for South Range Population by Gender. You can refer the same here
R
Dataset for "High-throughput phenotyping to characterise range use behaviour...
entrepot.recherche.data.gouv.fr
bin +4
Updated Jan 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julie Collet; Julie Collet; Claire Bonnefous; Claire Bonnefous; Karine Germain; Karine Germain; Laure Ravon; Laure Ravon; Ludovic Calandreau; Ludovic Calandreau; Vanessa Guesdon; Vanessa Guesdon; Anne Collin; Anne Collin; Elisabeth Le Bihan-Duval; Elisabeth Le Bihan-Duval; Sandrine Mignon-Grasteau; Sandrine Mignon-Grasteau (2024). Dataset for "High-throughput phenotyping to characterise range use behaviour in broiler chickens" [Dataset]. http://doi.org/10.57745/JUDHTG
Explore at:
tsv(13468), bin(7829), bin(7706), txt(1910), tsv(5600), text/comma-separated-values(1374092123), tsv(12835), bin(7008), text/comma-separated-values(1057246321), text/comma-separated-values(2204116241), type/x-r-syntax(69557), tsv(44362)Available download formats
Unique identifier
https://doi.org/10.57745/JUDHTG
Dataset updated
Jan 31, 2024
Dataset provided by
Recherche Data Gouv
Authors
Julie Collet; Julie Collet; Claire Bonnefous; Claire Bonnefous; Karine Germain; Karine Germain; Laure Ravon; Laure Ravon; Ludovic Calandreau; Ludovic Calandreau; Vanessa Guesdon; Vanessa Guesdon; Anne Collin; Anne Collin; Elisabeth Le Bihan-Duval; Elisabeth Le Bihan-Duval; Sandrine Mignon-Grasteau; Sandrine Mignon-Grasteau
License
https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
Time period covered
Mar 31, 2021 - Dec 23, 2021
Dataset funded by
European Commission
Description
A key characteristic of free-range chicken farming is to enable chickens to spend time outdoors. However, each chicken may use the available areas for roaming in variable ways. To check if, and how, broilers use their outdoor range at an individual level, we need to reliably characterise range use behaviour. Traditional methods relying on visual scans require significant time investment and only provide discontinuous information. Passive RFID (Radio Frequency Identification) systems enable tracking individually tagged chickens’ when they go through pop-holes; hence they only provide partial information on the movements of individual chickens. Here, we describe a new method to measure chickens’ range use and test its reliability on three ranges each containing a different breed. We used an active RFID system to localise chickens in their barn, or in one of nine zones of their range, every 30 seconds and assessed range-use behaviour in 600 chickens belonging to three breeds of slow- or medium-growing broilers used for outdoor production (all < 40g daily weight gain). From those real-time locations, we determined five measures to describe daily range use: time spent in the barn, number of outdoor accesses, number of zones visited in a day, gregariousness (an index that increases when birds spend time in zones where other birds are), and numbers of zone changes. Principal Component Analyses (PCAs) were performed on those measures, in each production system, to create two synthetic indicators of chickens’ range use behaviour. Our dataset includes the files needed to calibrate the system (supplementary materials), the data files used in the publication and the associated codes.
Z
ANN development + final testing datasets
data.niaid.nih.gov
resodate.org
+1more
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Authors (2020). ANN development + final testing datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1445865
Explore at:
Dataset updated
Jan 24, 2020
Authors
Authors
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
File name definitions:

'...v_50_175_250_300...' - dataset for velocity ranges [50, 175] + [250, 300] m/s

'...v_175_250...' - dataset for velocity range [175, 250] m/s

'ANNdevelop...' - used to perform 9 parametric sub-analyses where, in each one, many ANNs are developed (trained, validated and tested) and the one yielding the best results is selected

'ANNtest...' - used to test the best ANN from each aforementioned parametric sub-analysis, aiming to find the best ANN model; this dataset includes the 'ANNdevelop...' counterpart

Where to find the input (independent) and target (dependent) variable values for each dataset/excel ?

input values in 'IN' sheet

target values in 'TARGET' sheet

Where to find the results from the best ANN model (for each target/output variable and each velocity range)?

open the corresponding excel file and the expected (target) vs ANN (output) results are written in 'TARGET vs OUTPUT' sheet

Check reference below (to be added when the paper is published)

https://www.researchgate.net/publication/328849817_11_Neural_Networks_-_Max_Disp_-_Railway_Beams
Simulation Data Set
catalog.data.gov
s.cnmilf.com
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).
N
Grass Range, MT Population Breakdown by Gender
neilsberg.com
csv, json
Updated Sep 14, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2023). Grass Range, MT Population Breakdown by Gender [Dataset]. https://www.neilsberg.com/research/datasets/649529eb-3d85-11ee-9abe-0aa64bf2eeb2/
Explore at:
json, csvAvailable download formats
Dataset updated
Sep 14, 2023
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Montana, Grass Range
Variables measured
Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of Grass Range by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Grass Range across both sexes and to determine which sex constitutes the majority.

Key observations

There is a slight majority of female population, with 52.63% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Scope of gender :

Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

Variables / Data Columns

Gender: This column displays the Gender (Male / Female)

Population: The population of the gender in the Grass Range is shown in this column.

% of Total Population: This column displays the percentage distribution of each gender as a proportion of Grass Range total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Grass Range Population by Gender. You can refer the same here
Data from: FISBe: A real-world benchmark dataset for instance segmentation...
zenodo.org
data.niaid.nih.gov
+1more
bin, json +3
Updated Apr 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lisa Mais; Lisa Mais; Peter Hirsch; Peter Hirsch; Claire Managan; Claire Managan; Ramya Kandarpa; Josef Lorenz Rumberger; Josef Lorenz Rumberger; Annika Reinke; Annika Reinke; Lena Maier-Hein; Lena Maier-Hein; Gudrun Ihrke; Gudrun Ihrke; Dagmar Kainmueller; Dagmar Kainmueller; Ramya Kandarpa (2024). FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures [Dataset]. http://doi.org/10.5281/zenodo.10875063
Explore at:
zip, text/x-python, bin, json, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10875063
Dataset updated
Apr 2, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Lisa Mais; Lisa Mais; Peter Hirsch; Peter Hirsch; Claire Managan; Claire Managan; Ramya Kandarpa; Josef Lorenz Rumberger; Josef Lorenz Rumberger; Annika Reinke; Annika Reinke; Lena Maier-Hein; Lena Maier-Hein; Gudrun Ihrke; Gudrun Ihrke; Dagmar Kainmueller; Dagmar Kainmueller; Ramya Kandarpa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Feb 26, 2024
Description
General

For more details and the most up-to-date information please consult our project page: https://kainmueller-lab.github.io/fisbe.

Summary

A new dataset for neuron instance segmentation in 3d multicolor light microscopy data of fruit fly brains

30 completely labeled (segmented) images

71 partly labeled images

altogether comprising ∼600 expert-labeled neuron instances (labeling a single neuron takes between 30-60 min on average, yet a difficult one can take up to 4 hours)

To the best of our knowledge, the first real-world benchmark dataset for instance segmentation of long thin filamentous objects

A set of metrics and a novel ranking score for respective meaningful method benchmarking

An evaluation of three baseline methods in terms of the above metrics and score

Abstract

Instance segmentation of neurons in volumetric light microscopy images of nervous systems enables groundbreaking research in neuroscience by facilitating joint functional and morphological analyses of neural circuits at cellular resolution. Yet said multi-neuron light microscopy data exhibits extremely challenging properties for the task of instance segmentation: Individual neurons have long-ranging, thin filamentous and widely branching morphologies, multiple neurons are tightly inter-weaved, and partial volume effects, uneven illumination and noise inherent to light microscopy severely impede local disentangling as well as long-range tracing of individual neurons. These properties reflect a current key challenge in machine learning research, namely to effectively capture long-range dependencies in the data. While respective methodological research is buzzing, to date methods are typically benchmarked on synthetic datasets. To address this gap, we release the FlyLight Instance Segmentation Benchmark (FISBe) dataset, the first publicly available multi-neuron light microscopy dataset with pixel-wise annotations. In addition, we define a set of instance segmentation metrics for benchmarking that we designed to be meaningful with regard to downstream analyses. Lastly, we provide three baselines to kick off a competition that we envision to both advance the field of machine learning regarding methodology for capturing long-range data dependencies, and facilitate scientific discovery in basic neuroscience.

Dataset documentation:

We provide a detailed documentation of our dataset, following the Datasheet for Datasets questionnaire:

>> FISBe Datasheet

Our dataset originates from the FlyLight project, where the authors released a large image collection of nervous systems of ~74,000 flies, available for download under CC BY 4.0 license.

Files

fisbe_v1.0_{completely,partly}.zip

contains the image and ground truth segmentation data; there is one zarr file per sample, see below for more information on how to access zarr files.

fisbe_v1.0_mips.zip

maximum intensity projections of all samples, for convenience.

sample_list_per_split.txt

a simple list of all samples and the subset they are in, for convenience.

view_data.py

a simple python script to visualize samples, see below for more information on how to use it.

dim_neurons_val_and_test_sets.json

a list of instance ids per sample that are considered to be of low intensity/dim; can be used for extended evaluation.

Readme.md

general information

How to work with the image files

Each sample consists of a single 3d MCFO image of neurons of the fruit fly.
For each image, we provide a pixel-wise instance segmentation for all separable neurons.
Each sample is stored as a separate zarr file (zarr is a file storage format for chunked, compressed, N-dimensional arrays based on an open-source specification.").
The image data ("raw") and the segmentation ("gt_instances") are stored as two arrays within a single zarr file.
The segmentation mask for each neuron is stored in a separate channel.
The order of dimensions is CZYX.

We recommend to work in a virtual environment, e.g., by using conda:

conda create -y -n flylight-env -c conda-forge python=3.9
conda activate flylight-env

How to open zarr files

Install the python zarr package:
pip install zarr

Opened a zarr file with:

import zarr
raw = zarr.open(
seg = zarr.open(

# optional:
import numpy as np
raw_np = np.array(raw)

Zarr arrays are read lazily on-demand.
Many functions that expect numpy arrays also work with zarr arrays.
Optionally, the arrays can also explicitly be converted to numpy arrays.

How to view zarr image files

We recommend to use napari to view the image data.

Install napari:
pip install "napari[all]"

Save the following Python script:

import zarr, sys, napari

raw = zarr.load(sys.argv[1], mode='r', path="volumes/raw")
gts = zarr.load(sys.argv[1], mode='r', path="volumes/gt_instances")

viewer = napari.Viewer(ndisplay=3)
for idx, gt in enumerate(gts):
viewer.add_labels(
gt, rendering='translucent', blending='additive', name=f'gt_{idx}')
viewer.add_image(raw[0], colormap="red", name='raw_r', blending='additive')
viewer.add_image(raw[1], colormap="green", name='raw_g', blending='additive')
viewer.add_image(raw[2], colormap="blue", name='raw_b', blending='additive')
napari.run()

Execute:
python view_data.py

Metrics

S: Average of avF1 and C

avF1: Average F1 Score

C: Average ground truth coverage

clDice_TP: Average true positives clDice

FS: Number of false splits

FM: Number of false merges

tp: Relative number of true positives

For more information on our selected metrics and formal definitions please see our paper.

Baseline

To showcase the FISBe dataset together with our selection of metrics, we provide evaluation results for three baseline methods, namely PatchPerPix (ppp), Flood Filling Networks (FFN) and a non-learnt application-specific color clustering from Duan et al..
For detailed information on the methods and the quantitative results please see our paper.

License

The FlyLight Instance Segmentation Benchmark (FISBe) dataset is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

Citation

If you use FISBe in your research, please use the following BibTeX entry:

@misc{mais2024fisbe, title = {FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures}, author = {Lisa Mais and Peter Hirsch and Claire Managan and Ramya Kandarpa and Josef Lorenz Rumberger and Annika Reinke and Lena Maier-Hein and Gudrun Ihrke and Dagmar Kainmueller}, year = 2024, eprint = {2404.00130}, archivePrefix ={arXiv}, primaryClass = {cs.CV} }

Acknowledgments

We thank Aljoscha Nern for providing unpublished MCFO images as well as Geoffrey W. Meissner and the entire FlyLight Project Team for valuable
discussions.
P.H., L.M. and D.K. were supported by the HHMI Janelia Visiting Scientist Program.
This work was co-funded by Helmholtz Imaging.

Changelog

There have been no changes to the dataset so far.
All future change will be listed on the changelog page.

Contributing

If you would like to contribute, have encountered any issues or have any suggestions, please open an issue for the FISBe dataset in the accompanying github repository.

All contributions are welcome!
N
Grass Range, MT annual income distribution by work experience and gender...
neilsberg.com
csv, json
Updated Feb 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Grass Range, MT annual income distribution by work experience and gender dataset: Number of individuals ages 15+ with income, 2023 // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/grass-range-mt-income-by-gender/
Explore at:
json, csvAvailable download formats
Dataset updated
Feb 27, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Montana, Grass Range
Variables measured
Income for Male Population, Income for Female Population, Income for Male Population working full time, Income for Male Population working part time, Income for Female Population working full time, Income for Female Population working part time, Number of males working full time for a given income bracket, Number of males working part time for a given income bracket, Number of females working full time for a given income bracket, Number of females working part time for a given income bracket
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To portray the number of individuals for both the genders (Male and Female), within each income bracket we conducted an initial analysis and categorization of the American Community Survey data. Households are categorized, and median incomes are reported based on the self-identified gender of the head of the household. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the detailed breakdown of the count of individuals within distinct income brackets, categorizing them by gender (men and women) and employment type - full-time (FT) and part-time (PT), offering valuable insights into the diverse income landscapes within Grass Range. The dataset can be utilized to gain insights into gender-based income distribution within the Grass Range population, aiding in data analysis and decision-making..

Key observations

Employment patterns: Within Grass Range, among individuals aged 15 years and older with income, there were 26 men and 37 women in the workforce. Among them, 8 men were engaged in full-time, year-round employment, while 5 women were in full-time, year-round roles.

Annual income under $24,999: Of the male population working full-time, 12.50% fell within the income range of under $24,999, while none of the female population working full-time was represented in the same income bracket.

Annual income above $100,000: 37.50% of men in full-time roles earned incomes exceeding $100,000, while none of women in full-time positions earned within this income bracket.

Refer to the research insights for more key observations on more income brackets ( Annual income under $24,999, Annual income between $25,000 and $49,999, Annual income between $50,000 and $74,999, Annual income between $75,000 and $99,999 and Annual income above $100,000) and employment types (full-time year-round and part-time)

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Income brackets:

$1 to $2,499 or loss

$2,500 to $4,999

$5,000 to $7,499

$7,500 to $9,999

$10,000 to $12,499

$12,500 to $14,999

$15,000 to $17,499

$17,500 to $19,999

$20,000 to $22,499

$22,500 to $24,999

$25,000 to $29,999

$30,000 to $34,999

$35,000 to $39,999

$40,000 to $44,999

$45,000 to $49,999

$50,000 to $54,999

$55,000 to $64,999

$65,000 to $74,999

$75,000 to $99,999

$100,000 or more

Variables / Data Columns

Income Bracket: This column showcases 20 income brackets ranging from $1 to $100,000+..

Full-Time Males: The count of males employed full-time year-round and earning within a specified income bracket

Part-Time Males: The count of males employed part-time and earning within a specified income bracket

Full-Time Females: The count of females employed full-time year-round and earning within a specified income bracket

Part-Time Females: The count of females employed part-time and earning within a specified income bracket

Employment type classifications include:

Full-time, year-round: A full-time, year-round worker is a person who worked full time (35 or more hours per week) and 50 or more weeks during the previous calendar year.

Part-time: A part-time worker is a person who worked less than 35 hours per week during the previous calendar year.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Grass Range median household income by race. You can refer the same here
n
Data from: Contrasting effects of host or local specialization: widespread...
data.niaid.nih.gov
ourarchive.otago.ac.nz
+3more
zip
Updated Mar 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniela de Angeli Dutra; Gabriel Moreira Félix; Robert Poulin (2024). Contrasting effects of host or local specialization: widespread haemosporidians are host generalist whereas local specialists are locally abundant [Dataset]. http://doi.org/10.5061/dryad.j3tx95xfb
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.j3tx95xfb
Dataset updated
Mar 13, 2024
Dataset provided by
University of Otago
Universidade Estadual de Campinas (UNICAMP)
Authors
Daniela de Angeli Dutra; Gabriel Moreira Félix; Robert Poulin
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Aim: Despite the wide distribution of many parasites around the globe, the range of individual species varies significantly even among phylogenetically related taxa. Since parasites need suitable hosts to complete their development, parasite geographical and environmental ranges should be limited to communities where their hosts are found. Parasites may also suffer from a trade-off between being locally abundant or widely dispersed. We hypothesize that the geographical and environmental ranges of parasites are negatively associated to their host specificity and their local abundance. Location: Worldwide Time period: 2009 to 2021 Major taxa studied: Avian haemosporidian parasites Methods: We tested these hypotheses using a global database which comprises data on avian haemosporidian parasites from across the world. For each parasite lineage, we computed five metrics: phylogenetic host-range, environmental range, geographical range, and their mean local and total number of observations in the database. Phylogenetic generalized least squares models were ran to evaluate the influence of phylogenetic host-range and total and local abundances on geographical and environmental range. In addition, we analysed separately the two regions with the largest amount of available data: Europe and South America. Results: We evaluated 401 lineages from 757 localities and observed that generalism (i.e. phylogenetic host range) associates positively to both the parasites’ geographical and environmental ranges at global and Europe scales. For South America, generalism only associates with geographical range. Finally, mean local abundance (mean local number of parasite occurrences) was negatively related to geographical and environmental range. This pattern was detected worldwide and in South America, but not in Europe. Main Conclusions: We demonstrate that parasite specificity is linked to both their geographical and environmental ranges. The fact that locally abundant parasites present restricted ranges, indicates a trade-off between these two traits. This trade-off, however, only becomes evident when sufficient heterogeneous host communities are considered. Methods We compiled data on haemosporidian lineages from the MalAvi database (http://130.235.244.92/Malavi/ , Bensch et al. 2009) including all the data available from the “Grand Lineage Summary” representing Plasmodium and Haemoproteus genera from wild birds and that contained information regarding location. After checking for duplicated sequences, this dataset comprised a total of ~6200 sequenced parasites representing 1602 distinct lineages (775 Plasmodium and 827 Haemoproteus) collected from 1139 different host species and 757 localities from all continents except Antarctica (Supplementary figure 1, Supplementary Table 1). The parasite lineages deposited in MalAvi are based on a cyt b fragment of 478 bp. This dataset was used to calculate the parasites’ geographical, environmental and phylogenetic ranges. Geographical range All analyses in this study were performed using R version 4.02. In order to estimate the geographical range of each parasite lineage, we applied the R package “GeoRange” (Boyle, 2017) and chose the variable minimum spanning tree distance (i.e., shortest total distance of all lines connecting each locality where a particular lineage has been found). Using the function “create.matrix” from the “fossil” package, we created a matrix of lineages and coordinates and employed the function “GeoRange_MultiTaxa” to calculate the minimum spanning tree distance for each parasite lineage distance (i.e. shortest total distance in kilometers of all lines connecting each locality). Therefore, as at least two distinct sites are necessary to calculate this distance, parasites observed in a single locality could not have their geographical range estimated. For this reason, only parasites observed in two or more localities were considered in our phylogenetically controlled least squares (PGLS) models. Host and Environmental diversity Traditionally, ecologists use Shannon entropy to measure diversity in ecological assemblages (Pielou, 1966). The Shannon entropy of a set of elements is related to the degree of uncertainty someone would have about the identity of a random selected element of that set (Jost, 2006). Thus, Shannon entropy matches our intuitive notion of biodiversity, as the more diverse an assemblage is, the more uncertainty regarding to which species a randomly selected individual belongs. Shannon diversity increases with both the assemblage richness (e.g., the number of species) and evenness (e.g., uniformity in abundance among species). To compare the diversity of assemblages that vary in richness and evenness in a more intuitive manner, we can normalize diversities by Hill numbers (Chao et al., 2014b). The Hill number of an assemblage represents the effective number of species in the assemblage, i.e., the number of equally abundant species that are needed to give the same value of the diversity metric in that assemblage. Hill numbers can be extended to incorporate phylogenetic information. In such case, instead of species, we are measuring the effective number of phylogenetic entities in the assemblage. Here, we computed phylogenetic host-range as the phylogenetic Hill number associated with the assemblage of hosts found infected by a given parasite. Analyses were performed using the function “hill_phylo” from the “hillr” package (Chao et al., 2014a). Hill numbers are parameterized by a parameter “q” that determines the sensitivity of the metric to relative species abundance. Different “q” values produce Hill numbers associated with different diversity metrics. We set q = 1 to compute the Hill number associated with Shannon diversity. Here, low Hill numbers indicate specialization on a narrow phylogenetic range of hosts, whereas a higher Hill number indicates generalism across a broader phylogenetic spectrum of hosts. We also used Hill numbers to compute the environmental range of sites occupied by each parasite lineage. Firstly, we collected the 19 bioclimatic variables from WorldClim version 2 (http://www.worldclim.com/version2) for all sites used in this study (N = 713). Then, we standardized the 19 variables by centering and scaling them by their respective mean and standard deviation. Thereafter, we computed the pairwise Euclidian environmental distance among all sites and used this distance to compute a dissimilarity cluster. Finally, as for the phylogenetic Hill number, we used this dissimilarity cluster to compute the environmental Hill number of the assemblage of sites occupied by each parasite lineage. The environmental Hill number for each parasite can be interpreted as the effective number of environmental conditions in which a parasite lineage occurs. Thus, the higher the environmental Hill number, the more generalist the parasite is regarding the environmental conditions in which it can occur. Parasite phylogenetic tree A Bayesian phylogenetic reconstruction was performed. We built a tree for all parasite sequences for which we were able to estimate the parasite’s geographical, environmental and phylogenetic ranges (see above); this represented 401 distinct parasite lineages. This inference was produced using MrBayes 3.2.2 (Ronquist & Huelsenbeck, 2003) with the GTR + I + G model of nucleotide evolution, as recommended by ModelTest (Posada & Crandall, 1998), which selects the best-fit nucleotide substitution model for a set of genetic sequences. We ran four Markov chains simultaneously for a total of 7.5 million generations that were sampled every 1000 generations. The first 1250 million trees (25%) were discarded as a burn-in step and the remaining trees were used to calculate the posterior probabilities of each estimated node in the final consensus tree. Our final tree obtained a cumulative posterior probability of 0.999. Leucocytozoon caulleryi was used as the outgroup to root the phylogenetic tree as Leucocytozoon spp. represents a basal group within avian haemosporidians (Pacheco et al., 2020).
R
Guns Close Range Dataset
universe.roboflow.com
zip
Updated Oct 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Computer vision (2025). Guns Close Range Dataset [Dataset]. https://universe.roboflow.com/computer-vision-kcsdu/guns-close-range-7hqvz/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Oct 22, 2025
Dataset authored and provided by
Computer vision
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Objects Objects Objects Obj 2SfO Bounding Boxes
Description
Guns Close Range

## Overview Guns Close Range is a dataset for object detection tasks - it contains Objects Objects Objects Obj 2SfO annotations for 682 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
z
mmWave-based Fitness Activity Recognition Dataset
zenodo.org
png, zip
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yucheng Xie; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen; Yucheng Xie; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen (2024). mmWave-based Fitness Activity Recognition Dataset [Dataset]. http://doi.org/10.5281/zenodo.7793613
Explore at:
zip, pngAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7793613
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodo
Authors
Yucheng Xie; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen; Yucheng Xie; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description:
This mmWave Datasets are used for fitness activity identification. This dataset (FA Dataset) contains 14 common fitness daily activities. The data are captured by the mmWave radar TI-AWR1642. The dataset can be used by fellow researchers to reproduce the original work or to further explore other machine-learning problems in the domain of mmWave signals.
Format: .png format
Section 1: Device Configuration
A commodity mmWave radar TI AWR1642, which integrates a 2 × 4 antenna array. The detailed information of it can be found at https://www.ti.com/product/AWR1642#:~:text=The%20AWR1642%20is%20an%20ideal,of%2076%20to%2081%20GHz.
A TI DCA1000EVM data capture card is used to collect data from the mmWave device and send data to a laptop. The detailed information can be found at https://www.ti.com/tool/DCA1000EVM?keyMatch=DCA1000EVM.
mmWave radar work at the frequency in the range of 77~81GHz. The sampling rate is fixed at 100 frames per second and each frame has 17 chirps.
Section 2: Data Format
We provide our mmWave data in heatmaps for this dataset. The data file is in the png format. The details are shown in the following:
14 activities are included in the FA Dataset.
2 participants are included in the FA Dataset.
FA_d_p_i_u_j.png:
d represents the date to collect the fitness data.
p represents the environment to collect the fitness data.
i represents fitness activity type index
u represents user id
j represents sample index
Example:
FA_20220101_lab_1_2_3 represents the 3rd data sample of user 2 of activity 1 collected in the lab
Section 3: Experimental Setup
We place the mmWave device on a table with a height of 60cm.
The participants are asked to perform fitness activity in front of a mmWave device with a distance of 2m.
The data are collected at an lab with a size of (5.0m×3.0m).
Section 4: Data Description
We develop a spatial-temporal heatmap to integrates multiple activity features, including the range of movement, velocity, and time duration of each activity repetition.

We first derive the Doppler-range map of the users' activity by calculating Range-FFT and Doppler-FFT. Then, we generate the spatial-temporal heatmap by accumulating the velocity of every distance in every Doppler-range map together. Next, we normalize the derived velocity information and present the velocity-distance relationship in time dimension. In this way, we transfer the original instantaneous velocity-distance relationship to a more comprehensive spatial-temporal heatmap which describes the process of a whole activity.

As shown in Figure attached, in each spatial-temporal heatmap, the horizontal axis represents the time duration of an activity repetition while the vertical axis represents the range of movement. The velocity is represented by color.

We create 14 zip files to store the the dataset. There are 14 zip files starting with "FA", each contains repetitions from the same fitness activity.
14 common daily activities and their corresponding files
File Name Activity Type File Name Activity Type
FA1 Crunches FA8 Squats
FA2 Elbow plank and reach FA9 Burpees
FA3 Leg raise FA10 Chest squeezes
FA4 Lunges FA11 High knees
FA5 Mountain climber FA12 Side leg raise
FA6 Punches FA13 Side to side chops
FA7 Push ups FA14 Turning kicks

Section 5: Raw Data and Data Processing Algorithms
We also provide the mmWave raw data (.mat format) stored in the same zip file corresponding to the heatmap datasets. Each .mat file can store one set of activity repetitions (e.g., 4 repetations) from a same user.
For example: FA_d_p_i_u_j.mat:
d represents the data to collect the data.
p represents the environment to collect the data.
i represents the activity type index
u represents the user id
j represents the set index
We plan to provide the data processing algorithms (heatmap_generation.py) to load the mmWave raw data and generate the corresponding heatmap data.
Section 6: Citations
If your paper is related to our works, please cite our papers as follows.
https://ieeexplore.ieee.org/document/9868878/
Xie, Yucheng, Ruizhe Jiang, Xiaonan Guo, Yan Wang, Jerry Cheng, and Yingying Chen. "mmFit: Low-Effort Personalized Fitness Monitoring Using Millimeter Wave." In 2022 International Conference on Computer Communications and Networks (ICCCN), pp. 1-10. IEEE, 2022.
Bibtex:
@inproceedings{xie2022mmfit,
title={mmFit: Low-Effort Personalized Fitness Monitoring Using Millimeter Wave},
author={Xie, Yucheng and Jiang, Ruizhe and Guo, Xiaonan and Wang, Yan and Cheng, Jerry and Chen, Yingying},
booktitle={2022 International Conference on Computer Communications and Networks (ICCCN)},
pages={1--10},
year={2022},
organization={IEEE}
}
Data from: GALILEO VENUS RANGE FIX RAW DATA V1.0
catalog.data.gov
datasets.ai
+1more
Updated Aug 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Aeronautics and Space Administration (2025). GALILEO VENUS RANGE FIX RAW DATA V1.0 [Dataset]. https://catalog.data.gov/dataset/galileo-venus-range-fix-raw-data-v1-0-0943a
Explore at:
Dataset updated
Aug 22, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
Raw radio tracking data used to determine the precise distance to Venus (and improve knowledge of the Astronomical Unit) from the Galileo flyby on 10 February 1990.
N
South Range, MI households by income brackets: family, non-family, and...
neilsberg.com
csv, json
Updated Mar 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). South Range, MI households by income brackets: family, non-family, and total, in 2023 inflation-adjusted dollars [Dataset]. https://www.neilsberg.com/insights/south-range-mi-median-household-income/
Explore at:
json, csvAvailable download formats
Dataset updated
Mar 3, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Michigan, South Range
Variables measured
Income Level, All households, Family households, Non-Family households, Percent of All households, Percent of Family households, Percent of Non-Family households
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It delineates income distributions across income brackets (mentioned above) following an initial analysis and categorization. The percentage of all, family and nonfamily households were collected by grouping data as applicable. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents a breakdown of households across various income brackets in South Range, MI, as reported by the U.S. Census Bureau. The Census Bureau classifies households into different categories, including total households, family households, and non-family households. Our analysis of U.S. Census Bureau American Community Survey data for South Range, MI reveals how household income distribution varies among these categories. The dataset highlights the variation in number of households with income, offering valuable insights into the distribution of South Range households based on income levels.

Key observations

For Family Households: In South Range, the majority of family households, representing 21.9%, earn $60,000 to $74,999, showcasing a substantial share of the community families falling within this income bracket. Conversely, the minority of family households, comprising 1.46%, have incomes falling $150,000 to $199,999, representing a smaller but still significant segment of the community.

For Non-Family Households: In South Range, the majority of non-family households, accounting for 20.93%, have income Less than $10,000, indicating that a substantial portion of non-family households falls within this income bracket. On the other hand, the minority of non-family households, comprising 0.0%, earn $150,000 to $199,999, representing a smaller, yet notable, portion of non-family households in the community.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Income Levels:

Less than $10,000

$10,000 to $14,999

$15,000 to $19,999

$20,000 to $24,999

$25,000 to $29,999

$30,000 to $34,999

$35,000 to $39,999

$40,000 to $44,999

$45,000 to $49,999

$50,000 to $59,999

$60,000 to $74,999

$75,000 to $99,999

$125,000 to $149,999

$150,000 to $199,999

$200,000 or more

Variables / Data Columns

Income Level: The income level represents the income brackets ranging from Less than $10,000 to $200,000 or more in South Range, MI (As mentioned above).

All Households: Count of households for the specified income level

% All Households: Percentage of households at the specified income level relative to the total households in South Range, MI

Family Households: Count of family households for the specified income level

% Family Households: Percentage of family households at the specified income level relative to the total family households in South Range, MI

Non-Family Households: Count of non-family households for the specified income level

% Non-Family Households: Percentage of non-family households at the specified income level relative to the total non-family households in South Range, MI

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for South Range median household income. You can refer the same here
f
Summary and methods used to calculate the physical characteristics used to...
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Mar 31, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nathan, Senthilvel K. S. S.; Saldivar, Diana A. Ramirez; Vaughan, Ian P.; Goossens, Benoit; Stark, Danica J. (2017). Summary and methods used to calculate the physical characteristics used to compare the home range estimators. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001743878
Explore at:
Dataset updated
Mar 31, 2017
Authors
Nathan, Senthilvel K. S. S.; Saldivar, Diana A. Ramirez; Vaughan, Ian P.; Goossens, Benoit; Stark, Danica J.
Description
Summary and methods used to calculate the physical characteristics used to compare the home range estimators.
🛒 Supermarket Data
kaggle.com
zip
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mexwell (2024). 🛒 Supermarket Data [Dataset]. https://www.kaggle.com/datasets/mexwell/supermarket-data/versions/1
Explore at:
zip(78427538 bytes)Available download formats
Dataset updated
Jul 19, 2024
Authors
mexwell
Description
This is the dataset released as companion for the paper “Explaining the Product Range Effect in Purchase Data“, presented at the BigData 2013 conference.

supermarket_distances: three columns. The first column is the customer id, the second is the shop id and the third is the distance between the customer’s house and the shop location. The distance is a calculated in meters as a straight line so it does not take into account the road graph.

supermarket_prices: two columns. The first column is the product id and the second column is its unit price. The price is in Euro and it is calculated as the average unit price for the time span of the dataset.

supermarket_purchases: four columns. The first column is the customer id, the second is the product id, the third is the shop id and the fourth is the total amount of items that the customer bought the product in that particular shop. The data is recorded from January 2007 to December 2011.

Citation

Pennacchioli, D., Coscia, M., Rinzivillo, S., Pedreschi, D. and Giannotti, F., Explaining the Product Range Effect in Purchase Data. In BigData, 2013.

Acknowlegement

Foto von Eduardo Soares auf Unsplash
Math Formula Retrieval
kaggle.com
huggingface.co
zip
Updated Dec 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Math Formula Retrieval [Dataset]. https://www.kaggle.com/datasets/thedevastator/math-formula-pair-classification-dataset/data
Explore at:
zip(2021716728 bytes)Available download formats
Dataset updated
Dec 2, 2023
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Math Formula Retrieval

Math Formula Pair Classification Dataset

By ddrg (From Huggingface) [source]

About this dataset

With a total of six columns, including formula1, formula2, label (binary format), formula1, formula2, and label, the dataset provides all the necessary information for conducting comprehensive analysis and evaluation.

The train.csv file contains a subset of the dataset specifically curated for training purposes. It includes an extensive range of math formula pairs along with their corresponding labels and unique ID names. This allows researchers and data scientists to construct models that can predict whether two given formulas fall within the same category or not.

On the other hand, test.csv serves as an evaluation set. It consists of additional pairs of math formulas accompanied by their respective labels and unique IDs. By evaluating model performance on this test set after training it on train.csv data, researchers can assess how well their models generalize to unseen instances.

By leveraging this informative dataset, researchers can unlock new possibilities in mathematics-related fields such as pattern recognition algorithms development or enhancing educational tools that involve automatic identification and categorization tasks based on mathematical formulas

How to use the dataset

Introduction

Dataset Description

train.csv

The train.csv file contains a set of labeled math formula pairs along with their corresponding labels and formula name IDs. It consists of the following columns: - formula1: The first mathematical formula in the pair (text). - formula2: The second mathematical formula in the pair (text). - label: The classification label indicating whether the pair of formulas belong to the same category or not (binary). A label value of 1 indicates that both formulas belong to the same category, while a label value of 0 indicates different categories.

test.csv

The purpose of the test.csv file is to provide a set of formula pairs along with their labels and formula name IDs for testing and evaluation purposes. It has an identical structure to train.csv, containing columns like formula1, formula2, label, etc.

Task

The main task using this dataset is binary classification, where your objective is to predict whether two mathematical formulas belong to the same category or not based on their textual representation. You can use various machine learning algorithms such as logistic regression, decision trees, random forests, or neural networks for training models on this dataset.

Exploring & Analyzing Data

Before building your model, it's crucial to explore and analyze your data. Here are some steps you can take:

Load both CSV files (train.csv and test.csv) into your preferred data analysis framework or programming language (e.g., Python with libraries like pandas).

Examine the dataset's structure, including the number of rows, columns, and data types.

Check for missing values in the dataset and handle them accordingly.

Visualize the distribution of labels to understand whether it is balanced or imbalanced.

Model Building

Once you have analyzed and preprocessed your dataset, you can start building your classification model using various machine learning algorithms:

Split your train.csv data into training and validation sets for model evaluation during training.

Choose a suitable

Research Ideas

Math Formula Similarity: This dataset can be used to develop a model that classifies whether two mathematical formulas are similar or not. This can be useful in various applications such as plagiarism detection, identifying duplicate formulas in databases, or suggesting similar formulas based on user input.

Formula Categorization: The dataset can be used to train a model that categorizes mathematical formulas into different classes or categories. For example, the model can classify formulas into algebraic expressions, trigonometric equations, calculus problems, or geometric theorems. This categorization can help organize and search through large collections of mathematical formulas.

Formula Recommendation: Using this dataset, one could build a recommendation system that suggests related math formulas based on user input. By analyzing the similarities between different formula pairs and their corresponding labels, the system could provide recommendations for relevant mathematical concepts that users may need while solving problems or studying specific topics in mathematics

Acknowle...
f
Data from: Software for Computing and Annotating Genomic Ranges
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Aug 8, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carlson, Marc; Carey, Vincent J.; Gentleman, Robert; Lawrence, Michael; Pagès, Hervé; Morgan, Martin T.; Huber, Wolfgang; Aboyoun, Patrick (2013). Software for Computing and Annotating Genomic Ranges [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001647970
Explore at:
Dataset updated
Aug 8, 2013
Authors
Carlson, Marc; Carey, Vincent J.; Gentleman, Robert; Lawrence, Michael; Pagès, Hervé; Morgan, Martin T.; Huber, Wolfgang; Aboyoun, Patrick
Description
We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

Facebook

Twitter

Click to copy link

Link copied

Cite

Construction Sector Development Agency (2020). INSPIRE Priority Data Set (Compliant) - Species range [Dataset]. https://inspire-geoportal.ec.europa.eu/srv/api/records/bfcc7a93-dd66-453b-b7f5-9fc4a868e69f

INSPIRE Priority Data Set (Compliant) - Species range

Explore at:

www:download-1.0-http--download, www:link-1.0-http--link, ogc:wms-1.3.0-http-get-mapAvailable download formats

Dataset updated

Aug 26, 2020

Dataset provided by

State Service for Protected Areas under the Ministry of Environment
Construction Sector Development Agency

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitationshttp://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations

Area covered

Description

INSPIRE Priority Data Set (Compliant) - Species range

Clear search

Close search

Google apps

Main menu

INSPIRE Priority Data Set (Compliant) - Species range

Path loss at 5G high frequency range in South Asia

Citation

Fused Image dataset for convolutional neural Network-based crack Detection...

Credit Card Eligibility Data: Determining Factors

South Range, MI Population Breakdown by Gender

About this dataset

Content

Inspiration

Recommended for further research

Dataset for "High-throughput phenotyping to characterise range use behaviour...

ANN development + final testing datasets

Simulation Data Set

Grass Range, MT Population Breakdown by Gender

About this dataset

Content

Inspiration

Recommended for further research

Data from: FISBe: A real-world benchmark dataset for instance segmentation...

General

Summary

Abstract

Dataset documentation:

Files

How to work with the image files

How to open zarr files

How to view zarr image files

Metrics

Baseline

License

Citation

Acknowledgments

Changelog

Contributing

Grass Range, MT annual income distribution by work experience and gender...

About this dataset

Content

Inspiration

Recommended for further research

Data from: Contrasting effects of host or local specialization: widespread...

Guns Close Range Dataset

Guns Close Range

mmWave-based Fitness Activity Recognition Dataset

Data from: GALILEO VENUS RANGE FIX RAW DATA V1.0

South Range, MI households by income brackets: family, non-family, and...

About this dataset

Content

Inspiration

Recommended for further research

Summary and methods used to calculate the physical characteristics used to...

🛒 Supermarket Data

Citation

Acknowlegement

Math Formula Retrieval

Math Formula Retrieval

Math Formula Pair Classification Dataset

About this dataset

How to use the dataset

Introduction

Dataset Description

train.csv

test.csv

Task

Exploring & Analyzing Data

Model Building

Research Ideas

Acknowle...

Data from: Software for Computing and Annotating Genomic Ranges

INSPIRE Priority Data Set (Compliant) - Species rangeSee More Versions

INSPIRE Priority Data Set (Compliant) - Species range