54 datasets found
  1. e

    INSPIRE Priority Data Set (Compliant) - Species range

    • inspire-geoportal.ec.europa.eu
    • inspire-geoportal.lt
    • +1more
    Updated Aug 26, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Construction Sector Development Agency (2020). INSPIRE Priority Data Set (Compliant) - Species range [Dataset]. https://inspire-geoportal.ec.europa.eu/srv/api/records/bfcc7a93-dd66-453b-b7f5-9fc4a868e69f
    Explore at:
    www:download-1.0-http--download, www:link-1.0-http--link, ogc:wms-1.3.0-http-get-mapAvailable download formats
    Dataset updated
    Aug 26, 2020
    Dataset provided by
    State Service for Protected Areas under the Ministry of Environment
    Construction Sector Development Agency
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitationshttp://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations

    Area covered
    Description

    INSPIRE Priority Data Set (Compliant) - Species range

  2. Path loss at 5G high frequency range in South Asia

    • kaggle.com
    Updated Apr 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    S M MEHEDI ZAMAN (2023). Path loss at 5G high frequency range in South Asia [Dataset]. https://www.kaggle.com/datasets/smmehedizaman/path-loss-at-5g-high-frequency-range-in-south-asia
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 25, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    S M MEHEDI ZAMAN
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Area covered
    Asia, South Asia
    Description

    This dataset has been generated using NYUSIM 3.0 mm-Wave channel simulator software, which takes into account atmospheric data such as rain rate, humidity, barometric pressure, and temperature. The input data was collected over the course of a year in South Asia. As a result, the dataset provides an accurate representation of the seasonal variations in mm-wave channel characteristics in these areas. The dataset includes a total of 2835 records, each of which contains T-R Separation Distance (m), Time Delay (ns), Received Power (dBm), Phase (rad), Azimuth AoD (degree), Elevation AoD (degree), Azimuth AoA (degree), Elevation, AoA (degree), RMS Delay Spread (ns), Season, Frequency and Path Loss (dB). Four main seasons have been considered in this dataset: Spring, Summer, Fall, and Winter. Each season is subdivided into three parts (i.e., low, medium, and high), to accurately include the atmospheric variations in a season. To simulate the path loss, realistic Tx and Rx height, NLoS environment, and mean human blockage attenuation effects have been taken into consideration. The data has been preprocessed and normalized to ensure consistency and ease of use. Researchers in the field of mm-wave communications and networking can use this dataset to study the impact of atmospheric conditions on mm-wave channel characteristics and develop more accurate models for predicting channel behavior. The dataset can also be used to evaluate the performance of different communication protocols and signal processing techniques under varying weather conditions. Note that while the data was collected specifically in South Asia region, the high correlation between the weather patterns in this region and other areas means that the dataset may also be applicable to other regions with similar atmospheric conditions.

    Acknowledgements The paper in which the dataset was proposed is available on: https://ieeexplore.ieee.org/abstract/document/10307972

    Citation

    If you use this dataset, please cite the following paper:

    Rashed Hasan Ratul, S. M. Mehedi Zaman, Hasib Arman Chowdhury, Md. Zayed Hassan Sagor, Mohammad Tawhid Kawser, and Mirza Muntasir Nishat, “Atmospheric Influence on the Path Loss at High Frequencies for Deployment of 5G Cellular Communication Networks,” 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), 2023, pp. 1–6. https://doi.org/10.1109/ICCCNT56998.2023.10307972

    BibTeX ```bibtex @inproceedings{Ratul2023Atmospheric, author = {Ratul, Rashed Hasan and Zaman, S. M. Mehedi and Chowdhury, Hasib Arman and Sagor, Md. Zayed Hassan and Kawser, Mohammad Tawhid and Nishat, Mirza Muntasir}, title = {Atmospheric Influence on the Path Loss at High Frequencies for Deployment of {5G} Cellular Communication Networks}, booktitle = {2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT)}, year = {2023}, pages = {1--6}, doi = {10.1109/ICCCNT56998.2023.10307972}, keywords = {Wireless communication; Fluctuations; Rain; 5G mobile communication; Atmospheric modeling; Simulation; Predictive models; 5G-NR; mm-wave propagation; path loss; atmospheric influence; NYUSIM; ML} }

  3. Fused Image dataset for convolutional neural Network-based crack Detection...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Apr 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shanglian Zhou; Shanglian Zhou; Carlos Canchila; Carlos Canchila; Wei Song; Wei Song (2023). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Dataset]. http://doi.org/10.5281/zenodo.6383044
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 20, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Shanglian Zhou; Shanglian Zhou; Carlos Canchila; Carlos Canchila; Wei Song; Wei Song
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The “Fused Image dataset for convolutional neural Network-based crack Detection” (FIND) is a large-scale image dataset with pixel-level ground truth crack data for deep learning-based crack segmentation analysis. It features four types of image data including raw intensity image, raw range (i.e., elevation) image, filtered range image, and fused raw image. The FIND dataset consists of 2500 image patches (dimension: 256x256 pixels) and their ground truth crack maps for each of the four data types.

    The images contained in this dataset were collected from multiple bridge decks and roadways under real-world conditions. A laser scanning device was adopted for data acquisition such that the captured raw intensity and raw range images have pixel-to-pixel location correspondence (i.e., spatial co-registration feature). The filtered range data were generated by applying frequency domain filtering to eliminate image disturbances (e.g., surface variations, and grooved patterns) from the raw range data [1]. The fused image data were obtained by combining the raw range and raw intensity data to achieve cross-domain feature correlation [2,3]. Please refer to [4] for a comprehensive benchmark study performed using the FIND dataset to investigate the impact from different types of image data on deep convolutional neural network (DCNN) performance.

    If you share or use this dataset, please cite [4] and [5] in any relevant documentation.

    In addition, an image dataset for crack classification has also been published at [6].

    References:

    [1] Shanglian Zhou, & Wei Song. (2020). Robust Image-Based Surface Crack Detection Using Range Data. Journal of Computing in Civil Engineering, 34(2), 04019054. https://doi.org/10.1061/(asce)cp.1943-5487.0000873

    [2] Shanglian Zhou, & Wei Song. (2021). Crack segmentation through deep convolutional neural networks and heterogeneous image fusion. Automation in Construction, 125. https://doi.org/10.1016/j.autcon.2021.103605

    [3] Shanglian Zhou, & Wei Song. (2020). Deep learning–based roadway crack classification with heterogeneous image data fusion. Structural Health Monitoring, 20(3), 1274-1293. https://doi.org/10.1177/1475921720948434

    [4] Shanglian Zhou, Carlos Canchila, & Wei Song. (2023). Deep learning-based crack segmentation for civil infrastructure: data types, architectures, and benchmarked performance. Automation in Construction, 146. https://doi.org/10.1016/j.autcon.2022.104678

    [5] (This dataset) Shanglian Zhou, Carlos Canchila, & Wei Song. (2022). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6383044

    [6] Wei Song, & Shanglian Zhou. (2020). Laser-scanned roadway range image dataset (LRRD). Laser-scanned Range Image Dataset from Asphalt and Concrete Roadways for DCNN-based Crack Classification, DesignSafe-CI. https://doi.org/10.17603/ds2-bzv3-nc78

  4. Credit Card Eligibility Data: Determining Factors

    • kaggle.com
    zip
    Updated May 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rohit Sharma (2024). Credit Card Eligibility Data: Determining Factors [Dataset]. https://www.kaggle.com/datasets/rohit265/credit-card-eligibility-data-determining-factors
    Explore at:
    zip(303227 bytes)Available download formats
    Dataset updated
    May 18, 2024
    Authors
    Rohit Sharma
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description of the Credit Card Eligibility Data: Determining Factors

    The Credit Card Eligibility Dataset: Determining Factors is a comprehensive collection of variables aimed at understanding the factors that influence an individual's eligibility for a credit card. This dataset encompasses a wide range of demographic, financial, and personal attributes that are commonly considered by financial institutions when assessing an individual's suitability for credit.

    Each row in the dataset represents a unique individual, identified by a unique ID, with associated attributes ranging from basic demographic information such as gender and age, to financial indicators like total income and employment status. Additionally, the dataset includes variables related to familial status, housing, education, and occupation, providing a holistic view of the individual's background and circumstances.

    VariableDescription
    IDAn identifier for each individual (customer).
    GenderThe gender of the individual.
    Own_carA binary feature indicating whether the individual owns a car.
    Own_propertyA binary feature indicating whether the individual owns a property.
    Work_phoneA binary feature indicating whether the individual has a work phone.
    PhoneA binary feature indicating whether the individual has a phone.
    EmailA binary feature indicating whether the individual has provided an email address.
    UnemployedA binary feature indicating whether the individual is unemployed.
    Num_childrenThe number of children the individual has.
    Num_familyThe total number of family members.
    Account_lengthThe length of the individual's account with a bank or financial institution.
    Total_incomeThe total income of the individual.
    AgeThe age of the individual.
    Years_employedThe number of years the individual has been employed.
    Income_typeThe type of income (e.g., employed, self-employed, etc.).
    Education_typeThe education level of the individual.
    Family_statusThe family status of the individual.
    Housing_typeThe type of housing the individual lives in.
    Occupation_typeThe type of occupation the individual is engaged in.
    TargetThe target variable for the classification task, indicating whether the individual is eligible for a credit card or not (e.g., Yes/No, 1/0).

    Researchers, analysts, and financial institutions can leverage this dataset to gain insights into the key factors influencing credit card eligibility and to develop predictive models that assist in automating the credit assessment process. By understanding the relationship between various attributes and credit card eligibility, stakeholders can make more informed decisions, improve risk assessment strategies, and enhance customer targeting and segmentation efforts.

    This dataset is valuable for a wide range of applications within the financial industry, including credit risk management, customer relationship management, and marketing analytics. Furthermore, it provides a valuable resource for academic research and educational purposes, enabling students and researchers to explore the intricate dynamics of credit card eligibility determination.

  5. N

    South Range, MI Population Breakdown by Gender

    • neilsberg.com
    csv, json
    Updated Sep 14, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2023). South Range, MI Population Breakdown by Gender [Dataset]. https://www.neilsberg.com/research/datasets/658fcb29-3d85-11ee-9abe-0aa64bf2eeb2/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Sep 14, 2023
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Michigan, South Range
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of South Range by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of South Range across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of male population, with 50.54% of total population being male. Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the South Range is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of South Range total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for South Range Population by Gender. You can refer the same here

  6. R

    Dataset for "High-throughput phenotyping to characterise range use behaviour...

    • entrepot.recherche.data.gouv.fr
    bin +4
    Updated Jan 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julie Collet; Julie Collet; Claire Bonnefous; Claire Bonnefous; Karine Germain; Karine Germain; Laure Ravon; Laure Ravon; Ludovic Calandreau; Ludovic Calandreau; Vanessa Guesdon; Vanessa Guesdon; Anne Collin; Anne Collin; Elisabeth Le Bihan-Duval; Elisabeth Le Bihan-Duval; Sandrine Mignon-Grasteau; Sandrine Mignon-Grasteau (2024). Dataset for "High-throughput phenotyping to characterise range use behaviour in broiler chickens" [Dataset]. http://doi.org/10.57745/JUDHTG
    Explore at:
    tsv(13468), bin(7829), bin(7706), txt(1910), tsv(5600), text/comma-separated-values(1374092123), tsv(12835), bin(7008), text/comma-separated-values(1057246321), text/comma-separated-values(2204116241), type/x-r-syntax(69557), tsv(44362)Available download formats
    Dataset updated
    Jan 31, 2024
    Dataset provided by
    Recherche Data Gouv
    Authors
    Julie Collet; Julie Collet; Claire Bonnefous; Claire Bonnefous; Karine Germain; Karine Germain; Laure Ravon; Laure Ravon; Ludovic Calandreau; Ludovic Calandreau; Vanessa Guesdon; Vanessa Guesdon; Anne Collin; Anne Collin; Elisabeth Le Bihan-Duval; Elisabeth Le Bihan-Duval; Sandrine Mignon-Grasteau; Sandrine Mignon-Grasteau
    License

    https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html

    Time period covered
    Mar 31, 2021 - Dec 23, 2021
    Dataset funded by
    European Commission
    Description

    A key characteristic of free-range chicken farming is to enable chickens to spend time outdoors. However, each chicken may use the available areas for roaming in variable ways. To check if, and how, broilers use their outdoor range at an individual level, we need to reliably characterise range use behaviour. Traditional methods relying on visual scans require significant time investment and only provide discontinuous information. Passive RFID (Radio Frequency Identification) systems enable tracking individually tagged chickens’ when they go through pop-holes; hence they only provide partial information on the movements of individual chickens. Here, we describe a new method to measure chickens’ range use and test its reliability on three ranges each containing a different breed. We used an active RFID system to localise chickens in their barn, or in one of nine zones of their range, every 30 seconds and assessed range-use behaviour in 600 chickens belonging to three breeds of slow- or medium-growing broilers used for outdoor production (all < 40g daily weight gain). From those real-time locations, we determined five measures to describe daily range use: time spent in the barn, number of outdoor accesses, number of zones visited in a day, gregariousness (an index that increases when birds spend time in zones where other birds are), and numbers of zone changes. Principal Component Analyses (PCAs) were performed on those measures, in each production system, to create two synthetic indicators of chickens’ range use behaviour. Our dataset includes the files needed to calibrate the system (supplementary materials), the data files used in the publication and the associated codes.

  7. Z

    ANN development + final testing datasets

    • data.niaid.nih.gov
    • resodate.org
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Authors (2020). ANN development + final testing datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1445865
    Explore at:
    Dataset updated
    Jan 24, 2020
    Authors
    Authors
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    File name definitions:

    '...v_50_175_250_300...' - dataset for velocity ranges [50, 175] + [250, 300] m/s

    '...v_175_250...' - dataset for velocity range [175, 250] m/s

    'ANNdevelop...' - used to perform 9 parametric sub-analyses where, in each one, many ANNs are developed (trained, validated and tested) and the one yielding the best results is selected

    'ANNtest...' - used to test the best ANN from each aforementioned parametric sub-analysis, aiming to find the best ANN model; this dataset includes the 'ANNdevelop...' counterpart

    Where to find the input (independent) and target (dependent) variable values for each dataset/excel ?

    input values in 'IN' sheet

    target values in 'TARGET' sheet

    Where to find the results from the best ANN model (for each target/output variable and each velocity range)?

    open the corresponding excel file and the expected (target) vs ANN (output) results are written in 'TARGET vs OUTPUT' sheet

    Check reference below (to be added when the paper is published)

    https://www.researchgate.net/publication/328849817_11_Neural_Networks_-_Max_Disp_-_Railway_Beams

  8. Simulation Data Set

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).

  9. N

    Grass Range, MT Population Breakdown by Gender

    • neilsberg.com
    csv, json
    Updated Sep 14, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2023). Grass Range, MT Population Breakdown by Gender [Dataset]. https://www.neilsberg.com/research/datasets/649529eb-3d85-11ee-9abe-0aa64bf2eeb2/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Sep 14, 2023
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Montana, Grass Range
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Grass Range by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Grass Range across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of female population, with 52.63% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Grass Range is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Grass Range total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Grass Range Population by Gender. You can refer the same here

  10. Data from: FISBe: A real-world benchmark dataset for instance segmentation...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin, json +3
    Updated Apr 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lisa Mais; Lisa Mais; Peter Hirsch; Peter Hirsch; Claire Managan; Claire Managan; Ramya Kandarpa; Josef Lorenz Rumberger; Josef Lorenz Rumberger; Annika Reinke; Annika Reinke; Lena Maier-Hein; Lena Maier-Hein; Gudrun Ihrke; Gudrun Ihrke; Dagmar Kainmueller; Dagmar Kainmueller; Ramya Kandarpa (2024). FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures [Dataset]. http://doi.org/10.5281/zenodo.10875063
    Explore at:
    zip, text/x-python, bin, json, txtAvailable download formats
    Dataset updated
    Apr 2, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Lisa Mais; Lisa Mais; Peter Hirsch; Peter Hirsch; Claire Managan; Claire Managan; Ramya Kandarpa; Josef Lorenz Rumberger; Josef Lorenz Rumberger; Annika Reinke; Annika Reinke; Lena Maier-Hein; Lena Maier-Hein; Gudrun Ihrke; Gudrun Ihrke; Dagmar Kainmueller; Dagmar Kainmueller; Ramya Kandarpa
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Feb 26, 2024
    Description

    General

    For more details and the most up-to-date information please consult our project page: https://kainmueller-lab.github.io/fisbe.

    Summary

    • A new dataset for neuron instance segmentation in 3d multicolor light microscopy data of fruit fly brains
      • 30 completely labeled (segmented) images
      • 71 partly labeled images
      • altogether comprising ∼600 expert-labeled neuron instances (labeling a single neuron takes between 30-60 min on average, yet a difficult one can take up to 4 hours)
    • To the best of our knowledge, the first real-world benchmark dataset for instance segmentation of long thin filamentous objects
    • A set of metrics and a novel ranking score for respective meaningful method benchmarking
    • An evaluation of three baseline methods in terms of the above metrics and score

    Abstract

    Instance segmentation of neurons in volumetric light microscopy images of nervous systems enables groundbreaking research in neuroscience by facilitating joint functional and morphological analyses of neural circuits at cellular resolution. Yet said multi-neuron light microscopy data exhibits extremely challenging properties for the task of instance segmentation: Individual neurons have long-ranging, thin filamentous and widely branching morphologies, multiple neurons are tightly inter-weaved, and partial volume effects, uneven illumination and noise inherent to light microscopy severely impede local disentangling as well as long-range tracing of individual neurons. These properties reflect a current key challenge in machine learning research, namely to effectively capture long-range dependencies in the data. While respective methodological research is buzzing, to date methods are typically benchmarked on synthetic datasets. To address this gap, we release the FlyLight Instance Segmentation Benchmark (FISBe) dataset, the first publicly available multi-neuron light microscopy dataset with pixel-wise annotations. In addition, we define a set of instance segmentation metrics for benchmarking that we designed to be meaningful with regard to downstream analyses. Lastly, we provide three baselines to kick off a competition that we envision to both advance the field of machine learning regarding methodology for capturing long-range data dependencies, and facilitate scientific discovery in basic neuroscience.

    Dataset documentation:

    We provide a detailed documentation of our dataset, following the Datasheet for Datasets questionnaire:

    >> FISBe Datasheet

    Our dataset originates from the FlyLight project, where the authors released a large image collection of nervous systems of ~74,000 flies, available for download under CC BY 4.0 license.

    Files

    • fisbe_v1.0_{completely,partly}.zip
      • contains the image and ground truth segmentation data; there is one zarr file per sample, see below for more information on how to access zarr files.
    • fisbe_v1.0_mips.zip
      • maximum intensity projections of all samples, for convenience.
    • sample_list_per_split.txt
      • a simple list of all samples and the subset they are in, for convenience.
    • view_data.py
      • a simple python script to visualize samples, see below for more information on how to use it.
    • dim_neurons_val_and_test_sets.json
      • a list of instance ids per sample that are considered to be of low intensity/dim; can be used for extended evaluation.
    • Readme.md
      • general information

    How to work with the image files

    Each sample consists of a single 3d MCFO image of neurons of the fruit fly.
    For each image, we provide a pixel-wise instance segmentation for all separable neurons.
    Each sample is stored as a separate zarr file (zarr is a file storage format for chunked, compressed, N-dimensional arrays based on an open-source specification.").
    The image data ("raw") and the segmentation ("gt_instances") are stored as two arrays within a single zarr file.
    The segmentation mask for each neuron is stored in a separate channel.
    The order of dimensions is CZYX.

    We recommend to work in a virtual environment, e.g., by using conda:

    conda create -y -n flylight-env -c conda-forge python=3.9
    conda activate flylight-env

    How to open zarr files

    1. Install the python zarr package:
      pip install zarr
    2. Opened a zarr file with:

      import zarr
      raw = zarr.open(
      seg = zarr.open(

      # optional:
      import numpy as np
      raw_np = np.array(raw)

    Zarr arrays are read lazily on-demand.
    Many functions that expect numpy arrays also work with zarr arrays.
    Optionally, the arrays can also explicitly be converted to numpy arrays.

    How to view zarr image files

    We recommend to use napari to view the image data.

    1. Install napari:
      pip install "napari[all]"
    2. Save the following Python script:

      import zarr, sys, napari

      raw = zarr.load(sys.argv[1], mode='r', path="volumes/raw")
      gts = zarr.load(sys.argv[1], mode='r', path="volumes/gt_instances")

      viewer = napari.Viewer(ndisplay=3)
      for idx, gt in enumerate(gts):
      viewer.add_labels(
      gt, rendering='translucent', blending='additive', name=f'gt_{idx}')
      viewer.add_image(raw[0], colormap="red", name='raw_r', blending='additive')
      viewer.add_image(raw[1], colormap="green", name='raw_g', blending='additive')
      viewer.add_image(raw[2], colormap="blue", name='raw_b', blending='additive')
      napari.run()

    3. Execute:
      python view_data.py 

    Metrics

    • S: Average of avF1 and C
    • avF1: Average F1 Score
    • C: Average ground truth coverage
    • clDice_TP: Average true positives clDice
    • FS: Number of false splits
    • FM: Number of false merges
    • tp: Relative number of true positives

    For more information on our selected metrics and formal definitions please see our paper.

    Baseline

    To showcase the FISBe dataset together with our selection of metrics, we provide evaluation results for three baseline methods, namely PatchPerPix (ppp), Flood Filling Networks (FFN) and a non-learnt application-specific color clustering from Duan et al..
    For detailed information on the methods and the quantitative results please see our paper.

    License

    The FlyLight Instance Segmentation Benchmark (FISBe) dataset is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

    Citation

    If you use FISBe in your research, please use the following BibTeX entry:

    @misc{mais2024fisbe,
     title =    {FISBe: A real-world benchmark dataset for instance
             segmentation of long-range thin filamentous structures},
     author =    {Lisa Mais and Peter Hirsch and Claire Managan and Ramya
             Kandarpa and Josef Lorenz Rumberger and Annika Reinke and Lena
             Maier-Hein and Gudrun Ihrke and Dagmar Kainmueller},
     year =     2024,
     eprint =    {2404.00130},
     archivePrefix ={arXiv},
     primaryClass = {cs.CV}
    }

    Acknowledgments

    We thank Aljoscha Nern for providing unpublished MCFO images as well as Geoffrey W. Meissner and the entire FlyLight Project Team for valuable
    discussions.
    P.H., L.M. and D.K. were supported by the HHMI Janelia Visiting Scientist Program.
    This work was co-funded by Helmholtz Imaging.

    Changelog

    There have been no changes to the dataset so far.
    All future change will be listed on the changelog page.

    Contributing

    If you would like to contribute, have encountered any issues or have any suggestions, please open an issue for the FISBe dataset in the accompanying github repository.

    All contributions are welcome!

  11. N

    Grass Range, MT annual income distribution by work experience and gender...

    • neilsberg.com
    csv, json
    Updated Feb 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Grass Range, MT annual income distribution by work experience and gender dataset: Number of individuals ages 15+ with income, 2023 // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/grass-range-mt-income-by-gender/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 27, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Montana, Grass Range
    Variables measured
    Income for Male Population, Income for Female Population, Income for Male Population working full time, Income for Male Population working part time, Income for Female Population working full time, Income for Female Population working part time, Number of males working full time for a given income bracket, Number of males working part time for a given income bracket, Number of females working full time for a given income bracket, Number of females working part time for a given income bracket
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To portray the number of individuals for both the genders (Male and Female), within each income bracket we conducted an initial analysis and categorization of the American Community Survey data. Households are categorized, and median incomes are reported based on the self-identified gender of the head of the household. For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents the detailed breakdown of the count of individuals within distinct income brackets, categorizing them by gender (men and women) and employment type - full-time (FT) and part-time (PT), offering valuable insights into the diverse income landscapes within Grass Range. The dataset can be utilized to gain insights into gender-based income distribution within the Grass Range population, aiding in data analysis and decision-making..

    Key observations

    • Employment patterns: Within Grass Range, among individuals aged 15 years and older with income, there were 26 men and 37 women in the workforce. Among them, 8 men were engaged in full-time, year-round employment, while 5 women were in full-time, year-round roles.
    • Annual income under $24,999: Of the male population working full-time, 12.50% fell within the income range of under $24,999, while none of the female population working full-time was represented in the same income bracket.
    • Annual income above $100,000: 37.50% of men in full-time roles earned incomes exceeding $100,000, while none of women in full-time positions earned within this income bracket.
    • Refer to the research insights for more key observations on more income brackets ( Annual income under $24,999, Annual income between $25,000 and $49,999, Annual income between $50,000 and $74,999, Annual income between $75,000 and $99,999 and Annual income above $100,000) and employment types (full-time year-round and part-time)
    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Income brackets:

    • $1 to $2,499 or loss
    • $2,500 to $4,999
    • $5,000 to $7,499
    • $7,500 to $9,999
    • $10,000 to $12,499
    • $12,500 to $14,999
    • $15,000 to $17,499
    • $17,500 to $19,999
    • $20,000 to $22,499
    • $22,500 to $24,999
    • $25,000 to $29,999
    • $30,000 to $34,999
    • $35,000 to $39,999
    • $40,000 to $44,999
    • $45,000 to $49,999
    • $50,000 to $54,999
    • $55,000 to $64,999
    • $65,000 to $74,999
    • $75,000 to $99,999
    • $100,000 or more

    Variables / Data Columns

    • Income Bracket: This column showcases 20 income brackets ranging from $1 to $100,000+..
    • Full-Time Males: The count of males employed full-time year-round and earning within a specified income bracket
    • Part-Time Males: The count of males employed part-time and earning within a specified income bracket
    • Full-Time Females: The count of females employed full-time year-round and earning within a specified income bracket
    • Part-Time Females: The count of females employed part-time and earning within a specified income bracket

    Employment type classifications include:

    • Full-time, year-round: A full-time, year-round worker is a person who worked full time (35 or more hours per week) and 50 or more weeks during the previous calendar year.
    • Part-time: A part-time worker is a person who worked less than 35 hours per week during the previous calendar year.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Grass Range median household income by race. You can refer the same here

  12. n

    Data from: Contrasting effects of host or local specialization: widespread...

    • data.niaid.nih.gov
    • ourarchive.otago.ac.nz
    • +3more
    zip
    Updated Mar 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniela de Angeli Dutra; Gabriel Moreira Félix; Robert Poulin (2024). Contrasting effects of host or local specialization: widespread haemosporidians are host generalist whereas local specialists are locally abundant [Dataset]. http://doi.org/10.5061/dryad.j3tx95xfb
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 13, 2024
    Dataset provided by
    University of Otago
    Universidade Estadual de Campinas (UNICAMP)
    Authors
    Daniela de Angeli Dutra; Gabriel Moreira Félix; Robert Poulin
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Aim: Despite the wide distribution of many parasites around the globe, the range of individual species varies significantly even among phylogenetically related taxa. Since parasites need suitable hosts to complete their development, parasite geographical and environmental ranges should be limited to communities where their hosts are found. Parasites may also suffer from a trade-off between being locally abundant or widely dispersed. We hypothesize that the geographical and environmental ranges of parasites are negatively associated to their host specificity and their local abundance. Location: Worldwide Time period: 2009 to 2021 Major taxa studied: Avian haemosporidian parasites Methods: We tested these hypotheses using a global database which comprises data on avian haemosporidian parasites from across the world. For each parasite lineage, we computed five metrics: phylogenetic host-range, environmental range, geographical range, and their mean local and total number of observations in the database. Phylogenetic generalized least squares models were ran to evaluate the influence of phylogenetic host-range and total and local abundances on geographical and environmental range. In addition, we analysed separately the two regions with the largest amount of available data: Europe and South America. Results: We evaluated 401 lineages from 757 localities and observed that generalism (i.e. phylogenetic host range) associates positively to both the parasites’ geographical and environmental ranges at global and Europe scales. For South America, generalism only associates with geographical range. Finally, mean local abundance (mean local number of parasite occurrences) was negatively related to geographical and environmental range. This pattern was detected worldwide and in South America, but not in Europe. Main Conclusions: We demonstrate that parasite specificity is linked to both their geographical and environmental ranges. The fact that locally abundant parasites present restricted ranges, indicates a trade-off between these two traits. This trade-off, however, only becomes evident when sufficient heterogeneous host communities are considered. Methods We compiled data on haemosporidian lineages from the MalAvi database (http://130.235.244.92/Malavi/ , Bensch et al. 2009) including all the data available from the “Grand Lineage Summary” representing Plasmodium and Haemoproteus genera from wild birds and that contained information regarding location. After checking for duplicated sequences, this dataset comprised a total of ~6200 sequenced parasites representing 1602 distinct lineages (775 Plasmodium and 827 Haemoproteus) collected from 1139 different host species and 757 localities from all continents except Antarctica (Supplementary figure 1, Supplementary Table 1). The parasite lineages deposited in MalAvi are based on a cyt b fragment of 478 bp. This dataset was used to calculate the parasites’ geographical, environmental and phylogenetic ranges. Geographical range All analyses in this study were performed using R version 4.02. In order to estimate the geographical range of each parasite lineage, we applied the R package “GeoRange” (Boyle, 2017) and chose the variable minimum spanning tree distance (i.e., shortest total distance of all lines connecting each locality where a particular lineage has been found). Using the function “create.matrix” from the “fossil” package, we created a matrix of lineages and coordinates and employed the function “GeoRange_MultiTaxa” to calculate the minimum spanning tree distance for each parasite lineage distance (i.e. shortest total distance in kilometers of all lines connecting each locality). Therefore, as at least two distinct sites are necessary to calculate this distance, parasites observed in a single locality could not have their geographical range estimated. For this reason, only parasites observed in two or more localities were considered in our phylogenetically controlled least squares (PGLS) models. Host and Environmental diversity Traditionally, ecologists use Shannon entropy to measure diversity in ecological assemblages (Pielou, 1966). The Shannon entropy of a set of elements is related to the degree of uncertainty someone would have about the identity of a random selected element of that set (Jost, 2006). Thus, Shannon entropy matches our intuitive notion of biodiversity, as the more diverse an assemblage is, the more uncertainty regarding to which species a randomly selected individual belongs. Shannon diversity increases with both the assemblage richness (e.g., the number of species) and evenness (e.g., uniformity in abundance among species). To compare the diversity of assemblages that vary in richness and evenness in a more intuitive manner, we can normalize diversities by Hill numbers (Chao et al., 2014b). The Hill number of an assemblage represents the effective number of species in the assemblage, i.e., the number of equally abundant species that are needed to give the same value of the diversity metric in that assemblage. Hill numbers can be extended to incorporate phylogenetic information. In such case, instead of species, we are measuring the effective number of phylogenetic entities in the assemblage. Here, we computed phylogenetic host-range as the phylogenetic Hill number associated with the assemblage of hosts found infected by a given parasite. Analyses were performed using the function “hill_phylo” from the “hillr” package (Chao et al., 2014a). Hill numbers are parameterized by a parameter “q” that determines the sensitivity of the metric to relative species abundance. Different “q” values produce Hill numbers associated with different diversity metrics. We set q = 1 to compute the Hill number associated with Shannon diversity. Here, low Hill numbers indicate specialization on a narrow phylogenetic range of hosts, whereas a higher Hill number indicates generalism across a broader phylogenetic spectrum of hosts. We also used Hill numbers to compute the environmental range of sites occupied by each parasite lineage. Firstly, we collected the 19 bioclimatic variables from WorldClim version 2 (http://www.worldclim.com/version2) for all sites used in this study (N = 713). Then, we standardized the 19 variables by centering and scaling them by their respective mean and standard deviation. Thereafter, we computed the pairwise Euclidian environmental distance among all sites and used this distance to compute a dissimilarity cluster. Finally, as for the phylogenetic Hill number, we used this dissimilarity cluster to compute the environmental Hill number of the assemblage of sites occupied by each parasite lineage. The environmental Hill number for each parasite can be interpreted as the effective number of environmental conditions in which a parasite lineage occurs. Thus, the higher the environmental Hill number, the more generalist the parasite is regarding the environmental conditions in which it can occur. Parasite phylogenetic tree A Bayesian phylogenetic reconstruction was performed. We built a tree for all parasite sequences for which we were able to estimate the parasite’s geographical, environmental and phylogenetic ranges (see above); this represented 401 distinct parasite lineages. This inference was produced using MrBayes 3.2.2 (Ronquist & Huelsenbeck, 2003) with the GTR + I + G model of nucleotide evolution, as recommended by ModelTest (Posada & Crandall, 1998), which selects the best-fit nucleotide substitution model for a set of genetic sequences. We ran four Markov chains simultaneously for a total of 7.5 million generations that were sampled every 1000 generations. The first 1250 million trees (25%) were discarded as a burn-in step and the remaining trees were used to calculate the posterior probabilities of each estimated node in the final consensus tree. Our final tree obtained a cumulative posterior probability of 0.999. Leucocytozoon caulleryi was used as the outgroup to root the phylogenetic tree as Leucocytozoon spp. represents a basal group within avian haemosporidians (Pacheco et al., 2020).

  13. R

    Guns Close Range Dataset

    • universe.roboflow.com
    zip
    Updated Oct 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Computer vision (2025). Guns Close Range Dataset [Dataset]. https://universe.roboflow.com/computer-vision-kcsdu/guns-close-range-7hqvz/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 22, 2025
    Dataset authored and provided by
    Computer vision
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Objects Objects Objects Obj 2SfO Bounding Boxes
    Description

    Guns Close Range

    ## Overview
    
    Guns Close Range is a dataset for object detection tasks - it contains Objects Objects Objects Obj 2SfO annotations for 682 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  14. z

    mmWave-based Fitness Activity Recognition Dataset

    • zenodo.org
    png, zip
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yucheng Xie; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen; Yucheng Xie; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen (2024). mmWave-based Fitness Activity Recognition Dataset [Dataset]. http://doi.org/10.5281/zenodo.7793613
    Explore at:
    zip, pngAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodo
    Authors
    Yucheng Xie; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen; Yucheng Xie; Xiaonan Guo; Yan Wang; Jerry Cheng; Yingying Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description:

    This mmWave Datasets are used for fitness activity identification. This dataset (FA Dataset) contains 14 common fitness daily activities. The data are captured by the mmWave radar TI-AWR1642. The dataset can be used by fellow researchers to reproduce the original work or to further explore other machine-learning problems in the domain of mmWave signals.

    Format: .png format

    Section 1: Device Configuration

    Section 2: Data Format

    We provide our mmWave data in heatmaps for this dataset. The data file is in the png format. The details are shown in the following:

    • 14 activities are included in the FA Dataset.
    • 2 participants are included in the FA Dataset.
    • FA_d_p_i_u_j.png:
      • d represents the date to collect the fitness data.
      • p represents the environment to collect the fitness data.
      • i represents fitness activity type index
      • u represents user id
      • j represents sample index
    • Example:
      • FA_20220101_lab_1_2_3 represents the 3rd data sample of user 2 of activity 1 collected in the lab

    Section 3: Experimental Setup

    • We place the mmWave device on a table with a height of 60cm.
    • The participants are asked to perform fitness activity in front of a mmWave device with a distance of 2m.
    • The data are collected at an lab with a size of (5.0m×3.0m).

    Section 4: Data Description

    • We develop a spatial-temporal heatmap to integrates multiple activity features, including the range of movement, velocity, and time duration of each activity repetition.

    • We first derive the Doppler-range map of the users' activity by calculating Range-FFT and Doppler-FFT. Then, we generate the spatial-temporal heatmap by accumulating the velocity of every distance in every Doppler-range map together. Next, we normalize the derived velocity information and present the velocity-distance relationship in time dimension. In this way, we transfer the original instantaneous velocity-distance relationship to a more comprehensive spatial-temporal heatmap which describes the process of a whole activity.

    • As shown in Figure attached, in each spatial-temporal heatmap, the horizontal axis represents the time duration of an activity repetition while the vertical axis represents the range of movement. The velocity is represented by color.

    • We create 14 zip files to store the the dataset. There are 14 zip files starting with "FA", each contains repetitions from the same fitness activity.

    14 common daily activities and their corresponding files

    File Name Activity Type File Name Activity Type

    FA1 Crunches FA8 Squats

    FA2 Elbow plank and reach FA9 Burpees

    FA3 Leg raise FA10 Chest squeezes

    FA4 Lunges FA11 High knees

    FA5 Mountain climber FA12 Side leg raise

    FA6 Punches FA13 Side to side chops

    FA7 Push ups FA14 Turning kicks

    Section 5: Raw Data and Data Processing Algorithms

    • We also provide the mmWave raw data (.mat format) stored in the same zip file corresponding to the heatmap datasets. Each .mat file can store one set of activity repetitions (e.g., 4 repetations) from a same user.
      • For example: FA_d_p_i_u_j.mat:
        • d represents the data to collect the data.
        • p represents the environment to collect the data.
        • i represents the activity type index
        • u represents the user id
        • j represents the set index
    • We plan to provide the data processing algorithms (heatmap_generation.py) to load the mmWave raw data and generate the corresponding heatmap data.

    Section 6: Citations

    If your paper is related to our works, please cite our papers as follows.

    https://ieeexplore.ieee.org/document/9868878/

    Xie, Yucheng, Ruizhe Jiang, Xiaonan Guo, Yan Wang, Jerry Cheng, and Yingying Chen. "mmFit: Low-Effort Personalized Fitness Monitoring Using Millimeter Wave." In 2022 International Conference on Computer Communications and Networks (ICCCN), pp. 1-10. IEEE, 2022.

    Bibtex:

    @inproceedings{xie2022mmfit,

    title={mmFit: Low-Effort Personalized Fitness Monitoring Using Millimeter Wave},

    author={Xie, Yucheng and Jiang, Ruizhe and Guo, Xiaonan and Wang, Yan and Cheng, Jerry and Chen, Yingying},

    booktitle={2022 International Conference on Computer Communications and Networks (ICCCN)},

    pages={1--10},

    year={2022},

    organization={IEEE}

    }

  15. Data from: GALILEO VENUS RANGE FIX RAW DATA V1.0

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Aug 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Aeronautics and Space Administration (2025). GALILEO VENUS RANGE FIX RAW DATA V1.0 [Dataset]. https://catalog.data.gov/dataset/galileo-venus-range-fix-raw-data-v1-0-0943a
    Explore at:
    Dataset updated
    Aug 22, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Raw radio tracking data used to determine the precise distance to Venus (and improve knowledge of the Astronomical Unit) from the Galileo flyby on 10 February 1990.

  16. N

    South Range, MI households by income brackets: family, non-family, and...

    • neilsberg.com
    csv, json
    Updated Mar 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). South Range, MI households by income brackets: family, non-family, and total, in 2023 inflation-adjusted dollars [Dataset]. https://www.neilsberg.com/insights/south-range-mi-median-household-income/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Mar 3, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Michigan, South Range
    Variables measured
    Income Level, All households, Family households, Non-Family households, Percent of All households, Percent of Family households, Percent of Non-Family households
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It delineates income distributions across income brackets (mentioned above) following an initial analysis and categorization. The percentage of all, family and nonfamily households were collected by grouping data as applicable. For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents a breakdown of households across various income brackets in South Range, MI, as reported by the U.S. Census Bureau. The Census Bureau classifies households into different categories, including total households, family households, and non-family households. Our analysis of U.S. Census Bureau American Community Survey data for South Range, MI reveals how household income distribution varies among these categories. The dataset highlights the variation in number of households with income, offering valuable insights into the distribution of South Range households based on income levels.

    Key observations

    • For Family Households: In South Range, the majority of family households, representing 21.9%, earn $60,000 to $74,999, showcasing a substantial share of the community families falling within this income bracket. Conversely, the minority of family households, comprising 1.46%, have incomes falling $150,000 to $199,999, representing a smaller but still significant segment of the community.
    • For Non-Family Households: In South Range, the majority of non-family households, accounting for 20.93%, have income Less than $10,000, indicating that a substantial portion of non-family households falls within this income bracket. On the other hand, the minority of non-family households, comprising 0.0%, earn $150,000 to $199,999, representing a smaller, yet notable, portion of non-family households in the community.
    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Income Levels:

    • Less than $10,000
    • $10,000 to $14,999
    • $15,000 to $19,999
    • $20,000 to $24,999
    • $25,000 to $29,999
    • $30,000 to $34,999
    • $35,000 to $39,999
    • $40,000 to $44,999
    • $45,000 to $49,999
    • $50,000 to $59,999
    • $60,000 to $74,999
    • $75,000 to $99,999
    • $125,000 to $149,999
    • $150,000 to $199,999
    • $200,000 or more

    Variables / Data Columns

    • Income Level: The income level represents the income brackets ranging from Less than $10,000 to $200,000 or more in South Range, MI (As mentioned above).
    • All Households: Count of households for the specified income level
    • % All Households: Percentage of households at the specified income level relative to the total households in South Range, MI
    • Family Households: Count of family households for the specified income level
    • % Family Households: Percentage of family households at the specified income level relative to the total family households in South Range, MI
    • Non-Family Households: Count of non-family households for the specified income level
    • % Non-Family Households: Percentage of non-family households at the specified income level relative to the total non-family households in South Range, MI

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for South Range median household income. You can refer the same here

  17. f

    Summary and methods used to calculate the physical characteristics used to...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Mar 31, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nathan, Senthilvel K. S. S.; Saldivar, Diana A. Ramirez; Vaughan, Ian P.; Goossens, Benoit; Stark, Danica J. (2017). Summary and methods used to calculate the physical characteristics used to compare the home range estimators. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001743878
    Explore at:
    Dataset updated
    Mar 31, 2017
    Authors
    Nathan, Senthilvel K. S. S.; Saldivar, Diana A. Ramirez; Vaughan, Ian P.; Goossens, Benoit; Stark, Danica J.
    Description

    Summary and methods used to calculate the physical characteristics used to compare the home range estimators.

  18. 🛒 Supermarket Data

    • kaggle.com
    zip
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mexwell (2024). 🛒 Supermarket Data [Dataset]. https://www.kaggle.com/datasets/mexwell/supermarket-data/versions/1
    Explore at:
    zip(78427538 bytes)Available download formats
    Dataset updated
    Jul 19, 2024
    Authors
    mexwell
    Description

    This is the dataset released as companion for the paper “Explaining the Product Range Effect in Purchase Data“, presented at the BigData 2013 conference.

    • supermarket_distances: three columns. The first column is the customer id, the second is the shop id and the third is the distance between the customer’s house and the shop location. The distance is a calculated in meters as a straight line so it does not take into account the road graph.
    • supermarket_prices: two columns. The first column is the product id and the second column is its unit price. The price is in Euro and it is calculated as the average unit price for the time span of the dataset.
    • supermarket_purchases: four columns. The first column is the customer id, the second is the product id, the third is the shop id and the fourth is the total amount of items that the customer bought the product in that particular shop. The data is recorded from January 2007 to December 2011.

    Citation

    Pennacchioli, D., Coscia, M., Rinzivillo, S., Pedreschi, D. and Giannotti, F., Explaining the Product Range Effect in Purchase Data. In BigData, 2013.

    Acknowlegement

    Foto von Eduardo Soares auf Unsplash

  19. Math Formula Retrieval

    • kaggle.com
    • huggingface.co
    zip
    Updated Dec 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Math Formula Retrieval [Dataset]. https://www.kaggle.com/datasets/thedevastator/math-formula-pair-classification-dataset/data
    Explore at:
    zip(2021716728 bytes)Available download formats
    Dataset updated
    Dec 2, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Math Formula Retrieval

    Math Formula Pair Classification Dataset

    By ddrg (From Huggingface) [source]

    About this dataset

    With a total of six columns, including formula1, formula2, label (binary format), formula1, formula2, and label, the dataset provides all the necessary information for conducting comprehensive analysis and evaluation.

    The train.csv file contains a subset of the dataset specifically curated for training purposes. It includes an extensive range of math formula pairs along with their corresponding labels and unique ID names. This allows researchers and data scientists to construct models that can predict whether two given formulas fall within the same category or not.

    On the other hand, test.csv serves as an evaluation set. It consists of additional pairs of math formulas accompanied by their respective labels and unique IDs. By evaluating model performance on this test set after training it on train.csv data, researchers can assess how well their models generalize to unseen instances.

    By leveraging this informative dataset, researchers can unlock new possibilities in mathematics-related fields such as pattern recognition algorithms development or enhancing educational tools that involve automatic identification and categorization tasks based on mathematical formulas

    How to use the dataset

    Introduction

    Dataset Description

    train.csv

    The train.csv file contains a set of labeled math formula pairs along with their corresponding labels and formula name IDs. It consists of the following columns: - formula1: The first mathematical formula in the pair (text). - formula2: The second mathematical formula in the pair (text). - label: The classification label indicating whether the pair of formulas belong to the same category or not (binary). A label value of 1 indicates that both formulas belong to the same category, while a label value of 0 indicates different categories.

    test.csv

    The purpose of the test.csv file is to provide a set of formula pairs along with their labels and formula name IDs for testing and evaluation purposes. It has an identical structure to train.csv, containing columns like formula1, formula2, label, etc.

    Task

    The main task using this dataset is binary classification, where your objective is to predict whether two mathematical formulas belong to the same category or not based on their textual representation. You can use various machine learning algorithms such as logistic regression, decision trees, random forests, or neural networks for training models on this dataset.

    Exploring & Analyzing Data

    Before building your model, it's crucial to explore and analyze your data. Here are some steps you can take:

    • Load both CSV files (train.csv and test.csv) into your preferred data analysis framework or programming language (e.g., Python with libraries like pandas).
    • Examine the dataset's structure, including the number of rows, columns, and data types.
    • Check for missing values in the dataset and handle them accordingly.
    • Visualize the distribution of labels to understand whether it is balanced or imbalanced.

    Model Building

    Once you have analyzed and preprocessed your dataset, you can start building your classification model using various machine learning algorithms:

    • Split your train.csv data into training and validation sets for model evaluation during training.
    • Choose a suitable

    Research Ideas

    • Math Formula Similarity: This dataset can be used to develop a model that classifies whether two mathematical formulas are similar or not. This can be useful in various applications such as plagiarism detection, identifying duplicate formulas in databases, or suggesting similar formulas based on user input.
    • Formula Categorization: The dataset can be used to train a model that categorizes mathematical formulas into different classes or categories. For example, the model can classify formulas into algebraic expressions, trigonometric equations, calculus problems, or geometric theorems. This categorization can help organize and search through large collections of mathematical formulas.
    • Formula Recommendation: Using this dataset, one could build a recommendation system that suggests related math formulas based on user input. By analyzing the similarities between different formula pairs and their corresponding labels, the system could provide recommendations for relevant mathematical concepts that users may need while solving problems or studying specific topics in mathematics

    Acknowle...

  20. f

    Data from: Software for Computing and Annotating Genomic Ranges

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Aug 8, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carlson, Marc; Carey, Vincent J.; Gentleman, Robert; Lawrence, Michael; Pagès, Hervé; Morgan, Martin T.; Huber, Wolfgang; Aboyoun, Patrick (2013). Software for Computing and Annotating Genomic Ranges [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001647970
    Explore at:
    Dataset updated
    Aug 8, 2013
    Authors
    Carlson, Marc; Carey, Vincent J.; Gentleman, Robert; Lawrence, Michael; Pagès, Hervé; Morgan, Martin T.; Huber, Wolfgang; Aboyoun, Patrick
    Description

    We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Construction Sector Development Agency (2020). INSPIRE Priority Data Set (Compliant) - Species range [Dataset]. https://inspire-geoportal.ec.europa.eu/srv/api/records/bfcc7a93-dd66-453b-b7f5-9fc4a868e69f

INSPIRE Priority Data Set (Compliant) - Species range

Explore at:
www:download-1.0-http--download, www:link-1.0-http--link, ogc:wms-1.3.0-http-get-mapAvailable download formats
Dataset updated
Aug 26, 2020
Dataset provided by
State Service for Protected Areas under the Ministry of Environment
Construction Sector Development Agency
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitationshttp://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations

Area covered
Description

INSPIRE Priority Data Set (Compliant) - Species range

Search
Clear search
Close search
Google apps
Main menu