100+ datasets found
  1. Z

    Fused Image dataset for convolutional neural Network-based crack Detection...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shanglian Zhou; Carlos Canchila; Wei Song (2023). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6383043
    Explore at:
    Dataset updated
    Apr 20, 2023
    Authors
    Shanglian Zhou; Carlos Canchila; Wei Song
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The “Fused Image dataset for convolutional neural Network-based crack Detection” (FIND) is a large-scale image dataset with pixel-level ground truth crack data for deep learning-based crack segmentation analysis. It features four types of image data including raw intensity image, raw range (i.e., elevation) image, filtered range image, and fused raw image. The FIND dataset consists of 2500 image patches (dimension: 256x256 pixels) and their ground truth crack maps for each of the four data types.

    The images contained in this dataset were collected from multiple bridge decks and roadways under real-world conditions. A laser scanning device was adopted for data acquisition such that the captured raw intensity and raw range images have pixel-to-pixel location correspondence (i.e., spatial co-registration feature). The filtered range data were generated by applying frequency domain filtering to eliminate image disturbances (e.g., surface variations, and grooved patterns) from the raw range data [1]. The fused image data were obtained by combining the raw range and raw intensity data to achieve cross-domain feature correlation [2,3]. Please refer to [4] for a comprehensive benchmark study performed using the FIND dataset to investigate the impact from different types of image data on deep convolutional neural network (DCNN) performance.

    If you share or use this dataset, please cite [4] and [5] in any relevant documentation.

    In addition, an image dataset for crack classification has also been published at [6].

    References:

    [1] Shanglian Zhou, & Wei Song. (2020). Robust Image-Based Surface Crack Detection Using Range Data. Journal of Computing in Civil Engineering, 34(2), 04019054. https://doi.org/10.1061/(asce)cp.1943-5487.0000873

    [2] Shanglian Zhou, & Wei Song. (2021). Crack segmentation through deep convolutional neural networks and heterogeneous image fusion. Automation in Construction, 125. https://doi.org/10.1016/j.autcon.2021.103605

    [3] Shanglian Zhou, & Wei Song. (2020). Deep learning–based roadway crack classification with heterogeneous image data fusion. Structural Health Monitoring, 20(3), 1274-1293. https://doi.org/10.1177/1475921720948434

    [4] Shanglian Zhou, Carlos Canchila, & Wei Song. (2023). Deep learning-based crack segmentation for civil infrastructure: data types, architectures, and benchmarked performance. Automation in Construction, 146. https://doi.org/10.1016/j.autcon.2022.104678

    5 Shanglian Zhou, Carlos Canchila, & Wei Song. (2022). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6383044

    [6] Wei Song, & Shanglian Zhou. (2020). Laser-scanned roadway range image dataset (LRRD). Laser-scanned Range Image Dataset from Asphalt and Concrete Roadways for DCNN-based Crack Classification, DesignSafe-CI. https://doi.org/10.17603/ds2-bzv3-nc78

  2. N

    South Range, MI Population Breakdown by Gender and Age Dataset: Male and...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). South Range, MI Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e200fba9-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    South Range, Michigan
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of South Range by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for South Range. The dataset can be utilized to understand the population distribution of South Range by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in South Range. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for South Range.

    Key observations

    Largest age group (population): Male # 20-24 years (49) | Female # 20-24 years (50). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the South Range population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the South Range is shown in the following column.
    • Population (Female): The female population in the South Range is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in South Range for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for South Range Population by Gender. You can refer the same here

  3. Z

    ANN development + final testing datasets

    • data.niaid.nih.gov
    • resodate.org
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Authors (2020). ANN development + final testing datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1445865
    Explore at:
    Dataset updated
    Jan 24, 2020
    Authors
    Authors
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    File name definitions:

    '...v_50_175_250_300...' - dataset for velocity ranges [50, 175] + [250, 300] m/s

    '...v_175_250...' - dataset for velocity range [175, 250] m/s

    'ANNdevelop...' - used to perform 9 parametric sub-analyses where, in each one, many ANNs are developed (trained, validated and tested) and the one yielding the best results is selected

    'ANNtest...' - used to test the best ANN from each aforementioned parametric sub-analysis, aiming to find the best ANN model; this dataset includes the 'ANNdevelop...' counterpart

    Where to find the input (independent) and target (dependent) variable values for each dataset/excel ?

    input values in 'IN' sheet

    target values in 'TARGET' sheet

    Where to find the results from the best ANN model (for each target/output variable and each velocity range)?

    open the corresponding excel file and the expected (target) vs ANN (output) results are written in 'TARGET vs OUTPUT' sheet

    Check reference below (to be added when the paper is published)

    https://www.researchgate.net/publication/328849817_11_Neural_Networks_-_Max_Disp_-_Railway_Beams

  4. p

    Trends in Total Students (2013-2023): Range View Elementary School

    • publicschoolreview.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Public School Review, Trends in Total Students (2013-2023): Range View Elementary School [Dataset]. https://www.publicschoolreview.com/range-view-elementary-school-profile
    Explore at:
    Dataset authored and provided by
    Public School Review
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset tracks annual total students amount from 2013 to 2023 for Range View Elementary School

  5. o

    Range View Road Cross Street Data in Valier, MT

    • ownerly.com
    Updated Dec 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2021). Range View Road Cross Street Data in Valier, MT [Dataset]. https://www.ownerly.com/mt/valier/range-view-rd-home-details
    Explore at:
    Dataset updated
    Dec 11, 2021
    Dataset authored and provided by
    Ownerly
    Area covered
    Range View Road, Valier, Montana
    Description

    This dataset provides information about the number of properties, residents, and average property values for Range View Road cross streets in Valier, MT.

  6. Amazon AWS Recon Data For Finding Origin IP - 93M

    • kaggle.com
    zip
    Updated Sep 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chirag Artani (2023). Amazon AWS Recon Data For Finding Origin IP - 93M [Dataset]. https://www.kaggle.com/datasets/chiragartani/amazon-aws-asn-cidr-ip-to-hostname-recon-data
    Explore at:
    zip(225734462 bytes)Available download formats
    Dataset updated
    Sep 17, 2023
    Authors
    Chirag Artani
    Description

    Our mission with this project is to provide an always up-to-date and freely accessible map of the cloud landscape for every major cloud service provider.

    We've decided to kick things off with collecting SSL certificate data of AWS EC2 machines, considering the value of this data to security researchers. However, we plan to expand the project to include more data and providers in the near future. Your input and suggestions are incredibly valuable to us, so please don't hesitate to reach out on Twitter or Discord and let us know what areas you think we should prioritize next!

    How to find origin IP of any domain or subdomain inside this database?

    You can find origin IP for an example: instacart.com, Just search there instacart.com

    You can use command as well if you are using linux. Open the dataset using curl or wget and then **cd ** folder now run command: find . -type f -iname "*.csv" -print0 | xargs -0 grep "word"

    Like: find . -type f -iname "*.csv" -print0 | xargs -0 grep "instacart.com"

    Done, You will see output.

    How can SSL certificate data benefit you? The SSL data is organized into CSV files, with the following properties collected for every found certificate:

    IP Address Common Name Organization Country Locality Province Subject Alternative DNS Name Subject Alternative IP address Self-signed (boolean)

    IP Address Common Name Organization Country Locality Province Subject Alternative DNS Name Subject Alternative IP address Self-signed 1.2.3.4 example.com Example, Inc. US San Francisco California example.com 1.2.3.4 false 5.6.7.8 acme.net Acme, Inc. US Seattle Washington *.acme.net 5.6.7.8 false So what can you do with this data?

    Enumerate subdomains of your target domains Search for your target's domain names (e.g. example.com) and find hits in the Common Name and Subject Alternative Name fields of the collected certificates. All IP ranges are scanned daily and the dataset gets updated accordingly so you are very likley to find ephemeral hosts before they are taken down.

    Enumerate domains of your target companies Search for your target's company name (e.g. Example, Inc.), find hits in the Organization field, and explore the associated Common Name and Subject Alternative Name fields. The results will probably include subdomains of the domains you're familiar with and if you're in luck you might find new root domains expanding the scope.

    Enumerate possible sub-subdomain enumeration target If the certificate is issued for a wildcard (e.g. *.foo.example.com), chances are there are other subdomains you can find by brute-forcing there. And you know how effective of this technique can be. Here are some wordlists to help you with that!

    💡 Note: Remeber to monitor the dataset for daily updates to get notified whenever a new asset comes up!

    Perform IP lookups Search for an IP address (e.g. 3.122.37.147) to find host names associated with it, and explore the Common Name, Subject Alternative Name, and Organization fields to gain find more information about that address.

    Discover origin IP addresses to bypass proxy services When a website is hidden behind security proxy services like Cloudflare, Akamai, Incapsula, and others, it is possible to search for the host name (e.g., example.com) in the dataset. This search may uncover the origin IP address, allowing you to bypass the proxy. We've discussed a similar technique on our blog which you can find here!

    Get a fresh dataset of live web servers Each IP address in the dataset corresponds to an HTTPS server running on port 443. You can use this data for large-scale research without needing to spend time collecting it yourself.

    Whatever else you can think of If you use this data for a cool project or research, we would love to hear about it!

    Additionally, below you will find a detailed explanation of our data collection process and how you can implement the same technique to gather information from your own IP ranges.

    TB; DZ (Too big; didn't zoom):

    We kick off the workflow with a simple bash script that retrieves AWS's IP ranges. Using a JQ query, we extract the IP ranges of EC2 machines by filtering for .prefixes[] | select(.service=="EC2") | .ip_prefix. Other services are excluded from this workflow since they don't support custom SSL certificates, making their data irrelevant for our dataset.

    Then, we use mapcidr to divide the IP ranges obtained in step 1 into smaller ranges, each containing up to 100k hosts (Thanks, ProjectDiscovery team!). This step will be handy in the next step when we run the parallel scanning process.

    At the time of writing, the EC2 IP ranges include over 57 million IP addresses, so scanning them all on a single machine would be impractical, which is where our file-splitter node comes into play.

    This node iterates through the input from mapcidr and triggers individual jobs for each range. When executing this w...

  7. N

    Grass Range, MT Population Breakdown by Gender and Age Dataset: Male and...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Grass Range, MT Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e1e392ff-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Grass Range, Montana
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Grass Range by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Grass Range. The dataset can be utilized to understand the population distribution of Grass Range by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Grass Range. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Grass Range.

    Key observations

    Largest age group (population): Male # 35-39 years (7) | Female # 70-74 years (36). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the Grass Range population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Grass Range is shown in the following column.
    • Population (Female): The female population in the Grass Range is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Grass Range for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Grass Range Population by Gender. You can refer the same here

  8. o

    Range View Circle Cross Street Data in Silverthorne, CO

    • ownerly.com
    Updated Jan 12, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2022). Range View Circle Cross Street Data in Silverthorne, CO [Dataset]. https://www.ownerly.com/co/silverthorne/range-view-cir-home-details
    Explore at:
    Dataset updated
    Jan 12, 2022
    Dataset authored and provided by
    Ownerly
    Area covered
    Silverthorne, Colorado, Range View Circle
    Description

    This dataset provides information about the number of properties, residents, and average property values for Range View Circle cross streets in Silverthorne, CO.

  9. o

    Range View Circle Cross Street Data in Rapid City, SD

    • ownerly.com
    Updated Feb 6, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2022). Range View Circle Cross Street Data in Rapid City, SD [Dataset]. https://www.ownerly.com/sd/rapid-city/range-view-cir-home-details
    Explore at:
    Dataset updated
    Feb 6, 2022
    Dataset authored and provided by
    Ownerly
    Area covered
    South Dakota, Rapid City, Range View Circle
    Description

    This dataset provides information about the number of properties, residents, and average property values for Range View Circle cross streets in Rapid City, SD.

  10. Simulation Data Set

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).

  11. housing

    • kaggle.com
    zip
    Updated Sep 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HappyRautela (2023). housing [Dataset]. https://www.kaggle.com/datasets/happyrautela/housing
    Explore at:
    zip(809785 bytes)Available download formats
    Dataset updated
    Sep 22, 2023
    Authors
    HappyRautela
    Description

    The exercise after this contains questions that are based on the housing dataset.

    1. How many houses have a waterfront? a. 21000 b. 21450 c. 163 d. 173

    2. How many houses have 2 floors? a. 2692 b. 8241 c. 10680 d. 161

    3. How many houses built before 1960 have a waterfront? a. 80 b. 7309 c. 90 d. 92

    4. What is the price of the most expensive house having more than 4 bathrooms? a. 7700000 b. 187000 c. 290000 d. 399000

    5. For instance, if the ‘price’ column consists of outliers, how can you make the data clean and remove the redundancies? a. Calculate the IQR range and drop the values outside the range. b. Calculate the p-value and remove the values less than 0.05. c. Calculate the correlation coefficient of the price column and remove the values less than the correlation coefficient. d. Calculate the Z-score of the price column and remove the values less than the z-score.

    6. What are the various parameters that can be used to determine the dependent variables in the housing data to determine the price of the house? a. Correlation coefficients b. Z-score c. IQR Range d. Range of the Features

    7. If we get the r2 score as 0.38, what inferences can we make about the model and its efficiency? a. The model is 38% accurate, and shows poor efficiency. b. The model is showing 0.38% discrepancies in the outcomes. c. Low difference between observed and fitted values. d. High difference between observed and fitted values.

    8. If the metrics show that the p-value for the grade column is 0.092, what all inferences can we make about the grade column? a. Significant in presence of other variables. b. Highly significant in presence of other variables c. insignificance in presence of other variables d. None of the above

    9. If the Variance Inflation Factor value for a feature is considerably higher than the other features, what can we say about that column/feature? a. High multicollinearity b. Low multicollinearity c. Both A and B d. None of the above

  12. Street Video Dataset

    • kaggle.com
    zip
    Updated Oct 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Macgence (2025). Street Video Dataset [Dataset]. https://www.kaggle.com/datasets/macgence/street-video-dataset
    Explore at:
    zip(80128 bytes)Available download formats
    Dataset updated
    Oct 18, 2025
    Authors
    Macgence
    Description

    Improve your Computer Vision models using our extensive collection of Street Video Dataset along the street from individuals. This dataset covers a broad range of demographics and scenarios, which will enhance the accuracy of facial recognition, Video Recognition features in your models. This specialized collection of Video data is meticulously curated to support research and development in the construction industry. This dataset provides a rich resource for training and evaluation purposes.

    Metadata Availability: Insights into Participant Details

    Each participant is accompanied by comprehensive metadata, which includes detailed information about their age, gender, location. Furthermore, this metadata encompasses details such as domain, topic, type, and outcome, providing valuable insights for both model development and evaluation purposes.

    Specifications:

    Type: Video Volume: 3000 Industry: Video Recognition File Format: MP4 Gender Distribution: 50/50 Age Range: 18 – 65

    These technical specifications ensure compatibility and optimal performance for a wide range of AI development applications.

    Insights into Image Data:

    The dataset comprises 3000 high-quality Video. Created through collaboration with a network of experts, it captures realistic, ensuring a balanced representation age, gender and demographics.

    License:

    Exclusively curated by Macgence, this Video dataset is available for commercial use, empowering AI developers.

    Updates and Customization:

    Consistent updates with fresh Video recorded in varied real-world scenarios guarantee ongoing relevance and precision. We offer customization options such as adjusting samples and providing datasets tailored to your specific criteria and needs.

    Looking for high-quality datasets to train your AI model? Contact us today to get the dataset you need—fast, reliable, and ready for deployment!

  13. p

    Range View Elementary School

    • publicschoolreview.com
    json, xml
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Public School Review, Range View Elementary School [Dataset]. https://www.publicschoolreview.com/range-view-elementary-school-profile
    Explore at:
    xml, jsonAvailable download formats
    Dataset authored and provided by
    Public School Review
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2011 - Dec 31, 2025
    Description

    Historical Dataset of Range View Elementary School is provided by PublicSchoolReview and contain statistics on metrics:Total Students Trends Over Years (2013-2023),Total Classroom Teachers Trends Over Years (2013-2023),Distribution of Students By Grade Trends,Student-Teacher Ratio Comparison Over Years (2013-2023),American Indian Student Percentage Comparison Over Years (2011-2023),Asian Student Percentage Comparison Over Years (2021-2022),Hispanic Student Percentage Comparison Over Years (2013-2023),Black Student Percentage Comparison Over Years (2019-2022),White Student Percentage Comparison Over Years (2013-2023),Two or More Races Student Percentage Comparison Over Years (2013-2023),Diversity Score Comparison Over Years (2013-2023),Free Lunch Eligibility Comparison Over Years (2013-2023),Reduced-Price Lunch Eligibility Comparison Over Years (2013-2023),Reading and Language Arts Proficiency Comparison Over Years (2011-2022),Math Proficiency Comparison Over Years (2012-2023),Overall School Rank Trends Over Years (2012-2023)

  14. d

    Data from: U.S. Geological Survey - Gap Analysis Project Species Range Maps...

    • catalog.data.gov
    • data.usgs.gov
    Updated Nov 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). U.S. Geological Survey - Gap Analysis Project Species Range Maps CONUS_2001 [Dataset]. https://catalog.data.gov/dataset/u-s-geological-survey-gap-analysis-project-species-range-maps-conus-2001
    Explore at:
    Dataset updated
    Nov 20, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    GAP species range data are coarse representations of the total areal extent a species occupies, in other words the geographic limits within which a species can be found (Morrison and Hall 2002). These data provide the geographic extent within which the USGS Gap Analysis Project delineates areas of suitable habitat for terrestrial vertebrate species in their species habitat maps. The range maps are created by attributing a vector file derived from the 12-digit Hydrologic Unit Dataset (USDA NRCS 2009). Modifications to that dataset are described here < https://www.sciencebase.gov/catalog/item/56d496eee4b015c306f17a42>. Attribution of the season range for each species was based on the literature and online sources (See Cross Reference section of the metadata). Attribution for each hydrologic unit within the range included values for origin (native, introduced, reintroduced, vagrant), occurrence (extant, possibly present, potentially present, extirpated), reproductive use (breeding, non-breeding, both) and season (year-round, summer, winter, migratory, vagrant). These species range data provide the biological context within which to build our species distribution models. Versioning, Naming Conventions and Codes: A composite version code is employed to allow the user to track the spatial extent, the date of the ground conditions, and the iteration of the data set for that extent/date. For example, CONUS_2001v1 represents the spatial extent of the conterminous US (CONUS), the ground condition year of 2001, and the first iteration (v1) for that extent/date. In many cases, a GAP species code is used in conjunction with the version code to identify specific data sets or files (i.e. Cooper’s Hawk Habitat Map named bCOHAx_CONUS_2001v1_HabMap).

  15. o

    Range View Circle Cross Street Data in Prescott Valley, AZ

    • ownerly.com
    Updated Dec 5, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2021). Range View Circle Cross Street Data in Prescott Valley, AZ [Dataset]. https://www.ownerly.com/az/prescott-valley/range-view-cir-home-details
    Explore at:
    Dataset updated
    Dec 5, 2021
    Dataset authored and provided by
    Ownerly
    Area covered
    Arizona, Prescott Valley, Range View Circle
    Description

    This dataset provides information about the number of properties, residents, and average property values for Range View Circle cross streets in Prescott Valley, AZ.

  16. o

    Range View Drive Cross Street Data in Austin, TX

    • ownerly.com
    Updated Dec 8, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2021). Range View Drive Cross Street Data in Austin, TX [Dataset]. https://www.ownerly.com/tx/austin/range-view-dr-home-details
    Explore at:
    Dataset updated
    Dec 8, 2021
    Dataset authored and provided by
    Ownerly
    Area covered
    Texas, Austin, Range View Drive
    Description

    This dataset provides information about the number of properties, residents, and average property values for Range View Drive cross streets in Austin, TX.

  17. Point Cloud Mnist 2D

    • kaggle.com
    zip
    Updated Feb 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cristian Garcia (2020). Point Cloud Mnist 2D [Dataset]. https://www.kaggle.com/datasets/cristiangarcia/pointcloudmnist2d/discussion
    Explore at:
    zip(34176926 bytes)Available download formats
    Dataset updated
    Feb 12, 2020
    Authors
    Cristian Garcia
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Point Cloud MNIST 2D

    This is a simple dataset for getting started with Machine Learning for point cloud data. It take the original MNIST and converts each of the non-zero pixels into points in a 2D space. The idea is to classify each collection of point (rather than images) to the same label as in the MNIST. The source for generating this dataset can be found in this repository: cgarciae/point-cloud-mnist-2D

    Format

    There are 2 files: train.csv and test.csv. Each file has the columns

    label,x0,y0,v0,x1,y1,v1,...,x350,y350,v350

    where

    • label contains the target label in the range [0, 9]
    • x{i} contain the x position of the pixel/point as viewed in a Cartesian plane in the range [-1, 27].
    • y{i} contain the y position of the pixel/point as viewed in a Cartesian plane in the range [-1, 27].
    • v{i} contain the value of the pixel in the range [-1, 255].

    Padding

    The maximum number of point found on a image was 351, images with less points where padded to this length using the following values:

    • x{i} = -1
    • y{i} = -1
    • v{i} = -1

    Subsamples

    To make the challenge more interesting you can also try to solve the problem using a subset of points, e.g. the first N. Here are some visualizations of the dataset using different amounts of points:

    50

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F158444%2Fbbf5393884480e3d24772344e079c898%2F50.png?generation=1579911143877077&alt=media" alt="50">

    100

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F158444%2F5a83f6f5f7c5791e3c1c8e9eba2d052b%2F100.png?generation=1579911238988368&alt=media" alt="100">

    200

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F158444%2F202098ed0da35c41ae45dfc32e865972%2F200.png?generation=1579911264286372&alt=media" alt="200">

    351

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F158444%2F5c733566f8d689c5e0fd300440d04da2%2Fmax.png?generation=1579911289750248&alt=media" alt="">

    Distribution

    This histogram of the distribution the number of points per image in the dataset can give you a general idea of how difficult each variation can be.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F158444%2F9eb3b463f77a887dae83a7af0eb08c7d%2Flengths.png?generation=1579911380397412&alt=media" alt="">

  18. t

    RangeDet: In Defense of Range View for LiDAR-Based 3D Object Detection -...

    • service.tib.eu
    • resodate.org
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). RangeDet: In Defense of Range View for LiDAR-Based 3D Object Detection - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/rangedet--in-defense-of-range-view-for-lidar-based-3d-object-detection
    Explore at:
    Dataset updated
    Dec 2, 2024
    Description

    A LiDAR-based 3D object detection dataset.

  19. d

    Data from: Half interpercentile range (half of the difference between the...

    • catalog.data.gov
    • data.usgs.gov
    • +5more
    Updated Nov 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Half interpercentile range (half of the difference between the 16th and 84th percentiles) of wave-current bottom shear stress in the Middle Atlantic Bight for May, 2010 - May, 2011 (MAB_hIPR.SHP) [Dataset]. https://catalog.data.gov/dataset/half-interpercentile-range-half-of-the-difference-between-the-16th-and-84th-percentiles-of
    Explore at:
    Dataset updated
    Nov 21, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    The U.S. Geological Survey has been characterizing the regional variation in shear stress on the sea floor and sediment mobility through statistical descriptors. The purpose of this project is to identify patterns in stress in order to inform habitat delineation or decisions for anthropogenic use of the continental shelf. The statistical characterization spans the continental shelf from the coast to approximately 120 m water depth, at approximately 5 km resolution. Time-series of wave and circulation are created using numerical models, and near-bottom output of steady and oscillatory velocities and an estimate of bottom roughness are used to calculate a time-series of bottom shear stress at 1-hour intervals. Statistical descriptions such as the median and 95th percentile, which are the output included with this database, are then calculated to create a two-dimensional picture of the regional patterns in shear stress. In addition, time-series of stress are compared to critical stress values at select points calculated from observed surface sediment texture data to determine estimates of sea floor mobility.

  20. Research Papers Dataset

    • kaggle.com
    zip
    Updated May 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NECHBA MOHAMMED (2023). Research Papers Dataset [Dataset]. https://www.kaggle.com/datasets/nechbamohammed/research-papers-dataset
    Explore at:
    zip(619131172 bytes)Available download formats
    Dataset updated
    May 8, 2023
    Authors
    NECHBA MOHAMMED
    Description

    Description: This dataset (Version 10) contains a collection of research papers along with various attributes and metadata. It is a comprehensive and diverse dataset that can be used for a wide range of research and analysis tasks. The dataset encompasses papers from different fields of study, including computer science, mathematics, physics, and more.

    Fields in the Dataset: - id: A unique identifier for each paper. - title: The title of the research paper. - authors: The list of authors involved in the paper. - venue: The journal or venue where the paper was published. - year: The year when the paper was published. - n_citation: The number of citations received by the paper. - references: A list of paper IDs that are cited by the current paper. - abstract: The abstract of the paper.

    Example: - "id": "013ea675-bb58-42f8-a423-f5534546b2b1", - "title": "Prediction of consensus binding mode geometries for related chemical series of positive allosteric modulators of adenosine and muscarinic acetylcholine receptors", - "authors": ["Leon A. Sakkal", "Kyle Z. Rajkowski", "Roger S. Armen"], - "venue": "Journal of Computational Chemistry", - "year": 2017, - "n_citation": 0, - "references": ["4f4f200c-0764-4fef-9718-b8bccf303dba", "aa699fbf-fabe-40e4-bd68-46eaf333f7b1"], - "abstract": "This paper studies ..."

    Cite: https://www.aminer.cn/citation

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shanglian Zhou; Carlos Canchila; Wei Song (2023). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6383043

Fused Image dataset for convolutional neural Network-based crack Detection (FIND)

Explore at:
Dataset updated
Apr 20, 2023
Authors
Shanglian Zhou; Carlos Canchila; Wei Song
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The “Fused Image dataset for convolutional neural Network-based crack Detection” (FIND) is a large-scale image dataset with pixel-level ground truth crack data for deep learning-based crack segmentation analysis. It features four types of image data including raw intensity image, raw range (i.e., elevation) image, filtered range image, and fused raw image. The FIND dataset consists of 2500 image patches (dimension: 256x256 pixels) and their ground truth crack maps for each of the four data types.

The images contained in this dataset were collected from multiple bridge decks and roadways under real-world conditions. A laser scanning device was adopted for data acquisition such that the captured raw intensity and raw range images have pixel-to-pixel location correspondence (i.e., spatial co-registration feature). The filtered range data were generated by applying frequency domain filtering to eliminate image disturbances (e.g., surface variations, and grooved patterns) from the raw range data [1]. The fused image data were obtained by combining the raw range and raw intensity data to achieve cross-domain feature correlation [2,3]. Please refer to [4] for a comprehensive benchmark study performed using the FIND dataset to investigate the impact from different types of image data on deep convolutional neural network (DCNN) performance.

If you share or use this dataset, please cite [4] and [5] in any relevant documentation.

In addition, an image dataset for crack classification has also been published at [6].

References:

[1] Shanglian Zhou, & Wei Song. (2020). Robust Image-Based Surface Crack Detection Using Range Data. Journal of Computing in Civil Engineering, 34(2), 04019054. https://doi.org/10.1061/(asce)cp.1943-5487.0000873

[2] Shanglian Zhou, & Wei Song. (2021). Crack segmentation through deep convolutional neural networks and heterogeneous image fusion. Automation in Construction, 125. https://doi.org/10.1016/j.autcon.2021.103605

[3] Shanglian Zhou, & Wei Song. (2020). Deep learning–based roadway crack classification with heterogeneous image data fusion. Structural Health Monitoring, 20(3), 1274-1293. https://doi.org/10.1177/1475921720948434

[4] Shanglian Zhou, Carlos Canchila, & Wei Song. (2023). Deep learning-based crack segmentation for civil infrastructure: data types, architectures, and benchmarked performance. Automation in Construction, 146. https://doi.org/10.1016/j.autcon.2022.104678

5 Shanglian Zhou, Carlos Canchila, & Wei Song. (2022). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6383044

[6] Wei Song, & Shanglian Zhou. (2020). Laser-scanned roadway range image dataset (LRRD). Laser-scanned Range Image Dataset from Asphalt and Concrete Roadways for DCNN-based Crack Classification, DesignSafe-CI. https://doi.org/10.17603/ds2-bzv3-nc78

Search
Clear search
Close search
Google apps
Main menu