100+ datasets found

Z
Fused Image dataset for convolutional neural Network-based crack Detection...
data.niaid.nih.gov
zenodo.org
Updated Apr 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carlos Canchila (2023). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6383043
Explore at:
Dataset updated
Apr 20, 2023
Dataset provided by
Shanglian Zhou
Carlos Canchila
Wei Song
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The “Fused Image dataset for convolutional neural Network-based crack Detection” (FIND) is a large-scale image dataset with pixel-level ground truth crack data for deep learning-based crack segmentation analysis. It features four types of image data including raw intensity image, raw range (i.e., elevation) image, filtered range image, and fused raw image. The FIND dataset consists of 2500 image patches (dimension: 256x256 pixels) and their ground truth crack maps for each of the four data types.

The images contained in this dataset were collected from multiple bridge decks and roadways under real-world conditions. A laser scanning device was adopted for data acquisition such that the captured raw intensity and raw range images have pixel-to-pixel location correspondence (i.e., spatial co-registration feature). The filtered range data were generated by applying frequency domain filtering to eliminate image disturbances (e.g., surface variations, and grooved patterns) from the raw range data [1]. The fused image data were obtained by combining the raw range and raw intensity data to achieve cross-domain feature correlation [2,3]. Please refer to [4] for a comprehensive benchmark study performed using the FIND dataset to investigate the impact from different types of image data on deep convolutional neural network (DCNN) performance.

If you share or use this dataset, please cite [4] and [5] in any relevant documentation.

In addition, an image dataset for crack classification has also been published at [6].

References:

[1] Shanglian Zhou, & Wei Song. (2020). Robust Image-Based Surface Crack Detection Using Range Data. Journal of Computing in Civil Engineering, 34(2), 04019054. https://doi.org/10.1061/(asce)cp.1943-5487.0000873

[2] Shanglian Zhou, & Wei Song. (2021). Crack segmentation through deep convolutional neural networks and heterogeneous image fusion. Automation in Construction, 125. https://doi.org/10.1016/j.autcon.2021.103605

[3] Shanglian Zhou, & Wei Song. (2020). Deep learning–based roadway crack classification with heterogeneous image data fusion. Structural Health Monitoring, 20(3), 1274-1293. https://doi.org/10.1177/1475921720948434

[4] Shanglian Zhou, Carlos Canchila, & Wei Song. (2023). Deep learning-based crack segmentation for civil infrastructure: data types, architectures, and benchmarked performance. Automation in Construction, 146. https://doi.org/10.1016/j.autcon.2022.104678

5 Shanglian Zhou, Carlos Canchila, & Wei Song. (2022). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6383044

[6] Wei Song, & Shanglian Zhou. (2020). Laser-scanned roadway range image dataset (LRRD). Laser-scanned Range Image Dataset from Asphalt and Concrete Roadways for DCNN-based Crack Classification, DesignSafe-CI. https://doi.org/10.17603/ds2-bzv3-nc78
Public Land Survey System (PLSS): Township and Range
gis.data.ca.gov
data.ca.gov
+5more
Updated May 14, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Conservation (2019). Public Land Survey System (PLSS): Township and Range [Dataset]. https://gis.data.ca.gov/datasets/cadoc::public-land-survey-system-plss-township-and-range/about
Explore at:
Dataset updated
May 14, 2019
Dataset authored and provided by
California Department of Conservationhttp://www.conservation.ca.gov/
Area covered

Description
In support of new permitting workflows associated with anticipated WellSTAR needs, the CalGEM GIS unit extended the existing BLM PLSS Township & Range grid to cover offshore areas with the 3-mile limit of California jurisdiction. The PLSS grid as currently used by CalGEM is a composite of a BLM download (the majority of the data), additions by the DPR, and polygons created by CalGEM to fill in missing areas (the Ranchos, and Offshore areas within the 3-mile limit of California jurisdiction).CalGEM is the Geologic Energy Management Division of the California Department of Conservation, formerly the Division of Oil, Gas, and Geothermal Resources (as of January 1, 2020).Update Frequency: As Needed
N
South Range, MI Population Breakdown by Gender and Age Dataset: Male and...
neilsberg.com
csv, json
Updated Feb 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). South Range, MI Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e200fba9-f25d-11ef-8c1b-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Feb 24, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Michigan, South Range
Variables measured
Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of South Range by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for South Range. The dataset can be utilized to understand the population distribution of South Range by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in South Range. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for South Range.

Key observations

Largest age group (population): Male # 20-24 years (49) | Female # 20-24 years (50). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Age groups:

Under 5 years

5 to 9 years

10 to 14 years

15 to 19 years

20 to 24 years

25 to 29 years

30 to 34 years

35 to 39 years

40 to 44 years

45 to 49 years

50 to 54 years

55 to 59 years

60 to 64 years

65 to 69 years

70 to 74 years

75 to 79 years

80 to 84 years

85 years and over

Scope of gender :

Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

Variables / Data Columns

Age Group: This column displays the age group for the South Range population analysis. Total expected values are 18 and are define above in the age groups section.

Population (Male): The male population in the South Range is shown in the following column.

Population (Female): The female population in the South Range is shown in the following column.

Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in South Range for each age group.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for South Range Population by Gender. You can refer the same here
d
BLM ID Range Improvement Line
catalog.data.gov
s.cnmilf.com
+2more
Updated May 9, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bureau of Land Management (2025). BLM ID Range Improvement Line [Dataset]. https://catalog.data.gov/dataset/blm-id-range-improvement-line-hub
Explore at:
Dataset updated
May 9, 2025
Dataset provided by
Bureau of Land Management
Description
This geodatabase of point, line and polygon features is an effort to consolidate all of the range improvement locations on BLM-managed land in Idaho into one database. Currently, the line feature class has some data for all of the BLM field offices except the Coeur d'Alene and Cottonwood field offices. Range improvements are structures intended to enhance rangeland resources, including wildlife, watershed, and livestock management. Examples of range improvements include water troughs, spring headboxes, culverts, fences, water pipelines, gates, wildlife guzzlers, artificial nest structures, reservoirs, developed springs, corrals, exclosures, etc. These structures were first tracked by the Bureau of Land Management (BLM) in the Job Documentation Report (JDR) System in the early 1960s, which was predominately a paper-based tracking system. In 1988 the JDRs were migrated into and replaced by the automated Range Improvement Project System (RIPS), and version 2.0 is currently being used today. It tracks inventory, status, objectives, treatment, maintenance cycle, maintenance inspection, monetary contributions and reporting. Not all range improvements are documented in the RIPS database; there may be some older range improvements that were built before the JDR tracking system was established. There also may be unauthorized projects that are not in RIPS. Official project files of paper maps, reports, NEPA documents, checklists, etc., document the status of each project and are physically kept in the office with management authority for that project area. In addition, project data is entered into the RIPS system to enable managers to access the data to track progress, run reports, analyze the data, etc. Before Geographic Information System technology most offices kept paper atlases or overlay systems that mapped the locations of the range improvements. The objective of this geodatabase is to migrate the location of historic range improvement projects into a GIS for geospatial use with other data and to centralize the range improvement data for the state. This data set is a work in progress and does not have all range improvement projects that are on BLM lands. Some field offices have not migrated their data into this database, and others are partially completed. New projects may have been built but have not been entered into the system. Historic or unauthorized projects may not have case files and are being mapped and documented as they are found. Many field offices are trying to verify the locations and status of range improvements with GPS, and locations may change or projects that have been abandoned or removed on the ground may be deleted. Attributes may be incomplete or inaccurate. This data was created using the standard for range improvements set forth in Idaho IM 2009-044, dated 6/30/2009. However, it does not have all of the fields the standard requires. Fields that are missing from the line feature class that are in the standard are: ALLOT_NO, MGMT_AGCY, ADMIN_ST, ADMIN_OFF, SRCE_AGCY, MAX_PDOP, MAX_HDOP, CORR_TYPE, RCVR_TYPE, GPS_TIME, UPDATE_STA, UNFILT_POS, FILT_POS, DATA_DICTI, GPS_LENGTH, GPS_3DLGTH, AVE_VERT_P, AVE_HORZ_P, WORST_VERT, WORST_HORZ and CONF_LEVEL. Several additional fields have been added that are not part of the standard: top_fence, btm_fence, admin_fo_line and year_checked. There is no National BLM standard for GIS range improvement data at this time. For more information contact us at blm_id_stateoffice@blm.gov.
N
Grass Range, MT Population Breakdown by Gender and Age Dataset: Male and...
neilsberg.com
csv, json
Updated Feb 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Grass Range, MT Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e1e392ff-f25d-11ef-8c1b-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Feb 24, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Grass Range, Montana
Variables measured
Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of Grass Range by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Grass Range. The dataset can be utilized to understand the population distribution of Grass Range by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Grass Range. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Grass Range.

Key observations

Largest age group (population): Male # 35-39 years (7) | Female # 70-74 years (36). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Age groups:

Under 5 years

5 to 9 years

10 to 14 years

15 to 19 years

20 to 24 years

25 to 29 years

30 to 34 years

35 to 39 years

40 to 44 years

45 to 49 years

50 to 54 years

55 to 59 years

60 to 64 years

65 to 69 years

70 to 74 years

75 to 79 years

80 to 84 years

85 years and over

Scope of gender :

Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

Variables / Data Columns

Age Group: This column displays the age group for the Grass Range population analysis. Total expected values are 18 and are define above in the age groups section.

Population (Male): The male population in the Grass Range is shown in the following column.

Population (Female): The female population in the Grass Range is shown in the following column.

Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Grass Range for each age group.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Grass Range Population by Gender. You can refer the same here
S
Native and alien species ranges
dataportal.senckenberg.de
zip
Updated Mar 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seebens (2021). Native and alien species ranges [Dataset]. http://doi.org/10.12761/sgn.2016.01.024
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.12761/sgn.2016.01.024
Dataset updated
Mar 10, 2021
Dataset provided by
Senckenberg - Data Stock (general)
Authors
Seebens
Time period covered
1500 - 2014
Description
The file contains native and alien ranges of 1380 species worldwide obtained from the Global Invasive Species Database (http://www.iucngisd.org/gisd/) and CABI Invasive Species Compendium (http://www.cabi.org/isc/). The data are used to produce the results shown in Seebens, Essl & Blasius: The intermediate distance hypothesis of biological invasions, which is accepted for publication in Ecology Letters. The file is in csv format containing six columns: Species name, life form, native range, alien range, distance (great circle distance between the centroids of the respective regions) and species weights. More details about the data and the analysis can be found in Seebens et al.
phones price classification
kaggle.com
Updated Jan 14, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed_Ghonem01 (2024). phones price classification [Dataset]. https://www.kaggle.com/datasets/ahmedghonem01/phones-price-classification
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 14, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ahmed_Ghonem01
Description
Context Bob has started his own mobile company. He wants to give tough fight to big companies like Apple,Samsung etc.

He does not know how to estimate price of mobiles his company creates. In this competitive mobile phone market you cannot simply assume things. To solve this problem he collects sales data of mobile phones of various companies.

Bob wants to find out some relation between features of a mobile phone(eg:- RAM,Internal Memory etc) and its selling price. But he is not so good at Machine Learning. So he needs your help to solve this problem.

In this problem you do not have to predict actual price but a price range indicating how high the price is
ECMWF ERA5: surface level analysis parameter data
catalogue.ceda.ac.uk
data-search.nerc.ac.uk
Updated Jul 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
European Centre for Medium-Range Weather Forecasts (ECMWF) (2025). ECMWF ERA5: surface level analysis parameter data [Dataset]. https://catalogue.ceda.ac.uk/uuid/c1145ccc4b6d4310a4fc7cce61041b63
Explore at:
Dataset updated
Jul 17, 2025
Dataset provided by
Centre for Environmental Data Analysishttp://www.ceda.ac.uk/
Authors
European Centre for Medium-Range Weather Forecasts (ECMWF)
License
https://artefacts.ceda.ac.uk/licences/specific_licences/ecmwf-era-products.pdfhttps://artefacts.ceda.ac.uk/licences/specific_licences/ecmwf-era-products.pdf
Area covered
Earth
Variables measured
cloud_area_fraction, sea_ice_area_fraction, air_pressure_at_sea_level, lwe_thickness_of_surface_snow_amount, lwe_thickness_of_atmosphere_mass_content_of_water_vapor
Description
This dataset contains ERA5 surface level analysis parameter data. ERA5 is the 5th generation reanalysis project from the European Centre for Medium-Range Weather Forecasts (ECWMF) - see linked documentation for further details. This dataset contains a limited selection of all available variables and have been converted to netCDF from the original GRIB files held on the ECMWF system. They have also been translated onto a regular latitude-longitude grid during the extraction process from the ECMWF holdings. For a fuller set of variables please see the linked Copernicus Data Store (CDS) data tool, linked to from this record.

Model level analysis and surface forecast data to complement this dataset are also available. Data from a 10 member ensemble, run at lower spatial and temporal resolution, were also produced to provide an uncertainty estimate for the output from the single high resolution (hourly output at 31 km grid spacing) 'HRES' realisation producing data in this dataset.

The ERA5 global atmospheric reanalysis of the covers 1979 to 2 months behind the present month. This follows on from the ERA-15, ERA-40 rand ERA-interim re-analysis projects.

An initial release of ERA5 data (ERA5t) is made roughly 5 days behind the present date. These will be subsequently reviewed ahead of being released by ECMWF as quality assured data within 3 months. CEDA holds a 6 month rolling copy of the latest ERA5t data. See related datasets linked to from this record. However, for the period 2000-2006 the initial ERA5 release was found to suffer from stratospheric temperature biases and so new runs to address this issue were performed resulting in the ERA5.1 release (see linked datasets). Note, though, that Simmons et al. 2020 (technical memo 859) report that "ERA5.1 is very close to ERA5 in the lower and middle troposphere." but users of data from this period should read the technical memo 859 for further details.
e
Climate Indicators: Extreme Temperature Range (etr) - Dataset - B2FIND
b2find.eudat.eu
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Climate Indicators: Extreme Temperature Range (etr) - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/f1245099-9f27-582b-9429-ee09bd945f46
Explore at:
Dataset updated
Apr 10, 2025
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
cdo -sub -yearmax TX.nc -yearmin TN.nc out.nc Difference between the maximum of the maximum temperature and the minimum of the minimum temperature: Let TXi be the daily maximum temperature on day i and TNj be the daily minimum temperature on day j. For Etr build the difference between the maximum value of TXi per year and the minimum value of TNj per year. Climate Modell Data More information about the climate model data source and methods can be found in the text files of the head data set (DOI: 10.58160/gGzexcbDikobkyvK, see "IsPartOf-DOI").
ECMWF ERA5: ensemble spreads of surface level analysis parameter data
catalogue.ceda.ac.uk
data-search.nerc.ac.uk
Updated Jul 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
European Centre for Medium-Range Weather Forecasts (ECMWF) (2025). ECMWF ERA5: ensemble spreads of surface level analysis parameter data [Dataset]. https://catalogue.ceda.ac.uk/uuid/3c3c845f1dfb4788a2577651cd758ee9
Explore at:
Dataset updated
Jul 7, 2025
Dataset provided by
Centre for Environmental Data Analysishttp://www.ceda.ac.uk/
Authors
European Centre for Medium-Range Weather Forecasts (ECMWF)
License
https://artefacts.ceda.ac.uk/licences/specific_licences/ecmwf-era-products.pdfhttps://artefacts.ceda.ac.uk/licences/specific_licences/ecmwf-era-products.pdf
Area covered
Earth
Variables measured
time, latitude, longitude, Skin temperature, Total cloud cover, 2 metre temperature, cloud_area_fraction, Sea ice area fraction, sea_ice_area_fraction, Mean sea level pressure, and 7 more
Description
This dataset contains ensemble spreads for the ERA5 surface level analysis parameter data ensemble means (see linked dataset). ERA5 is the 5th generation reanalysis project from the European Centre for Medium-Range Weather Forecasts (ECWMF) - see linked documentation for further details. The ensemble means and spreads are calculated from the ERA5 10 member ensemble, run at a reduced resolution compared with the single high resolution (hourly output at 31 km grid spacing) 'HRES' realisation, for which these data have been produced to provide an uncertainty estimate. This dataset contains a limited selection of all available variables and have been converted to netCDF from the original GRIB files held on the ECMWF system. They have also been translated onto a regular latitude-longitude grid during the extraction process from the ECMWF holdings. For a fuller set of variables please see the linked Copernicus Data Store (CDS) data tool, linked to from this record.

Note, ensemble standard deviation is often referred to as ensemble spread and is calculated as the standard deviation of the 10-members in the ensemble (i.e., including the control). It is not the sample standard deviation, and thus were calculated by dividing by 10 rather than 9 (N-1). See linked datasets for ensemble member and ensemble mean data.

The ERA5 global atmospheric reanalysis of the covers 1979 to 2 months behind the present month. This follows on from the ERA-15, ERA-40 rand ERA-interim re-analysis projects.

An initial release of ERA5 data (ERA5t) is made roughly 5 days behind the present date. These will be subsequently reviewed ahead of being released by ECMWF as quality assured data within 3 months. CEDA holds a 6 month rolling copy of the latest ERA5t data. See related datasets linked to from this record. However, for the period 2000-2006 the initial ERA5 release was found to suffer from stratospheric temperature biases and so new runs to address this issue were performed resulting in the ERA5.1 release (see linked datasets). Note, though, that Simmons et al. 2020 (technical memo 859) report that "ERA5.1 is very close to ERA5 in the lower and middle troposphere." but users of data from this period should read the technical memo 859 for further details.
a
Endemic Mammal Richness in California, Range Weighted (Data Basin Dataset)
hub.arcgis.com
Updated Apr 20, 2011
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mkoo (2011). Endemic Mammal Richness in California, Range Weighted (Data Basin Dataset) [Dataset]. https://hub.arcgis.com/content/c5d971cdbb6e4f4ab8bfcfa368623f59
Explore at:
Dataset updated
Apr 20, 2011
Dataset authored and provided by
mkoo
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Area covered

Description
Project Goals: To identify regions of recently evolved endemic (neo-endemism) mammal species in California and thereby infer areas of rapid evolutionary diversification, which may help guide conservation prioritization and future planning for protected areas. Four species-based GIS rasters were produced of mammalian endemism (see reference for details ). This is: Richness of species distribution models weighted by inverse range size Abstract: The high rate of anthropogenic impact on natural systems mandates protection of the evolutionary processes that generate and sustain biological diversity. Environmental drivers of diversification include spatial heterogeneity of abiotic and biotic agents of divergent selection, features that suppress gene flow, and climatic or geological processes that open new niche space. To explore how well such proxies perform as surrogates for conservation planning, we need first to map areas with rapid diversification — ‘evolutionary hotspots’. Here we combine estimates of range size and divergence time to map spatial patterns of neo-endemism for mammals of California, a global biodiversity hotspot. Neo-endemism is explored at two scales: (i) endemic species, weighted by the inverse of range size and mtDNA sequence divergence from sisters; and (ii) as a surrogate for spatial patterns of phenotypic divergence, endemic subspecies, again using inverse-weighting of range size. The species-level analysis revealed foci of narrowly endemic, young taxa in the central Sierra Nevada, northern and central coast, and Tehachapi and Peninsular Ranges. The subspecies endemism-richness analysis supported the last four areas as hotspots for diversification, but also highlighted additional coastal areas (Monterey to north of San Francisco Bay) and the Inyo Valley to the east. We suggest these hotspots reflect the major processes shaping mammal neo-endemism: steep environmental gradients, biotic admixture areas, and areas with recent geological/climate change. Anthropogenic changes to both environment and land use will have direct impacts on regions of rapid divergence. However, despite widespread changes to land cover in California, the majority of the hotspots identified here occur in areas with relatively intact ecological landscapes. The geographical scope of conserving evolutionary process is beyond the scale of any single agency or nongovernmental organization. Choosing which land to closely protect and/or purchase will always require close coordination between agencies. Citation:DAVIS, E.B., KOO, M.S., CONROY, C., PATTON, J.L. & MORITZ, C. (2008) The California Hotspots Project: identifying regions of rapid diversification of mammals. Molecular Ecology 17, 120 -138. This dataset was reviewed in another manner. Spatial Resolution: 0.0083333338 DD This layer package was loaded using Data Basin.Click here to go to the detail page for this layer package in Data Basin, where you can find out more information, such as full metadata, or use it to create a live web map.
Data from: Current and projected research data storage needs of Agricultural...
catalog.data.gov
agdatacommons.nal.usda.gov
+2more
Updated Apr 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Current and projected research data storage needs of Agricultural Research Service researchers in 2016 [Dataset]. https://catalog.data.gov/dataset/current-and-projected-research-data-storage-needs-of-agricultural-research-service-researc-f33da
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Description
The USDA Agricultural Research Service (ARS) recently established SCINet , which consists of a shared high performance computing resource, Ceres, and the dedicated high-speed Internet2 network used to access Ceres. Current and potential SCINet users are using and generating very large datasets so SCINet needs to be provisioned with adequate data storage for their active computing. It is not designed to hold data beyond active research phases. At the same time, the National Agricultural Library has been developing the Ag Data Commons, a research data catalog and repository designed for public data release and professional data curation. Ag Data Commons needs to anticipate the size and nature of data it will be tasked with handling. The ARS Web-enabled Databases Working Group, organized under the SCINet initiative, conducted a study to establish baseline data storage needs and practices, and to make projections that could inform future infrastructure design, purchases, and policies. The SCINet Web-enabled Databases Working Group helped develop the survey which is the basis for an internal report. While the report was for internal use, the survey and resulting data may be generally useful and are being released publicly. From October 24 to November 8, 2016 we administered a 17-question survey (Appendix A) by emailing a Survey Monkey link to all ARS Research Leaders, intending to cover data storage needs of all 1,675 SY (Category 1 and Category 4) scientists. We designed the survey to accommodate either individual researcher responses or group responses. Research Leaders could decide, based on their unit's practices or their management preferences, whether to delegate response to a data management expert in their unit, to all members of their unit, or to themselves collate responses from their unit before reporting in the survey. Larger storage ranges cover vastly different amounts of data so the implications here could be significant depending on whether the true amount is at the lower or higher end of the range. Therefore, we requested more detail from "Big Data users," those 47 respondents who indicated they had more than 10 to 100 TB or over 100 TB total current data (Q5). All other respondents are called "Small Data users." Because not all of these follow-up requests were successful, we used actual follow-up responses to estimate likely responses for those who did not respond. We defined active data as data that would be used within the next six months. All other data would be considered inactive, or archival. To calculate per person storage needs we used the high end of the reported range divided by 1 for an individual response, or by G, the number of individuals in a group response. For Big Data users we used the actual reported values or estimated likely values. Resources in this dataset:Resource Title: Appendix A: ARS data storage survey questions. File Name: Appendix A.pdfResource Description: The full list of questions asked with the possible responses. The survey was not administered using this PDF but the PDF was generated directly from the administered survey using the Print option under Design Survey. Asterisked questions were required. A list of Research Units and their associated codes was provided in a drop down not shown here. Resource Software Recommended: Adobe Acrobat,url: https://get.adobe.com/reader/ Resource Title: CSV of Responses from ARS Researcher Data Storage Survey. File Name: Machine-readable survey response data.csvResource Description: CSV file includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed. This information is that same data as in the Excel spreadsheet (also provided).Resource Title: Responses from ARS Researcher Data Storage Survey. File Name: Data Storage Survey Data for public release.xlsxResource Description: MS Excel worksheet that Includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel
Z
Data from: FISBe: A real-world benchmark dataset for instance segmentation...
data.niaid.nih.gov
zenodo.org
Updated Apr 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Reinke, Annika (2024). FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10875062
Explore at:
Dataset updated
Apr 2, 2024
Dataset provided by
Hirsch, Peter
Rumberger, Josef Lorenz
Ihrke, Gudrun
Reinke, Annika
Kandarpa, Ramya
Kainmueller, Dagmar
Maier-Hein, Lena
Managan, Claire
Mais, Lisa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
General

For more details and the most up-to-date information please consult our project page: https://kainmueller-lab.github.io/fisbe.

Summary

A new dataset for neuron instance segmentation in 3d multicolor light microscopy data of fruit fly brains

30 completely labeled (segmented) images

71 partly labeled images

altogether comprising ∼600 expert-labeled neuron instances (labeling a single neuron takes between 30-60 min on average, yet a difficult one can take up to 4 hours)

To the best of our knowledge, the first real-world benchmark dataset for instance segmentation of long thin filamentous objects

A set of metrics and a novel ranking score for respective meaningful method benchmarking

An evaluation of three baseline methods in terms of the above metrics and score

Abstract

Instance segmentation of neurons in volumetric light microscopy images of nervous systems enables groundbreaking research in neuroscience by facilitating joint functional and morphological analyses of neural circuits at cellular resolution. Yet said multi-neuron light microscopy data exhibits extremely challenging properties for the task of instance segmentation: Individual neurons have long-ranging, thin filamentous and widely branching morphologies, multiple neurons are tightly inter-weaved, and partial volume effects, uneven illumination and noise inherent to light microscopy severely impede local disentangling as well as long-range tracing of individual neurons. These properties reflect a current key challenge in machine learning research, namely to effectively capture long-range dependencies in the data. While respective methodological research is buzzing, to date methods are typically benchmarked on synthetic datasets. To address this gap, we release the FlyLight Instance Segmentation Benchmark (FISBe) dataset, the first publicly available multi-neuron light microscopy dataset with pixel-wise annotations. In addition, we define a set of instance segmentation metrics for benchmarking that we designed to be meaningful with regard to downstream analyses. Lastly, we provide three baselines to kick off a competition that we envision to both advance the field of machine learning regarding methodology for capturing long-range data dependencies, and facilitate scientific discovery in basic neuroscience.

Dataset documentation:

We provide a detailed documentation of our dataset, following the Datasheet for Datasets questionnaire:

FISBe Datasheet

Our dataset originates from the FlyLight project, where the authors released a large image collection of nervous systems of ~74,000 flies, available for download under CC BY 4.0 license.

Files

fisbe_v1.0_{completely,partly}.zip

contains the image and ground truth segmentation data; there is one zarr file per sample, see below for more information on how to access zarr files.

fisbe_v1.0_mips.zip

maximum intensity projections of all samples, for convenience.

sample_list_per_split.txt

a simple list of all samples and the subset they are in, for convenience.

view_data.py

a simple python script to visualize samples, see below for more information on how to use it.

dim_neurons_val_and_test_sets.json

a list of instance ids per sample that are considered to be of low intensity/dim; can be used for extended evaluation.

Readme.md

general information

How to work with the image files

Each sample consists of a single 3d MCFO image of neurons of the fruit fly.For each image, we provide a pixel-wise instance segmentation for all separable neurons.Each sample is stored as a separate zarr file (zarr is a file storage format for chunked, compressed, N-dimensional arrays based on an open-source specification.").The image data ("raw") and the segmentation ("gt_instances") are stored as two arrays within a single zarr file.The segmentation mask for each neuron is stored in a separate channel.The order of dimensions is CZYX.

We recommend to work in a virtual environment, e.g., by using conda:

conda create -y -n flylight-env -c conda-forge python=3.9conda activate flylight-env

How to open zarr files

Install the python zarr package:

pip install zarr

Opened a zarr file with:

import zarrraw = zarr.open(, mode='r', path="volumes/raw")seg = zarr.open(, mode='r', path="volumes/gt_instances")

optional:import numpy as npraw_np = np.array(raw)

Zarr arrays are read lazily on-demand.Many functions that expect numpy arrays also work with zarr arrays.Optionally, the arrays can also explicitly be converted to numpy arrays.

How to view zarr image files

We recommend to use napari to view the image data.

Install napari:

pip install "napari[all]"

Save the following Python script:

import zarr, sys, napari

raw = zarr.load(sys.argv[1], mode='r', path="volumes/raw")gts = zarr.load(sys.argv[1], mode='r', path="volumes/gt_instances")

viewer = napari.Viewer(ndisplay=3)for idx, gt in enumerate(gts): viewer.add_labels( gt, rendering='translucent', blending='additive', name=f'gt_{idx}')viewer.add_image(raw[0], colormap="red", name='raw_r', blending='additive')viewer.add_image(raw[1], colormap="green", name='raw_g', blending='additive')viewer.add_image(raw[2], colormap="blue", name='raw_b', blending='additive')napari.run()

Execute:

python view_data.py /R9F03-20181030_62_B5.zarr

Metrics

S: Average of avF1 and C

avF1: Average F1 Score

C: Average ground truth coverage

clDice_TP: Average true positives clDice

FS: Number of false splits

FM: Number of false merges

tp: Relative number of true positives

For more information on our selected metrics and formal definitions please see our paper.

Baseline

To showcase the FISBe dataset together with our selection of metrics, we provide evaluation results for three baseline methods, namely PatchPerPix (ppp), Flood Filling Networks (FFN) and a non-learnt application-specific color clustering from Duan et al..For detailed information on the methods and the quantitative results please see our paper.

License

The FlyLight Instance Segmentation Benchmark (FISBe) dataset is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

Citation

If you use FISBe in your research, please use the following BibTeX entry:

@misc{mais2024fisbe, title = {FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures}, author = {Lisa Mais and Peter Hirsch and Claire Managan and Ramya Kandarpa and Josef Lorenz Rumberger and Annika Reinke and Lena Maier-Hein and Gudrun Ihrke and Dagmar Kainmueller}, year = 2024, eprint = {2404.00130}, archivePrefix ={arXiv}, primaryClass = {cs.CV} }

Acknowledgments

We thank Aljoscha Nern for providing unpublished MCFO images as well as Geoffrey W. Meissner and the entire FlyLight Project Team for valuablediscussions.P.H., L.M. and D.K. were supported by the HHMI Janelia Visiting Scientist Program.This work was co-funded by Helmholtz Imaging.

Changelog

There have been no changes to the dataset so far.All future change will be listed on the changelog page.

Contributing

If you would like to contribute, have encountered any issues or have any suggestions, please open an issue for the FISBe dataset in the accompanying github repository.

All contributions are welcome!
h
Graph200K
huggingface.co
Updated Apr 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
VisualCloze (2025). Graph200K [Dataset]. https://huggingface.co/datasets/VisualCloze/Graph200K
Explore at:
Dataset updated
Apr 11, 2025
Dataset authored and provided by
VisualCloze
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning

[Paper] [Project Page] [Github]

[🤗 Online Demo]

[🤗 Full Model Card (Diffusers)] [🤗 LoRA Model Card (Diffusers)]

Graph200k is a large-scale dataset containing a wide range of distinct tasks of image generation. If you find Graph200k is helpful, please consider to star ⭐ the Github Repo. Thanks!

📰 News

[2025-5-15] 🤗🤗🤗 VisualCloze has been merged into the… See the full description on the dataset page: https://huggingface.co/datasets/VisualCloze/Graph200K.
i
PhysioIntent: Multimodal dataset for human intention prediction research -...
rdm.inesctec.pt
Updated Aug 24, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). PhysioIntent: Multimodal dataset for human intention prediction research - Dataset - CKAN [Dataset]. https://rdm.inesctec.pt/dataset/nis-2022-003
Explore at:
Dataset updated
Aug 24, 2022
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PhysioIntent database was acquired during the master thesis at INESC TEC. The dataset was built to research human movement intention through biosignals (electromyogram (EMG), electroencephalogram (EEG) and electrocardiogram (ECG)) using the Cyton board from openBCI [1]. Inertial data (9-axis) was also recorded with a proprietary device from INESC TEC named iHandU [2]. A camera, logitech C270 HD, was also used to record the participant’s session video, thus better supporting the post-processing of the recorded data and the agreement between the protocol and the participant activity. All data was then synchronized with the aid of a photoresistor, correlating the visual stimuli presented to the user with the signals acquired. The acquisitions are divided into two phases, where the 2nd phase was performed to improve some setbacks encountered in the 1st phase, such as data loss and synchronization issues. The 1st phase study included 6 healthy volunteers (range of age = 22 to 25; average age = 22.3±0.9; 2 males and 4 females; all right-handed). In the 2nd phase, the study included 3 healthy volunteers (range of age = 20 to 26; average age = 22.6±2.5; 2 males and 1 female; all right-handed). The protocol consists in the execution and imagination of some upper limb movements, which will be repeated several times throughout the protocol. There are a total of three different movements during the session: hand-grasping, wrist supination and pick and place. Each sequence of movements, imagination and execution, as well as the resting periods is called a trial. A run is a sequence of trials that end on a 60s break. This dataset has two different phases of acquisition. Phase 1 has a total of four runs with fifteen trials each, while phase 2 has five runs with eighteen trials each. During Phase 1, on every run, each movement is imagined and executed 5 times corresponding to a total of 20 repetitions per movement during each session. On phase two, on every run, each movement was executed and imagined 6 times, resulting in 30 repetitions per movement on each session. In phase 1, 4 different muscles, bicep brachii, tricep brachii, flexor carpi radialis, and extensor digitorum, were measured. For the EEG, the measured channels were: FP1, FP2, FCZ, C3, CZ, C4, CP3, CP4, P3, and P4. During phase 2, only one muscle, extensor digitorum, was measured. For the EEG, the channels measured were: FP1, FP2, FC3, FCz, FC4, C1, C3, Cz, C2, C4, CP3, CP4, P3, and P4. Before the experiments, the participants were informed about the experimental protocols, paradigms, and purpose. After ensuring they understood the information, the participants signed a written consent approved by the DPO from INESC TEC. All files are grouped by subject. You can find all the detailed descriptions of how the files are organized on the README file. Also, there is an extra folder called "PhysioIntent supporting material" where you can find some extra material including a script with functions to help you read the data, a description of the experimental protocol and the setup create for each phase. For each subject the data is organized according to the data model ("Subject_data_storage_model") where it is shown that each type of data is present in a different folder. Regarding biosignals (openBCI/ folder), there is the raw and processed data. There is an additional README file for some subjects that contains some particular details of the acquisition. [1] Cyton + Daisy Biosensing Boards (16-Channels). (2022). Retrieved 23 August 2022, from https://shop.openbci.com/products [2] Oliveira, Ana, Duarte Dias, Elodie Múrias Lopes, Maria do Carmo Vilas-Boas, and João Paulo Silva Cunha. "SnapKi—An Inertial Easy-to-Adapt Wearable Textile Device for Movement Quantification of Neurological Patients." Sensors 20, no. 14 (2020): 3875.
B
Data from: A comprehensive analysis of autocorrelation and bias in home...
datasetcatalog.nlm.nih.gov
borealisdata.ca
+1more
Updated May 19, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Schabo, Dana G.; Ullmann, Wiebke; de Paula Cunha, Rogerio; Markham, A. Catherine; Alberts, Susan C.; Selva, Nuria; Koch, Flávia; Ali, Abdullahi H.; Zwijacz-Kozica, Tomasz; Thompson, Peter; Sergiel, Agnieszka; Mueller, Thomas; Dekker, Jasja; Ramalho, Emiliano E.; Patterson, Bruce D.; Morato, Ronaldo G.; Farwig, Nina; da Silva, Marina X.; LaPoint, Scott; Beyer, Dean; Medici, Emilia Patricia; Goheen, Jacob R.; Noonan, Michael J.; Olson, Kirk A.; Jeltsch, Florian; Belant, Jerrold L.; Fichtel, Claudia; Fleming, Christen H.; Akre, Tom S.; Ford, Adam T.; Nathan, Ran; Böhning-Gaese, Katrin; Fagan, William F.; Blaum, Niels; Tucker, Marlee A.; Antunes, Pamela C.; Drescher-Lehman, Jonathan; Rosner, Sascha; Calabrese, Justin M.; Paviolo, Agustin; Cullen Jr. , Laury; Fischer, Christina; Spiegel, Orr; Altmann, Jeanne; Zięba, Filip; Oliveira-Santos, Luiz Gustavo R.; Kappeler, Peter M.; Kauffman, Matthew; Janssen, René (2021). Data from: A comprehensive analysis of autocorrelation and bias in home range estimation [Dataset]. http://doi.org/10.5683/SP2/OAJTAO
Explore at:
Unique identifier
https://doi.org/10.5683/SP2/OAJTAO
Dataset updated
May 19, 2021
Authors
Schabo, Dana G.; Ullmann, Wiebke; de Paula Cunha, Rogerio; Markham, A. Catherine; Alberts, Susan C.; Selva, Nuria; Koch, Flávia; Ali, Abdullahi H.; Zwijacz-Kozica, Tomasz; Thompson, Peter; Sergiel, Agnieszka; Mueller, Thomas; Dekker, Jasja; Ramalho, Emiliano E.; Patterson, Bruce D.; Morato, Ronaldo G.; Farwig, Nina; da Silva, Marina X.; LaPoint, Scott; Beyer, Dean; Medici, Emilia Patricia; Goheen, Jacob R.; Noonan, Michael J.; Olson, Kirk A.; Jeltsch, Florian; Belant, Jerrold L.; Fichtel, Claudia; Fleming, Christen H.; Akre, Tom S.; Ford, Adam T.; Nathan, Ran; Böhning-Gaese, Katrin; Fagan, William F.; Blaum, Niels; Tucker, Marlee A.; Antunes, Pamela C.; Drescher-Lehman, Jonathan; Rosner, Sascha; Calabrese, Justin M.; Paviolo, Agustin; Cullen Jr. , Laury; Fischer, Christina; Spiegel, Orr; Altmann, Jeanne; Zięba, Filip; Oliveira-Santos, Luiz Gustavo R.; Kappeler, Peter M.; Kauffman, Matthew; Janssen, René
Description
AbstractHome range estimation is routine practice in ecological research. While advances in animal tracking technology have increased our capacity to collect data to support home range analysis, these same advances have also resulted in increasingly autocorrelated data. Consequently, the question of which home range estimator to use on modern, highly autocorrelated tracking data remains open. This question is particularly relevant given that most estimators assume independently sampled data. Here, we provide a comprehensive evaluation of the effects of autocorrelation on home range estimation. We base our study on an extensive dataset of GPS locations from 369 individuals representing 27 species distributed across 5 continents. We first assemble a broad array of home range estimators, including Kernel Density Estimation (KDE) with four bandwidth optimizers (Gaussian reference function, autocorrelated-Gaussian reference function (AKDE), Silverman's rule of thumb, and least squares cross-validation), Minimum Convex Polygon, and Local Convex Hull methods. Notably, all of these estimators except AKDE assume independent and identically distributed (IID) data. We then employ half-sample cross-validation to objectively quantify estimator performance, and the recently introduced effective sample size for home range area estimation ($\hat{N}_\mathrm{area}$) to quantify the information content of each dataset. We found that AKDE 95\% area estimates were larger than conventional IID-based estimates by a mean factor of 2. The median number of cross-validated locations included in the holdout sets by AKDE 95\% (or 50\%) estimates was 95.3\% (or 50.1\%), confirming the larger AKDE ranges were appropriately selective at the specified quantile. Conversely, conventional estimates exhibited negative bias that increased with decreasing $\hat{N}_\mathrm{area}$. To contextualize our empirical results, we performed a detailed simulation study to tease apart how sampling frequency, sampling duration, and the focal animal's movement conspire to affect range estimates. Paralleling our empirical results, the simulation study demonstrated that AKDE was generally more accurate than conventional methods, particularly for small $\hat{N}_\mathrm{area}$. While 72\% of the 369 empirical datasets had \textgreater1000 total observations, only 4\% had an $\hat{N}_\mathrm{area}$ \textgreater1000, where 30\% had an $\hat{N}_\mathrm{area}$ \textless30. In this frequently encountered scenario of small $\hat{N}_\mathrm{area}$, AKDE was the only estimator capable of producing an accurate home range estimate on autocorrelated data.
m
USA POI & Foot Traffic Enriched Geospatial Dataset by Predik Data-Driven
app.mobito.io
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
USA POI & Foot Traffic Enriched Geospatial Dataset by Predik Data-Driven [Dataset]. https://app.mobito.io/data-product/usa-enriched-geospatial-framework-dataset
Explore at:
Area covered
United States
Description
Our dataset provides detailed and precise insights into the business, commercial, and industrial aspects of any given area in the USA (Including Point of Interest (POI) Data and Foot Traffic. The dataset is divided into 150x150 sqm areas (geohash 7) and has over 50 variables. - Use it for different applications: Our combined dataset, which includes POI and foot traffic data, can be employed for various purposes. Different data teams use it to guide retailers and FMCG brands in site selection, fuel marketing intelligence, analyze trade areas, and assess company risk. Our dataset has also proven to be useful for real estate investment.- Get reliable data: Our datasets have been processed, enriched, and tested so your data team can use them more quickly and accurately.- Ideal for trainning ML models. The high quality of our geographic information layers results from more than seven years of work dedicated to the deep understanding and modeling of geospatial Big Data. Among the features that distinguished this dataset is the use of anonymized and user-compliant mobile device GPS location, enriched with other alternative and public data.- Easy to use: Our dataset is user-friendly and can be easily integrated to your current models. Also, we can deliver your data in different formats, like .csv, according to your analysis requirements. - Get personalized guidance: In addition to providing reliable datasets, we advise your analysts on their correct implementation.Our data scientists can guide your internal team on the optimal algorithms and models to get the most out of the information we provide (without compromising the security of your internal data).Answer questions like: - What places does my target user visit in a particular area? Which are the best areas to place a new POS?- What is the average yearly income of users in a particular area?- What is the influx of visits that my competition receives?- What is the volume of traffic surrounding my current POS?This dataset is useful for getting insights from industries like:- Retail & FMCG- Banking, Finance, and Investment- Car Dealerships- Real Estate- Convenience Stores- Pharma and medical laboratories- Restaurant chains and franchises- Clothing chains and franchisesOur dataset includes more than 50 variables, such as:- Number of pedestrians seen in the area.- Number of vehicles seen in the area.- Average speed of movement of the vehicles seen in the area.- Point of Interest (POIs) (in number and type) seen in the area (supermarkets, pharmacies, recreational locations, restaurants, offices, hotels, parking lots, wholesalers, financial services, pet services, shopping malls, among others). - Average yearly income range (anonymized and aggregated) of the devices seen in the area.Notes to better understand this dataset:- POI confidence means the average confidence of POIs in the area. In this case, POIs are any kind of location, such as a restaurant, a hotel, or a library. - Category confidences, for example"food_drinks_tobacco_retail_confidence" indicates how confident we are in the existence of food/drink/tobacco retail locations in the area. - We added predictions for The Home Depot and Lowe's Home Improvement stores in the dataset sample. These predictions were the result of a machine-learning model that was trained with the data. Knowing where the current stores are, we can find the most similar areas for new stores to open.How efficient is a Geohash?Geohash is a faster, cost-effective geofencing option that reduces input data load and provides actionable information. Its benefits include faster querying, reduced cost, minimal configuration, and ease of use.Geohash ranges from 1 to 12 characters. The dataset can be split into variable-size geohashes, with the default being geohash7 (150m x 150m).
c
ckanext-extend_search
catalog.civicdataecosystem.org
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). ckanext-extend_search [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-extend_search
Explore at:
Dataset updated
Jun 4, 2025
Description
The extend_search extension enhances the CKAN data catalog by adding advanced search capabilities. It focuses on improving how users find datasets by introducing date range filtering based on the 'modified-on' metadata, and enables searching datasets by custodian. By incorporating these features, extend_search makes it easier for users to discover relevant datasets within a CKAN instance. Key Features: Date Range Search Filter: Allows users to filter datasets based on a date range applied to the 'modified-on' metadata field. This feature utilizes the bootstrap-daterangepicker library, crediting Dan Grossman’s work, to provide a user-friendly interface for selecting date ranges. Custodian Search Filter: Introduces the ability to search datasets based on the custodian responsible for the dataset. This facilitates finding datasets managed by specific organizations or individuals. Technical Integration: The extension is installed via standard CKAN extension installation procedures. This involves cloning the repository, installing the required Python packages using pip, installing the extension using setup.py, and enabling the extend_search plugin in the CKAN configuration file (.ini). Benefits & Impact: By implementing the extend_search extension, CKAN installations can improve the findability of datasets, saving users time and effort. Date range filtering is specifically useful when searching for recently updated datasets, while custodian filtering is helpful when looking for datasets managed by specific entities.
N
Median Household Income Variation by Family Size in Grass Range, MT:...
neilsberg.com
csv, json
Updated Jan 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Median Household Income Variation by Family Size in Grass Range, MT: Comparative analysis across 7 household sizes [Dataset]. https://www.neilsberg.com/research/datasets/1af70bab-73fd-11ee-949f-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Jan 11, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Grass Range, Montana
Variables measured
Household size, Median Household Income
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across 7 household sizes (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out how household income varies with the size of the family unit. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents median household incomes for various household sizes in Grass Range, MT, as reported by the U.S. Census Bureau. The dataset highlights the variation in median household income with the size of the family unit, offering valuable insights into economic trends and disparities within different household sizes, aiding in data analysis and decision-making.

Key observations

Of the 7 household sizes (1 person to 7-or-more person households) reported by the census bureau, only 2-person households were found in Grass Range. The coefficient of variation (CV) is 37.75%. This high CV indicates high relative variability, suggesting that the incomes vary significantly across different sizes of households.

In the most recent year, 2021, The smallest household size for which the bureau reported a median household income was 2-person households, with an income of $77,015. Additionally, the Census Bureau did not report a median household income for larger household sizes.

https://i.neilsberg.com/ch/grass-range-mt-median-household-income-by-household-size.jpeg" alt="Grass Range, MT median household income, by household size (in 2022 inflation-adjusted dollars)">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Household Sizes:

1-person households

2-person households

3-person households

4-person households

5-person households

6-person households

7-or-more-person households

Variables / Data Columns

Household Size: This column showcases 7 household sizes ranging from 1-person households to 7-or-more-person households (As mentioned above).

Median Household Income: Median household income, in 2022 inflation-adjusted dollars for the specific household size.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Grass Range median household income. You can refer the same here
E
[[SUPERSEDED - this dataset is replaced by a later version:...
dtechtive.com
find.data.gov.scot
+1more
Updated Jun 6, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Edinburgh. School of Engineering. Institute for Digital Communications (2017). [[SUPERSEDED - this dataset is replaced by a later version: https://doi.org/10.7488/ds/2109]] Accounting for the Complex Hierarchical Topology of EEG Functional Connectivity in Network Binarisation [Dataset]. http://doi.org/10.7488/ds/2055
Explore at:
Unique identifier
https://doi.org/10.7488/ds/2055
Dataset updated
Jun 6, 2017
Dataset provided by
University of Edinburgh. School of Engineering. Institute for Digital Communications
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
UNITED KINGDOM
Description
[[SUPERSEDED - this dataset is replaced by a later version: https://doi.org/10.7488/ds/2109]] Research into binary network analysis of brain function faces a methodological challenge in selecting an appropriate threshold to binarise edge weights. For EEG, such binarisation should take into account the complex hierarchical structure found in functional connectivity. We explore the density range suitable for such structure and provide a comparison of state-of-the-art binarisation techniques, the recently proposed Cluster-Span Threshold (CST), minimum spanning trees and union of shortest path graphs, with arbitrary proportional thresholds and weighted networks. We test these techniques on weighted complex hierarchy models by contrasting model realisations with small parametric differences. We also test the robustness of these techniques to random and targeted topological attacks. We reveal that complex hierarchical topology requires a medium-density range binarisation solution, such as the CST which proves near maximal for distinguishing differences when compared with arbitrary proportional thresholding. Simulated results are validated with the analysis of three relevant EEG datasets: eyes open and closed resting states; visual short-term memory tasks; and resting state Alzheimer's disease with a healthy control group. The CST consistently outperforms other state-of-the-art binarisation methods for topological accuracy and robustness in both synthetic and real data. We provide insights into how the complex hierarchical structure of functional networks is best revealed in medium density ranges and how it safeguards against targeted attacks. These EEG PLI connnectivity data sets are used in the analysis of our submitted manuscript: https://arxiv.org/abs/1610.06360.

Facebook

Twitter

Click to copy link

Link copied

Cite

Carlos Canchila (2023). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6383043

Fused Image dataset for convolutional neural Network-based crack Detection (FIND)

Explore at:

Dataset updated

Apr 20, 2023

Dataset provided by

Shanglian Zhou
Carlos Canchila
Wei Song

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The “Fused Image dataset for convolutional neural Network-based crack Detection” (FIND) is a large-scale image dataset with pixel-level ground truth crack data for deep learning-based crack segmentation analysis. It features four types of image data including raw intensity image, raw range (i.e., elevation) image, filtered range image, and fused raw image. The FIND dataset consists of 2500 image patches (dimension: 256x256 pixels) and their ground truth crack maps for each of the four data types.

The images contained in this dataset were collected from multiple bridge decks and roadways under real-world conditions. A laser scanning device was adopted for data acquisition such that the captured raw intensity and raw range images have pixel-to-pixel location correspondence (i.e., spatial co-registration feature). The filtered range data were generated by applying frequency domain filtering to eliminate image disturbances (e.g., surface variations, and grooved patterns) from the raw range data [1]. The fused image data were obtained by combining the raw range and raw intensity data to achieve cross-domain feature correlation [2,3]. Please refer to [4] for a comprehensive benchmark study performed using the FIND dataset to investigate the impact from different types of image data on deep convolutional neural network (DCNN) performance.

If you share or use this dataset, please cite [4] and [5] in any relevant documentation.

In addition, an image dataset for crack classification has also been published at [6].

References:

[1] Shanglian Zhou, & Wei Song. (2020). Robust Image-Based Surface Crack Detection Using Range Data. Journal of Computing in Civil Engineering, 34(2), 04019054. https://doi.org/10.1061/(asce)cp.1943-5487.0000873

[2] Shanglian Zhou, & Wei Song. (2021). Crack segmentation through deep convolutional neural networks and heterogeneous image fusion. Automation in Construction, 125. https://doi.org/10.1016/j.autcon.2021.103605

[3] Shanglian Zhou, & Wei Song. (2020). Deep learning–based roadway crack classification with heterogeneous image data fusion. Structural Health Monitoring, 20(3), 1274-1293. https://doi.org/10.1177/1475921720948434

[4] Shanglian Zhou, Carlos Canchila, & Wei Song. (2023). Deep learning-based crack segmentation for civil infrastructure: data types, architectures, and benchmarked performance. Automation in Construction, 146. https://doi.org/10.1016/j.autcon.2022.104678

5 Shanglian Zhou, Carlos Canchila, & Wei Song. (2022). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6383044

[6] Wei Song, & Shanglian Zhou. (2020). Laser-scanned roadway range image dataset (LRRD). Laser-scanned Range Image Dataset from Asphalt and Concrete Roadways for DCNN-based Crack Classification, DesignSafe-CI. https://doi.org/10.17603/ds2-bzv3-nc78

Clear search

Close search

Google apps

Main menu

Fused Image dataset for convolutional neural Network-based crack Detection...

Public Land Survey System (PLSS): Township and Range

South Range, MI Population Breakdown by Gender and Age Dataset: Male and...

About this dataset

Content

Inspiration

Recommended for further research

BLM ID Range Improvement Line

Grass Range, MT Population Breakdown by Gender and Age Dataset: Male and...

About this dataset

Content

Inspiration

Recommended for further research

Native and alien species ranges

phones price classification

ECMWF ERA5: surface level analysis parameter data

Climate Indicators: Extreme Temperature Range (etr) - Dataset - B2FIND

ECMWF ERA5: ensemble spreads of surface level analysis parameter data

Endemic Mammal Richness in California, Range Weighted (Data Basin Dataset)

Data from: Current and projected research data storage needs of Agricultural...

Data from: FISBe: A real-world benchmark dataset for instance segmentation...

optional:import numpy as npraw_np = np.array(raw)

Graph200K

PhysioIntent: Multimodal dataset for human intention prediction research -...

Data from: A comprehensive analysis of autocorrelation and bias in home...

USA POI & Foot Traffic Enriched Geospatial Dataset by Predik Data-Driven

ckanext-extend_search

Median Household Income Variation by Family Size in Grass Range, MT:...

About this dataset

Content

Inspiration

Recommended for further research

[[SUPERSEDED - this dataset is replaced by a later version:...

Fused Image dataset for convolutional neural Network-based crack Detection (FIND)