Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The “Fused Image dataset for convolutional neural Network-based crack Detection” (FIND) is a large-scale image dataset with pixel-level ground truth crack data for deep learning-based crack segmentation analysis. It features four types of image data including raw intensity image, raw range (i.e., elevation) image, filtered range image, and fused raw image. The FIND dataset consists of 2500 image patches (dimension: 256x256 pixels) and their ground truth crack maps for each of the four data types.
The images contained in this dataset were collected from multiple bridge decks and roadways under real-world conditions. A laser scanning device was adopted for data acquisition such that the captured raw intensity and raw range images have pixel-to-pixel location correspondence (i.e., spatial co-registration feature). The filtered range data were generated by applying frequency domain filtering to eliminate image disturbances (e.g., surface variations, and grooved patterns) from the raw range data [1]. The fused image data were obtained by combining the raw range and raw intensity data to achieve cross-domain feature correlation [2,3]. Please refer to [4] for a comprehensive benchmark study performed using the FIND dataset to investigate the impact from different types of image data on deep convolutional neural network (DCNN) performance.
If you share or use this dataset, please cite [4] and [5] in any relevant documentation.
In addition, an image dataset for crack classification has also been published at [6].
References:
[1] Shanglian Zhou, & Wei Song. (2020). Robust Image-Based Surface Crack Detection Using Range Data. Journal of Computing in Civil Engineering, 34(2), 04019054. https://doi.org/10.1061/(asce)cp.1943-5487.0000873
[2] Shanglian Zhou, & Wei Song. (2021). Crack segmentation through deep convolutional neural networks and heterogeneous image fusion. Automation in Construction, 125. https://doi.org/10.1016/j.autcon.2021.103605
[3] Shanglian Zhou, & Wei Song. (2020). Deep learning–based roadway crack classification with heterogeneous image data fusion. Structural Health Monitoring, 20(3), 1274-1293. https://doi.org/10.1177/1475921720948434
[4] Shanglian Zhou, Carlos Canchila, & Wei Song. (2023). Deep learning-based crack segmentation for civil infrastructure: data types, architectures, and benchmarked performance. Automation in Construction, 146. https://doi.org/10.1016/j.autcon.2022.104678
[5] (This dataset) Shanglian Zhou, Carlos Canchila, & Wei Song. (2022). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6383044
[6] Wei Song, & Shanglian Zhou. (2020). Laser-scanned roadway range image dataset (LRRD). Laser-scanned Range Image Dataset from Asphalt and Concrete Roadways for DCNN-based Crack Classification, DesignSafe-CI. https://doi.org/10.17603/ds2-bzv3-nc78
Merged HDR images of many multi-exposure datasets can be improved with accurate exposure estimation.
Five chemicals [2-ethylhexyl 4-hydroxybenzoate (2-EHHB), 4-nonylphenol-branched (4-NP), 4-tert-octylphenol (4-OP), benzyl butyl phthalate (BBP) and dibutyl phthalate (DBP) were subjected to a 21-day Amphibian Metamorphosis Assay (AMA) following OCSPP 890.1100 test guidelines. The selected chemicals exhibited estrogenic or androgenic bioactivity in high throughput screening data obtained from US EPA ToxCast models. Xenopus laevis larvae were exposed nominally to each chemical at 3.6, 10.9, 33.0 and 100 µg/L, except 4-NP for which concentrations were 1.8, 5.5, 16.5 and 50 µg/L. Endpoint data (daily or given study day (SD)) collected included: mortality (daily), developmental stage (SD 7 and 21), hind limb length (HLL) (SD 7 and 21), snout-vent length (SVL) (SD 7 and 21), wet body weight (BW) (SD 7 and 21), and thyroid histopathology (SD 21). 4-OP and BBP caused accelerated development compared to controls at the mean measured concentration of 39.8 and 3.5 µg/L, respectively. Normalized HLL was increased on SD 21 for all chemicals except 4-NP. Histopathology revealed mild thyroid follicular cell hypertrophy at all BBP concentrations, while moderate thyroid follicular cell hypertrophy occurred at the 105 µg /L BBP concentration. Evidence of accelerated metamorphic development was also observed histopathologically in BBP-treated frogs at concentrations as low as 3.5 µg/L. Increased BW relative to control occurred for all chemicals except 4-OP. Increase in SVL was observed in larvae exposed to 4-NP, BBP and DBP on SD 21. With the exception of 4-NP, four of the chemicals tested appeared to alter thyroid axis-driven metamorphosis, albeit through different lines of evidence, with BBP and DBP providing the strongest evidence of effects on the thyroid axis. Citation information for this dataset can be found in Data.gov's References section.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Home-range estimation is an important application of animal tracking data that is frequently complicated by autocorrelation, sampling irregularity, and small effective sample sizes. We introduce a novel, optimal weighting method that accounts for temporal sampling bias in autocorrelated tracking data. This method corrects for irregular and missing data, such that oversampled times are downweighted and undersampled times are upweighted to minimize error in the home-range estimate. We also introduce computationally efficient algorithms that make this method feasible with large datasets. Generally speaking, there are three situations where weight optimization improves the accuracy of home-range estimates: with marine data, where the sampling schedule is highly irregular, with duty cycled data, where the sampling schedule changes during the observation period, and when a small number of home-range crossings are observed, making the beginning and end times more independent and informative than the intermediate times. Using both simulated data and empirical examples including reef manta ray, Mongolian gazelle, and African buffalo, optimal weighting is shown to reduce the error and increase the spatial resolution of home-range estimates. With a conveniently packaged and computationally efficient software implementation, this method broadens the array of datasets with which accurate space-use assessments can be made.
In support of new permitting workflows associated with anticipated WellSTAR needs, the CalGEM GIS unit extended the existing BLM PLSS Township & Range grid to cover offshore areas with the 3-mile limit of California jurisdiction. The PLSS grid as currently used by CalGEM is a composite of a BLM download (the majority of the data), additions by the DPR, and polygons created by CalGEM to fill in missing areas (the Ranchos, and Offshore areas within the 3-mile limit of California jurisdiction).CalGEM is the Geologic Energy Management Division of the California Department of Conservation, formerly the Division of Oil, Gas, and Geothermal Resources (as of January 1, 2020).Update Frequency: As Needed
Range Improvements are man-made or man-caused features on the landscape designed and implemented for the purpose of improving the available forage, managing the season of use or use patterns and enhancing the overall rangeland health of areas available for domestic livestock use. Range improvements may occur on private, state, and public lands under the jurisdiction of the Bureau of Land Management (BLM) and/or other federal or state agencies. On public lands managed by the Bureau of Land Management (BLM), permittees or lessees (henceforth, “operators”) may be required to install range improvements to meet the terms and conditions of their permits or leases. Often the BLM, operators, and other interested parties work together and jointly contribute to construction.Range improvements are authorized physical modifications or treatment which are designed to improve production of forage; change vegetation composition; control patterns of use; provide water; stabilize soil and water conditions; restore, protect, and improve the conditions of the rangeland ecosystems to benefit livestock, wild horses and burros, and fish and wildlife. They include, but are not limited to, structures, treatment projects and use of mechanical devices or modifications achieved through mechanical means. Range Improvements. There are two kinds of range improvements: nonstructural and structural. Seeding or prescribed burns are examples of nonstructural range improvements. Fences or facilities such as wells or water pipelines are examples of structural improvements. Many structural improvements are considered permanent, as they are not easily removed from the land. This data standard will only relate to structural range improvements features as GIS and attribute data related to almost all non-structural range improvements is stored in other national data standard datasets (e.g. NISIMS, VTRT, NFPORS). Range improvement data is also available in the Range Improvement System (RIPS), a BLM database used for tracking the establishment and maintenance of range improvments. RIPS is the database of record and contains the data to be used for budgetary and workload planning. This data set shall be comprised of a spatial display of the data in RIPS. The record unique identifier within the RIPS database (RIPS number) will be added to GIS features in this data standard to link between spatial depictions of range improvements features to their corresponding RIPS records. Wherever possible RIPS data shall be used to populate this data set.
Aim: Species adapt differently to contrasting environments, such as open habitats with sparse vegetation and forested habitats with dense forest cover. We investigated colonization patterns in the open and forested environments in the Diagonal of Open Formations and surrounding rain forests (i.e., Amazon and Atlantic Forest) in Brazil, tested whether the diversification rates were affected by the environmental conditions, and identified traits that enabled species to persist in those environments.
Location: South America, Brazil.
Taxon: Squamata, Lizards
Methods: We estimated ancestral ranges to identify range shifts relative to traditional open and forested habitats for all species. We used phylogenetic information and the current distribution of species in open and forested environments. To evaluate whether these environments influenced species diversification, we tested 12 models using a Hidden Geographic State Speciation and Extinction analysis. Finally, we combined phylogenetic ...
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Large language models (LLMs) have shown impressive capabilities in solving a wide range of tasks based on human instructions. However, developing a conversational AI assistant for electronic health record (EHR) data remains challenging due to the lack of large-scale instruction-following datasets. To address this, we present MIMIC-IV-Ext-Instr, a dataset containing over 450K open-ended, instruction-following examples generated using GPT-3.5 on a HIPAA-compliant platform. Derived from the MIMIC-IV EHR database, MIMIC-IV-Ext-Instr spans a wide range of topics and is specifically designed to support instruction-tuning of general-purpose LLMs for diverse clinical applications.
The databases ESTAR, PSTAR, and ASTAR calculate stopping-power and range tables for electrons, protons, or helium ions. Stopping-power and range tables can be calculated for electrons in any user-specified material and for protons and helium ions in 74 materials.
Please cite the following paper when using this dataset: N. Thakur, “MonkeyPox2022Tweets: A large-scale Twitter dataset on the 2022 Monkeypox outbreak, findings from analysis of Tweets, and open research questions,” Infect. Dis. Rep., vol. 14, no. 6, pp. 855–883, 2022, DOI: https://doi.org/10.3390/idr14060087. Abstract The mining of Tweets to develop datasets on recent issues, global challenges, pandemics, virus outbreaks, emerging technologies, and trending matters has been of significant interest to the scientific community in the recent past, as such datasets serve as a rich data resource for the investigation of different research questions. Furthermore, the virus outbreaks of the past, such as COVID-19, Ebola, Zika virus, and flu, just to name a few, were associated with various works related to the analysis of the multimodal components of Tweets to infer the different characteristics of conversations on Twitter related to these respective outbreaks. The ongoing outbreak of the monkeypox virus, declared a Global Public Health Emergency (GPHE) by the World Health Organization (WHO), has resulted in a surge of conversations about this outbreak on Twitter, which is resulting in the generation of tremendous amounts of Big Data. There has been no prior work in this field thus far that has focused on mining such conversations to develop a Twitter dataset. Therefore, this work presents an open-access dataset of 571,831 Tweets about monkeypox that have been posted on Twitter since the first detected case of this outbreak on May 7, 2022. The dataset complies with the privacy policy, developer agreement, and guidelines for content redistribution of Twitter, as well as with the FAIR principles (Findability, Accessibility, Interoperability, and Reusability) principles for scientific data management. Data Description The dataset consists of a total of 571,831 Tweet IDs of the same number of tweets about monkeypox that were posted on Twitter from 7th May 2022 to 11th November (the most recent date at the time of uploading the most recent version of the dataset). The Tweet IDs are presented in 12 different .txt files based on the timelines of the associated tweets. The following represents the details of these dataset files. Filename: TweetIDs_Part1.txt (No. of Tweet IDs: 13926, Date Range of the associated Tweet IDs: May 7, 2022, to May 21, 2022) Filename: TweetIDs_Part2.txt (No. of Tweet IDs: 17705, Date Range of the associated Tweet IDs: May 21, 2022, to May 27, 2022) Filename: TweetIDs_Part3.txt (No. of Tweet IDs: 17585, Date Range of the associated Tweet IDs: May 27, 2022, to June 5, 2022) Filename: TweetIDs_Part4.txt (No. of Tweet IDs: 19718, Date Range of the associated Tweet IDs: June 5, 2022, to June 11, 2022) Filename: TweetIDs_Part5.txt (No. of Tweet IDs: 46718, Date Range of the associated Tweet IDs: June 12, 2022, to June 30, 2022) Filename: TweetIDs_Part6.txt (No. of Tweet IDs: 138711, Date Range of the associated Tweet IDs: July 1, 2022, to July 23, 2022) Filename: TweetIDs_Part7.txt (No. of Tweet IDs: 105890, Date Range of the associated Tweet IDs: July 24, 2022, to July 31, 2022) Filename: TweetIDs_Part8.txt (No. of Tweet IDs: 93959, Date Range of the associated Tweet IDs: August 1, 2022, to August 9, 2022) Filename: TweetIDs_Part9.txt (No. of Tweet IDs: 50832, Date Range of the associated Tweet IDs: August 10, 2022, to August 24, 2022) Filename: TweetIDs_Part10.txt (No. of Tweet IDs: 39042, Date Range of the associated Tweet IDs: August 25, 2022, to September 19, 2022) Filename: TweetIDs_Part11.txt (No. of Tweet IDs: 12341, Date Range of the associated Tweet IDs: September 20, 2022, to October 9, 2022) Filename: TweetIDs_Part12.txt (No. of Tweet IDs: 15404, Date Range of the associated Tweet IDs: October 10, 2022, to November 11, 2022) Please note: The dataset contains only Tweet IDs in compliance with the terms and conditions mentioned in the privacy policy, developer agreement, and guidelines for content redistribution of Twitter. The Tweet IDs need to be hydrated to be used. For hydrating this dataset, the Hydrator application (link to download the application: https://github.com/DocNow/hydrator/releases and link to a step-by-step tutorial: https://towardsdatascience.com/learn-how-to-easily-hydrate-tweets-a0f393ed340e#:~:text=Hydrating%20Tweets) may be used.
We analyze the >4{sigma} sources in the most sensitive 100arcmin^2^ area (rms4.5{sigma}) obtained from high-resolution interferometric follow-up observations. The raw SCUBA-2 >4{sigma} limit is fainter than 2.25mJy throughout this region, and deboosting corrections would lower this further. Of the 53 SCUBA-2 sources in this sample, only five have no ALMA detections, while 13% (68% confidence range 7%-19%) have multiple ALMA counterparts. Color-based high-redshift dusty galaxy selection techniques find at most 55% of the total ALMA sample. In addition to using literature spectroscopic and optical/near-infrared photometric redshifts, we estimate far infrared photometric redshifts based on an Arp 220 template. We identify seven z>~4 candidates. We see the expected decline with redshift of the 4.5 and 24{mu}m to 850{mu}m flux ratios, confirming these as good diagnostics of z>~4 candidates. We visually classify 52 ALMA sources, finding 44% (68% confidence range 35%-53%) to be apparent mergers. We calculate rest-frame 2-8keV and 8-28keV luminosities using the 7Ms Chandra X-ray image. Nearly all of the ALMA sources detected at 0.5-2keV are consistent with a known X-ray luminosity to 850{mu}m flux relation for star-forming galaxies, while most of those detected at 2-7keV are moderate-luminosity AGNs that lie just above the 2-7keV detection threshold. The latter largely have substantial obscurations of logN_H_=23-24cm^-2^, but two of the high-redshift candidates may even be Compton thick.
AbstractEvery species experiences limits to its geographic distribution. Some evolutionary models predict that populations at range edges are less well-adapted to their local environments due to drift, expansion load, or swamping gene flow from the range interior. Alternatively, populations near range edges might be uniquely adapted to marginal environments. In this study, we use a database of transplant studies that quantify performance at broad geographic scales to test how local adaptation, site quality, and population quality change from spatial and climatic range centers towards edges. We find that populations from poleward edges perform relatively poorly, both on average across sites (15% lower population quality) and when compared to other populations at home (31% relative fitness disadvantage), consistent with these populations harboring high genetic load. Populations from equatorial edges also perform poorly on average (18% lower population quality) but, in contrast, outperform foreign populations (16% relative fitness advantage), suggesting that populations from equatorial edges have adapted to unique environments. Finally, we find that populations from sites that are thermally extreme relative to the species' niche demonstrate strong local adaptation, regardless of geographic position. Our findings indicate that both nonadaptive processes and adaptive evolution contribute to variation in adaptation across species' ranges., MethodsThis dataset contains fitness data gathered from a systematic literature search of transplant experiments, along with geographic and climatic covariates derived for this study. Included is the final data file and model running scripts, as well as scripts, GBIF occurrence data, and intermediate files demonstrating how spatial and climatic predictors were calculated., Usage notesSee README file.
Individual percentages, median fluorescent intensities and concentrations for each horse that were used to generate figure graphs are compiled in labeled data tables. (A) Percentage of IgE+ monocytes out of total cells in unsorted, MACS sorted and MACS+FACS sorted samples from 18 different horses in Fig 2D. (B) Percentage of CD23- cells out of total IgE+ monocytes in Fig 3D. (C) Clinical scores of allergic in in Fig 4A. (D) Percentage of IgE+ monocytes out of total monocytes in Fig 4C. (E) Percentage of CD16+ cells out of total IgE+ monocytes in Fig 4D. (F) Serum total IgE (ng/ml) measured by bead-based assay in Fig 5A. (G) IgE median fluorescent intensity (MFI) of IgE mAb 176 (Alexa Fluor 488) on IgE+ monocytes in Fig 5B. (H) Combined serum total IgE and IgE MFI on IgE+ monocytes in Fig 5C. (I) Percentage of monocytes out of total IgE+ cells in Fig 6A. (J) Secreted concentration of IL-10 (pg/ml), IL-4 (pg/ml), IFN? (MFI) and IL-17A (MFI) as measured by bead-based assay in Fig 6B. (K) Percentage of CD16+ cells out of total IgE- CD14+ monocytes. B-H,K show allergic (n = 7) and nonallergic (n = 7) horses, J shows allergic (n = 8) and nonallergic (n = 8) horses in October 2019. C-H,K show data points collected from April 2018-March 2019. (XLSX)
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Here are a few use cases for this project:
Historical Research: The model could be used by researchers and historians to extract and analyze information from old digitized newspapers. By identifying title, date, and page, it allows them to track chronological events and understand the social, political, and economic context better.
Digital Library Cataloging: This model could help digital libraries in cataloging their newspaper collections. Identifying the different elements (title, date, page) can aid in creating a more refined and exact database, making it easier for users to find the exact newspaper issue they are looking for.
News Monitoring: Media tracking or Public Relations companies can use the model to monitor coverage about a specific event or brand. By identifying text regions and titles it could help in finding articles about a particular subject over a wide range of published newspapers.
Educational Purpose: It can be used for teaching journalism or communication students about the evolution of news-writing styles, formats, and layout designs by observing differences across time and different types of newspapers.
Content Recommendation: For a digital news platform, this model could analyze the users' reading habits (type of title, section, or day of reading most frequently) and recommend similar content to enhance user engagement.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Russian Brainstorming Prompt-Response Dataset, a meticulously curated collection of 2000 prompt and response pairs. This dataset is a valuable resource for enhancing the creative and generative abilities of Language Models (LMs), a critical aspect in advancing generative AI.
Dataset Content:This brainstorming dataset comprises a diverse set of prompts and responses where the prompt contains instruction, context, constraints, and restrictions while completion contains the most accurate response list for the given prompt. Both these prompts and completions are available in Russian language.
These prompt and completion pairs cover a broad range of topics, including science, history, technology, geography, literature, current affairs, and more. Each prompt is accompanied by a response, providing valuable information and insights to enhance the language model training process. Both the prompt and response were manually curated by native Russian people, and references were taken from diverse sources like books, news articles, websites, and other reliable references.
This dataset encompasses various prompt types, including instruction type, continuation type, and in-context learning (zero-shot, few-shot) type. Additionally, you'll find prompts and responses containing rich text elements, such as tables, code, JSON, etc., all in proper markdown format.
Prompt Diversity:To ensure diversity, our brainstorming dataset features prompts of varying complexity levels, ranging from easy to medium and hard. The prompts also vary in length, including short, medium, and long prompts, providing a comprehensive range. Furthermore, the dataset includes prompts with constraints and persona restrictions, making it exceptionally valuable for LLM training.
Response Formats:Our dataset accommodates diverse learning experiences, offering responses across different domains depending on the prompt. For these brainstorming prompts, responses are generally provided in list format. These responses encompass text strings, numerical values, and dates, enhancing the language model's ability to generate reliable, coherent, and contextually appropriate answers.
Data Format and Annotation Details:This fully labeled Russian Brainstorming Prompt Completion Dataset is available in both JSON and CSV formats. It includes comprehensive annotation details, including a unique ID, prompt, prompt type, prompt length, prompt complexity, domain, response, and the presence of rich text.
Quality and Accuracy:Our dataset upholds the highest standards of quality and accuracy. Each prompt undergoes meticulous validation, and the corresponding responses are thoroughly verified. We prioritize inclusivity, ensuring that the dataset incorporates prompts and completions representing diverse perspectives and writing styles, maintaining an unbiased and discrimination-free stance.
The Russian version is grammatically accurate without any spelling or grammatical errors. No copyrighted, toxic, or harmful content is used during the construction of this dataset.
Continuous Updates and Customization:The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. We continuously work to expand this dataset, ensuring its ongoing growth and relevance. Additionally, FutureBeeAI offers the flexibility to curate custom brainstorming prompt and completion datasets tailored to specific requirements, providing you with customization options.
License:This dataset, created by FutureBeeAI, is now available for commercial use. Researchers, data scientists, and developers can leverage this fully labeled and ready-to-deploy Russian Brainstorming Prompt-Completion Dataset to enhance the creative and accurate response generation capabilities of their generative AI models and explore new approaches to NLP tasks.
We aim to identify and characterise binary systems containing red supergiant (RSG) stars in the Small Magellanic Cloud (SMC) using a newly available ultraviolet (UV) point source catalogue obtained using the Ultraviolet Imaging Telescope (UVIT) on board AstroSat. We select a sample of 561 SMC RSGs based on photometric and spectroscopic observations at optical wavelengths and cross-match this with the far-UV point source catalogue using the UVIT F172M filter, finding 88 matches down to an AB magnitude of 20.3, which we interpret as hot companions to the RSGs. Stellar parameters (luminosities, effective temperatures and masses) for both components in all 88 binary systems are determined and we find mass distributions in the ranges 6.2<M/M_{sun}<20.3 for RSGs and 3.7<M/M{sun}<15.6 for their companions. The most massive RSG binary system in the SMC has a combined mass of 30+/-2M{sun}, with a mass ratio (q) of 0.94. To determine the intrinsic multiplicity fraction for RSGs in the SMC, we simulate observational biases and find 18.8+/-1.5% for mass ratios in the range 0.3<q<1.0 and orbital periods approximately in the range 3<logP[days]<8. By comparing our results with those of a similar mass on the main-sequence, we determine the fraction of single stars to be ~20% and argue that the orbital period distribution declines rapidly beyond logP~3.5. We study the mass-ratio distribution of RSG binary systems and find that a uniform distribution best describes the data below 14M{sun}. Above 14M{sun}_, we find a lack of high mass-ratio systems.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
General information: Title of the Paper: "Direct and Indirect Effects of Fungicides on Growth, Feeding, and Pigmentation of the Freshwater Detritivore Asellus aquaticus" Authors: Akshay Mohan, Blake Matthews, Katja Räsänen Date of Data Collection: July 2022 - October 2022 Funder: University of Jyväskylä Publication Date: 15.10.2024 DOI of the Associated Paper: 10.1016/j.ecoenv.2024.117017 Overview of the Dataset: This dataset comprises two primary files used in the analysis of fungicide effects on Asellus aquaticus: Rangefinding.csv: The file contains results from a range-finding experiment where three fungicides were tested at different concentrations. Tebuconazole-Expt.csv: The file ncludes results from the direct and indirect exposure experiment, where Asellus aquaticus was exposed to Tebuconazole through both water and diet. Data Files and Structure: The file “Rangefinding.csv” contains the results of the range-finding study. It has the following columns: Number: Unique identifier for each entry. Image_name: Filename of the image taken for each isopod. Plate_number: Identifier for the experimental plate used. Fungicide: The type of fungicide used (e.g., Tebuconazole). Concentration: Fungicide concentration (µg/L). Initial_size, Final_size: Measured body length in cm (using FIJI v1.54d) of individuals before and after exposure. Growth_rate: Calculated growth over the study. Initial_leaf_area, Final_leaf_area: Measured area of leaf discs in mm2 (using FIJI v1.54d) before and after exposure. Feeding_rate: Rate of leaf consumption during the study. Survival: Binary (1 = alive, 0 = dead). The file “Tebuconazole final data mod.csv” contains results from the direct and indirect exposure study. It has the following columns: Number: Unique identifier for each entry. Image_name: Filename of the image taken for each isopod. Plate_number: Identifier for the experimental plate used. Treatment: Combination of diet and exposure. Diet: Specifies the diet treatment conditions. Exposure: Specifies the exposure conditions. Initial_area, Final_area: Measured body area in mm2 (using Phenopype v3.3.4) of individuals before and after exposure. Growth_rate: Calculated growth over the study period. Initial_leaf_area, Final_leaf_area: Measured area of leaf discs in mm2 (using FIJI v1.54d) before and after exposure. Feeding_rate: Rate of leaf consumption during the study. Pigmentation_Initial, Pigmentation_final: Measured pigmentation values (using Phenopype v3.3.4) before and after exposure. Pigmentation_rate: Change in pigmentation measured over the study period. Survival, Moulting: Binary indicators of survival and molting. Methodology: Data Collection: In both experiments, individual Asellus aquaticus were exposed to varying concentrations of fungicides, and the response variables were measured weekly using digital photographs and analyzed using image analysis softwares. Data Processing: Growth rates were calculated based on the differences between initial and final sizes, and feeding rates were based on leaf area consumed. Pigmentation was measured using grayscale values. Access and Licensing: The dataset is licensed under [CC BY 4.0], allowing reuse with proper attribution. DOI for the dataset will be available after publication on JYX. Contact Information: For further details or inquiries, contact: Akshay Mohan - akmohank@jyu.fi Blake Matthews - blake.matthews@eawag.ch Katja Räsänen - katja.j.rasanen@jyu.fi
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset projects the Global Environmental Stratification (GEnS) (Metzger, 2013) onto palaeoclimatic models of the mid-Holocene and Last Glacial Maximum. A Random Forest Classifier was applied to nine PMIP3 climate scenarios for the mid-Holocene, and three scenarios for the Last Glacial Maximum at a 5 arc-minute resolution following the methodology of Soteriades et al., (2017). This dataset summarises palaeo-environmental change for the mid-Holocene and Last Glacial Maximum using categorical classes of Environmental Zones and Strata.
195 UV-bright stars have been found on two-color 48-inch Schmidt plates centered on the galactic plane, and on one high-latitude plate. This catalog contains sources with (U-B) in the range U-B=0 to U-B=-1.5. Introduction: The Sandage two-color photographic survey was originally made in support of the UHURU x-ray satellite in order to identify those optical counterparts of the detected x-ray sources found in the galactic plane. During inspection of the plates, however, many UV-bright objects fainter than 10th magnitude were seen in the general field. A larger image in the U filter suggested the possibility of a bluer object as in the case of low-luminosity stars, white dwarfs, novae, CVs, normal early B stars, etc. As these are interesting in themselves, it was decided to publish a catalog for the use of other observers. This multi-color photographic technique has been described, for example, by Haro and Herbig (1955). The survey was concentrated on objects with m(B)~10 or fainter. It employed the Palomar 48-in (Oschin) Schmidt telescope and was centered on the galactic plane with overlapping regions covering the galactic latitudes +- 9 degrees, and extending throughout most of the northern plane (l = 0 deg - 227 deg). Plates were taken by J. Kristian, A.R. Sandage, R.J. Brucato, and Lanning, primarily. The data presented here were found following a careful examination of the plates but it should not be assumed these data represent a complete survey of the fields examined. The categories were roughly calibrated against photoelectric (U-B) measures, but a full scale calibration program, including magnitude effects, etc. was not done. The numerical (U-B) limits of the tables should not therefore be taken precisely. The blue magnitude of the sources in the finding list has been estimated using these photoelectric values as a guide but should be considered accurate to only +- 0.5 mag. due to the difficulty of adjusting to the various plate characteristics. Positions were measured from images retrieved from the Space Telescope Science Institute collection of Guide Star digital plate scans. The accuracy of positions from the Guide Star Catalog images has been estimated to be on the order of 0.2-0.8 arcsec (Russell et al. 1990) Information provided by Bidelman (private communication) resulted in the discovery that 15 positions for objects listed in Paper II were in error. Investigation indicated that an incorrect header was associated with the scan of the Guide Star plate originally archived onto optical disk. The incorrect astrometric solution, based on the use of an incorrect origin point, was subsequently applied in the positional determination when centroiding the object. The average offset for positions in right ascension is 14.17 seconds of time, with no detectable trend in the numbers. The offsets in declination range from +6.56 arcseconds through zero to -6.85 arcseconds as one progresses from west to east across the plate. This is consistent with a rotation being introduced into the bad plate solution. Objects with incorrect positions included Lanning 96, 97, 98, 99, 100, 102, 104, 108, 111, 113, 114, 115, 116, 119, and 122. uv.dat contains the corrected coordinates.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The “Fused Image dataset for convolutional neural Network-based crack Detection” (FIND) is a large-scale image dataset with pixel-level ground truth crack data for deep learning-based crack segmentation analysis. It features four types of image data including raw intensity image, raw range (i.e., elevation) image, filtered range image, and fused raw image. The FIND dataset consists of 2500 image patches (dimension: 256x256 pixels) and their ground truth crack maps for each of the four data types.
The images contained in this dataset were collected from multiple bridge decks and roadways under real-world conditions. A laser scanning device was adopted for data acquisition such that the captured raw intensity and raw range images have pixel-to-pixel location correspondence (i.e., spatial co-registration feature). The filtered range data were generated by applying frequency domain filtering to eliminate image disturbances (e.g., surface variations, and grooved patterns) from the raw range data [1]. The fused image data were obtained by combining the raw range and raw intensity data to achieve cross-domain feature correlation [2,3]. Please refer to [4] for a comprehensive benchmark study performed using the FIND dataset to investigate the impact from different types of image data on deep convolutional neural network (DCNN) performance.
If you share or use this dataset, please cite [4] and [5] in any relevant documentation.
In addition, an image dataset for crack classification has also been published at [6].
References:
[1] Shanglian Zhou, & Wei Song. (2020). Robust Image-Based Surface Crack Detection Using Range Data. Journal of Computing in Civil Engineering, 34(2), 04019054. https://doi.org/10.1061/(asce)cp.1943-5487.0000873
[2] Shanglian Zhou, & Wei Song. (2021). Crack segmentation through deep convolutional neural networks and heterogeneous image fusion. Automation in Construction, 125. https://doi.org/10.1016/j.autcon.2021.103605
[3] Shanglian Zhou, & Wei Song. (2020). Deep learning–based roadway crack classification with heterogeneous image data fusion. Structural Health Monitoring, 20(3), 1274-1293. https://doi.org/10.1177/1475921720948434
[4] Shanglian Zhou, Carlos Canchila, & Wei Song. (2023). Deep learning-based crack segmentation for civil infrastructure: data types, architectures, and benchmarked performance. Automation in Construction, 146. https://doi.org/10.1016/j.autcon.2022.104678
[5] (This dataset) Shanglian Zhou, Carlos Canchila, & Wei Song. (2022). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6383044
[6] Wei Song, & Shanglian Zhou. (2020). Laser-scanned roadway range image dataset (LRRD). Laser-scanned Range Image Dataset from Asphalt and Concrete Roadways for DCNN-based Crack Classification, DesignSafe-CI. https://doi.org/10.17603/ds2-bzv3-nc78