100+ datasets found

n
Genome Aggregation Database
neuinfo.org
scicrunch.org
+2more
Updated Jul 19, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Genome Aggregation Database [Dataset]. http://identifiers.org/RRID:SCR_014964
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_014964
Dataset updated
Jul 19, 2018
Description
Database that aggregates exome and genome sequencing data from large-scale sequencing projects. The gnomAD data set contains individuals sequenced using multiple exome capture methods and sequencing chemistries. Raw data from the projects have been reprocessed through the same pipeline, and jointly variant-called to increase consistency across projects.
f
Data from: Prediction of Protein Aggregation Propensity via Data-Driven...
acs.figshare.com
zip
Updated Oct 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seungpyo Kang; Minseon Kim; Jiwon Sun; Myeonghun Lee; Kyoungmin Min (2023). Prediction of Protein Aggregation Propensity via Data-Driven Approaches [Dataset]. http://doi.org/10.1021/acsbiomaterials.3c01001.s002
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1021/acsbiomaterials.3c01001.s002
Dataset updated
Oct 16, 2023
Dataset provided by
ACS Publications
Authors
Seungpyo Kang; Minseon Kim; Jiwon Sun; Myeonghun Lee; Kyoungmin Min
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Protein aggregation occurs when misfolded or unfolded proteins physically bind together and can promote the development of various amyloid diseases. This study aimed to construct surrogate models for predicting protein aggregation via data-driven methods using two types of databases. First, an aggregation propensity score database was constructed by calculating the scores for protein structures in the Protein Data Bank using Aggrescan3D 2.0. Moreover, feature- and graph-based models for predicting protein aggregation have been developed by using this database. The graph-based model outperformed the feature-based model, resulting in an R2 of 0.95, although it intrinsically required protein structures. Second, for the experimental data, a feature-based model was built using the Curated Protein Aggregation Database 2.0 to predict the aggregated intensity curves. In summary, this study suggests approaches that are more effective in predicting protein aggregation, depending on the type of descriptor and the database.
b
Genome Aggregation Database
bioregistry.io
Updated Dec 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Genome Aggregation Database [Dataset]. https://bioregistry.io/gnomad
Explore at:
Dataset updated
Dec 19, 2022
License
https://bioregistry.io/spdx:CC0-1.0https://bioregistry.io/spdx:CC0-1.0
Description
The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators, with the goal of aggregating and harmonizing both exome and genome sequencing data from a wide variety of large-scale sequencing projects, and making summary data available for the wider scientific community (from https://gnomad.broadinstitute.org).
n
Ultimate Rough Aggregation of Metabolic Map
neuinfo.org
dknet.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Ultimate Rough Aggregation of Metabolic Map [Dataset]. http://identifiers.org/RRID:SCR_014694
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_014694
Dataset updated
Jan 29, 2022
Description
Metabolic pathway map that collects metabolic data gathered from multiple public databases and organizes them in one central location.
e
Aggregate Functions JOINS and SET operations
paper.erudition.co.in
html
Updated May 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Einetic (2024). Aggregate Functions JOINS and SET operations [Dataset]. https://paper.erudition.co.in/makaut/bachelor-in-business-administration-hons-2023-2024/4/database-management-with-sql
Explore at:
htmlAvailable download formats
Dataset updated
May 1, 2024
Dataset authored and provided by
Einetic
License
https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Description
Question Paper Solutions of chapter Aggregate Functions JOINS and SET operations of Database Management with SQL, 4th Semester , Bachelor in Business Administration (Hons.) 2023-2024
Genome Aggregation Database (gnomAD) - Data Lakehouse Ready
registry.opendata.aws
Updated Sep 13, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amazon Web Services (2021). Genome Aggregation Database (gnomAD) - Data Lakehouse Ready [Dataset]. https://registry.opendata.aws/gnomad-data-lakehouse-ready/
Explore at:
Dataset updated
Sep 13, 2021
Dataset provided by
Amazon Web Serviceshttp://aws.amazon.com/
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators that aggregates and harmonizes both exome and genome data from a wide range of large-scale human sequencing projects Sign up for the gnomAD mailing list here. This dataset was derived from summary data from gnomAD release 3.1, available on the Registry of Open Data on AWS for ready enrollment into the Data Lake as Code.
d
Daily aggregation notebook
search.dataone.org
hydroshare.org
Updated Dec 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeff Sadler (2021). Daily aggregation notebook [Dataset]. https://search.dataone.org/view/sha256%3A4859cfa9813183a6f7473feb0daa227bbabd13d9054dd2ca5f395a19933223e8
Explore at:
Dataset updated
Dec 5, 2021
Dataset provided by
Hydroshare
Authors
Jeff Sadler
Description
Python 2 Jupyter notebook that aggregates sub-daily time series observations up to a daily time scale. The code was originally written to aggregate data stored in the sqlite database stored in this resource: https://www.hydroshare.org/resource/9e1b23607ac240588ba50d6b5b9a49b5/
U
Global Aggregation of Stream Silica (GlASS) (ver. 2.0, July 2025)
data.usgs.gov
catalog.data.gov
Updated Jul 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kathi Jankowski; Keira Johnson; Joanna Carey; Nicholas Lyon; Paul Julian; Sidney Bush; Lienne Sethna; Angel Chen; Adam Wymore; Pirkko Kortelainen; Hjalmar Laudon; Amanda Poste; Diane McKnight; William McDowell; Arial Shogren; Ruth Heindel; Antti Raike; Jeremy Jones; Fred Worrall; Luke Mosley; Pamela Sullivan (2025). Global Aggregation of Stream Silica (GlASS) (ver. 2.0, July 2025) [Dataset]. http://doi.org/10.5066/P138M8AR
Explore at:
Unique identifier
https://doi.org/10.5066/P138M8AR
Dataset updated
Jul 11, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
Kathi Jankowski; Keira Johnson; Joanna Carey; Nicholas Lyon; Paul Julian; Sidney Bush; Lienne Sethna; Angel Chen; Adam Wymore; Pirkko Kortelainen; Hjalmar Laudon; Amanda Poste; Diane McKnight; William McDowell; Arial Shogren; Ruth Heindel; Antti Raike; Jeremy Jones; Fred Worrall; Luke Mosley; Pamela Sullivan
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Time period covered
Jan 1, 1963 - Dec 1, 2023
Description
Riverine silicon (Si) plays a vital role in governing primary production, water quality, and carbon sequestration. The Global Aggregation of Stream Silica (GlASS) database was constructed to assess changes in riverine Si concentrations and fluxes, their relationship to available nutrients, and to evaluate mechanisms driving these patterns. GlASS includes dissolved Si (DSi), dissolved inorganic nitrogen, and dissolved inorganic phosphorus concentrations at daily to quarterly time steps, daily discharge, and watershed characteristics for rivers with drainage areas ranging less than 1 square kilometer to more than 4 million square kilometers and spanning nine climate zones. Chemistry and discharge data range between years 1963 and 2024. Watershed and climate data range between 1948 and 2024. GlASS uses publicly available datasets, ensuring transparency and reproducibility. Original data sources are cited, data quality assurance workflows are public, and input files to a common load ...
Microwave Single Scattering Properties Database (Horizontally Aligned...
zenodo.org
nc
Updated Jan 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kamil Mroz; Kamil Mroz; Jussi Leinonen; Jussi Leinonen (2023). Microwave Single Scattering Properties Database (Horizontally Aligned Aggregates of Dendrites) [Dataset]. http://doi.org/10.5281/zenodo.7510186
Explore at:
ncAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7510186
Dataset updated
Jan 8, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Kamil Mroz; Kamil Mroz; Jussi Leinonen; Jussi Leinonen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The database contains physical and microwave single scattering properties of horizontally aligned frozen hydrometeors as large as 11 cm in diameter.

A description of the aggregation model used for particle generation can be found in:
Leinonen, J., and Szyrmer, W. (2015), Radar signatures of snowflake riming: A modeling study, Earth and Space Science, 2, 346– 358, doi:10.1002/2015EA000102.
The code used for particle generation is freely available at: https://github.com/jleinonen/aggregation

The scattering properties of particles were computed using discrete dipole approximation using ADDA software package (https://github.com/adda-team/adda)

Terminal velocity of snowflakes was computed using 4 hydrodynamical models that were implemented as a part of snowScat library (https://github.com/OPTIMICe-team/snowScatt)

Approximately one half of the snowflake structure files and one quarter of scattering properties (for X, Ku, Ka and W band) were generated for the publication of Leinonen and Szyrmer (2015). The remaining part of the dataset was generated using the ALICE High Performance Computing Facility at the University of Leicester.
f
DataSheet_1_AgTC and AgETL: open-source tools to enhance data collection and...
frontiersin.figshare.com
pdf
Updated Feb 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luis Vargas-Rojas; To-Chia Ting; Katherine M. Rainey; Matthew Reynolds; Diane R. Wang (2024). DataSheet_1_AgTC and AgETL: open-source tools to enhance data collection and management for plant science research.pdf [Dataset]. http://doi.org/10.3389/fpls.2024.1265073.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fpls.2024.1265073.s001
Dataset updated
Feb 21, 2024
Dataset provided by
Frontiers
Authors
Luis Vargas-Rojas; To-Chia Ting; Katherine M. Rainey; Matthew Reynolds; Diane R. Wang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Advancements in phenotyping technology have enabled plant science researchers to gather large volumes of information from their experiments, especially those that evaluate multiple genotypes. To fully leverage these complex and often heterogeneous data sets (i.e. those that differ in format and structure), scientists must invest considerable time in data processing, and data management has emerged as a considerable barrier for downstream application. Here, we propose a pipeline to enhance data collection, processing, and management from plant science studies comprising of two newly developed open-source programs. The first, called AgTC, is a series of programming functions that generates comma-separated values file templates to collect data in a standard format using either a lab-based computer or a mobile device. The second series of functions, AgETL, executes steps for an Extract-Transform-Load (ETL) data integration process where data are extracted from heterogeneously formatted files, transformed to meet standard criteria, and loaded into a database. There, data are stored and can be accessed for data analysis-related processes, including dynamic data visualization through web-based tools. Both AgTC and AgETL are flexible for application across plant science experiments without programming knowledge on the part of the domain scientist, and their functions are executed on Jupyter Notebook, a browser-based interactive development environment. Additionally, all parameters are easily customized from central configuration files written in the human-readable YAML format. Using three experiments from research laboratories in university and non-government organization (NGO) settings as test cases, we demonstrate the utility of AgTC and AgETL to streamline critical steps from data collection to analysis in the plant sciences.
B
Diamond physical and chemical characteristics database
borealisdata.ca
search.dataone.org
Updated Jul 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thomas Stachel (2022). Diamond physical and chemical characteristics database [Dataset]. http://doi.org/10.7939/DVN/B8VYHV
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7939/DVN/B8VYHV
Dataset updated
Jul 11, 2022
Dataset provided by
Borealis
Authors
Thomas Stachel
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
A database of physical and IR-spectroscopic characteristics (including nitrogen content and aggregation state) and stable isotopic compositions (d13C and d15N) of principally inclusion-bearing diamonds
National Mortgage Database Aggregate Statistics
catalog.data.gov
Updated Mar 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Federal Housing Finance Agency (2025). National Mortgage Database Aggregate Statistics [Dataset]. https://catalog.data.gov/dataset/national-mortgage-database-aggregate-statistics
Explore at:
Dataset updated
Mar 11, 2025
Dataset provided by
Federal Housing Finance Agencyhttps://www.fhfa.gov/
Description
The National Mortgage Database (NMDB®) is a nationally representative five percent sample of residential mortgages in the United States.
A 的基因组
figshare.com
txt
Updated Apr 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
lihao Guo (2025). A 的基因组 [Dataset]. http://doi.org/10.6084/m9.figshare.28828337.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28828337.v1
Dataset updated
Apr 19, 2025
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
lihao Guo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
基因组组装
n
Data from: Leaf area predicts conspecific spatial aggregation of woody...
data.niaid.nih.gov
search.dataone.org
+1more
zip
Updated Jul 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jingjing Xi; Collin Li; Min Wang; Stavros Veresoglou (2024). Leaf area predicts conspecific spatial aggregation of woody species [Dataset]. http://doi.org/10.5061/dryad.4b8gthtn2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.4b8gthtn2
Dataset updated
Jul 16, 2024
Dataset provided by
Sun Yat-sen University
Authors
Jingjing Xi; Collin Li; Min Wang; Stavros Veresoglou
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Aim: Addressing how woody plant species are distributed in space can reveal inconspicuous drivers that structure plant communities. The spatial structure of conspecifics varies not only at local scales across co-existing plant species but also at larger biogeographical scales with climatic parameters and habitat properties. The possibility that biogeographical drivers shape the spatial structure of plants, however, has not received sufficient attention. Location: Global synthesis. Time period: 1997 - 2022. Major taxa studied: Woody angiosperms and conifers. Methods: We carried out a quantitative synthesis to capture the interplay between local scale and larger scale drivers. We modelled conspecific spatial aggregation as a binary response through logistic models and Ripley’s L statistics and the distance at which the point process was least random with mixed effects linear models. Our predictors covered a range of plant traits, climatic predictors and descriptors of the habitat. Results: We hypothesized that plant traits, when summarized by local scale predictors, exceed in importance biogeographical drivers in determining the spatial structure of conspecifics across woody systems. This was only the case in relation to the frequency with which we observe aggregated distributions. The probability of observing spatial aggregation and the intensity of it was higher for plant species with large leaves but further depended on climatic parameters and mycorrhiza. Main Conclusions: Compared to climatic variables, plant traits perform poorly in explaining the spatial structure of woody plant species, even though leaf area is a decisive plant trait that is related to whether we observe homogenous spatial aggregation and its intensity. Despite the limited variance explained by our models, we found that the spatial structure of woody plants is subject to consistent biogeographical constraints and that these exceed beyond descriptors of individual species, which we captured here through leaf area. Methods On the 8th of September 2022 we carried out a search in the Web of Science with the search string “(Ripley's K function) AND (forest)”. The search yielded 356 hits. We screened those 356 studies for eligibility, first based on the suitability of their article titles and second based on their abstracts (Figure S1). The 240 eligible studies were subsequently screened manually upon reading the entire article based on the following inclusion criteria: (1) The study reported on univariate Ripley's K or L statistics or else it was possible to extract those from figures or maps. (2) The study had been carried out in a woody ecosystem or a rangeland. (3) The univariate Ripley’s K statistics described the distribution of individuals from a single plant species. (4) The authors named the plant species for which the univariate Ripley's K statistics had been described. (5) The landscape (for example a logging area) did not induce conspicuous point processes that could not be corrected within the analysis. We manually processed the remaining 240 studies through reading the main text which reduced the final number of eligible studies to 69. A list of those data sources can be found in Appendix Three. From those studies we extracted the following moderators and we fitted them as predictors in subsequent models: Mean annual temperature: continuous variable. When unreported, we extracted the variable based on coordinates from WorldClim (Fick & Hijmans, 2017). Total annual precipitation: continuous variable. When unreported, we extracted the variable based on coordinates from WorldClim (Fick & Hijmans, 2017). Latitude of the study location: continuous variable. When unreported, we extracted the information based on the closest location reported. Longitude of the study location: continuous variable. When unreported, we extracted the information based on the closest location reported. Site area: continuous variable. We extracted the site area from the studies and converted it into a unified unit, square meter. Tree species: categorical variable. Plant traits: we collected data on 7 traits: leaf area (i.e. the size of the leaves), seed mass, wood density, leaf mass per area, tree height, plant species biomass and stem specific density. We first gathered data on tree height, seed mass and leaf area from the subset of common species in TRY (Díaz et al., 2022). We subsequently searched for seed mass data the SID database (Royal Botanic Gardens Kew, 2023) and the ICRAF database for wood density data (Ketterings et al., 2001). In the cases we observed no records in those databases we checked the EOL database (http://eol.org.). For leaf area, leaf mass per area, tree height, plant species biomass and stem specific density, we extracted them from the EOL database (http://eol.org.). We opted with these traits to cover as many trait syndromes as possible but the main criterion which we used to decide on the traits was the feasibility of acquiring them for the plant species in our database. Woody system age: categorical variable. We classified non woody habitats, plantations and systems that had recently experienced serious disturbances as “young” whereas natural forests or woody stands that had reached maturity as “old”. Mycorrhiza type: categorical variable. We extracted mycorrhizal types for each species from Wang and Qiu (2006). In the cases that we could find no mycorrhizal classification information in the database at a species level we searched instead the database compiled by Delavaux et al. (2021) containing information at a genus level. We only extracted mycorrhizal classifications if these supported a single mycorrhizal type at a minimum probability of 85%. Otherwise, we left the plant species unclassified in relation to mycorrhiza. Ripley's L effect size: continuous variable. We first calculated for all distances the ratio between the (1) difference between the Ripley's L statistic and the width of the 95% CI envelope divided by two and (2) the difference between the upper and lower points of the envelope divided by two. A large absolute value suggests a strong deviation from randomness whereas any value below 1 suggest a random process. We identified the location where the absolute value of this ratio was maximum. Ripley's L statistic: continuous variable. We transformed Ripley´s K statistics (when they had not been transformed) into Ripley´s L statistics. We only used the value at the location where we observed the maximum in absolute value Ripley's K effect size. Distance when Ripley's L peaked: continuous variable describing the distance at which we observed the maximum in absolute value Ripley´s L effect size. Köppen climate zone: a categorical variable with 4 levels describing the main climatic zones based on the Köppen classification: A (tropical climates); B (arid climates); C (temperate climates); D (continental climates). We extracted those from the raster files published by Beck et al. (2018). In the cases that we observed multiple values in databases (referring here mainly to plant trait values) per species, we used the median value. In the cases when we had to digitize plots to extract data, we did so with Plot Digitizer v2.6.8.
u
EPiC database - Recycled aggregate
figshare.unimelb.edu.au
resodate.org
pdf
Updated Dec 10, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Robert Crawford; André Stephan; Fabian Prideaux (2020). EPiC database - Recycled aggregate [Dataset]. http://doi.org/10.26188/5da557263ad43
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.26188/5da557263ad43
Dataset updated
Dec 10, 2020
Dataset provided by
The University of Melbourne
Authors
Robert Crawford; André Stephan; Fabian Prideaux
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This material is part of the free Environmental Performance in Construction (EPiC) Database. The EPiC Database contains embodied environmental flow coefficients for 250+ construction materials using a comprehensive hybrid life cycle inventory approach.Recycled aggregate is a cheap and readily available product made from recycled construction materials. It is strong and durable with excellent drainage properties. It is typically comprised of concrete, stone, brick, ceramics, mortar and other common construction materials. It is produced using the waste materials collected from the demolition of building and infrastructure projects. Impurities such as metal, wood and timber are removed via magnets and other sorting techniques. The remaining materials are sorted by size, and crushed into a coarse aggregate.Recycled aggregate is becoming increasingly popular as a replacement for natural aggregates. It is commonly used for: bulk fill, road construction, gravel, and as an aggregate in concrete. When used in concrete, it is typically combined with fly ash or other additives to ensure improved strength and reliability.
FHFA: National Mortgage Database (NMDB®) Aggregate Statistics
datalumos.org
openicpsr.org
Updated Feb 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Federal Housing Finance Agency (2025). FHFA: National Mortgage Database (NMDB®) Aggregate Statistics [Dataset]. http://doi.org/10.3886/E220243V1
Explore at:
Unique identifier
https://doi.org/10.3886/E220243V1
Dataset updated
Feb 20, 2025
Dataset authored and provided by
Federal Housing Finance Agencyhttps://www.fhfa.gov/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The National Mortgage Database (NMDB®) is a nationally representative five percent sample of residential mortgages in the United States. Publication of aggregate data from NMDB is a step toward implementing the statutory requirements of section 1324(c) of the Federal Housing Enterprises Financial Safety and Soundness Act of 1992, as amended by the Housing and Economic Recovery Act of 2008. The statute requires FHFA to conduct a monthly mortgage market survey to collect data on the characteristics of individual mortgages, both Enterprise and non-Enterprise, and to make the data available to the public while protecting the privacy of the borrowers.Notes:1) All CSV file headers are now standardized as described in the Data Dictionary and Technical Notes and all CSV files are zipped.2) Alternate wide format CSV files are available. The wide format may be more easily opened by MS Excel.
f
Thematic Woody Aggregation for DR Congo - AFRICOVER
data.apps.fao.org
Updated Jul 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Thematic Woody Aggregation for DR Congo - AFRICOVER [Dataset]. https://data.apps.fao.org/map/catalog/srv/resources/datasets/2f2d5683-67c0-4760-98b3-4ef98a31a96e
Explore at:
Dataset updated
Jul 20, 2024
Description
This dataset is a thematic reaggregated version of the original national Africover landcover multipurpose database. It contains all natural vegetation with a woody component. The original full resolution land cover has been produced from visual interpretation of digitally enhanced LANDSAT TM images (Bands 4,3,2) acquired mainly in the period 2000-2001 (see the "Multipurpose Landcover Database" metadata for more details). This dataset is intended for free public access. Thematic aggregation is the way that the end user customizes the Africover database to fulfil his/her specific requirements. The Africover database gives equal level of detail to Agriculture as well as Natural vegetation or Bare Areas etc. Generally a single user does not need this level of detail for each class type; therefore he/she will enhance the information of one land cover type and will generalize or erase the information related to other land cover aspects. The most powerful way to conduct an aggregation exercise is to use the classifiers as basic elements of the exercise. This gives the user the maximum flexibility on the use of data. The shape main attributes correspond to the following fields: -ID -HECTARES -WOODY_ID -WOODY_DESC You can download a zip archive containing: -the drc-cult-agg (.shp) -the DR Congo Classifiers Used (.pdf) -the DR Congo legend (.pdf and .xls) -the DR Congo Legend - LCCS Import file (.xls) -the LCCSglossary_drcongo (.pdf) -the thematic-aggregation-procedure (.pdf) -the thematic-aggregation-annex1 (.pdf) -the thematic-aggregation-annex2 (.pdf) -the Userlabel Definitions (.pdf)
Z
Database of physicochemical and optical properties of black carbon fractal...
data.niaid.nih.gov
Updated Jun 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Baseerat; Thomas; Jaikrishna; Tobias; Marius; Mira (2023). Database of physicochemical and optical properties of black carbon fractal aggregates [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7523057
Explore at:
Dataset updated
Jun 20, 2023
Dataset provided by
Patil
Romshoo
Kloft
Müller
Michels
Pöhlker
Authors
Baseerat; Thomas; Jaikrishna; Tobias; Marius; Mira
Description
In order to estimate the climate impact of highly absorbing black carbon (BC) aerosols, it is necessary to know their optical properties. The Lorentz-Mie theory, often used to calculate the optical properties of BC under the spherical morphological assumption, produces discrepancies when compared to measurements. In light of this, researchers are currently investigating the possibility of computing the optical properties of BC using a realistic fractal aggregate morphology. To determine the optical properties of such BC fractal aggregates, the Multiple Sphere T-Matrix method (MSTM) is used, which can take more than 24 hours for a single simulation depending on the aggregate properties. This study provides a highly accurate benchmark machine-learning algorithm that can be used to generate the optical properties of BC fractal aggregate in a fraction of a second. The machine learning algorithm was trained over an extensive database of physicochemical and optical properties of BC fractal aggregates. The extensive training data helped develop an ML algorithm that can accurately predict the optical properties of BC fractal aggregates with an average deviation of less than one percent from their actual values. Specifically, the ML algorithm provides the option to generate the optical properties in the visible spectrum using either kernel ridge regression (KRR) or artificial neural networks (ANN) for a BC fractal aggregate of desired physicochemical properties like size, morphology, and organic coating. The dataset of physicochemical and optical properties of BC fractal aggregates are provided here. The developed ML algorithm for predicting the optical properties of BC fractal aggregates (https://github.com/jaikrishnap/Machine-learning-for-prediction-of-BCFAs) is highly useful for real-world applications due to its wide parameter range, high accuracy, and low computational cost.

Contents

database_optical_properties_black_carbon_fractal_aggregtates.csv, data file, comma-separated values

database_header.txt, metadata, text

Citation for the database:

B., Romshoo, T., Müller, B., Patil, J., Michels, T., Kloft, M., and Pöhlker, M.: Database of physicochemical and optical properties of black carbon fractal aggregates, Dataset, https://doi.org/10.5281/zenodo.7523058, 2023.
Data_Sheet_1_Integrated Analysis of Multiple Microarray Studies to Identify...
frontiersin.figshare.com
txt
Updated Jun 3, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zi-An Chen; Yu-Feng Sun; Quan-Xu Wang; Hui-Hui Ma; Zhi-Zhao Ma; Chuan-Jie Yang (2023). Data_Sheet_1_Integrated Analysis of Multiple Microarray Studies to Identify Novel Gene Signatures in Ulcerative Colitis.CSV [Dataset]. http://doi.org/10.3389/fgene.2021.697514.s001
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2021.697514.s001
Dataset updated
Jun 3, 2023
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Zi-An Chen; Yu-Feng Sun; Quan-Xu Wang; Hui-Hui Ma; Zhi-Zhao Ma; Chuan-Jie Yang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Background: Ulcerative colitis (UC) is a chronic, complicated, inflammatory disease with an increasing incidence and prevalence worldwide. However, the intrinsic molecular mechanisms underlying the pathogenesis of UC have not yet been fully elucidated.Methods: All UC datasets published in the GEO database were analyzed and summarized. Subsequently, the robust rank aggregation (RRA) method was used to identify differentially expressed genes (DEGs) between UC patients and controls. Gene functional annotation and PPI network analysis were performed to illustrate the potential functions of the DEGs. Some important functional modules from the protein-protein interaction (PPI) network were identified by molecular complex detection (MCODE), Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG), and analyses were performed. The results of CytoHubba, a plug for integrated algorithm for biomolecular interaction networks combined with RRA analysis, were used to identify the hub genes. Finally, a mouse model of UC was established by dextran sulfate sodium salt (DSS) solution to verify the expression of hub genes.Results: A total of 6 datasets met the inclusion criteria (GSE38713, GSE59071, GSE73661, GSE75214, GSE87466, GSE92415). The RRA integrated analysis revealed 208 significant DEGs (132 upregulated genes and 76 downregulated genes). After constructing the PPI network by MCODE plug, modules with the top three scores were listed. The CytoHubba app and RRA identified six hub genes: LCN2, CXCL1, MMP3, IDO1, MMP1, and S100A8. We found through enrichment analysis that these functional modules and hub genes were mainly related to cytokine secretion, immune response, and cancer progression. With the mouse model, we found that the expression of all six hub genes in the UC group was higher than that in the control group (P < 0.05).Conclusion: The hub genes analyzed by the RRA method are highly reliable. These findings improve the understanding of the molecular mechanisms in UC pathogenesis.
d
Species of Greatest Conservation Need National Database
catalog.data.gov
data.usgs.gov
Updated Nov 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Species of Greatest Conservation Need National Database [Dataset]. https://catalog.data.gov/dataset/species-of-greatest-conservation-need-national-database
Explore at:
Dataset updated
Nov 12, 2025
Dataset provided by
U.S. Geological Survey
Description
The Species of Greatest Conservation Need National Database is an aggregation of lists from State Wildlife Action Plans. Species of Greatest Conservation Need (SGCN) are wildlife species that need conservation attention as listed in action plans. In this database, we have validated scientific names from original documents against taxonomic authorities to increase consistency among names enabling aggregation and summary. This database does not replace the information contained in the original State Wildlife Action Plans. The database includes SGCN lists from 56 states, territories, and districts, encompassing action plans spanning from 2005 to 2022. State Wildlife Action Plans undergo updates at least once every 10 years by respective wildlife agencies. The SGCN list data from these action plans have been compiled in partnership with individual wildlife management agencies, the United States Fish and Wildlife Service, and the Association of Fish and Wildlife Agencies. The SGCN National Database consists of three data tables: "source_data", "process_data", and "validated_data". Most users will likely find the "sgcn_species_all_records" table that combines all three tables most useful to compare "source_" names and "validated_" names and to aggregate and summarize using validated names. The "source_data" table provides an archive of all SGCN records listed by conservation authorities over multiple actions plans, which includes the scientific names, common names, locations, and year of action plan. The "process_data" table incorporates processing information, including the archiving and processing dates along with persistent identifiers used for record documentation, while the "validated_data" table provides the taxonomic identities from the matched taxonomic source, including the standardized scientific name, common name, and taxonomic ranks as well as links to supplementary taxonomic information.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2018). Genome Aggregation Database [Dataset]. http://identifiers.org/RRID:SCR_014964

Genome Aggregation Database

RRID:SCR_014964, biotools:gnomad, Genome Aggregation Database (RRID:SCR_014964), gnomAD, gnomAD 2.0, gnomAD Browser, gnomAD version 2.0, Exome Aggregation Consortium

Explore at:

76 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://identifiers.org/RRID:SCR_014964

Dataset updated

Jul 19, 2018

Description

Database that aggregates exome and genome sequencing data from large-scale sequencing projects. The gnomAD data set contains individuals sequenced using multiple exome capture methods and sequencing chemistries. Raw data from the projects have been reprocessed through the same pipeline, and jointly variant-called to increase consistency across projects.

Clear search

Close search

Google apps

Main menu

Genome Aggregation Database

Data from: Prediction of Protein Aggregation Propensity via Data-Driven...

Genome Aggregation Database

Ultimate Rough Aggregation of Metabolic Map

Aggregate Functions JOINS and SET operations

Genome Aggregation Database (gnomAD) - Data Lakehouse Ready

Daily aggregation notebook

Global Aggregation of Stream Silica (GlASS) (ver. 2.0, July 2025)

Microwave Single Scattering Properties Database (Horizontally Aligned...

DataSheet_1_AgTC and AgETL: open-source tools to enhance data collection and...

Diamond physical and chemical characteristics database

National Mortgage Database Aggregate Statistics

A 的基因组

Data from: Leaf area predicts conspecific spatial aggregation of woody...

EPiC database - Recycled aggregate

FHFA: National Mortgage Database (NMDB®) Aggregate Statistics

Thematic Woody Aggregation for DR Congo - AFRICOVER

Database of physicochemical and optical properties of black carbon fractal...

Data_Sheet_1_Integrated Analysis of Multiple Microarray Studies to Identify...

Species of Greatest Conservation Need National Database

Genome Aggregation DatabaseSee More Versions

RRID:SCR_014964, biotools:gnomad, Genome Aggregation Database (RRID:SCR_014964), gnomAD, gnomAD 2.0, gnomAD Browser, gnomAD version 2.0, Exome Aggregation Consortium

Genome Aggregation Database