100+ datasets found
  1. n

    Genome Aggregation Database

    • neuinfo.org
    • scicrunch.org
    • +2more
    Updated Jul 19, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Genome Aggregation Database [Dataset]. http://identifiers.org/RRID:SCR_014964
    Explore at:
    Dataset updated
    Jul 19, 2018
    Description

    Database that aggregates exome and genome sequencing data from large-scale sequencing projects. The gnomAD data set contains individuals sequenced using multiple exome capture methods and sequencing chemistries. Raw data from the projects have been reprocessed through the same pipeline, and jointly variant-called to increase consistency across projects.

  2. f

    Data from: Prediction of Protein Aggregation Propensity via Data-Driven...

    • acs.figshare.com
    zip
    Updated Oct 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seungpyo Kang; Minseon Kim; Jiwon Sun; Myeonghun Lee; Kyoungmin Min (2023). Prediction of Protein Aggregation Propensity via Data-Driven Approaches [Dataset]. http://doi.org/10.1021/acsbiomaterials.3c01001.s002
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 16, 2023
    Dataset provided by
    ACS Publications
    Authors
    Seungpyo Kang; Minseon Kim; Jiwon Sun; Myeonghun Lee; Kyoungmin Min
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Protein aggregation occurs when misfolded or unfolded proteins physically bind together and can promote the development of various amyloid diseases. This study aimed to construct surrogate models for predicting protein aggregation via data-driven methods using two types of databases. First, an aggregation propensity score database was constructed by calculating the scores for protein structures in the Protein Data Bank using Aggrescan3D 2.0. Moreover, feature- and graph-based models for predicting protein aggregation have been developed by using this database. The graph-based model outperformed the feature-based model, resulting in an R2 of 0.95, although it intrinsically required protein structures. Second, for the experimental data, a feature-based model was built using the Curated Protein Aggregation Database 2.0 to predict the aggregated intensity curves. In summary, this study suggests approaches that are more effective in predicting protein aggregation, depending on the type of descriptor and the database.

  3. b

    Genome Aggregation Database

    • bioregistry.io
    Updated Dec 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Genome Aggregation Database [Dataset]. https://bioregistry.io/gnomad
    Explore at:
    Dataset updated
    Dec 19, 2022
    License

    https://bioregistry.io/spdx:CC0-1.0https://bioregistry.io/spdx:CC0-1.0

    Description

    The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators, with the goal of aggregating and harmonizing both exome and genome sequencing data from a wide variety of large-scale sequencing projects, and making summary data available for the wider scientific community (from https://gnomad.broadinstitute.org).

  4. n

    Ultimate Rough Aggregation of Metabolic Map

    • neuinfo.org
    • dknet.org
    • +2more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Ultimate Rough Aggregation of Metabolic Map [Dataset]. http://identifiers.org/RRID:SCR_014694
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Metabolic pathway map that collects metabolic data gathered from multiple public databases and organizes them in one central location.

  5. e

    Aggregate Functions JOINS and SET operations

    • paper.erudition.co.in
    html
    Updated May 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2024). Aggregate Functions JOINS and SET operations [Dataset]. https://paper.erudition.co.in/makaut/bachelor-in-business-administration-hons-2023-2024/4/database-management-with-sql
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 1, 2024
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Aggregate Functions JOINS and SET operations of Database Management with SQL, 4th Semester , Bachelor in Business Administration (Hons.) 2023-2024

  6. Genome Aggregation Database (gnomAD) - Data Lakehouse Ready

    • registry.opendata.aws
    Updated Sep 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amazon Web Services (2021). Genome Aggregation Database (gnomAD) - Data Lakehouse Ready [Dataset]. https://registry.opendata.aws/gnomad-data-lakehouse-ready/
    Explore at:
    Dataset updated
    Sep 13, 2021
    Dataset provided by
    Amazon Web Serviceshttp://aws.amazon.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators that aggregates and harmonizes both exome and genome data from a wide range of large-scale human sequencing projects Sign up for the gnomAD mailing list here. This dataset was derived from summary data from gnomAD release 3.1, available on the Registry of Open Data on AWS for ready enrollment into the Data Lake as Code.

  7. d

    Daily aggregation notebook

    • search.dataone.org
    • hydroshare.org
    Updated Dec 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeff Sadler (2021). Daily aggregation notebook [Dataset]. https://search.dataone.org/view/sha256%3A4859cfa9813183a6f7473feb0daa227bbabd13d9054dd2ca5f395a19933223e8
    Explore at:
    Dataset updated
    Dec 5, 2021
    Dataset provided by
    Hydroshare
    Authors
    Jeff Sadler
    Description

    Python 2 Jupyter notebook that aggregates sub-daily time series observations up to a daily time scale. The code was originally written to aggregate data stored in the sqlite database stored in this resource: https://www.hydroshare.org/resource/9e1b23607ac240588ba50d6b5b9a49b5/

  8. U

    Global Aggregation of Stream Silica (GlASS) (ver. 2.0, July 2025)

    • data.usgs.gov
    • catalog.data.gov
    Updated Jul 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kathi Jankowski; Keira Johnson; Joanna Carey; Nicholas Lyon; Paul Julian; Sidney Bush; Lienne Sethna; Angel Chen; Adam Wymore; Pirkko Kortelainen; Hjalmar Laudon; Amanda Poste; Diane McKnight; William McDowell; Arial Shogren; Ruth Heindel; Antti Raike; Jeremy Jones; Fred Worrall; Luke Mosley; Pamela Sullivan (2025). Global Aggregation of Stream Silica (GlASS) (ver. 2.0, July 2025) [Dataset]. http://doi.org/10.5066/P138M8AR
    Explore at:
    Dataset updated
    Jul 11, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    Kathi Jankowski; Keira Johnson; Joanna Carey; Nicholas Lyon; Paul Julian; Sidney Bush; Lienne Sethna; Angel Chen; Adam Wymore; Pirkko Kortelainen; Hjalmar Laudon; Amanda Poste; Diane McKnight; William McDowell; Arial Shogren; Ruth Heindel; Antti Raike; Jeremy Jones; Fred Worrall; Luke Mosley; Pamela Sullivan
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    Jan 1, 1963 - Dec 1, 2023
    Description

    Riverine silicon (Si) plays a vital role in governing primary production, water quality, and carbon sequestration. The Global Aggregation of Stream Silica (GlASS) database was constructed to assess changes in riverine Si concentrations and fluxes, their relationship to available nutrients, and to evaluate mechanisms driving these patterns. GlASS includes dissolved Si (DSi), dissolved inorganic nitrogen, and dissolved inorganic phosphorus concentrations at daily to quarterly time steps, daily discharge, and watershed characteristics for rivers with drainage areas ranging less than 1 square kilometer to more than 4 million square kilometers and spanning nine climate zones. Chemistry and discharge data range between years 1963 and 2024. Watershed and climate data range between 1948 and 2024. GlASS uses publicly available datasets, ensuring transparency and reproducibility. Original data sources are cited, data quality assurance workflows are public, and input files to a common load ...

  9. Microwave Single Scattering Properties Database (Horizontally Aligned...

    • zenodo.org
    nc
    Updated Jan 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kamil Mroz; Kamil Mroz; Jussi Leinonen; Jussi Leinonen (2023). Microwave Single Scattering Properties Database (Horizontally Aligned Aggregates of Dendrites) [Dataset]. http://doi.org/10.5281/zenodo.7510186
    Explore at:
    ncAvailable download formats
    Dataset updated
    Jan 8, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kamil Mroz; Kamil Mroz; Jussi Leinonen; Jussi Leinonen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The database contains physical and microwave single scattering properties of horizontally aligned frozen hydrometeors as large as 11 cm in diameter.

    A description of the aggregation model used for particle generation can be found in:
    Leinonen, J., and Szyrmer, W. (2015), Radar signatures of snowflake riming: A modeling study, Earth and Space Science, 2, 346– 358, doi:10.1002/2015EA000102.
    The code used for particle generation is freely available at: https://github.com/jleinonen/aggregation

    The scattering properties of particles were computed using discrete dipole approximation using ADDA software package (https://github.com/adda-team/adda)

    Terminal velocity of snowflakes was computed using 4 hydrodynamical models that were implemented as a part of snowScat library (https://github.com/OPTIMICe-team/snowScatt)

    Approximately one half of the snowflake structure files and one quarter of scattering properties (for X, Ku, Ka and W band) were generated for the publication of Leinonen and Szyrmer (2015). The remaining part of the dataset was generated using the ALICE High Performance Computing Facility at the University of Leicester.

  10. f

    DataSheet_1_AgTC and AgETL: open-source tools to enhance data collection and...

    • frontiersin.figshare.com
    pdf
    Updated Feb 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luis Vargas-Rojas; To-Chia Ting; Katherine M. Rainey; Matthew Reynolds; Diane R. Wang (2024). DataSheet_1_AgTC and AgETL: open-source tools to enhance data collection and management for plant science research.pdf [Dataset]. http://doi.org/10.3389/fpls.2024.1265073.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Feb 21, 2024
    Dataset provided by
    Frontiers
    Authors
    Luis Vargas-Rojas; To-Chia Ting; Katherine M. Rainey; Matthew Reynolds; Diane R. Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Advancements in phenotyping technology have enabled plant science researchers to gather large volumes of information from their experiments, especially those that evaluate multiple genotypes. To fully leverage these complex and often heterogeneous data sets (i.e. those that differ in format and structure), scientists must invest considerable time in data processing, and data management has emerged as a considerable barrier for downstream application. Here, we propose a pipeline to enhance data collection, processing, and management from plant science studies comprising of two newly developed open-source programs. The first, called AgTC, is a series of programming functions that generates comma-separated values file templates to collect data in a standard format using either a lab-based computer or a mobile device. The second series of functions, AgETL, executes steps for an Extract-Transform-Load (ETL) data integration process where data are extracted from heterogeneously formatted files, transformed to meet standard criteria, and loaded into a database. There, data are stored and can be accessed for data analysis-related processes, including dynamic data visualization through web-based tools. Both AgTC and AgETL are flexible for application across plant science experiments without programming knowledge on the part of the domain scientist, and their functions are executed on Jupyter Notebook, a browser-based interactive development environment. Additionally, all parameters are easily customized from central configuration files written in the human-readable YAML format. Using three experiments from research laboratories in university and non-government organization (NGO) settings as test cases, we demonstrate the utility of AgTC and AgETL to streamline critical steps from data collection to analysis in the plant sciences.

  11. B

    Diamond physical and chemical characteristics database

    • borealisdata.ca
    • search.dataone.org
    Updated Jul 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Stachel (2022). Diamond physical and chemical characteristics database [Dataset]. http://doi.org/10.7939/DVN/B8VYHV
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 11, 2022
    Dataset provided by
    Borealis
    Authors
    Thomas Stachel
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A database of physical and IR-spectroscopic characteristics (including nitrogen content and aggregation state) and stable isotopic compositions (d13C and d15N) of principally inclusion-bearing diamonds

  12. National Mortgage Database Aggregate Statistics

    • catalog.data.gov
    Updated Mar 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Federal Housing Finance Agency (2025). National Mortgage Database Aggregate Statistics [Dataset]. https://catalog.data.gov/dataset/national-mortgage-database-aggregate-statistics
    Explore at:
    Dataset updated
    Mar 11, 2025
    Dataset provided by
    Federal Housing Finance Agencyhttps://www.fhfa.gov/
    Description

    The National Mortgage Database (NMDB®) is a nationally representative five percent sample of residential mortgages in the United States.

  13. A 的基因组

    • figshare.com
    txt
    Updated Apr 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    lihao Guo (2025). A 的基因组 [Dataset]. http://doi.org/10.6084/m9.figshare.28828337.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Apr 19, 2025
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    lihao Guo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    基因组组装

  14. n

    Data from: Leaf area predicts conspecific spatial aggregation of woody...

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jingjing Xi; Collin Li; Min Wang; Stavros Veresoglou (2024). Leaf area predicts conspecific spatial aggregation of woody species [Dataset]. http://doi.org/10.5061/dryad.4b8gthtn2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Sun Yat-sen University
    Authors
    Jingjing Xi; Collin Li; Min Wang; Stavros Veresoglou
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Aim: Addressing how woody plant species are distributed in space can reveal inconspicuous drivers that structure plant communities. The spatial structure of conspecifics varies not only at local scales across co-existing plant species but also at larger biogeographical scales with climatic parameters and habitat properties. The possibility that biogeographical drivers shape the spatial structure of plants, however, has not received sufficient attention. Location: Global synthesis. Time period: 1997 - 2022. Major taxa studied: Woody angiosperms and conifers. Methods: We carried out a quantitative synthesis to capture the interplay between local scale and larger scale drivers. We modelled conspecific spatial aggregation as a binary response through logistic models and Ripley’s L statistics and the distance at which the point process was least random with mixed effects linear models. Our predictors covered a range of plant traits, climatic predictors and descriptors of the habitat. Results: We hypothesized that plant traits, when summarized by local scale predictors, exceed in importance biogeographical drivers in determining the spatial structure of conspecifics across woody systems. This was only the case in relation to the frequency with which we observe aggregated distributions. The probability of observing spatial aggregation and the intensity of it was higher for plant species with large leaves but further depended on climatic parameters and mycorrhiza. Main Conclusions: Compared to climatic variables, plant traits perform poorly in explaining the spatial structure of woody plant species, even though leaf area is a decisive plant trait that is related to whether we observe homogenous spatial aggregation and its intensity. Despite the limited variance explained by our models, we found that the spatial structure of woody plants is subject to consistent biogeographical constraints and that these exceed beyond descriptors of individual species, which we captured here through leaf area. Methods On the 8th of September 2022 we carried out a search in the Web of Science with the search string “(Ripley's K function) AND (forest)”. The search yielded 356 hits. We screened those 356 studies for eligibility, first based on the suitability of their article titles and second based on their abstracts (Figure S1). The 240 eligible studies were subsequently screened manually upon reading the entire article based on the following inclusion criteria: (1) The study reported on univariate Ripley's K or L statistics or else it was possible to extract those from figures or maps. (2) The study had been carried out in a woody ecosystem or a rangeland. (3) The univariate Ripley’s K statistics described the distribution of individuals from a single plant species. (4) The authors named the plant species for which the univariate Ripley's K statistics had been described. (5) The landscape (for example a logging area) did not induce conspicuous point processes that could not be corrected within the analysis. We manually processed the remaining 240 studies through reading the main text which reduced the final number of eligible studies to 69. A list of those data sources can be found in Appendix Three. From those studies we extracted the following moderators and we fitted them as predictors in subsequent models: Mean annual temperature: continuous variable. When unreported, we extracted the variable based on coordinates from WorldClim (Fick & Hijmans, 2017). Total annual precipitation: continuous variable. When unreported, we extracted the variable based on coordinates from WorldClim (Fick & Hijmans, 2017). Latitude of the study location: continuous variable. When unreported, we extracted the information based on the closest location reported. Longitude of the study location: continuous variable. When unreported, we extracted the information based on the closest location reported. Site area: continuous variable. We extracted the site area from the studies and converted it into a unified unit, square meter. Tree species: categorical variable. Plant traits: we collected data on 7 traits: leaf area (i.e. the size of the leaves), seed mass, wood density, leaf mass per area, tree height, plant species biomass and stem specific density. We first gathered data on tree height, seed mass and leaf area from the subset of common species in TRY (Díaz et al., 2022). We subsequently searched for seed mass data the SID database (Royal Botanic Gardens Kew, 2023) and the ICRAF database for wood density data (Ketterings et al., 2001). In the cases we observed no records in those databases we checked the EOL database (http://eol.org.). For leaf area, leaf mass per area, tree height, plant species biomass and stem specific density, we extracted them from the EOL database (http://eol.org.). We opted with these traits to cover as many trait syndromes as possible but the main criterion which we used to decide on the traits was the feasibility of acquiring them for the plant species in our database. Woody system age: categorical variable. We classified non woody habitats, plantations and systems that had recently experienced serious disturbances as “young” whereas natural forests or woody stands that had reached maturity as “old”. Mycorrhiza type: categorical variable. We extracted mycorrhizal types for each species from Wang and Qiu (2006). In the cases that we could find no mycorrhizal classification information in the database at a species level we searched instead the database compiled by Delavaux et al. (2021) containing information at a genus level. We only extracted mycorrhizal classifications if these supported a single mycorrhizal type at a minimum probability of 85%. Otherwise, we left the plant species unclassified in relation to mycorrhiza. Ripley's L effect size: continuous variable. We first calculated for all distances the ratio between the (1) difference between the Ripley's L statistic and the width of the 95% CI envelope divided by two and (2) the difference between the upper and lower points of the envelope divided by two. A large absolute value suggests a strong deviation from randomness whereas any value below 1 suggest a random process. We identified the location where the absolute value of this ratio was maximum. Ripley's L statistic: continuous variable. We transformed Ripley´s K statistics (when they had not been transformed) into Ripley´s L statistics. We only used the value at the location where we observed the maximum in absolute value Ripley's K effect size. Distance when Ripley's L peaked: continuous variable describing the distance at which we observed the maximum in absolute value Ripley´s L effect size. Köppen climate zone: a categorical variable with 4 levels describing the main climatic zones based on the Köppen classification: A (tropical climates); B (arid climates); C (temperate climates); D (continental climates). We extracted those from the raster files published by Beck et al. (2018). In the cases that we observed multiple values in databases (referring here mainly to plant trait values) per species, we used the median value. In the cases when we had to digitize plots to extract data, we did so with Plot Digitizer v2.6.8.

  15. u

    EPiC database - Recycled aggregate

    • figshare.unimelb.edu.au
    • resodate.org
    pdf
    Updated Dec 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert Crawford; André Stephan; Fabian Prideaux (2020). EPiC database - Recycled aggregate [Dataset]. http://doi.org/10.26188/5da557263ad43
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Dec 10, 2020
    Dataset provided by
    The University of Melbourne
    Authors
    Robert Crawford; André Stephan; Fabian Prideaux
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This material is part of the free Environmental Performance in Construction (EPiC) Database. The EPiC Database contains embodied environmental flow coefficients for 250+ construction materials using a comprehensive hybrid life cycle inventory approach.Recycled aggregate is a cheap and readily available product made from recycled construction materials. It is strong and durable with excellent drainage properties. It is typically comprised of concrete, stone, brick, ceramics, mortar and other common construction materials. It is produced using the waste materials collected from the demolition of building and infrastructure projects. Impurities such as metal, wood and timber are removed via magnets and other sorting techniques. The remaining materials are sorted by size, and crushed into a coarse aggregate.Recycled aggregate is becoming increasingly popular as a replacement for natural aggregates. It is commonly used for: bulk fill, road construction, gravel, and as an aggregate in concrete. When used in concrete, it is typically combined with fly ash or other additives to ensure improved strength and reliability.

  16. FHFA: National Mortgage Database (NMDB®) Aggregate Statistics

    • datalumos.org
    • openicpsr.org
    Updated Feb 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Federal Housing Finance Agency (2025). FHFA: National Mortgage Database (NMDB®) Aggregate Statistics [Dataset]. http://doi.org/10.3886/E220243V1
    Explore at:
    Dataset updated
    Feb 20, 2025
    Dataset authored and provided by
    Federal Housing Finance Agencyhttps://www.fhfa.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The National Mortgage Database (NMDB®) is a nationally​​ representative five percent sample of residential mortgages in the United States. Publication ​of aggregate data from NMDB is a step toward implementing the statutory requirements of section 1324(c) of the Federal Housing Enterprises Financial Safety and Soundness Act of 1992, as amended by the Housing and Economic Recovery Act of 2008. The statute requires FHFA to conduct a monthly mortgage market survey to collect data on the characteristics of individual mortgages, both Enterprise and​ non-Enterprise, and to make the data a​vailable to the public while protecting the privacy of the borrowers.​Notes:1) All CSV file headers are now standardized as described in the Data Dictionary and Technical Notes and all CSV files are zipped.2) Alternate wide format CSV files are available. The wide format may be more easily opened by MS Excel.

  17. f

    Thematic Woody Aggregation for DR Congo - AFRICOVER

    • data.apps.fao.org
    Updated Jul 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Thematic Woody Aggregation for DR Congo - AFRICOVER [Dataset]. https://data.apps.fao.org/map/catalog/srv/resources/datasets/2f2d5683-67c0-4760-98b3-4ef98a31a96e
    Explore at:
    Dataset updated
    Jul 20, 2024
    Description

    This dataset is a thematic reaggregated version of the original national Africover landcover multipurpose database. It contains all natural vegetation with a woody component. The original full resolution land cover has been produced from visual interpretation of digitally enhanced LANDSAT TM images (Bands 4,3,2) acquired mainly in the period 2000-2001 (see the "Multipurpose Landcover Database" metadata for more details). This dataset is intended for free public access. Thematic aggregation is the way that the end user customizes the Africover database to fulfil his/her specific requirements. The Africover database gives equal level of detail to Agriculture as well as Natural vegetation or Bare Areas etc. Generally a single user does not need this level of detail for each class type; therefore he/she will enhance the information of one land cover type and will generalize or erase the information related to other land cover aspects. The most powerful way to conduct an aggregation exercise is to use the classifiers as basic elements of the exercise. This gives the user the maximum flexibility on the use of data. The shape main attributes correspond to the following fields: -ID -HECTARES -WOODY_ID -WOODY_DESC You can download a zip archive containing: -the drc-cult-agg (.shp) -the DR Congo Classifiers Used (.pdf) -the DR Congo legend (.pdf and .xls) -the DR Congo Legend - LCCS Import file (.xls) -the LCCSglossary_drcongo (.pdf) -the thematic-aggregation-procedure (.pdf) -the thematic-aggregation-annex1 (.pdf) -the thematic-aggregation-annex2 (.pdf) -the Userlabel Definitions (.pdf)

  18. Z

    Database of physicochemical and optical properties of black carbon fractal...

    • data.niaid.nih.gov
    Updated Jun 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Baseerat; Thomas; Jaikrishna; Tobias; Marius; Mira (2023). Database of physicochemical and optical properties of black carbon fractal aggregates [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7523057
    Explore at:
    Dataset updated
    Jun 20, 2023
    Dataset provided by
    Patil
    Romshoo
    Kloft
    Müller
    Michels
    Pöhlker
    Authors
    Baseerat; Thomas; Jaikrishna; Tobias; Marius; Mira
    Description

    In order to estimate the climate impact of highly absorbing black carbon (BC) aerosols, it is necessary to know their optical properties. The Lorentz-Mie theory, often used to calculate the optical properties of BC under the spherical morphological assumption, produces discrepancies when compared to measurements. In light of this, researchers are currently investigating the possibility of computing the optical properties of BC using a realistic fractal aggregate morphology. To determine the optical properties of such BC fractal aggregates, the Multiple Sphere T-Matrix method (MSTM) is used, which can take more than 24 hours for a single simulation depending on the aggregate properties. This study provides a highly accurate benchmark machine-learning algorithm that can be used to generate the optical properties of BC fractal aggregate in a fraction of a second. The machine learning algorithm was trained over an extensive database of physicochemical and optical properties of BC fractal aggregates. The extensive training data helped develop an ML algorithm that can accurately predict the optical properties of BC fractal aggregates with an average deviation of less than one percent from their actual values. Specifically, the ML algorithm provides the option to generate the optical properties in the visible spectrum using either kernel ridge regression (KRR) or artificial neural networks (ANN) for a BC fractal aggregate of desired physicochemical properties like size, morphology, and organic coating. The dataset of physicochemical and optical properties of BC fractal aggregates are provided here. The developed ML algorithm for predicting the optical properties of BC fractal aggregates (https://github.com/jaikrishnap/Machine-learning-for-prediction-of-BCFAs) is highly useful for real-world applications due to its wide parameter range, high accuracy, and low computational cost.

    Contents

    database_optical_properties_black_carbon_fractal_aggregtates.csv, data file, comma-separated values

    database_header.txt, metadata, text

    Citation for the database:

    B., Romshoo, T., Müller, B., Patil, J., Michels, T., Kloft, M., and Pöhlker, M.: Database of physicochemical and optical properties of black carbon fractal aggregates, Dataset, https://doi.org/10.5281/zenodo.7523058, 2023.

  19. Data_Sheet_1_Integrated Analysis of Multiple Microarray Studies to Identify...

    • frontiersin.figshare.com
    txt
    Updated Jun 3, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zi-An Chen; Yu-Feng Sun; Quan-Xu Wang; Hui-Hui Ma; Zhi-Zhao Ma; Chuan-Jie Yang (2023). Data_Sheet_1_Integrated Analysis of Multiple Microarray Studies to Identify Novel Gene Signatures in Ulcerative Colitis.CSV [Dataset]. http://doi.org/10.3389/fgene.2021.697514.s001
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Zi-An Chen; Yu-Feng Sun; Quan-Xu Wang; Hui-Hui Ma; Zhi-Zhao Ma; Chuan-Jie Yang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Background: Ulcerative colitis (UC) is a chronic, complicated, inflammatory disease with an increasing incidence and prevalence worldwide. However, the intrinsic molecular mechanisms underlying the pathogenesis of UC have not yet been fully elucidated.Methods: All UC datasets published in the GEO database were analyzed and summarized. Subsequently, the robust rank aggregation (RRA) method was used to identify differentially expressed genes (DEGs) between UC patients and controls. Gene functional annotation and PPI network analysis were performed to illustrate the potential functions of the DEGs. Some important functional modules from the protein-protein interaction (PPI) network were identified by molecular complex detection (MCODE), Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG), and analyses were performed. The results of CytoHubba, a plug for integrated algorithm for biomolecular interaction networks combined with RRA analysis, were used to identify the hub genes. Finally, a mouse model of UC was established by dextran sulfate sodium salt (DSS) solution to verify the expression of hub genes.Results: A total of 6 datasets met the inclusion criteria (GSE38713, GSE59071, GSE73661, GSE75214, GSE87466, GSE92415). The RRA integrated analysis revealed 208 significant DEGs (132 upregulated genes and 76 downregulated genes). After constructing the PPI network by MCODE plug, modules with the top three scores were listed. The CytoHubba app and RRA identified six hub genes: LCN2, CXCL1, MMP3, IDO1, MMP1, and S100A8. We found through enrichment analysis that these functional modules and hub genes were mainly related to cytokine secretion, immune response, and cancer progression. With the mouse model, we found that the expression of all six hub genes in the UC group was higher than that in the control group (P < 0.05).Conclusion: The hub genes analyzed by the RRA method are highly reliable. These findings improve the understanding of the molecular mechanisms in UC pathogenesis.

  20. d

    Species of Greatest Conservation Need National Database

    • catalog.data.gov
    • data.usgs.gov
    Updated Nov 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Species of Greatest Conservation Need National Database [Dataset]. https://catalog.data.gov/dataset/species-of-greatest-conservation-need-national-database
    Explore at:
    Dataset updated
    Nov 12, 2025
    Dataset provided by
    U.S. Geological Survey
    Description

    The Species of Greatest Conservation Need National Database is an aggregation of lists from State Wildlife Action Plans. Species of Greatest Conservation Need (SGCN) are wildlife species that need conservation attention as listed in action plans. In this database, we have validated scientific names from original documents against taxonomic authorities to increase consistency among names enabling aggregation and summary. This database does not replace the information contained in the original State Wildlife Action Plans. The database includes SGCN lists from 56 states, territories, and districts, encompassing action plans spanning from 2005 to 2022. State Wildlife Action Plans undergo updates at least once every 10 years by respective wildlife agencies. The SGCN list data from these action plans have been compiled in partnership with individual wildlife management agencies, the United States Fish and Wildlife Service, and the Association of Fish and Wildlife Agencies. The SGCN National Database consists of three data tables: "source_data", "process_data", and "validated_data". Most users will likely find the "sgcn_species_all_records" table that combines all three tables most useful to compare "source_" names and "validated_" names and to aggregate and summarize using validated names. The "source_data" table provides an archive of all SGCN records listed by conservation authorities over multiple actions plans, which includes the scientific names, common names, locations, and year of action plan. The "process_data" table incorporates processing information, including the archiving and processing dates along with persistent identifiers used for record documentation, while the "validated_data" table provides the taxonomic identities from the matched taxonomic source, including the standardized scientific name, common name, and taxonomic ranks as well as links to supplementary taxonomic information.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2018). Genome Aggregation Database [Dataset]. http://identifiers.org/RRID:SCR_014964

Genome Aggregation Database

RRID:SCR_014964, biotools:gnomad, Genome Aggregation Database (RRID:SCR_014964), gnomAD, gnomAD 2.0, gnomAD Browser, gnomAD version 2.0, Exome Aggregation Consortium

Explore at:
76 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jul 19, 2018
Description

Database that aggregates exome and genome sequencing data from large-scale sequencing projects. The gnomAD data set contains individuals sequenced using multiple exome capture methods and sequencing chemistries. Raw data from the projects have been reprocessed through the same pipeline, and jointly variant-called to increase consistency across projects.

Search
Clear search
Close search
Google apps
Main menu