100+ datasets found

b
Database of homology-derived secondary structure of proteins
bioregistry.io
Updated Jan 16, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Database of homology-derived secondary structure of proteins [Dataset]. https://bioregistry.io/hssp
Explore at:
Dataset updated
Jan 16, 2022
Description
HSSP (homology-derived structures of proteins) is a derived database merging structural (2-D and 3-D) and sequence information (1-D). For each protein of known 3D structure from the Protein Data Bank, the database has a file with all sequence homologues, properly aligned to the PDB protein.
d
Data from: Postfire Debris-Flow Database (Literature Derived)
catalog.data.gov
data.usgs.gov
Updated Nov 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Postfire Debris-Flow Database (Literature Derived) [Dataset]. https://catalog.data.gov/dataset/postfire-debris-flow-database-literature-derived
Explore at:
Dataset updated
Nov 26, 2025
Dataset provided by
U.S. Geological Survey
Description
The data presented in this data release represent observations of postfire debris flows that have been collected from publicly available datasets. Data originate from 13 different countries: the United States, Australia, China, Italy, Greece, Portugal, Spain, the United Kingdom, Austria, Switzerland, Canada, South Korea, and Japan. The data are located in the file called “PFDF_database_sortedbyReference.txt” and a description of each column header can be found in both the file “column_headers.txt” and the metadata file (“Post-fire Debris-Flow Database (Literature Derived).xml”). The observations are derived from areas that have been burned by wildfire and are global in nature. However, this dataset is synthesized from information collected by many different researchers for different purposes, and therefore not all fields are available for each of the observations. Missing information is indicated by the value “-9999” in the ”PFDF_database_sortedbyReference.txt” file. Note that the text file contains special characters and a mix of date-time formats that reflect the original data provided by the authors. The text may not be displayed correctly if it is opened by proprietary software such as Microsoft Excel but will appear correctly when opened in a text editor software.
ASTEROID LIGHTCURVE DERIVED DATA V13.0 - Dataset - NASA Open Data Portal
data.nasa.gov
Updated Mar 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). ASTEROID LIGHTCURVE DERIVED DATA V13.0 - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/asteroid-lightcurve-derived-data-v13-0-e7aaa
Explore at:
Dataset updated
Mar 31, 2025
Dataset provided by
NASAhttp://nasa.gov/
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
This is a compilation of published rotational parameters derived from lightcurve data for asteroids, based on the Warner et al. (2009) Asteroid Lightcurve Database. This is the version released in March 2012. In addition to reported rotational parameters by individual paper, there is a summary file with the values adopted by Harris, Warner, and Pravec as the most likely correct values for each asteroid. The data set also contains files listing known binary asteroids and 'tumbling' asteroids.
Data from: Global Data Set of Derived Soil Properties, 0.5-Degree Grid...
catalog.data.gov
s.cnmilf.com
+4more
Updated Sep 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ORNL_DAAC (2025). Global Data Set of Derived Soil Properties, 0.5-Degree Grid (ISRIC-WISE) [Dataset]. https://catalog.data.gov/dataset/global-data-set-of-derived-soil-properties-0-5-degree-grid-isric-wise-bdb54
Explore at:
Dataset updated
Sep 19, 2025
Dataset provided by
Oak Ridge National Laboratory Distributed Active Archive Center
Description
The World Inventory of Soil Emission Potentials (WISE) database currently contains data for over 4300 soil profiles collected mostly between 1950 and 1995. This database has been used to generate a series of uniform data sets of derived soil properties for each of the 106 soil units considered in the Soil Map of the World (FAO-UNESCO, 1974). These data sets were then linked to a 1/2 degree longitude by 1/2 degree latitude version of the edited and digital Soil Map of the World (FAO, 1995) to generate GIS raster image files for the following variables: Total available water capacity (mm water per 1 m soil depth) soil organic carbon density (kg C/m2 for 0-30cm depth range) soil organic carbon density (kg C/m2 for 0-100cm depth range) soil carbonate carbon density (kg C/m**2 for 0-100cm depth range) soil pH (0-30 cm depth range) soil pH (30-100 cm depth range) Data Citation: The data set should be cited as follows: Batjes, N. H. (ed). 2000. Global Data Set of Derived Soil Properties, 0.5-Degree Grid (ISRIC-WISE). Available on-line from Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee, U.S.A.
f
Minimal dataset derived from the database.
datasetcatalog.nlm.nih.gov
figshare.com
Updated May 14, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Molnár, Tamás; Szántó, Kata; Takács, Péter; Szamosi, Tamás; Bálint, Anita; Kunovszki, Péter; Borsi, András; Milassin, Ágnes; Farkas, Klaudia; Gimesi-Országh, Judit; Lakatos, Péter L. (2020). Minimal dataset derived from the database. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000583359
Explore at:
Dataset updated
May 14, 2020
Authors
Molnár, Tamás; Szántó, Kata; Takács, Péter; Szamosi, Tamás; Bálint, Anita; Kunovszki, Péter; Borsi, András; Milassin, Ágnes; Farkas, Klaudia; Gimesi-Országh, Judit; Lakatos, Péter L.
Description
Sheet “Prevalence, Incidence”–Prevalent and incident patient numbers. Sheet “Demographics”–Patient counts in age groups, median, mean and standard deviation of age. Sheet “Deaths”–Raw death counts and median age at death. Sheet “Malignancies distribution”–Patient counts with malignancy diagnoses based on 3-digit ICD-10 code. Sheet “CRC epidemiology”–Patient counts with new CRC diagnoses, median age at CRC diagnosis and death, total death counts. Sheet “Survival1”–Survival curve data, OS, UC patients and controls. Sheet “Survival2”–Survival curve data, OS, CRC-UC patients, whole group and age stratified. (XLSX)
o
Addressing statistical biases in nucleotide-derived protein databases for...
omicsdi.org
Updated Jul 10, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Addressing statistical biases in nucleotide-derived protein databases for proteogenomic search strategies. [Dataset]. https://www.omicsdi.org/dataset/biostudies/S-EPMC3703792
Explore at:
Dataset updated
Jul 10, 2023
Variables measured
Unknown
Description
Proteogenomics has the potential to advance genome annotation through high quality peptide identifications derived from mass spectrometry experiments, which demonstrate a given gene or isoform is expressed and translated at the protein level. This can advance our understanding of genome function, discovering novel genes and gene structure that have not yet been identified or validated. Because of the high-throughput shotgun nature of most proteomics experiments, it is essential to carefully control for false positives and prevent any potential misannotation. A number of statistical procedures to deal with this are in wide use in proteomics, calculating false discovery rate (FDR) and posterior error probability (PEP) values for groups and individual peptide spectrum matches (PSMs). These methods control for multiple testing and exploit decoy databases to estimate statistical significance. Here, we show that database choice has a major effect on these confidence estimates leading to significant differences in the number of PSMs reported. We note that standard target:decoy approaches using six-frame translations of nucleotide sequences, such as assembled transcriptome data, apparently underestimate the confidence assigned to the PSMs. The source of this error stems from the inflated and unusual nature of the six-frame database, where for every target sequence there exists five "incorrect" targets that are unlikely to code for protein. The attendant FDR and PEP estimates lead to fewer accepted PSMs at fixed thresholds, and we show that this effect is a product of the database and statistical modeling and not the search engine. A variety of approaches to limit database size and remove noncoding target sequences are examined and discussed in terms of the altered statistical estimates generated and PSMs reported. These results are of importance to groups carrying out proteogenomics, aiming to maximize the validation and discovery of gene structure in sequenced genomes, while still controlling for false positives.
r
Literature-derived human gene-disease network
rrid.site
neuinfo.org
+2more
Updated Nov 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Literature-derived human gene-disease network [Dataset]. http://identifiers.org/RRID:SCR_005653
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_005653
Dataset updated
Nov 30, 2025
Description
A text mining derived database with focus on extracting and classifying gene-disease associations with respect to several biomolecular conditions. It uses a machine learning based algorithm to extract semantic gene-disease relations from a textual source of interest. The semantic gene-disease relations were extracted with F-measures of 78. More specifically, the textual source utilized here originates from Entrez Gene''''s GeneRIF (Gene Reference Into Function) database (Mitchell, et al., 2003). LHGDN was created based on a GeneRIF version from March 31st, 2009, consisting of 414241 phrases. These phrases were further restricted to the organism Homo sapiens, which resulted in a total of 178004 phrases. We benchmark our approach on two different tasks. The first task is the identification of semantic relations between diseases and treatments. The available data set consists of manually annotated PubMed abstracts. The second task is the identification of relations between genes and diseases from a set of concise phrases, so-called GeneRIF (Gene Reference Into Function) phrases. In our experimental setting, we do not assume that the entities are given, as is often the case in previous relation extraction work. Rather the extraction of the entities is solved as a subproblem. Compared with other state-of-the-art approaches, we achieve very competitive results on both data sets. To demonstrate the scalability of our solution, we apply our approach to the complete human GeneRIF database. The resulting gene-disease network contains 34758 semantic associations between 4939 genes and 1745 diseases. The gene-disease network is publicly available as a machine-readable RDF graph. We extend the framework of Conditional Random Fields towards the annotation of semantic relations from text and apply it to the biomedical domain. Our approach is based on a rich set of textual features and achieves a performance that is competitive to leading approaches. The model is quite general and can be extended to handle arbitrary biological entities and relation types. The resulting gene-disease network shows that the GeneRIF database provides a rich knowledge source for text mining.
Combined data set derived from new data generated herein and publicly...
plos.figshare.com
xls
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joana Matzen da Silva; Simon Creer; Antonina dos Santos; Ana C. Costa; Marina R. Cunha; Filipe O. Costa; Gary R. Carvalho (2023). Combined data set derived from new data generated herein and publicly available DNA barcoding projects from the Barcode of Life Database. [Dataset]. http://doi.org/10.1371/journal.pone.0019449.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0019449.t001
Dataset updated
May 30, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Joana Matzen da Silva; Simon Creer; Antonina dos Santos; Ana C. Costa; Marina R. Cunha; Filipe O. Costa; Gary R. Carvalho
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Combined data set derived from new data generated herein and publicly available DNA barcoding projects from the Barcode of Life Database.
DataSheet_1_MFPPDB: a comprehensive multi-functional plant peptide...
frontiersin.figshare.com
pdf
Updated Oct 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yaozu Yang; Hongwei Wu; Yu Gao; Wei Tong; Ke Li (2023). DataSheet_1_MFPPDB: a comprehensive multi-functional plant peptide database.pdf [Dataset]. http://doi.org/10.3389/fpls.2023.1224394.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fpls.2023.1224394.s001
Dataset updated
Oct 16, 2023
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Yaozu Yang; Hongwei Wu; Yu Gao; Wei Tong; Ke Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Plants produce a wide range of bioactive peptides as part of their innate defense mechanisms. With the explosive growth of plant-derived peptides, verifying the therapeutic function using traditional experimental methods are resources and time consuming. Therefore, it is necessary to predict the therapeutic function of plant-derived peptides more effectively and accurately with reduced waste of resources and thus expedite the development of plant peptides. We herein developed a repository of plant peptides predicted to have multiple therapeutic functions, named as MFPPDB (multi-functional plant peptide database). MFPPDB including 1,482,409 single or multiple functional plant origin therapeutic peptides derived from 121 fundamental plant species. The functional categories of these therapeutic peptides include 41 different features such as anti-bacterial, anti-fungal, anti-HIV, anti-viral, and anti-cancer. The detailed physicochemical information of these peptides was presented in functional search and physicochemical property search module, which can help users easily access the peptide information by the plant peptide species, ID, and functions, or by their peptide ID, isoelectric point, peptide sequence, and molecular weight through web-friendly interface. We further matched the predicted peptides to nine state-of-the-art curated functional peptide databases and found that at least 293,408 of the peptides possess functional potentials. Overall, MFPPDB integrated a massive number of plant peptides have single or multiple therapeutic functions, which will facilitate the comprehensive research in plant peptidomics. MFPPDB can be freely accessed through http://124.223.195.214:9188/mfppdb/index.
f
Working definition derived from the NHIS claims database.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Sep 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Son, Sumin; Ahn, Hyeong Sik; Kim, Seung Ho; Lee, Ki-Il; Kim, Ikhee; Mo, Ji-Hun; Kim, Hyun Jung (2023). Working definition derived from the NHIS claims database. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000942808
Explore at:
Dataset updated
Sep 27, 2023
Authors
Son, Sumin; Ahn, Hyeong Sik; Kim, Seung Ho; Lee, Ki-Il; Kim, Ikhee; Mo, Ji-Hun; Kim, Hyun Jung
Description
Working definition derived from the NHIS claims database.
MER2 Pancam Science Derived IOF Data Bundle - Dataset - NASA Open Data...
data.nasa.gov
Updated Mar 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). MER2 Pancam Science Derived IOF Data Bundle - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/mer2-pancam-science-derived-iof-data-bundle
Explore at:
Dataset updated
Mar 31, 2025
Dataset provided by
NASAhttp://nasa.gov/
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
This bundle contains derived IOF data from the Panoramic Cameras (Pancam) on Mars Exploration Rover 2 (Spirit). These data were produced by the science team.
Z
BridgeDb: pathway identifier mapping database derived from Wikidata
data.niaid.nih.gov
Updated May 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Willighagen, Egon (2023). BridgeDb: pathway identifier mapping database derived from Wikidata [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7902768
Explore at:
Dataset updated
May 7, 2023
Dataset provided by
Maastricht University
Authors
Willighagen, Egon
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Release of a BridgeDb gene identifier mapping database between Wikidata and Ensembl.

[INFO]: Database finished. INFO: old database is Wikidata 1.0.0 (build: 20230506) INFO: new database is Wikidata 1.0.0 (build: 20230506) INFO: Number of ids in Wd (Wikidata): 153715 (unchanged) INFO: Number of ids in En (Ensembl): 153637 (unchanged) INFO: new size is 95 Mb (changed +0.0%) INFO: total number of identifiers is 307352 INFO: total number of mappings is 307430
w
HUN AssetList Database v1p2 20150128
data.wu.ac.at
researchdata.edu.au
+1more
Updated Oct 9, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Programme (2018). HUN AssetList Database v1p2 20150128 [Dataset]. https://data.wu.ac.at/schema/data_gov_au/ZmRiMzFhNGEtYTZjNC00ZDYwLWJhNDQtY2IxMjYwY2I1YTdl
Explore at:
Dataset updated
Oct 9, 2018
Dataset provided by
Bioregional Assessment Programme
Description
Abstract

The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

Superseded by HUN AssetList v1.3 20150212 (GUID: dcf8349e-aaed-4d30-80ab-1c8cbad8fe68) on 2/12/2015

This dataset contains the spatial and non-spatial (attribute) components of the Hunter subregion Asset List as an .mdb file, which is readable as an MS Access database or as an ESRI Personal Geodatabase.

Under the BA program, a spatial assets database is developed for each defined bioregional assessment project. The spatial elements that underpin the identification of water dependent assets are identified in the first instance by regional NRM organisations (via the WAIT tool) and supplemented with additional elements from national and state/territory government datasets. A report on the WAIT process for the Hunter is included in the zip file as part of this dataset.

Elements are initially included in the preliminary assets database if they are partly or wholly within the subregion's preliminary assessment extent (Materiality Test 1, M1). Elements are then grouped into assets which are evaluated by project teams to determine whether they meet the second Materiality Test (M2). Assets meeting both Materiality Tests comprise the water dependent asset list. Descriptions of the assets identified in the Hunter subregion are found in the "AssetList" table of the database.

Assets are the spatial features used by project teams to model scenarios under the BA program. Detailed attribution does not exist at the asset level. Asset attribution includes only the core set of BA-derived attributes reflecting the BA classification hierarchy, as described in Appendix A of "AnR_database_HUN_v1p2_20150128.doc", located in the zip file as part of this dataset.

The "Element_to_Asset" table contains the relationships and identifies the elements that were grouped to create each asset.

Detailed information describing the database structure and content can be found in the document "AnR_database_HUN_v1p2_20150128.doc" located in the zip file.

Some of the source data used in the compilation of this dataset is restricted.

Purpose

The Asset List Database was developed to identify water dependent assets located within the Hunter subregion.

Dataset History

Superseded by HUN AssetList v1.3 20150212 (GUID: dcf8349e-aaed-4d30-80ab-1c8cbad8fe68) on 2/12/2015*****

This dataset is an update of the previous version of the Hunter asset list database: "Asset list for Hunter - CURRENT"; ID: 51b1e021-2958-4cd3-8daa-ba46ece09d1c, which was updated with the inclusion of data from NSW Department of Primary Industries - Office of Water: HIGH PROBABILITY GROUNDWATER DEPENDENT VEGETATION WITH HIGH ECOLOGICAL VALUE (Hunter-Central Rivers).

Dataset Citation

Bioregional Assessment Programme (2015) HUN AssetList Database v1p2 20150128. Bioregional Assessment Derived Dataset. Viewed 09 October 2018, http://data.bioregionalassessments.gov.au/dataset/64ecd565-bb7c-4f21-951e-f35966b91c99.

Dataset Ancestors

Derived From NSW Office of Water Surface Water Entitlements Locations v1_Oct2013

Derived From Communities of National Environmental Significance Database - RESTRICTED - Metadata only

Derived From National Groundwater Dependent Ecosystems (GDE) Atlas

Derived From Birds Australia - Important Bird Areas (IBA) 2009

Derived From Hunter CMA GDEs (DRAFT DPI pre-release)

Derived From NSW Office of Water Surface Water Licences Processed for Hunter v1 20140516

Derived From NSW Office of Water Surface Water Offtakes - Hunter v1 24102013

Derived From National Groundwater Dependent Ecosystems (GDE) Atlas (including WA)

Derived From Asset list for Hunter - CURRENT

Derived From Species Profile and Threats Database (SPRAT) - Australia - Species of National Environmental Significance Database (BA subset - RESTRICTED - Metadata only)

Derived From Ramsar Wetlands of Australia

Derived From Commonwealth Heritage List Spatial Database (CHL)

Derived From GW Element Bores with Unknown FTYPE Hunter NSW Office of Water 20150514

Derived From New South Wales NSW Regional CMA Water Asset Information WAIT tool databases, RESTRICTED Includes ALL Reports

Derived From National Heritage List Spatial Database (NHL) (v2.1)

Derived From Groundwater Entitlement Hunter NSW Office of Water 20150324

Derived From NSW Office of Water combined geodatabase of regulated rivers and water sharing plan regions

Derived From Australia World Heritage Areas

Derived From NSW Office of Water GW licence extract linked to spatial locations for NorthandSouthSydney v3 13032014

Derived From Groundwater Economic Elements Hunter NSW 20150520 PersRem v02

Derived From Directory of Important Wetlands in Australia (DIWA) Spatial Database (Public)

Derived From New South Wales NSW - Regional - CMA - Water Asset Information Tool - WAIT - databases

Derived From Operating Mines OZMIN Geoscience Australia 20150201

Derived From NSW Office of Water - National Groundwater Information System 20141101v02

Derived From Groundwater Economic Assets Hunter NSW 20150331 PersRem

Derived From Australia - Species of National Environmental Significance Database

Derived From Monitoring Power Generation and Water Supply Bores Hunter NOW 20150514

Derived From Northern Rivers CMA GDEs (DRAFT DPI pre-release)

Derived From Australia, Register of the National Estate (RNE) - Spatial Database (RNESDB) Internal

Derived From NSW Office of Water Groundwater Entitlements Spatial Locations

Derived From NSW Office of Water Groundwater Licence Extract, North and South Sydney - Oct 2013

Derived From NSW Office of Water - GW licence extract linked to spatial locations for North and South Sydney v2 20140228

Derived From Collaborative Australian Protected Areas Database (CAPAD) 2010 (Not current release)
w
MBC Impact and Risk Analysis Database v01
data.wu.ac.at
researchdata.edu.au
+1more
Updated Oct 25, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Programme (2017). MBC Impact and Risk Analysis Database v01 [Dataset]. https://data.wu.ac.at/schema/data_gov_au/NGZlMDQxZmItNzUzNy00MjY5LWFiYzctMTI5NDNhNWNlYmM1
Explore at:
Dataset updated
Oct 25, 2017
Dataset provided by
Bioregional Assessment Programme
Description
Abstract

The Maranoa-Balonne-Condamine Impact and Risk Analysis Database (Analysis Database) is a fit-for-purpose geospatial information system developed for the Impact and Risk Analysis (Component 3-4) products of the Bioregional Assessment Technical Programme (BATP).

The version provided here for public download has been slightly modified to remove restricted material such as the co-ordinates of protected or threatened species. This version was used to populate BA Explorer.

The Analysis Database brings together many of the data sets used in Components 1 and 2 of the assessments and includes hydrology and hydrogeology modelling results, landscape classes and economic, sociocultural and ecological assets. These data sets are listed in the Component 1 and 2 products under the Assessments tab in http://www.bioregionalassessments.gov.au/.

An Analysis Database of common design and schema was implemented for each subregion where a full Impact and Risk Analysis was completed. To populate each database, input datasets were transformed, normalised and inserted into their respective Analysis Databases in accord with the common design and schema. The approach enabled the universal treatment of data analysis across all bioregions despite data being of different specifications and origins.

The Analysis Database includes all the data used for the assessment of the subregion with the exception of those datasets that were not provided to the program with an open access licence. The database is constructed using the Open Source platform PostgreSQL coupled with PostGIS. This technology was considered to better enable the provenance and transparency requirements of the Programme. The files provided here have been prepared using the PostgreSQL version 9.5 SQL Dump function - pg_dump.

A detailed description of the Analysis Database, its design, structure and application is provided in the supporting documentation: http://data.bioregionalassessments.gov.au/dataset/05e851cf-57a5-4127-948a-1b41732d538c

Purpose

The Maranoa-Balonne-Condamine Impact and Risk Analysis Database (Analysis Database) is the geospatial database for completing the Impact and Risk Analysis component of the Maranoa-Balonne-Condamine Bioregional Assessment. This includes the creating of results, tables and maps that appear in the relevant Products of each assessment. The database also manages the data used by the BA Explorer.

An individual instance of the Analysis Database was developed for each subregion where a component 3-4 Impact and Risks Assessment was conducted. With the exception of the subregion-specific data contained within it and the removal of restricted data records, each analysis database is of identical design and structure.

Dataset History

This Analysis Database is an instance of PostgreSQL version 9.5 hosted on Linux Red Hat Enterprise Linux version 4.8.5-4. PostgreSQL geospatial capabilities are provided by POSTGIS version 2.2.

Data pre-processing and upload into each PostgreSQL database was completed using FME Desktop (Oracle Edition) version 2016.1.2.1. Analysis data and results are provided to users and systems via the geospatial services of Geoserver version 2.9.1. Scientific analysis and mapping was undertaken by connecting a range of data using a combination of Microsoft Excel, QGIS and ArcMap systems.

During the Programme and for its working life, the Analysis Database was hosted and managed on instances of Amazon Web Services managed by Geoscience Australia and the Bureau of Meteorology.

Dataset Citation

Bioregional Assessment Programme (2017) MBC Impact and Risk Analysis Database v01. Bioregional Assessment Derived Dataset. Viewed 25 October 2017, http://data.bioregionalassessments.gov.au/dataset/69075f3e-67ba-405b-8640-96e6cb2a189a.

Dataset Ancestors

Derived From QLD Dept of Natural Resources and Mines, Groundwater Entitlements 20131204

Derived From Surface Geology of Australia, 1:1 000 000 scale, 2012 edition

Derived From Asset database for the Maranoa-Balonne-Condamine subregion on 16 June 2015

Derived From South East Queensland GDE (draft)

Derived From Geofabric Surface Cartography - V2.1

Derived From Environmental Asset Database - Commonwealth Environmental Water Office

Derived From QLD Dept of Natural Resources and Mines, Surface Water Entitlements 131204

Derived From GEODATA TOPO 250K Series 3, File Geodatabase format (.gdb)

Derived From Catchment Scale Land Use of Australia - 2014

Derived From Surface water preliminary assessment extent for the Maranoa-Balonne-Condamine subregion - v02

Derived From MBC Groundwater model domain boundary

Derived From Key Environmental Assets - KEA - of the Murray Darling Basin

Derived From Bioregional Assessment areas v03

Derived From MBC Groundwater model ACRD 5th to 95th percentile drawdown

Derived From Permanent and Semi-Permanent Waterbodies of the Lake Eyre Basin (Queensland and South Australia) (DRAFT)

Derived From Receptors for the Maranoa-Balonne-Condamine subregion

Derived From Bioregional Assessment areas v01

Derived From Bioregional Assessment areas v02

Derived From MBC Assessment Units 20160714 v01

Derived From Victoria - Seamless Geology 2014

Derived From Matters of State environmental significance (version 4.1), Queensland

Derived From Communities of National Environmental Significance Database - RESTRICTED - Metadata only

Derived From Bioregional Assessment areas v06

Derived From Asset database for the Maranoa-Balonne-Condamine subregion on 9 June 2015

Derived From Queensland wetland data version 3 - wetland areas.

Derived From Groundwater Preliminary Assessment Extent (PAE) for the Maranoa Balonne Condamine (MBC) subregion - v02

Derived From National Groundwater Dependent Ecosystems (GDE) Atlas (including WA)

Derived From Asset database for the Maranoa-Balonne-Condamine subregion on 05 February 2016

Derived From MBC Groundwater model layer boundaries

Derived From NSW Catchment Management Authority Boundaries 20130917

Derived From Baseline drawdown Layer 1 - Condamine Alluvium

Derived From MBC Assessment unit codified by regional watertable

Derived From QLD Dept of Natural Resources and Mines, Groundwater Entitlements linked to bores and NGIS v4 28072014

Derived From MBC Assessment Units 20160714 v02

Derived From MBC Groundwater model water balance areas

Derived From Asset database for the Maranoa-Balonne-Condamine subregion on 25 February 2015

Derived From Australia - Species of National Environmental Significance Database

Derived From MBC Groundwater model uncertainty analysis

Derived From Spring vents assessed for the Surat Underground Water Impact Report 2012

Derived From Collaborative Australian Protected Areas Database (CAPAD) 2010 (Not current release)

Derived From Queensland QLD - Regional - NRM - Water Asset Information Tool - WAIT - databases

Derived From [NSW Office of Water GW licence extract linked to spatial
n
NASA Earthdata
earthdata.nasa.gov
s.cnmilf.com
+3more
Updated Jul 24, 1994
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ORNL_CLOUD (1994). NASA Earthdata [Dataset]. http://doi.org/10.3334/ORNLDAAC/117
Explore at:
Unique identifier
https://doi.org/10.3334/ORNLDAAC/117
Dataset updated
Jul 24, 1994
Dataset authored and provided by
ORNL_CLOUD
Description
During the 1989 FIFE field campaign, measurements were made of soil moisture release parameters and hydraulic conductivity. Bulk density and soil moisture release data were collected at five FIFE sites representing the major soil types in the FIFE study area. These data were used to model the porosity, saturated water potential, and the b-factor (the exponent of the power curve function) following the method of Clapp and Hormberger (1978). These soil moisture characteristics can be used to describe plant-available water and water movement through soils.
Data from: ASTEROID LIGHTCURVE DERIVED DATA V14.0
catalog.data.gov
s.cnmilf.com
+1more
Updated Aug 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Aeronautics and Space Administration (2025). ASTEROID LIGHTCURVE DERIVED DATA V14.0 [Dataset]. https://catalog.data.gov/dataset/asteroid-lightcurve-derived-data-v14-0-53bb4
Explore at:
Dataset updated
Aug 22, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
This is a compilation of published rotational parameters derived from lightcurve data for asteroids, based on the Warner et al. (2009) Asteroid Lightcurve Database. This is the version as of March 1, 2014. In addition to reported rotational parameters by individual paper, there is a summary file with the values adopted by Harris, Warner, and Pravec as the most likely correct values for each asteroid. The data set also contains files listing known binary asteroids and 'tumbling' asteroids.
BIOSYSMOdb: Curated Database for Biodegradation and Bioremediation
data.europa.eu
unknown
Updated Feb 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2025). BIOSYSMOdb: Curated Database for Biodegradation and Bioremediation [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-14795254?locale=et
Explore at:
unknownAvailable download formats
Dataset updated
Feb 25, 2025
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BIOSYSMOdb is a comprehensive and integrative database developed as part of BIOSYSMO project. This resource centralizes data on metabolic pathways, reactions, enzymes, and degradative organisms to address soil contamination caused by industrial, agricultural, and urban activities. BIOSYSMOdb serves as a bridge between computational and experimental research, offering a unified platform to accelerate bioremediation solutions. Dataset Description BIOSYSMOdb integrates curated and synthesized data from major public repositories: EAWAG BBD, MibPOPdb, MetaCyc, Uniprot, and KEGG. The database includes: Chemical level: Details on compounds relevant for biodegradation. Metabolic level: Data on pathways, reactions, enzymes, and organisms associated with degradation. Organism level: Information on degradative organisms and their genomic data. Protein level: Information on enzymes in charge of each reaction and their sequence data associates (if available) Data Structure The following files are included in the dataset: BIOSYSMOdb_Compounds_chemical_iden_v1.0.csv: Compounds identifiers iferred for other databases BIOSYSMOdb_Compounds_chemical_info_v1.0.csv: Compounds information collected from public sources BIOSYSMOdb_Compounds_onthology_cod_v1.0.csv: Compounds onthology codes derived from Classyfire BIOSYSMOdb_Compounds_onthology_term_v1.0.csv: Compounds onthology terms derived from Classyfire BIOSYSMOdb_Pathways_v1.0.csv: Pathways dataset BIOSYSMOdb_Reactions_v1.0.csv: Reactions dataset (containing substrates, products, enzymes and pathways associated) BIOSYSMOdb_Enzymes_v1.0.csv: Reactions dataset (containing reactions associated) BIOSYSMOdb_Compounds_v1.0.csv: Compounds principal dataset BIOSYSMOdb_Organisms_v1.0.csv: Organisms principal dataset (containing pathways associated and NCBI Genome ID when available) CSV Descriptions Compound ID: Unique identifier for each compound. Pathway Name: Name of the metabolic pathway. Reaction ID: Identifier for individual reactions. Enzyme/Protein ID: Unique identifier for associated enzymes. Organism Name: Name of the degradative organism. Jupyter Notebook for querying BIOSYSMOdbTo facilitate data exploration and connections within the CSV files, a Jupyter Notebook, BIOSYSMO_database_queries, has been created. This notebook enables users to analyze relationships between different datasets and execute relevant queries efficiently. Data Sources & Licenses This database includes data derived from diverse databases: EAWAG BBD: Data on biodegradation of persistent organic pollutants. Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) MibPOPdb: Focused on microbial degradation of xenobiotics. Creative Commons Attribution 4.0 International (CC BY 4.0) license. MetaCyc: Comprehensive metabolic pathway database. Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) KEGG: Genomic integration and metabolic networks. Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) UniProt: Protein sequences. Creative Commons Attribution 4.0 International (CC BY 4.0) license. NCBI Genome: Organism Genomes. This database is public. Pubchem: Chemical Compounds. this database is public. CHebi: Chemical Compounds. Creative Commons Attribution 4.0 International (CC BY 4.0) license. Licensing and Attribution This dataset is shared under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0). Please credit BIOSYSMOdb and the original sources (EAWAG BBD, MibPOPdb, MetaCyc, ChEBI, Pubchem and KEGG) in any use or derivative works. BIOSYSMOdb was developed as part of the BIOSYSMO project, which has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101060211. Acknowledgments- MetaCyc, KEGG, EAWAG BBD, UniProt, NCBI Genome, PubChem, ChEBI, and MibPOPdb – For providing essential data that supported the curation of BIOSYSMOdb.- BIOSYSMO consortium – For their contributions to the database’s design and development. We extend our gratitude to the Horizon Europe programme and the European Union for their support in advancing research on bioremediation and biodegradation. Contact For inquiries, please contact: - Contact Name: Main Researcher: Marta Franco de Benito, MsC or Project Coordinator: Sara Gil Guerrero, PhD - Email: marta.franco@idener.ai // sara.gil@idener.ai - Institution: IDENER.AI
s
Comprehensive Systems-Biology Database
scicrunch.org
neuinfo.org
+2more
Updated Jan 29, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Comprehensive Systems-Biology Database [Dataset]. http://identifiers.org/RRID:SCR_008185
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008185
Dataset updated
Jan 29, 2022
Description
CSB.DB presents the results of bio-statistical analysis on gene expression data in association with additional biochemical and physiological knowledge. The main aim of this database platform is to provide tools that support insight into life''s complexity pyramid with a special focus on the integration of data from transcript and metabolite profiling experiments. The main focus of the CSB project is the generation of new easily accessible knowledge about the relationship and the hierarchy of cellular components. Thus new progress towards understanding lifes complexity pyramid is made. For this aim statistical and computational algorithms are applied to organism specific data derived from publicly available multi-parallel technologies, currently such as expression profiles. The underlying data are derived from various research activities. Thus CSB project provides an integrated and centralized public resource allowing universal access on the generated knowledge CSB.DB: A Comprehensive Systems-Biology Database. The derived knowledge should support the formulation of new hypotheses about the respective functional involvement of genes beyond their (inter-) relationships. Another major goal of the CSB project is to supply the researchers with necessary information to formulate these new hypotheses without demanding any a-priori statistical knowledge of the user. The CSB project mainly focuses on application of required statistical tests as well as to assist the user during exploration of results with information / help files to support hypothesis generation
f
Data from: Tissue Usage Preference and Intrinsically Disordered Region...
datasetcatalog.nlm.nih.gov
acs.figshare.com
Updated Mar 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lam, Maggie P. Y.; Brenman, Stella; Lau, Edward; Ng, Dominic C. M.; Black, Alexander; Pandi, Boomathi (2024). Tissue Usage Preference and Intrinsically Disordered Region Remodeling of Alternative Splicing Derived Proteoforms in the Heart [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001322234
Explore at:
Dataset updated
Mar 8, 2024
Authors
Lam, Maggie P. Y.; Brenman, Stella; Lau, Edward; Ng, Dominic C. M.; Black, Alexander; Pandi, Boomathi
Description
A computational analysis of mass spectrometry data was performed to uncover alternative splicing derived protein variants across chambers of the human heart. Evidence for 216 non-canonical isoforms was apparent in the atrium and the ventricle, including 52 isoforms not documented on SwissProt and recovered using an RNA sequencing derived database. Among non-canonical isoforms, 29 show signs of regulation based on statistically significant preferences in tissue usage, including a ventricular enriched protein isoform of tensin-1 (TNS1) and an atrium-enriched PDZ and LIM Domain 3 (PDLIM3) isoform 2 (PDLIM3-2/ALP-H). Examined variant regions that differ between alternative and canonical isoforms are highly enriched with intrinsically disordered regions. Moreover, over two-thirds of such regions are predicted to function in protein binding and RNA binding. The analysis here lends further credence to the notion that alternative splicing diversifies the proteome by rewiring intrinsically disordered regions, which are increasingly recognized to play important roles in the generation of biological function from protein sequences.
f
Data from: Proteogenomic Gene Structure Validation in the Pineapple Genome
acs.figshare.com
figshare.com
xlsx
Updated Apr 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Norazrin Ariffin; David Wells Newman; Michael G. Nelson; Ronan O’cualain; Simon J. Hubbard (2024). Proteogenomic Gene Structure Validation in the Pineapple Genome [Dataset]. http://doi.org/10.1021/acs.jproteome.3c00675.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jproteome.3c00675.s002
Dataset updated
Apr 23, 2024
Dataset provided by
ACS Publications
Authors
Norazrin Ariffin; David Wells Newman; Michael G. Nelson; Ronan O’cualain; Simon J. Hubbard
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
MD2 pineapple (Ananas comosus) is the second most important tropical crop that preserves crassulacean acid metabolism (CAM), which has high water-use efficiency and is fast becoming the most consumed fresh fruit worldwide. Despite the significance of environmental efficiency and popularity, until very recently, its genome sequence has not been determined and a high-quality annotated proteome has not been available. Here, we have undertaken a pilot proteogenomic study, analyzing the proteome of MD2 pineapple leaves using liquid chromatography-mass spectrometry (LC–MS/MS), which validates 1781 predicted proteins in the annotated F153 (V3) genome. In addition, a further 603 peptide identifications are found that map exclusively to an independent MD2 transcriptome-derived database but are not found in the standard F153 (V3) annotated proteome. Peptide identifications derived from these MD2 transcripts are also cross-referenced to a more recent and complete MD2 genome annotation, resulting in 402 nonoverlapping peptides, which in turn support 30 high-quality gene candidates novel to both pineapple genomes. Many of the validated F153 (V3) genes are also supported by an independent proteomics data set collected for an ornamental pineapple variety. The contigs and peptides have been mapped to the current F153 genome build and are available as bed files to display a custom gene track on the Ensembl Plants region viewer. These analyses add to the knowledge of experimentally validated pineapple genes and demonstrate the utility of transcript-derived proteomics to discover both novel genes and genetic structure in a plant genome, adding value to its annotation.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2022). Database of homology-derived secondary structure of proteins [Dataset]. https://bioregistry.io/hssp

Database of homology-derived secondary structure of proteins

Explore at:

16 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jan 16, 2022

Description

HSSP (homology-derived structures of proteins) is a derived database merging structural (2-D and 3-D) and sequence information (1-D). For each protein of known 3D structure from the Protein Data Bank, the database has a file with all sequence homologues, properly aligned to the PDB protein.

Clear search

Close search

Google apps

Main menu

Database of homology-derived secondary structure of proteins

Data from: Postfire Debris-Flow Database (Literature Derived)

ASTEROID LIGHTCURVE DERIVED DATA V13.0 - Dataset - NASA Open Data Portal

Data from: Global Data Set of Derived Soil Properties, 0.5-Degree Grid...

Minimal dataset derived from the database.

Addressing statistical biases in nucleotide-derived protein databases for...

Literature-derived human gene-disease network

Combined data set derived from new data generated herein and publicly...

DataSheet_1_MFPPDB: a comprehensive multi-functional plant peptide...

Working definition derived from the NHIS claims database.

MER2 Pancam Science Derived IOF Data Bundle - Dataset - NASA Open Data...

BridgeDb: pathway identifier mapping database derived from Wikidata

HUN AssetList Database v1p2 20150128

Abstract

Purpose

Dataset History

Dataset Citation

Dataset Ancestors

MBC Impact and Risk Analysis Database v01

Abstract

Purpose

Dataset History

Dataset Citation

Dataset Ancestors

NASA Earthdata

Data from: ASTEROID LIGHTCURVE DERIVED DATA V14.0

BIOSYSMOdb: Curated Database for Biodegradation and Bioremediation

Comprehensive Systems-Biology Database

Data from: Tissue Usage Preference and Intrinsically Disordered Region...

Data from: Proteogenomic Gene Structure Validation in the Pineapple Genome

Database of homology-derived secondary structure of proteinsSee More Versions

Database of homology-derived secondary structure of proteins