100+ datasets found
  1. e

    SFLD

    • ebi.ac.uk
    Updated Sep 7, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). SFLD [Dataset]. https://www.ebi.ac.uk/interpro/
    Explore at:
    Dataset updated
    Sep 7, 2018
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    SFLD (Structure-Function Linkage Database) is a hierarchical classification of enzymes that relates specific sequence-structure features to specific chemical capabilities.

  2. e

    SMART

    • ebi.ac.uk
    Updated Feb 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). SMART [Dataset]. https://www.ebi.ac.uk/interpro/
    Explore at:
    Dataset updated
    Feb 14, 2020
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    SMART (a Simple Modular Architecture Research Tool) allows the identification and annotation of genetically mobile domains and the analysis of domain architectures. SMART is based at EMBL, Heidelberg, Germany.

  3. e

    HAMAP

    • ebi.ac.uk
    Updated Feb 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). HAMAP [Dataset]. https://www.ebi.ac.uk/interpro/
    Explore at:
    Dataset updated
    Feb 5, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    HAMAP stands for High-quality Automated and Manual Annotation of Proteins. HAMAP profiles are manually created by expert curators. They identify proteins that are part of well-conserved protein families or subfamilies. HAMAP is based at the SIB Swiss Institute of Bioinformatics, Geneva, Switzerland.

  4. s

    EMBL European Bioinformatics Institute - Nucleotide Sequencing Data

    • geonetwork.soosmap.aq
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). EMBL European Bioinformatics Institute - Nucleotide Sequencing Data [Dataset]. https://geonetwork.soosmap.aq/geonetwork/srv/search
    Explore at:
    Dataset updated
    Apr 21, 2025
    Description

    The European Molecular Biology Laboratory European Bioinformatics Institute (EMBL-EBI) is international, innovative and interdisciplinary, and a champion of open data in the life sciences. The EMBL-EBI captures and presents globally comprehensive sequence data as part of the International Nucleotide Sequence Database Collaboration. Data provided to GBIF include geotagged environmental sequences with user-provided taxonomic identifications. This dataset contains INSDC sequences associated with environmental sample identifiers. The dataset is prepared periodically using the public ENA API (https://www.ebi.ac.uk/ena/portal/api/) by querying data with the search parameters: environmental_sample=True & host="" EMBL-EBI also publishes other records in separate datasets (https://www.gbif.org/publisher/ada9d123-ddb4-467d-8891-806ea8d94230). The data was then processed as follows: 1. Human sequences were excluded. 2. For non-CONTIG records, the sample accession number (when available) along with the scientific name were used to identify sequence records corresponding to the same individuals (or group of organism of the same species in the same sample). Only one record was kept for each scientific name/sample accession number. 3. Contigs and whole genome shotgun (WGS) records were added individually. 4. The records that were missing some information were excluded. Only records associated with a specimen voucher or records containing both a location AND a date were kept. 5. The records associated with the same vouchers are aggregated together. 6. A lot of records left corresponded to individual sequences or reads corresponding to the same organisms. In practise, these were "duplicate" occurrence records that weren't filtered out in STEP 2 because the sample accession sample was missing. To identify those potential duplicates, we grouped all the remaining records by scientific_name, collection_date, location, country, identified_by, collected_by and sample_accession (when available). Then we excluded the groups that contained more than 50 records. The rationale behind the choice of threshold is explained here: Deduplication v2 gbif/embl-adapter#10 (comment) 7. To improve the matching of the EBI scientific name to the GBIF backbone taxonomy, we incorporated the ENA taxonomic information. The kingdom, Phylum, Class, Order, Family, and genus were obtained from the ENA taxonomy checklist available here: http://ftp.ebi.ac.uk/pub/databases/ena/taxonomy/sdwca.zip More information available here: https://github.com/gbif/embl-adapter#readme You can find the mapping used to format the EMBL data to Darwin Core Archive here: https://github.com/gbif/embl-adapter/blob/master/DATAMAPPING.md

  5. INSDC Host Organism Sequences

    • gbif.org
    • researchdata.edu.au
    • +2more
    Updated Aug 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Bioinformatics Institute (EMBL-EBI); European Bioinformatics Institute (EMBL-EBI) (2025). INSDC Host Organism Sequences [Dataset]. http://doi.org/10.15468/e97kmy
    Explore at:
    Dataset updated
    Aug 2, 2025
    Dataset provided by
    European Bioinformatics Institutehttp://www.ebi.ac.uk/
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    Authors
    European Bioinformatics Institute (EMBL-EBI); European Bioinformatics Institute (EMBL-EBI)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    This dataset contains INSDC sequences associated with host organisms. The dataset is prepared periodically using the public ENA API (https://www.ebi.ac.uk/ena/portal/api/) using the methods described below.

    EMBL-EBI also publishes other records in separate datasets (https://www.gbif.org/publisher/ada9d123-ddb4-467d-8891-806ea8d94230).

    The data was then processed as follows:

    1. Human sequences were excluded.

    2. For non-CONTIG records, the sample accession number (when available) along with the scientific name were used to identify sequence records corresponding to the same individuals (or group of organism of the same species in the same sample). Only one record was kept for each scientific name/sample accession number.

    3. Contigs and whole genome shotgun (WGS) records were added individually.

    4. The records that were missing some information were excluded. Only records associated with a specimen voucher or records containing both a location AND a date were kept.

    5. The records associated with the same vouchers are aggregated together.

    6. A lot of records left corresponded to individual sequences or reads corresponding to the same organisms. In practise, these were "duplicate" occurrence records that weren't filtered out in STEP 2 because the sample accession sample was missing. To identify those potential duplicates, we grouped all the remaining records by `scientific_name`, `collection_date`, `location`, `country`, `identified_by`, `collected_by` and `sample_accession` (when available). Then we excluded the groups that contained more than 50 records. The rationale behind the choice of threshold is explained here: https://github.com/gbif/embl-adapter/issues/10#issuecomment-855757978

    7. To improve the matching of the EBI scientific name to the GBIF backbone taxonomy, we incorporated the ENA taxonomic information. The kingdom, Phylum, Class, Order, Family, and genus were obtained from the ENA taxonomy checklist available here: http://ftp.ebi.ac.uk/pub/databases/ena/taxonomy/sdwca.zip

    More information available here: https://github.com/gbif/embl-adapter#readme

    You can find the mapping used to format the EMBL data to Darwin Core Archive here: https://github.com/gbif/embl-adapter/blob/master/DATAMAPPING.md

  6. INSDC Sequences

    • gbif.org
    • researchdata.edu.au
    Updated Jul 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Bioinformatics Institute (EMBL-EBI); European Bioinformatics Institute (EMBL-EBI) (2025). INSDC Sequences [Dataset]. http://doi.org/10.15468/sbmztx
    Explore at:
    Dataset updated
    Jul 26, 2025
    Dataset provided by
    European Bioinformatics Institutehttp://www.ebi.ac.uk/
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    Authors
    European Bioinformatics Institute (EMBL-EBI); European Bioinformatics Institute (EMBL-EBI)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    This dataset contains INSDC sequence records not associated with environmental sample identifiers or host organisms. The dataset is prepared periodically using the public ENA API (https://www.ebi.ac.uk/ena/portal/api/) by querying data with search parameters: `environmental_sample=False & host=""`

    EMBL-EBI also publishes other records in separate datasets (https://www.gbif.org/publisher/ada9d123-ddb4-467d-8891-806ea8d94230).

    The data was then processed as follows:

    1. Human sequences were excluded.

    2. For non-CONTIG records, the sample accession number (when available) along with the scientific name were used to identify sequence records corresponding to the same individuals (or group of organism of the same species in the same sample). Only one record was kept for each scientific name/sample accession number.

    3. Contigs and whole genome shotgun (WGS) records were added individually.

    4. The records that were missing some information were excluded. Only records associated with a specimen voucher or records containing both a location AND a date were kept.

    5. The records associated with the same vouchers are aggregated together.

    6. A lot of records left corresponded to individual sequences or reads corresponding to the same organisms. In practise, these were "duplicate" occurrence records that weren't filtered out in STEP 2 because the sample accession sample was missing. To identify those potential duplicates, we grouped all the remaining records by `scientific_name`, `collection_date`, `location`, `country`, `identified_by`, `collected_by` and `sample_accession` (when available). Then we excluded the groups that contained more than 50 records. The rationale behind the choice of threshold is explained here: https://github.com/gbif/embl-adapter/issues/10#issuecomment-855757978

    7. To improve the matching of the EBI scientific name to the GBIF backbone taxonomy, we incorporated the ENA taxonomic information. The kingdom, Phylum, Class, Order, Family, and genus were obtained from the ENA taxonomy checklist available here: http://ftp.ebi.ac.uk/pub/databases/ena/taxonomy/sdwca.zip

    More information available here: https://github.com/gbif/embl-adapter#readme

    You can find the mapping used to format the EMBL data to Darwin Core Archive here: https://github.com/gbif/embl-adapter/blob/master/DATAMAPPING.md

  7. E

    Blueprint: A human variation panel of genetic influences on epigenomes and...

    • ega-archive.org
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Blueprint: A human variation panel of genetic influences on epigenomes and transcriptomes in three immune cell types, (ChIP-Seq for CD4-positive, alpha-beta T cell, on genome GRCh37) [Dataset]. https://ega-archive.org/datasets/EGAD00001002673
    Explore at:
    License

    https://ega-archive.org/dacs/EGAC00001000135https://ega-archive.org/dacs/EGAC00001000135

    Description

    ChIP-Seq data for 154 CD4-positive, alpha-beta T cell sample(s). 355 run(s), 265 experiment(s), 250 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_chipseq_analysis_ebi_20160816

  8. e

    Active ENA Data Hubs

    • ebi.ac.uk
    Updated Dec 1, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). Active ENA Data Hubs [Dataset]. https://www.ebi.ac.uk/ebisearch/data-coverage
    Explore at:
    Dataset updated
    Dec 1, 2015
    Description

    Overview of active ENA Data Hubs

  9. e

    Arabidopsis thaliana N-terminal Acetylome

    • ebi.ac.uk
    • data.niaid.nih.gov
    • +1more
    Updated Jul 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jean Baptiste BOYER (2020). Arabidopsis thaliana N-terminal Acetylome [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD016496
    Explore at:
    Dataset updated
    Jul 13, 2020
    Authors
    Jean Baptiste BOYER
    Variables measured
    Proteomics
    Description

    Quantitative study of the N-terminal acetylome variations in Arabidopsis thaliana, looking at the effect of a N-acetyltransferase KO.

  10. e

    Hypoxia_Normoxia_C48_UTR_Watt_2024

    • ebi.ac.uk
    • data.niaid.nih.gov
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tyler Cooper (2025). Hypoxia_Normoxia_C48_UTR_Watt_2024 [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD058655
    Explore at:
    Dataset updated
    May 29, 2025
    Authors
    Tyler Cooper
    Variables measured
    Proteomics
    Description

    This data is apart of a project assessing transcriptional start site switching and UTR switching at translational level following hypoxia.

  11. e

    Discovery of new cerebrospinal fluid biomarkers for meningitis in children...

    • ebi.ac.uk
    • data.niaid.nih.gov
    Updated Nov 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Da Qi (2023). Discovery of new cerebrospinal fluid biomarkers for meningitis in children C4PR_LIV [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD000764
    Explore at:
    Dataset updated
    Nov 18, 2023
    Authors
    Da Qi
    Variables measured
    Proteomics
    Description

    Bacterial meningitis is usually fatal without treatment and prompt and accurate diagnosis coupled with the timely administration of parenteral antibiotics, are necessary in order to save lives. The diagnosis can sometimes be delayed whilst samples are analysed in a laboratory using traditional methods of microscopy and antigen testing. The objective of our project is to define specific protein signatures in cerebrospinal fluid associated with Streptococcus pneumoniae infection which could lead to the development of assays or point-of-care devices to improve the speed and accuracy of diagnosis, and guide the clinicians in the treatment and prognosis of children with bacterial meningitis. The associated research paper is in preparation.

  12. e

    GEO DataSets

    • ebi.ac.uk
    Updated Dec 1, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). GEO DataSets [Dataset]. https://www.ebi.ac.uk/ebisearch/data-coverage
    Explore at:
    Dataset updated
    Dec 1, 2015
    Description

    Gene Expression Omnibus. GEO is a public functional genomics data repository supporting MIAME-compliant data submissions. The GEO DataSets database stores original submitter-supplied records (Series, Samples and Platforms) as well as curated DataSets.

  13. e

    Data from: Competitive binding of STATs to receptor phospho-Tyr motifs...

    • ebi.ac.uk
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephan Wilmes, Competitive binding of STATs to receptor phospho-Tyr motifs accounts for altered cytokine responses in autoimmune disorders [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD024188
    Explore at:
    Authors
    Stephan Wilmes
    Variables measured
    Proteomics
    Description

    In this study, we compared the effects of two cytokine treatments on the proteome of human Th-1 cells. We used saturating doses of murine single-chain IL-27 (EBI3+p28, 10nM) and HyperIL-6 (20nM) and continuously stimulated cells of three donors with the two cytokines for 24h or left untreated.

  14. e

    A high-density, organ-specific proteome map for Arabidopsis thaliana

    • ebi.ac.uk
    • data.niaid.nih.gov
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Katja Baerenfaller, A high-density, organ-specific proteome map for Arabidopsis thaliana [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PRD000044
    Explore at:
    Authors
    Katja Baerenfaller
    Variables measured
    Proteomics
    Description

    Not available

  15. e

    Proteomic characterization of sorted peroxisomes

    • ebi.ac.uk
    Updated Apr 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tommaso De Marchi (2025). Proteomic characterization of sorted peroxisomes [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD028679
    Explore at:
    Dataset updated
    Apr 3, 2025
    Authors
    Tommaso De Marchi
    Variables measured
    Proteomics
    Description

    Proteomic analysis of sorted peroxysomes (old, young, and middle-aged).

  16. e

    All

    • ebi.ac.uk
    Updated Nov 4, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). All [Dataset]. https://www.ebi.ac.uk/interpro/set/all/
    Explore at:
    Dataset updated
    Nov 4, 2019
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset of the type ? from the database All - version N/A

  17. e

    proteins pulled down by CREB, part 2

    • ebi.ac.uk
    Updated Aug 11, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hui du (2019). proteins pulled down by CREB, part 2 [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD008261
    Explore at:
    Dataset updated
    Aug 11, 2019
    Authors
    hui du
    Variables measured
    Proteomics
    Description

    Hepatocarcinoma is the third leading cause of death in cancer in the world. In recent years, research on CREB in hepatocellular carcinoma has become a hotspot, so our research group wants to use the mass spectrometry analysis what proteins can bind with CREB, and then explore the links between CREB and hepatocellular carcinoma.

  18. e

    Cohorts - Early Cause

    • ebi.ac.uk
    Updated Dec 1, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). Cohorts - Early Cause [Dataset]. https://www.ebi.ac.uk
    Explore at:
    Dataset updated
    Dec 1, 2015
    Description

    Cohorts metadata for the Early Causes project.

  19. e

    Analysis of SARS-CoV-2 S and human ACE2 protein by LC-MS

    • ebi.ac.uk
    • data.niaid.nih.gov
    Updated Sep 9, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peng Zhao (2020). Analysis of SARS-CoV-2 S and human ACE2 protein by LC-MS [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD019940
    Explore at:
    Dataset updated
    Sep 9, 2020
    Authors
    Peng Zhao
    Variables measured
    Proteomics
    Description

    Analysis of intact O-linked glycopeptides for SARS-Cov-2 S and human ACE2 protein by LC-MS

  20. e

    A2A2Y4

    • ebi.ac.uk
    Updated Nov 19, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). A2A2Y4 [Dataset]. https://www.ebi.ac.uk/interpro/protein/A2A2Y4
    Explore at:
    Dataset updated
    Nov 19, 2015
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Putative tumor suppressor gene that may be implicated in the origin and progression of lung cancer

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2018). SFLD [Dataset]. https://www.ebi.ac.uk/interpro/

SFLD

Explore at:
Dataset updated
Sep 7, 2018
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

SFLD (Structure-Function Linkage Database) is a hierarchical classification of enzymes that relates specific sequence-structure features to specific chemical capabilities.

Search
Clear search
Close search
Google apps
Main menu