100+ datasets found
  1. e

    SFLD

    • ebi.ac.uk
    Updated Sep 7, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). SFLD [Dataset]. https://www.ebi.ac.uk/interpro/
    Explore at:
    Dataset updated
    Sep 7, 2018
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    SFLD (Structure-Function Linkage Database) is a hierarchical classification of enzymes that relates specific sequence-structure features to specific chemical capabilities.

  2. e

    NCBIFAM

    • ebi.ac.uk
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). NCBIFAM [Dataset]. https://www.ebi.ac.uk/interpro/
    Explore at:
    Dataset updated
    Dec 16, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    NCBIfam is a collection of protein families, featuring curated multiple sequence alignments, hidden Markov models (HMMs) and annotation, which provides a tool for identifying functionally related proteins based on sequence homology. NCBIfam is maintained at the National Center for Biotechnology Information (Bethesda, MD). NCBIfam includes models from TIGRFAMs, another database of protein families developed at The Institute for Genomic Research, then at the J. Craig Venter Institute (Rockville, MD, US).

  3. e

    SMART

    • ebi.ac.uk
    Updated Feb 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). SMART [Dataset]. https://www.ebi.ac.uk/interpro/
    Explore at:
    Dataset updated
    Feb 14, 2020
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    SMART (a Simple Modular Architecture Research Tool) allows the identification and annotation of genetically mobile domains and the analysis of domain architectures. SMART is based at EMBL, Heidelberg, Germany.

  4. e

    HAMAP

    • ebi.ac.uk
    Updated Feb 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). HAMAP [Dataset]. https://www.ebi.ac.uk/interpro/
    Explore at:
    Dataset updated
    Feb 5, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    HAMAP stands for High-quality Automated and Manual Annotation of Proteins. HAMAP profiles are manually created by expert curators. They identify proteins that are part of well-conserved protein families or subfamilies. HAMAP is based at the SIB Swiss Institute of Bioinformatics, Geneva, Switzerland.

  5. s

    EMBL European Bioinformatics Institute - Nucleotide Sequencing Data

    • geonetwork.soosmap.aq
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). EMBL European Bioinformatics Institute - Nucleotide Sequencing Data [Dataset]. https://geonetwork.soosmap.aq/geonetwork/srv/search
    Explore at:
    Dataset updated
    Apr 21, 2025
    Description

    The European Molecular Biology Laboratory European Bioinformatics Institute (EMBL-EBI) is international, innovative and interdisciplinary, and a champion of open data in the life sciences. The EMBL-EBI captures and presents globally comprehensive sequence data as part of the International Nucleotide Sequence Database Collaboration. Data provided to GBIF include geotagged environmental sequences with user-provided taxonomic identifications. This dataset contains INSDC sequences associated with environmental sample identifiers. The dataset is prepared periodically using the public ENA API (https://www.ebi.ac.uk/ena/portal/api/) by querying data with the search parameters: environmental_sample=True & host="" EMBL-EBI also publishes other records in separate datasets (https://www.gbif.org/publisher/ada9d123-ddb4-467d-8891-806ea8d94230). The data was then processed as follows: 1. Human sequences were excluded. 2. For non-CONTIG records, the sample accession number (when available) along with the scientific name were used to identify sequence records corresponding to the same individuals (or group of organism of the same species in the same sample). Only one record was kept for each scientific name/sample accession number. 3. Contigs and whole genome shotgun (WGS) records were added individually. 4. The records that were missing some information were excluded. Only records associated with a specimen voucher or records containing both a location AND a date were kept. 5. The records associated with the same vouchers are aggregated together. 6. A lot of records left corresponded to individual sequences or reads corresponding to the same organisms. In practise, these were "duplicate" occurrence records that weren't filtered out in STEP 2 because the sample accession sample was missing. To identify those potential duplicates, we grouped all the remaining records by scientific_name, collection_date, location, country, identified_by, collected_by and sample_accession (when available). Then we excluded the groups that contained more than 50 records. The rationale behind the choice of threshold is explained here: Deduplication v2 gbif/embl-adapter#10 (comment) 7. To improve the matching of the EBI scientific name to the GBIF backbone taxonomy, we incorporated the ENA taxonomic information. The kingdom, Phylum, Class, Order, Family, and genus were obtained from the ENA taxonomy checklist available here: http://ftp.ebi.ac.uk/pub/databases/ena/taxonomy/sdwca.zip More information available here: https://github.com/gbif/embl-adapter#readme You can find the mapping used to format the EMBL data to Darwin Core Archive here: https://github.com/gbif/embl-adapter/blob/master/DATAMAPPING.md

  6. INSDC Sequences

    • gbif.org
    • researchdata.edu.au
    • +1more
    Updated Jul 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Bioinformatics Institute (EMBL-EBI); European Bioinformatics Institute (EMBL-EBI) (2025). INSDC Sequences [Dataset]. http://doi.org/10.15468/sbmztx
    Explore at:
    Dataset updated
    Jul 26, 2025
    Dataset provided by
    European Bioinformatics Institutehttp://www.ebi.ac.uk/
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    Authors
    European Bioinformatics Institute (EMBL-EBI); European Bioinformatics Institute (EMBL-EBI)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    This dataset contains INSDC sequence records not associated with environmental sample identifiers or host organisms. The dataset is prepared periodically using the public ENA API (https://www.ebi.ac.uk/ena/portal/api/) by querying data with search parameters: `environmental_sample=False & host=""`

    EMBL-EBI also publishes other records in separate datasets (https://www.gbif.org/publisher/ada9d123-ddb4-467d-8891-806ea8d94230).

    The data was then processed as follows:

    1. Human sequences were excluded.

    2. For non-CONTIG records, the sample accession number (when available) along with the scientific name were used to identify sequence records corresponding to the same individuals (or group of organism of the same species in the same sample). Only one record was kept for each scientific name/sample accession number.

    3. Contigs and whole genome shotgun (WGS) records were added individually.

    4. The records that were missing some information were excluded. Only records associated with a specimen voucher or records containing both a location AND a date were kept.

    5. The records associated with the same vouchers are aggregated together.

    6. A lot of records left corresponded to individual sequences or reads corresponding to the same organisms. In practise, these were "duplicate" occurrence records that weren't filtered out in STEP 2 because the sample accession sample was missing. To identify those potential duplicates, we grouped all the remaining records by `scientific_name`, `collection_date`, `location`, `country`, `identified_by`, `collected_by` and `sample_accession` (when available). Then we excluded the groups that contained more than 50 records. The rationale behind the choice of threshold is explained here: https://github.com/gbif/embl-adapter/issues/10#issuecomment-855757978

    7. To improve the matching of the EBI scientific name to the GBIF backbone taxonomy, we incorporated the ENA taxonomic information. The kingdom, Phylum, Class, Order, Family, and genus were obtained from the ENA taxonomy checklist available here: http://ftp.ebi.ac.uk/pub/databases/ena/taxonomy/sdwca.zip

    More information available here: https://github.com/gbif/embl-adapter#readme

    You can find the mapping used to format the EMBL data to Darwin Core Archive here: https://github.com/gbif/embl-adapter/blob/master/DATAMAPPING.md

  7. e

    ccRCC_expression

    • ebi.ac.uk
    Updated Oct 15, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yusuke Sato (2013). ccRCC_expression [Dataset]. https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-1980/
    Explore at:
    Dataset updated
    Oct 15, 2013
    Authors
    Yusuke Sato
    Description

    Gene expression in clear cell RCC was measured for 101 samples

  8. e

    Transcriptional Profiling of 1,000 human cancer cell lines

    • ebi.ac.uk
    Updated Jun 30, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francesco Iorio (2015). Transcriptional Profiling of 1,000 human cancer cell lines [Dataset]. https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-3610/
    Explore at:
    Dataset updated
    Jun 30, 2015
    Authors
    Francesco Iorio
    Description

    Basal expression profiles of 1,000 human cancer cell lines in the Genomics of Drug Sensitivity in Cancer (GDSC) panel [upcoming version], profiled using a diverse collection of 265 compounds. We have carried out an extensive computational exploration of the data to determine (1) to what extent does the mutational landscape of cancer cell lines recapitulate that seen in primary tumours, (2) what effect the status of these genomic features have on the variation in drug response; (3) whether genomic alterations acting in concert explain more of the variation in drug response; and (4) what is the predictive ability of these individual data-omics and at what extent this is improved when they are combined. [See publication]

  9. e

    Arabidopsis thaliana N-terminal Acetylome

    • ebi.ac.uk
    • data.niaid.nih.gov
    • +1more
    Updated Jul 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jean Baptiste BOYER (2020). Arabidopsis thaliana N-terminal Acetylome [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD016496
    Explore at:
    Dataset updated
    Jul 13, 2020
    Authors
    Jean Baptiste BOYER
    Variables measured
    Proteomics
    Description

    Quantitative study of the N-terminal acetylome variations in Arabidopsis thaliana, looking at the effect of a N-acetyltransferase KO.

  10. e

    Active ENA Data Hubs

    • ebi.ac.uk
    Updated Dec 1, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). Active ENA Data Hubs [Dataset]. https://www.ebi.ac.uk/ebisearch/data-coverage
    Explore at:
    Dataset updated
    Dec 1, 2015
    Description

    Overview of active ENA Data Hubs

  11. e

    Plasma proteomics in children with new-onset type 1 diabetes: a strong tool...

    • ebi.ac.uk
    Updated May 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Didier Vertommen (2024). Plasma proteomics in children with new-onset type 1 diabetes: a strong tool to identify partial remission biomarkers [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD049795
    Explore at:
    Dataset updated
    May 4, 2024
    Authors
    Didier Vertommen
    Variables measured
    Proteomics
    Description

    Partial remission (PR) occurs in only half of patients with new-onset type 1 diabetes (T1D) and correspond to a transient period characterized by low daily insulin needs, low glycemic fluctuations and increased endogenous insulin secretion. While identification of newly-onset T1D patients with significant residual beta-cell function may foster patient-specific interventions, reliable predictive biomarkers of PR occurrence currently lack. We analyzed the plasma of children with new-onset T1D to identify biomarkers present at diagnosis that predicted PR at 3 months post-diagnosis. We first performed an extensive shotgun proteomic analysis using Liquid Chromatography-Tandem-Mass-Spectrometry (LCMS/MS) on the plasma of 16 children with new-onset T1D and quantified nearly 1500 unique proteins with 98 significantly correlating with Insulin-Dose Adjusted glycated hemoglobin A1c score (IDAA1C). We next applied a series of both qualitative and statistical filters that yielded to the selection of 26 protein candidates that were associated to pathophysiological mechanisms related to T1D. Finally, we translationally validated several of the candidates using single-shot targeted proteomic (PRM method) on raw plasma. Taken together, we identified plasmatic biomarkers present at diagnosis that may predict the occurrence of PR in a single mass-spectrometry run. We believe that the identification of new predictive biomarkers of PR and β-cell function is key to stratify patients with new-onset T1D for β-cell preservation therapies

  12. e

    Hypoxia_Normoxia_C48_UTR_Watt_2024

    • ebi.ac.uk
    • data.niaid.nih.gov
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tyler Cooper (2025). Hypoxia_Normoxia_C48_UTR_Watt_2024 [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD058655
    Explore at:
    Dataset updated
    May 29, 2025
    Authors
    Tyler Cooper
    Variables measured
    Proteomics
    Description

    This data is apart of a project assessing transcriptional start site switching and UTR switching at translational level following hypoxia.

  13. e

    Discovery of new cerebrospinal fluid biomarkers for meningitis in children...

    • ebi.ac.uk
    • data.niaid.nih.gov
    Updated Nov 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Da Qi (2023). Discovery of new cerebrospinal fluid biomarkers for meningitis in children C4PR_LIV [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD000764
    Explore at:
    Dataset updated
    Nov 18, 2023
    Authors
    Da Qi
    Variables measured
    Proteomics
    Description

    Bacterial meningitis is usually fatal without treatment and prompt and accurate diagnosis coupled with the timely administration of parenteral antibiotics, are necessary in order to save lives. The diagnosis can sometimes be delayed whilst samples are analysed in a laboratory using traditional methods of microscopy and antigen testing. The objective of our project is to define specific protein signatures in cerebrospinal fluid associated with Streptococcus pneumoniae infection which could lead to the development of assays or point-of-care devices to improve the speed and accuracy of diagnosis, and guide the clinicians in the treatment and prognosis of children with bacterial meningitis. The associated research paper is in preparation.

  14. e

    Data from: An advanced strategy for comprehensive profiling of...

    • ebi.ac.uk
    • data.niaid.nih.gov
    Updated Feb 26, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ivo Hendriks (2019). An advanced strategy for comprehensive profiling of ADP-ribosylation sites using mass spectrometry-based proteomics [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD012243
    Explore at:
    Dataset updated
    Feb 26, 2019
    Authors
    Ivo Hendriks
    Variables measured
    Proteomics
    Description

    ADP-ribosylation is a widespread post-translational modification (PTM) with crucial functions in many cellular processes. Here, we describe an in-depth ADP-ribosylome using our Af1521-based proteomics methodology for profiling of ADP-ribosylation sites, by systematically assessing complementary proteolytic digestions and precursor fragmentation through application of electron-transfer higher-energy collisional dissociation (EThcD) and electron transfer dissociation (ETD), respectively. While ETD spectra yielded higher identification scores, EThcD generally proved superior to ETD in identification and localization of ADP-ribosylation sites regardless of protease employed. Notwithstanding, the propensities of complementary proteases and fragmentation methods expanded the detectable repertoire of ADP-ribosylation to an unprecedented depth. This system-wide profiling of the ADP-ribosylome in HeLa cells subjected to DNA damage uncovered >11,000 unique ADP-ribosylated peptides mapping to >7,000 ADP-ribosylation sites, in total modifying over one-third of the human nuclear proteome and highlighting the vast scope of this PTM. High-resolution MS/MS spectra enabled identification of dozens of proteins concomitantly modified by ADP-ribosylation and phosphorylation, revealing a considerable degree of crosstalk on histones. ADP-ribosylation was confidently localized to various amino acid residue types, including less abundantly modified residues, with hundreds of ADP-ribosylation sites pinpointed on histidine, arginine, and tyrosine residues. Functional enrichment analysis suggested modification of these specific residue types is directed in a spatial manner, with tyrosine ADP-ribosylation linked to the ribosome, arginine ADP-ribosylation linked to the endoplasmic reticulum, and histidine ADP-ribosylation linked to the mitochondrion.

  15. e

    Data from: Competitive binding of STATs to receptor phospho-Tyr motifs...

    • ebi.ac.uk
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephan Wilmes, Competitive binding of STATs to receptor phospho-Tyr motifs accounts for altered cytokine responses in autoimmune disorders [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD024188
    Explore at:
    Authors
    Stephan Wilmes
    Variables measured
    Proteomics
    Description

    In this study, we compared the effects of two cytokine treatments on the proteome of human Th-1 cells. We used saturating doses of murine single-chain IL-27 (EBI3+p28, 10nM) and HyperIL-6 (20nM) and continuously stimulated cells of three donors with the two cytokines for 24h or left untreated.

  16. e

    Proteomic characterization of sorted peroxisomes

    • ebi.ac.uk
    Updated Apr 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tommaso De Marchi (2025). Proteomic characterization of sorted peroxisomes [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD028679
    Explore at:
    Dataset updated
    Apr 3, 2025
    Authors
    Tommaso De Marchi
    Variables measured
    Proteomics
    Description

    Proteomic analysis of sorted peroxysomes (old, young, and middle-aged).

  17. e

    A high-density, organ-specific proteome map for Arabidopsis thaliana

    • ebi.ac.uk
    • data.niaid.nih.gov
    Updated Apr 25, 2008
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Katja Baerenfaller (2008). A high-density, organ-specific proteome map for Arabidopsis thaliana [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PRD000044
    Explore at:
    Dataset updated
    Apr 25, 2008
    Authors
    Katja Baerenfaller
    Variables measured
    Proteomics
    Description

    Not available

  18. e

    proteins pulled down by CREB, part 2

    • ebi.ac.uk
    Updated Aug 11, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hui du (2019). proteins pulled down by CREB, part 2 [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD008261
    Explore at:
    Dataset updated
    Aug 11, 2019
    Authors
    hui du
    Variables measured
    Proteomics
    Description

    Hepatocarcinoma is the third leading cause of death in cancer in the world. In recent years, research on CREB in hepatocellular carcinoma has become a hotspot, so our research group wants to use the mass spectrometry analysis what proteins can bind with CREB, and then explore the links between CREB and hepatocellular carcinoma.

  19. e

    Analysis of SARS-CoV-2 S and human ACE2 protein by LC-MS

    • ebi.ac.uk
    • data.niaid.nih.gov
    Updated Sep 9, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peng Zhao (2020). Analysis of SARS-CoV-2 S and human ACE2 protein by LC-MS [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD019940
    Explore at:
    Dataset updated
    Sep 9, 2020
    Authors
    Peng Zhao
    Variables measured
    Proteomics
    Description

    Analysis of intact O-linked glycopeptides for SARS-Cov-2 S and human ACE2 protein by LC-MS

  20. e

    A2A2Y4

    • ebi.ac.uk
    Updated Nov 19, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). A2A2Y4 [Dataset]. https://www.ebi.ac.uk/interpro/protein/A2A2Y4
    Explore at:
    Dataset updated
    Nov 19, 2015
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Putative tumor suppressor gene that may be implicated in the origin and progression of lung cancer

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2018). SFLD [Dataset]. https://www.ebi.ac.uk/interpro/

SFLD

Explore at:
Dataset updated
Sep 7, 2018
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

SFLD (Structure-Function Linkage Database) is a hierarchical classification of enzymes that relates specific sequence-structure features to specific chemical capabilities.

Search
Clear search
Close search
Google apps
Main menu