100+ datasets found

n
Human Potential Tumor Associated Antigen database
neuinfo.org
rrid.site
+2more
Updated Aug 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Human Potential Tumor Associated Antigen database [Dataset]. http://identifiers.org/RRID:SCR_002938
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002938
Dataset updated
Aug 22, 2024
Description
To accelerate the process of tumor antigen discovery, we generated a publicly available Human Potential Tumor Associated Antigen database (HPtaa) with pTAAs identified by insilico computing. 3518 potential targets have been included in the database, which is freely available to academic users. It successfully screened out 41 of 82 known Cancer-Testis antigens, 6 of 18 differentiation antigen, 2 of 2 oncofetal antigen, and 7 of 12 FDA approved cancer markers that have Gene ID, therefore will provide a good platform for identification of cancer target genes. This database utilizes expression data from various expression platforms, including carefully chosen publicly available microarray expression data, GEO SAGE data, Unigene expression data. In addition, other relevant databases required for TAA discovery such as CGAP, CCDS, gene ontology database etc, were also incorporated. In order to integrate different expression platforms together, various strategies and algorithms have been developed. Known tumor antigens are gathered from literature and serve as training sets. A total tumor specificity penalty was computed from positive clue penalty for differential expression in human cancers, the corresponding differential ratio, and normal tissue restriction penalty for each gene. We hope this database will help with the process of cancer immunome identification, thus help with improving the diagnosis and treatment of human carcinomas.
Serum Antibody Repertoire Profiling Using In Silico Antigen Screen
plos.figshare.com
doc
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xinyue Liu; Qiang Hu; Song Liu; Luke J. Tallo; Lisa Sadzewicz; Cassandra A. Schettine; Mikhail Nikiforov; Elena N. Klyushnenkova; Yurij Ionov (2023). Serum Antibody Repertoire Profiling Using In Silico Antigen Screen [Dataset]. http://doi.org/10.1371/journal.pone.0067181
Explore at:
docAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0067181
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Xinyue Liu; Qiang Hu; Song Liu; Luke J. Tallo; Lisa Sadzewicz; Cassandra A. Schettine; Mikhail Nikiforov; Elena N. Klyushnenkova; Yurij Ionov
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Serum antibodies are valuable source of information on the health state of an organism. The profiles of serum antibody reactivity can be generated by using a high throughput sequencing of peptide-coding DNA from combinatorial random peptide phage display libraries selected for binding to serum antibodies. Here we demonstrate that the targets of immune response, which are recognized by serum antibodies directed against sequential epitopes, can be identified using the serum antibody repertoire profiles generated by high throughput sequencing. We developed an algorithm to filter the results of the protein database BLAST search for selected peptides to distinguish real antigens recognized by serum antibodies from irrelevant proteins retrieved randomly. When we used this algorithm to analyze serum antibodies from mice immunized with human protein, we were able to identify the protein used for immunizations among the top candidate antigens. When we analyzed human serum sample from the metastatic melanoma patient, the recombinant protein, corresponding to the top candidate from the list generated using the algorithm, was recognized by antibodies from metastatic melanoma serum on the western blot, thus confirming that the method can identify autoantigens recognized by serum antibodies. We demonstrated also that our unbiased method of looking at the repertoire of serum antibodies reveals quantitative information on the epitope composition of the targets of immune response. A method for deciphering information contained in the serum antibody repertoire profiles may help to identify autoantibodies that can be used for diagnosing and monitoring autoimmune diseases or malignancies.
n
SV40 Large T-Antigen Mutant Database
neuinfo.org
rrid.site
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). SV40 Large T-Antigen Mutant Database [Dataset]. http://identifiers.org/RRID:SCR_005313
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_005313
Dataset updated
Jan 29, 2022
Description
THIS RESOURCE IS NO LONGER IN SERVICE, documented on July 15, 2013. The SV40 T antigen database lists viruses and plasmids expressing mutant forms of large T antigen. Each entry contains information regarding the mutant designation, mutant type, virus strain, nucleotide change, amino acid change and pertinent references. Category: Human Genes and Diseases Subcategory: Cancer gene databases
b
Antibody-Antigen Complex Database
bioregistry.io
Updated Oct 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Antibody-Antigen Complex Database [Dataset]. https://bioregistry.io/registry/aacdb
Explore at:
Dataset updated
Oct 20, 2025
Description
Identifiers represent antibody-antigen complexes in the Antigen-Antibody Complex Database (AACDB), which provides comprehensive structural and functional annotations including paratope and epitope information, antibody developability data, and antigen-drug target relationships to support immunoinformatics research and therapeutic antibody development.
d
Immune Epitope Database and Analysis Resource (IEDB)
catalog.data.gov
healthdata.gov
+2more
Updated Jul 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institutes of Health (NIH) (2023). Immune Epitope Database and Analysis Resource (IEDB) [Dataset]. https://catalog.data.gov/dataset/immune-epitope-database-and-analysis-resource-iedb
Explore at:
Dataset updated
Jul 26, 2023
Dataset provided by
National Institutes of Health (NIH)
Description
This repository contains antibody/B cell and T cell epitope information and epitope prediction and analysis tools for use by the research community worldwide. Immune epitopes are defined as molecular structures recognized by specific antigen receptors of the immune system, namely antibodies, B cell receptors, and T cell receptors. Immune epitopes from infectious diseases, excluding HIV, and immune-mediated diseases and the accompanying biological information are included.
d
Blood Group Antigen Gene Mutation Database
dknet.org
rrid.site
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Blood Group Antigen Gene Mutation Database [Dataset]. http://identifiers.org/RRID:SCR_002297
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002297
Dataset updated
Jan 29, 2022
Description
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on August 23, 2019.BGMUT was database that provided publicly accessible platform for DNA sequences and curated set of blood mutation information. Data Archive are available at ftp://ftp.ncbi.nlm.nih.gov/pub/mhc/rbc/Final Archive.
Table3_CAD v1.0: Cancer Antigens Database Platform for Cancer Antigen...
frontiersin.figshare.com
docx
Updated Jun 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jijun Yu; Luoxuan Wang; Xiangya Kong; Yang Cao; Mengmeng Zhang; Zhaolin Sun; Yang Liu; Jing Wang; Beifen Shen; Xiaochen Bo; Jiannan Feng (2023). Table3_CAD v1.0: Cancer Antigens Database Platform for Cancer Antigen Algorithm Development and Information Exploration.docx [Dataset]. http://doi.org/10.3389/fbioe.2022.819583.s007
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fbioe.2022.819583.s007
Dataset updated
Jun 1, 2023
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Jijun Yu; Luoxuan Wang; Xiangya Kong; Yang Cao; Mengmeng Zhang; Zhaolin Sun; Yang Liu; Jing Wang; Beifen Shen; Xiaochen Bo; Jiannan Feng
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Cancer vaccines have gradually attracted attention for their tremendous preclinical and clinical performance. With the development of next-generation sequencing technologies and related algorithms, pipelines based on sequencing and machine learning methods have become mainstream in cancer antigen prediction; of particular focus are neoantigens, mutation peptides that only exist in tumor cells that lack central tolerance and have fewer side effects. The rapid prediction and filtering of neoantigen peptides are crucial to the development of neoantigen-based cancer vaccines. However, due to the lack of verified neoantigen datasets and insufficient research on the properties of neoantigens, neoantigen prediction algorithms still need to be improved. Here, we recruited verified cancer antigen peptides and collected as much relevant peptide information as possible. Then, we discussed the role of each dataset for algorithm improvement in cancer antigen research, especially neoantigen prediction. A platform, Cancer Antigens Database (CAD, http://cad.bio-it.cn/), was designed to facilitate users to perform a complete exploration of cancer antigens online.
Example proteins and validated epitopes present in the IEDB 3.0 database.
plos.figshare.com
xls
Updated Jun 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joana Pissarra; Franck Dorkeld; Etienne Loire; Vincent Bonhomme; Denis Sereno; Jean-Loup Lemesre; Philippe Holzmuller (2023). Example proteins and validated epitopes present in the IEDB 3.0 database. [Dataset]. http://doi.org/10.1371/journal.pone.0273494.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0273494.t001
Dataset updated
Jun 13, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Joana Pissarra; Franck Dorkeld; Etienne Loire; Vincent Bonhomme; Denis Sereno; Jean-Loup Lemesre; Philippe Holzmuller
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Example proteins and validated epitopes present in the IEDB 3.0 database.
s
Epitome
scicrunch.org
neuinfo.org
+1more
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Epitome [Dataset]. http://identifiers.org/RRID:SCR_007641
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007641
Description
Epitome is a database of structurally inferred antigenic epitopes in proteins. It includes all known antigenic residues and the antibodies that interact with them, including a detailed description of residues involved in the interaction and their sequence/structure environments. Additionally, Interactions can be visualized using an interface into Jmol. The website also contains specialized software, NLProt, to enable users to extract protein names and sequences from natural language text, and links to several other databases involved in antibody/antigen interactions. antibody/antigen interactions, antigen epitope
n
Data from: Kabat Database of Sequences of Proteins of Immunological Interest...
neuinfo.org
dknet.org
+2more
Updated Jun 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Kabat Database of Sequences of Proteins of Immunological Interest [Dataset]. http://identifiers.org/RRID:SCR_006465
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006465
Dataset updated
Jun 27, 2024
Description
The Kabat Database determines the combining site of antibodies based on the available amino acid sequences. The precise delineation of complementarity determining regions (CDR) of both light and heavy chains provides the first example of how properly aligned sequences can be used to derive structural and functional information of biological macromolecules. The Kabat database now includes nucleotide sequences, sequences of T cell receptors for antigens (TCR), major histocompatibility complex (MHC) class I and II molecules, and other proteins of immunological interest. The Kabat Database searching and analysis tools package is an ASP.NET web-based portal containing lookup tools, sequence matching tools, alignment tools, length distribution tools, positional correlation tools and much more. The searching and analysis tools are custom made for the aligned data sets contained in both the SQL Server and ASCII text flat file formats. The searching and analysis tools may be run on a single PC workstation or in a distributed environment. The analysis tools are written in ASP.NET and C# and are available in Visual Studio .NET 2003/2005/2008 formats. The Kabat Database was initially started in 1970 to determine the combining site of antibodies based on the available amino acid sequences at that time. Bence Jones proteins, mostly from human, were aligned, using the now-known Kabat numbering system, and a quantitative measure, variability, was calculated for every position. Three peaks, at positions 24-34, 50-56 and 89-97, were identified and proposed to form the complementarity determining regions (CDR) of light chains. Subsequently, antibody heavy chain amino acid sequences were also aligned using a different numbering system, since the locations of their CDRs (31-35B, 50-65 and 95-102) are different from those of the light chains. CDRL1 starts right after the first invariant Cys 23 of light chains, while CDRH1 is eight amino acid residues away from the first invariant Cys 22 of heavy chains. During the past 30 years, the Kabat database has grown to include nucleotide sequences, sequences of T cell receptors for antigens (TCR), major histocompatibility complex (MHC) class I and II molecules and other proteins of immunological interest. It has been used extensively by immunologists to derive useful structural and functional information from the primary sequences of these proteins.
n
ExPASy ABCD database
neuinfo.org
dknet.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). ExPASy ABCD database [Dataset]. http://identifiers.org/RRID:SCR_017401
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_017401
Dataset updated
Jan 29, 2022
Description
Repository of sequenced antibodies, integrating curated information about antibody and its antigen with cross links to standardized databases of chemical and protein entities. Manually curated repository of sequenced antibodies, developed by Geneva Antibody Facility at University of Geneva, in collaboration with CALIPHO and Swiss Prot groups at SIB Swiss Institute of Bioinformatics. Database provides list of sequenced antibodies with their known targets. Each antibody is assigned unique ID number that can be used in academic publications to increase reproducibility of experiments.
Antibody and Nanobody Design Dataset (ANDD)
zenodo.org
zip
Updated Sep 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yikai Wu; Yikai Wu (2025). Antibody and Nanobody Design Dataset (ANDD) [Dataset]. http://doi.org/10.5281/zenodo.16894086
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.16894086
Dataset updated
Sep 26, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Yikai Wu; Yikai Wu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Title: Antibody and Nanobody Design Dataset (ANDD): A Comprehensive Resource with Sequence, Structure, and Binding Affinity Data

DOI: 10.5281/zenodo.16894086

Resource Type: Dataset

Publisher: Zenodo

Publication Year: 2025

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Overview (Abstract):

The Antibody and Nanobody Design Dataset (ANDD) is a unified, large-scale dataset created to overcome the limitations of data fragmentation and incompleteness in antibody and nanobody research. It integrates sequence, structure, antigen information, and binding affinity data from 15 diverse sources, including OAS, PDB, SabDab, and others. ANDD comprises 48,800 antibody/nanobody sequences, structural data for 25,158 entries, antigen sequences for 12,617 entries, and a total of 9,569 binding affinity values for antibody/nanobody-antigen pairs. A key innovation is the augmentation of experimental affinity data with 5,218 high-quality predictions generated by the ANTIPASTI model. This makes ANDD the largest available dataset of its kind, providing a robust foundation for training and validating deep learning models in therapeutic antibody and nanobody design.

Keywords: Dataset, Antibody Design, Nanobody Design, VHH, Deep Learning, Protein Engineering, Binding Affinity, Therapeutic Antibodies, Computational Biology

Methods (Data Curation and Processing):

The ANDD was constructed through a rigorous multi-step process:

Data Collection: Data was aggregated from 15 primary sources, including both antibody/nanobody-specific databases (e.g., OAS, SAbDab, INDI, sdAb-DB) and general protein databases (e.g., PDB, UNIPROT, PDBbind).

Integration and Standardization: Data from disparate sources was consolidated into a consistent format, addressing challenges of format inconsistency. Entries were manually validated to exclude non-relevant data (e.g., T-cell receptors).

Affinity Data Augmentation: The ANTIPASTI deep learning model was used to predict and add binding affinity values for entries that had structural data but lacked experimental affinity measurements.

Manual Curation: Web-based data and information from publicly available patents targeting key antigens (HER2, IL-6, CD45, SARS-CoV-2 RBD) were manually extracted to enhance completeness.

Hierarchical Organization: Data is organized in a hierarchical structure, offering four progressively detailed levels: Sequence-only, Sequence+Structure, Sequence+Structure+Antigen, and Sequence+Structure+Antigen+Affinity.

Data Specifications and Format:

The dataset is distributed in two parts:

ANDD.csv: A comprehensive spreadsheet containing all annotated metadata for each entry.

All_structures/Folder: A directory containing the corresponding PDB structure files for entries with structural data.

The ANDD.csvfile includes the following key fields (a full description is available in the Data Record section of the paper):

General Info: Source, Update_Date, PDB_ID, Experimental_Method, Ab_or_Nano, Source_Organism.

Chain Details: Entity IDs, Asym IDs, Database Accession Codes, and Macromolecule Names for Heavy (H) and Light (L) chains.

Antigen Details: Ag_Name, Ag_Seq, Ag_Source Organism, and relevant database identifiers.

Sequence Data: Full amino acid sequences for H/L chains and individual CDR regions (H1-H3, L1-L3).

Affinity Data: Experimentally measured or predicted Affinity_Kd(M), ∆Gbinding(kJ), and the Affinity_Method.

Mutation Data: Annotation of any amino acid mutations (Ab/Nano_mutation).

Technical Validation:

The quality of ANDD has been ensured through extensive validation:

Manual Curation: A rigorous manual review process was conducted to check for accuracy and consistency between sequence, structure, and affinity data across randomly selected entries.

Affinity Validation with AlphaBind: The experimental Kd values were validated by comparing them against enrichment ratios predicted by the AlphaBind model, showing a significant correlation (Pearson’s r = 0.750).

Cross-Mapping Validation: The internal consistency between Kd and ∆Gbinding values within the dataset was confirmed, showing a perfect correlation (Pearson’s r = 1.000) as per thermodynamic principles.

Proof-of-Concept Application: The dataset's utility was demonstrated by fine-tuning the Diffab generative model on a subset of ANDD. The fine-tuned model showed significant improvements in generating nanobodies with better predicted binding affinity, structural diversity, and developability metrics.

Potential Uses:

ANDD is designed to accelerate research in computational biology and drug discovery, including:

Training and benchmarking deep learning models for de novoantibody/nanobody sequence and structure generation.

Developing and validating predictive models for antibody-antigen binding affinity.

Studying structure-function relationships in antibody-antigen interactions.

Facilitating the design of optimized therapeutic antibodies and nanobodies with improved specificity and efficacy.

Access and License:

The ANDD dataset is publicly available for download under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. Users are free to share and adapt the material for any purpose, even commercially, provided appropriate credit is given to the original authors and this data descriptor is cited.
d
CTDatabase
dknet.org
scicrunch.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). CTDatabase [Dataset]. http://identifiers.org/RRID:SCR_007614
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007614 https://identifiers.org/RRID:SCR_007614/resolver
Dataset updated
Jan 29, 2022
Description
A database of information about each Cancer-Testis (CT) gene, its gene products and the immune response induced in cancer patients by these proteins. CT antigens are proteins normally expressed only in the human germ line but that are also present in a significant subset of malignant tumors. The practical importance of these proteins is that due to their restricted expression pattern they are frequently recognized by the immune system of cancer patients. Moreover, this antigenicity has raised the possibility of their being used as vaccines to actively stimulate immune responses in order to combat tumor growth. As a result worldwide research into many aspects of CT antigens is rapidly growing prompting the construction of this database as a resource for investigators involved in this area.
n
BciPep
neuinfo.org
scicrunch.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). BciPep [Dataset]. http://identifiers.org/RRID:SCR_007559
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007559
Dataset updated
Jan 29, 2022
Description
Bcipep is collection of the peptides having the role in humoral immunity. The peptides in the database have varying measure of immunogenicity. This database can assist in the development of methods for predicting B cell epitopes, designing synthetic vaccines, and in disease diagnosis. These peptides lead to the generation of antibodies which combine with antigens and are responsible for the host defense, and can be very useful for subunit vaccine designing. The database has 3031 peptide entries. For each peptide, the user can find a plethora of information, including entry number, peptide sequence, pathogen group, protein source, antigen structure, antibody, etc.
b
AntiBodies Chemically Defined database
bioregistry.io
Updated Aug 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). AntiBodies Chemically Defined database [Dataset]. https://bioregistry.io/registry/abcd
Explore at:
Dataset updated
Aug 12, 2021
Description
The ABCD (AntiBodies Chemically Defined) database is a manually curated depository of sequenced antibodies
d
GlycoEpitope
dknet.org
scicrunch.org
+1more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). GlycoEpitope [Dataset]. http://identifiers.org/RRID:SCR_014404
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_014404
Dataset updated
Jan 29, 2022
Description
A database of carbohydrate antigens and matching antibodies. Epitopes and antibodies are listed within the database. Users may also search for epitopes and antibodies by keyword, epitope ID, tissue, receptor, enzyme, and other fields.
f
Data from: Statistical Analysis and Tokenization of Epitopes to Construct...
acs.figshare.com
bin
Updated Sep 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elena Lopez-Martinez; Aitor Manteca; Noelia Ferruz; Aitziber L. Cortajarena (2023). Statistical Analysis and Tokenization of Epitopes to Construct Artificial Neoepitope Libraries [Dataset]. http://doi.org/10.1021/acssynbio.3c00201.s004
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.1021/acssynbio.3c00201.s004
Dataset updated
Sep 20, 2023
Dataset provided by
ACS Publications
Authors
Elena Lopez-Martinez; Aitor Manteca; Noelia Ferruz; Aitziber L. Cortajarena
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Epitopes are specific regions on an antigen’s surface that the immune system recognizes. Epitopes are usually protein regions on foreign immune-stimulating entities such as viruses and bacteria, and in some cases, endogenous proteins may act as antigens. Identifying epitopes is crucial for accelerating the development of vaccines and immunotherapies. However, mapping epitopes in pathogen proteomes is challenging using conventional methods. Screening artificial neoepitope libraries against antibodies can overcome this issue. Here, we applied conventional sequence analysis and methods inspired in natural language processing to reveal specific sequence patterns in the linear epitopes deposited in the Immune Epitope Database (www.iedb.org) that can serve as building blocks for the design of universal epitope libraries. Our results reveal that amino acid frequency in annotated linear epitopes differs from that in the human proteome. Aromatic residues are overrepresented, while the presence of cysteines is practically null in epitopes. Byte pair encoding tokenization shows high frequencies of tryptophan in tokens of 5, 6, and 7 amino acids, corroborating the findings of the conventional sequence analysis. These results can be applied to reduce the diversity of linear epitope libraries by orders of magnitude.
n
Animal Genome Database
neuinfo.org
rrid.site
+1more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Animal Genome Database [Dataset]. http://identifiers.org/RRID:SCR_008165
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008165
Dataset updated
Jan 29, 2022
Description
Database of comparative gene mapping between species to assist the mapping of the genes related to phenotypic traits in livestock. The linkage maps, cytogenetic maps, polymerase chain reaction primers of pig, cattle, mouse and human, and their references have been included in the database, and the correspondence among species have been stipulated in the database. AGP is an animal genome database developed on a Unix workstation and maintained by a relational database management system. It is a joint project of National Institute of Agrobiological Sciences (NIAS) and Institute of the Society for Techno-innovation of Agriculture, Forestry and Fisheries (STAFF-Institute), under cooperation with other related research institutes. AGP also contains the Pig Expression Data Explorer (PEDE), a database of porcine EST collections derived from full-length cDNA libraries and full-length sequences of the cDNA clones picked from the EST collection. The EST sequences have been clustered and assembled, and their similarity to sequences in RefSeq, and UniGene determined. The PEDE database system was constructed to store sequences and similarity data of swine full-length cDNA libraries and to make them available to users. It provides interfaces for keyword and ID searches of BLAST results and enables users to obtain sequence data and names of clones of interest. Putative SNPs in EST assemblies have been classified according to breed specificity and their effect on coding amino acids, and the assemblies are equipped with an SNP search interface. The database contains porcine nucleotide sequences and cDNA clones that are ready for analyses such as expression in mammalian cells, because of their high likelihood of containing full-length CDS. PEDE will be useful for researchers who want to explore genes that may be responsible for traits such as disease susceptibility. The database also offers information regarding major and minor porcine-specific antigens, which might be investigated in regard to the use of pigs as models in various medical research applications.
d
Data from: Antigen-specific cytometry
catalog.data.gov
data.virginia.gov
Updated Sep 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institutes of Health (2025). Antigen-specific cytometry [Dataset]. https://catalog.data.gov/dataset/antigen-specific-cytometry
Explore at:
Dataset updated
Sep 30, 2025
Dataset provided by
National Institutes of Health
Description
From its origins in the 16thcentury, microscopy has allowed the cell, as the basic unit of eukaryotic life and disease, to be identified and analyzed. Today, quantitative cytometric technologies, either microscope based or flow cytometric, are the most powerful tools to analyze the proliferation, physiology and differentiation of cells generally, and are particularly useful in immunopathology. In combination with monoclonal antibodies (which recognize specific gene products) conjugated to sensitive fluorescent dyes, cell types can be identified according to the genes they express. They can also be isolated using either fluorescence-activated cell sorting (FACS) or magnetic cell sorting (MACS). In the past 20 years, immunofluorescence-based cytometry and cell sorting have become 'state of the art' technologies, mostly serving to identify subsets of lymphocytes and systemic changes in the immune system. Although it is certainly of value for diagnosis and analysis of immunopathology, cytometry did have one major limitation; except in a few experimental situations, it was not possible to focus analysis on those lymphocytes that specifically recognize the relevant antigens in a normal or pathological immune reaction. This drawback has recently been overcome both for B and T lymphocytes, using antigen to identify the cells. Today, a number of exciting new technologies make it possible to analyze and isolate specifically those lymphocytes that are directly involved in the immune reaction to given antigens. These advances will spur research in arthritis considerably.
f
DataSheet_2_Large-scale template-based structural modeling of T-cell...
frontiersin.figshare.com
pdf
Updated Aug 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dmitrii S. Shcherbinin; Vadim K. Karnaukhov; Ivan V. Zvyagin; Dmitriy M. Chudakov; Mikhail Shugay (2023). DataSheet_2_Large-scale template-based structural modeling of T-cell receptors with known antigen specificity reveals complementarity features.pdf [Dataset]. http://doi.org/10.3389/fimmu.2023.1224969.s002
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fimmu.2023.1224969.s002
Dataset updated
Aug 15, 2023
Dataset provided by
Frontiers
Authors
Dmitrii S. Shcherbinin; Vadim K. Karnaukhov; Ivan V. Zvyagin; Dmitriy M. Chudakov; Mikhail Shugay
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IntroductionT-cell receptor (TCR) recognition of foreign peptides presented by the major histocompatibility complex (MHC) initiates the adaptive immune response against pathogens. While a large number of TCR sequences specific to different antigenic peptides are known to date, the structural data describing the conformation and contacting residues for TCR-peptide-MHC complexes is relatively limited. In the present study we aim to extend and analyze the set of available structures by performing highly accurate template-based modeling of these complexes using TCR sequences with known specificity. MethodsIdentification of CDR3 sequences and their further clustering, based on available spatial structures, V- and J-genes of corresponding T-cell receptors, and epitopes, was performed using the VDJdb database. Modeling of the selected CDR3 loops was conducted using a stepwise introduction of single amino acid substitutions to the template PDB structures, followed by optimization of the TCR-peptide-MHC contacting interface using the Rosetta package applications. Statistical analysis and recursive feature elimination procedures were carried out on computed energy values and properties of contacting amino acid residues between CDR3 loops and peptides, using R.ResultsUsing the set of 29 complex templates (including a template with SARS-CoV-2 antigen) and 732 specificity records, we built a database of 1585 model structures carrying substitutions in either TCRα or TCRβ chains with some models representing the result of different mutation pathways for the same final structure. This database allowed us to analyze features of amino acid contacts in TCR - peptide interfaces that govern antigen recognition preferences and interpret these interactions in terms of physicochemical properties of interacting residues.ConclusionOur results provide a methodology for creating high-quality TCR-peptide-MHC models for antigens of interest that can be utilized to predict TCR specificity.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). Human Potential Tumor Associated Antigen database [Dataset]. http://identifiers.org/RRID:SCR_002938

Human Potential Tumor Associated Antigen database

RRID:SCR_002938, nif-0000-02987, Human Potential Tumor Associated Antigen database (RRID:SCR_002938), HPtaa Database

Explore at:

6 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://identifiers.org/RRID:SCR_002938

Dataset updated

Aug 22, 2024

Description

To accelerate the process of tumor antigen discovery, we generated a publicly available Human Potential Tumor Associated Antigen database (HPtaa) with pTAAs identified by insilico computing. 3518 potential targets have been included in the database, which is freely available to academic users. It successfully screened out 41 of 82 known Cancer-Testis antigens, 6 of 18 differentiation antigen, 2 of 2 oncofetal antigen, and 7 of 12 FDA approved cancer markers that have Gene ID, therefore will provide a good platform for identification of cancer target genes. This database utilizes expression data from various expression platforms, including carefully chosen publicly available microarray expression data, GEO SAGE data, Unigene expression data. In addition, other relevant databases required for TAA discovery such as CGAP, CCDS, gene ontology database etc, were also incorporated. In order to integrate different expression platforms together, various strategies and algorithms have been developed. Known tumor antigens are gathered from literature and serve as training sets. A total tumor specificity penalty was computed from positive clue penalty for differential expression in human cancers, the corresponding differential ratio, and normal tissue restriction penalty for each gene. We hope this database will help with the process of cancer immunome identification, thus help with improving the diagnosis and treatment of human carcinomas.

Clear search

Close search

Google apps

Main menu

Human Potential Tumor Associated Antigen database

Serum Antibody Repertoire Profiling Using In Silico Antigen Screen

SV40 Large T-Antigen Mutant Database

Antibody-Antigen Complex Database

Immune Epitope Database and Analysis Resource (IEDB)

Blood Group Antigen Gene Mutation Database

Table3_CAD v1.0: Cancer Antigens Database Platform for Cancer Antigen...

Example proteins and validated epitopes present in the IEDB 3.0 database.

Epitome

Data from: Kabat Database of Sequences of Proteins of Immunological Interest...

ExPASy ABCD database

Antibody and Nanobody Design Dataset (ANDD)

CTDatabase

BciPep

AntiBodies Chemically Defined database

GlycoEpitope

Data from: Statistical Analysis and Tokenization of Epitopes to Construct...

Animal Genome Database

Data from: Antigen-specific cytometry

DataSheet_2_Large-scale template-based structural modeling of T-cell...

Human Potential Tumor Associated Antigen database

RRID:SCR_002938, nif-0000-02987, Human Potential Tumor Associated Antigen database (RRID:SCR_002938), HPtaa Database