100+ datasets found
  1. b

    Data from: AntiBody Sequence Database

    • bioregistry.io
    Updated Jan 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). AntiBody Sequence Database [Dataset]. https://bioregistry.io/absd
    Explore at:
    Dataset updated
    Jan 23, 2025
    Description

    The AntiBody Sequence Database is a public dataset for antibody sequence data. It provides unique identifiers for antibody sequences, including both immunoglobulin and single-chain variable fragment sequences. These are are critical for immunological studies, and allows users to search and retrieve antibody sequences based on sequence similarity and specificity, and other biological properties.

  2. r

    Structural Antibody Database

    • rrid.site
    • neuinfo.org
    • +2more
    Updated Apr 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Structural Antibody Database [Dataset]. http://identifiers.org/RRID:SCR_022096/resolver?q=*&i=rrid
    Explore at:
    Dataset updated
    Apr 20, 2022
    Description

    Database containing all antibody structures available in the PDB, annotated and presented in consistent fashion.Each structure is annotated with number of properties including experimental details, antibody nomenclature (e.g. heavy-light pairings), curated affinity data and sequence annotations. You can use the database to inspect individual structures, create and download datasets for analysis, search the database for structures with similar sequences to your query, monitor the known structural repetoire of antibodies.

  3. R

    Raw data from external antibody databases and scripts to homogenize and...

    • entrepot.recherche.data.gouv.fr
    application/x-gzip +1
    Updated Feb 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolas MAILLET; Nicolas MAILLET; Simon MALESYS; Simon MALESYS (2025). Raw data from external antibody databases and scripts to homogenize and standardize them used to build AntiBody Sequence Database (for reproducibility) [Dataset]. http://doi.org/10.57745/DDLHWU
    Explore at:
    application/x-gzip(620431), application/x-gzip(163643), application/x-gzip(6833391387), text/markdown(12475), application/x-gzip(80726198), application/x-gzip(65497009)Available download formats
    Dataset updated
    Feb 4, 2025
    Dataset provided by
    Recherche Data Gouv
    Authors
    Nicolas MAILLET; Nicolas MAILLET; Simon MALESYS; Simon MALESYS
    License

    https://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/3.1/customlicense?persistentId=doi:10.57745/DDLHWUhttps://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/3.1/customlicense?persistentId=doi:10.57745/DDLHWU

    Description

    Reproducibility data for the AntiBody Sequence Database (ABSD) article. This dataset contains the raw data (antibody sequences) extracted on June 20, 2024, from various databases, as well as the several scripts, to ensure the reproducibility of our results. External databases used: ABDB, AbPDB, CoV-AbDab, Genbank, IMGT, PDB, SACS, SAbDab, TheraSAbDab, UniProt, KABAT Scripts usage: each external database has a corresponding script to format all antibody sequences extracted from it. A last script enable merging all extracted antibody sequences while removing redundancy, standardizing and cleaning data.

  4. MSDatasets and Antibody Databases.zip

    • figshare.com
    txt
    Updated Jul 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kunyi Li (2025). MSDatasets and Antibody Databases.zip [Dataset]. http://doi.org/10.6084/m9.figshare.29634140.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 24, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Kunyi Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MS Datasets: (1) 100 simulated spectra; (2) Waters spectra;(3) HB100 spectra.Antibody database:(1) a database of 800 antibody sequences;(2) a database of 500 decoy sequences from mouse.

  5. Antibody and Nanobody Design Dataset (ANDD)

    • zenodo.org
    zip
    Updated Sep 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yikai Wu; Yikai Wu (2025). Antibody and Nanobody Design Dataset (ANDD) [Dataset]. http://doi.org/10.5281/zenodo.16894086
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 26, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yikai Wu; Yikai Wu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Title: Antibody and Nanobody Design Dataset (ANDD): A Comprehensive Resource with Sequence, Structure, and Binding Affinity Data

    DOI: 10.5281/zenodo.16894086

    Resource Type: Dataset

    Publisher: Zenodo

    Publication Year: 2025

    License: Creative Commons Attribution 4.0 International (CC BY 4.0)

    Overview (Abstract):

    The Antibody and Nanobody Design Dataset (ANDD) is a unified, large-scale dataset created to overcome the limitations of data fragmentation and incompleteness in antibody and nanobody research. It integrates sequence, structure, antigen information, and binding affinity data from 15 diverse sources, including OAS, PDB, SabDab, and others. ANDD comprises 48,800 antibody/nanobody sequences, structural data for 25,158 entries, antigen sequences for 12,617 entries, and a total of 9,569 binding affinity values for antibody/nanobody-antigen pairs. A key innovation is the augmentation of experimental affinity data with 5,218 high-quality predictions generated by the ANTIPASTI model. This makes ANDD the largest available dataset of its kind, providing a robust foundation for training and validating deep learning models in therapeutic antibody and nanobody design.

    Keywords: Dataset, Antibody Design, Nanobody Design, VHH, Deep Learning, Protein Engineering, Binding Affinity, Therapeutic Antibodies, Computational Biology

    Methods (Data Curation and Processing):

    The ANDD was constructed through a rigorous multi-step process:

    1. Data Collection: Data was aggregated from 15 primary sources, including both antibody/nanobody-specific databases (e.g., OAS, SAbDab, INDI, sdAb-DB) and general protein databases (e.g., PDB, UNIPROT, PDBbind).
    2. Integration and Standardization: Data from disparate sources was consolidated into a consistent format, addressing challenges of format inconsistency. Entries were manually validated to exclude non-relevant data (e.g., T-cell receptors).
    3. Affinity Data Augmentation: The ANTIPASTI deep learning model was used to predict and add binding affinity values for entries that had structural data but lacked experimental affinity measurements.
    4. Manual Curation: Web-based data and information from publicly available patents targeting key antigens (HER2, IL-6, CD45, SARS-CoV-2 RBD) were manually extracted to enhance completeness.
    5. Hierarchical Organization: Data is organized in a hierarchical structure, offering four progressively detailed levels: Sequence-only, Sequence+Structure, Sequence+Structure+Antigen, and Sequence+Structure+Antigen+Affinity.

    Data Specifications and Format:

    The dataset is distributed in two parts:

    1. ANDD.csv: A comprehensive spreadsheet containing all annotated metadata for each entry.
    2. All_structures/Folder: A directory containing the corresponding PDB structure files for entries with structural data.

    The ANDD.csvfile includes the following key fields (a full description is available in the Data Record section of the paper):

    • General Info: Source, Update_Date, PDB_ID, Experimental_Method, Ab_or_Nano, Source_Organism.
    • Chain Details: Entity IDs, Asym IDs, Database Accession Codes, and Macromolecule Names for Heavy (H) and Light (L) chains.
    • Antigen Details: Ag_Name, Ag_Seq, Ag_Source Organism, and relevant database identifiers.
    • Sequence Data: Full amino acid sequences for H/L chains and individual CDR regions (H1-H3, L1-L3).
    • Affinity Data: Experimentally measured or predicted Affinity_Kd(M), ∆Gbinding(kJ), and the Affinity_Method.
    • Mutation Data: Annotation of any amino acid mutations (Ab/Nano_mutation).

    Technical Validation:

    The quality of ANDD has been ensured through extensive validation:

    1. Manual Curation: A rigorous manual review process was conducted to check for accuracy and consistency between sequence, structure, and affinity data across randomly selected entries.
    2. Affinity Validation with AlphaBind: The experimental Kd values were validated by comparing them against enrichment ratios predicted by the AlphaBind model, showing a significant correlation (Pearson’s r = 0.750).
    3. Cross-Mapping Validation: The internal consistency between Kd and ∆Gbinding values within the dataset was confirmed, showing a perfect correlation (Pearson’s r = 1.000) as per thermodynamic principles.
    4. Proof-of-Concept Application: The dataset's utility was demonstrated by fine-tuning the Diffab generative model on a subset of ANDD. The fine-tuned model showed significant improvements in generating nanobodies with better predicted binding affinity, structural diversity, and developability metrics.

    Potential Uses:

    ANDD is designed to accelerate research in computational biology and drug discovery, including:

    • Training and benchmarking deep learning models for de novoantibody/nanobody sequence and structure generation.
    • Developing and validating predictive models for antibody-antigen binding affinity.
    • Studying structure-function relationships in antibody-antigen interactions.
    • Facilitating the design of optimized therapeutic antibodies and nanobodies with improved specificity and efficacy.

    Access and License:

    The ANDD dataset is publicly available for download under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. Users are free to share and adapt the material for any purpose, even commercially, provided appropriate credit is given to the original authors and this data descriptor is cited.

  6. d

    Therapeutic Structural Antibody Database

    • dknet.org
    • neuinfo.org
    • +2more
    Updated Apr 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Therapeutic Structural Antibody Database [Dataset]. http://identifiers.org/RRID:SCR_022093
    Explore at:
    Dataset updated
    Apr 20, 2022
    Description

    Tracks all antibody and nanobody related therapeutics recognized by World Health Organisation, and identifies any corresponding structures in Structural Antibody Database with near exact or exact variable domain sequence matches. Synchronized with SAbDab to update weekly, reflecting new Protein Data Bank entries and availability of new sequence data published by WHO.

  7. n

    Data from: Kabat Database of Sequences of Proteins of Immunological Interest...

    • neuinfo.org
    • dknet.org
    • +2more
    Updated Jun 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Kabat Database of Sequences of Proteins of Immunological Interest [Dataset]. http://identifiers.org/RRID:SCR_006465
    Explore at:
    Dataset updated
    Jun 27, 2024
    Description

    The Kabat Database determines the combining site of antibodies based on the available amino acid sequences. The precise delineation of complementarity determining regions (CDR) of both light and heavy chains provides the first example of how properly aligned sequences can be used to derive structural and functional information of biological macromolecules. The Kabat database now includes nucleotide sequences, sequences of T cell receptors for antigens (TCR), major histocompatibility complex (MHC) class I and II molecules, and other proteins of immunological interest. The Kabat Database searching and analysis tools package is an ASP.NET web-based portal containing lookup tools, sequence matching tools, alignment tools, length distribution tools, positional correlation tools and much more. The searching and analysis tools are custom made for the aligned data sets contained in both the SQL Server and ASCII text flat file formats. The searching and analysis tools may be run on a single PC workstation or in a distributed environment. The analysis tools are written in ASP.NET and C# and are available in Visual Studio .NET 2003/2005/2008 formats. The Kabat Database was initially started in 1970 to determine the combining site of antibodies based on the available amino acid sequences at that time. Bence Jones proteins, mostly from human, were aligned, using the now-known Kabat numbering system, and a quantitative measure, variability, was calculated for every position. Three peaks, at positions 24-34, 50-56 and 89-97, were identified and proposed to form the complementarity determining regions (CDR) of light chains. Subsequently, antibody heavy chain amino acid sequences were also aligned using a different numbering system, since the locations of their CDRs (31-35B, 50-65 and 95-102) are different from those of the light chains. CDRL1 starts right after the first invariant Cys 23 of light chains, while CDRH1 is eight amino acid residues away from the first invariant Cys 22 of heavy chains. During the past 30 years, the Kabat database has grown to include nucleotide sequences, sequences of T cell receptors for antigens (TCR), major histocompatibility complex (MHC) class I and II molecules and other proteins of immunological interest. It has been used extensively by immunologists to derive useful structural and functional information from the primary sequences of these proteins.

  8. Serum Antibody Repertoire Profiling Using In Silico Antigen Screen

    • plos.figshare.com
    doc
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xinyue Liu; Qiang Hu; Song Liu; Luke J. Tallo; Lisa Sadzewicz; Cassandra A. Schettine; Mikhail Nikiforov; Elena N. Klyushnenkova; Yurij Ionov (2023). Serum Antibody Repertoire Profiling Using In Silico Antigen Screen [Dataset]. http://doi.org/10.1371/journal.pone.0067181
    Explore at:
    docAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Xinyue Liu; Qiang Hu; Song Liu; Luke J. Tallo; Lisa Sadzewicz; Cassandra A. Schettine; Mikhail Nikiforov; Elena N. Klyushnenkova; Yurij Ionov
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Serum antibodies are valuable source of information on the health state of an organism. The profiles of serum antibody reactivity can be generated by using a high throughput sequencing of peptide-coding DNA from combinatorial random peptide phage display libraries selected for binding to serum antibodies. Here we demonstrate that the targets of immune response, which are recognized by serum antibodies directed against sequential epitopes, can be identified using the serum antibody repertoire profiles generated by high throughput sequencing. We developed an algorithm to filter the results of the protein database BLAST search for selected peptides to distinguish real antigens recognized by serum antibodies from irrelevant proteins retrieved randomly. When we used this algorithm to analyze serum antibodies from mice immunized with human protein, we were able to identify the protein used for immunizations among the top candidate antigens. When we analyzed human serum sample from the metastatic melanoma patient, the recombinant protein, corresponding to the top candidate from the list generated using the algorithm, was recognized by antibodies from metastatic melanoma serum on the western blot, thus confirming that the method can identify autoantigens recognized by serum antibodies. We demonstrated also that our unbiased method of looking at the repertoire of serum antibodies reveals quantitative information on the epitope composition of the targets of immune response. A method for deciphering information contained in the serum antibody repertoire profiles may help to identify autoantibodies that can be used for diagnosing and monitoring autoimmune diseases or malignancies.

  9. n

    Abysis Database

    • neuinfo.org
    • rrid.site
    • +2more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Abysis Database [Dataset]. http://identifiers.org/RRID:SCR_000756
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    A database of antibody structure containing sequences from Kabat, IMGT and the Protein Databank (PDB), as well as structure data from the PDB. It provides search of the sequence data on various criteria and display of results in different formats. For data from the PDB, sequence searches can be combined with structural constraints. For example, one can ask for all the antibodies with a 10-residue Kabat CDR-L1 with a serine at H23 and an arginine within 10A of H36. The site also has software for structure analysis and other information on antibody structure available.

  10. N

    Data from: Antibody Sequence Determinants of Viral Antigen Specificity

    • data.niaid.nih.gov
    Updated Nov 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abu-Shmais AA; Vukovich MJ; Wasdin PT; Suresh YP; Marinov TM; Rush SA; Gillespie RA; Sankhala RS; Choe M; Joyce MG; Kanekiyo M; McLellan JS; Georgiev IS (2024). Antibody Sequence Determinants of Viral Antigen Specificity [Dataset]. https://data.niaid.nih.gov/resources?id=gse250159
    Explore at:
    Dataset updated
    Nov 22, 2024
    Dataset provided by
    Vanderbilt University Medical Center
    Authors
    Abu-Shmais AA; Vukovich MJ; Wasdin PT; Suresh YP; Marinov TM; Rush SA; Gillespie RA; Sankhala RS; Choe M; Joyce MG; Kanekiyo M; McLellan JS; Georgiev IS
    Description

    Throughout life, humans experience repeated exposure to viral antigens through infection and vaccination, resulting in the generation of diverse, and largely unique, antigen specific antibody repertoires. A paramount feature of antibodies that enables their critical contributions in counteracting recurrent and novel pathogens, and consequently fostering their utility as valuable targets for therapeutic and vaccine development, is the exquisite specificity displayed against their target antigens. Yet, there is still limited understanding of the determinants of antibody-antigen specificity, particularly as a function of antibody sequence. In recent years, experimental characterization of antibody repertoires has led to novel insights into fundamental properties of antibody sequences, but has been largely decoupled from at-scale antigen specificity analysis. Here, using the LIBRA-seq technology, we generated a large dataset mapping antibody sequence to antigen specificity for thousands of B cells, by screening the repertoires of a set of healthy individuals against twenty viral antigens representing diverse pathogens of biomedical significance. Analysis uncovered virus specific patterns in variable gene usage, gene pairing, somatic hypermutation, as well as the presence of convergent antiviral signatures across multiple individuals, including the presence of public antibody clonotypes. Notably, our results showed that, for B cell receptors originating from different individuals but leveraging an identical combination of heavy and light chain variable genes, there is a specific CDRH3/CDRL3 identity threshold that defines whether these B cells may share the same antigen specificity. This finding provides a quantifiable measure of the relationship between antibody sequence and antigen specificity and further defines experimentally grounded criteria for defining public antibody clonality. Understanding the fundamental rules of antibody-antigen interactions can lead to transformative new approaches for the development of antibody therapeutics and vaccines against current and emerging viruses. Antigen specific B cells were sorted from human PBMCs using the standard LIBRA-seq pipeline (Shiakolas & Setliff et al., Cell 2019). Cells were distributed into microfluidic droplets containing unique cell-specific oligonucleotides using the 10X Genomics Chromium Controller. cDNA libraires from single cells were further amplified with BCR enrichment primers using the 10X Genomics VDJ protocol.

  11. Data from: Inverse folding for antibody sequence design using deep learning

    • zenodo.org
    bin, csv
    Updated Oct 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Frédéric A. Dreyer; Daniel Cutting; Constantin Schneider; Henry Kenlay; Charlotte M. Deane; Frédéric A. Dreyer; Daniel Cutting; Constantin Schneider; Henry Kenlay; Charlotte M. Deane (2023). Inverse folding for antibody sequence design using deep learning [Dataset]. http://doi.org/10.5281/zenodo.8164693
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Oct 31, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Frédéric A. Dreyer; Daniel Cutting; Constantin Schneider; Henry Kenlay; Charlotte M. Deane; Frédéric A. Dreyer; Daniel Cutting; Constantin Schneider; Henry Kenlay; Charlotte M. Deane
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Model weights of the AbMPNN model (arXiv:2310.19513) presented at the 2023 ICML Workshop on Computational Biology, and csv files with the split between train, test and validation across the SAbDab and ImmuneBuilder datasets.

    This model is based on ProteinMPNN and can be run using the corresponding code: https://github.com/dauparas/ProteinMPNN.

  12. f

    DataSheet_1_Complete variable domain sequences of monoclonal antibody light...

    • frontiersin.figshare.com
    txt
    Updated Jun 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Allison Nau; Yun Shen; Vaishali Sanchorawala; Tatiana Prokaeva; Gareth J. Morgan (2023). DataSheet_1_Complete variable domain sequences of monoclonal antibody light chains identified from untargeted RNA sequencing data.fasta [Dataset]. http://doi.org/10.3389/fimmu.2023.1167235.s001
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 19, 2023
    Dataset provided by
    Frontiers
    Authors
    Allison Nau; Yun Shen; Vaishali Sanchorawala; Tatiana Prokaeva; Gareth J. Morgan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionMonoclonal antibody light chain proteins secreted by clonal plasma cells cause tissue damage due to amyloid deposition and other mechanisms. The unique protein sequence associated with each case contributes to the diversity of clinical features observed in patients. Extensive work has characterized many light chains associated with multiple myeloma, light chain amyloidosis and other disorders, which we have collected in the publicly accessible database, AL-Base. However, light chain sequence diversity makes it difficult to determine the contribution of specific amino acid changes to pathology. Sequences of light chains associated with multiple myeloma provide a useful comparison to study mechanisms of light chain aggregation, but relatively few monoclonal sequences have been determined. Therefore, we sought to identify complete light chain sequences from existing high throughput sequencing data.MethodsWe developed a computational approach using the MiXCR suite of tools to extract complete rearranged IGVL-IGJL sequences from untargeted RNA sequencing data. This method was applied to whole-transcriptome RNA sequencing data from 766 newly diagnosed patients in the Multiple Myeloma Research Foundation CoMMpass study.ResultsMonoclonal IGVL-IGJL sequences were defined as those where >50% of assigned IGK or IGL reads from each sample mapped to a unique sequence. Clonal light chain sequences were identified in 705/766 samples from the CoMMpass study. Of these, 685 sequences covered the complete IGVL-IGJL region. The identity of the assigned sequences is consistent with their associated clinical data and with partial sequences previously determined from the same cohort of samples. Sequences have been deposited in AL-Base.DiscussionOur method allows routine identification of clonal antibody sequences from RNA sequencing data collected for gene expression studies. The sequences identified represent, to our knowledge, the largest collection of multiple myeloma-associated light chains reported to date. This work substantially increases the number of monoclonal light chains known to be associated with non-amyloid plasma cell disorders and will facilitate studies of light chain pathology.

  13. More templates are available for all structural regions in the new database....

    • plos.figshare.com
    xls
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeliazko R. Jeliazkov; Rahel Frick; Jing Zhou; Jeffrey J. Gray (2023). More templates are available for all structural regions in the new database. [Dataset]. http://doi.org/10.1371/journal.pone.0234282.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 11, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jeliazko R. Jeliazkov; Rahel Frick; Jing Zhou; Jeffrey J. Gray
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    More templates are available for all structural regions in the new database.

  14. f

    Data from: Template-Assisted De Novo Sequencing of SARS-CoV‑2 and Influenza...

    • datasetcatalog.nlm.nih.gov
    • acs.figshare.com
    Updated Jun 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sautto, Giuseppe A.; Ippolito, Gregory C.; Chandrasekaran, Hamssika; Bensussan, Alena; Ross, Ted M.; Person, Maria D.; Gadush, Michelle V. (2022). Template-Assisted De Novo Sequencing of SARS-CoV‑2 and Influenza Monoclonal Antibodies by Mass Spectrometry [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000435936
    Explore at:
    Dataset updated
    Jun 2, 2022
    Authors
    Sautto, Giuseppe A.; Ippolito, Gregory C.; Chandrasekaran, Hamssika; Bensussan, Alena; Ross, Ted M.; Person, Maria D.; Gadush, Michelle V.
    Description

    In this study, we used multiple enzyme digestions, coupled with higher-energy collisional dissociation (HCD) and electron-transfer/higher-energy collision dissociation (EThcD) fragmentation to develop a mass-spectrometric (MS) method for determining the complete protein sequence of monoclonal antibodies (mAbs). The method was refined on an mAb of a known sequence, a SARS-CoV-1 antireceptor binding domain (RBD) spike monoclonal antibody. The data were searched using Supernovo to generate a complete template-assisted de novo sequence for this and two SARS-CoV-2 mAbs of known sequences resulting in correct sequences for the variable regions and correct distinction of Ile and Leu residues. We then used the method on a set of 25 antihemagglutinin (HA) influenza antibodies of unknown sequences and determined high confidence sequences for >99% of the complementarity determining regions (CDRs). The heavy-chain and light-chain genes were cloned and transfected into cells for recombinant expression followed by affinity purification. The recombinant mAbs displayed binding curves matching the original mAbs with specificity to the HA influenza antigen. Our findings indicate that this methodology results in almost complete antibody sequence coverage with high confidence results for CDR regions on diverse mAb sequences.

  15. R

    AB-SR (AntiBody Sequence Reconstructor) software: datasets for complete...

    • entrepot.recherche.data.gouv.fr
    text/markdown, xz
    Updated Jul 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolas MAILLET; Nicolas MAILLET; Bertrand SAUNIER; Bertrand SAUNIER (2023). AB-SR (AntiBody Sequence Reconstructor) software: datasets for complete benchmarking [Dataset]. http://doi.org/10.57745/4RNESM
    Explore at:
    xz(151168720), xz(123771256), xz(167698664), text/markdown(13509), xz(826456268), xz(177197812)Available download formats
    Dataset updated
    Jul 28, 2023
    Dataset provided by
    Recherche Data Gouv
    Authors
    Nicolas MAILLET; Nicolas MAILLET; Bertrand SAUNIER; Bertrand SAUNIER
    License

    https://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.57745/4RNESMhttps://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.57745/4RNESM

    Description

    Files, folders, tabular data and some raw data used in the publication: AB-SR reconstructs polyclonal antibody Fv domains after bottom-up proteomic de-novo sequencing (N. Maillet & B. Saunier). The AB-SR software reconstructs the sequences of most pairs of heavy and light chain variable regions from (in silico) pools containing up to 500 immunoglobulins in just a few minutes. For each Figure, the data before and after AB-SR software are available (see README.md for detailed explanations). Data presented here are used to benchmark AB-SR. More precisely, each experiment consists in IgGs coming from public databases being in silico digested using RPG software. Resulting peptides are then fed to AB-SR that reconstructs most initial IgGs.

  16. Monoclonal Antibodies

    • kaggle.com
    zip
    Updated Aug 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan Vandersmith (2020). Monoclonal Antibodies [Dataset]. https://www.kaggle.com/rvanasa/monoclonal-antibodies
    Explore at:
    zip(51198335 bytes)Available download formats
    Dataset updated
    Aug 22, 2020
    Authors
    Ryan Vandersmith
    License

    http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html

    Description

    Context

    This dataset combines monoclonal antibody (mAB) information from a variety of sources into a more concise and convenient format.

    Here is a quick introduction to monoclonal antibodies in context with the COVID-19 pandemic.

    Sources

    • RCSB PDB - 3D protein models
    • SAbDab - pairs of antigens and antibodies from RCSB
    • Thera-SAbDab - therapeutic monoclonal antibodies
    • CoV-AbDab - COVID-19 related antibodies
    • ANARCI - CDR predictions
    • DSSP - secondary structure and solubility predictions
  17. b

    antibody chain 1 (cds) Sequence Data

    • biocomplete.it
    text/x-fasta
    Updated Oct 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). antibody chain 1 (cds) Sequence Data [Dataset]. https://biocomplete.it/sequences/10803/sequence
    Explore at:
    text/x-fastaAvailable download formats
    Dataset updated
    Oct 27, 2025
    Measurement technique
    DNA sequencing
    Description

    DNA sequence and relationships for antibody chain 1 (cds)

  18. f

    Table_2_RAPID: A Rep-Seq Dataset Analysis Platform With an Integrated...

    • datasetcatalog.nlm.nih.gov
    Updated Aug 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yu, Xueqing; Xu, Qingxian; Tang, Haipei; Zeng, Huikun; Chen, Yuan; Lan, Chunhong; Zhang, Yanxia; Wang, Minhui; Guan, Junjie; Zhu, Yan; Ma, Cuiyu; Wei, Lai; Zhang, Zhenhai; Xie, Wenxi; Chen, Sen; Yang, Wei; Zhang, Yan; Wang, Qilong; Zhang, Yanfang; Wang, Chengrui; Guo, Shixin; Chen, Tianjian; Yang, Xiujia; Ren, Jian (2021). Table_2_RAPID: A Rep-Seq Dataset Analysis Platform With an Integrated Antibody Database.xlsx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000844044
    Explore at:
    Dataset updated
    Aug 13, 2021
    Authors
    Yu, Xueqing; Xu, Qingxian; Tang, Haipei; Zeng, Huikun; Chen, Yuan; Lan, Chunhong; Zhang, Yanxia; Wang, Minhui; Guan, Junjie; Zhu, Yan; Ma, Cuiyu; Wei, Lai; Zhang, Zhenhai; Xie, Wenxi; Chen, Sen; Yang, Wei; Zhang, Yan; Wang, Qilong; Zhang, Yanfang; Wang, Chengrui; Guo, Shixin; Chen, Tianjian; Yang, Xiujia; Ren, Jian
    Description

    The antibody repertoire is a critical component of the adaptive immune system and is believed to reflect an individual’s immune history and current immune status. Delineating the antibody repertoire has advanced our understanding of humoral immunity, facilitated antibody discovery, and showed great potential for improving the diagnosis and treatment of disease. However, no tool to date has effectively integrated big Rep-seq data and prior knowledge of functional antibodies to elucidate the remarkably diverse antibody repertoire. We developed a Rep-seq dataset Analysis Platform with an Integrated antibody Database (RAPID; https://rapid.zzhlab.org/), a free and web-based tool that allows researchers to process and analyse Rep-seq datasets. RAPID consolidates 521 WHO-recognized therapeutic antibodies, 88,059 antigen- or disease-specific antibodies, and 306 million clones extracted from 2,449 human IGH Rep-seq datasets generated from individuals with 29 different health conditions. RAPID also integrates a standardized Rep-seq dataset analysis pipeline to enable users to upload and analyse their datasets. In the process, users can also select set of existing repertoires for comparison. RAPID automatically annotates clones based on integrated therapeutic and known antibodies, and users can easily query antibodies or repertoires based on sequence or optional keywords. With its powerful analysis functions and rich set of antibody and antibody repertoire information, RAPID will benefit researchers in adaptive immune studies.

  19. m

    Neutralization Data and Aligned ENV Sequences for Predicting Antibody...

    • data.mendeley.com
    Updated Jun 20, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Catalin Buiu (2016). Neutralization Data and Aligned ENV Sequences for Predicting Antibody Affinities using Artificial Neural Networks [Dataset]. http://doi.org/10.17632/bhcjwtwjh4.1
    Explore at:
    Dataset updated
    Jun 20, 2016
    Authors
    Catalin Buiu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sample file with neutralization data (IC50) for different antibodies and viral strains, adapted from J. Huang, G. Ofek, L. Laub, M. K. Louder, N. A. Doria-Rose, N. S. Longo, H. Imamichi, R. T. Bailer, B. Chakrabarti, S. K. Sharma, S. M. Alam, T. Wang, Y. Yang, B. Zhang, S. A. Migueles, R. Wyatt, B. F. Haynes, P. D. Kwong, J. R. Mascola, and M. Connors, “Broad and potent neutralization of HIV-1 by a gp41-specific human antibody.,” Nature, vol. 491, no. 7424, pp. 406–12, Nov. 2012.

    Aligned ENV sequences downloaded from the HIV Sequence Database (www.hiv.lanl.gov/content/sequence/HIV/mainpage.html). There are 4907 sequences and the alignment length is 1369.

  20. Data and associated Matlab code for "Predicting Monoclonal Antibody Binding...

    • zenodo.org
    bin
    Updated Jun 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neal Woodbury; Neal Woodbury (2024). Data and associated Matlab code for "Predicting Monoclonal Antibody Binding Sequences from a Sparse Sampling of All Possible Sequences " [Dataset]. http://doi.org/10.5281/zenodo.12510566
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 23, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Neal Woodbury; Neal Woodbury
    Description

    This is the data and Matlab code associated with "Predicting Monoclonal Antibody Binding Sequences from a Sparse Sampling of All Possible Sequences ". Note that there are Arizona State University Patents associated with the algorithms involved in this work.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2025). AntiBody Sequence Database [Dataset]. https://bioregistry.io/absd

Data from: AntiBody Sequence Database

Related Article
Explore at:
Dataset updated
Jan 23, 2025
Description

The AntiBody Sequence Database is a public dataset for antibody sequence data. It provides unique identifiers for antibody sequences, including both immunoglobulin and single-chain variable fragment sequences. These are are critical for immunological studies, and allows users to search and retrieve antibody sequences based on sequence similarity and specificity, and other biological properties.

Search
Clear search
Close search
Google apps
Main menu