100+ datasets found
  1. d

    Structural Antibody Database

    • dknet.org
    • neuinfo.org
    • +2more
    Updated Apr 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Structural Antibody Database [Dataset]. http://identifiers.org/RRID:SCR_022096
    Explore at:
    Dataset updated
    Apr 20, 2022
    Description

    Database containing all antibody structures available in the PDB, annotated and presented in consistent fashion.Each structure is annotated with number of properties including experimental details, antibody nomenclature (e.g. heavy-light pairings), curated affinity data and sequence annotations. You can use the database to inspect individual structures, create and download datasets for analysis, search the database for structures with similar sequences to your query, monitor the known structural repetoire of antibodies.

  2. r

    Abysis Database

    • rrid.site
    • dknet.org
    • +2more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Abysis Database [Dataset]. http://identifiers.org/RRID:SCR_000756
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    A database of antibody structure containing sequences from Kabat, IMGT and the Protein Databank (PDB), as well as structure data from the PDB. It provides search of the sequence data on various criteria and display of results in different formats. For data from the PDB, sequence searches can be combined with structural constraints. For example, one can ask for all the antibodies with a 10-residue Kabat CDR-L1 with a serine at H23 and an arginine within 10A of H36. The site also has software for structure analysis and other information on antibody structure available.

  3. R

    Raw data from external antibody databases and scripts to homogenize and...

    • entrepot.recherche.data.gouv.fr
    application/x-gzip +1
    Updated Feb 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolas MAILLET; Nicolas MAILLET; Simon MALESYS; Simon MALESYS (2025). Raw data from external antibody databases and scripts to homogenize and standardize them used to build AntiBody Sequence Database (for reproducibility) [Dataset]. http://doi.org/10.57745/DDLHWU
    Explore at:
    application/x-gzip(620431), application/x-gzip(163643), application/x-gzip(6833391387), text/markdown(12475), application/x-gzip(80726198), application/x-gzip(65497009)Available download formats
    Dataset updated
    Feb 4, 2025
    Dataset provided by
    Recherche Data Gouv
    Authors
    Nicolas MAILLET; Nicolas MAILLET; Simon MALESYS; Simon MALESYS
    License

    https://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/3.1/customlicense?persistentId=doi:10.57745/DDLHWUhttps://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/3.1/customlicense?persistentId=doi:10.57745/DDLHWU

    Description

    Reproducibility data for the AntiBody Sequence Database (ABSD) article. This dataset contains the raw data (antibody sequences) extracted on June 20, 2024, from various databases, as well as the several scripts, to ensure the reproducibility of our results. External databases used: ABDB, AbPDB, CoV-AbDab, Genbank, IMGT, PDB, SACS, SAbDab, TheraSAbDab, UniProt, KABAT Scripts usage: each external database has a corresponding script to format all antibody sequences extracted from it. A last script enable merging all extracted antibody sequences while removing redundancy, standardizing and cleaning data.

  4. f

    MSDatasets and Antibody Databases.zip

    • figshare.com
    txt
    Updated Jul 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kunyi Li (2025). MSDatasets and Antibody Databases.zip [Dataset]. http://doi.org/10.6084/m9.figshare.29634140.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 24, 2025
    Dataset provided by
    figshare
    Authors
    Kunyi Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MS Datasets: (1) 100 simulated spectra; (2) Waters spectra;(3) HB100 spectra.Antibody database:(1) a database of 800 antibody sequences;(2) a database of 500 decoy sequences from mouse.

  5. n

    Therapeutic Structural Antibody Database

    • neuinfo.org
    Updated Sep 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Therapeutic Structural Antibody Database [Dataset]. http://identifiers.org/RRID:SCR_022093
    Explore at:
    Dataset updated
    Sep 10, 2024
    Description

    Tracks all antibody and nanobody related therapeutics recognized by World Health Organisation, and identifies any corresponding structures in Structural Antibody Database with near exact or exact variable domain sequence matches. Synchronized with SAbDab to update weekly, reflecting new Protein Data Bank entries and availability of new sequence data published by WHO.

  6. f

    Serum Antibody Repertoire Profiling Using In Silico Antigen Screen

    • plos.figshare.com
    doc
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xinyue Liu; Qiang Hu; Song Liu; Luke J. Tallo; Lisa Sadzewicz; Cassandra A. Schettine; Mikhail Nikiforov; Elena N. Klyushnenkova; Yurij Ionov (2023). Serum Antibody Repertoire Profiling Using In Silico Antigen Screen [Dataset]. http://doi.org/10.1371/journal.pone.0067181
    Explore at:
    docAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Xinyue Liu; Qiang Hu; Song Liu; Luke J. Tallo; Lisa Sadzewicz; Cassandra A. Schettine; Mikhail Nikiforov; Elena N. Klyushnenkova; Yurij Ionov
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Serum antibodies are valuable source of information on the health state of an organism. The profiles of serum antibody reactivity can be generated by using a high throughput sequencing of peptide-coding DNA from combinatorial random peptide phage display libraries selected for binding to serum antibodies. Here we demonstrate that the targets of immune response, which are recognized by serum antibodies directed against sequential epitopes, can be identified using the serum antibody repertoire profiles generated by high throughput sequencing. We developed an algorithm to filter the results of the protein database BLAST search for selected peptides to distinguish real antigens recognized by serum antibodies from irrelevant proteins retrieved randomly. When we used this algorithm to analyze serum antibodies from mice immunized with human protein, we were able to identify the protein used for immunizations among the top candidate antigens. When we analyzed human serum sample from the metastatic melanoma patient, the recombinant protein, corresponding to the top candidate from the list generated using the algorithm, was recognized by antibodies from metastatic melanoma serum on the western blot, thus confirming that the method can identify autoantigens recognized by serum antibodies. We demonstrated also that our unbiased method of looking at the repertoire of serum antibodies reveals quantitative information on the epitope composition of the targets of immune response. A method for deciphering information contained in the serum antibody repertoire profiles may help to identify autoantibodies that can be used for diagnosing and monitoring autoimmune diseases or malignancies.

  7. r

    Data from: Kabat Database of Sequences of Proteins of Immunological Interest...

    • rrid.site
    • dknet.org
    • +2more
    Updated Sep 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Kabat Database of Sequences of Proteins of Immunological Interest [Dataset]. http://identifiers.org/RRID:SCR_006465
    Explore at:
    Dataset updated
    Sep 16, 2025
    Description

    The Kabat Database determines the combining site of antibodies based on the available amino acid sequences. The precise delineation of complementarity determining regions (CDR) of both light and heavy chains provides the first example of how properly aligned sequences can be used to derive structural and functional information of biological macromolecules. The Kabat database now includes nucleotide sequences, sequences of T cell receptors for antigens (TCR), major histocompatibility complex (MHC) class I and II molecules, and other proteins of immunological interest. The Kabat Database searching and analysis tools package is an ASP.NET web-based portal containing lookup tools, sequence matching tools, alignment tools, length distribution tools, positional correlation tools and much more. The searching and analysis tools are custom made for the aligned data sets contained in both the SQL Server and ASCII text flat file formats. The searching and analysis tools may be run on a single PC workstation or in a distributed environment. The analysis tools are written in ASP.NET and C# and are available in Visual Studio .NET 2003/2005/2008 formats. The Kabat Database was initially started in 1970 to determine the combining site of antibodies based on the available amino acid sequences at that time. Bence Jones proteins, mostly from human, were aligned, using the now-known Kabat numbering system, and a quantitative measure, variability, was calculated for every position. Three peaks, at positions 24-34, 50-56 and 89-97, were identified and proposed to form the complementarity determining regions (CDR) of light chains. Subsequently, antibody heavy chain amino acid sequences were also aligned using a different numbering system, since the locations of their CDRs (31-35B, 50-65 and 95-102) are different from those of the light chains. CDRL1 starts right after the first invariant Cys 23 of light chains, while CDRH1 is eight amino acid residues away from the first invariant Cys 22 of heavy chains. During the past 30 years, the Kabat database has grown to include nucleotide sequences, sequences of T cell receptors for antigens (TCR), major histocompatibility complex (MHC) class I and II molecules and other proteins of immunological interest. It has been used extensively by immunologists to derive useful structural and functional information from the primary sequences of these proteins.

  8. D

    Antibody Sequencing Services Market Report | Global Forecast From 2025 To...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Antibody Sequencing Services Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/antibody-sequencing-services-market
    Explore at:
    csv, pptx, pdfAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Antibody Sequencing Services Market Outlook




    The global antibody sequencing services market size was valued at approximately USD 450 million in 2023 and is projected to reach around USD 950 million by 2032, growing at a compound annual growth rate (CAGR) of 8.5% during the forecast period. The primary growth factor driving this market is the increasing demand for therapeutic and diagnostic antibodies, which are crucial in developing targeted therapies for various diseases, including cancer and autoimmune disorders.




    One of the significant growth factors for the antibody sequencing services market is the rising prevalence of chronic diseases and the subsequent demand for advanced therapeutic options. With an aging population and the global burden of diseases like cancer, autoimmune disorders, and infectious diseases on the rise, there is an increased need for effective treatments. Antibody-based therapies have proven to be highly effective in targeting specific disease markers, leading to their growing adoption. This, in turn, is driving the demand for antibody sequencing services, which are essential for the development and optimization of these therapies.




    Another critical factor contributing to the market's growth is the advancements in sequencing technologies. Over the past decade, there have been significant improvements in sequencing methods, leading to faster, more accurate, and cost-effective sequencing solutions. Techniques such as next-generation sequencing (NGS) and single-cell sequencing have revolutionized the field, allowing for high-throughput and detailed analysis of antibody sequences. These technological advancements have made it easier for researchers and companies to obtain high-quality sequencing data, thereby boosting the adoption of antibody sequencing services.




    Furthermore, the increasing focus on personalized medicine is also fueling the growth of the antibody sequencing services market. Personalized medicine aims to tailor treatments based on an individual's unique genetic makeup, leading to more effective and targeted therapies. Antibody sequencing plays a crucial role in this approach by enabling the identification of specific antibodies that can be used to design personalized treatments. As the healthcare industry continues to shift towards personalized medicine, the demand for antibody sequencing services is expected to grow significantly.



    In addition to sequencing, the Antibody Labeling Service is gaining traction as an essential component in the development of therapeutic and diagnostic antibodies. This service involves the attachment of specific labels to antibodies, which can be used in various applications such as imaging, flow cytometry, and immunoassays. The ability to label antibodies accurately and efficiently enhances their utility in research and clinical settings, allowing for more precise targeting and detection of disease markers. As the demand for personalized medicine and targeted therapies continues to grow, the need for reliable antibody labeling services is expected to increase, complementing the advancements in antibody sequencing technologies.




    From a regional perspective, North America holds the largest share in the antibody sequencing services market, followed by Europe and the Asia Pacific. The dominance of North America can be attributed to the presence of a well-established healthcare infrastructure, significant investments in research and development, and the presence of major pharmaceutical and biotechnology companies. Additionally, the region has a high prevalence of chronic diseases, further driving the demand for advanced therapeutic options. The Asia Pacific region is expected to witness the highest growth during the forecast period, owing to the increasing healthcare expenditure, growing focus on research activities, and the rising prevalence of chronic diseases in countries like China and India.



    Service Type Analysis




    The antibody sequencing services market can be segmented by service type into De Novo Sequencing, Database Sequencing, and Hybrid Sequencing. Among these, De Novo Sequencing accounts for a significant market share due to its capability to provide a complete sequence of antibodies without any prior knowledge of the sequence. This service is particularly crucial for discovering novel antibodies and understanding their structure and f

  9. N

    Data from: Antibody Sequence Determinants of Viral Antigen Specificity

    • data.niaid.nih.gov
    Updated Nov 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abu-Shmais AA; Vukovich MJ; Wasdin PT; Suresh YP; Marinov TM; Rush SA; Gillespie RA; Sankhala RS; Choe M; Joyce MG; Kanekiyo M; McLellan JS; Georgiev IS (2024). Antibody Sequence Determinants of Viral Antigen Specificity [Dataset]. https://data.niaid.nih.gov/resources?id=gse250159
    Explore at:
    Dataset updated
    Nov 22, 2024
    Dataset provided by
    Vanderbilt University Medical Center
    Authors
    Abu-Shmais AA; Vukovich MJ; Wasdin PT; Suresh YP; Marinov TM; Rush SA; Gillespie RA; Sankhala RS; Choe M; Joyce MG; Kanekiyo M; McLellan JS; Georgiev IS
    Description

    Throughout life, humans experience repeated exposure to viral antigens through infection and vaccination, resulting in the generation of diverse, and largely unique, antigen specific antibody repertoires. A paramount feature of antibodies that enables their critical contributions in counteracting recurrent and novel pathogens, and consequently fostering their utility as valuable targets for therapeutic and vaccine development, is the exquisite specificity displayed against their target antigens. Yet, there is still limited understanding of the determinants of antibody-antigen specificity, particularly as a function of antibody sequence. In recent years, experimental characterization of antibody repertoires has led to novel insights into fundamental properties of antibody sequences, but has been largely decoupled from at-scale antigen specificity analysis. Here, using the LIBRA-seq technology, we generated a large dataset mapping antibody sequence to antigen specificity for thousands of B cells, by screening the repertoires of a set of healthy individuals against twenty viral antigens representing diverse pathogens of biomedical significance. Analysis uncovered virus specific patterns in variable gene usage, gene pairing, somatic hypermutation, as well as the presence of convergent antiviral signatures across multiple individuals, including the presence of public antibody clonotypes. Notably, our results showed that, for B cell receptors originating from different individuals but leveraging an identical combination of heavy and light chain variable genes, there is a specific CDRH3/CDRL3 identity threshold that defines whether these B cells may share the same antigen specificity. This finding provides a quantifiable measure of the relationship between antibody sequence and antigen specificity and further defines experimentally grounded criteria for defining public antibody clonality. Understanding the fundamental rules of antibody-antigen interactions can lead to transformative new approaches for the development of antibody therapeutics and vaccines against current and emerging viruses. Antigen specific B cells were sorted from human PBMCs using the standard LIBRA-seq pipeline (Shiakolas & Setliff et al., Cell 2019). Cells were distributed into microfluidic droplets containing unique cell-specific oligonucleotides using the 10X Genomics Chromium Controller. cDNA libraires from single cells were further amplified with BCR enrichment primers using the 10X Genomics VDJ protocol.

  10. f

    More templates are available for all structural regions in the new database....

    • plos.figshare.com
    xls
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeliazko R. Jeliazkov; Rahel Frick; Jing Zhou; Jeffrey J. Gray (2023). More templates are available for all structural regions in the new database. [Dataset]. http://doi.org/10.1371/journal.pone.0234282.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 11, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jeliazko R. Jeliazkov; Rahel Frick; Jing Zhou; Jeffrey J. Gray
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    More templates are available for all structural regions in the new database.

  11. Data from: Improving antibody language models with native pairing

    • zenodo.org
    application/gzip, zip
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah Burbach; Sarah Burbach; Bryan Briney; Bryan Briney (2025). Improving antibody language models with native pairing [Dataset]. http://doi.org/10.5281/zenodo.12745725
    Explore at:
    application/gzip, zipAvailable download formats
    Dataset updated
    Jun 12, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sarah Burbach; Sarah Burbach; Bryan Briney; Bryan Briney
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Motivation. Existing large language models designed to predict antibody structure and function have been trained exclusively with unpaired antibody sequences. This is a substantial drawback, as each antibody represents a unique pairing of heavy and light chains that both contribute to antigen recognition. The cost of generating large datasets of natively paired antibody sequences is orders of magnitude higher than the cost of unpaired sequences, and the paucity of available paired antibody sequence datasets precludes training a state-of-the-art language model using only paired training data. Here, we sought to determine whether and to what extent natively paired training data improves model performance.

    Results. Using a unique and recently reported dataset of approximately 1.6 x 106 natively paired human antibody sequences, we trained two baseline antibody language model (BALM) variants: BALM-paired and BALM-unpaired. We quantify the superiority of BALM-paired over BALM-unpaired, and we show that BALM-paired's improved performance can be attributed at least in part to its ability to learn cross-chain features that span natively paired heavy and light chains. Additionally, we fine-tuned the general protein language model ESM-2 using these paired antibody sequences and report that the fine-tuned model, but not base ESM-2, demonstrates a similar understanding of cross-chain features.

    Files. The following files are included in this repository:

    • BALM-paired.tar.gz: Model weights for the BALM-paired model.
    • BALM-shuffled.tar.gz: Model weights for the BALM-shuffled model.
    • BALM-unpaired.tar.gz: Model weights for the BALM-unpaired model.
    • ESM2-650M_paired-fine-tuned.tar.gz: Model weights for the 650M-parameter ESM-2 model after fine-tuning with natively paired antibody sequences.
    • jaffe-paired-dataset_airr-annotation.tar.gz: All natively paired antibody sequences from the Jaffe dataset were annotated with abstar and subsequently filtered to remove duplicates or unproductive sequences. The annotated sequences are provided in an AIRR-compliant format.
    • test-dataset_annotated.tar.gz: Two csv files, both with sequences annotated in an AIRR-compliant format. lc-coherence_test-unique_annotated.csv contains all sequences from the test dataset and fig3-20kembeddings_annotated.csv contains the 20k sequences from the test used for the Figure 2 UMAP embeddings. For both datasets, the sequences can be paired together based on their pair_id.
    • train-test-eval_paired.tar.gz: Datasets used to train, test, and evaluate the BALM-paired model. Compressed folder containing three files: train.txt, test.txt, and eval.txt. Each file has one input sequence per line. This dataset was also used to fine-tune the 650M-parameter ESM-2 variant.
    • train-test-eval_shuffled.tar.gz: Datasets used to train, test, and evaluate the BALM-shuffled model. Compressed folder containing three csv files, with two columns for the heavy and light chains.
    • train-test-eval_unpaired.tar.gz: Datasets used to train, test, and evaluate the BALM-unpaired model. Compressed folder containing three files: train.txt, test.txt, and eval.txt. Each file has one input sequence per line.
    • classification-datasets.tar.gz: Three classification datasets used to train classification models in Figure 5. The datasets are: flu-0_cov-1.csv, hd-0_cov-1.csv, and hd-0_flu-1_cov-2.csv. CoV antibody sequences were obtained from CoV-AbDab, Flu antibody sequences were obtained from Wang et al., and healthy donor antibody sequences were obtained from Hurtado et al.

    Code: All code used for model training, testing, and figure generation is available under the MIT license on GitHub. An archived version of the GitHub repository (from the time of manuscript publication) is included here as code-archive.zip.

  12. Structural region to sequence mapping for RosettaAntibody.

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeliazko R. Jeliazkov; Rahel Frick; Jing Zhou; Jeffrey J. Gray (2023). Structural region to sequence mapping for RosettaAntibody. [Dataset]. http://doi.org/10.1371/journal.pone.0234282.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 11, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jeliazko R. Jeliazkov; Rahel Frick; Jing Zhou; Jeffrey J. Gray
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Structural region to sequence mapping for RosettaAntibody.

  13. f

    DataSheet_1_Complete variable domain sequences of monoclonal antibody light...

    • frontiersin.figshare.com
    txt
    Updated Jun 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Allison Nau; Yun Shen; Vaishali Sanchorawala; Tatiana Prokaeva; Gareth J. Morgan (2023). DataSheet_1_Complete variable domain sequences of monoclonal antibody light chains identified from untargeted RNA sequencing data.fasta [Dataset]. http://doi.org/10.3389/fimmu.2023.1167235.s001
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 19, 2023
    Dataset provided by
    Frontiers
    Authors
    Allison Nau; Yun Shen; Vaishali Sanchorawala; Tatiana Prokaeva; Gareth J. Morgan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionMonoclonal antibody light chain proteins secreted by clonal plasma cells cause tissue damage due to amyloid deposition and other mechanisms. The unique protein sequence associated with each case contributes to the diversity of clinical features observed in patients. Extensive work has characterized many light chains associated with multiple myeloma, light chain amyloidosis and other disorders, which we have collected in the publicly accessible database, AL-Base. However, light chain sequence diversity makes it difficult to determine the contribution of specific amino acid changes to pathology. Sequences of light chains associated with multiple myeloma provide a useful comparison to study mechanisms of light chain aggregation, but relatively few monoclonal sequences have been determined. Therefore, we sought to identify complete light chain sequences from existing high throughput sequencing data.MethodsWe developed a computational approach using the MiXCR suite of tools to extract complete rearranged IGVL-IGJL sequences from untargeted RNA sequencing data. This method was applied to whole-transcriptome RNA sequencing data from 766 newly diagnosed patients in the Multiple Myeloma Research Foundation CoMMpass study.ResultsMonoclonal IGVL-IGJL sequences were defined as those where >50% of assigned IGK or IGL reads from each sample mapped to a unique sequence. Clonal light chain sequences were identified in 705/766 samples from the CoMMpass study. Of these, 685 sequences covered the complete IGVL-IGJL region. The identity of the assigned sequences is consistent with their associated clinical data and with partial sequences previously determined from the same cohort of samples. Sequences have been deposited in AL-Base.DiscussionOur method allows routine identification of clonal antibody sequences from RNA sequencing data collected for gene expression studies. The sequences identified represent, to our knowledge, the largest collection of multiple myeloma-associated light chains reported to date. This work substantially increases the number of monoclonal light chains known to be associated with non-amyloid plasma cell disorders and will facilitate studies of light chain pathology.

  14. f

    Data from: Template-Assisted De Novo Sequencing of SARS-CoV‑2 and Influenza...

    • acs.figshare.com
    • datasetcatalog.nlm.nih.gov
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michelle V. Gadush; Giuseppe A. Sautto; Hamssika Chandrasekaran; Alena Bensussan; Ted M. Ross; Gregory C. Ippolito; Maria D. Person (2023). Template-Assisted De Novo Sequencing of SARS-CoV‑2 and Influenza Monoclonal Antibodies by Mass Spectrometry [Dataset]. http://doi.org/10.1021/acs.jproteome.1c00913.s002
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    ACS Publications
    Authors
    Michelle V. Gadush; Giuseppe A. Sautto; Hamssika Chandrasekaran; Alena Bensussan; Ted M. Ross; Gregory C. Ippolito; Maria D. Person
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    In this study, we used multiple enzyme digestions, coupled with higher-energy collisional dissociation (HCD) and electron-transfer/higher-energy collision dissociation (EThcD) fragmentation to develop a mass-spectrometric (MS) method for determining the complete protein sequence of monoclonal antibodies (mAbs). The method was refined on an mAb of a known sequence, a SARS-CoV-1 antireceptor binding domain (RBD) spike monoclonal antibody. The data were searched using Supernovo to generate a complete template-assisted de novo sequence for this and two SARS-CoV-2 mAbs of known sequences resulting in correct sequences for the variable regions and correct distinction of Ile and Leu residues. We then used the method on a set of 25 antihemagglutinin (HA) influenza antibodies of unknown sequences and determined high confidence sequences for

    99% of the complementarity determining regions (CDRs). The heavy-chain and light-chain genes were cloned and transfected into cells for recombinant expression followed by affinity purification. The recombinant mAbs displayed binding curves matching the original mAbs with specificity to the HA influenza antigen. Our findings indicate that this methodology results in almost complete antibody sequence coverage with high confidence results for CDR regions on diverse mAb sequences.

  15. e

    AB-SR (AntiBody Sequence Reconstructor) software: datasets for complete...

    • b2find.eudat.eu
    Updated May 8, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). AB-SR (AntiBody Sequence Reconstructor) software: datasets for complete benchmarking - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/195b3749-c382-59a6-b1fc-76c3c16e8bdf
    Explore at:
    Dataset updated
    May 8, 2024
    Description

    Files, folders, tabular data and some raw data used in the publication: AB-SR reconstructs polyclonal antibody Fv domains after bottom-up proteomic de-novo sequencing (N. Maillet & B. Saunier). The AB-SR software reconstructs the sequences of most pairs of heavy and light chain variable regions from (in silico) pools containing up to 500 immunoglobulins in just a few minutes. For each Figure, the data before and after AB-SR software are available (see README.md for detailed explanations). Data presented here are used to benchmark AB-SR. More precisely, each experiment consists in IgGs coming from public databases being in silico digested using RPG software. Resulting peptides are then fed to AB-SR that reconstructs most initial IgGs.

  16. Detailed data for the collected MMP-targeting antibody sequences.

    • plos.figshare.com
    xlsx
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xinmeng Li; James A. Van Deventer; Soha Hassoun (2023). Detailed data for the collected MMP-targeting antibody sequences. [Dataset]. http://doi.org/10.1371/journal.pcbi.1007779.s004
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Xinmeng Li; James A. Van Deventer; Soha Hassoun
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    (a) original sequences, (b) extracted features, (c) representative MMP-targeting set sequence IDs after BLASTCLUST and corresponding sequences in the original set, (d) representative MMP-IGHV-targeting set heavy chain sequence IDs after BLASTCLUST and corresponding sequences in the original set. (XLSX)

  17. Data and associated Matlab code for "Predicting Monoclonal Antibody Binding...

    • zenodo.org
    bin
    Updated Dec 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neal Woodbury; Neal Woodbury (2023). Data and associated Matlab code for "Predicting Monoclonal Antibody Binding Sequences from a Sparse Sampling of All Possible Sequences " [Dataset]. http://doi.org/10.5281/zenodo.10262899
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 5, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Neal Woodbury; Neal Woodbury
    Description

    This is the data and Matlab code associated with "Predicting Monoclonal Antibody Binding Sequences from a Sparse Sampling of All Possible Sequences ". Note that there are Arizona State University Patents associated with the algorithms involved in this work.

  18. f

    Data from: De Novo Sequencing of Antibody Light Chain Proteoforms from...

    • figshare.com
    • acs.figshare.com
    xlsx
    Updated Jun 4, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathieu Dupré; Magalie Duchateau; Rebecca Sternke-Hoffmann; Amelie Boquoi; Christian Malosse; Roland Fenk; Rainer Haas; Alexander K. Buell; Martial Rey; Julia Chamot-Rooke (2023). De Novo Sequencing of Antibody Light Chain Proteoforms from Patients with Multiple Myeloma [Dataset]. http://doi.org/10.1021/acs.analchem.1c01955.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    ACS Publications
    Authors
    Mathieu Dupré; Magalie Duchateau; Rebecca Sternke-Hoffmann; Amelie Boquoi; Christian Malosse; Roland Fenk; Rainer Haas; Alexander K. Buell; Martial Rey; Julia Chamot-Rooke
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    In multiple myeloma diseases, monoclonal immunoglobulin light chains (LCs) are abundantly produced, with, as a consequence in some cases, the formation of deposits affecting various organs, such as the kidney, while in other cases remaining soluble up to concentrations of several g·L–1 in plasma. The exact factors crucial for the solubility of LCs are poorly understood, but it can be hypothesized that their amino acid sequence plays an important role. Determining the precise sequences of patient-derived LCs is therefore highly desirable. We establish here a novel de novo sequencing workflow for patient-derived LCs, based on the combination of bottom-up and top-down proteomics without database search. PEAKS is used for the de novo sequencing of peptides that are further assembled into full length LC sequences using ALPS. Top-down proteomics provides the molecular masses of proteoforms and allows the exact determination of the amino acid sequence including all posttranslational modifications. This pipeline is then used for the complete de novo sequencing of LCs extracted from the urine of 10 patients with multiple myeloma. We show that for the bottom-up part, digestions with trypsin and Nepenthes digestive fluid are sufficient to produce overlapping peptides able to generate the best sequence candidates. Top-down proteomics is absolutely required to achieve 100% final sequence coverage and characterize clinical samples containing several LCs. Our work highlights an unexpected range of modifications.

  19. OASis peptide database

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Aug 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Prihoda; David Prihoda; Jad Maamary; Andrew Waight; Veronica Juan; Laurence Fayadat-Dilman; Daniel Svozil; Danny A. Bitton; Jad Maamary; Andrew Waight; Veronica Juan; Laurence Fayadat-Dilman; Daniel Svozil; Danny A. Bitton (2021). OASis peptide database [Dataset]. http://doi.org/10.5281/zenodo.5164685
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Aug 7, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    David Prihoda; David Prihoda; Jad Maamary; Andrew Waight; Veronica Juan; Laurence Fayadat-Dilman; Daniel Svozil; Danny A. Bitton; Jad Maamary; Andrew Waight; Veronica Juan; Laurence Fayadat-Dilman; Daniel Svozil; Danny A. Bitton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    OASis human 9-mer peptide database, generated from 118 million human antibody sequences from the Observed Antibody Space database.

    Attached is a gzipped SQLite database containing two tables: "peptides" and "subjects".

    Links:

  20. d

    Data from: An amphipol-stabilized multi-pass transmembrane protein as an...

    • search.dataone.org
    Updated Jun 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiping Yang; Tao Liu; Xianqing Mai; Shengyan Kong; Jingjing He; Zhenhua Wang; Jie Shen; Xiaohua He; Yongmei Xing; Hongwu Qian; Pei Tong (2025). An amphipol-stabilized multi-pass transmembrane protein as an immunogen to generate mouse memory B cells against native VMAT2 [Dataset]. http://doi.org/10.5061/dryad.pvmcvdnwv
    Explore at:
    Dataset updated
    Jun 24, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Jiping Yang; Tao Liu; Xianqing Mai; Shengyan Kong; Jingjing He; Zhenhua Wang; Jie Shen; Xiaohua He; Yongmei Xing; Hongwu Qian; Pei Tong
    Description

    Complex integral membrane proteins with the property of multiple transmembrane-spanning regions are large protein families, including G-protein-coupled receptors (GPCR), ion channels, and transporters. Due to their essential roles in sensing and processing signals, they are the primary drug targets of more than half of the approved drugs—majorly small molecules. Although antibodies succeeded in the pharmaceutical markets, they are rare in modulating complex integral membrane proteins with favorable properties. Two significant limitations in such antibody discovery are the preparation of correctly folded antigens and the generation of antibodies against the natural conformation. Here, we developed an amphipol-trapped antigen as an immunogen and induced efficient mouse memory B cell responses. We generated antibodies unbiasedly by culturing single memory B cells and characterized their specificities. We implemented our strategy to generate high-affinity antibodies against the native confo...,
    Eighty-eight antibody sequences (heavy chain and light chains) discovered in this study using Sanger sequencing are included in this dataset. They were isolated from mouse memory B cells.

    &..., , # An amphipol-stabilized multi-pass transmembrane protein as an immunogen to generate mouse memory B cells against native VMAT2

    https://doi.org/10.5061/dryad.pvmcvdnwv

    Antibody_heavy_and_light_chain.rtf

    88 antibody heavy and light chains discovered in this study were included in the dataset. They were from mouse memory B cells. This file is in FASTA format. ,

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2022). Structural Antibody Database [Dataset]. http://identifiers.org/RRID:SCR_022096

Structural Antibody Database

RRID:SCR_022096, Structural Antibody Database (RRID:SCR_022096), SAbDab

Explore at:
Dataset updated
Apr 20, 2022
Description

Database containing all antibody structures available in the PDB, annotated and presented in consistent fashion.Each structure is annotated with number of properties including experimental details, antibody nomenclature (e.g. heavy-light pairings), curated affinity data and sequence annotations. You can use the database to inspect individual structures, create and download datasets for analysis, search the database for structures with similar sequences to your query, monitor the known structural repetoire of antibodies.

Search
Clear search
Close search
Google apps
Main menu