16 datasets found
  1. JPL Small Body Database Search Engine - Dataset - NASA Open Data Portal

    • data.nasa.gov
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). JPL Small Body Database Search Engine - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/jpl-small-body-database-search-engine
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Use this search engine to generate custom tables of orbital and/or physical parameters for all asteroids and comets (or a specified sub-set) in our small-body database. If this is your first time here, you may find it helpful to read our tutorial. Otherwise, simply follow the steps in each section: 'Search Constraints', 'Output Fields', and finally 'Format Options'. If you want details for a single object, use the Small Body Browser instead.

  2. d

    DataForSEO Google Keyword Database, historical and current

    • datarade.ai
    .json, .csv
    Updated Mar 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DataForSEO (2023). DataForSEO Google Keyword Database, historical and current [Dataset]. https://datarade.ai/data-products/dataforseo-google-keyword-database-historical-and-current-dataforseo
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    Mar 14, 2023
    Dataset authored and provided by
    DataForSEO
    Area covered
    Bahrain, Cyprus, Canada, Bolivia (Plurinational State of), Spain, Bangladesh, El Salvador, Uruguay, Singapore, Turkey
    Description

    You can check the fields description in the documentation: current Keyword database: https://docs.dataforseo.com/v3/databases/google/keywords/?bash; Historical Keyword database: https://docs.dataforseo.com/v3/databases/google/history/keywords/?bash. You don’t have to download fresh data dumps in JSON or CSV – we can deliver data straight to your storage or database. We send terrabytes of data to dozens of customers every month using Amazon S3, Google Cloud Storage, Microsoft Azure Blob, Eleasticsearch, and Google Big Query. Let us know if you’d like to get your data to any other storage or database.

  3. C

    Canada Internet Usage: Search Engine Market Share: Mobile: Haosou

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, Canada Internet Usage: Search Engine Market Share: Mobile: Haosou [Dataset]. https://www.ceicdata.com/en/canada/internet-usage-search-engine-market-share/internet-usage-search-engine-market-share-mobile-haosou
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Sep 22, 2024 - Jan 10, 2026
    Area covered
    Canada
    Description

    Canada Internet Usage: Search Engine Market Share: Mobile: Haosou data was reported at 0.000 % in 10 Jan 2026. This records a decrease from the previous number of 0.020 % for 09 Jan 2026. Canada Internet Usage: Search Engine Market Share: Mobile: Haosou data is updated daily, averaging 0.010 % from Sep 2024 (Median) to 10 Jan 2026, with 16 observations. The data reached an all-time high of 0.020 % in 09 Jan 2026 and a record low of 0.000 % in 10 Jan 2026. Canada Internet Usage: Search Engine Market Share: Mobile: Haosou data remains active status in CEIC and is reported by Statcounter Global Stats. The data is categorized under Global Database’s Canada – Table CA.SC.IU: Internet Usage: Search Engine Market Share.

  4. d

    DataForSEO Backlink Summary Database

    • datarade.ai
    .json, .csv
    Updated Aug 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DataForSEO (2023). DataForSEO Backlink Summary Database [Dataset]. https://datarade.ai/data-products/dataforseo-backlink-summary-database-dataforseo
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    Aug 17, 2023
    Dataset authored and provided by
    DataForSEO
    Area covered
    Tonga, Ukraine, Canada, Equatorial Guinea, Vanuatu, Mexico, Solomon Islands, Nicaragua, Yemen, Sri Lanka
    Description

    The fields description may be found here: https://docs.dataforseo.com/v3/databases/backlink_summary/?bash

    DataForSEO Backlink Summary Database encompasses millions of domains enriched with backlink data and other related metrics. You will get a comprehensive overview of a domain’s backlink profile, including the number of inbound links, referring domains and referring pages, new & lost backlinks and referring domains, domain rank, backlink spam score, and more.

    This database is available in both JSON and CSV formats.

  5. d

    Peptide Sequence Database

    • dknet.org
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Peptide Sequence Database [Dataset]. http://identifiers.org/RRID:SCR_005764
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    The Peptide Sequence Database contains putative peptide sequences from human, mouse, rat, and zebrafish. Compressed to eliminate redundancy, these are about 40 fold smaller than a brute force enumeration. Current and old releases are available for download. Each species'' peptide sequence database comprises peptide sequence data from releveant species specific UniGene and IPI clusters, plus all sequences from their consituent EST, mRNA and protein sequence databases, namely RefSeq proteins and mRNAs, UniProt''s SwissProt and TrEMBL, GenBank mRNA, ESTs, and high-throughput cDNAs, HInv-DB, VEGA, EMBL, IPI protein sequences, plus the enumeration of all combinations of UniProt sequence variants, Met loss PTM, and signal peptide cleavages. The README file contains some information about the non amino-acid symbols O (digest site corresponding to a protein N- or C-terminus) and J (no digest sequence join) used in these peptide sequence databases and information about how to configure various search engines to use them. Some search engines handle (very) long sequences badly and in some cases must be patched to use these peptide sequence databases. All search engines supported by the PepArML meta-search engine can (or can be patched to) successfully search these peptide sequence databases.

  6. f

    PIA: An Intuitive Protein Inference Engine with a Web-Based User Interface

    • acs.figshare.com
    xlsx
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julian Uszkoreit; Alexandra Maerkens; Yasset Perez-Riverol; Helmut E. Meyer; Katrin Marcus; Christian Stephan; Oliver Kohlbacher; Martin Eisenacher (2023). PIA: An Intuitive Protein Inference Engine with a Web-Based User Interface [Dataset]. http://doi.org/10.1021/acs.jproteome.5b00121.s003
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    ACS Publications
    Authors
    Julian Uszkoreit; Alexandra Maerkens; Yasset Perez-Riverol; Helmut E. Meyer; Katrin Marcus; Christian Stephan; Oliver Kohlbacher; Martin Eisenacher
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Protein inference connects the peptide spectrum matches (PSMs) obtained from database search engines back to proteins, which are typically at the heart of most proteomics studies. Different search engines yield different PSMs and thus different protein lists. Analysis of results from one or multiple search engines is often hampered by different data exchange formats and lack of convenient and intuitive user interfaces. We present PIA, a flexible software suite for combining PSMs from different search engine runs and turning these into consistent results. PIA can be integrated into proteomics data analysis workflows in several ways. A user-friendly graphical user interface can be run either locally or (e.g., for larger core facilities) from a central server. For automated data processing, stand-alone tools are available. PIA implements several established protein inference algorithms and can combine results from different search engines seamlessly. On several benchmark data sets, we show that PIA can identify a larger number of proteins at the same protein FDR when compared to that using inference based on a single search engine. PIA supports the majority of established search engines and data in the mzIdentML standard format. It is implemented in Java and freely available at https://github.com/mpc-bioinformatics/pia.

  7. Data from: Irreproducibility in searches of scientific literature: a...

    • zenodo.org
    csv, txt
    Updated Sep 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabor Pozsgai; Gabor Pozsgai; Gabor Lövei; Liette Vasseur; Geoff Gurr; Péter Batáry; Janos Korponai; Nick Littlewood; Jian Liu; Arnold Móra; John Obrycki; Olivia Reynolds; Jenni Stockan; Heather VanVolkenburg; Jie Zhang; Wenwu Zhou; Minsheng You; Gabor Lövei; Liette Vasseur; Geoff Gurr; Péter Batáry; Janos Korponai; Nick Littlewood; Jian Liu; Arnold Móra; John Obrycki; Olivia Reynolds; Jenni Stockan; Heather VanVolkenburg; Jie Zhang; Wenwu Zhou; Minsheng You (2022). Irreproducibility in searches of scientific literature: a comparative analysis [Dataset]. http://doi.org/10.5061/dryad.djh9w0w17
    Explore at:
    txt, csvAvailable download formats
    Dataset updated
    Sep 23, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Gabor Pozsgai; Gabor Pozsgai; Gabor Lövei; Liette Vasseur; Geoff Gurr; Péter Batáry; Janos Korponai; Nick Littlewood; Jian Liu; Arnold Móra; John Obrycki; Olivia Reynolds; Jenni Stockan; Heather VanVolkenburg; Jie Zhang; Wenwu Zhou; Minsheng You; Gabor Lövei; Liette Vasseur; Geoff Gurr; Péter Batáry; Janos Korponai; Nick Littlewood; Jian Liu; Arnold Móra; John Obrycki; Olivia Reynolds; Jenni Stockan; Heather VanVolkenburg; Jie Zhang; Wenwu Zhou; Minsheng You
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    1. Repeatability is the cornerstone of science and it is particularly important for systematic reviews. However, little is known on how researchers' choice of database and search platform influence the repeatability of systematic reviews. Here, we aim to unveil how the computer environment and the location where the search was initiated from influence hit results.

    2. We present a comparative analysis of time-synchronized searches at different institutional locations in the world, and evaluate the consistency of hits obtained within each of the search terms using different search platforms.

    3. We revealed a large variation among search platforms and showed that PubMed and Scopus returned consistent results to identical search strings from different locations. Google Scholar and Web of Science's Core Collection varied substantially both in the number of returned hits and in the list of individual articles depending on the search location and computing environment. Inconsistency in Web of Science results has most likely emerged from the different licensing packages at different institutions.

    4. To maintain scientific integrity and consistency, especially in systematic reviews, action is needed from both the scientific community and scientific search platforms to increase search consistency. Researchers are encouraged to report the search location and the databases used for systematic reviews, and database providers should make search algorithms transparent and revise access rules to titles behind paywalls. Additional options for increasing the repeatability and transparency of systematic reviews are storing both search metadata and hit results in open repositories and using Application Programming Interfaces (APIs) to retrieve standardized, machine-readable search metadata.

  8. m

    Ultimate Arabic News Dataset

    • data.mendeley.com
    Updated May 9, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmed Hashim Al-Dulaimi (2022). Ultimate Arabic News Dataset [Dataset]. http://doi.org/10.17632/jz56k5wxz7.1
    Explore at:
    Dataset updated
    May 9, 2022
    Authors
    Ahmed Hashim Al-Dulaimi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Ultimate Arabic News Dataset is a collection of single-label modern Arabic texts that are used in news websites and press articles.

    Arabic news data was collected by web scraping techniques from many famous news sites such as Al-Arabiya, Al-Youm Al-Sabea (Youm7), the news published on the Google search engine and other various sources.

    • The data we collect consists of two Primary files:

    UltimateArabic: A file containing more than 193,000 original Arabic news texts, without pre-processing. The texts contain words, numbers, and symbols that can be removed using pre-processing to increase accuracy when using the dataset in various Arabic natural language processing tasks such as text classification.

    UltimateArabicPrePros: It is a file that contains the data mentioned in the first file, but after pre-processing, where the number of data became about 188,000 text documents, where stop words, non-Arabic words, symbols and numbers have been removed so that this file is ready for use directly in the various Arabic natural language processing tasks. Like text classification.

    • We add two samples of data collected by web scraping techniques:

    Sample_Youm7_Politic: An example of news in the "Politic" category collected from the Youm7 website.

    Sample_alarabiya_Sport: An example of news in the "Sport" category collected from the Al-Arabiya website.

    • The data is divided into 10 different categories: Culture, Diverse, Economy, Sport, Politic, Art, Society, Technology, Medical and Religion.
  9. Data from: Inventory of online public databases and repositories holding...

    • catalog.data.gov
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Inventory of online public databases and repositories holding agricultural data in 2017 [Dataset]. https://catalog.data.gov/dataset/inventory-of-online-public-databases-and-repositories-holding-agricultural-data-in-2017-d4c81
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    United States agricultural researchers have many options for making their data available online. This dataset aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural data. These data serve as both a current landscape analysis and also as a baseline for future studies of ag research data. Purpose As sources of agricultural data become more numerous and disparate, and collaboration and open data become more expected if not required, this research provides a landscape inventory of online sources of open agricultural data. An inventory of current agricultural data sharing options will help assess how the Ag Data Commons, a platform for USDA-funded data cataloging and publication, can best support data-intensive and multi-disciplinary research. It will also help agricultural librarians assist their researchers in data management and publication. The goals of this study were to establish where agricultural researchers in the United States-- land grant and USDA researchers, primarily ARS, NRCS, USFS and other agencies -- currently publish their data, including general research data repositories, domain-specific databases, and the top journals compare how much data is in institutional vs. domain-specific vs. federal platforms determine which repositories are recommended by top journals that require or recommend the publication of supporting data ascertain where researchers not affiliated with funding or initiatives possessing a designated open data repository can publish data Approach The National Agricultural Library team focused on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS) style research data, rather than ag economics, statistics, and social sciences data. To find domain-specific, general, institutional, and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources including re3data, libguides, and ARS lists were analysed. Primarily environmental or public health databases were not included, but places where ag grantees would publish data were considered. Search methods We first compiled a list of known domain specific USDA / ARS datasets / databases that are represented in the Ag Data Commons, including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K Workspace @ NAL, and GRIN. We then searched using search engines such as Bing and Google for non-USDA / federal ag databases, using Boolean variations of “agricultural data” /“ag data” / “scientific data” + NOT + USDA (to filter out the federal / USDA results). Most of these results were domain specific, though some contained a mix of data subjects. We then used search engines such as Bing and Google to find top agricultural university repositories using variations of “agriculture”, “ag data” and “university” to find schools with agriculture programs. Using that list of universities, we searched each university web site to see if their institution had a repository for their unique, independent research data if not apparent in the initial web browser search. We found both ag specific university repositories and general university repositories that housed a portion of agricultural data. Ag specific university repositories are included in the list of domain-specific repositories. Results included Columbia University – International Research Institute for Climate and Society, UC Davis – Cover Crops Database, etc. If a general university repository existed, we determined whether that repository could filter to include only data results after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University of Minnesota DRUM (Digital Repository of the University of Minnesota). We then split out NCBI (National Center for Biotechnology Information) repositories. Next we searched the internet for open general data repositories using a variety of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine whether that repository could filter for data results after search terms were applied. General subject data repositories include Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo. Finally, we compared scholarly journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural data. Extensive lists of journals were compiled, in which USDA published in 2012 and 2016, combining search results in ARIS, Scopus, and the Forest Service's TreeSearch, plus the USDA web sites Economic Research Service (ERS), National Agricultural Statistics Service (NASS), Natural Resources and Conservation Service (NRCS), Food and Nutrition Service (FNS), Rural Development (RD), and Agricultural Marketing Service (AMS). The top 50 journals' author instructions were consulted to see if they (a) ask or require submitters to provide supplemental data, or (b) require submitters to submit data to open repositories. Data are provided for Journals based on a 2012 and 2016 study of where USDA employees publish their research studies, ranked by number of articles, including 2015/2016 Impact Factor, Author guidelines, Supplemental Data?, Supplemental Data reviewed?, Open Data (Supplemental or in Repository) Required? and Recommended data repositories, as provided in the online author guidelines for each the top 50 journals. Evaluation We ran a series of searches on all resulting general subject databases with the designated search terms. From the results, we noted the total number of datasets in the repository, type of resource searched (datasets, data, images, components, etc.), percentage of the total database that each term comprised, any dataset with a search term that comprised at least 1% and 5% of the total collection, and any search term that returned greater than 100 and greater than 500 results. We compared domain-specific databases and repositories based on parent organization, type of institution, and whether data submissions were dependent on conditions such as funding or affiliation of some kind. Results A summary of the major findings from our data review: Over half of the top 50 ag-related journals from our profile require or encourage open data for their published authors. There are few general repositories that are both large AND contain a significant portion of ag data in their collection. GBIF (Global Biodiversity Information Facility), ICPSR, and ORNL DAAC were among those that had over 500 datasets returned with at least one ag search term and had that result comprise at least 5% of the total collection. Not even one quarter of the domain-specific repositories and datasets reviewed allow open submission by any researcher regardless of funding or affiliation. See included README file for descriptions of each individual data file in this dataset. Resources in this dataset:Resource Title: Journals. File Name: Journals.csvResource Title: Journals - Recommended repositories. File Name: Repos_from_journals.csvResource Title: TDWG presentation. File Name: TDWG_Presentation.pptxResource Title: Domain Specific ag data sources. File Name: domain_specific_ag_databases.csvResource Title: Data Dictionary for Ag Data Repository Inventory. File Name: Ag_Data_Repo_DD.csvResource Title: General repositories containing ag data. File Name: general_repos_1.csvResource Title: README and file inventory. File Name: README_InventoryPublicDBandREepAgData.txt

  10. f

    Data from: Combining High-Resolution and Exact Calibration To Boost...

    • figshare.com
    zip
    Updated Oct 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andy Lin; J. Jeffry Howbert; William Stafford Noble (2018). Combining High-Resolution and Exact Calibration To Boost Statistical Power: A Well-Calibrated Score Function for High-Resolution MS2 Data [Dataset]. http://doi.org/10.1021/acs.jproteome.8b00206.s007
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 18, 2018
    Dataset provided by
    ACS Publications
    Authors
    Andy Lin; J. Jeffry Howbert; William Stafford Noble
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    To achieve accurate assignment of peptide sequences to observed fragmentation spectra, a shotgun proteomics database search tool must make good use of the very high-resolution information produced by state-of-the-art mass spectrometers. However, making use of this information while also ensuring that the search engine’s scores are well calibrated, that is, that the score assigned to one spectrum can be meaningfully compared to the score assigned to a different spectrum, has proven to be challenging. Here we describe a database search score function, the “residue evidence” (res-ev) score, that achieves both of these goals simultaneously. We also demonstrate how to combine calibrated res-ev scores with calibrated XCorr scores to produce a “combined p value” score function. We provide a benchmark consisting of four mass spectrometry data sets, which we use to compare the combined p value to the score functions used by several existing search engines. Our results suggest that the combined p value achieves state-of-the-art performance, generally outperforming MS Amanda and Morpheus and performing comparably to MS-GF+. The res-ev and combined p-value score functions are freely available as part of the Tide search engine in the Crux mass spectrometry toolkit (http://crux.ms).

  11. f

    Data from: Compid: A New Software Tool To Integrate and Compare MS/MS Based...

    • figshare.com
    application/cdfv2
    Updated Feb 24, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Niina Lietzén; Lari Natri; Olli S. Nevalainen; Jussi Salmi; Tuula A. Nyman (2016). Compid: A New Software Tool To Integrate and Compare MS/MS Based Protein Identification Results from Mascot and Paragon [Dataset]. http://doi.org/10.1021/pr100824w.s002
    Explore at:
    application/cdfv2Available download formats
    Dataset updated
    Feb 24, 2016
    Dataset provided by
    ACS Publications
    Authors
    Niina Lietzén; Lari Natri; Olli S. Nevalainen; Jussi Salmi; Tuula A. Nyman
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Tandem mass spectrometry-based proteomics experiments produce large amounts of raw data, and different database search engines are needed to reliably identify all the proteins from this data. Here, we present Compid, an easy-to-use software tool that can be used to integrate and compare protein identification results from two search engines, Mascot and Paragon. Additionally, Compid enables extraction of information from large Mascot result files that cannot be opened via the Web interface and calculation of general statistical information about peptide and protein identifications in a data set. To demonstrate the usefulness of this tool, we used Compid to compare Mascot and Paragon database search results for mitochondrial proteome sample of human keratinocytes. The reports generated by Compid can be exported and opened as Excel documents or as text files using configurable delimiters, allowing the analysis and further processing of Compid output with a multitude of programs. Compid is freely available and can be downloaded from http://users.utu.fi/lanatr/compid. It is released under an open source license (GPL), enabling modification of the source code. Its modular architecture allows for creation of supplementary software components e.g. to enable support for additional input formats and report categories.

  12. i

    ExCAPE-DB

    • solr.ideaconsult.net
    csv
    Updated Nov 29, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    H2020 ExCAPE (2016). ExCAPE-DB [Dataset]. https://solr.ideaconsult.net/search/excape/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Nov 29, 2016
    Dataset authored and provided by
    H2020 ExCAPE
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    ExcapeDB: An integrated large scale dataset facilitating Big Data analysis in chemogenomics

  13. e

    Data from: Database search engines and target database features impinge upon...

    • ebi.ac.uk
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michele Mishto, Database search engines and target database features impinge upon the identification of post-translationally cis-spliced peptides in HLA class I immunopeptidomes [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD031709
    Explore at:
    Authors
    Michele Mishto
    Variables measured
    Proteomics
    Description

    Unconventional epitopes presented by HLA class I complexes are emerging targets for T cell targeted immunotherapies. Their identification by mass spectrometry required development of novel methods to cope with the large number of theoretical candidates. Methods to identify post-translationally spliced peptides led to a broad range of outcomes. We here investigated the impact of three common database search engines – i.e. Mascot, Mascot+Percolator and PEAKS DB – as final identification step, as well as the features of target database on the ability to correctly identify non-spliced and cis-spliced peptides. We used ground truth datasets measured by mass spectrometry to benchmark methods’ performance and extended the analysis to HLA class I immunopeptidomes. PEAKS DB showed better precision and recall of cis-spliced peptides and larger number of identified peptides in HLA class I immunopeptidomes than the other search engine strategies. The better performance of PEAKS DB appears to result from better discrimination between target and decoy hits and hence a more robust FDR estimation, and seems independent to peptide and spectrum features here investigated. Head of the research group Molecular Immunology at King’s College London and the Francis Crick Institute, London (UK). Email: michele.mishto@kcl.ac.uk,

  14. f

    Data from: Database Creator for Mass Analysis of Peptides and Proteins,...

    • figshare.com
    txt
    Updated Aug 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pandi Boomathi Pandeswari; Arnold Emerson Isaac; Varatharajan Sabareesh (2023). Database Creator for Mass Analysis of Peptides and Proteins, DC-MAPP: A Standalone Tool for Simplifying Manual Analysis of Mass Spectral Data to Identify Peptide/Protein Sequences [Dataset]. http://doi.org/10.1021/jasms.3c00030.s005
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 1, 2023
    Dataset provided by
    ACS Publications
    Authors
    Pandi Boomathi Pandeswari; Arnold Emerson Isaac; Varatharajan Sabareesh
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Proteomic studies typically involve the use of different types of software for annotating experimental tandem mass spectrometric data (MS/MS) and thereby simplifying the process of peptide and protein identification. For such annotations, these softwares calculate the m/z values of the peptide/protein precursor and fragment ions, for which a database of protein sequences must be provided as an input file. The calculated m/z values are stored as another database, which the user usually cannot view. Database Creator for Mass Analysis of Peptides and Proteins (DC-MAPP) is a novel standalone software that can create custom databases for “viewing” the calculated m/z values of precursor and fragment ions, prior to the database search. It contains three modules. Peptide/Protein sequences as per user’s choice can be entered as input to the first module for creating a custom database. In the second module, m/z values must be queried-in, which are searched within the custom database to identify protein/peptide sequences. The third module is suited for peptide mass fingerprinting, which can be used to analyze both ESI and MALDI mass spectral data. The feature of “viewing” the custom database can be helpful not only for better understanding the search engine processes, but also for designing multiple reaction monitoring (MRM) methods. Post-translational modifications and protein isoforms can also be analyzed. Since, DC-MAPP relies on the protein/peptide “sequences” for creating custom databases, it may not be applicable for the searches involving spectral libraries. Python language was used for implementation, and the graphical user interface was built with Page/Tcl, making this tool more user-friendly. It is freely available at https://vit.ac.in/DC-MAPP/.

  15. MSFragger-Glycan-Database-from-MSFragger-Glyco-paper.xlsx

    • figshare.com
    xlsx
    Updated Jan 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wen-Feng Zeng (2021). MSFragger-Glycan-Database-from-MSFragger-Glyco-paper.xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.13669853.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 31, 2021
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Wen-Feng Zeng
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    N- and O-glycan databases of MSFragger-Glyco

  16. f

    Comparison of Novel Decoy Database Designs for Optimizing Protein...

    • acs.figshare.com
    zip
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luca Bianco; Jennifer A. Mead; Conrad Bessant (2023). Comparison of Novel Decoy Database Designs for Optimizing Protein Identification Searches Using ABRF sPRG2006 Standard MS/MS Data Sets [Dataset]. http://doi.org/10.1021/pr800792z.s002
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    ACS Publications
    Authors
    Luca Bianco; Jennifer A. Mead; Conrad Bessant
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Decoy database searches are used to filter out false positive protein identifications derived from search engines, but there is no consensus about which decoy is “the best”. We evaluate nine different decoy designs using public data sets from samples of known composition. Statistically significant performance differences were found, but no single decoy stood out among the best performers. Ultimately, we recommend peptide level reverse decoys searched independently from the target.

  17. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
nasa.gov (2025). JPL Small Body Database Search Engine - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/jpl-small-body-database-search-engine
Organization logo

JPL Small Body Database Search Engine - Dataset - NASA Open Data Portal

Explore at:
Dataset updated
Mar 31, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description

Use this search engine to generate custom tables of orbital and/or physical parameters for all asteroids and comets (or a specified sub-set) in our small-body database. If this is your first time here, you may find it helpful to read our tutorial. Otherwise, simply follow the steps in each section: 'Search Constraints', 'Output Fields', and finally 'Format Options'. If you want details for a single object, use the Small Body Browser instead.

Search
Clear search
Close search
Google apps
Main menu