100+ datasets found
  1. Most popular database management systems worldwide 2024

    • statista.com
    Updated Jun 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Most popular database management systems worldwide 2024 [Dataset]. https://www.statista.com/statistics/809750/worldwide-popularity-ranking-database-management-systems/
    Explore at:
    Dataset updated
    Jun 15, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jun 2024
    Area covered
    Worldwide
    Description

    As of June 2024, the most popular database management system (DBMS) worldwide was Oracle, with a ranking score of *******; MySQL and Microsoft SQL server rounded out the top three. Although the database management industry contains some of the largest companies in the tech industry, such as Microsoft, Oracle and IBM, a number of free and open-source DBMSs such as PostgreSQL and MariaDB remain competitive. Database Management Systems As the name implies, DBMSs provide a platform through which developers can organize, update, and control large databases. Given the business world’s growing focus on big data and data analytics, knowledge of SQL programming languages has become an important asset for software developers around the world, and database management skills are seen as highly desirable. In addition to providing developers with the tools needed to operate databases, DBMS are also integral to the way that consumers access information through applications, which further illustrates the importance of the software.

  2. Most popular relational database management systems worldwide 2024

    • statista.com
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Most popular relational database management systems worldwide 2024 [Dataset]. https://www.statista.com/statistics/1131568/worldwide-popularity-ranking-relational-database-management-systems/
    Explore at:
    Dataset updated
    Jun 30, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jun 2024
    Area covered
    Worldwide
    Description

    As of June 2024, the most popular relational database management system (RDBMS) worldwide was Oracle, with a ranking score of *******. Oracle was also the most popular DBMS overall. MySQL and Microsoft SQL server rounded out the top three.

  3. Best Database Types for Data Analytics by Industry

    • blog.devart.com
    html
    Updated Mar 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Devart (2025). Best Database Types for Data Analytics by Industry [Dataset]. https://blog.devart.com/best-database-for-data-analytics.html
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Mar 27, 2025
    Dataset authored and provided by
    Devart
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Industry, Database Type, Common Databases
    Description

    A guide to choosing the most suitable database types for data analytics across different industries, including examples of common databases.

  4. Most popular open source database management systems worldwide 2024

    • statista.com
    Updated Jul 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Most popular open source database management systems worldwide 2024 [Dataset]. https://www.statista.com/statistics/1131602/worldwide-popularity-ranking-database-management-systems-open-source/
    Explore at:
    Dataset updated
    Jul 1, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jun 2024
    Area covered
    Worldwide
    Description

    As of June 2024, the most popular open-source database management system (DBMS) in the world was MySQL, with a ranking score of ****. Oracle was the most popular commercial DBMS at that time, with a ranking score of ****.

  5. NOAA/WDS Paleoclimatology - DoD2k Database of Databases for Common Era...

    • catalog.data.gov
    • data.noaa.gov
    Updated Jul 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (Point of Contact); NOAA World Data Service for Paleoclimatology (Point of Contact) (2025). NOAA/WDS Paleoclimatology - DoD2k Database of Databases for Common Era Paleoclimatology [Dataset]. https://catalog.data.gov/dataset/noaa-wds-paleoclimatology-dod2k-database-of-databases-for-common-era-paleoclimatology1
    Explore at:
    Dataset updated
    Jul 1, 2025
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Description

    This archived Paleoclimatology Study is available from the NOAA National Centers for Environmental Information (NCEI), under the World Data Service (WDS) for Paleoclimatology. The associated NCEI study type is Other Collections. The data include parameters of reconstructions (air temperature) with a geographic location of Global. The time period coverage is from 1949 to -50 in calendar years before present (BP). See metadata information for parameter and study location details. Please cite this study when using the data.

  6. Databases_DBMS_2024

    • kaggle.com
    zip
    Updated Mar 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ravi Varma Odugu (2024). Databases_DBMS_2024 [Dataset]. https://www.kaggle.com/datasets/ravivarmaodugu/databases-dbms-2024
    Explore at:
    zip(11683 bytes)Available download formats
    Dataset updated
    Mar 4, 2024
    Authors
    Ravi Varma Odugu
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The Databases_DBMS_2024 dataset provides information about leading databases with a worldwide footprint.

    The dataset contains records of 417 databases and has information about the DBMS type, multi-model capability, vendor, and vendor country.

    The dataset also contains data on DBMS score and rankings, from DB-engines.com.

    Kagglers can utilise the dataset to explore the

    • Composition of DBMS Types and Multi-model capability
    • Distribution of DBMS vendors and Vendor countries, etc.
    • Trends and patterns in DBMS rankings and scores
  7. Common Database on Designated Areas (CDDA) attribute data (access database)...

    • ckan.publishing.service.gov.uk
    Updated May 26, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2016). Common Database on Designated Areas (CDDA) attribute data (access database) - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/common-database-on-designated-areas-cdda-attribute-data-access-database
    Explore at:
    Dataset updated
    May 26, 2016
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    The CDDA is a data bank for officially designated protected areas such as nature reserves, protected landscapes, National Parks, etc. in Europe. The CDDA is run by the European Environment Agency(EEA). This access database includes only data for National Designations, the main ones being Sites of Special Scientific Interest, National Nature Reserves, Local Nature Reserves, National Parks, Areas of Outstanding Natural Beauty and a variety of Marine Protected Areas. The data are updated annually in March. Further details are available from the EEA's EIONET portal http://rod.eionet.europa.eu/obligations/32. This provides data for all members states in the EU and also describes the data model with descriptions of each table and attribute. The two most important tables in the data schema are the sites table (one row of data for each site) and the designations table (one row for each type of designation). These two tables can be joined on the field DESIG_ABBR. Other tables in the schema are included mainly for EEAs internal purposes. The annual submission of the CDDA

  8. Most commonly used database technologies among developers worldwide 2023

    • statista.com
    Updated Nov 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Most commonly used database technologies among developers worldwide 2023 [Dataset]. https://www.statista.com/statistics/794187/united-states-developer-survey-most-wanted-used-database-technologies/
    Explore at:
    Dataset updated
    Nov 28, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    May 8, 2023 - May 19, 2023
    Area covered
    Worldwide
    Description

    In 2023, over ** percent of surveyed software developers worldwide reported using PostgreSQL, the highest share of any database technology. Other popular database tools among developers included MySQL and SQLite.

  9. Z

    Quality of child healthcare in European countries: common measures across...

    • data.niaid.nih.gov
    • data.europa.eu
    Updated Jun 3, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rocco Ilaria; Tamburis Oscar; Pecoraro Fabrizio; Luzi Daniela; Corso Barbara; Minicuci Nadia (2021). Quality of child healthcare in European countries: common measures across international databases and national agencies [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4572598
    Explore at:
    Dataset updated
    Jun 3, 2021
    Dataset provided by
    National Research Council
    Università Federico II, Napoli
    Authors
    Rocco Ilaria; Tamburis Oscar; Pecoraro Fabrizio; Luzi Daniela; Corso Barbara; Minicuci Nadia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Europe
    Description

    This spreadsheet provides the list of indicators related to the assessment of the quality of child healthcare collected from two type of sources: open-access international databases and national experts. It has been adopted to the Paper 'Quality of child healthcare in European countries: common measures across international databases and national agencies'.

  10. Selection of databases commonly used in our workflows.

    • figshare.com
    • plos.figshare.com
    xls
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miguel Vazquez; Victor de la Torre; Alfonso Valencia (2023). Selection of databases commonly used in our workflows. [Dataset]. http://doi.org/10.1371/journal.pcbi.1002824.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Miguel Vazquez; Victor de la Torre; Alfonso Valencia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Selection of databases commonly used in our workflows.

  11. Common Database on Designated Areas in the UK - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated May 26, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2016). Common Database on Designated Areas in the UK - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/common-database-on-designated-areas-in-the-uk
    Explore at:
    Dataset updated
    May 26, 2016
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Area covered
    United Kingdom
    Description

    A spatial dataset of the UK's National designations submitted to the Common Database on Designated Areas (CDDA) in March 2016. This is the most up to data copy of the dataset and previous submissions have been archived. The CDDA is a data bank for officially designated protected areas such as nature reserves, protected landscapes, National Parks, etc. in Europe. The CDDA is run by the European Environment Agency. This spatial dataset includes only data for National Designations, the main ones being Sites of Special Scientific Interest, National Nature Reserves, Local Nature Reserves, National Parks, Areas of Outstanding Natural Beauty and a variety of Marine Protected Areas.

  12. 365 Data Science Web site statistics

    • kaggle.com
    zip
    Updated Aug 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    yasser messahli (2024). 365 Data Science Web site statistics [Dataset]. https://www.kaggle.com/yassermessahli/365-data-science-web-site-statistics
    Explore at:
    zip(3895191 bytes)Available download formats
    Dataset updated
    Aug 9, 2024
    Authors
    yasser messahli
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    365 Data Science Database

    365 Data Science is a website that provides online courses and resources for learning data science, machine learning, and data analysis.

    It is common for websites that offer online courses to have **databases **to store information about their courses, students, and progress. It is also possible that they use databases for storing and organizing the data used in their courses and examples.

    If you're looking for specific information about the database used by 365 Data Science, I recommend reaching out to them directly through their Website or support channels.

  13. Data from: Barriers to engagement.

    • plos.figshare.com
    xls
    Updated Oct 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander P. Noar; Hannah E. Jeffery; Hariharan Subbiah Ponniah; Usman Jaffer (2023). Barriers to engagement. [Dataset]. http://doi.org/10.1371/journal.pone.0292343.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Oct 10, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Alexander P. Noar; Hannah E. Jeffery; Hariharan Subbiah Ponniah; Usman Jaffer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Communities of practice (CoPs) are defined as "groups of people who share a concern, a set of problems, or a passion about a topic, and who deepen their knowledge and expertise by interacting on an ongoing basis". They are an effective form of knowledge management that have been successfully used in the business sector and increasingly so in healthcare. In May 2023 the electronic databases MEDLINE and EMBASE were systematically searched for primary research studies on CoPs published between 1st January 1950 and 31st December 2022. PRISMA guidelines were followed. The following search terms were used: community/communities of practice AND (healthcare OR medicine OR patient/s). The database search picked up 2009 studies for screening. Of these, 50 papers met the inclusion criteria. The most common aim of CoPs was to directly improve a clinical outcome, with 19 studies aiming to achieve this. In terms of outcomes, qualitative outcomes were the most common measure used in 21 studies. Only 11 of the studies with a quantitative element had the appropriate statistical methodology to report significance. Of the 9 studies that showed a statistically significant effect, 5 showed improvements in hospital-based provision of services such as discharge planning or rehabilitation services. 2 of the studies showed improvements in primary-care, such as management of hepatitis C, and 2 studies showed improvements in direct clinical outcomes, such as central line infections. CoPs in healthcare are aimed at improving clinical outcomes and have been shown to be effective. There is still progress to be made and a need for further studies with more rigorous methodologies, such as RCTs, to provide further support of the causality of CoPs on outcomes.

  14. G

    NoSQL Database as a Service Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). NoSQL Database as a Service Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/nosql-database-as-a-service-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Aug 29, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    NoSQL Database as a Service Market Outlook



    According to our latest research, the global NoSQL Database as a Service (DBaaS) market size reached USD 5.8 billion in 2024 and is projected to grow at a robust CAGR of 18.7% during the forecast period. By 2033, the market is forecasted to reach a substantial USD 32.2 billion, reflecting the accelerating adoption of scalable, flexible, and cloud-native database solutions across industries. This impressive growth is primarily driven by the mounting demand for real-time data processing, the proliferation of unstructured and semi-structured data, and the increasing digital transformation initiatives among enterprises globally.




    The rapid expansion of digital business models and the explosion of big data have been pivotal in fueling the growth of the NoSQL Database as a Service market. Organizations are increasingly shifting away from traditional relational databases due to their limitations in managing large volumes of unstructured data, which is common in modern applications such as IoT, social media, and big data analytics. NoSQL DBaaS offers superior scalability, high availability, and flexible schema design, enabling enterprises to deliver high-performance applications without the constraints of legacy database architectures. The cloud-based delivery model further enhances accessibility and reduces the total cost of ownership, making it a compelling choice for businesses looking to innovate and scale rapidly.




    Another significant growth factor is the surge in demand for real-time analytics and personalized customer experiences. Modern enterprises, especially in sectors like retail, BFSI, and healthcare, require instant insights from diverse data sources to make informed decisions and enhance user engagement. NoSQL DBaaS platforms are designed to handle massive data inflows, support low-latency operations, and integrate seamlessly with advanced analytics and AI/ML tools. This ability to process and analyze data in real time is crucial for applications such as fraud detection, recommendation engines, and predictive maintenance, further driving the adoption of NoSQL Database as a Service solutions.




    The evolving regulatory landscape and growing concerns around data security and compliance are also influencing the NoSQL DBaaS market. Service providers are investing heavily in robust security frameworks, encryption, and compliance certifications to address the stringent requirements of industries such as healthcare and finance. This focus on security, combined with the agility and scalability of cloud-native NoSQL databases, is encouraging even risk-averse organizations to migrate their mission-critical workloads to DBaaS platforms. As a result, the market is witnessing increased traction from both large enterprises and small and medium-sized businesses seeking to balance innovation with compliance.




    Regionally, North America continues to dominate the NoSQL Database as a Service market, accounting for the largest revenue share in 2024. The regionÂ’s leadership is attributed to the early adoption of cloud technologies, a mature digital ecosystem, and the presence of major DBaaS providers. However, Asia Pacific is emerging as the fastest-growing region, driven by rapid digitalization, the expansion of e-commerce, and government-led smart city initiatives. Europe is also witnessing steady growth, supported by stringent data privacy regulations and increasing investments in cloud infrastructure. The market dynamics in Latin America and the Middle East & Africa are evolving, with growing awareness and adoption of cloud-based database solutions across various sectors.



    The concept of Database-as-a-Service (DBaaS) is revolutionizing how organizations manage and access their data. By offering database functionalities as a cloud service, DBaaS eliminates the need for physical hardware and complex installations, allowing businesses to focus on their core operations. This service model provides flexibility and scalability, enabling companies to adjust their database resources according to demand without significant upfront investments. As more enterprises embrace digital transformation, the demand for DBaaS is expected to grow, driven by its ability to streamline operations and reduce IT overhead.



    <div class="free_sample_div text-center&qu

  15. Data from: The California current predator diet database: synthesis of...

    • doi.pangaea.de
    • search.dataone.org
    zip
    Updated Aug 7, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Spencer A Wood; L E Koehn; Amber I Szoboszlai; Julie A Thayer; L E Sydeman; T E Essington (2014). The California current predator diet database: synthesis of common forage species [Dataset]. http://doi.org/10.1594/PANGAEA.834750
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 7, 2014
    Dataset provided by
    PANGAEA
    Authors
    Spencer A Wood; L E Koehn; Amber I Szoboszlai; Julie A Thayer; L E Sydeman; T E Essington
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Area covered
    California
    Description

    Characterization of the diets of upper-trophic predators is a key ingredient in management including the development of ecosystem-based fishery management plans, conservation efforts for top predators, and ecological and economic modeling of predator prey interactions. The California Current Predator Diet Database (CCPDD) synthesizes data from published records of predator food habits over the past century. The database includes diet information for 100+ upper-trophic level predator species, based on over 200 published citations from the California Current region of the Pacific Ocean, ranging from Baja, Mexico to Vancouver Island, Canada. We include diet data for all predators that consume forage species: seabirds, cetaceans, pinnipeds, bony and cartilaginous fishes, and a predatory invertebrate; data represent seven discrete geographic regions within the CCS (Canada, WA, OR, CA-n, CA-c, CA-s, Mexico). The database is organized around predator-prey links that represent an occurrence of a predator eating a prey or group of prey items. Here we present synthesized data for the occurrence of 32 forage species (see Table 2 in the affiliated paper) in the diet of pelagic predators (currently submitted to Ecological Informatics). Future versions of the shared-data will include diet information for all prey items consumed, not just the forage species of interest.

  16. d

    Data from: Domestic and International Common Language Database (DICL)

    • catalog.data.gov
    Updated Apr 6, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office of Economics (2021). Domestic and International Common Language Database (DICL) [Dataset]. https://catalog.data.gov/dataset/domestic-and-international-common-language-database-dicl
    Explore at:
    Dataset updated
    Apr 6, 2021
    Dataset provided by
    Office of Economics
    Description

    The database contains index measures of linguistic similarity both domestically and internationally. The domestic measures capture linguistic similarities present among populations within a single country while the international indexes capture language similarities between two different countries. The indexes reflect three aspects of language: common official languages, common native languages, and linguistic proximity across languages.

  17. Facilitators of engagement.

    • plos.figshare.com
    xls
    Updated Oct 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander P. Noar; Hannah E. Jeffery; Hariharan Subbiah Ponniah; Usman Jaffer (2023). Facilitators of engagement. [Dataset]. http://doi.org/10.1371/journal.pone.0292343.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Oct 10, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Alexander P. Noar; Hannah E. Jeffery; Hariharan Subbiah Ponniah; Usman Jaffer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Communities of practice (CoPs) are defined as "groups of people who share a concern, a set of problems, or a passion about a topic, and who deepen their knowledge and expertise by interacting on an ongoing basis". They are an effective form of knowledge management that have been successfully used in the business sector and increasingly so in healthcare. In May 2023 the electronic databases MEDLINE and EMBASE were systematically searched for primary research studies on CoPs published between 1st January 1950 and 31st December 2022. PRISMA guidelines were followed. The following search terms were used: community/communities of practice AND (healthcare OR medicine OR patient/s). The database search picked up 2009 studies for screening. Of these, 50 papers met the inclusion criteria. The most common aim of CoPs was to directly improve a clinical outcome, with 19 studies aiming to achieve this. In terms of outcomes, qualitative outcomes were the most common measure used in 21 studies. Only 11 of the studies with a quantitative element had the appropriate statistical methodology to report significance. Of the 9 studies that showed a statistically significant effect, 5 showed improvements in hospital-based provision of services such as discharge planning or rehabilitation services. 2 of the studies showed improvements in primary-care, such as management of hepatitis C, and 2 studies showed improvements in direct clinical outcomes, such as central line infections. CoPs in healthcare are aimed at improving clinical outcomes and have been shown to be effective. There is still progress to be made and a need for further studies with more rigorous methodologies, such as RCTs, to provide further support of the causality of CoPs on outcomes.

  18. S

    Data for common institutional ownership and innovation

    • scidb.cn
    Updated Aug 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xinyu Liu; Wenjun Cai (2025). Data for common institutional ownership and innovation [Dataset]. http://doi.org/10.57760/sciencedb.28803
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 8, 2025
    Dataset provided by
    Science Data Bank
    Authors
    Xinyu Liu; Wenjun Cai
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Our sample consists of annual data from firms listed on the A-share markets of the Shanghai and Shenzhen Stock Exchanges in China, covering the period from 2003 to 2022. We gather the necessary data on listed firm from two databases: Chinese Innovation Research Database (CIRD) for firms’ innovation, China Stock Market & Accounting Research Database (CSMAR) for common ownership. CIRD not only includes patent data filed or granted to different entities, distinguishing between three types of patents—invention, utility model, and design—but also provides key information such as the nature of applications (independent or joint), classification numbers, and patent statistics. CSMAR database is positioned as a research-oriented precision database, referring to the standards of authoritative databases such as CRSP and COMPUSTAT, with the aim of researching and quantifying investment analysis. We match the innovation data to the financial data for each firm, and we exclude financial listed companies, exclude ST and * ST listed companies and delete samples with missing data. To avoid extreme value interference, we winsorize all continuous variables at the 1% level. With these filters, our final sample of 48,956 firm-year observations for 4957 firms.

  19. Data from: A consensus compound/bioactivity dataset for data-driven drug...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated May 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura Isigkeit; Laura Isigkeit; Apirat Chaikuad; Apirat Chaikuad; Daniel Merk; Daniel Merk (2022). A consensus compound/bioactivity dataset for data-driven drug design and chemogenomics [Dataset]. http://doi.org/10.5281/zenodo.6320761
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 13, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Laura Isigkeit; Laura Isigkeit; Apirat Chaikuad; Apirat Chaikuad; Daniel Merk; Daniel Merk
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Information

    The diverse publicly available compound/bioactivity databases constitute a key resource for data-driven applications in chemogenomics and drug design. Analysis of their coverage of compound entries and biological targets revealed considerable differences, however, suggesting benefit of a consensus dataset. Therefore, we have combined and curated information from five esteemed databases (ChEMBL, PubChem, BindingDB, IUPHAR/BPS and Probes&Drugs) to assemble a consensus compound/bioactivity dataset comprising 1144803 compounds with 10915362 bioactivities on 5613 targets (including defined macromolecular targets as well as cell-lines and phenotypic readouts). It also provides simplified information on assay types underlying the bioactivity data and on bioactivity confidence by comparing data from different sources. We have unified the source databases, brought them into a common format and combined them, enabling an ease for generic uses in multiple applications such as chemogenomics and data-driven drug design.

    The consensus dataset provides increased target coverage and contains a higher number of molecules compared to the source databases which is also evident from a larger number of scaffolds. These features render the consensus dataset a valuable tool for machine learning and other data-driven applications in (de novo) drug design and bioactivity prediction. The increased chemical and bioactivity coverage of the consensus dataset may improve robustness of such models compared to the single source databases. In addition, semi-automated structure and bioactivity annotation checks with flags for divergent data from different sources may help data selection and further accurate curation.

    Structure and content of the dataset

    Dataset structure

    ChEMBL

    ID

    PubChem

    ID

    IUPHAR

    ID

    Target

    Activity

    type

    Assay typeUnitMean C (0)...Mean PC (0)...Mean B (0)...Mean I (0)...Mean PD (0)...Activity check annotationLigand namesCanonical SMILES C...Structure checkSource

    The dataset was created using the Konstanz Information Miner (KNIME) (https://www.knime.com/) and was exported as a CSV-file and a compressed CSV-file.

    Except for the canonical SMILES columns, all columns are filled with the datatype ‘string’. The datatype for the canonical SMILES columns is the smiles-format. We recommend the File Reader node for using the dataset in KNIME. With the help of this node the data types of the columns can be adjusted exactly. In addition, only this node can read the compressed format.

    Column content:

    • ChEMBL ID, PubChem ID, IUPHAR ID: chemical identifier of the databases
    • Target: biological target of the molecule expressed as the HGNC gene symbol
    • Activity type: for example, pIC50
    • Assay type: Simplification/Classification of the assay into cell-free, cellular, functional and unspecified
    • Unit: unit of bioactivity measurement
    • Mean columns of the databases: mean of bioactivity values or activity comments denoted with the frequency of their occurrence in the database, e.g. Mean C = 7.5 *(15) -> the value for this compound-target pair occurs 15 times in ChEMBL database
    • Activity check annotation: a bioactivity check was performed by comparing values from the different sources and adding an activity check annotation to provide automated activity validation for additional confidence
      • no comment: bioactivity values are within one log unit;
      • check activity data: bioactivity values are not within one log unit;
      • only one data point: only one value was available, no comparison and no range calculated;
      • no activity value: no precise numeric activity value was available;
      • no log-value could be calculated: no negative decadic logarithm could be calculated, e.g., because the reported unit was not a compound concentration
    • Ligand names: all unique names contained in the five source databases are listed
    • Canonical SMILES columns: Molecular structure of the compound from each database
    • Structure check: To denote matching or differing compound structures in different source databases
      • match: molecule structures are the same between different sources;
      • no match: the structures differ;
      • 1 source: no structure comparison is possible, because the molecule comes from only one source database.
    • Source: From which databases the data come from

  20. Z

    NanoClass-compatible BOLD CO1 databases

    • data.niaid.nih.gov
    Updated Dec 3, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Evelien Jongepier (2021). NanoClass-compatible BOLD CO1 databases [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5751456
    Explore at:
    Dataset updated
    Dec 3, 2021
    Dataset provided by
    University of Amsterdam
    Authors
    Evelien Jongepier
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BOLD CO1 databases reformatted to use in NanoClass (https://github.com/ejongepier/NanoClass; version 0.3.0-beta or higher) and QIIME2. Three separate databases are included for use in combination with primers mtD, LCO-HCO and CI. Databases include reference sequences and reference taxonomies for the use in NanoClass, as well as pre-trained classifiers for use in QIIME2. See usage instructions below.

    For questions, please contact e.jongepier@uva.nl.

    ==========================================

    WARNING

    Please note this version of a custom BOLD CO1 db comes with absolutely no warranties.

    When using this db in NanoClass, mind that it has only been tested with methods: ["megablast","minimap","spingo"] NanoClass cannot be run in combination with these BOLD CO1 databases using methods ["mothur","centrifuge","kraken"]. Compatibility with ["blast","dcmegablast","qiime","rdp"] is untested. Just remove the tools you want to skip from the NanoClass/config.yaml (see also the NanoClass documentation here: https://ejongepier.github.io/NanoClass/)

    Never use this data base in combination with the NanoClass snakemake -F parameter or this BOLD CO1 database will be overwriten by the default 16S SILVA database.

    ==========================================

    DESCRIPTION

    BOLD CO1 database (last) downloaded on 20210420 and reformatted for use in QIIME2 and NanoClass. To clean-up BOLD CO1 db these steps were taken (step 7 to 11 were repeated for each of the 3 primers): - remove identical duplicates [3597874] - drop seqs with non-IUPAC characters [3597839] - remove leading and trailing ambiguous bases [3597839] - remove low quality reads - remove reads with homopolymer runs - filter by length - extract fragments between primer sequences [mtD:112450; CI:121391; LCO-HCO:65307] - dereplicate / cluster [mtD:55075; CI:46470; LCO-HCO:24835] - remove uninformative taxonomic labels [mtD:55073; CI:46466; LCO-HCO:24832] - reformat db for use in NanoClass - train classifier based on fragments

    ==========================================

    HOW TO USE THESE DBS

    Use in NanoClass:

    Unzip the database and copy the reference taxonomy and (unzipped) reference sequences to the NanoClass/db/common directory, like so:

    $ cp mtD/bold-v20210421-taxonomy-mtD.tsv /path/to/NanoClass/db/common/ref-taxonomy.txt $ gzip -d -c mtD/bold-v20210421-frags-mtD.fa.gz > /path/to/NanoClass/db/common/ref-seqs.fna

    Something similar can be done for the other two primers (CI or LCO-HCO). Only these three primers are supported at this point.

    Next, create an (empty) ref-seqs.aln file just to prevent NanoClass from automatically downloading the default 16S SILVA database, which would overwrite the BOLD db you just copied into NanoClass/db/common.

    $ touch /path/to/NanoClass/db/common/ref-seqs.aln

    Finally, you need to make a change to the NanoClass/Snakefile (i.e change first line into the second).

    optrules.extend(["plots/precision.pdf"] if len(config["methods"]) > 2 else []) optrules.extend(["plots/precision.pdf"] if len(config["methods"]) > 200 else [])

    This will disable the computation of precision plots by NanoClass as this is not supported in combination with the custom BOLD CO1 databases.

    Also mind that you need to change the nanofilt minlen and maxlen in the NanoClass/config.yaml to capture the appropriate fragment length for your primer. For the mtD primer I used minlen 600 and maxlen 900 for testing.

    Use in QIIME2:

    You can use the trained classifier directly in QIIME2, like so:

    $ qiime feature-classifier classify-sklearn
    --i-classifier mtD/bold-v20210421-classifier-mtD.qza
    --i-reads .qza
    --o-classification .qza
    --verbose

    Something similar can be done for the other two primers (CI or LCO-HCO). Only these three primers are supported at this point. The classifiers have only been tested with with the sklearn algorithm.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2024). Most popular database management systems worldwide 2024 [Dataset]. https://www.statista.com/statistics/809750/worldwide-popularity-ranking-database-management-systems/
Organization logo

Most popular database management systems worldwide 2024

Explore at:
41 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jun 15, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jun 2024
Area covered
Worldwide
Description

As of June 2024, the most popular database management system (DBMS) worldwide was Oracle, with a ranking score of *******; MySQL and Microsoft SQL server rounded out the top three. Although the database management industry contains some of the largest companies in the tech industry, such as Microsoft, Oracle and IBM, a number of free and open-source DBMSs such as PostgreSQL and MariaDB remain competitive. Database Management Systems As the name implies, DBMSs provide a platform through which developers can organize, update, and control large databases. Given the business world’s growing focus on big data and data analytics, knowledge of SQL programming languages has become an important asset for software developers around the world, and database management skills are seen as highly desirable. In addition to providing developers with the tools needed to operate databases, DBMS are also integral to the way that consumers access information through applications, which further illustrates the importance of the software.

Search
Clear search
Close search
Google apps
Main menu