100+ datasets found

Most popular database management systems worldwide 2024
statista.com
Updated Jun 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Most popular database management systems worldwide 2024 [Dataset]. https://www.statista.com/statistics/809750/worldwide-popularity-ranking-database-management-systems/
Explore at:
Dataset updated
Jun 30, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jun 2024
Area covered
Worldwide
Description
As of June 2024, the most popular database management system (DBMS) worldwide was Oracle, with a ranking score of *******; MySQL and Microsoft SQL server rounded out the top three. Although the database management industry contains some of the largest companies in the tech industry, such as Microsoft, Oracle and IBM, a number of free and open-source DBMSs such as PostgreSQL and MariaDB remain competitive. Database Management Systems As the name implies, DBMSs provide a platform through which developers can organize, update, and control large databases. Given the business world’s growing focus on big data and data analytics, knowledge of SQL programming languages has become an important asset for software developers around the world, and database management skills are seen as highly desirable. In addition to providing developers with the tools needed to operate databases, DBMS are also integral to the way that consumers access information through applications, which further illustrates the importance of the software.
v
Global Distributed Relational Database Market Size By Deployment Type, By...
verifiedmarketresearch.com
Updated Sep 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
VERIFIED MARKET RESEARCH (2024). Global Distributed Relational Database Market Size By Deployment Type, By Organization Size, By End User Industry, By Geographic Scope And Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/distributed-relational-database-market/
Explore at:
Dataset updated
Sep 2, 2024
Dataset authored and provided by
VERIFIED MARKET RESEARCH
License
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Time period covered
2024 - 2031
Area covered
Global
Description
Distributed Relational Database Market size is growing at a moderate pace with substantial growth rates over the last few years and is estimated that the market will grow significantly in the forecasted period i.e. 2024 to 2031.

Global Distributed Relational Database Market Drivers

The market drivers for the Distributed Relational Database Market can be influenced by various factors. These may include:

Growing Data Volume: Organizations require scalable and effective methods to handle and process massive amounts of data due to the exponential growth in data generation. Scalability and enhanced performance are two features that make distributed relational databases a good option for managing large amounts of data.

Cloud Adoption: The market for distributed relational databases has been greatly impacted by the emergence of cloud computing. Cloud platforms are encouraging the usage of distributed databases in cloud environments with their scalable infrastructure and managed database services. Distributed databases are also included by cloud providers into their services, increasing accessibility.

Global Distributed Relational Database Market Restraints

Several factors can act as restraints or challenges for the Distributed Relational Database Market. These may include:

Complexity in Management: Complex configurations and management are frequently associated with distributed relational databases. It can be difficult to ensure data consistency, manage distributed transactions, and deal with node failures; these tasks may call for specific knowledge and resources.

High Initial Costs: Including infrastructure investments and licensing fees, the implementation of distributed relational databases might come with a hefty upfront cost. These upfront expenses may prevent adoption in smaller businesses or those with tighter budgets.
D
Relational Database Software Market Report | Global Forecast From 2025 To...
dataintelo.com
csv, pdf, pptx
Updated Jan 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). Relational Database Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/relational-database-software-market
Explore at:
pptx, pdf, csvAvailable download formats
Dataset updated
Jan 7, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Relational Database Software Market Outlook

In 2023, the global market size for relational database software is valued at approximately $61.5 billion, with an anticipated growth to $113.9 billion by 2032, reflecting a robust CAGR of 7.1%. This impressive growth is mainly driven by the increasing volume of data generated across industries and the need for efficient data management solutions. The expanding application of relational database software in various sectors such as BFSI, healthcare, and telecommunications is also a significant contributor to market growth. Furthermore, the transition from legacy systems to modern, scalable database solutions is propelling this market forward.

The proliferation of data from diverse sources, including IoT devices, social media, and enterprise applications, is one of the primary growth factors for the relational database software market. Organizations are increasingly adopting advanced database management systems to handle large volumes of structured and unstructured data efficiently. This necessity aligns with the growing trend of digital transformation, where data plays a crucial role in driving business insights and decision-making processes. Additionally, the rise of big data analytics and artificial intelligence necessitates robust database solutions that can manage and process vast amounts of data in real-time.

Another significant growth driver for this market is the increasing reliance on cloud-based solutions. Cloud computing offers scalable, flexible, and cost-effective database management options, making it an attractive choice for enterprises of all sizes. The adoption of cloud-based relational database software is accelerating as it reduces the need for physical infrastructure, lowers maintenance costs, and provides seamless access to data from any location. Moreover, cloud providers are continually enhancing their offerings with advanced features such as automated backups, disaster recovery, and high availability, further boosting the market demand.

The integration of relational database software with emerging technologies such as blockchain, machine learning, and internet of things (IoT) is also fueling market growth. These integrations enable enhanced data security, improved data analytics capabilities, and efficient data management, which are crucial for modern enterprises. For instance, blockchain technology can provide a secure and transparent way of handling transactions and records within a relational database, while machine learning algorithms can optimize queries and database performance. As these technologies evolve, their synergy with relational database software is expected to create new opportunities and drive further market expansion.

In addition to the growing significance of relational databases, Object-Oriented Databases Software is gaining traction as businesses seek more flexible and efficient ways to manage complex data structures. Unlike traditional relational databases that rely on tables and rows, object-oriented databases store data in objects, similar to how data is organized in object-oriented programming. This approach allows for a more intuitive mapping of real-world entities and relationships, making it particularly beneficial for applications that require complex data representations, such as computer-aided design (CAD), multimedia systems, and telecommunications. As industries continue to evolve and demand more sophisticated data management solutions, the adoption of object-oriented databases is expected to rise, complementing the existing relational database landscape.

Region-wise, North America holds a significant share of the relational database software market, driven by the presence of leading technology companies, high adoption of advanced IT solutions, and substantial investments in research and development. Europe follows closely, with strong growth observed in cloud-based solutions and regulatory frameworks favoring data security and privacy. The Asia Pacific region is projected to exhibit the highest growth rate, attributed to the rapid digitalization of economies, increasing IT expenditures, and expanding tech-savvy population. Conversely, Latin America and the Middle East & Africa regions are also experiencing growth, albeit at a slower pace, due to growing awareness and gradual adoption of database management solutions.

Deployment Mode Analysis

The deployment mode segment of the relational database software market can be bifur
D
Open Source Database Solution Market Report | Global Forecast From 2025 To...
dataintelo.com
csv, pdf, pptx
Updated Sep 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2024). Open Source Database Solution Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-open-source-database-solution-market
Explore at:
pdf, pptx, csvAvailable download formats
Dataset updated
Sep 23, 2024
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Open Source Database Solution Market Outlook

The global market size for open source database solutions is projected to exhibit remarkable growth, driven by a compound annual growth rate (CAGR) of 12.5% from 2024 to 2032. In 2023, the market is estimated to be valued at USD 11.2 billion and is expected to reach approximately USD 28.8 billion by 2032. The growth factors contributing significantly to this expansion include the increasing adoption of data-driven decision-making processes, cost-efficiency of open source solutions, and the proliferation of big data and IoT applications.

The growth of the open source database solution market is majorly attributed to the increasing reliance on data analytics across various industries. Enterprises are increasingly leveraging data to derive actionable insights, make informed decisions, and optimize operations. Open source database solutions offer a cost-effective alternative to proprietary databases, thereby enabling organizations of all sizes to harness the power of data without incurring prohibitive costs. Additionally, the flexibility and scalability of open source databases make them an attractive choice for enterprises looking to manage and analyze large volumes of data efficiently.

Another key growth factor is the burgeoning demand for cloud-based solutions. The cloud offers numerous advantages, including scalability, reduced infrastructure costs, and improved accessibility. Open source databases are well-suited for cloud deployments, enabling organizations to leverage the elasticity and computational power of cloud environments. As more businesses migrate to the cloud, the demand for open source database solutions is expected to surge. Moreover, the ongoing advancements in cloud technology, such as the introduction of serverless architectures and managed database services, further bolster the adoption of open source databases in the cloud.

The rise of the Internet of Things (IoT) and big data technologies is also driving the growth of the open source database solution market. IoT devices generate vast amounts of data that need to be stored, managed, and analyzed in real-time. Open source databases are capable of handling the high velocity, variety, and volume of IoT data, making them a preferred choice for IoT applications. Similarly, big data technologies, which require robust and scalable database solutions, are increasingly relying on open source databases to manage large datasets and perform complex analytics.

Regionally, North America is expected to dominate the open source database solution market, driven by the presence of major technology companies and early adopters of advanced technologies. The region's well-established IT infrastructure and the growing emphasis on data analytics further contribute to its leadership in the market. However, significant growth is also anticipated in the Asia Pacific region, fueled by the rapid digitization of economies, increasing investments in IT infrastructure, and the expanding base of tech-savvy enterprises. European markets are also poised for steady growth, supported by favorable regulatory frameworks and the rising adoption of open source technologies in various industries.

Database Type Analysis

The open source database solution market can be segmented by database type into SQL, NoSQL, and NewSQL databases. SQL databases, or traditional relational databases, remain a cornerstone in the market, known for their ability to handle structured data efficiently. These databases are particularly favored in applications requiring ACID (Atomicity, Consistency, Isolation, Durability) compliance, such as financial transactions and enterprise resource planning (ERP) systems. Despite the emergence of newer technologies, SQL databases continue to see widespread adoption due to their maturity, robustness, and the extensive ecosystem of tools and support available.

NoSQL databases, on the other hand, have gained significant traction in recent years, driven by the need to manage unstructured and semi-structured data. These databases offer superior scalability and flexibility, making them ideal for applications such as social media analytics, content management systems, and real-time web applications. NoSQL databases are designed to handle large volumes of data and high user loads, which makes them particularly suitable for big data applications. The diverse range of NoSQL databases, including document stores, key-value stores, column-family stores, and graph databases, provides organizations with the flexibility to choose the best-fit solution for their specific use cases.</p&
Data from: The SCOC database – a large, open and global database with...
data.niaid.nih.gov
datadryad.org
+1more
zip
Updated Oct 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tanja Stratmann; Karline Soetaert; Chih-Lin Wei; Yu-Shih Lin; Dick van Oevelen (2022). The SCOC database – a large, open and global database with sediment community oxygen consumption rates [Dataset]. http://doi.org/10.5061/dryad.25nd083
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.25nd083
Dataset updated
Oct 4, 2022
Dataset provided by
Royal Netherlands Institute for Sea Research
,
Authors
Tanja Stratmann; Karline Soetaert; Chih-Lin Wei; Yu-Shih Lin; Dick van Oevelen
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
Mediterranean Sea, Pacific Ocean, Southern Ocean, Atlantic Ocean, Red Sea, Black Sea, Arctic Ocean, Indian Ocean, Gulf of Mexico
Description
Sediment community oxygen consumption (SCOC) rates provide important information about biogeochemical processes in marine sediments and the activity of benthic microorganisms and fauna. Therefore, several databases of sediment community oxygen consumption data have been compiled since the mid-1990s. However, these earlier databases contained much less data records and were not freely available. Additionally, the databases were not transparent in their selection procedure, so that other researchers could not assess the quality of the data. Here, we present the largest, best documented, and freely available database of SCOC data compiled to date. The database is comprised of 2,936 georeferenced SCOC records from 208 studies that were selected following the procedure for systematic reviews and meta-analyses. Each data record states whether the oxygen consumption was measured ex situ or in situ, as total oxygen uptake or diffusive oxygen uptake, and which measurement device was used. The database will be curated and updated annually to secure and maintain an up-to-date global database of SCOC data.
D
Distributed Database Market Report | Global Forecast From 2025 To 2033
dataintelo.com
csv, pdf, pptx
Updated Oct 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2024). Distributed Database Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/distributed-database-market
Explore at:
pdf, csv, pptxAvailable download formats
Dataset updated
Oct 4, 2024
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Distributed Database Market Outlook

The global distributed database market size was valued at USD 12.5 billion in 2023 and is projected to reach USD 28.6 billion by 2032, registering a compound annual growth rate (CAGR) of 9.6% during the forecast period. This growth is driven by the proliferation of big data, the expanding IoT ecosystem, and the increasing need for real-time data processing and analytics.

One of the significant growth factors for the distributed database market is the rising adoption of cloud-based services. Organizations are increasingly moving their operations to the cloud to leverage its scalability, flexibility, and cost-effectiveness. Cloud services enable businesses to manage and process vast amounts of data efficiently, which is essential for real-time analytics and decision-making. Additionally, cloud-based distributed databases offer enhanced disaster recovery capabilities, reducing the risk of data loss and ensuring business continuity.

Another factor propelling the growth of the distributed database market is the increasing need for real-time data processing and analytics. In today's fast-paced business environment, companies must analyze data in real-time to gain actionable insights and stay competitive. Distributed databases facilitate real-time data processing by distributing the workload across multiple servers, ensuring that data can be accessed and analyzed quickly and efficiently. This capability is particularly crucial for industries such as finance, healthcare, and retail, where timely decision-making can significantly impact business outcomes.

The growing adoption of Internet of Things (IoT) technology is also driving the demand for distributed databases. IoT devices generate massive amounts of data that need to be collected, stored, and analyzed in real-time. Distributed databases are well-suited for handling the high volume, velocity, and variety of IoT data, enabling businesses to gain valuable insights and improve operational efficiency. Additionally, the ability to process and analyze IoT data in real-time can help organizations enhance their products and services, optimize resource utilization, and improve customer experiences.

Regional outlook for the distributed database market shows significant growth potential across various regions. North America is expected to dominate the market due to the presence of major technology players and early adoption of advanced technologies. Europe is also anticipated to witness substantial growth, driven by the increasing adoption of cloud services and rising investments in big data analytics. Meanwhile, the Asia Pacific region is projected to experience the highest growth rate, fueled by the rapid digital transformation of businesses, growing IoT ecosystem, and increasing demand for real-time analytics solutions.

Database Type Analysis

The distributed database market is segmented by database type into relational, NoSQL, and NewSQL databases. Relational databases, which have been the backbone of enterprise data management for decades, continue to hold a significant market share. These databases are highly structured and use SQL queries for data manipulation, making them ideal for applications that require complex transactions and data integrity. The robustness and reliability of relational databases make them a popular choice for industries such as finance, healthcare, and retail, where data accuracy and consistency are paramount.

NoSQL databases have gained traction in recent years due to their ability to handle unstructured and semi-structured data. Unlike relational databases, NoSQL databases do not rely on a fixed schema, allowing for greater flexibility and scalability. This makes them well-suited for applications that deal with large volumes of diverse data types, such as social media platforms, IoT applications, and content management systems. The growing need for big data analytics and real-time data processing is driving the adoption of NoSQL databases, as they can efficiently manage and analyze vast amounts of data.

NewSQL databases are a relatively new entrant in the distributed database market, combining the best features of relational and NoSQL databases. They offer the scalability and flexibility of NoSQL databases while maintaining the ACID (Atomicity, Consistency, Isolation, Durability) properties of relational databases. This makes NewSQL databases ideal for applications that require high performance and data integrity. As businesses increasingly seek solutions that can handle both structured and unstructured data while
Database Market Size & Share Analysis - Industry Research Report - Growth...
mordorintelligence.com
pdf,excel,csv,ppt
Updated Jul 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mordor Intelligence (2025). Database Market Size & Share Analysis - Industry Research Report - Growth Trends, 2030 [Dataset]. https://www.mordorintelligence.com/industry-reports/database-market
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
Jul 2, 2025
Dataset provided by
Authors
Mordor Intelligence
License
https://www.mordorintelligence.com/privacy-policyhttps://www.mordorintelligence.com/privacy-policy
Time period covered
2020 - 2030
Area covered
Global
Description
The Database Market is Segmented by Database Type (Relational (RDBMS), Nosql, and More), Deployment (Cloud, On-Premsies), Service Model (Database-As-A-Service (DBaaS), License and Maintenance Software), Enterprise (SMEs, Large Enterprises), Workload Type (Transactional (OLTP), Analytical (OLAP), and More), End-User Vertical (BFSI, Retail, and More), and by Geography. The Market Forecasts are Provided in Terms of Value (USD).
Dataset of A Large-scale Study about Quality and Reproducibility of Jupyter...
zenodo.org
explore.openaire.eu
bz2
Updated Mar 15, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
João Felipe; João Felipe; Leonardo; Leonardo; Vanessa; Vanessa; Juliana; Juliana (2021). Dataset of A Large-scale Study about Quality and Reproducibility of Jupyter Notebooks [Dataset]. http://doi.org/10.5281/zenodo.2592524
Explore at:
bz2Available download formats
Unique identifier
https://doi.org/10.5281/zenodo.2592524
Dataset updated
Mar 15, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
João Felipe; João Felipe; Leonardo; Leonardo; Vanessa; Vanessa; Juliana; Juliana
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The self-documenting aspects and the ability to reproduce results have been touted as significant benefits of Jupyter Notebooks. At the same time, there has been growing criticism that the way notebooks are being used leads to unexpected behavior, encourage poor coding practices and that their results can be hard to reproduce. To understand good and bad practices used in the development of real notebooks, we analyzed 1.4 million notebooks from GitHub.

Paper: https://2019.msrconf.org/event/msr-2019-papers-a-large-scale-study-about-quality-and-reproducibility-of-jupyter-notebooks

This repository contains two files:

dump.tar.bz2

jupyter_reproducibility.tar.bz2

The dump.tar.bz2 file contains a PostgreSQL dump of the database, with all the data we extracted from the notebooks.

The jupyter_reproducibility.tar.bz2 file contains all the scripts we used to query and download Jupyter Notebooks, extract data from them, and analyze the data. It is organized as follows:

analyses: this folder has all the notebooks we use to analyze the data in the PostgreSQL database.

archaeology: this folder has all the scripts we use to query, download, and extract data from GitHub notebooks.

paper: empty. The notebook analyses/N12.To.Paper.ipynb moves data to it

In the remaining of this text, we give instructions for reproducing the analyses, by using the data provided in the dump and reproducing the collection, by collecting data from GitHub again.

Reproducing the Analysis

This section shows how to load the data in the database and run the analyses notebooks. In the analysis, we used the following environment:

Ubuntu 18.04.1 LTS
PostgreSQL 10.6
Conda 4.5.11
Python 3.7.2
PdfCrop 2012/11/02 v1.38

First, download dump.tar.bz2 and extract it:

tar -xjf dump.tar.bz2

It extracts the file db2019-03-13.dump. Create a database in PostgreSQL (we call it "jupyter"), and use psql to restore the dump:

psql jupyter < db2019-03-13.dump

It populates the database with the dump. Now, configure the connection string for sqlalchemy by setting the environment variable JUP_DB_CONNECTTION:

export JUP_DB_CONNECTION="postgresql://user:password@hostname/jupyter";

Download and extract jupyter_reproducibility.tar.bz2:

tar -xjf jupyter_reproducibility.tar.bz2

Create a conda environment with Python 3.7:

conda create -n analyses python=3.7 conda activate analyses

Go to the analyses folder and install all the dependencies of the requirements.txt

cd jupyter_reproducibility/analyses pip install -r requirements.txt

For reproducing the analyses, run jupyter on this folder:

jupyter notebook

Execute the notebooks on this order:

Index.ipynb

N0.Repository.ipynb

N1.Skip.Notebook.ipynb

N2.Notebook.ipynb

N3.Cell.ipynb

N4.Features.ipynb

N5.Modules.ipynb

N6.AST.ipynb

N7.Name.ipynb

N8.Execution.ipynb

N9.Cell.Execution.Order.ipynb

N10.Markdown.ipynb

N11.Repository.With.Notebook.Restriction.ipynb

N12.To.Paper.ipynb

Reproducing or Expanding the Collection

The collection demands more steps to reproduce and takes much longer to run (months). It also involves running arbitrary code on your machine. Proceed with caution.

Requirements

This time, we have extra requirements:

All the analysis requirements
lbzip2 2.5
gcc 7.3.0
Github account
Gmail account

Environment

First, set the following environment variables:

export JUP_MACHINE="db"; # machine identifier export JUP_BASE_DIR="/mnt/jupyter/github"; # place to store the repositories export JUP_LOGS_DIR="/home/jupyter/logs"; # log files export JUP_COMPRESSION="lbzip2"; # compression program export JUP_VERBOSE="5"; # verbose level export JUP_DB_CONNECTION="postgresql://user:password@hostname/jupyter"; # sqlchemy connection export JUP_GITHUB_USERNAME="github_username"; # your github username export JUP_GITHUB_PASSWORD="github_password"; # your github password export JUP_MAX_SIZE="8000.0"; # maximum size of the repositories directory (in GB) export JUP_FIRST_DATE="2013-01-01"; # initial date to query github export JUP_EMAIL_LOGIN="gmail@gmail.com"; # your gmail address export JUP_EMAIL_TO="target@email.com"; # email that receives notifications export JUP_OAUTH_FILE="~/oauth2_creds.json" # oauth2 auhentication file export JUP_NOTEBOOK_INTERVAL=""; # notebook id interval for this machine. Leave it in blank export JUP_REPOSITORY_INTERVAL=""; # repository id interval for this machine. Leave it in blank export JUP_WITH_EXECUTION="1"; # run execute python notebooks export JUP_WITH_DEPENDENCY="0"; # run notebooks with and without declared dependnecies export JUP_EXECUTION_MODE="-1"; # run following the execution order export JUP_EXECUTION_DIR="/home/jupyter/execution"; # temporary directory for running notebooks export JUP_ANACONDA_PATH="~/anaconda3"; # conda installation path export JUP_MOUNT_BASE="/home/jupyter/mount_ghstudy.sh"; # bash script to mount base dir export JUP_UMOUNT_BASE="/home/jupyter/umount_ghstudy.sh"; # bash script to umount base dir export JUP_NOTEBOOK_TIMEOUT="300"; # timeout the extraction # Frequenci of log report export JUP_ASTROID_FREQUENCY="5"; export JUP_IPYTHON_FREQUENCY="5"; export JUP_NOTEBOOKS_FREQUENCY="5"; export JUP_REQUIREMENT_FREQUENCY="5"; export JUP_CRAWLER_FREQUENCY="1"; export JUP_CLONE_FREQUENCY="1"; export JUP_COMPRESS_FREQUENCY="5"; export JUP_DB_IP="localhost"; # postgres database IP

Then, configure the file ~/oauth2_creds.json, according to yagmail documentation: https://media.readthedocs.org/pdf/yagmail/latest/yagmail.pdf

Configure the mount_ghstudy.sh and umount_ghstudy.sh scripts. The first one should mount the folder that stores the directories. The second one should umount it. You can leave the scripts in blank, but it is not advisable, as the reproducibility study runs arbitrary code on your machine and you may lose your data.

Scripts

Download and extract jupyter_reproducibility.tar.bz2:

tar -xjf jupyter_reproducibility.tar.bz2

Install 5 conda environments and 5 anaconda environments, for each python version. In each of them, upgrade pip, install pipenv, and install the archaeology package (Note that it is a local package that has not been published to pypi. Make sure to use the -e option):

Conda 2.7

conda create -n raw27 python=2.7 -y conda activate raw27 pip install --upgrade pip pip install pipenv pip install -e jupyter_reproducibility/archaeology

Anaconda 2.7

conda create -n py27 python=2.7 anaconda -y conda activate py27 pip install --upgrade pip pip install pipenv pip install -e jupyter_reproducibility/archaeology

Conda 3.4

It requires a manual jupyter and pathlib2 installation due to some incompatibilities found on the default installation.

conda create -n raw34 python=3.4 -y conda activate raw34 conda install jupyter -c conda-forge -y conda uninstall jupyter -y pip install --upgrade pip pip install jupyter pip install pipenv pip install -e jupyter_reproducibility/archaeology pip install pathlib2

Anaconda 3.4

conda create -n py34 python=3.4 anaconda -y conda activate py34 pip install --upgrade pip pip install pipenv pip install -e jupyter_reproducibility/archaeology

Conda 3.5

conda create -n raw35 python=3.5 -y conda activate raw35 pip install --upgrade pip pip install pipenv pip install -e jupyter_reproducibility/archaeology

Anaconda 3.5

It requires the manual installation of other anaconda packages.

conda create -n py35 python=3.5 anaconda -y conda install -y appdirs atomicwrites keyring secretstorage libuuid navigator-updater prometheus_client pyasn1 pyasn1-modules spyder-kernels tqdm jeepney automat constantly anaconda-navigator conda activate py35 pip install --upgrade pip pip install pipenv pip install -e jupyter_reproducibility/archaeology

Conda 3.6

conda create -n raw36 python=3.6 -y conda activate raw36 pip install --upgrade pip pip install pipenv pip install -e jupyter_reproducibility/archaeology

Anaconda 3.6

conda create -n py36 python=3.6 anaconda -y conda activate py36 conda install -y anaconda-navigator jupyterlab_server navigator-updater pip install --upgrade pip pip install pipenv pip install -e jupyter_reproducibility/archaeology

Conda 3.7

<code
Leading big data vendors in 2014-2017, by revenue
statista.com
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Leading big data vendors in 2014-2017, by revenue [Dataset]. https://www.statista.com/statistics/254271/big-data-revenue-by-leading-vendors/
Explore at:
Dataset updated
Jul 10, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
This statistic shows the revenues from the leading big data vendors from 2014 to 2017. In 2017, IBM generated around **** billion U.S. dollars worth of revenue through big data services, software and hardware.
D
Time Series Databases Software Market Report | Global Forecast From 2025 To...
dataintelo.com
csv, pdf, pptx
Updated Dec 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2024). Time Series Databases Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-time-series-databases-software-market
Explore at:
pptx, csv, pdfAvailable download formats
Dataset updated
Dec 3, 2024
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Time Series Databases Software Market Outlook

The global time series databases software market is experiencing significant expansion, with market size estimated at approximately USD 1.5 billion in 2023 and projected to reach USD 4.2 billion by 2032, registering a robust compound annual growth rate (CAGR) of 12.5% during the forecast period. This growth is driven by the increasing need for real-time analytics and the management of time-stamped data across various industry verticals. The proliferation of IoT devices and the growing importance of time-stamped data in decision-making processes are key factors contributing to this upward trajectory. As businesses seek to leverage these capabilities, the demand for efficient time series databases continues to rise.

One of the major growth factors driving the time series databases software market is the burgeoning IoT ecosystem. With millions of devices generating vast amounts of data every second, there is an unprecedented demand for systems that can efficiently process, store, and analyze time-stamped data. IoT applications, such as smart cities, connected vehicles, and industrial automation, rely heavily on real-time data insights to optimize operations and improve outcomes. Consequently, organizations are investing in advanced time series databases to harness the potential of IoT-driven data streams effectively. This trend is expected to accelerate as IoT adoption continues to grow across various sectors.

Another pivotal growth factor is the increasing emphasis on predictive analytics and machine learning across industries. Time series databases play a crucial role in these areas by enabling businesses to analyze historical data patterns and predict future trends. In sectors like finance, healthcare, and energy, the ability to forecast future events accurately can lead to improved decision-making and strategic planning. For instance, financial institutions utilize time series databases for stock market analysis, while healthcare providers use them for patient monitoring and prognosis. This growing reliance on predictive analytics is expected to fuel the demand for time series database solutions in the coming years.

The need for high-performance and scalable data architectures is also contributing to market growth. Traditional relational databases are often ill-equipped to handle the unique challenges posed by time-stamped data, such as high write and query loads and the need for efficient compression and data retention strategies. Time series databases are specifically designed to address these challenges, offering features such as efficient storage, fast retrieval, and seamless integration with analytics tools. As organizations grapple with increasingly large datasets, the adoption of time series databases is anticipated to rise, driven by the demand for scalable and cost-effective solutions.

Regionally, North America holds a significant share of the time series databases software market, driven by the presence of numerous tech-savvy industries and a strong focus on digital transformation. The Asia Pacific region is expected to witness the highest growth rate, fueled by rapid industrialization, the expansion of smart city initiatives, and increasing investments in IoT infrastructure. Europe also presents substantial growth prospects due to the growing adoption of advanced analytics solutions across various sectors. Meanwhile, Latin America and the Middle East & Africa are gradually embracing these technologies, albeit at a slower pace, as infrastructure and digital initiatives continue to develop. Each region's growth trajectory is influenced by local economic conditions, technology adoption rates, and regulatory frameworks.

Deployment Type Analysis

The analysis of deployment types in the time series databases software market reveals a dynamic landscape shaped by varying organizational needs and technological preferences. On-premises deployment remains a viable option for many businesses, particularly those in regulated industries where data security and control are paramount. Organizations in sectors such as finance and healthcare often prefer on-premises solutions to maintain stringent control over their data environments. These deployments offer the advantage of complete data custody and the flexibility to tailor configurations to specific organizational requirements. However, these benefits come with the trade-offs of higher upfront costs and the need for in-house technical expertise to manage and maintain the infrastructure effectively.

On the other hand, the cloud-based deployment model is witnessing
d
Fuels Database for Intact and Invaded Big Sagebrush Ecological Sites
catalog.data.gov
data.usgs.gov
+1more
Updated Jul 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Fuels Database for Intact and Invaded Big Sagebrush Ecological Sites [Dataset]. https://catalog.data.gov/dataset/fuels-database-for-intact-and-invaded-big-sagebrush-ecological-sites
Explore at:
Dataset updated
Jul 24, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
The Fuels Guide and Database for Big Sagebrush Ecological Sites was developed as part of the Joint Fire Sciences Program project “Quantifying and predicting fuels and the effects of reduction treatments along successional and invasion gradients in sagebrush habitats” (Shinneman et al. 2015). The research was carried out by the U.S. Geological Survey (USGS) Forest and Rangeland Ecosystem Science Center and Boise State University researchers, in partnership with the U.S. Bureau of Land Management and the Idaho Army National Guard. Most of the research for the project focused on the Morley Nelson Snake River Birds of Prey National Conservation Area (hereafter the NCA) in southern Idaho. Sagebrush shrublands in the NCA, and throughout much of the Great Basin and Snake River Plain, are highly influenced by non-native plants that alter successional trajectories, suppress native species, and promote frequent wildfire. Fine-fuel loadings created by nonnative annual grasses and forbs can be highly variable through space and time, which can increase uncertainty when predicting fire risk and behavior. The overarching goal of the research project was to explore and develop different approaches to better quantify and predict these dynamic fuel loadings, as well as the effects of fuels manipulations in sagebrush habitats. The purpose of this database is to provide a tool that allows ready access to fuel loading data across a range of conditions, from relatively intact sagebrush-bunchgrass communities to degraded communities dominated by nonnative annual grasses and forbs. The Fuels Guide and Database (FGD) is a tool designed to assist land managers in estimating fuel loads within a specific stand of vegetation, under conditions ranging from sagebrush-dominated to nonnative, annual grass/forb-dominated communities. Users can query the database based on vegetation cover, vegetation height, and specific environmental variables (for example elevation, precipitation, temperature, soil surface texture, and ecological site) and return fuel loading data that match the query parameters. The FGD also allows users to view photos by point or plot and to individually exclude certain points or plots to help identify areas that best match the current conditions. Final results can be exported to Microsoft Excel spreadsheet or summarized in Microsoft Word reports that can be used to improve estimates of fuel loadings in the field. Fuels data were collected on the NCA, and therefore extrapolation of queried results should also only be applied to the NCA and similar regional environments. However, there is potential for additional cover data, vegetation height data, and fuels data to be added to the FGD. If you are interested in contributing data to the FGD please contact the USGS Forest and Rangeland Ecosystem Science Center (fresc_outreach@usgs.gov). With additional input from other users, the Fuels Guide and Database has the potential to be a powerful tool throughout the sagebrush shrublands to assist land managers in quickly estimating fuel loadings.
Spider Realistic Dataset In Structure-Grounded Pretraining for Text-to-SQL
zenodo.org
bin, json, txt
Updated Aug 16, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiang Deng; Ahmed Hassan Awadallah; Christopher Meek; Oleksandr Polozov; Huan Sun; Matthew Richardson; Xiang Deng; Ahmed Hassan Awadallah; Christopher Meek; Oleksandr Polozov; Huan Sun; Matthew Richardson (2021). Spider Realistic Dataset In Structure-Grounded Pretraining for Text-to-SQL [Dataset]. http://doi.org/10.5281/zenodo.5205322
Explore at:
txt, json, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5205322
Dataset updated
Aug 16, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Xiang Deng; Ahmed Hassan Awadallah; Christopher Meek; Oleksandr Polozov; Huan Sun; Matthew Richardson; Xiang Deng; Ahmed Hassan Awadallah; Christopher Meek; Oleksandr Polozov; Huan Sun; Matthew Richardson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This folder contains the Spider-Realistic dataset used for evaluation in the paper "Structure-Grounded Pretraining for Text-to-SQL". The dataset is created based on the dev split of the Spider dataset (2020-06-07 version from https://yale-lily.github.io/spider). We manually modified the original questions to remove the explicit mention of column names while keeping the SQL queries unchanged to better evaluate the model's capability in aligning the NL utterance and the DB schema. For more details, please check our paper at https://arxiv.org/abs/2010.12773.

It contains the following files:

- spider-realistic.json
# The spider-realistic evaluation set
# Examples: 508
# Databases: 19
- dev.json
# The original dev split of Spider
# Examples: 1034
# Databases: 20
- tables.json
# The original DB schemas from Spider
# Databases: 166
- README.txt
- license

The Spider-Realistic dataset is created based on the dev split of the Spider dataset realsed by Yu, Tao, et al. "Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task." It is a subset of the original dataset with explicit mention of the column names removed. The sql queries and databases are kept unchanged.
For the format of each json file, please refer to the github page of Spider https://github.com/taoyds/spider.
For the database files please refer to the official Spider release https://yale-lily.github.io/spider.

This dataset is distributed under the CC BY-SA 4.0 license.

If you use the dataset, please cite the following papers including the original Spider datasets, Finegan-Dollak et al., 2018 and the original datasets for Restaurants, GeoQuery, Scholar, Academic, IMDB, and Yelp.

@article{deng2020structure,
title={Structure-Grounded Pretraining for Text-to-SQL},
author={Deng, Xiang and Awadallah, Ahmed Hassan and Meek, Christopher and Polozov, Oleksandr and Sun, Huan and Richardson, Matthew},
journal={arXiv preprint arXiv:2010.12773},
year={2020}
}

@inproceedings{Yu&al.18c,
year = 2018,
title = {Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task},
booktitle = {EMNLP},
author = {Tao Yu and Rui Zhang and Kai Yang and Michihiro Yasunaga and Dongxu Wang and Zifan Li and James Ma and Irene Li and Qingning Yao and Shanelle Roman and Zilin Zhang and Dragomir Radev }
}

@InProceedings{P18-1033,
author = "Finegan-Dollak, Catherine
and Kummerfeld, Jonathan K.
and Zhang, Li
and Ramanathan, Karthik
and Sadasivam, Sesh
and Zhang, Rui
and Radev, Dragomir",
title = "Improving Text-to-SQL Evaluation Methodology",
booktitle = "Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
year = "2018",
publisher = "Association for Computational Linguistics",
pages = "351--360",
location = "Melbourne, Australia",
url = "http://aclweb.org/anthology/P18-1033"
}

@InProceedings{data-sql-imdb-yelp,
dataset = {IMDB and Yelp},
author = {Navid Yaghmazadeh, Yuepeng Wang, Isil Dillig, and Thomas Dillig},
title = {SQLizer: Query Synthesis from Natural Language},
booktitle = {International Conference on Object-Oriented Programming, Systems, Languages, and Applications, ACM},
month = {October},
year = {2017},
pages = {63:1--63:26},
url = {http://doi.org/10.1145/3133887},
}

@article{data-academic,
dataset = {Academic},
author = {Fei Li and H. V. Jagadish},
title = {Constructing an Interactive Natural Language Interface for Relational Databases},
journal = {Proceedings of the VLDB Endowment},
volume = {8},
number = {1},
month = {September},
year = {2014},
pages = {73--84},
url = {http://dx.doi.org/10.14778/2735461.2735468},
}

@InProceedings{data-atis-geography-scholar,
dataset = {Scholar, and Updated ATIS and Geography},
author = {Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant Krishnamurthy, and Luke Zettlemoyer},
title = {Learning a Neural Semantic Parser from User Feedback},
booktitle = {Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
year = {2017},
pages = {963--973},
location = {Vancouver, Canada},
url = {http://www.aclweb.org/anthology/P17-1089},
}

@inproceedings{data-geography-original
dataset = {Geography, original},
author = {John M. Zelle and Raymond J. Mooney},
title = {Learning to Parse Database Queries Using Inductive Logic Programming},
booktitle = {Proceedings of the Thirteenth National Conference on Artificial Intelligence - Volume 2},
year = {1996},
pages = {1050--1055},
location = {Portland, Oregon},
url = {http://dl.acm.org/citation.cfm?id=1864519.1864543},
}

@inproceedings{data-restaurants-logic,
author = {Lappoon R. Tang and Raymond J. Mooney},
title = {Automated Construction of Database Interfaces: Intergrating Statistical and Relational Learning for Semantic Parsing},
booktitle = {2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora},
year = {2000},
pages = {133--141},
location = {Hong Kong, China},
url = {http://www.aclweb.org/anthology/W00-1317},
}

@inproceedings{data-restaurants-original,
author = {Ana-Maria Popescu, Oren Etzioni, and Henry Kautz},
title = {Towards a Theory of Natural Language Interfaces to Databases},
booktitle = {Proceedings of the 8th International Conference on Intelligent User Interfaces},
year = {2003},
location = {Miami, Florida, USA},
pages = {149--157},
url = {http://doi.acm.org/10.1145/604045.604070},
}

@inproceedings{data-restaurants,
author = {Alessandra Giordani and Alessandro Moschitti},
title = {Automatic Generation and Reranking of SQL-derived Answers to NL Questions},
booktitle = {Proceedings of the Second International Conference on Trustworthy Eternal Systems via Evolving Software, Data and Knowledge},
year = {2012},
location = {Montpellier, France},
pages = {59--76},
url = {https://doi.org/10.1007/978-3-642-45260-4_5},
}
D
NEWSQL Database Market Report | Global Forecast From 2025 To 2033
dataintelo.com
csv, pdf, pptx
Updated Sep 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2024). NEWSQL Database Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-newsql-database-market
Explore at:
pptx, pdf, csvAvailable download formats
Dataset updated
Sep 22, 2024
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
NEWSQL Database Market Outlook

The NEWSQL Database Market is poised for significant expansion, with a global market size estimated at USD 2.1 billion in 2023 and projected to reach approximately USD 8.6 billion by 2032, at a robust CAGR of 16.8% during the forecast period. This surge is driven by the increasing demand for scalable and high-performance database solutions that can seamlessly handle both transactional and analytical workloads, effectively bridging the gap between traditional relational databases and modern NoSQL databases.

The primary growth factor for the NEWSQL Database Market is the rapid increase in data generation across various industries. With the advent of IoT, big data, and advanced analytics, enterprises are generating massive amounts of structured and unstructured data that need to be efficiently stored, processed, and analyzed. NEWSQL databases, known for their ability to provide scalable performance while maintaining ACID (Atomicity, Consistency, Isolation, Durability) properties, are increasingly becoming the preferred choice for organizations aiming to leverage their data for competitive advantage.

Another significant driver of growth is the rising adoption of cloud computing. As more businesses migrate their operations to the cloud, there is a growing need for database solutions that can operate efficiently in cloud environments. NEWSQL databases, with their cloud-native architecture, offer unparalleled flexibility, scalability, and cost-efficiency, thereby attracting a significant number of enterprises. Additionally, the increasing trend toward digital transformation across industries is propelling the demand for advanced database solutions that can support new-age applications and workloads.

Furthermore, the evolving regulatory landscape around data security and privacy is also contributing to the growth of the NEWSQL Database Market. With stringent regulations such as GDPR and CCPA in place, organizations are under immense pressure to ensure the security and integrity of their data. NEWSQL databases, with their advanced security features and compliance capabilities, are well-equipped to meet these regulatory requirements, thus driving their adoption across various sectors, including BFSI, healthcare, and government.

Regionally, North America is expected to dominate the NEWSQL Database Market, owing to the presence of major technology players and early adopters of advanced database technologies. However, the Asia Pacific region is anticipated to witness the highest growth rate during the forecast period, driven by rapid digitalization, increasing investments in IT infrastructure, and the proliferation of startups and SMEs in countries like China, India, and Japan. Europe, Latin America, and the Middle East & Africa are also expected to contribute significantly to the market growth, supported by increasing IT spending and the growing need for advanced data management solutions.

Type Analysis

The NEWSQL Database Market is segmented by type into Cloud-Based and On-Premises solutions. The Cloud-Based segment is anticipated to witness substantial growth due to the increasing adoption of cloud technologies by businesses globally. Cloud-based NEWSQL databases offer numerous advantages, including reduced infrastructure costs, enhanced scalability, and flexibility, making them an attractive option for enterprises of all sizes. Furthermore, the rising trend of remote working and the need for real-time data access are further propelling the demand for cloud-based database solutions.

On the other hand, the On-Premises segment continues to hold a significant share of the market, particularly among large enterprises and organizations with critical data security requirements. On-premises NEWSQL databases offer enhanced control over data and infrastructure, making them suitable for sectors such as BFSI, healthcare, and government, where data privacy and security are of paramount importance. Despite the growing popularity of cloud-based solutions, the on-premises segment is expected to maintain steady growth, driven by the need for robust, secure, and high-performance database solutions.

Moreover, hybrid models that combine both cloud and on-premises database solutions are gaining traction. These models offer the best of both worlds, providing the scalability and flexibility of cloud-based solutions while retaining the control and security of on-premises systems. This trend is particularly evident in industries with fluctuating workloads or those undergoing digital transformation, where a hybrid approach
d
Alaska Geochemical Database Version 3.0 (AGDB3) including best value data...
catalog.data.gov
data.usgs.gov
+2more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Alaska Geochemical Database Version 3.0 (AGDB3) including best value data compilations for rock, sediment, soil, mineral, and concentrate sample media [Dataset]. https://catalog.data.gov/dataset/alaska-geochemical-database-version-3-0-agdb3-including-best-value-data-compilations-for-r
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Alaska
Description
The Alaska Geochemical Database Version 3.0 (AGDB3) contains new geochemical data compilations in which each geologic material sample has one best value determination for each analyzed species, greatly improving speed and efficiency of use. Like the Alaska Geochemical Database Version 2.0 before it, the AGDB3 was created and designed to compile and integrate geochemical data from Alaska to facilitate geologic mapping, petrologic studies, mineral resource assessments, definition of geochemical baseline values and statistics, element concentrations and associations, environmental impact assessments, and studies in public health associated with geology. This relational database, created from databases and published datasets of the U.S. Geological Survey (USGS), Atomic Energy Commission National Uranium Resource Evaluation (NURE), Alaska Division of Geological & Geophysical Surveys (DGGS), U.S. Bureau of Mines, and U.S. Bureau of Land Management serves as a data archive in support of Alaskan geologic and geochemical projects and contains data tables in several different formats describing historical and new quantitative and qualitative geochemical analyses. The analytical results were determined by 112 laboratory and field analytical methods on 396,343 rock, sediment, soil, mineral, heavy-mineral concentrate, and oxalic acid leachate samples. Most samples were collected by personnel of these agencies and analyzed in agency laboratories or, under contracts, in commercial analytical laboratories. These data represent analyses of samples collected as part of various agency programs and projects from 1938 through 2017. In addition, mineralogical data from 18,138 nonmagnetic heavy-mineral concentrate samples are included in this database. The AGDB3 includes historical geochemical data archived in the USGS National Geochemical Database (NGDB) and NURE National Uranium Resource Evaluation-Hydrogeochemical and Stream Sediment Reconnaissance databases, and in the DGGS Geochemistry database. Retrievals from these databases were used to generate most of the AGDB data set. These data were checked for accuracy regarding sample location, sample media type, and analytical methods used. In other words, the data of the AGDB3 supersedes data in the AGDB and the AGDB2, but the background about the data in these two earlier versions are needed by users of the current AGDB3 to understand what has been done to amend, clean up, correct and format this data. Corrections were entered, resulting in a significantly improved Alaska geochemical dataset, the AGDB3. Data that were not previously in these databases because the data predate the earliest agency geochemical databases, or were once excluded for programmatic reasons, are included here in the AGDB3 and will be added to the NGDB and Alaska Geochemistry. The AGDB3 data provided here are the most accurate and complete to date and should be useful for a wide variety of geochemical studies. The AGDB3 data provided in the online version of the database may be updated or changed periodically.
Mongo DB/ Json datasets
kaggle.com
Updated Sep 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shrashti (2023). Mongo DB/ Json datasets [Dataset]. https://www.kaggle.com/datasets/shrashtisinghal/mongo-db-datsets
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 3, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shrashti
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
Introducing the largest and most comprehensive collection of Mongo DB Dataset! This meticulously curated dataset brings together a wealth of information from various domains, including ecommerce, aviation, biology, zoology, literature, history, and more. Meticulously gathered from numerous reliable sources, this dataset has been expertly transformed into a unified format, making it an invaluable resource for researchers, data scientists, and enthusiasts alike. Each domain contributes its unique insights and knowledge, providing a diverse range of information for exploration and analysis. With its enriched content and extensive coverage, this Mongo DB Dataset opens up endless possibilities for uncovering hidden patterns, conducting groundbreaking research, and gaining profound insights across multiple disciplines.
p
Top 10 Best B2B Data Providers in 2025: High-Quality B2B Data & Vendors
prospectwallet.com
Updated Aug 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prospect Wallet: B2B Mailing & Email lists | Direct Mail Marketing (2025). Top 10 Best B2B Data Providers in 2025: High-Quality B2B Data & Vendors [Dataset]. https://www.prospectwallet.com/blog/top-10-b2b-database-providers/
Explore at:
Dataset updated
Aug 11, 2025
Dataset authored and provided by
Prospect Wallet: B2B Mailing & Email lists | Direct Mail Marketing
Description
If you’re in B2B sales or marketing, you know the deal: finding the right prospects is half the battle. Without solid, up-to-date data, you’re just shooting in the dark. That’s where B2B data providers come in—think of them as your secret weapon for building a killer pipeline. They give you verified contact info, insights into what your prospects are up to, and the tools to make your outreach hit the mark.

I’ve spent hours digging into the bes
Multi-Satellite Volcanic Sulfur Dioxide L4 Long-Term Global Database V4...
data.nasa.gov
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). Multi-Satellite Volcanic Sulfur Dioxide L4 Long-Term Global Database V4 (MSVOLSO2L4) at GES DISC - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/multi-satellite-volcanic-sulfur-dioxide-l4-long-term-global-database-v4-msvolso2l4-at-ges--2534f
Explore at:
Dataset updated
Apr 1, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
Version 4 is the current version of the data set. Older versions are no longer available and have been superseded by Version 4.These data are a part of MEaSUREs 2012 projects. The particular project, "Multi-Decadal Sulfur Dioxide Climatology from Satellite Instruments", is expected to produce SO2 Earth Science Data Record by means of combining measurements from backscatter Ultraviolet (BUV), thermal infrared (IR) and microwave (MLS) instruments on multiple satellites. The data represent best estimates of the volcanic and anthropogenic contribution to global atmospheric SO2 concentrations. Since SO2 is the major precursor of sulfate aerosol, which has climate and air quality impact, SO2 measurements will contribute to better understanding of the sulfate aerosol distributions and its atmospheric impact."The released data file is a long-term database of volcanic SO2 emission derived from ultraviolet satellite measurements from October 31, 1978, to present.Data are in a table format in simple ASCII format:Column Descriptions:Column 1 = Name of volcano.Column 2 = Latitude of volcano.Column 3 = Longitude of volcano.Column 4 = Altitude of volcano (km).Column 5 = Eruption year.Column 6 = Eruption month of year.Column 7 = Eruption day of month.Column 8 = Eruption style: exp = explosive, eff = effusive.Column 9 = Eruption volcanic explosivity index (nd = no data or undetermined).Column 10 = Observed plume altitude (km) where known.Column 11 = Estimated plume altitude (km) above vent: 10 km for explosive, 5 km for effusive.Column 12 = Measured SO2 mass in kilotons (= 1000 metric tons).
f
Dominant land cover type - Global Land Cover Share Database
data.apps.fao.org
Updated Dec 26, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Dominant land cover type - Global Land Cover Share Database [Dataset]. https://data.apps.fao.org/map/catalog/us/search?keyword=FAO
Explore at:
Dataset updated
Dec 26, 2021
Description
The Global Land Cover-SHARE (GLC-SHARE) is a new land cover database at the global level created by FAO, Land and Water Division in partnership and with contribution from various partners and institutions. It provides a set of major thematic land cover layers resulting by a combination of "best available" high resolution national, regional and/or sub-national land cover databases with the weighted average land cover information derived from large-scale available datasets. The database is produced with a resolution of 30 arc second (1km). The approach implemented is based on the utilization of the Land Cover Classification System (LCCS) and SEEA (System of Environmental-Economic Accounting) legend systems for the harmonization of the various global, regional and national land cover legends. The major benefit of the GLC-SHARE product is its capacity to preserve the existing and available high resolution land cover information at the regional and country level obtained by spatial and multi-temporal source data, integrating them with the best synthesis of global datasets. Preliminary validation campaign was performed using 1000 random points statistically distributed over each land cover classes. The database is distributed in the following eleven layers, in raster format (GeoTIFF ), whose pixel values represent the percentage of density coverage in each pixel of the land cover type. The dominant layer, representing the value of the dominant land cover type, is also available along with a legend in LYR ESRI format. Finally, information on each layer's source is retrievable in sources layer, by joining the raster values with an Excel table. 01-Artificial Surfaces 02-CropLand 03-Grassland 04-Tree Covered Area 05-Shrubs Covered Area 06-Herbaceous vegetation, aquatic or regularly flooded 07-Mangroves 08-Sparse vegetation 09-BareSoil 10-Snow and glaciers 11-Waterbodies
Snow and glaciers - GLC-SHARE
data.amerigeoss.org
pdf, png, wms, zip
Updated Jun 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Food and Agriculture Organization (2024). Snow and glaciers - GLC-SHARE [Dataset]. https://data.amerigeoss.org/dataset/903a5860-821f-40d8-b029-ead767f882c4
Explore at:
zip, pdf, png, wmsAvailable download formats
Dataset updated
Jun 11, 2024
Dataset provided by
Food and Agriculture Organizationhttp://fao.org/
License
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Description
This dataset is a raster format GeoTIFF representing the percentage of density in each pixel of snow and glaciers. It is part of the Global Land Cover-SHARE (GLC-SHARE) database at the global level created by FAO, Land and Water Division in partnership and with contribution from various partners and institutions.

The snow and glaciers dataset includes any geographic area covered by snow or glaciers persistently for 10 months or more.

Supplemental Information:

GLC-SHARE provides a set of major thematic land cover layers resulting by a combination of "best available" high resolution national, regional and/or sub-national land cover databases with the weighted average land cover information derived from large-scale available datasets. The database is produced with a resolution of 30 arc second (1km). The approach implemented is based on the utilization of the Land Cover Classification System (LCCS) and SEEA (System of Environmental-Economic Accounting) legend systems for the harmonization of the various global, regional and national land cover legends. The major benefit of the GLC-SHARE product is its capacity to preserve the existing and available high resolution land cover information at the regional and country level obtained by spatial and multi-temporal source data, integrating them with the best synthesis of global datasets.

Preliminary validation campaign was performed using 1000 random points statistically distributed over each land cover classes. The database is distributed in the following eleven layers, in raster format (GeoTIFF ), whose pixel values represent the percentage of density coverage in each pixel of the land cover type. The dominant layer, representing the value of the dominant land cover type, is also available along with a legend in LYR ESRI format. Finally, information on each layer's source is retrievable in sources layer, by joining the raster values with an Excel table. 01-Artificial Surfaces 02-CropLand 03-Grassland 04-Tree Covered Area 05-Shrubs Covered Area 06-Herbaceous vegetation, aquatic or regularly flooded 07-Mangroves 08-Sparse vegetation 09-BareSoil 10-Snow and glaciers 11-Waterbodies

Contact points:

Metadata Contact: FAO GIS Manager

Resource Contact: Land and Water Officer FAO-NRL

Data lineage:

The land cover database is validated only using the high resolution remote sensing imagery present in Google Earth.

Resource constraints:

Reproduction and dissemination of material contained in GLC-SHARE Beta-Release v1.0 or educational, research, personal or other noncommercial purposes are authorized without any prior written permission from the copyright holders, provided FAO are fully acknowledged. No part of GLC-SHARE Beta-Release v1.0 data may be downloaded, stored in a retrieval system or transmitted by any means for resale or other commercial purposes without written permission of the copyright holders. If any information or resources on this site are attributed to a site or source external to FAO permission to use must be sought with FAO.

The designations employed and the presentation of material in this information product do not imply the expression of any opinion whatsoever on the part of the Food and Agriculture Organization of the United Nations (FAO) concerning the legal or development status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. FAO declines all responsibility for errors or deficiencies in the database or software or in the documentation accompanying it, for program maintenance and upgrading as well as for any damage that may arise from them. FAO also declines any responsibility for updating the data and assumes no responsibility for errors and omissions in the data provided. Users are, however, kindly asked to report any errors or deficiencies in this product to FAO.

Online resources:

Download: GLC-Share - Snow and glaciers

Download: GLC-Share - Sources

Download: GLC-Share report
d
CompanyData.com (BoldData) — Vietnam Largest B2B Company Database — 1.83+...
datarade.ai
Updated Apr 21, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CompanyData.com (BoldData) (2021). CompanyData.com (BoldData) — Vietnam Largest B2B Company Database — 1.83+ Million Verified Companies [Dataset]. https://datarade.ai/data-products/list-of-1m-companies-in-vietnam-bolddata
Explore at:
.json, .csv, .xls, .txtAvailable download formats
Dataset updated
Apr 21, 2021
Dataset authored and provided by
CompanyData.com (BoldData)
Area covered
Vietnam
Description
CompanyData.com, powered by BoldData, provides verified company information sourced directly from official trade registers. Our Vietnam database features 1,828,945 company records, offering a reliable and up-to-date foundation for your business needs.

Each Vietnamese company profile includes detailed firmographic data such as company name, registration number, legal form, industry classification, revenue, and employee count. Many records also contain contact details like emails and mobile numbers of decision-makers, helping you connect directly with the right businesses.

Our Vietnam data is trusted for a wide range of applications including compliance, KYC verification, lead generation, market research, sales and marketing campaigns, CRM enrichment, and AI training. Every record is curated for accuracy and relevance, ensuring your strategies are built on reliable information.

Choose the delivery method that suits your business best. We offer tailored company lists, complete national databases, real-time API access, and ready-to-use Excel or CSV files. Our enrichment services further enhance your existing data with fresh, verified information.

With access to more than 380 million verified companies worldwide, CompanyData.com helps businesses grow locally in Vietnam and scale globally with confidence. Let us power your data-driven decisions with precision, quality, and reach.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). Most popular database management systems worldwide 2024 [Dataset]. https://www.statista.com/statistics/809750/worldwide-popularity-ranking-database-management-systems/

Most popular database management systems worldwide 2024

Explore at:

42 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jun 30, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Time period covered

Jun 2024

Area covered

Worldwide

Description

As of June 2024, the most popular database management system (DBMS) worldwide was Oracle, with a ranking score of *******; MySQL and Microsoft SQL server rounded out the top three. Although the database management industry contains some of the largest companies in the tech industry, such as Microsoft, Oracle and IBM, a number of free and open-source DBMSs such as PostgreSQL and MariaDB remain competitive. Database Management Systems As the name implies, DBMSs provide a platform through which developers can organize, update, and control large databases. Given the business world’s growing focus on big data and data analytics, knowledge of SQL programming languages has become an important asset for software developers around the world, and database management skills are seen as highly desirable. In addition to providing developers with the tools needed to operate databases, DBMS are also integral to the way that consumers access information through applications, which further illustrates the importance of the software.

Clear search

Close search

Google apps

Main menu

Most popular database management systems worldwide 2024

Global Distributed Relational Database Market Size By Deployment Type, By...

Relational Database Software Market Report | Global Forecast From 2025 To...

Relational Database Software Market Outlook

Deployment Mode Analysis

Open Source Database Solution Market Report | Global Forecast From 2025 To...

Open Source Database Solution Market Outlook

Database Type Analysis

Data from: The SCOC database – a large, open and global database with...

Distributed Database Market Report | Global Forecast From 2025 To 2033

Distributed Database Market Outlook

Database Type Analysis

Database Market Size & Share Analysis - Industry Research Report - Growth...

Dataset of A Large-scale Study about Quality and Reproducibility of Jupyter...

Leading big data vendors in 2014-2017, by revenue

Time Series Databases Software Market Report | Global Forecast From 2025 To...

Time Series Databases Software Market Outlook

Deployment Type Analysis

Fuels Database for Intact and Invaded Big Sagebrush Ecological Sites

Spider Realistic Dataset In Structure-Grounded Pretraining for Text-to-SQL

NEWSQL Database Market Report | Global Forecast From 2025 To 2033

NEWSQL Database Market Outlook

Type Analysis

Alaska Geochemical Database Version 3.0 (AGDB3) including best value data...

Mongo DB/ Json datasets

Top 10 Best B2B Data Providers in 2025: High-Quality B2B Data & Vendors

Multi-Satellite Volcanic Sulfur Dioxide L4 Long-Term Global Database V4...

Dominant land cover type - Global Land Cover Share Database

Snow and glaciers - GLC-SHARE

CompanyData.com (BoldData) — Vietnam Largest B2B Company Database — 1.83+...

Most popular database management systems worldwide 2024