28 datasets found

Amount of data created, consumed, and stored 2010-2023, with forecasts to...
statista.com
Updated Jun 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Amount of data created, consumed, and stored 2010-2023, with forecasts to 2028 [Dataset]. https://www.statista.com/statistics/871513/worldwide-data-created/
Explore at:
Dataset updated
Jun 30, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
May 2024
Area covered
Worldwide
Description
The total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly, reaching *** zettabytes in 2024. Over the next five years up to 2028, global data creation is projected to grow to more than *** zettabytes. In 2020, the amount of data created and replicated reached a new high. The growth was higher than previously expected, caused by the increased demand due to the COVID-19 pandemic, as more people worked and learned from home and used home entertainment options more often. Storage capacity also growing Only a small percentage of this newly created data is kept though, as just * percent of the data produced and consumed in 2020 was saved and retained into 2021. In line with the strong growth of the data volume, the installed base of storage capacity is forecast to increase, growing at a compound annual growth rate of **** percent over the forecast period from 2020 to 2025. In 2020, the installed base of storage capacity reached *** zettabytes.
c
The global AI Training Dataset Market size will be USD 2962.4 million in...
cognitivemarketresearch.com
pdf,excel,csv,ppt
Updated Jun 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research (2025). The global AI Training Dataset Market size will be USD 2962.4 million in 2025. [Dataset]. https://www.cognitivemarketresearch.com/ai-training-dataset-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
Jun 14, 2025
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
According to Cognitive Market Research, the global AI Training Dataset Market size will be USD 2962.4 million in 2025. It will expand at a compound annual growth rate (CAGR) of 28.60% from 2025 to 2033.

North America held the major market share for more than 37% of the global revenue with a market size of USD 1096.09 million in 2025 and will grow at a compound annual growth rate (CAGR) of 26.4% from 2025 to 2033. Europe accounted for a market share of over 29% of the global revenue, with a market size of USD 859.10 million. APAC held a market share of around 24% of the global revenue with a market size of USD 710.98 million in 2025 and will grow at a compound annual growth rate (CAGR) of 30.6% from 2025 to 2033. South America has a market share of more than 3.8% of the global revenue, with a market size of USD 112.57 million in 2025 and will grow at a compound annual growth rate (CAGR) of 27.6% from 2025 to 2033. Middle East had a market share of around 4% of the global revenue and was estimated at a market size of USD 118.50 million in 2025 and will grow at a compound annual growth rate (CAGR) of 27.9% from 2025 to 2033. Africa had a market share of around 2.20% of the global revenue and was estimated at a market size of USD 65.17 million in 2025 and will grow at a compound annual growth rate (CAGR) of 28.3% from 2025 to 2033. Data Annotation category is the fastest growing segment of the AI Training Dataset Market

Market Dynamics of AI Training Dataset Market

Key Drivers for AI Training Dataset Market

Government-Led Open Data Initiatives Fueling AI Training Dataset Market Growth

In recent years, Government-initiated open data efforts have strongly driven the development of the AI Training Dataset Market through offering affordable, high-quality datasets that are vital in training sound AI models. For instance, the U.S. government's drive for openness and innovation can be seen through portals such as Data.gov, which provides an enormous collection of datasets from many industries, ranging from healthcare, finance, and transportation. Such datasets are basic building blocks in constructing AI applications and training models using real-world data. In the same way, the platform data.gov.uk, run by the U.K. government, offers ample datasets to aid AI research and development, creating an environment that is supportive of technological growth. By releasing such information into the public domain, governments not only enhance transparency but also encourage innovation in the AI industry, resulting in greater demand for training datasets and helping to drive the market's growth.

India's IndiaAI Datasets Platform Accelerates AI Training Dataset Market Growth

India's upcoming launch of the IndiaAI Datasets Platform in January 2025 is likely to greatly increase the AI Training Dataset Market. The project, which is part of the government's ?10,000 crore IndiaAI Mission, will establish an open-source repository similar to platforms such as HuggingFace to enable developers to create, train, and deploy AI models. The platform will collect datasets from central and state governments and private sector organizations to provide a wide and rich data pool. Through improved access to high-quality, non-personal data, the platform is filling an important requirement for high-quality datasets for training AI models, thus driving innovation and development in the AI industry. This public initiative reflects India's determination to become a global AI hub, offering the infrastructure required to facilitate startups, researchers, and businesses in creating cutting-edge AI solutions. The initiative not only simplifies data access but also creates a model for public-private partnerships in AI development.

Restraint Factor for the AI Training Dataset Market

Data Privacy Regulations Impeding AI Training Dataset Market Growth

Strict data privacy laws are coming up as a major constraint in the AI Training Dataset Market since governments across the globe are establishing legislation to safeguard personal data. In the European Union, explicit consent for using personal data is required under the General Data Protection Regulation (GDPR), reducing the availability of datasets for training AI. Likewise, the data protection regulator in Brazil ordered Meta and others to stop the use of Brazilian personal data in training AI models due to dangers to individuals' funda...
h
gdelt-event-2025-v4
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Don Branson, gdelt-event-2025-v4 [Dataset]. https://huggingface.co/datasets/dwb2023/gdelt-event-2025-v4
Explore at:
Authors
Don Branson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for dwb2023/gdelt-event-2025-v4

This dataset contains global event records from the GDELT (Global Database of Events, Language, and Tone) Project for May 1 - 11, capturing real-world events and their characteristics across the globe through news media coverage.

Dataset Details Dataset Description

The GDELT Event Database is a comprehensive repository of human societal-scale behavior and beliefs across all countries of the world, connecting every… See the full description on the dataset page: https://huggingface.co/datasets/dwb2023/gdelt-event-2025-v4.
N
Earth, TX Age Group Population Dataset: A Complete Breakdown of Earth Age...
neilsberg.com
csv, json
Updated Feb 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Earth, TX Age Group Population Dataset: A Complete Breakdown of Earth Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/451f6711-f122-11ef-8c1b-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Feb 22, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Texas, Earth
Variables measured
Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the Earth population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Earth. The dataset can be utilized to understand the population distribution of Earth by age. For example, using this dataset, we can identify the largest age group in Earth.

Key observations

The largest age group in Earth, TX was for the group of age 10 to 14 years years with a population of 102 (10.89%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in Earth, TX was the 85 years and over years with a population of 4 (0.43%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates

Age groups:

Under 5 years

5 to 9 years

10 to 14 years

15 to 19 years

20 to 24 years

25 to 29 years

30 to 34 years

35 to 39 years

40 to 44 years

45 to 49 years

50 to 54 years

55 to 59 years

60 to 64 years

65 to 69 years

70 to 74 years

75 to 79 years

80 to 84 years

85 years and over

Variables / Data Columns

Age Group: This column displays the age group in consideration

Population: The population for the specific age group in the Earth is shown in this column.

% of Total Population: This column displays the population of each age group as a proportion of Earth total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Earth Population by Age. You can refer the same here
D
AI Training Dataset Market Report | Global Forecast From 2025 To 2033
dataintelo.com
csv, pdf, pptx
Updated Jan 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). AI Training Dataset Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-ai-training-dataset-market
Explore at:
csv, pptx, pdfAvailable download formats
Dataset updated
Jan 7, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
AI Training Dataset Market Outlook

The global AI training dataset market size was valued at approximately USD 1.2 billion in 2023 and is projected to reach USD 6.5 billion by 2032, growing at a compound annual growth rate (CAGR) of 20.5% from 2024 to 2032. This substantial growth is driven by the increasing adoption of artificial intelligence across various industries, the necessity for large-scale and high-quality datasets to train AI models, and the ongoing advancements in AI and machine learning technologies.

One of the primary growth factors in the AI training dataset market is the exponential increase in data generation across multiple sectors. With the proliferation of internet usage, the expansion of IoT devices, and the digitalization of industries, there is an unprecedented volume of data being generated daily. This data is invaluable for training AI models, enabling them to learn and make more accurate predictions and decisions. Moreover, the need for diverse and comprehensive datasets to improve AI accuracy and reliability is further propelling market growth.

Another significant factor driving the market is the rising investment in AI and machine learning by both public and private sectors. Governments around the world are recognizing the potential of AI to transform economies and improve public services, leading to increased funding for AI research and development. Simultaneously, private enterprises are investing heavily in AI technologies to gain a competitive edge, enhance operational efficiency, and innovate new products and services. These investments necessitate high-quality training datasets, thereby boosting the market.

The proliferation of AI applications in various industries, such as healthcare, automotive, retail, and finance, is also a major contributor to the growth of the AI training dataset market. In healthcare, AI is being used for predictive analytics, personalized medicine, and diagnostic automation, all of which require extensive datasets for training. The automotive industry leverages AI for autonomous driving and vehicle safety systems, while the retail sector uses AI for personalized shopping experiences and inventory management. In finance, AI assists in fraud detection and risk management. The diverse applications across these sectors underline the critical need for robust AI training datasets.

As the demand for AI applications continues to grow, the role of Ai Data Resource Service becomes increasingly vital. These services provide the necessary infrastructure and tools to manage, curate, and distribute datasets efficiently. By leveraging Ai Data Resource Service, organizations can ensure that their AI models are trained on high-quality and relevant data, which is crucial for achieving accurate and reliable outcomes. The service acts as a bridge between raw data and AI applications, streamlining the process of data acquisition, annotation, and validation. This not only enhances the performance of AI systems but also accelerates the development cycle, enabling faster deployment of AI-driven solutions across various sectors.

Regionally, North America currently dominates the AI training dataset market due to the presence of major technology companies and extensive R&D activities in the region. However, Asia Pacific is expected to witness the highest growth rate during the forecast period, driven by rapid technological advancements, increasing investments in AI, and the growing adoption of AI technologies across various industries in countries like China, India, and Japan. Europe and Latin America are also anticipated to experience significant growth, supported by favorable government policies and the increasing use of AI in various sectors.

Data Type Analysis

The data type segment of the AI training dataset market encompasses text, image, audio, video, and others. Each data type plays a crucial role in training different types of AI models, and the demand for specific data types varies based on the application. Text data is extensively used in natural language processing (NLP) applications such as chatbots, sentiment analysis, and language translation. As the use of NLP is becoming more widespread, the demand for high-quality text datasets is continually rising. Companies are investing in curated text datasets that encompass diverse languages and dialects to improve the accuracy and efficiency of NLP models.

Image data is critical for computer vision application
d
Global 15 x 15 Minute Grids of the Downscaled GDP Based on the SRES B2...
catalog.data.gov
s.cnmilf.com
+4more
Updated Aug 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SEDAC (2025). Global 15 x 15 Minute Grids of the Downscaled GDP Based on the SRES B2 Scenario, 1990 and 2025 [Dataset]. https://catalog.data.gov/dataset/global-15-x-15-minute-grids-of-the-downscaled-gdp-based-on-the-sres-b2-scenario-1990-and-2
Explore at:
Dataset updated
Aug 23, 2025
Dataset provided by
SEDAC
Description
The Global 15x15 Minute Grids of the Downscaled GDP Based on the Special Report on Emissions Scenarios (SRES) B2 Scenario, 1990 and 2025, are geospatial distributions of Gross Domestic Product (GDP) per Unit area (GDP densities). These global grids were generated using the Country-level GDP and Downscaled Projections Based on the SRES B2 Scenario, 1990-2100 data set, and CIESIN's Gridded Population of World, Version 2 (GPWv2) data set as the base map. First, the GDP per capita was developed at a country-level for 1990 and 2025. Then the gridded GDP was developed within each country by applying the GDP per capita to each grid cell of the GPW, under the assumption that the GDP per capita was uniform within a country. This data set is produced and distributed by the Columbia University Center for International Earth Science Information Network (CIESIN).
T
GDP by Country Dataset
tradingeconomics.com
csv, excel, json, xml
Updated Jun 29, 2011
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2011). GDP by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/gdp
Explore at:
csv, json, xml, excelAvailable download formats
Dataset updated
Jun 29, 2011
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
World
Description
This dataset provides values for GDP reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Top 100 SaaS Companies/Startups 2025
kaggle.com
Updated May 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shreyas Dasari (2025). Top 100 SaaS Companies/Startups 2025 [Dataset]. https://www.kaggle.com/datasets/shreyasdasari7/top-100-saas-companiesstartups
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 29, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shreyas Dasari
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset provides comprehensive, up-to-date information about the top 100 Software-as-a-Service (SaaS) companies globally as of 2025. It includes detailed financial metrics, company fundamentals, and operational data that are crucial for market research, competitive analysis, investment decisions, and academic studies.

Key Features

100 leading SaaS companies across various industries

11 comprehensive data points per company

Current 2025 data including latest valuations and ARR figures

Verified information from multiple reliable sources

Clean, analysis-ready format with consistent data structure

Use Cases

Market Research: Analyze SaaS industry trends and market dynamics

Investment Analysis: Evaluate growth patterns and valuation multiples

Competitive Intelligence: Benchmark companies within sectors

Academic Research: Study business models and growth strategies

Data Science Projects: Build predictive models for SaaS metrics

Business Strategy: Identify successful patterns in SaaS businesses

Industries Covered

Enterprise Software (CRM, ERP, HR) Developer Tools & DevOps Cybersecurity Data Analytics & Business Intelligence Marketing & Sales Technology Financial Technology Communication & Collaboration E-commerce Platforms Design & Creative Tools Infrastructure & Cloud Services

Why This Dataset? The SaaS industry has grown to over $300 billion globally, with companies achieving unprecedented valuations and growth rates. This dataset captures the current state of the industry leaders, providing insights into what makes successful SaaS companies tick.

Sources/Proof of Data: Data Sources The data has been meticulously compiled from multiple authoritative sources:

Company Financial Reports (Q4 2024 - Q1 2025)

Official earnings releases and investor relations documents SEC filings for public companies

Investment Databases

Crunchbase, PitchBook, and CB Insights for funding data Venture capital and private equity announcements

Market Research Reports

Gartner, Forrester, and IDC industry analyses SaaS Capital Index and valuation reports

Industry Publications

TechCrunch, Forbes, Wall Street Journal coverage Company press releases and official announcements

Product Review Platforms

G2 Crowd ratings and reviews Capterra and GetApp user feedback

Data Verification

Cross-referenced across multiple sources for accuracy Updated with latest available information as of May 2025 Validated against official company statements where available
r
Amazon Prime Global Availability Data 2025
redstagfulfillment.com
html
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Red Stag Fulfillment (2025). Amazon Prime Global Availability Data 2025 [Dataset]. https://redstagfulfillment.com/how-many-countries-offer-amazon-prime/
Explore at:
htmlAvailable download formats
Dataset updated
Jul 11, 2025
Dataset authored and provided by
Red Stag Fulfillment
Time period covered
2005 - 2025
Area covered
Global - 27 countries across 5 continents
Variables measured
Launch dates, Monthly pricing, Regional benefits, Market penetration, Country availability
Description
Comprehensive dataset covering Amazon Prime availability across 27 countries, including launch dates, pricing, and regional benefit differences

Data from: WikiReddit: Tracing Information and Attention Flows Between...

zenodo.org

bin

Updated May 4, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Patrick Gildersleve; Patrick Gildersleve; Anna Beers; Anna Beers; Viviane Ito; Viviane Ito; Agustin Orozco; Agustin Orozco; Francesca Tripodi; Francesca Tripodi (2025). WikiReddit: Tracing Information and Attention Flows Between Online Platforms [Dataset]. http://doi.org/10.5281/zenodo.14653265

Explore at:

binAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.14653265

Dataset updated

May 4, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Patrick Gildersleve; Patrick Gildersleve; Anna Beers; Anna Beers; Viviane Ito; Viviane Ito; Agustin Orozco; Agustin Orozco; Francesca Tripodi; Francesca Tripodi

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Time period covered

Jan 15, 2025

Description

Preprint

Gildersleve, P., Beers, A., Ito, V., Orozco, A., & Tripodi, F. (2025). WikiReddit: Tracing Information and Attention Flows Between Online Platforms. arXiv [Cs.CY]. https://doi.org/10.48550/arXiv.2502.04942

Accepted at the International AAAI Conference on Web and Social Media (ICWSM) 2025

Abstract

The World Wide Web is a complex interconnected digital ecosystem, where information and attention flow between platforms and communities throughout the globe. These interactions co-construct how we understand the world, reflecting and shaping public discourse. Unfortunately, researchers often struggle to understand how information circulates and evolves across the web because platform-specific data is often siloed and restricted by linguistic barriers. To address this gap, we present a comprehensive, multilingual dataset capturing all Wikipedia links shared in posts and comments on Reddit from 2020 to 2023, excluding those from private and NSFW subreddits. Each linked Wikipedia article is enriched with revision history, page view data, article ID, redirects, and Wikidata identifiers. Through a research agreement with Reddit, our dataset ensures user privacy while providing a query and ID mechanism that integrates with the Reddit and Wikipedia APIs. This enables extended analyses for researchers studying how information flows across platforms. For example, Reddit discussions use Wikipedia for deliberation and fact-checking which subsequently influences Wikipedia content, by driving traffic to articles or inspiring edits. By analyzing the relationship between information shared and discussed on these platforms, our dataset provides a foundation for examining the interplay between social media discourse and collaborative knowledge consumption and production.

Datasheet

Motivation

The motivations for this dataset stem from the challenges researchers face in studying the flow of information across the web. While the World Wide Web enables global communication and collaboration, data silos, linguistic barriers, and platform-specific restrictions hinder our ability to understand how information circulates, evolves, and impacts public discourse. Wikipedia and Reddit, as major hubs of knowledge sharing and discussion, offer an invaluable lens into these processes. However, without comprehensive data capturing their interactions, researchers are unable to fully examine how platforms co-construct knowledge. This dataset bridges this gap, providing the tools needed to study the interconnectedness of social media and collaborative knowledge systems.

Composition

WikiReddit, a comprehensive dataset capturing all Wikipedia mentions (including links) shared in posts and comments on Reddit from 2020 to 2023, excluding those from private and NSFW (not safe for work) subreddits. The SQL database comprises 336K total posts, 10.2M comments, 1.95M unique links, and 1.26M unique articles spanning 59 languages on Reddit and 276 Wikipedia language subdomains. Each linked Wikipedia article is enriched with its revision history and page view data within a ±10-day window of its posting, as well as article ID, redirects, and Wikidata identifiers. Supplementary anonymous metadata from Reddit posts and comments further contextualizes the links, offering a robust resource for analysing cross-platform information flows, collective attention dynamics, and the role of Wikipedia in online discourse.

Collection Process

Data was collected from the Reddit4Researchers and Wikipedia APIs. No personally identifiable information is published in the dataset. Data from Reddit to Wikipedia is linked via the hyperlink and article titles appearing in Reddit posts.

Preprocessing/cleaning/labeling

Extensive processing with tools such as regex was applied to the Reddit post/comment text to extract the Wikipedia URLs. Redirects for Wikipedia URLs and article titles were found through the API and mapped to the collected data. Reddit IDs are hashed with SHA-256 for post/comment/user/subreddit anonymity.

Uses

We foresee several applications of this dataset and preview four here. First, Reddit linking data can be used to understand how attention is driven from one platform to another. Second, Reddit linking data can shed light on how Wikipedia's archive of knowledge is used in the larger social web. Third, our dataset could provide insights into how external attention is topically distributed across Wikipedia. Our dataset can help extend that analysis into the disparities in what types of external communities Wikipedia is used in, and how it is used. Fourth, relatedly, a topic analysis of our dataset could reveal how Wikipedia usage on Reddit contributes to societal benefits and harms. Our dataset could help examine if homogeneity within the Reddit and Wikipedia audiences shapes topic patterns and assess whether these relationships mitigate or amplify problematic engagement online.

Distribution

The dataset is publicly shared with a Creative Commons Attribution 4.0 International license. The article describing this dataset should be cited: https://doi.org/10.48550/arXiv.2502.04942

Maintenance

Patrick Gildersleve will maintain this dataset, and add further years of content as and when available.

SQL Database Schema

Table: `posts`

Column Name	Type	Description
`subreddit_id`	TEXT	The unique identifier for the subreddit.
`crosspost_parent_id`	TEXT	The ID of the original Reddit post if this post is a crosspost.
`post_id`	TEXT	Unique identifier for the Reddit post.
`created_at`	TIMESTAMP	The timestamp when the post was created.
`updated_at`	TIMESTAMP	The timestamp when the post was last updated.
`language_code`	TEXT	The language code of the post.
`score`	INTEGER	The score (upvotes minus downvotes) of the post.
`upvote_ratio`	REAL	The ratio of upvotes to total votes.
`gildings`	INTEGER	Number of awards (gildings) received by the post.
`num_comments`	INTEGER	Number of comments on the post.

Table: `comments`

Column Name	Type	Description
`subreddit_id`	TEXT	The unique identifier for the subreddit.
`post_id`	TEXT	The ID of the Reddit post the comment belongs to.
`parent_id`	TEXT	The ID of the parent comment (if a reply).
`comment_id`	TEXT	Unique identifier for the comment.
`created_at`	TIMESTAMP	The timestamp when the comment was created.
`last_modified_at`	TIMESTAMP	The timestamp when the comment was last modified.
`score`	INTEGER	The score (upvotes minus downvotes) of the comment.
`upvote_ratio`	REAL	The ratio of upvotes to total votes for the comment.
`gilded`	INTEGER	Number of awards (gildings) received by the comment.

Table: `postlinks`

Column Name	Type	Description
`post_id`	TEXT	Unique identifier for the Reddit post.
`end_processed_valid`	INTEGER	Whether the extracted URL from the post resolves to a valid URL.
`end_processed_url`	TEXT	The extracted URL from the Reddit post.
`final_valid`	INTEGER	Whether the final URL from the post resolves to a valid URL after redirections.
`final_status`	INTEGER	HTTP status code of the final URL.
`final_url`	TEXT	The final URL after redirections.
`redirected`	INTEGER	Indicator of whether the posted URL was redirected (1) or not (0).
`in_title`	INTEGER	Indicator of whether the link appears in the post title (1) or post body (0).

Table: `commentlinks`

Column Name	Type	Description
`comment_id`	TEXT	Unique identifier for the Reddit comment.
`end_processed_valid`	INTEGER	Whether the extracted URL from the comment resolves to a valid URL.
`end_processed_url`	TEXT	The extracted URL from the comment.
`final_valid`	INTEGER	Whether the final URL from the comment resolves to a valid URL after redirections.
`final_status`	INTEGER	HTTP status code of the final

d
A Global Self-consistent, Hierarchical, High-resolution Geography Database...
catalog.data.gov
gimi9.com
Updated Aug 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(Point of Contact) (2025). A Global Self-consistent, Hierarchical, High-resolution Geography Database from 2010-02-19 to 2017-06-15 (NCEI Accession 0304143) [Dataset]. https://catalog.data.gov/dataset/a-global-self-consistent-hierarchical-high-resolution-geography-database-from-2010-02-19-to-201
Explore at:
Dataset updated
Aug 1, 2025
Dataset provided by
(Point of Contact)
Description
A Global Self-consistent, Hierarchical, High-resolution Geography Database is a high-resolution geography data set amalgamated from three data bases in the public domain: World Vector Shorelines (WVS). CIA World Data Bank II (WDBII). Atlas of the Cryosphere (AC). The WVS is our basis for shorelines except for Antarctica while the WDBII is the basis for lakes, although there are instances where differences in coastline representations necessitated adding WDBII islands to GSHHG. The WDBII source also provides all political borders and rivers. The addition of AC since 2.3.0 allows us to offer two choices for Antarctica coastlines: Ice-front or Grounding line. These are encoded as levels 5 and 6, respectively and users of GSHHG can choose which set to use. GSHHG data have undergone extensive processing and should be free of internal inconsistencies such as erratic points and crossing segments. The shorelines are constructed entirely from hierarchically arranged closed polygons. A modified version of GSHHG is used by GMT, the Generic Mapping Tools. Starting with version 2.2.2, GSHHG has been released under the GNU Lesser General Public License. NCEI decommissioned the Global Self-consistent, Hierarchical, High-resolution Geography Database in May 2025 with no further updates. Comments and questions may be sent to: ncei.info@noaa.gov.
Global AIS-based Apparent Fishing Effort Dataset
zenodo.org
csv, json, txt, zip
Updated Mar 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Global Fishing Watch (2025). Global AIS-based Apparent Fishing Effort Dataset [Dataset]. http://doi.org/10.5281/zenodo.14982712
Explore at:
zip, json, txt, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14982712
Dataset updated
Mar 11, 2025
Dataset provided by
Global Fishing Watch
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Overview

This dataset contains version 3.0 (March 2025 release) of the Global Fishing Watch apparent fishing effort dataset. Data is available for 2012-2024 and based on positions of >190,000 unique automatic identification system (AIS) devices on fishing vessels, of which up to ~96,000 are active in a given year. Fishing vessels are identified via a machine learning model, vessel registry databases, and manual review by GFW and regional experts. Vessel time is measured in hours, calculated by assigning to each AIS position the amount of time elapsed since the previous AIS position of the vessel. The time is counted as apparent fishing hours if the GFW fishing detection model - a neural network machine learning model - determines the vessel is engaged in fishing behavior during that AIS position.

Data are spatially binned into grid cells that measure 0.01 or 0.1 degrees on a side; the coordinates defining each cell are provided in decimal degrees (WGS84) and correspond to the lower-left corner. Data are available in the following formats:

Daily apparent fishing hours by flag state and gear type at 100th degree resolution

Monthly apparent fishing hours by flag state and gear type at 10th degree resolution

Daily apparent fishing hours by MMSI at 10th degree resolution

The fishing effort dataset is accompanied by a table of vessel information (e.g. gear type, flag state, dimensions).

File structure

Fishing effort and vessel presence data are available as .csv files in daily formats. Files for each year are stored in separate .zip files. A README.txt and schema.json file is provided for each dataset version and contains the table schema and additional information. There is also a README-known-issues-v3.txt file outlining some of the known issues with the version 3 release.

Files are names according to the following convention:

Daily file format:

[fleet/mmsi]-daily-csvs-[100/10]-v3-[year].zip

[fleet/mmsi]-daily-csvs-[100/10]-v3-[date].csv

Monthly file format:

fleet-monthly-csvs-10-v3-[year].zip

fleet-monthly-csvs-10-v3-[date].csv

Fishing vessel format: fishing-vessels-v3.csv

README file format: README-[fleet/mmsi/fishing-vessels/known-issues]-v3.txt

File identifiers:

[fleet/mmsi]: Data by fleet (flag and geartype) or by MMSI

[100/10]: 100th or 10th degree resolution

[year]: Year of data included in .zip file

[date]: Date of data included in .csv files. For monthly data, [date]corresponds to the first date of the month

Examples: fleet-daily-csvs-100-v3-2020.zip; mmsi-daily-csvs-10-v3-2020-01-10.csv; fishing-vessels-v3.csv; README-fleet-v3.txt; fleet-monthly-csvs-10-v3-2024.zip; fleet-monthly-csvs-10-v3-2024-08-01.csv

Key documentation

For an overview of how GFW turns raw AIS positions into estimates of fishing hours, see this page.

The models used to produce this dataset were developed as part of this publication: D.A. Kroodsma, J. Mayorga, T. Hochberg, N.A. Miller, K. Boerder, F. Ferretti, A. Wilson, B. Bergman, T.D. White, B.A. Block, P. Woods, B. Sullivan, C. Costello, and B. Worm. "Tracking the global footprint of fisheries." Science 361.6378 (2018). Model details are available in the Supplementary Materials.

The README-known-issues-v3.txt file describing this dataset's specific caveats can be downloaded from this page. We highly recommend that users read this file in full.

The README-mmsi-v3.txt file, the README-fleet-v3.txt file, and the README-fishing-vessels-v3.txt files are downloadable from this page and contain the data description for (respectively) the fishing hours by MMSI dataset, the fishing hours by fleet dataset, and the vessel information file. These readmes contain key explanations about the gear types and flag states assigned to vessels in the dataset.

File name structure for the datafiles are available below on this page and file schema can be downloaded from this page.

A FAQ describing the updates in this version and the differences between this dataset and the data available from the GFW Map and APIs is available here.

Use Cases

The apparent fishing hours dataset is intended to allow users to analyze patterns of fishing across the world’s oceans at temporal scales as fine as daily and at spatial scales as fine as 0.1 or 0.01 degree cells. Fishing hours can be separated out by gear type, vessel flag and other characteristics of vessels such as tonnage.

Potential applications for this dataset are broad. We offer suggested use cases to illustrate its utility. The dataset can be integrated as a static layer in multi-layered analyses, allowing researchers to investigate relationships between fishing effort and other variables, including biodiversity, tracking, and environmental data, as defined by their research objectives.

A few example questions that these data could be used to answer:

What flag states have fishing activity in my area of interest?

Do hotspots of longline fishing overlap with known migration routes of sea turtles?

How does fishing time by trawlers change by month in my area of interest? Which seasons see the most trawling hours and which see the least?

Caveats

This global dataset estimates apparent fishing hours effort. The dataset is based on publicly available information and statistical classifications which may not fully capture the nuances of local fishing practices. While we manually review the dataset at a global scale and in a select set of smaller test regions to check for issues, given the scale of the dataset we are unable to manually review every fleet in every region. We recognize the potential for inaccuracies and encourage users to approach regional analyses with caution, utilizing their own regional expertise to validate findings. We welcome your feedback on any regional analysis at research@globalfishingwatch.org to enhance the dataset's accuracy.

Caveats relating to known sources of inaccuracy as well as interpretation pitfalls to avoid are described in the README-known-issues-v3.txt file available for download from this page. We highly recommend that users read this file in full. The issues described include:

Data from 2024 should be considered provisional, as vessel classifications may change as more data from 2025 becomes available.

MMSI is used in this dataset as the vessel identifier. While MMSI is intended to serve as the unique AIS identifier for an individual vessel, this does not always hold in practice.

The Maritime Identification Digits (MID), the first 3 digits of MMSI, are the only source of information on vessel flag state when the vessel does not appear on a registry. The MID may be entered incorrectly, obscuring information about an MMSI’s flag state.

AIS reception is not consistent across all areas and changes over time.

Alternative ways to access

Query using SQL in the Global Fishing Watch public BigQuery dataset: global-fishing-watch.fishing_effort_v3

Download the entire dataset from the Global Fishing Watch Data Download Portal (https://globalfishingwatch.org/data-download/datasets/public-fishing-effort)
T
PRODUCER PRICES by Country Dataset
tradingeconomics.com
csv, excel, json, xml
Updated Jul 16, 2013
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2013). PRODUCER PRICES by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/producer-prices
Explore at:
xml, csv, json, excelAvailable download formats
Dataset updated
Jul 16, 2013
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
World
Description
This dataset provides values for PRODUCER PRICES reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
T
GOLD RESERVES by Country Dataset
tradingeconomics.com
csv, excel, json, xml
Updated May 26, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2017). GOLD RESERVES by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/gold-reserves
Explore at:
excel, xml, csv, jsonAvailable download formats
Dataset updated
May 26, 2017
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
World
Description
This dataset provides values for GOLD RESERVES reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
T
HOUSING STARTS by Country Dataset
tradingeconomics.com
csv, excel, json, xml
Updated Sep 28, 2013
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2013). HOUSING STARTS by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/housing-starts
Explore at:
csv, json, excel, xmlAvailable download formats
Dataset updated
Sep 28, 2013
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
World
Description
This dataset provides values for HOUSING STARTS reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
N
Black Earth Town, Wisconsin Age Cohorts Dataset: Children, Working Adults,...
neilsberg.com
csv, json
Updated Feb 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Black Earth Town, Wisconsin Age Cohorts Dataset: Children, Working Adults, and Seniors in Black Earth town - Population and Percentage Analysis // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/black-earth-town-wi-population-by-age/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Feb 22, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Black Earth, Wisconsin, Black Earth
Variables measured
Population Over 65 Years, Population Under 18 Years, Population Between 18 and 64 Years, Percent of Total Population for Age Groups
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age cohorts. For age cohorts we divided it into three buckets Children ( Under the age of 18 years), working population ( Between 18 and 64 years) and senior population ( Over 65 years). For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the Black Earth town population by age cohorts (Children: Under 18 years; Working population: 18-64 years; Senior population: 65 years or more). It lists the population in each age cohort group along with its percentage relative to the total population of Black Earth town. The dataset can be utilized to understand the population distribution across children, working population and senior population for dependency ratio, housing requirements, ageing, migration patterns etc.

Key observations

The largest age group was 18 to 64 years with a poulation of 276 (66.67% of the total population). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Age cohorts:

Under 18 years

18 to 64 years

65 years and over

Variables / Data Columns

Age Group: This column displays the age cohort for the Black Earth town population analysis. Total expected values are 3 groups ( Children, Working Population and Senior Population).

Population: The population for the age cohort in Black Earth town is shown in the following column.

Percent of Total Population: The population as a percent of total population of the Black Earth town is shown in the following column.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Black Earth town Population by Age. You can refer the same here
T
EXISTING HOME SALES by Country Dataset
tradingeconomics.com
csv, excel, json, xml
Updated May 27, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2017). EXISTING HOME SALES by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/existing-home-sales
Explore at:
json, csv, excel, xmlAvailable download formats
Dataset updated
May 27, 2017
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
World
Description
This dataset provides values for EXISTING HOME SALES reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
T
United States GDP
tradingeconomics.com
fa.tradingeconomics.com
+13more
csv, excel, json, xml
Updated Jun 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). United States GDP [Dataset]. https://tradingeconomics.com/united-states/gdp
Explore at:
xml, excel, json, csvAvailable download formats
Dataset updated
Jun 15, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 31, 1960 - Dec 31, 2024
Area covered
United States
Description
The Gross Domestic Product (GDP) in the United States was worth 29184.89 billion US dollars in 2024, according to official data from the World Bank. The GDP value of the United States represents 27.49 percent of the world economy. This dataset provides - United States GDP - actual values, historical data, forecast, chart, statistics, economic calendar and news.
T
Gold - Price Data
tradingeconomics.com
it.tradingeconomics.com
+13more
csv, excel, json, xml
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). Gold - Price Data [Dataset]. https://tradingeconomics.com/commodity/gold
Explore at:
excel, csv, json, xmlAvailable download formats
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 3, 1968 - Aug 26, 2025
Area covered
World
Description
Gold rose to 3,377.18 USD/t.oz on August 26, 2025, up 0.30% from the previous day. Over the past month, Gold's price has risen 1.88%, and is up 33.74% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks the benchmark market for this commodity. Gold - values, historical data, forecasts and news - updated on August of 2025.
N
Blue Earth City Township, Minnesota Hispanic or Latino Population...
neilsberg.com
csv, json
Updated Feb 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Blue Earth City Township, Minnesota Hispanic or Latino Population Distribution by Ancestries Dataset : Detailed Breakdown of Hispanic or Latino Origins // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b1e95a52-ef82-11ef-9e71-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Feb 21, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Minnesota, Blue Earth City Township
Variables measured
Hispanic or Latino population with Cuban ancestry, Hispanic or Latino population with Mexican ancestry, Hispanic or Latino population with Puerto Rican ancestry, Hispanic or Latino population with Other Hispanic or Latino ancestry, Hispanic or Latino population with Cuban ancestry as Percent of Total Hispanic Population, Hispanic or Latino population with Mexican ancestry as Percent of Total Hispanic Population, Hispanic or Latino population with Puerto Rican ancestry as Percent of Total Hispanic Population, Hispanic or Latino population with Other Hispanic or Latino ancestry as Percent of Total Hispanic Population
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) Origin / Ancestry for Hispanic population and (b) respective population as a percentage of the total Hispanic population, we initially analyzed and categorized the data for each of the ancestries across the Hispanic or Latino population. It is ensured that the population estimates used in this dataset pertain exclusively to ancestries for the Hispanic or Latino population. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the Blue Earth City township Hispanic or Latino population. It includes the distribution of the Hispanic or Latino population, of Blue Earth City township, by their ancestries, as identified by the Census Bureau. The dataset can be utilized to understand the origin of the Hispanic or Latino population of Blue Earth City township.

Key observations

Among the Hispanic population in Blue Earth City township, regardless of the race, the largest group is of Mexican origin, with a population of 26 (100% of the total Hispanic population).

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Origin for Hispanic or Latino population include:

Mexican

Puerto Rican

Cuban

Other Hispanic or Latino

Variables / Data Columns

Origin: This column displays the origin for Hispanic or Latino population for the Blue Earth City township

Population: The population of the specific origin for Hispanic or Latino population in the Blue Earth City township is shown in this column.

% of Total Hispanic Population: This column displays the percentage distribution of each Hispanic origin as a proportion of Blue Earth City township total Hispanic or Latino population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Blue Earth City township Population by Race & Ethnicity. You can refer the same here

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). Amount of data created, consumed, and stored 2010-2023, with forecasts to 2028 [Dataset]. https://www.statista.com/statistics/871513/worldwide-data-created/

Amount of data created, consumed, and stored 2010-2023, with forecasts to 2028

Explore at:

Dataset updated

Jun 30, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Time period covered

May 2024

Area covered

Worldwide

Description

The total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly, reaching *** zettabytes in 2024. Over the next five years up to 2028, global data creation is projected to grow to more than *** zettabytes. In 2020, the amount of data created and replicated reached a new high. The growth was higher than previously expected, caused by the increased demand due to the COVID-19 pandemic, as more people worked and learned from home and used home entertainment options more often. Storage capacity also growing Only a small percentage of this newly created data is kept though, as just * percent of the data produced and consumed in 2020 was saved and retained into 2021. In line with the strong growth of the data volume, the installed base of storage capacity is forecast to increase, growing at a compound annual growth rate of **** percent over the forecast period from 2020 to 2025. In 2020, the installed base of storage capacity reached *** zettabytes.

Clear search

Close search

Google apps

Main menu

Amount of data created, consumed, and stored 2010-2023, with forecasts to...

The global AI Training Dataset Market size will be USD 2962.4 million in...

gdelt-event-2025-v4

Earth, TX Age Group Population Dataset: A Complete Breakdown of Earth Age...

About this dataset

Content

Inspiration

Recommended for further research

AI Training Dataset Market Report | Global Forecast From 2025 To 2033

AI Training Dataset Market Outlook

Data Type Analysis

Global 15 x 15 Minute Grids of the Downscaled GDP Based on the SRES B2...

GDP by Country Dataset

Top 100 SaaS Companies/Startups 2025

Amazon Prime Global Availability Data 2025

Data from: WikiReddit: Tracing Information and Attention Flows Between...

Preprint

Abstract

Datasheet

Motivation

Composition

Collection Process

Preprocessing/cleaning/labeling

Uses

Distribution

Maintenance

SQL Database Schema

Table: posts

Table: comments

Table: postlinks

Table: commentlinks

A Global Self-consistent, Hierarchical, High-resolution Geography Database...

Global AIS-based Apparent Fishing Effort Dataset

Overview

File structure

Key documentation

Use Cases

Caveats

Alternative ways to access

PRODUCER PRICES by Country Dataset

GOLD RESERVES by Country Dataset

HOUSING STARTS by Country Dataset

Black Earth Town, Wisconsin Age Cohorts Dataset: Children, Working Adults,...

About this dataset

Content

Inspiration

Recommended for further research

EXISTING HOME SALES by Country Dataset

United States GDP

Gold - Price Data

Blue Earth City Township, Minnesota Hispanic or Latino Population...

About this dataset

Content

Inspiration

Recommended for further research

Amount of data created, consumed, and stored 2010-2023, with forecasts to 2028

Table: `posts`

Table: `comments`

Table: `postlinks`

Table: `commentlinks`