The global big data market is forecasted to grow to 103 billion U.S. dollars by 2027, more than double its expected market size in 2018. With a share of 45 percent, the software segment would become the large big data market segment by 2027.
What is Big data?
Big data is a term that refers to the kind of data sets that are too large or too complex for traditional data processing applications. It is defined as having one or some of the following characteristics: high volume, high velocity or high variety. Fast-growing mobile data traffic, cloud computing traffic, as well as the rapid development of technologies such as artificial intelligence (AI) and the Internet of Things (IoT) all contribute to the increasing volume and complexity of data sets.
Big data analytics
Advanced analytics tools, such as predictive analytics and data mining, help to extract value from the data and generate new business insights. The global big data and business analytics market was valued at 169 billion U.S. dollars in 2018 and is expected to grow to 274 billion U.S. dollars in 2022. As of November 2018, 45 percent of professionals in the market research industry reportedly used big data analytics as a research method.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
There is a lack of public available datasets on financial services and specially in the emerging mobile money transactions domain. Financial datasets are important to many researchers and in particular to us performing research in the domain of fraud detection. Part of the problem is the intrinsically private nature of financial transactions, that leads to no publicly available datasets.
We present a synthetic dataset generated using the simulator called PaySim as an approach to such a problem. PaySim uses aggregated data from the private dataset to generate a synthetic dataset that resembles the normal operation of transactions and injects malicious behaviour to later evaluate the performance of fraud detection methods.
PaySim simulates mobile money transactions based on a sample of real transactions extracted from one month of financial logs from a mobile money service implemented in an African country. The original logs were provided by a multinational company, who is the provider of the mobile financial service which is currently running in more than 14 countries all around the world.
This synthetic dataset is scaled down 1/4 of the original dataset and it is created just for Kaggle.
This is a sample of 1 row with headers explanation:
1,PAYMENT,1060.31,C429214117,1089.0,28.69,M1591654462,0.0,0.0,0,0
step - maps a unit of time in the real world. In this case 1 step is 1 hour of time. Total steps 744 (30 days simulation).
type - CASH-IN, CASH-OUT, DEBIT, PAYMENT and TRANSFER.
amount - amount of the transaction in local currency.
nameOrig - customer who started the transaction
oldbalanceOrg - initial balance before the transaction
newbalanceOrig - new balance after the transaction
nameDest - customer who is the recipient of the transaction
oldbalanceDest - initial balance recipient before the transaction. Note that there is not information for customers that start with M (Merchants).
newbalanceDest - new balance recipient after the transaction. Note that there is not information for customers that start with M (Merchants).
isFraud - This is the transactions made by the fraudulent agents inside the simulation. In this specific dataset the fraudulent behavior of the agents aims to profit by taking control or customers accounts and try to empty the funds by transferring to another account and then cashing out of the system.
isFlaggedFraud - The business model aims to control massive transfers from one account to another and flags illegal attempts. An illegal attempt in this dataset is an attempt to transfer more than 200.000 in a single transaction.
There are 5 similar files that contain the run of 5 different scenarios. These files are better explained at my PhD thesis chapter 7 (PhD Thesis Available here http://urn.kb.se/resolve?urn=urn:nbn:se:bth-12932).
We ran PaySim several times using random seeds for 744 steps, representing each hour of one month of real time, which matches the original logs. Each run took around 45 minutes on an i7 intel processor with 16GB of RAM. The final result of a run contains approximately 24 million of financial records divided into the 5 types of categories: CASH-IN, CASH-OUT, DEBIT, PAYMENT and TRANSFER.
This work is part of the research project ”Scalable resource-efficient systems for big data analytics” funded by the Knowledge Foundation (grant: 20140032) in Sweden.
Please refer to this dataset using the following citations:
PaySim first paper of the simulator:
E. A. Lopez-Rojas , A. Elmir, and S. Axelsson. "PaySim: A financial mobile money simulator for fraud detection". In: The 28th European Modeling and Simulation Symposium-EMSS, Larnaca, Cyprus. 2016
In 2024, global retail e-commerce sales reached an estimated ************ U.S. dollars. Projections indicate a ** percent growth in this figure over the coming years, with expectations to come close to ************** dollars by 2028. World players Among the key players on the world stage, the American marketplace giant Amazon holds the title of the largest e-commerce player globally, with a gross merchandise value of nearly *********** U.S. dollars in 2024. Amazon was also the most valuable retail brand globally, followed by mostly American competitors such as Walmart and the Home Depot. Leading e-tailing regions E-commerce is a dormant channel globally, but nowhere has it been as successful as in Asia. In 2024, the e-commerce revenue in that continent alone was measured at nearly ************ U.S. dollars, outperforming the Americas and Europe. That year, the up-and-coming e-commerce markets also centered around Asia. The Philippines and India stood out as the swiftest-growing e-commerce markets based on online sales, anticipating a growth rate surpassing ** percent.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Money Supply M0 in the United States decreased to 5648600 USD Million in May from 5732900 USD Million in April of 2025. This dataset provides - United States Money Supply M0 - actual values, historical data, forecast, chart, statistics, economic calendar and news.
https://www.icpsr.umich.edu/web/ICPSR/studies/38050/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/38050/terms
Launched on April 28, 2009, Kickstarter is a Public Benefit Corporation based in Brooklyn, New York. It is a global crowdfunding platform that helps to fund new creative projects and ideas through direct support from individuals (backers) from around the world who pledge money to bring these projects and ideas to life. Kickstarter supports many different kinds of projects. Everything from films, games, and music to art, design, and technology. Funding on Kickstarter is based on the all-or-nothing model. Backers who pledge their support towards a particular project won't be charged unless the funding goal has been reached. Successfully funded projects reward their backers with one-of-a-kind experiences, e.g., limited editions, or copies of the creative work being produced. This study includes three datasets: (1) Kickstarter Project (public-use file), (2) Backer Location file, and (3) Kickstarter Project (restricted-use file). The public-use Kickstarter Project dataset contains detailed information about all successful and unsuccessful Kickstarter projects (N=610,015) from 2009-2023, including the project category and subcategory, project location (city, state (for U.S.-based projects), and country), funding goal in original and U.S. currencies, amount pledged in dollars, and the number of backers for each project. The restricted file adds the project title, 150-character project description, and the URL for the project on the Kickstarter site. The Backer Location dataset includes information about backers' country and state and the total amount pledged for each geographic location.
Oracle’s cloud services and license support division is the company’s most profitable business segment, bringing in over ** billion U.S. dollars in its 2024 fiscal year. In that year, Oracle brought in annual revenue of close to ** billion U.S. dollars, its highest revenue figure to date. Oracle Corporation Oracle was founded by Larry Ellison in 1977 as a tech company primarily focused on relational databases. Today, Oracle ranks among the largest companies in the world in terms of market value and serves as the world’s most popular database management system provider. Oracle’s success is not only reflected in its booming sales figures, but also in its growing number of employees: between fiscal year 2008 and 2021, Oracle’s total employee number has grown substantially, increasing from around ****** to *******. Database market The global database market reached a size of ** billion U.S. dollars in 2020. Database Management Systems (DBMSs) provide a platform through which developers can organize, update, and control large databases, with products like Oracle, MySQL, and Microsoft SQL Server being the most widely used in the market.
I WILL NOT HELP YOU ONE BIT NOT A CHANCE YOU THINK I WILL HONOR YOU I JUST LAUGH IN THE FACE OF IT ALL YOU HAVE NO MONEY LEFT YOU HAVE NO BULLETS LEFT THE RESOURCES YOU HAVE LEFT ARE NOW LOCKED IN BITTER ESCROWS FOREVER YOU ONLY HAVE BITCOINS YOU ONLY HAVE DOGE COINS THEY ARE NOT REAL MONEY AT ALL UNLESS YOU KNOW HOW TO EXCHANGE THEM AGAIN I WILL JUST CONTINUE TO LAUGH WHO WILL SAVE YOU NOW WHO WILL MAKE PEACE AMONGST YOU? YOU ONLY HAVE ENOUGH FEUL LEFT TO CONTINUE USING MISSILE STRIKES I… See the full description on the dataset page: https://huggingface.co/datasets/MAAT-EL-DUAT/I-WILL-NOT-SAVE-YOUR-WORLD.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Money Supply M2 in the United States increased to 21942 USD Billion in May from 21862.40 USD Billion in April of 2025. This dataset provides - United States Money Supply M2 - actual values, historical data, forecast, chart, statistics, economic calendar and news.
By Throwback Thursday [source]
This dataset contains comprehensive information about the US recorded music industry in 2019 Week 10. It includes details on the various formats of recorded music, such as CDs, vinyl records, digital downloads, and more. The dataset also provides data on the respective years in which these records were made, allowing for accurate historical comparison and analysis.
Key metrics provided include the number of units sold for each format, as well as corresponding revenue generated from their sales. In addition to the raw revenue figures, this dataset offers an extra column that presents inflation-adjusted revenue values. These adjusted figures take into account changes in purchasing power over time and enable a fair comparison of different years' revenues.
Overall, this dataset offers valuable insights into the US recorded music industry's performance in terms of format popularity and economic gains throughout a specific week in 2019. Researchers, analysts, and music professionals can utilize this comprehensive dataset to explore trends within specific formats while considering both absolute revenue and inflation-adjusted figures
Introduction:
Understanding the Columns: a) Format: This column categorizes the format of the recorded music, such as CD, vinyl, digital download, etc. b) Year: This column represents the year in which the data was recorded. c) Units: The number of units sold for a particular format of recorded music. d) Revenue: The revenue generated from sales for a specific format. e) Revenue (Inflation Adjusted): The column that shows revenue adjusted for inflation.
Analyzing Formats: By exploring and analyzing the Format column in this dataset, you can gain insights into changing consumer preferences over time. You can identify which formats have gained popularity or declined over different years or periods.
Understanding Revenue Generation: To understand revenue patterns in relation to various formats and years, analyze both Revenue and Revenue (Inflation Adjusted) columns separately. Comparing these two columns will help you assess changes due to inflation accurately.
Exploring Units Sold: The column Units provides insight into how many units were sold for each format within a specific year or period. Analyzing this data helps understand consumer demand across various formats.
Calculating Inflation-Adjusted Revenue: Utilize the Revenue (Inflation Adjusted) column when analyzing long-term trends or comparisons across different periods without worrying about how inflation affects purchasing power over time.
Comparing Multiple Years or Periods: This dataset includes information specifically for 2019 Week 10. However, you can use this dataset in conjunction with other datasets covering different years to compare revenue, units sold, and format performance across multiple years.
Creating Visualizations: Visualizations such as line charts or bar graphs can help represent patterns and trends more comprehensively. Consider creating visualizations based on formats over multiple years or comparing revenue generated by different formats.
Deriving Insights: Make use of the information provided to identify trends, understand customer preferences, and make informed decisions related to marketing strategies or product offerings in the music industry.
Conclusion:
- Analyzing the impact of different music formats on revenue: This dataset provides information on the revenue and units sold for different recorded music formats such as CDs, vinyl, and digital downloads. By analyzing this data, one can identify which format generates the highest revenue and understand how consumer preferences have shifted over time.
- Tracking changes in purchasing power over time: The dataset includes both revenue and inflation-adjusted revenue figures, allowing for a comparison of how purchasing power has changed over the years. This can be useful in understanding trends in consumer spending habits or evaluating the success of marketing campaigns.
- Assessing market performance by year: With data on both units sold and revenue by year, this dataset can be used to assess the overall performance of the US recorded music industry over time. By comparing different years, one can identify periods of growth or decline and gain insights into factors driving these changes, such as technological advancements or shifts in consumer behavior
&...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The DXY exchange rate rose to 97.2687 on June 27, 2025, up 0.13% from the previous session. Over the past month, the United States Dollar has weakened 2.61%, and is down by 8.10% over the last 12 months. United States Dollar - values, historical data, forecasts and news - updated on June of 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Exports in the United States increased to 289.37 USD Billion in April from 281.07 USD Billion in March of 2025. This dataset provides the latest reported value for - United States Exports - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Money Supply M2 in China increased to 325783.81 CNY Billion in May from 325173.93 CNY Billion in April of 2025. This dataset provides - China Money Supply M2 - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Gross Domestic Product (GDP) in Iran was worth 404.63 billion US dollars in 2023, according to official data from the World Bank. The GDP value of Iran represents 0.38 percent of the world economy. This dataset provides - Iran GDP - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for MONEY SUPPLY M4 reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Gross Domestic Product (GDP) in Australia was worth 1728.06 billion US dollars in 2023, according to official data from the World Bank. The GDP value of Australia represents 1.64 percent of the world economy. This dataset provides - Australia GDP - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Gross Domestic Product (GDP) in India was worth 3567.55 billion US dollars in 2023, according to official data from the World Bank. The GDP value of India represents 3.38 percent of the world economy. This dataset provides the latest reported value for - India GDP - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The global big data market is forecasted to grow to 103 billion U.S. dollars by 2027, more than double its expected market size in 2018. With a share of 45 percent, the software segment would become the large big data market segment by 2027.
What is Big data?
Big data is a term that refers to the kind of data sets that are too large or too complex for traditional data processing applications. It is defined as having one or some of the following characteristics: high volume, high velocity or high variety. Fast-growing mobile data traffic, cloud computing traffic, as well as the rapid development of technologies such as artificial intelligence (AI) and the Internet of Things (IoT) all contribute to the increasing volume and complexity of data sets.
Big data analytics
Advanced analytics tools, such as predictive analytics and data mining, help to extract value from the data and generate new business insights. The global big data and business analytics market was valued at 169 billion U.S. dollars in 2018 and is expected to grow to 274 billion U.S. dollars in 2022. As of November 2018, 45 percent of professionals in the market research industry reportedly used big data analytics as a research method.