Envestnet®| Yodlee®'s Electronic Payment Data (Aggregate/Row) Panels consist of de-identified, near-real time (T+1) USA credit/debit/ACH transaction level data – offering a wide view of the consumer activity ecosystem. The underlying data is sourced from end users leveraging the aggregation portion of the Envestnet®| Yodlee®'s financial technology platform.
Envestnet | Yodlee Consumer Panels (Aggregate/Row) include data relating to millions of transactions, including ticket size and merchant location. The dataset includes de-identified credit/debit card and bank transactions (such as a payroll deposit, account transfer, or mortgage payment). Our coverage offers insights into areas such as consumer, TMT, energy, REITs, internet, utilities, ecommerce, MBS, CMBS, equities, credit, commodities, FX, and corporate activity. We apply rigorous data science practices to deliver key KPIs daily that are focused, relevant, and ready to put into production.
We offer free trials. Our team is available to provide support for loading, validation, sample scripts, or other services you may need to generate insights from our data.
Investors, corporate researchers, and corporates can use our data to answer some key business questions such as: - How much are consumers spending with specific merchants/brands and how is that changing over time? - Is the share of consumer spend at a specific merchant increasing or decreasing? - How are consumers reacting to new products or services launched by merchants? - For loyal customers, how is the share of spend changing over time? - What is the company’s market share in a region for similar customers? - Is the company’s loyal user base increasing or decreasing? - Is the lifetime customer value increasing or decreasing?
Additional Use Cases: - Use spending data to analyze sales/revenue broadly (sector-wide) or granular (company-specific). Historically, our tracked consumer spend has correlated above 85% with company-reported data from thousands of firms. Users can sort and filter by many metrics and KPIs, such as sales and transaction growth rates and online or offline transactions, as well as view customer behavior within a geographic market at a state or city level. - Reveal cohort consumer behavior to decipher long-term behavioral consumer spending shifts. Measure market share, wallet share, loyalty, consumer lifetime value, retention, demographics, and more.) - Study the effects of inflation rates via such metrics as increased total spend, ticket size, and number of transactions. - Seek out alpha-generating signals or manage your business strategically with essential, aggregated transaction and spending data analytics.
Use Cases Categories (Our data provides an innumerable amount of use cases, and we look forward to working with new ones): 1. Market Research: Company Analysis, Company Valuation, Competitive Intelligence, Competitor Analysis, Competitor Analytics, Competitor Insights, Customer Data Enrichment, Customer Data Insights, Customer Data Intelligence, Demand Forecasting, Ecommerce Intelligence, Employee Pay Strategy, Employment Analytics, Job Income Analysis, Job Market Pricing, Marketing, Marketing Data Enrichment, Marketing Intelligence, Marketing Strategy, Payment History Analytics, Price Analysis, Pricing Analytics, Retail, Retail Analytics, Retail Intelligence, Retail POS Data Analysis, and Salary Benchmarking
Investment Research: Financial Services, Hedge Funds, Investing, Mergers & Acquisitions (M&A), Stock Picking, Venture Capital (VC)
Consumer Analysis: Consumer Data Enrichment, Consumer Intelligence
Market Data: AnalyticsB2C Data Enrichment, Bank Data Enrichment, Behavioral Analytics, Benchmarking, Customer Insights, Customer Intelligence, Data Enhancement, Data Enrichment, Data Intelligence, Data Modeling, Ecommerce Analysis, Ecommerce Data Enrichment, Economic Analysis, Financial Data Enrichment, Financial Intelligence, Local Economic Forecasting, Location-based Analytics, Market Analysis, Market Analytics, Market Intelligence, Market Potential Analysis, Market Research, Market Share Analysis, Sales, Sales Data Enrichment, Sales Enablement, Sales Insights, Sales Intelligence, Spending Analytics, Stock Market Predictions, and Trend Analysis
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In recent years, we have been exploring computational models to classify bank accounts in combating illegal pyramid selling. The department of economic investigation provides us with plenty of transaction data of real bank accounts. An account contains a lot of transaction records, each of which includes bilateral transaction accounts, timestamp, amount of money and transaction direction, etc. We sample out the transaction records belonging to 10145 bank accounts to form out dataset for training our model. There are 9270 normal accounts and 875 accounts involving a MLM organization respectively. The number of transaction records generated by the normal accounts run up to 6732730 and the fraud records created by MLM members amount to 275804 rows. These MLM members are manually annotated as ``illegal'' by economic investigators. Before training the models, we filtered out some noisy data, i.e. deleting the duplicate records, incomplete records and the records whose transaction amounts no more than 50. Therefore, 1371914 records is filtered out from the set of normal accounts' transaction records and 91341 records created by illegal accounts are deleted. In general, more than 5 million transaction records are used after denoising.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Store Transaction data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/iamprateek/store-transaction-data on 14 February 2022.
--- Dataset description provided by original source is as follows ---
Nielsen receives transaction level scanning data (POS Data) from its partner stores on a regular basis. Stores sharing POS data include bigger format store types such as supermarkets, hypermarkets as well as smaller traditional trade grocery stores (Kirana stores), medical stores etc. using a POS machine.
While in a bigger format store, all items for all transactions are scanned using a POS machine, smaller and more localized shops do not have a 100% compliance rate in terms of scanning and inputting information into the POS machine for all transactions.
A transaction involving a single packet of chips or a single piece of candy may not be scanned and recorded to spare customer the inconvenience or during rush hours when the store is crowded with customers.
Thus, the data received from such stores is often incomplete and lacks complete information of all transactions completed within a day.
Additionally, apart from incomplete transaction data in a day, it is observed that certain stores do not share data for all active days. Stores share data ranging from 2 to 28 days in a month. While it is possible to impute/extrapolate data for 2 days of a month using 28 days of actual historical data, the vice versa is not recommended.
Nielsen encourages you to create a model which can help impute/extrapolate data to fill in the missing data gaps in the store level POS data currently received.
You are provided with the dataset that contains store level data by brands and categories for select stores-
Hackathon_ Ideal_Data - The file contains brand level data for 10 stores for the last 3 months. This can be referred to as the ideal data.
Hackathon_Working_Data - This contains data for selected stores which are missing and/or incomplete.
Hackathon_Mapping_File - This file is provided to help understand the column names in the data set.
Hackathon_Validation_Data - This file contains the data stores and product groups for which you have to predict the Total_VALUE.
Sample Submission - This file represents what needs to be uploaded as output by candidate in the same format. The sample data is provided in the file to help understand the columns and values required.
Nielsen Holdings plc (NYSE: NLSN) is a global measurement and data analytics company that provides the most complete and trusted view available of consumers and markets worldwide. Nielsen is divided into two business units. Nielsen Global Media, the arbiter of truth for media markets, provides media and advertising industries with unbiased and reliable metrics that create a shared understanding of the industry required for markets to function. Nielsen Global Connect provides consumer packaged goods manufacturers and retailers with accurate, actionable information and insights and a complete picture of the complex and changing marketplace that companies need to innovate and grow. Our approach marries proprietary Nielsen data with other data sources to help clients around the world understand what’s happening now, what’s happening next, and how to best act on this knowledge. An S&P 500 company, Nielsen has operations in over 100 countries, covering more than 90% of the world’s population.
Know more: https://www.nielsen.com/us/en/
Build an imputation and/or extrapolation model to fill the missing data gaps for select stores by analyzing the data and determine which factors/variables/features can help best predict the store sales.
--- Original source retains full ownership of the source dataset ---
Consumer Edge is a leader in alternative consumer data for public and private investors and corporate clients. CE Vision Europe includes consumer transaction data on 6.7M+ credit cards, debit cards, direct debit accounts, and direct transfer accounts, including 5.3M+ active monthly users. Capturing online, offline, and 3rd-party consumer spending on public and private companies, data covers 5K+ merchants, 3K+ brands mapped to 600 global parent companies (500 publicly traded), and deep geographic breakouts with demographic breakouts coming soon for UK. Brick & mortar and ecommerce direct-to-consumer sales are recorded on transaction date and purchase data is available for most companies as early as 5 days post-swipe.
Consumer Edge’s consumer transaction datasets offer insights into industries across consumer and discretionary spend such as: • Apparel, Accessories, & Footwear • Automotive • Beauty • Commercial – Hardlines • Convenience / Drug / Diet • Department Stores • Discount / Club • Education • Electronics / Software • Financial Services • Full-Service Restaurants • Grocery • Ground Transportation • Health Products & Services • Home & Garden • Insurance • Leisure & Recreation • Limited-Service Restaurants • Luxury • Miscellaneous Services • Online Retail – Broadlines • Other Specialty Retail • Pet Products & Services • Sporting Goods, Hobby, Toy & Game • Telecom & Media • Travel
Private equity and venture capital firms can leverage insights from CE’s synthetic data to assess investment opportunities, while consumer insights teams and retailers can gain visibility into transaction data’s potential for competitive analysis, shopper behavior, and market intelligence.
CE Vision Benefits • Discover new competitors • Compare sales, average ticket & transactions across competition • Evaluate demographic and geographic drivers of growth • Assess customer loyalty • Explore granularity by geos • Benchmark market share vs. competition • Analyze business performance with advanced cross-cut queries
Corporate researchers and consumer insights teams use CE Vision for:
Corporate Strategy Use Cases • Ecommerce vs. brick & mortar trends • Real estate opportunities • Economic spending shifts
Marketing & Consumer Insights • Total addressable market view • Competitive threats & opportunities • Cross-shopping trends for new partnerships • Demo and geo growth drivers • Customer loyalty & retention
Investor Relations • Shareholder perspective on brand vs. competition • Real-time market intelligence • M&A opportunities
Most popular use cases for private equity and venture capital firms include: • Deal Sourcing • Live Diligences • Portfolio Monitoring
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The balance of payments is a record of a country's international transactions with the rest of the world. It is composed of the current account and the capital and financial account. The current account is itself subdivided into goods, services, income and current transfers; it registers the value of exports (credits) and imports (debits). The difference between these two values is the "balance".
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
E-commerce has become a new channel to support businesses development. Through e-commerce, businesses can get access and establish a wider market presence by providing cheaper and more efficient distribution channels for their products or services. E-commerce has also changed the way people shop and consume products and services. Many people are turning to their computers or smart devices to order goods, which can easily be delivered to their homes.
This is a sales transaction data set of UK-based e-commerce (online retail) for one year. This London-based shop has been selling gifts and homewares for adults and children through the website since 2007. Their customers come from all over the world and usually make direct purchases for themselves. There are also small businesses that buy in bulk and sell to other customers through retail outlet channels.
The data set contains 500K rows and 8 columns. The following is the description of each column. 1. TransactionNo (categorical): a six-digit unique number that defines each transaction. The letter “C” in the code indicates a cancellation. 2. Date (numeric): the date when each transaction was generated. 3. ProductNo (categorical): a five or six-digit unique character used to identify a specific product. 4. Product (categorical): product/item name. 5. Price (numeric): the price of each product per unit in pound sterling (£). 6. Quantity (numeric): the quantity of each product per transaction. Negative values related to cancelled transactions. 7. CustomerNo (categorical): a five-digit unique number that defines each customer. 8. Country (categorical): name of the country where the customer resides.
There is a small percentage of order cancellation in the data set. Most of these cancellations were due to out-of-stock conditions on some products. Under this situation, customers tend to cancel an order as they want all products delivered all at once.
Information is a main asset of businesses nowadays. The success of a business in a competitive environment depends on its ability to acquire, store, and utilize information. Data is one of the main sources of information. Therefore, data analysis is an important activity for acquiring new and useful information. Analyze this dataset and try to answer the following questions. 1. How was the sales trend over the months? 2. What are the most frequently purchased products? 3. How many products does the customer purchase in each transaction? 4. What are the most profitable segment customers? 5. Based on your findings, what strategy could you recommend to the business to gain more profit?
ExactOne delivers unparalleled consumer transaction insights to help investors and corporate clients uncover market opportunities, analyze trends, and drive better decisions.
Dataset Highlights - Source: Debit and credit card transactions from 600K+ active users and 2M accounts connected via Open Banking. Scale: Covers 250M+ annual transactions, mapped to 1,800+ merchants and 330+ tickers. Historical Depth: Over 6 years of transaction data. Flexibility: Analyse transactions by merchant/ticker, category/industry, or timeframe (daily, weekly, monthly, or quarterly).
ExactOne data offers visibility into key consumer industries, including: Airlines - Regional / Budget Airlines - Cargo Airlines - Full Service Autos - OEMs Communication Services - Cable & Satellite Communication Services - Integrated Telecommunications Communication Services - Wireless Telecom Consumer - Services Consumer - Health & Fitness Consumer Staples - Household Supplies Energy - Utilities Energy - Integrated Oil & Gas Financial Services - Insurance Grocers - Traditional Hotels - C-corp Industrial - Misc Industrial - Tools And Hardware Internet - E-commerce Internet - B2B Services Internet - Ride Hailing & Delivery Leisure - Online Gambling Media - Digital Subscription Real Estate - Brokerage Restaurants - Quick Service Restaurants - Fast Casual Restaurants - Pubs Restaurants - Specialty Retail - Softlines Retail - Mass Merchants Retail - European Luxury Retail - Specialty Retail - Sports & Athletics Retail - Footwear Retail - Dept Stores Retail - Luxury Retail - Convenience Stores Retail - Hardlines Technology - Enterprise Software Technology - Electronics & Appliances Technology - Computer Hardware Utilities - Water Utilities
Use Cases
For Private Equity & Venture Capital Firms: - Deal Sourcing: Identify high-growth opportunities. - Due Diligence: Leverage transaction data to evaluate investment potential. - Portfolio Monitoring: Track performance post-investment with real-time data.
For Consumer Insights & Strategy Teams: - Market Dynamics: Compare sales trends, average transaction size, and customer loyalty. - Competitive Analysis: Benchmark market share and identify emerging competitors. - E-commerce vs. Brick & Mortar Trends: Assess channel performance and strategic opportunities. - Demographic & Geographic Insights: Uncover growth drivers by demo and geo segments.
For Investor Relations Teams: - Shareholder Insights: Monitor brand performance relative to competitors. - Real-Time Intelligence: Analyse sales and market dynamics for public and private companies. - M&A Opportunities: Evaluate market share and growth potential for strategic investments.
Key Benefits of ExactOne - Understand Market Share: Benchmark against competitors and uncover emerging players. - Analyse Customer Loyalty: Evaluate repeat purchase behavior and retention rates. - Track Growth Trends: Identify key drivers of sales by geography, demographic, and channel. - Granular Insights: Drill into transaction-level data or aggregated summaries for in-depth analysis.
With ExactOne, investors and corporate leaders gain actionable, real-time insights into consumer behaviour and market dynamics, enabling smarter decisions and sustained growth.
Dataset replaced by: http://data.europa.eu/euodp/data/dataset/zhr959EjfKSPU6UBmQaITg
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Уникальный идентификатор https://doi.org/10.5281/zenodo.4718440 Набор данных обновлен Dec 19, 2022 Набор данных предоставлен Zenodo Авторы Can Özturan; Can Özturan; Alper Şen; Alper Şen; Baran Kılıç; Baran Kılıç Лицензия Attribution 4.0 (CC BY 4.0) Информация о лицензии была получена автоматически Описание This dataset contains ether as well as popular ERC20 token transfer transactions extracted from the Ethereum Mainnet blockchain. Only send ether, contract function call, contract deployment transactions are present in the dataset. Miner reward (static block reward) and "uncle block inclusion reward" are added as transactions to the dataset. Transaction fee reward and "uncles reward" are not currently included in the dataset. Details of the datasets are given below: FILENAME FORMAT: The filenames have the following format: eth-tx- where For example file eth-tx-1000000-1099999.txt.bz2 contains transactions from block 1000000 to block 1099999 inclusive. The files are compressed with bzip2. They can be uncompressed using command bunzip2. TRANSACTION FORMAT: Each line in a file corresponds to a transaction. The transaction has the following format: units. ERC20 tokens transfers (transfer and transferFrom function calls in ERC20 contract) are indicated by token symbol. For example GUSD is Gemini USD stable coin. The JSON file erc20tokens.json given below contains the details of ERC20 tokens. Failed transactions are prefixed with "F-". BLOCK TIME FORMAT: The block time file has the following format: erc20tokens.json FILE: This file contains the list of popular ERC20 token contracts whose transfer/transferFrom transactions appear in the data files. ERC20 token list: USDT TRYb XAUt BNB LEO LINK HT HEDG MKR CRO VEN INO PAX INB SNX REP MOF ZRX SXP OKB XIN OMG SAI HOT DAI EURS HPT BUSD USDC SUSD HDG QCAD PLUS BTCB WBTC cWBTC renBTC sBTC imBTC pBTC IMPORTANT NOTE: Public Ethereum Mainnet blockchain data is open and can be obtained by connecting as a node on the blockchain or by using the block explorer web sites such as http://etherscan.io . The downloaders and users of this dataset accept the full responsibility of using the data in GDPR compliant manner or any other regulations. We provide the data as is and we cannot be held responsible for anything. NOTE: If you use this dataset, please do not forget to add the DOI number to the citation. If you use our dataset in your research, please also cite our paper: https://link.springer.com/article/10.1007/s10586-021-03511-0 @article{kilic2022parallel, title={Parallel Analysis of Ethereum Blockchain Transaction Data using Cluster Computing}, journal={Cluster Computing}, author={K{\i}l{\i}{\c{c}}, Baran and {"O}zturan, Can and Sen, Alper}, year={2022}, month={Jan} }
This dataset contains ether as well as popular ERC20 token transfer transactions extracted from the Ethereum Mainnet blockchain.
Only send ether, contract function call, contract deployment transactions are present in the dataset. Miner reward transactions are not currently included in the dataset.
Details of the datasets are given below:
FILENAME FORMAT:
The filenames have the following format:
eth-tx-
where
For example file eth-tx-1000000-1099999.txt.bz2 contains transactions from
block 1000000 to block 1099999 inclusive.
The files are compressed with bzip2. They can be uncompressed using command bunzip2.
TRANSACTION FORMAT:
Each line in a file corresponds to a transaction. The transaction has the following format:
units. ERC20 tokens transfers (transfer and transferFrom function calls in ERC20
contract) are indicated by token symbol. For example GUSD is Gemini USD stable
coin. The JSON file erc20tokens.json given below contains the details of ERC20 tokens.
decoder-error.txt FILE:
This file contains transactions (block no, tx no, tx hash) on each line that produced
error while decoding calldata. These transactions are not present in the data files.
er20tokens.json FILE:
This file contains the list of popular ERC20 token contracts whose transfer/transferFrom
transactions appear in the data files.
-------------------------------------------------------------------------------------------
[
{
"address": "0xdac17f958d2ee523a2206206994597c13d831ec7",
"decdigits": 6,
"symbol": "USDT",
"name": "Tether-USD"
},
{
"address": "0xB8c77482e45F1F44dE1745F52C74426C631bDD52",
"decdigits": 18,
"symbol": "BNB",
"name": "Binance"
},
{
"address": "0x2af5d2ad76741191d15dfe7bf6ac92d4bd912ca3",
"decdigits": 18,
"symbol": "LEO",
"name": "Bitfinex-LEO"
},
{
"address": "0x514910771af9ca656af840dff83e8264ecf986ca",
"decdigits": 18,
"symbol": "LNK",
"name": "Chainlink"
},
{
"address": "0x6f259637dcd74c767781e37bc6133cd6a68aa161",
"decdigits": 18,
"symbol": "HT",
"name": "HuobiToken"
},
{
"address": "0xf1290473e210b2108a85237fbcd7b6eb42cc654f",
"decdigits": 18,
"symbol": "HEDG",
"name": "HedgeTrade"
},
{
"address": "0x9f8f72aa9304c8b593d555f12ef6589cc3a579a2",
"decdigits": 18,
"symbol": "MKR",
"name": "Maker"
},
{
"address": "0xa0b73e1ff0b80914ab6fe0444e65848c4c34450b",
"decdigits": 8,
"symbol": "CRO",
"name": "Crypto.com"
},
{
"address": "0xd850942ef8811f2a866692a623011bde52a462c1",
"decdigits": 18,
"symbol": "VEN",
"name": "VeChain"
},
{
"address": "0x0d8775f648430679a709e98d2b0cb6250d2887ef",
"decdigits": 18,
"symbol": "BAT",
"name": "Basic-Attention"
},
{
"address": "0xc9859fccc876e6b4b3c749c5d29ea04f48acb74f",
"decdigits": 0,
"symbol": "INO",
"name": "INO-Coin"
},
{
"address": "0x8e870d67f660d95d5be530380d0ec0bd388289e1",
"decdigits": 18,
"symbol": "PAX",
"name": "Paxos-Standard"
},
{
"address": "0x17aa18a4b64a55abed7fa543f2ba4e91f2dce482",
"decdigits": 18,
"symbol": "INB",
"name": "Insight-Chain"
},
{
"address": "0xc011a72400e58ecd99ee497cf89e3775d4bd732f",
"decdigits": 18,
"symbol": "SNX",
"name": "Synthetix-Network"
},
{
"address": "0x1985365e9f78359a9B6AD760e32412f4a445E862",
"decdigits": 18,
"symbol": "REP",
"name": "Reputation"
},
{
"address": "0x653430560be843c4a3d143d0110e896c2ab8ac0d",
"decdigits": 16,
"symbol": "MOF",
"name": "Molecular-Future"
},
{
"address": "0x0000000000085d4780B73119b644AE5ecd22b376",
"decdigits": 18,
"symbol": "TUSD",
"name": "True-USD"
},
{
"address": "0xe41d2489571d322189246dafa5ebde1f4699f498",
"decdigits": 18,
"symbol": "ZRX",
"name": "ZRX"
},
{
"address": "0x8ce9137d39326ad0cd6491fb5cc0cba0e089b6a9",
"decdigits": 18,
"symbol": "SXP",
"name": "Swipe"
},
{
"address": "0x75231f58b43240c9718dd58b4967c5114342a86c",
"decdigits": 18,
"symbol": "OKB",
"name": "Okex"
},
{
"address": "0xa974c709cfb4566686553a20790685a47aceaa33",
"decdigits": 18,
"symbol": "XIN",
"name": "Mixin"
},
{
"address": "0xd26114cd6EE289AccF82350c8d8487fedB8A0C07",
"decdigits": 18,
"symbol": "OMG",
"name": "OmiseGO"
},
{
"address": "0x89d24a6b4ccb1b6faa2625fe562bdd9a23260359",
"decdigits": 18,
"symbol": "SAI",
"name": "Sai Stablecoin v1.0"
},
{
"address": "0x6c6ee5e31d828de241282b9606c8e98ea48526e2",
"decdigits": 18,
"symbol": "HOT",
"name": "HoloToken"
},
{
"address": "0x6b175474e89094c44da98b954eedeac495271d0f",
"decdigits": 18,
"symbol": "DAI",
"name": "Dai Stablecoin"
},
{
"address": "0xdb25f211ab05b1c97d595516f45794528a807ad8",
"decdigits": 2,
"symbol": "EURS",
"name": "Statis-EURS"
},
{
"address": "0xa66daa57432024023db65477ba87d4e7f5f95213",
"decdigits": 18,
"symbol": "HPT",
"name": "HuobiPoolToken"
},
{
"address": "0x4fabb145d64652a948d72533023f6e7a623c7c53",
"decdigits": 18,
"symbol": "BUSD",
"name": "Binance-USD"
},
{
"address": "0x056fd409e1d7a124bd7017459dfea2f387b6d5cd",
"decdigits": 2,
"symbol": "GUSD",
"name": "Gemini-USD"
},
{
"address": "0x2c537e5624e4af88a7ae4060c022609376c8d0eb",
"decdigits": 6,
"symbol": "TRYB",
"name": "BiLira"
},
{
"address": "0x4922a015c4407f87432b179bb209e125432e4a2a",
"decdigits": 6,
"symbol": "XAUT",
"name": "Tether-Gold"
},
{
"address": "0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48",
"decdigits": 6,
"symbol": "USDC",
"name": "USD-Coin"
},
{
"address": "0xa5b55e6448197db434b92a0595389562513336ff",
"decdigits": 16,
"symbol": "SUSD",
"name": "Santender"
},
{
"address": "0xffe8196bc259e8dedc544d935786aa4709ec3e64",
"decdigits": 18,
"symbol": "HDG",
"name": "HedgeTrade"
},
{
"address": "0x4a16baf414b8e637ed12019fad5dd705735db2e0",
"decdigits": 2,
"symbol": "QCAD",
"name": "QCAD"
}
]
-------------------------------------------------------------------------------------------
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Groceries dataset ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/heeraldedhia/groceries-dataset on 28 January 2022.
--- Dataset description provided by original source is as follows ---
Market Basket Analysis is one of the key techniques used by large retailers to uncover associations between items. It works by looking for combinations of items that occur together frequently in transactions. To put it another way, it allows retailers to identify relationships between the items that people buy.
Association Rules are widely used to analyze retail basket or transaction data and are intended to identify strong rules discovered in transaction data using measures of interestingness, based on the concept of strong rules.
The dataset has 38765 rows of the purchase orders of people from the grocery stores. These orders can be analysed and association rules can be generated using Market Basket Analysis by algorithms like Apriori Algorithm.
Apriori is an algorithm for frequent itemset mining and association rule learning over relational databases. It proceeds by identifying the frequent individual items in the database and extending them to larger and larger item sets as long as those item sets appear sufficiently often in the database. The frequent itemsets determined by Apriori can be used to determine association rules which highlight general trends in the database: this has applications in domains such as market basket analysis.
Assume there are 100 customers 10 of them bought milk, 8 bought butter and 6 bought both of them. bought milk => bought butter support = P(Milk & Butter) = 6/100 = 0.06 confidence = support/P(Butter) = 0.06/0.08 = 0.75 lift = confidence/P(Milk) = 0.75/0.10 = 7.5
Note: this example is extremely small. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.
Support: This says how popular an itemset is, as measured by the proportion of transactions in which an itemset appears.
Confidence: This says how likely item Y is purchased when item X is purchased, expressed as {X -> Y}. This is measured by the proportion of transactions with item X, in which item Y also appears.
Lift: This says how likely item Y is purchased when item X is purchased while controlling for how popular item Y is.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
An example of TRTH intraday top-of-book transaction data for a single Johannesburg Stock Exchange (JSE) listed equity. The data is for teaching, learning and research projects sourced from the legacy Tick History v1 SOAP API interface from https://tickhistory.thomsonreuters.com/TickHistory in May 2016. Related raw data and similar data-structures can now be accessed using Tick History v2 and the REST API https://hosted.datascopeapi.reuters.com/RestApi/v1.
Configuration control: the test dataset contains 16 CSV files with names: "
Attributes: The data set is for the ticker: AGLJ.J from May 2010 until May 2016. The files include the following attributes: RIC, Local Date-Time, Event Type, Price at the Event, Volume at the Event, Best Bid Changes, Best Ask Changes, and Trade Event Sign: RIC, DateTimeL, Type, Price, Volume, L1 Bid, L1 Ask, Trade Sign. The Local Date-Time (DateTimeL) is a serial date number where 1 corresponds to Jan-1-0000, for example, 736333.382013 corresponds to 4-Jan-2016 09:10:05 (or 20160104T091005 in ISO 8601 format). The trade event sign (Trade Sign) indicates whether the transaction was buyer (or seller) initiated as +1 (-1) and was prepared using the method of Lee and Ready (2008).
Disclaimer: The data is not up-to-date, is incomplete, it has been pre-processed; as such it is not fit for any other purpose than teaching and learning, and algorithm testing. For complete, up-to-date, and error-free data please use the Tick History v2 interface directly.
Research Objectives: The data has been used to build empirical evidence in support of hierarchical causality and universality in financial markets by considering price impact on different time and averaging scales, feature selection on different scales as inputs into scale dependent machine learning applications, and for various aspects of agent-based model calibration and market ecology studies on different time and averaging scales.
Acknowledgements to: Diane Wilcox, Dieter Hendricks, Michael Harvey, Fayyaaz Loonat, Michael Gant, Nicholas Murphy and Donovan Platt.
The FHFA House Price Index (FHFA HPI®) is the nation’s only collection of public, freely available house price indexes that measure changes in single-family home values based on data from all 50 states and over 400 American cities that extend back to the mid-1970s. The FHFA HPI incorporates tens of millions of home sales and offers insights about house price fluctuations at the national, census division, state, metro area, county, ZIP code, and census tract levels. FHFA uses a fully transparent methodology based upon a weighted, repeat-sales statistical technique to analyze house price transaction data. What does the FHFA HPI represent? The FHFA HPI is a broad measure of the movement of single-family house prices. The FHFA HPI is a weighted, repeat-sales index, meaning that it measures average price changes in repeat sales or refinancings on the same properties. This information is obtained by reviewing repeat mortgage transactions on single-family properties whose mortgages have been purchased or securitized by Fannie Mae or Freddie Mac since January 1975. The FHFA HPI serves as a timely, accurate indicator of house price trends at various geographic levels. Because of the breadth of the sample, it provides more information than is available in other house price indexes. It also provides housing economists with an improved analytical tool that is useful for estimating changes in the rates of mortgage defaults, prepayments and housing affordability in specific geographic areas. U.S. Federal Housing Finance Agency, All-Transactions House Price Index for Connecticut [CTSTHPI], retrieved from FRED, Federal Reserve Bank of St. Louis; https://fred.stlouisfed.org/series/CTSTHPI, August 2, 2023.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
ANZ Banking data. It contains financial transaction data of ANZ Bank.
Financial Transaction data from ANZ bank program on Forage.
Data contains financial transactions and sample banking data.
Daily cryptocurrency data (transaction count, on-chain transaction volume, value of created coins, price, market cap, and exchange volume) in CSV format. The data sample stretches back to December 2013. Daily on-chain transaction volume is calculated as the sum of all transaction outputs belonging to the blocks mined on the given day. “Change” outputs are not included. Transaction count figure doesn’t include coinbase transactions. Zcash figures for on-chain volume and transaction count reflect data collected for transparent transactions only. In the last month, 10.5% (11/18/17) of ZEC transactions were shielded, and these are excluded from the analysis due to their private nature. Thus transaction volume figures in reality are higher than the estimate presented here, and NVT and exchange to transaction value lower. Data on shielded and transparent transactions can be found here and here. Decred data doesn’t include tickets and voting transactions. Monero transaction volume is impossible to calculate due to RingCT which hides transaction amounts.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Greetings , fellow analysts !
(NOTE : This is a random dataset generated using python. It bears no resemblance to any real entity in the corporate world. Any resemblance is a matter of coincidence.)
REC-SSEC Bank is a govt-aided bank operating in the Indian Peninsula. They have regional branches in over 40+ regions of the country. You have been provided with a massive excel sheet containing the transaction details, the total transaction amount and their location and total transaction count.
The dataset is described as follows :
For example , in the very first row , the data can be read as : " On the first of January, 2022 , 1932 transactions of summing upto INR 365554 from Bhuj were reported " NOTE : There are about 2750 transactions every single day. All of this has been given to you.
The bank wants you to answer the following questions :
BatchData's Deed Dataset - Real Estate Transaction Data + Property Transaction Data
Unlock a wealth of historical real estate insights with BatchData's Deed Dataset. This premium offering provides detailed real estate transaction data, including comprehensive property transaction records with over 15 critical data points. Whether you're analyzing market trends, assessing investment opportunities, or conducting in-depth property research, this dataset delivers the granular information you need.
Why Choose BatchData?
At BatchData, we are committed to delivering the most accurate and comprehensive datasets in the industry. Our Deed Dataset exemplifies our dedication to quality and precision:
Comprehensive Datasets: As a single-vendor provider, we offer an extensive array of data including property, homeowner, mortgage, listing, valuation, permit, demographic, foreclosure, and contact information. All this is available from one reliable source, streamlining your data acquisition process.
Technical Excellence: Our dataset comes with clear documentation, purpose-built APIs, and extensive developer resources. Our technical teams are supported by robust engineering resources to ensure seamless integration and utilization.
Tailor-Fit Pricing and Packaging: We understand that different businesses have different needs. That’s why we offer flexible pricing models and practical API metering. You only pay for the data you need, making our solutions scalable and aligned with your business objectives.
Unmatched Contact Information Accuracy: We lead the industry with superior right-party contact rates, ensuring you get multiple accurate contact points, including highly reliable phone numbers.
Choose BatchData for your real estate data needs and experience unparalleled accuracy and flexibility in data solutions.
Daily cryptocurrency data (transaction count, on-chain transaction volume, value of created coins, price, market cap, and exchange volume) in CSV format. The data sample stretches back to December 2013. Daily on-chain transaction volume is calculated as the sum of all transaction outputs belonging to the blocks mined on the given day. “Change” outputs are not included. Transaction count figure doesn’t include coinbase transactions. Zcash figures for on-chain volume and transaction count reflect data collected for transparent transactions only. In the last month, 10.5% (11/18/17) of ZEC transactions were shielded, and these are excluded from the analysis due to their private nature. Thus transaction volume figures in reality are higher than the estimate presented here, and NVT and exchange to transaction value lower. Data on shielded and transparent transactions can be found here and here. Decred data doesn’t include tickets and voting transactions. Monero transaction volume is impossible to calculate due to RingCT which hides transaction amounts.
http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
Market Basket Analysis is one of the key techniques used by large retailers to uncover associations between items. It works by looking for combinations of items that occur together frequently in transactions. To put it another way, it allows retailers to identify relationships between the items that people buy.
Association Rules are widely used to analyze retail basket or transaction data and are intended to identify strong rules discovered in transaction data using measures of interestingness, based on the concept of strong rules.
The dataset has 38765 rows of the purchase orders of people from the grocery stores. These orders can be analysed and association rules can be generated using Market Basket Analysis by algorithms like Apriori Algorithm.
Apriori is an algorithm for frequent itemset mining and association rule learning over relational databases. It proceeds by identifying the frequent individual items in the database and extending them to larger and larger item sets as long as those item sets appear sufficiently often in the database. The frequent itemsets determined by Apriori can be used to determine association rules which highlight general trends in the database: this has applications in domains such as market basket analysis.
Assume there are 100 customers 10 of them bought milk, 8 bought butter and 6 bought both of them. bought milk => bought butter support = P(Milk & Butter) = 6/100 = 0.06 confidence = support/P(Butter) = 0.06/0.08 = 0.75 lift = confidence/P(Milk) = 0.75/0.10 = 7.5
Note: this example is extremely small. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.
Support: This says how popular an itemset is, as measured by the proportion of transactions in which an itemset appears.
Confidence: This says how likely item Y is purchased when item X is purchased, expressed as {X -> Y}. This is measured by the proportion of transactions with item X, in which item Y also appears.
Lift: This says how likely item Y is purchased when item X is purchased while controlling for how popular item Y is.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The dataset consists of three files: a file with behaviour data (events.csv), a file with item properties (item_properties.сsv) and a file, which describes category tree (category_tree.сsv). The data has been collected from a real-world ecommerce website. It is raw data, i.e. without any content transformations, however, all values are hashed due to confidential issues. The purpose of publishing is to motivate researches in the field of recommender systems with implicit feedback.
The behaviour data, i.e. events like clicks, add to carts, transactions, represent interactions that were collected over a period of 4.5 months. A visitor can make three types of events, namely “view”, “addtocart” or “transaction”. In total there are 2 756 101 events including 2 664 312 views, 69 332 add to carts and 22 457 transactions produced by 1 407 580 unique visitors. For about 90% of events corresponding properties can be found in the “item_properties.csv” file.
For example:
The file with item properties (item_properties.csv) includes 20 275 902 rows, i.e. different properties, describing 417 053 unique items. File is divided into 2 files due to file size limitations. Since the property of an item can vary in time (e.g., price changes over time), every row in the file has corresponding timestamp. In other words, the file consists of concatenated snapshots for every week in the file with the behaviour data. However, if a property of an item is constant over the observed period, only a single snapshot value will be present in the file. For example, we have three properties for single item and 4 weekly snapshots, like below:
timestamp,itemid,property,value
1439694000000,1,100,1000
1439695000000,1,100,1000
1439696000000,1,100,1000
1439697000000,1,100,1000
1439694000000,1,200,1000
1439695000000,1,200,1100
1439696000000,1,200,1200
1439697000000,1,200,1300
1439694000000,1,300,1000
1439695000000,1,300,1000
1439696000000,1,300,1100
1439697000000,1,300,1100
After snapshot merge it would looks like:
1439694000000,1,100,1000
1439694000000,1,200,1000
1439695000000,1,200,1100
1439696000000,1,200,1200
1439697000000,1,200,1300
1439694000000,1,300,1000
1439696000000,1,300,1100
Because property=100 is constant over time, property=200 has different values for all snapshots, property=300 has been changed once.
Item properties file contain timestamp column because all of them are time dependent, since properties may change over time, e.g. price, category, etc. Initially, this file consisted of snapshots for every week in the events file and contained over 200 millions rows. We have merged consecutive constant property values, so it's changed from snapshot form to change log form. Thus, constant values would appear only once in the file. This action has significantly reduced the number of rows in 10 times.
All values in the “item_properties.csv” file excluding "categoryid" and "available" properties were hashed. Value of the "categoryid" property contains item category identifier. Value of the "available" property contains availability of the item, i.e. 1 means the item was available, otherwise 0. All numerical values were marked with "n" char at the beginning, and have 3 digits precision after decimal point, e.g., "5" will become "n5.000", "-3.67584" will become "n-3.675". All words in text values were normalized (stemming procedure: https://en.wikipedia.org/wiki/Stemming) and hashed, numbers were processed as above, e.g. text "Hello world 2017!" will become "24214 44214 n2017.000"
The category tree file has 1669 rows. Every row in the file specifies a child categoryId and the corresponding parent. For example:
Retail Rocket (retailrocket.io) helps web shoppers make better shopping decisions by providing personalized real-time recommendations through multiple channels with over 100MM unique monthly users and 1000+ retail partners over the world.
Envestnet®| Yodlee®'s Electronic Payment Data (Aggregate/Row) Panels consist of de-identified, near-real time (T+1) USA credit/debit/ACH transaction level data – offering a wide view of the consumer activity ecosystem. The underlying data is sourced from end users leveraging the aggregation portion of the Envestnet®| Yodlee®'s financial technology platform.
Envestnet | Yodlee Consumer Panels (Aggregate/Row) include data relating to millions of transactions, including ticket size and merchant location. The dataset includes de-identified credit/debit card and bank transactions (such as a payroll deposit, account transfer, or mortgage payment). Our coverage offers insights into areas such as consumer, TMT, energy, REITs, internet, utilities, ecommerce, MBS, CMBS, equities, credit, commodities, FX, and corporate activity. We apply rigorous data science practices to deliver key KPIs daily that are focused, relevant, and ready to put into production.
We offer free trials. Our team is available to provide support for loading, validation, sample scripts, or other services you may need to generate insights from our data.
Investors, corporate researchers, and corporates can use our data to answer some key business questions such as: - How much are consumers spending with specific merchants/brands and how is that changing over time? - Is the share of consumer spend at a specific merchant increasing or decreasing? - How are consumers reacting to new products or services launched by merchants? - For loyal customers, how is the share of spend changing over time? - What is the company’s market share in a region for similar customers? - Is the company’s loyal user base increasing or decreasing? - Is the lifetime customer value increasing or decreasing?
Additional Use Cases: - Use spending data to analyze sales/revenue broadly (sector-wide) or granular (company-specific). Historically, our tracked consumer spend has correlated above 85% with company-reported data from thousands of firms. Users can sort and filter by many metrics and KPIs, such as sales and transaction growth rates and online or offline transactions, as well as view customer behavior within a geographic market at a state or city level. - Reveal cohort consumer behavior to decipher long-term behavioral consumer spending shifts. Measure market share, wallet share, loyalty, consumer lifetime value, retention, demographics, and more.) - Study the effects of inflation rates via such metrics as increased total spend, ticket size, and number of transactions. - Seek out alpha-generating signals or manage your business strategically with essential, aggregated transaction and spending data analytics.
Use Cases Categories (Our data provides an innumerable amount of use cases, and we look forward to working with new ones): 1. Market Research: Company Analysis, Company Valuation, Competitive Intelligence, Competitor Analysis, Competitor Analytics, Competitor Insights, Customer Data Enrichment, Customer Data Insights, Customer Data Intelligence, Demand Forecasting, Ecommerce Intelligence, Employee Pay Strategy, Employment Analytics, Job Income Analysis, Job Market Pricing, Marketing, Marketing Data Enrichment, Marketing Intelligence, Marketing Strategy, Payment History Analytics, Price Analysis, Pricing Analytics, Retail, Retail Analytics, Retail Intelligence, Retail POS Data Analysis, and Salary Benchmarking
Investment Research: Financial Services, Hedge Funds, Investing, Mergers & Acquisitions (M&A), Stock Picking, Venture Capital (VC)
Consumer Analysis: Consumer Data Enrichment, Consumer Intelligence
Market Data: AnalyticsB2C Data Enrichment, Bank Data Enrichment, Behavioral Analytics, Benchmarking, Customer Insights, Customer Intelligence, Data Enhancement, Data Enrichment, Data Intelligence, Data Modeling, Ecommerce Analysis, Ecommerce Data Enrichment, Economic Analysis, Financial Data Enrichment, Financial Intelligence, Local Economic Forecasting, Location-based Analytics, Market Analysis, Market Analytics, Market Intelligence, Market Potential Analysis, Market Research, Market Share Analysis, Sales, Sales Data Enrichment, Sales Enablement, Sales Insights, Sales Intelligence, Spending Analytics, Stock Market Predictions, and Trend Analysis