100+ datasets found
  1. Envestnet | Yodlee's De-Identified Retail Transaction Data | Row/Aggregate...

    • datarade.ai
    .sql, .txt
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Envestnet | Yodlee, Envestnet | Yodlee's De-Identified Retail Transaction Data | Row/Aggregate Level | USA Consumer Data covering 3600+ corporations | 90M+ Accounts [Dataset]. https://datarade.ai/data-products/envestnet-yodlee-s-retail-transaction-data-row-aggregate-envestnet-yodlee
    Explore at:
    .sql, .txtAvailable download formats
    Dataset provided by
    Envestnethttp://envestnet.com/
    Yodlee
    Authors
    Envestnet | Yodlee
    Area covered
    United States of America
    Description

    Envestnet®| Yodlee®'s Retail Transaction Data (Aggregate/Row) Panels consist of de-identified, near-real time (T+1) USA credit/debit/ACH transaction level data – offering a wide view of the consumer activity ecosystem. The underlying data is sourced from end users leveraging the aggregation portion of the Envestnet®| Yodlee®'s financial technology platform.

    Envestnet | Yodlee Consumer Panels (Aggregate/Row) include data relating to millions of transactions, including ticket size and merchant location. The dataset includes de-identified credit/debit card and bank transactions (such as a payroll deposit, account transfer, or mortgage payment). Our coverage offers insights into areas such as consumer, TMT, energy, REITs, internet, utilities, ecommerce, MBS, CMBS, equities, credit, commodities, FX, and corporate activity. We apply rigorous data science practices to deliver key KPIs daily that are focused, relevant, and ready to put into production.

    We offer free trials. Our team is available to provide support for loading, validation, sample scripts, or other services you may need to generate insights from our data.

    Investors, corporate researchers, and corporates can use our data to answer some key business questions such as: - How much are consumers spending with specific merchants/brands and how is that changing over time? - Is the share of consumer spend at a specific merchant increasing or decreasing? - How are consumers reacting to new products or services launched by merchants? - For loyal customers, how is the share of spend changing over time? - What is the company’s market share in a region for similar customers? - Is the company’s loyal user base increasing or decreasing? - Is the lifetime customer value increasing or decreasing?

    Additional Use Cases: - Use spending data to analyze sales/revenue broadly (sector-wide) or granular (company-specific). Historically, our tracked consumer spend has correlated above 85% with company-reported data from thousands of firms. Users can sort and filter by many metrics and KPIs, such as sales and transaction growth rates and online or offline transactions, as well as view customer behavior within a geographic market at a state or city level. - Reveal cohort consumer behavior to decipher long-term behavioral consumer spending shifts. Measure market share, wallet share, loyalty, consumer lifetime value, retention, demographics, and more.) - Study the effects of inflation rates via such metrics as increased total spend, ticket size, and number of transactions. - Seek out alpha-generating signals or manage your business strategically with essential, aggregated transaction and spending data analytics.

    Use Cases Categories (Our data provides an innumerable amount of use cases, and we look forward to working with new ones): 1. Market Research: Company Analysis, Company Valuation, Competitive Intelligence, Competitor Analysis, Competitor Analytics, Competitor Insights, Customer Data Enrichment, Customer Data Insights, Customer Data Intelligence, Demand Forecasting, Ecommerce Intelligence, Employee Pay Strategy, Employment Analytics, Job Income Analysis, Job Market Pricing, Marketing, Marketing Data Enrichment, Marketing Intelligence, Marketing Strategy, Payment History Analytics, Price Analysis, Pricing Analytics, Retail, Retail Analytics, Retail Intelligence, Retail POS Data Analysis, and Salary Benchmarking

    1. Investment Research: Financial Services, Hedge Funds, Investing, Mergers & Acquisitions (M&A), Stock Picking, Venture Capital (VC)

    2. Consumer Analysis: Consumer Data Enrichment, Consumer Intelligence

    3. Market Data: AnalyticsB2C Data Enrichment, Bank Data Enrichment, Behavioral Analytics, Benchmarking, Customer Insights, Customer Intelligence, Data Enhancement, Data Enrichment, Data Intelligence, Data Modeling, Ecommerce Analysis, Ecommerce Data Enrichment, Economic Analysis, Financial Data Enrichment, Financial Intelligence, Local Economic Forecasting, Location-based Analytics, Market Analysis, Market Analytics, Market Intelligence, Market Potential Analysis, Market Research, Market Share Analysis, Sales, Sales Data Enrichment, Sales Enablement, Sales Insights, Sales Intelligence, Spending Analytics, Stock Market Predictions, and Trend Analysis

  2. D

    Data Broker Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jun 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Data Broker Report [Dataset]. https://www.datainsightsmarket.com/reports/data-broker-1456225
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    Jun 15, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global data broker market is experiencing robust growth, driven by the increasing demand for personalized services across various sectors. The market's expansion is fueled by the proliferation of data from diverse sources, the rise of big data analytics, and the escalating need for accurate consumer insights for targeted marketing and risk assessment. Companies are leveraging data broker services to enhance customer understanding, optimize marketing campaigns, improve fraud detection, and personalize user experiences. The growing adoption of cloud-based solutions and advanced analytics further accelerates market expansion. While data privacy regulations and concerns about data security pose challenges, the market continues to thrive due to the crucial role data brokers play in various business operations. We estimate the market size in 2025 to be approximately $150 billion, based on observed growth in related sectors like data analytics and marketing technology. A projected Compound Annual Growth Rate (CAGR) of 15% from 2025 to 2033 suggests the market will reach a significant size by the end of the forecast period, driven by ongoing technological advancements and increasing data availability. The competitive landscape is characterized by a mix of established players and emerging businesses. Major players like Acxiom, Experian, Equifax, and TransUnion dominate the market, leveraging their extensive data networks and advanced analytical capabilities. However, smaller companies and innovative startups are challenging the established players by focusing on niche segments and developing specialized data aggregation and analytics tools. The market is fragmented, with companies competing on data quality, accuracy, compliance, and the breadth of services offered. Strategic partnerships and acquisitions are expected to intensify as companies strive to expand their data portfolio and enhance their technological capabilities. The regional distribution is expected to reflect established economic patterns, with North America and Europe holding a significant market share initially, but growth in Asia-Pacific and other developing regions is anticipated to contribute significantly in the coming years.

  3. P

    Global Healthcare Data Aggregation Services Market Revenue Forecasts...

    • statsndata.org
    excel, pdf
    Updated May 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stats N Data (2025). Global Healthcare Data Aggregation Services Market Revenue Forecasts 2025-2032 [Dataset]. https://www.statsndata.org/report/healthcare-data-aggregation-services-market-274859
    Explore at:
    pdf, excelAvailable download formats
    Dataset updated
    May 2025
    Dataset authored and provided by
    Stats N Data
    License

    https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order

    Area covered
    Global
    Description

    The Healthcare Data Aggregation Services market has emerged as a crucial component in the evolving landscape of healthcare management, driven by the increasing volume of health-related data generated daily. Healthcare data aggregation involves the collection, integration, and analysis of disparate data sources to pr

  4. d

    Replication Data for: Lost in Aggregation: Improving Event Analysis with...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cook, Scott; Weidmann, Nils (2023). Replication Data for: Lost in Aggregation: Improving Event Analysis with Report-Level Data [Dataset]. http://doi.org/10.7910/DVN/OOIEAO
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Cook, Scott; Weidmann, Nils
    Description

    Most measures of social conflict processes are derived from primary and secondary source reports. In many cases, reports are used to create event-level data sets by aggregating information from multiple, and often conflicting, reports to single event observations. We argue this pre-aggregation is less innocuous than it seems, costing applied researchers opportunities for improved inference. First, researchers cannot evaluate the consequences of different methods of report aggregation. Second, aggregation discards report-level information (i.e., variation across reports) that is useful in addressing measurement error inherent in event data. Therefore, we advocate that data should be supplied and analyzed at the report level. We demonstrate the consequences of using aggregated event data as a predictor or outcome variable, and how analysis can be improved using report-level information directly. These gains are demonstrated with simulated-data experiments and in the analysis of real-world data, using the newly available Mass Mobilization in Autocracies Database (MMAD)

  5. Additional file 2 of A data driven learning approach for the assessment of...

    • springernature.figshare.com
    txt
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erik Tute; Nagarajan Ganapathy; Antje Wulff (2023). Additional file 2 of A data driven learning approach for the assessment of data quality [Dataset]. http://doi.org/10.6084/m9.figshare.16916706.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Erik Tute; Nagarajan Ganapathy; Antje Wulff
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 2. Data generation.

  6. FHFA Data: Uniform Appraisal Dataset Aggregate Statistics

    • datalumos.org
    • openicpsr.org
    Updated Feb 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Federal Housing Finance Agency (2025). FHFA Data: Uniform Appraisal Dataset Aggregate Statistics [Dataset]. http://doi.org/10.3886/E219961V1
    Explore at:
    Dataset updated
    Feb 18, 2025
    Dataset authored and provided by
    Federal Housing Finance Agencyhttps://www.fhfa.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2013 - 2024
    Area covered
    United States of America
    Description

    The Uniform Appraisal Dataset (UAD) Aggregate Statistics Data File and Dashboards are the nation’s first publicly available datasets of aggregate statistics on appraisal records, giving the public new access to a broad set of data points and trends found in appraisal reports. The UAD Aggregate Statistics for Enterprise Single-Family, Enterprise Condominium, and Federal Housing Administration (FHA) Single-Family appraisals may be grouped by neighborhood characteristics, property characteristics and different geographic levels.DocumentationOverview (10/28/2024)Data Dictionary (10/28/2024)Data File Version History and Suppression Rates (12/18/2024)Dashboard Guide (2/3/2025)UAD Aggregate Statistics DashboardsThe UAD Aggregate Statistics Dashboards are the visual front end of the UAD Aggregate Statistics Data File. The Dashboards are designed to provide easy access to customized maps and charts for all levels of users. Access the UAD Aggregate Statistics Dashboards here.UAD Aggregate Statistics DatasetsNotes:Some of the data files are relatively large in size and will not open correctly in certain software packages, such as Microsoft Excel. All the files can be opened and used in data analytics software such as SAS, Python, or R.All CSV files are zipped.

  7. Data Entry Service Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2024). Data Entry Service Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-data-entry-service-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Sep 23, 2024
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Entry Service Market Outlook



    The global data entry service market size is poised to experience significant growth, with the market expected to rise from USD 2.5 billion in 2023 to USD 4.8 billion by 2032, achieving a Compound Annual Growth Rate (CAGR) of 7.5% over the forecast period. This growth can be attributed to several factors including the increasing adoption of digital technologies, the rising demand for data accuracy and integrity, and the need for businesses to manage vast amounts of data efficiently.



    One of the key growth factors driving the data entry service market is the rapid digital transformation across various industries. As businesses continue to digitize their operations, the volume of data generated has increased exponentially. This data needs to be accurately entered, processed, and managed to derive meaningful insights. The demand for data entry services has surged as companies seek to outsource these non-core activities, enabling them to focus on their primary business operations. Additionally, the widespread adoption of cloud-based solutions and big data analytics has further fueled the demand for efficient data management services.



    Another significant driver of market growth is the increasing need for data accuracy and integrity. Inaccurate or incomplete data can lead to poor decision-making, financial losses, and a decrease in operational efficiency. Organizations are increasingly recognizing the importance of maintaining high-quality data and are investing in data entry services to ensure that their databases are accurate, up-to-date, and reliable. This is particularly crucial for industries such as healthcare, BFSI, and retail, where precise data is essential for regulatory compliance, customer relationship management, and operational efficiency.



    The cost-effectiveness of outsourcing data entry services is also contributing to market growth. By outsourcing these tasks to specialized service providers, organizations can save on labor costs, reduce operational expenses, and improve productivity. Service providers often have access to advanced tools and technologies, as well as skilled professionals who can perform data entry tasks more efficiently and accurately. This not only leads to cost savings but also allows businesses to reallocate resources to more strategic activities, driving overall growth.



    From a regional perspective, the Asia Pacific region is expected to witness the highest growth in the data entry service market during the forecast period. This can be attributed to the region's strong IT infrastructure, the presence of numerous outsourcing service providers, and the growing adoption of digital technologies across various industries. North America and Europe are also significant markets, driven by the high demand for data management services in sectors such as healthcare, BFSI, and retail. The Middle East & Africa and Latin America are anticipated to experience steady growth, supported by increasing investments in digital infrastructure and the rising awareness of the benefits of data entry services.



    Service Type Analysis



    The data entry service market can be segmented into various service types, including online data entry, offline data entry, data processing, data conversion, data cleansing, and others. Each of these service types plays a crucial role in ensuring the accuracy, integrity, and usability of data. Online data entry services involve entering data directly into an online system or database, which is essential for real-time data management and accessibility. This service type is particularly popular in industries such as e-commerce, where timely and accurate data entry is critical for inventory management and customer service.



    Offline data entry services, on the other hand, involve entering data into offline systems or databases, which are later synchronized with online systems. This service type is often used in industries where internet connectivity may be unreliable or where data security is a primary concern. Offline data entry is also essential for processing historical data or data that is collected through physical forms and documents. The demand for offline data entry services is driven by the need for accurate and timely data entry in sectors such as manufacturing, government, and healthcare.



    Data processing services involve the manipulation, transformation, and analysis of raw data to produce meaningful information. This includes tasks such as data validation, data sorting, data aggregation, and data analysis. Data processing is a critical componen

  8. m

    The Climate Change Twitter Dataset

    • data.mendeley.com
    Updated May 18, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dimitrios Effrosynidis (2022). The Climate Change Twitter Dataset [Dataset]. http://doi.org/10.17632/mw8yd7z9wc.1
    Explore at:
    Dataset updated
    May 18, 2022
    Authors
    Dimitrios Effrosynidis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The most comprehensive dataset to date regarding climate change and human opinions via Twitter. It has the heftiest temporal coverage, spanning over 13 years, includes over 15 million tweets spatially distributed across the world, and provides the geolocation of most tweets. Seven dimensions of information are tied to each tweet, namely geolocation, user gender, climate change stance and sentiment, aggressiveness, deviations from historic temperature, and topic modeling, while accompanied by environmental disaster events information. These dimensions were produced by testing and evaluating a plethora of state-of-the-art machine learning algorithms and methods, both supervised and unsupervised, including BERT, RNN, LSTM, CNN, SVM, Naive Bayes, VADER, Textblob, Flair, and LDA.

    The following columns are in the dataset:

    ➡ created_at: The timestamp of the tweet. ➡ id: The unique id of the tweet. ➡ lng: The longitude the tweet was written. ➡ lat: The latitude the tweet was written. ➡ topic: Categorization of the tweet in one of ten topics namely, seriousness of gas emissions, importance of human intervention, global stance, significance of pollution awareness events, weather extremes, impact of resource overconsumption, Donald Trump versus science, ideological positions on global warming, politics, and undefined. ➡ sentiment: A score on a continuous scale. This scale ranges from -1 to 1 with values closer to 1 being translated to positive sentiment, values closer to -1 representing a negative sentiment while values close to 0 depicting no sentiment or being neutral. ➡ stance: That is if the tweet supports the belief of man-made climate change (believer), if the tweet does not believe in man-made climate change (denier), and if the tweet neither supports nor refuses the belief of man-made climate change (neutral). ➡ gender: Whether the user that made the tweet is male, female, or undefined. ➡ temperature_avg: The temperature deviation in Celsius and relative to the January 1951-December 1980 average at the time and place the tweet was written. ➡ aggressiveness: That is if the tweet contains aggressive language or not.

    Since Twitter forbids making public the text of the tweets, in order to retrieve it you need to do a process called hydrating. Tools such as Twarc or Hydrator can be used to hydrate tweets.

  9. Healthcare Data Analytics Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Healthcare Data Analytics Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-healthcare-data-analytics-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Healthcare Data Analytics Market Outlook



    The global healthcare data analytics market size was valued at approximately USD 24.5 billion in 2023 and is projected to reach around USD 95.5 billion by 2032, growing at a robust CAGR of 16.5% during the forecast period. The market's growth is largely driven by the increasing adoption of electronic health records (EHRs) and the rising demand for data-driven decision-making in healthcare.



    One of the primary growth factors for the healthcare data analytics market is the escalating volume of healthcare data generated from various sources such as clinical trials, patient records, and medical devices. The integration of big data analytics in healthcare facilitates enhanced patient outcomes and operational efficiency by enabling predictive analytics, personalized medicine, and real-time decision-making. Moreover, the adoption of advanced technologies such as artificial intelligence (AI) and machine learning (ML) further drives the need for sophisticated analytics tools to manage and interpret the vast amounts of data.



    Another significant driver of market growth is the increasing emphasis on value-based care and the need to reduce healthcare costs. Healthcare providers and payers are increasingly leveraging data analytics to identify cost-saving opportunities, optimize resource allocation, and improve care quality. Analytics tools help in identifying patterns and trends, thereby enabling healthcare organizations to adopt preventive measures and reduce the incidence of chronic diseases. Furthermore, government initiatives promoting the use of healthcare IT solutions and the implementation of stringent regulations for data management and security are contributing to market expansion.



    The growing popularity of telemedicine and remote patient monitoring also contributes to the expansion of the healthcare data analytics market. The COVID-19 pandemic has accelerated the adoption of telehealth services, leading to a surge in data generated from remote consultations and wearable devices. This data needs to be effectively analyzed to provide actionable insights, improve patient care, and streamline healthcare operations. Additionally, the rising focus on population health management and the need to address healthcare disparities are driving the adoption of analytics solutions to better understand and address the health needs of diverse populations.



    The emergence of Healthcare BI Platform solutions is transforming the landscape of healthcare data analytics. These platforms provide healthcare organizations with powerful tools to aggregate, analyze, and visualize data from multiple sources, enabling more informed decision-making. By integrating data from electronic health records, financial systems, and operational databases, Healthcare BI Platforms offer a comprehensive view of organizational performance. This holistic approach not only aids in improving patient outcomes but also enhances operational efficiency by identifying areas for cost reduction and resource optimization. As healthcare systems continue to evolve, the role of BI platforms in facilitating data-driven strategies becomes increasingly vital, supporting the shift towards value-based care and personalized medicine.



    Regionally, North America holds the largest share of the healthcare data analytics market due to the high adoption of advanced healthcare technologies and the presence of key market players. The Asia Pacific region is expected to witness the highest growth rate during the forecast period, fueled by the increasing healthcare expenditure, growing awareness about the benefits of data analytics, and the rapid digital transformation of the healthcare sector in countries like China and India.



    Component Analysis



    The healthcare data analytics market is segmented by component into software, hardware, and services. The software segment holds the largest market share, driven by the growing demand for advanced analytics solutions that can handle large volumes of healthcare data. Software tools for data visualization, predictive analytics, and machine learning are increasingly being adopted to derive meaningful insights from complex datasets and improve clinical and operational outcomes. Organizations are investing heavily in upgrading their software infrastructure to keep pace with the evolving healthcare landscape.



    The hardware segment, although smaller in comparison to software, plays a crucial role i

  10. C

    Carrier Aggregation Solutions Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jan 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Carrier Aggregation Solutions Report [Dataset]. https://www.datainsightsmarket.com/reports/carrier-aggregation-solutions-1951039
    Explore at:
    ppt, pdf, docAvailable download formats
    Dataset updated
    Jan 25, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Market Overview: The global carrier aggregation solutions market is expected to reach USD XX million by 2033, growing at a CAGR of XX% from 2025 to 2033. The increasing adoption of mobile broadband services, the need for improved network performance, and the rising demand for high-speed internet connectivity are driving the market growth. Carrier aggregation combines multiple carrier signals into a wider bandwidth, enabling faster data transmission speeds and improved coverage. Market Segmentation and Regional Landscape: By application, the market is segmented into smartphone and tablet, enterprise, and consumer electronics. By type, it includes hardware, software, and services. Key players in the market include Cisco, Nokia, Huawei Technologies, ZTE, Qorvo, Artiza Networks, Anritsu, and ROHDE&SCHWARZKG. Regionally, North America and Asia Pacific dominate the market, with the latter expected to witness significant growth due to the increasing penetration of mobile broadband and government initiatives promoting 5G deployment. Carrier Aggregation (CA) is a mobile communication technology that combines multiple frequency bands into a single wider band, to increase data rates and enhance network capacity. This report provides an in-depth analysis of the global carrier aggregation solutions market, including the key players, current trends, and future growth prospects.

  11. Additional file 4 of A data driven learning approach for the assessment of...

    • springernature.figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erik Tute; Nagarajan Ganapathy; Antje Wulff (2023). Additional file 4 of A data driven learning approach for the assessment of data quality [Dataset]. http://doi.org/10.6084/m9.figshare.16916712.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Erik Tute; Nagarajan Ganapathy; Antje Wulff
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 4. Example MM to check missing BP.

  12. m

    Step-downs analysis: aggregated data and analytical code

    • bridges.monash.edu
    • researchdata.edu.au
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tyler Lane; Luke Sheehan; Shannon Gray; Dianne Beck; Alex Collie (2023). Step-downs analysis: aggregated data and analytical code [Dataset]. http://doi.org/10.26180/5dba1e5b4277a
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Monash University
    Authors
    Tyler Lane; Luke Sheehan; Shannon Gray; Dianne Beck; Alex Collie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file contains analytical code and aggregate data for evaluating the impact of step-downs (reductions in workers' comp payments after several months in the system) on scheme exit, plus the R project file. I have included the cleaning file but no case-level data.To use:Download all files into a single folder and open the project file. You should be able to run most RMarkdown files from there. The meta-analysis file depends on outputs from all other analytical files.Note on v5:This update reflect analytical changes to reflect peer review and the correction of an existing error. Analytical changes include an additional sensitivity analysis investigating effects on claims unaffected by step-downs, which would indicate confounding. The existing error was exclusion criteria had not been applied correctly in some cases. Claims affected by step-downs must consider both maximum/minimum caps and compensation rates. Otherwise, claims may have minimal effects of step-downs. E.g., if the cap was $2000 per week with an initial rate of 95%, injured workers could only be included if they made $2000 / 95% or less.

  13. d

    Data from: Topological data analysis of biological aggregation models

    • search.dataone.org
    • datadryad.org
    Updated Apr 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chad M. Topaz; Lori Ziegelmeier; Tom Halverson (2025). Topological data analysis of biological aggregation models [Dataset]. http://doi.org/10.5061/dryad.91j93
    Explore at:
    Dataset updated
    Apr 4, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Chad M. Topaz; Lori Ziegelmeier; Tom Halverson
    Time period covered
    May 19, 2015
    Description

    We apply tools from topological data analysis to two mathematical models inspired by biological aggregations such as bird flocks, fish schools, and insect swarms. Our data consists of numerical simulation output from the models of Vicsek and D'Orsogna. These models are dynamical systems describing the movement of agents who interact via alignment, attraction, and/or repulsion. Each simulation time frame is a point cloud in position-velocity space. We analyze the topological structure of these point clouds, interpreting the persistent homology by calculating the first few Betti numbers. These Betti numbers count connected components, topological circles, and trapped volumes present in the data. To interpret our results, we introduce a visualization that displays Betti numbers over simulation time and topological persistence scale. We compare our topological results to order parameters typically used to quantify the global behavior of aggregations, such as polarization and angular momentu...

  14. w

    Global Data Broker Market Research Report: By Data Type (Public Records,...

    • wiseguyreports.com
    Updated Dec 31, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wWiseguy Research Consultants Pvt Ltd (2024). Global Data Broker Market Research Report: By Data Type (Public Records, Consumer Data, Commercial Data, Financial Data), By End User (Marketing Agencies, Insurance Companies, Financial Institutions, Retailers), By Service Type (Data Aggregation, Data Monetization, Data Analytics, Data Management), By Deployment Model (Cloud-Based, On-Premises) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2032. [Dataset]. https://www.wiseguyreports.com/es/reports/data-broker-market
    Explore at:
    Dataset updated
    Dec 31, 2024
    Dataset authored and provided by
    wWiseguy Research Consultants Pvt Ltd
    License

    https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

    Area covered
    Global
    Description
    BASE YEAR2024
    HISTORICAL DATA2019 - 2024
    REPORT COVERAGERevenue Forecast, Competitive Landscape, Growth Factors, and Trends
    MARKET SIZE 2023225.72(USD Billion)
    MARKET SIZE 2024240.55(USD Billion)
    MARKET SIZE 2032400.0(USD Billion)
    SEGMENTS COVEREDData Type, End User, Service Type, Deployment Model, Regional
    COUNTRIES COVEREDNorth America, Europe, APAC, South America, MEA
    KEY MARKET DYNAMICSData privacy regulations, Increasing demand for analytics, Growth of digital marketing, Rising cybersecurity concerns, Emergence of AI technologies
    MARKET FORECAST UNITSUSD Billion
    KEY COMPANIES PROFILEDTransUnion, Epsilon, Meredith Corporation, Nielsen, Dun and Bradstreet, Experian, Zillow, CoreLogic, Equifax, Infogroup, ID Analytics, Foursquare, Thomson Reuters, Oracle Data Cloud, Acxiom
    MARKET FORECAST PERIOD2025 - 2032
    KEY MARKET OPPORTUNITIESGrowing demand for personalized marketing, Expansion into emerging markets, Increased regulatory compliance demands, Advancements in data analytics technology, Rise of data-driven decision making
    COMPOUND ANNUAL GROWTH RATE (CAGR) 6.57% (2025 - 2032)
  15. J

    Aggregate vs. disaggregate data analysis—a paradox in the estimation of a...

    • journaldata.zbw.eu
    • jda-test.zbw.eu
    txt
    Updated Dec 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cheng Hsiao; Yan Shen; Hiroshi Fujiki; Cheng Hsiao; Yan Shen; Hiroshi Fujiki (2022). Aggregate vs. disaggregate data analysis—a paradox in the estimation of a money demand function of Japan under the low interest rate policy (replication data) [Dataset]. http://doi.org/10.15456/jae.2022319.0709091259
    Explore at:
    txt(14432), txt(4429), txt(1154)Available download formats
    Dataset updated
    Dec 8, 2022
    Dataset provided by
    ZBW - Leibniz Informationszentrum Wirtschaft
    Authors
    Cheng Hsiao; Yan Shen; Hiroshi Fujiki; Cheng Hsiao; Yan Shen; Hiroshi Fujiki
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Japan
    Description

    We use Japanese aggregate and disaggregate money demand data to show that conflicting inferences can arise. The aggregate data appears to support the contention that there was no stable money demand function. The disaggregate data shows that there was a stable money demand function. Neither was there any indication of the presence of a liquidity trap. Possible sources of discrepancy are explored and the diametrically opposite results between the aggregate and disaggregate analysis are attributed to the neglected heterogeneity among micro units. We provide necessary and sufficient conditions for the existence of a cointegrating relation among aggregate variables when heterogeneous cointegration relations among micro units exist. We also conduct simulation analysis to show that when such conditions are violated, it is possible to observe stable micro relations, but unit root phenomena among macro variables. Moreover, the prediction of aggregate outcomes, using aggregate data, is less accurate than the prediction based on micro equations, and policy evaluation based on aggregate data ignoring heterogeneity in micro units can be grossly misleading.

  16. The Organization of Tropical Rainfall: Observed convective aggregation data...

    • catalogue.ceda.ac.uk
    • data-search.nerc.ac.uk
    Updated Feb 9, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Holloway (2018). The Organization of Tropical Rainfall: Observed convective aggregation data across the Tropics [Dataset]. https://catalogue.ceda.ac.uk/uuid/f3f8337c838c4602876d43f56d878515
    Explore at:
    Dataset updated
    Feb 9, 2018
    Dataset provided by
    Centre for Environmental Data Analysishttp://www.ceda.ac.uk/
    Authors
    Christopher Holloway
    License

    https://artefacts.ceda.ac.uk/licences/missing_licence.pdfhttps://artefacts.ceda.ac.uk/licences/missing_licence.pdf

    Time period covered
    Jun 14, 2006 - Apr 17, 2011
    Area covered
    Description

    This dataset contains about 5 years of analysed observations regarding the degree of convective aggregation, or clumping, across the tropics - these are averaged onto a large-scale grid. There are also additional variables which represent environmental fields (e.g. sea surface temperature from satellite data, or humidity profiles averaged from reanalysis data) averaged onto the same large-scale grid. The main aggregation index is the Simple Convective Aggregation Index (SCAI) originally defined in Tobin et al. 2012, Journal of Climate. The data were created during the main years of CloudSat and Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) satellite data so that they could be compared with vertical cloud profiles from this satellite data, and the results of this analysis appear in Stein et al. 2017, Journal of Climate.

    Each file is one year of data (although the year may not be complete).

    Each variable is an array: var(nlon, nlat, [nlev], ntime) longitude, latitude, pressure, time are variables in each file units are attributes of each variable (except non-dimensional ones) missing_value is 3.0E20 and is an attribute of each variable

    Time is in days since 19790101:00Z and is every 3hours at 00z, 03z, ... The actual temporal frequency of the data is described for each variable below.

    The data is for each 10deg X 10deg lat/lon box, 30S-30N (at outer edges of box domain), with each box defined by its centre coordinates and with boxes overlapping each other by 5deg in each direction.

    In general, each variable is a spatial average over each box, with the value set to missing if more than 15% of the box is missing data. Exceptions to this are given below. The most important exception is for the brightness temperature data, used in aggregation statistics, which is filled in using neighborhood averaging if no more than 5% of the pixels are missing, but otherwise is considered to be all missing data. The percentage of missing pixels is recorded in 'bt_miss_frac'.

  17. n

    Data from: Repository Analytics and Metrics Portal (RAMP) 2021 data

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    zip
    Updated May 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Wheeler; Kenning Arlitsch (2023). Repository Analytics and Metrics Portal (RAMP) 2021 data [Dataset]. http://doi.org/10.5061/dryad.1rn8pk0tz
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 23, 2023
    Dataset provided by
    Montana State University
    University of New Mexico
    Authors
    Jonathan Wheeler; Kenning Arlitsch
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    The Repository Analytics and Metrics Portal (RAMP) is a web service that aggregates use and performance use data of institutional repositories. The data are a subset of data from RAMP, the Repository Analytics and Metrics Portal (http://rampanalytics.org), consisting of data from all participating repositories for the calendar year 2021. For a description of the data collection, processing, and output methods, please see the "methods" section below.

    The record will be revised periodically to make new data available through the remainder of 2021.

    Methods

    Data Collection

    RAMP data are downloaded for participating IR from Google Search Console (GSC) via the Search Console API. The data consist of aggregated information about IR pages which appeared in search result pages (SERP) within Google properties (including web search and Google Scholar).

    Data are downloaded in two sets per participating IR. The first set includes page level statistics about URLs pointing to IR pages and content files. The following fields are downloaded for each URL, with one row per URL:

    url: This is returned as a 'page' by the GSC API, and is the URL of the page which was included in an SERP for a Google property.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    date: The date of the search.
    

    Following data processing describe below, on ingest into RAMP a additional field, citableContent, is added to the page level data.

    The second set includes similar information, but instead of being aggregated at the page level, the data are grouped based on the country from which the user submitted the corresponding search, and the type of device used. The following fields are downloaded for combination of country and device, with one row per country/device combination:

    country: The country from which the corresponding search originated.
    device: The device used for the search.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    date: The date of the search.
    

    Note that no personally identifiable information is downloaded by RAMP. Google does not make such information available.

    More information about click-through rates, impressions, and position is available from Google's Search Console API documentation: https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query and https://support.google.com/webmasters/answer/7042828?hl=en

    Data Processing

    Upon download from GSC, the page level data described above are processed to identify URLs that point to citable content. Citable content is defined within RAMP as any URL which points to any type of non-HTML content file (PDF, CSV, etc.). As part of the daily download of page level statistics from Google Search Console (GSC), URLs are analyzed to determine whether they point to HTML pages or actual content files. URLs that point to content files are flagged as "citable content." In addition to the fields downloaded from GSC described above, following this brief analysis one more field, citableContent, is added to the page level data which records whether each page/URL in the GSC data points to citable content. Possible values for the citableContent field are "Yes" and "No."

    The data aggregated by the search country of origin and device type do not include URLs. No additional processing is done on these data. Harvested data are passed directly into Elasticsearch.

    Processed data are then saved in a series of Elasticsearch indices. Currently, RAMP stores data in two indices per participating IR. One index includes the page level data, the second index includes the country of origin and device type data.

    About Citable Content Downloads

    Data visualizations and aggregations in RAMP dashboards present information about citable content downloads, or CCD. As a measure of use of institutional repository content, CCD represent click activity on IR content that may correspond to research use.

    CCD information is summary data calculated on the fly within the RAMP web application. As noted above, data provided by GSC include whether and how many times a URL was clicked by users. Within RAMP, a "click" is counted as a potential download, so a CCD is calculated as the sum of clicks on pages/URLs that are determined to point to citable content (as defined above).

    For any specified date range, the steps to calculate CCD are:

    Filter data to only include rows where "citableContent" is set to "Yes."
    Sum the value of the "clicks" field on these rows.
    

    Output to CSV

    Published RAMP data are exported from the production Elasticsearch instance and converted to CSV format. The CSV data consist of one "row" for each page or URL from a specific IR which appeared in search result pages (SERP) within Google properties as described above. Also as noted above, daily data are downloaded for each IR in two sets which cannot be combined. One dataset includes the URLs of items that appear in SERP. The second dataset is aggregated by combination of the country from which a search was conducted and the device used.

    As a result, two CSV datasets are provided for each month of published data:

    page-clicks:

    The data in these CSV files correspond to the page-level data, and include the following fields:

    url: This is returned as a 'page' by the GSC API, and is the URL of the page which was included in an SERP for a Google property.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    date: The date of the search.
    citableContent: Whether or not the URL points to a content file (ending with pdf, csv, etc.) rather than HTML wrapper pages. Possible values are Yes or No.
    index: The Elasticsearch index corresponding to page click data for a single IR.
    repository_id: This is a human readable alias for the index and identifies the participating repository corresponding to each row. As RAMP has undergone platform and version migrations over time, index names as defined for the previous field have not remained consistent. That is, a single participating repository may have multiple corresponding Elasticsearch index names over time. The repository_id is a canonical identifier that has been added to the data to provide an identifier that can be used to reference a single participating repository across all datasets. Filtering and aggregation for individual repositories or groups of repositories should be done using this field.
    

    Filenames for files containing these data end with “page-clicks”. For example, the file named 2021-01_RAMP_all_page-clicks.csv contains page level click data for all RAMP participating IR for the month of January, 2021.

    country-device-info:

    The data in these CSV files correspond to the data aggregated by country from which a search was conducted and the device used. These include the following fields:

    country: The country from which the corresponding search originated.
    device: The device used for the search.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    date: The date of the search.
    index: The Elasticsearch index corresponding to country and device access information data for a single IR.
    repository_id: This is a human readable alias for the index and identifies the participating repository corresponding to each row. As RAMP has undergone platform and version migrations over time, index names as defined for the previous field have not remained consistent. That is, a single participating repository may have multiple corresponding Elasticsearch index names over time. The repository_id is a canonical identifier that has been added to the data to provide an identifier that can be used to reference a single participating repository across all datasets. Filtering and aggregation for individual repositories or groups of repositories should be done using this field.
    

    Filenames for files containing these data end with “country-device-info”. For example, the file named 2021-01_RAMP_all_country-device-info.csv contains country and device data for all participating IR for the month of January, 2021.

    References

    Google, Inc. (2021). Search Console APIs. Retrieved from https://developers.google.com/webmaster-tools/search-console-api-original.

  18. A

    ‘Strategic Measure_Street Segment Condition Data Aggregated’ analyzed by...

    • analyst-2.ai
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com), ‘Strategic Measure_Street Segment Condition Data Aggregated’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-strategic-measure-street-segment-condition-data-aggregated-e577/latest
    Explore at:
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Strategic Measure_Street Segment Condition Data Aggregated’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/5f0b0d59-2dfd-4c94-83f3-9dd5e8538f1a on 27 January 2022.

    --- Dataset description provided by original source is as follows ---

    This table contains aggregated street condition data for each fiscal year beginning in FY2018. A contract vendor surveys street conditions for one-third to one-half of the City of Austin each year. The reported value each year represents the aggregate reported conditions for the three most recent years as of the year reported. Detailed condition data for the most recent fiscal year can be found in the dataset Strategic Measure_Street Segment Condition Data.

    View more details and insights related to this data set on the story page: https://data.austintexas.gov/stories/s/kara-xhcd.

    --- Original source retains full ownership of the source dataset ---

  19. Account Aggregators Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Oct 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2024). Account Aggregators Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/account-aggregators-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Oct 4, 2024
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Account Aggregators Market Outlook



    The global account aggregators market size is projected to grow from USD 1.8 billion in 2023 to USD 6.4 billion by 2032, driven by a robust CAGR of 15.4%. The growing need for data-driven decision-making and efficient financial management systems are key factors propelling this market's growth. Organizations across various sectors are increasingly adopting account aggregation solutions to streamline access to financial data, thereby enhancing their ability to make informed business decisions.



    One of the primary factors driving the growth of the account aggregators market is the increasing digitalization of financial services. As more consumers and businesses transition to online banking and digital financial solutions, the need for secure and efficient data aggregation becomes paramount. Account aggregators facilitate this by enabling seamless access to financial data from multiple sources, improving transparency and financial management. In addition, the rising demand for personalized financial services is prompting financial institutions to leverage account aggregation to gain deeper insights into user behavior and preferences.



    Regulatory frameworks and government initiatives also play a significant role in the market's expansion. Various governments and regulatory bodies are mandating the adoption of open banking and data sharing protocols, which necessitate the use of account aggregation services. For instance, the European Union's PSD2 directive and India's Account Aggregator framework are designed to promote data portability and interoperability, thereby fostering a competitive and innovative financial ecosystem. These regulations not only ensure consumer data protection but also encourage the development of new financial products and services.



    Technological advancements such as artificial intelligence (AI) and machine learning (ML) are further enhancing the capabilities of account aggregators. These technologies enable more accurate data analysis and predictive analytics, allowing businesses to forecast trends and make proactive decisions. Additionally, the integration of blockchain technology is expected to enhance data security and transparency, addressing concerns related to data breaches and fraud. As these technologies continue to evolve, they are likely to drive increased adoption of account aggregation solutions across various sectors.



    From a regional perspective, North America is expected to dominate the account aggregators market, followed by Europe and Asia Pacific. The early adoption of advanced financial technologies and a highly developed financial infrastructure contribute to North America's leading market position. Europe is also witnessing significant growth due to stringent regulatory requirements and a strong emphasis on open banking initiatives. Meanwhile, Asia Pacific is emerging as a lucrative market, driven by rapid economic development, increasing internet penetration, and supportive government policies aimed at digital financial inclusion.



    Component Analysis



    The account aggregators market is segmented by components into software and services. The software segment is expected to hold a significant share of the market owing to the increasing adoption of advanced financial management solutions. Account aggregation software enables seamless integration and access to financial data from multiple accounts, providing users with a comprehensive view of their financial status. This segment is witnessing continuous innovation, with companies developing user-friendly interfaces and advanced analytics capabilities to meet the growing demand for personalized financial services.



    Services, on the other hand, encompass a range of offerings including consulting, integration, and maintenance services. As organizations adopt account aggregation software, the need for expert consulting and integration services becomes crucial to ensure smooth implementation and operation. Maintenance services are also essential to address any technical issues and ensure the software's optimal performance. The growing demand for these services is driving significant revenue growth in this segment, as businesses seek to maximize the benefits of their account aggregation solutions.



    Within the software segment, there is a growing trend towards cloud-based solutions. Cloud-based account aggregation software offers several advantages, including scalability, flexibility, and cost-effectiveness. These solutions enable businesses to access financial data from anywhere, at a

  20. n

    Repository Analytics and Metrics Portal (RAMP) 2018 data

    • data.niaid.nih.gov
    • dataone.org
    • +1more
    zip
    Updated Jul 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Wheeler; Kenning Arlitsch (2021). Repository Analytics and Metrics Portal (RAMP) 2018 data [Dataset]. http://doi.org/10.5061/dryad.ffbg79cvp
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 27, 2021
    Dataset provided by
    Montana State University
    University of New Mexico
    Authors
    Jonathan Wheeler; Kenning Arlitsch
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    The Repository Analytics and Metrics Portal (RAMP) is a web service that aggregates use and performance use data of institutional repositories. The data are a subset of data from RAMP, the Repository Analytics and Metrics Portal (http://rampanalytics.org), consisting of data from all participating repositories for the calendar year 2018. For a description of the data collection, processing, and output methods, please see the "methods" section below. Note that the RAMP data model changed in August, 2018 and two sets of documentation are provided to describe data collection and processing before and after the change.

    Methods

    RAMP Data Documentation – January 1, 2017 through August 18, 2018

    Data Collection

    RAMP data were downloaded for participating IR from Google Search Console (GSC) via the Search Console API. The data consist of aggregated information about IR pages which appeared in search result pages (SERP) within Google properties (including web search and Google Scholar).

    Data from January 1, 2017 through August 18, 2018 were downloaded in one dataset per participating IR. The following fields were downloaded for each URL, with one row per URL:

    url: This is returned as a 'page' by the GSC API, and is the URL of the page which was included in an SERP for a Google property.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    country: The country from which the corresponding search originated.
    device: The device used for the search.
    date: The date of the search.
    

    Following data processing describe below, on ingest into RAMP an additional field, citableContent, is added to the page level data.

    Note that no personally identifiable information is downloaded by RAMP. Google does not make such information available.

    More information about click-through rates, impressions, and position is available from Google's Search Console API documentation: https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query and https://support.google.com/webmasters/answer/7042828?hl=en

    Data Processing

    Upon download from GSC, data are processed to identify URLs that point to citable content. Citable content is defined within RAMP as any URL which points to any type of non-HTML content file (PDF, CSV, etc.). As part of the daily download of statistics from Google Search Console (GSC), URLs are analyzed to determine whether they point to HTML pages or actual content files. URLs that point to content files are flagged as "citable content." In addition to the fields downloaded from GSC described above, following this brief analysis one more field, citableContent, is added to the data which records whether each URL in the GSC data points to citable content. Possible values for the citableContent field are "Yes" and "No."

    Processed data are then saved in a series of Elasticsearch indices. From January 1, 2017, through August 18, 2018, RAMP stored data in one index per participating IR.

    About Citable Content Downloads

    Data visualizations and aggregations in RAMP dashboards present information about citable content downloads, or CCD. As a measure of use of institutional repository content, CCD represent click activity on IR content that may correspond to research use.

    CCD information is summary data calculated on the fly within the RAMP web application. As noted above, data provided by GSC include whether and how many times a URL was clicked by users. Within RAMP, a "click" is counted as a potential download, so a CCD is calculated as the sum of clicks on pages/URLs that are determined to point to citable content (as defined above).

    For any specified date range, the steps to calculate CCD are:

    Filter data to only include rows where "citableContent" is set to "Yes."
    Sum the value of the "clicks" field on these rows.
    

    Output to CSV

    Published RAMP data are exported from the production Elasticsearch instance and converted to CSV format. The CSV data consist of one "row" for each page or URL from a specific IR which appeared in search result pages (SERP) within Google properties as described above.

    The data in these CSV files include the following fields:

    url: This is returned as a 'page' by the GSC API, and is the URL of the page which was included in an SERP for a Google property.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    country: The country from which the corresponding search originated.
    device: The device used for the search.
    date: The date of the search.
    citableContent: Whether or not the URL points to a content file (ending with pdf, csv, etc.) rather than HTML wrapper pages. Possible values are Yes or No.
    index: The Elasticsearch index corresponding to page click data for a single IR.
    repository_id: This is a human readable alias for the index and identifies the participating repository corresponding to each row. As RAMP has undergone platform and version migrations over time, index names as defined for the index field have not remained consistent. That is, a single participating repository may have multiple corresponding Elasticsearch index names over time. The repository_id is a canonical identifier that has been added to the data to provide an identifier that can be used to reference a single participating repository across all datasets. Filtering and aggregation for individual repositories or groups of repositories should be done using this field.
    

    Filenames for files containing these data follow the format 2018-01_RAMP_all.csv. Using this example, the file 2018-01_RAMP_all.csv contains all data for all RAMP participating IR for the month of January, 2018.

    Data Collection from August 19, 2018 Onward

    RAMP data are downloaded for participating IR from Google Search Console (GSC) via the Search Console API. The data consist of aggregated information about IR pages which appeared in search result pages (SERP) within Google properties (including web search and Google Scholar).

    Data are downloaded in two sets per participating IR. The first set includes page level statistics about URLs pointing to IR pages and content files. The following fields are downloaded for each URL, with one row per URL:

    url: This is returned as a 'page' by the GSC API, and is the URL of the page which was included in an SERP for a Google property.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    date: The date of the search.
    

    Following data processing describe below, on ingest into RAMP a additional field, citableContent, is added to the page level data.

    The second set includes similar information, but instead of being aggregated at the page level, the data are grouped based on the country from which the user submitted the corresponding search, and the type of device used. The following fields are downloaded for combination of country and device, with one row per country/device combination:

    country: The country from which the corresponding search originated.
    device: The device used for the search.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    date: The date of the search.
    

    Note that no personally identifiable information is downloaded by RAMP. Google does not make such information available.

    More information about click-through rates, impressions, and position is available from Google's Search Console API documentation: https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query and https://support.google.com/webmasters/answer/7042828?hl=en

    Data Processing

    Upon download from GSC, the page level data described above are processed to identify URLs that point to citable content. Citable content is defined within RAMP as any URL which points to any type of non-HTML content file (PDF, CSV, etc.). As part of the daily download of page level statistics from Google Search Console (GSC), URLs are analyzed to determine whether they point to HTML pages or actual content files. URLs that point to content files are flagged as "citable content." In addition to the fields downloaded from GSC described above, following this brief analysis one more field, citableContent, is added to the page level data which records whether each page/URL in the GSC data points to citable content. Possible values for the citableContent field are "Yes" and "No."

    The data aggregated by the search country of origin and device type do not include URLs. No additional processing is done on these data. Harvested data are passed directly into Elasticsearch.

    Processed data are then saved in a series of Elasticsearch indices. Currently, RAMP stores data in two indices per participating IR. One index includes the page level data, the second index includes the country of origin and device type data.

    About Citable Content Downloads

    Data visualizations and aggregations in RAMP dashboards present information about citable content downloads, or CCD. As a measure of use of institutional repository

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Envestnet | Yodlee, Envestnet | Yodlee's De-Identified Retail Transaction Data | Row/Aggregate Level | USA Consumer Data covering 3600+ corporations | 90M+ Accounts [Dataset]. https://datarade.ai/data-products/envestnet-yodlee-s-retail-transaction-data-row-aggregate-envestnet-yodlee
Organization logoOrganization logo

Envestnet | Yodlee's De-Identified Retail Transaction Data | Row/Aggregate Level | USA Consumer Data covering 3600+ corporations | 90M+ Accounts

Explore at:
.sql, .txtAvailable download formats
Dataset provided by
Envestnethttp://envestnet.com/
Yodlee
Authors
Envestnet | Yodlee
Area covered
United States of America
Description

Envestnet®| Yodlee®'s Retail Transaction Data (Aggregate/Row) Panels consist of de-identified, near-real time (T+1) USA credit/debit/ACH transaction level data – offering a wide view of the consumer activity ecosystem. The underlying data is sourced from end users leveraging the aggregation portion of the Envestnet®| Yodlee®'s financial technology platform.

Envestnet | Yodlee Consumer Panels (Aggregate/Row) include data relating to millions of transactions, including ticket size and merchant location. The dataset includes de-identified credit/debit card and bank transactions (such as a payroll deposit, account transfer, or mortgage payment). Our coverage offers insights into areas such as consumer, TMT, energy, REITs, internet, utilities, ecommerce, MBS, CMBS, equities, credit, commodities, FX, and corporate activity. We apply rigorous data science practices to deliver key KPIs daily that are focused, relevant, and ready to put into production.

We offer free trials. Our team is available to provide support for loading, validation, sample scripts, or other services you may need to generate insights from our data.

Investors, corporate researchers, and corporates can use our data to answer some key business questions such as: - How much are consumers spending with specific merchants/brands and how is that changing over time? - Is the share of consumer spend at a specific merchant increasing or decreasing? - How are consumers reacting to new products or services launched by merchants? - For loyal customers, how is the share of spend changing over time? - What is the company’s market share in a region for similar customers? - Is the company’s loyal user base increasing or decreasing? - Is the lifetime customer value increasing or decreasing?

Additional Use Cases: - Use spending data to analyze sales/revenue broadly (sector-wide) or granular (company-specific). Historically, our tracked consumer spend has correlated above 85% with company-reported data from thousands of firms. Users can sort and filter by many metrics and KPIs, such as sales and transaction growth rates and online or offline transactions, as well as view customer behavior within a geographic market at a state or city level. - Reveal cohort consumer behavior to decipher long-term behavioral consumer spending shifts. Measure market share, wallet share, loyalty, consumer lifetime value, retention, demographics, and more.) - Study the effects of inflation rates via such metrics as increased total spend, ticket size, and number of transactions. - Seek out alpha-generating signals or manage your business strategically with essential, aggregated transaction and spending data analytics.

Use Cases Categories (Our data provides an innumerable amount of use cases, and we look forward to working with new ones): 1. Market Research: Company Analysis, Company Valuation, Competitive Intelligence, Competitor Analysis, Competitor Analytics, Competitor Insights, Customer Data Enrichment, Customer Data Insights, Customer Data Intelligence, Demand Forecasting, Ecommerce Intelligence, Employee Pay Strategy, Employment Analytics, Job Income Analysis, Job Market Pricing, Marketing, Marketing Data Enrichment, Marketing Intelligence, Marketing Strategy, Payment History Analytics, Price Analysis, Pricing Analytics, Retail, Retail Analytics, Retail Intelligence, Retail POS Data Analysis, and Salary Benchmarking

  1. Investment Research: Financial Services, Hedge Funds, Investing, Mergers & Acquisitions (M&A), Stock Picking, Venture Capital (VC)

  2. Consumer Analysis: Consumer Data Enrichment, Consumer Intelligence

  3. Market Data: AnalyticsB2C Data Enrichment, Bank Data Enrichment, Behavioral Analytics, Benchmarking, Customer Insights, Customer Intelligence, Data Enhancement, Data Enrichment, Data Intelligence, Data Modeling, Ecommerce Analysis, Ecommerce Data Enrichment, Economic Analysis, Financial Data Enrichment, Financial Intelligence, Local Economic Forecasting, Location-based Analytics, Market Analysis, Market Analytics, Market Intelligence, Market Potential Analysis, Market Research, Market Share Analysis, Sales, Sales Data Enrichment, Sales Enablement, Sales Insights, Sales Intelligence, Spending Analytics, Stock Market Predictions, and Trend Analysis

Search
Clear search
Close search
Google apps
Main menu