An interactive dashboard that showcases the City of Austin Open Data Portal (data.austintexas.gov) web traffic and search-term performance metrics. *City of Austin Open Data Terms of Use https://data.austintexas.gov/stories/s/ranj‐cccq
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Web Analytics Market was valued at USD 6.16 Billion in 2024 and is projected to reach USD 13.6 Billion by 2032, growing at a CAGR of 18.58% from 2026 to 2032.
Web Analytics Market Drivers
Data-Driven Decision Making: Businesses increasingly rely on data-driven insights to optimize their online strategies. Web analytics provides valuable data on website traffic, user behavior, and conversion rates, enabling data-driven decision-making.
E-commerce Growth: The rapid growth of e-commerce has fueled the demand for web analytics tools to track online sales, customer behavior, and marketing campaign effectiveness.
Mobile Dominance: The increasing use of mobile devices for internet browsing has made mobile analytics a crucial aspect of web analytics. Businesses need to understand how users interact with their websites and apps on mobile devices.
analytics tools can be complex to implement and use, requiring technical expertise.
Web Analytics Market Size 2025-2029
The web analytics market size is forecast to increase by USD 3.63 billion, at a CAGR of 15.4% between 2024 and 2029.
The market is experiencing significant growth, driven by the rising preference for online shopping and the increasing adoption of cloud-based solutions. The shift towards e-commerce is fueling the demand for advanced web analytics tools that enable businesses to gain insights into customer behavior and optimize their digital strategies. Furthermore, cloud deployment models offer flexibility, scalability, and cost savings, making them an attractive option for businesses of all sizes. However, the market also faces challenges associated with compliance to data privacy and regulations. With the increasing amount of data being generated and collected, ensuring data security and privacy is becoming a major concern for businesses.
Regulatory compliance, such as GDPR and CCPA, adds complexity to the implementation and management of web analytics solutions. Companies must navigate these challenges effectively to maintain customer trust and avoid potential legal issues. To capitalize on market opportunities and address these challenges, businesses should invest in robust web analytics solutions that prioritize data security and privacy while providing actionable insights to inform strategic decision-making and enhance customer experiences.
What will be the Size of the Web Analytics Market during the forecast period?
Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
Request Free Sample
The market continues to evolve, with dynamic market activities unfolding across various sectors. Entities such as reporting dashboards, schema markup, conversion optimization, session duration, organic traffic, attribution modeling, conversion rate optimization, call to action, content calendar, SEO audits, website performance optimization, link building, page load speed, user behavior tracking, and more, play integral roles in this ever-changing landscape. Data visualization tools like Google Analytics and Adobe Analytics provide valuable insights into user engagement metrics, helping businesses optimize their content strategy, website design, and technical SEO. Goal tracking and keyword research enable marketers to measure the return on investment of their efforts and refine their content marketing and social media marketing strategies.
Mobile optimization, form optimization, and landing page optimization are crucial aspects of website performance optimization, ensuring a seamless user experience across devices and improving customer acquisition cost. Search console and page speed insights offer valuable insights into website traffic analysis and help businesses address technical issues that may impact user behavior. Continuous optimization efforts, such as multivariate testing, data segmentation, and data filtering, allow businesses to fine-tune their customer journey mapping and cohort analysis. Search engine optimization, both on-page and off-page, remains a critical component of digital marketing, with backlink analysis and page authority playing key roles in improving domain authority and organic traffic.
The ongoing integration of user behavior tracking, click-through rate, and bounce rate into marketing strategies enables businesses to gain a deeper understanding of their audience and optimize their customer experience accordingly. As market dynamics continue to evolve, the integration of these tools and techniques into comprehensive digital marketing strategies will remain essential for businesses looking to stay competitive in the digital landscape.
How is this Web Analytics Industry segmented?
The web analytics industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.
Deployment
Cloud-based
On-premises
Application
Social media management
Targeting and behavioral analysis
Display advertising optimization
Multichannel campaign analysis
Online marketing
Component
Solutions
Services
Geography
North America
US
Canada
Europe
France
Germany
Italy
UK
APAC
China
India
Japan
South Korea
Rest of World (ROW)
.
By Deployment Insights
The cloud-based segment is estimated to witness significant growth during the forecast period.
In today's digital landscape, web analytics plays a pivotal role in driving business growth and optimizing online performance. Cloud-based deployment of web analytics is a game-changer, enabling on-demand access to computing resources for data analysis. This model streamlines business intelligence processes by collecting,
This asset is a filter (derived view of a dataset) based on the system dataset, 'Site Analytics: Catalog Search Terms' which is automatically generated by the City of Austin Open Data Portal (data.austintexas.gov). It provides data on the words and phrases entered by site users of in search bars that look through the data catalog for relevant information. Catalog searches using the Discovery API are not included.
Each row in the dataset indicates the number of catalog searches made using the search term from the specified user segment during the noted hour.
Data are segmented into the following user types: • site member: users who have logged in and have been granted a role on the domain • community user: users who have logged in but do not have a role on the domain • anonymous: users who have not logged in to the domain
Data are updated by a system process at least once a day, if there is new data to record.
Data provided by: Tyler Technologies Creation date of data source: January 31, 2020
https://www.mordorintelligence.com/privacy-policyhttps://www.mordorintelligence.com/privacy-policy
The Web Analytics Market In Retail And CPG report segments the industry into By Offering (Solution, Services), By Organization Size (SMEs, Large Enterprises), By Application (Search Engine Optimization And Ranking, Online Marketing & Marketing Automation, Customer Profiling And Feedback, Application Performance Management, Social Media Management, Others), and Geography (North America, Europe, Asia, and more).
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The Web Analytics market in Retail and CPG is experiencing robust growth, projected to reach $1.22 billion in 2025 and maintain a Compound Annual Growth Rate (CAGR) of 18.19% from 2025 to 2033. This expansion is fueled by several key drivers. The increasing need for data-driven decision-making within retail and consumer packaged goods (CPG) companies is paramount. Businesses are leveraging web analytics to gain deeper insights into consumer behavior, optimize marketing campaigns, personalize customer experiences, and improve operational efficiency. The rising adoption of e-commerce and omnichannel strategies further accelerates market growth, demanding sophisticated analytics to track customer journeys across multiple touchpoints. Furthermore, advancements in artificial intelligence (AI) and machine learning (ML) are enhancing the capabilities of web analytics platforms, enabling more accurate predictions and proactive adjustments to business strategies. The market is segmented by offering (solutions and services), organization size (SMEs and large enterprises), and application (SEO/ranking, online marketing, customer profiling, application performance management, social media management, and others). Large enterprises currently dominate the market due to their greater resources and sophisticated analytics requirements, but the SME segment is expected to witness significant growth driven by the accessibility of cloud-based analytics solutions. Geographic distribution shows strong growth potential across regions, particularly in the Asia-Pacific region fueled by rapid e-commerce adoption and digital transformation initiatives. North America and Europe maintain substantial market shares due to early adoption and mature digital infrastructure. Competition in the market is intense, with major players like Google, IBM, Meta, Adobe, Microsoft, and Salesforce offering a wide range of analytics solutions and services. However, the market also accommodates smaller, specialized providers catering to niche needs. The future growth of the Web Analytics market in Retail and CPG will depend on factors like continued innovation in analytics technologies, the increasing complexity of customer data, the need for enhanced data security and privacy, and the evolving regulatory landscape around data usage. Companies that can effectively address these factors and deliver comprehensive, user-friendly, and insightful analytics platforms are poised to capture significant market share in the coming years. The focus will continue to shift toward predictive analytics, real-time dashboards, and integrated solutions that provide a holistic view of the customer journey. Recent developments include: April 2024 - IBM Consulting and Microsoft have unveiled the opening of the IBM-Microsoft Experience Zone in Bangalore, India. The Experience Zone is designed as an exclusive venue where clients can delve into the potential of generative AI, hybrid cloud solutions, and other advanced Microsoft offerings. The goal is to expedite their business transformations and secure a competitive edge., January 2024 - Microsoft Corp. announced a suite of generative AI and data solutions tailored for retailers. These solutions cover every touchpoint of the retail shopper journey, from crafting personalized shopping experiences and empowering store associates to harness and consolidating retail data, ultimately aiding brands in better connecting with their target audiences. Microsoft's initiatives include introducing copilot templates on Azure OpenAI Service, enhancing retailers' ability to craft personalized shopping experiences, and streamlining store operations. Microsoft Fabric hosts advanced retail data solutions, while Microsoft Dynamics 365 Customer Insights boasts new copilot features. Microsoft also rolled out the Retail Media Creative Studio within the Microsoft Retail Media Platform. These advancements collectively bolster Microsoft Cloud for Retail, providing retailers with diverse tools to integrate copilot experiences across the entire shopper journey seamlessly.. Key drivers for this market are: Growing Demand for Online Shopping Trends, Rising Adoption of Analytics Tools to Understand Customer Preferences; Increasing Customer Centric Approach and Use of Recommendation Engines. Potential restraints include: Growing Demand for Online Shopping Trends, Rising Adoption of Analytics Tools to Understand Customer Preferences; Increasing Customer Centric Approach and Use of Recommendation Engines. Notable trends are: Search Engine Optimization and Ranking Sector Significantly Driving the Market Growth.
DataForSEO Labs API offers three powerful keyword research algorithms and historical keyword data:
• Related Keywords from the “searches related to” element of Google SERP. • Keyword Suggestions that match the specified seed keyword with additional words before, after, or within the seed key phrase. • Keyword Ideas that fall into the same category as specified seed keywords. • Historical Search Volume with current cost-per-click, and competition values.
Based on in-market categories of Google Ads, you can get keyword ideas from the relevant Categories For Domain and discover relevant Keywords For Categories. You can also obtain Top Google Searches with AdWords and Bing Ads metrics, product categories, and Google SERP data.
You will find well-rounded ways to scout the competitors:
• Domain Whois Overview with ranking and traffic info from organic and paid search. • Ranked Keywords that any domain or URL has positions for in SERP. • SERP Competitors and the rankings they hold for the keywords you specify. • Competitors Domain with a full overview of its rankings and traffic from organic and paid search. • Domain Intersection keywords for which both specified domains rank within the same SERPs. • Subdomains for the target domain you specify along with the ranking distribution across organic and paid search. • Relevant Pages of the specified domain with rankings and traffic data. • Domain Rank Overview with ranking and traffic data from organic and paid search. • Historical Rank Overview with historical data on rankings and traffic of the specified domain from organic and paid search. • Page Intersection keywords for which the specified pages rank within the same SERP.
All DataForSEO Labs API endpoints function in the Live mode. This means you will be provided with the results in response right after sending the necessary parameters with a POST request.
The limit is 2000 API calls per minute, however, you can contact our support team if your project requires higher rates.
We offer well-rounded API documentation, GUI for API usage control, comprehensive client libraries for different programming languages, free sandbox API testing, ad hoc integration, and deployment support.
We have a pay-as-you-go pricing model. You simply add funds to your account and use them to get data. The account balance doesn't expire.
https://www.thebusinessresearchcompany.com/privacy-policyhttps://www.thebusinessresearchcompany.com/privacy-policy
Global Mobile Apps And Web Analytics market size is expected to reach $27.19 billion by 2029 at 16.3%, segmented as by solution, mobile app analytics tools, web analytics platforms, real-time data analytics solutions, a or b testing tools
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains trace data describing user interactions with the Inter-university Consortium for Political and Social Research website (ICPSR). We gathered site usage data from Google Analytics. We focused our analysis on user sessions, which are groups of interactions with resources (e.g., website pages) and events initiated by users. ICPSR tracks a subset of user interactions (i.e., other than page views) through event triggers. We analyzed sequences of interactions with resources, including the ICPSR data catalog, variable index, data citations collected in the ICPSR Bibliography of Data-related Literature, and topical information about project archives. As part of our analysis, we calculated the total number of unique sessions and page views in the study period. Data in our study period fell between September 1, 2012, and 2016. ICPSR's website was updated and relaunched in September 2012 with new search functionality, including a Social Science Variables Database (SSVD) tool. ICPSR then reorganized its website and changed its analytics collection procedures in 2016, marking this as the cutoff date for our analysis. Data are relevant for two reasons. First, updates to the ICPSR website during the study period focused only on front-end design rather than the website's search functionality. Second, the core features of the website over the period we examined (e.g., faceted and variable search, standardized metadata, the use of controlled vocabularies, and restricted data applications) are shared with other major data archives, making it likely that the trends in user behavior we report are generalizable.
Baidu Search Index is a big data analytics tool developed by Baidu, the most popular search engine in China, to reflect changes in search popularity for specific keywords.
Based on an ecosystem partnership with Baidu Search Index, Datago has direct access to keyword search index data from Baidu Index’s database. BSIA-Investor selects A-share stock codes in different formats as keywords, aggregates the corresponding Baidu Index data, and provides insights into the online search interest of Chinese investors for over 5,000 A-share stocks. This data helps investors better understand the market sentiment of millions of Chinese investors toward A-shares, including:
Investor Interest Measurement: A direct reflection of how Chinese investors’ interest in the A-share market fluctuates.
Cross-Comparison of Listed Companies: Search index data offers strong comparability, enabling users to assess differences in market attention among various listed companies and identify high-interest stocks.
Trend Tracking & Market Insights: By monitoring changes in the search popularity of individual stocks, investors can capture market hotspots, gain timely insights into potential investment opportunities, and leverage data for informed decision-making.
Coverage: 5000+ A-share stocks
History: 2016-01-01
Frequency: Daily
The Repository Analytics and Metrics Portal (RAMP) is a web service that aggregates use and performance use data of institutional repositories. The data are a subset of data from RAMP, the Repository Analytics and Metrics Portal (http://rampanalytics.org), consisting of data from all participating repositories for the calendar year 2018. For a description of the data collection, processing, and output methods, please see the "methods" section below. Note that the RAMP data model changed in August, 2018 and two sets of documentation are provided to describe data collection and processing before and after the change.
The Catalog Search Terms dataset captures the words and phrases input by users in search bars that look through the data catalog for relevant information. Data can also be categorized by user segments.
This dataset includes data on the words and phrases input by users in search bars that look through the data catalog for relevant information. Catalog searches using the Discovery API are not included.
Each row in the dataset indicates the number of catalog searches made using the search term from the specified user segment during the noted hour.
Data are segmented into the following user types:Data are updated by a system process at least once a day.
Please see Site Analytics: Catalog Search Terms for more detail.
Unlock the Power of Behavioural Data with GDPR-Compliant Clickstream Insights.
Swash clickstream data offers a comprehensive and GDPR-compliant dataset sourced from users worldwide, encompassing both desktop and mobile browsing behaviour. Here's an in-depth look at what sets us apart and how our data can benefit your organisation.
User-Centric Approach: Unlike traditional data collection methods, we take a user-centric approach by rewarding users for the data they willingly provide. This unique methodology ensures transparent data collection practices, encourages user participation, and establishes trust between data providers and consumers.
Wide Coverage and Varied Categories: Our clickstream data covers diverse categories, including search, shopping, and URL visits. Whether you are interested in understanding user preferences in e-commerce, analysing search behaviour across different industries, or tracking website visits, our data provides a rich and multi-dimensional view of user activities.
GDPR Compliance and Privacy: We prioritise data privacy and strictly adhere to GDPR guidelines. Our data collection methods are fully compliant, ensuring the protection of user identities and personal information. You can confidently leverage our clickstream data without compromising privacy or facing regulatory challenges.
Market Intelligence and Consumer Behaviuor: Gain deep insights into market intelligence and consumer behaviour using our clickstream data. Understand trends, preferences, and user behaviour patterns by analysing the comprehensive user-level, time-stamped raw or processed data feed. Uncover valuable information about user journeys, search funnels, and paths to purchase to enhance your marketing strategies and drive business growth.
High-Frequency Updates and Consistency: We provide high-frequency updates and consistent user participation, offering both historical data and ongoing daily delivery. This ensures you have access to up-to-date insights and a continuous data feed for comprehensive analysis. Our reliable and consistent data empowers you to make accurate and timely decisions.
Custom Reporting and Analysis: We understand that every organisation has unique requirements. That's why we offer customisable reporting options, allowing you to tailor the analysis and reporting of clickstream data to your specific needs. Whether you need detailed metrics, visualisations, or in-depth analytics, we provide the flexibility to meet your reporting requirements.
Data Quality and Credibility: We take data quality seriously. Our data sourcing practices are designed to ensure responsible and reliable data collection. We implement rigorous data cleaning, validation, and verification processes, guaranteeing the accuracy and reliability of our clickstream data. You can confidently rely on our data to drive your decision-making processes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Five files, one of which is a ZIP archive, containing data that support the findings of this study. PDF file "IA screenshots CSU Libraries search config" contains screenshots captured from the Internet Archive's Wayback Machine for all 24 CalState libraries' homepages for years 2017 - 2019. Excel file "CCIHE2018-PublicDataFile" contains Carnegie Classifications data from the Indiana University Center for Postsecondary Research for all of the CalState campuses from 2018. CSV file "2017-2019_RAW" contains the raw data exported from Ex Libris Primo Analytics (OBIEE) for all 24 CalState libraries for calendar years 2017 - 2019. CSV file "clean_data" contains the cleaned data from Primo Analytics which was used for all subsequent analysis such as charting and import into SPSS for statistical testing. ZIP archive file "NonparametricStatisticalTestsFromSPSS" contains 23 SPSS files [.spv format] reporting the results of testing conducted in SPSS. This archive includes things such as normality check, descriptives, and Kruskal-Wallis H-test results.
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global search and content analytics market size was estimated to be USD 5.9 billion in 2023 and is projected to reach USD 16.5 billion by 2032, growing at a CAGR of 12.1% over the forecast period. This substantial growth is primarily driven by the increasing demand for data-driven insights across various industries that aim to enhance their decision-making processes. The expansion of digital content and the need for effective content management and optimization are significant factors contributing to this upward trajectory. With businesses striving to improve their online presence and engagement, the emphasis on advanced analytics tools continues to rise, thereby fuelling the market's expansion.
One of the core growth factors propelling the search and content analytics market is the exponential growth of data generated through digital channels. Businesses today are inundated with vast quantities of unstructured data derived from social media, web pages, online forums, and other digital environments. The ability to transform this raw data into actionable insights is increasingly becoming a competitive necessity. Organizations are leveraging search and content analytics to navigate this complex data landscape, enabling them to understand consumer behavior, optimize marketing strategies, and improve content delivery. This growing reliance on data analytics to derive meaningful insights from voluminous data sets is a crucial driver of market growth.
Technological advancements in artificial intelligence (AI) and machine learning (ML) are further accelerating the adoption of search and content analytics tools. These technologies enhance the capabilities of analytics software, enabling it to process and analyze large data sets with greater speed and accuracy. AI-powered analytics solutions offer features like natural language processing for more precise sentiment analysis, predictive analytics for forecasting trends, and automated recommendations for content optimization. The integration of AI and ML in analytics solutions not only streamlines operations but also enhances the precision and reliability of the insights generated, thus boosting the market growth.
The increasing focus on personalized customer experiences is another significant factor driving the search and content analytics market. As businesses seek to offer more personalized interactions, the need for understanding customer preferences and behaviors becomes paramount. Search and content analytics tools facilitate deeper audience insights, allowing companies to tailor their content and marketing strategies accordingly. This trend is particularly prevalent in sectors like retail, e-commerce, and media, where customer engagement and satisfaction are crucial. By leveraging analytics solutions, companies can refine their content strategies to better align with consumer expectations, thereby enhancing customer loyalty and driving revenue growth.
Regionally, North America is expected to lead the market, driven by the presence of major technology companies and early adoption of advanced analytics solutions. The region's strong technological infrastructure, coupled with a high concentration of digital businesses, facilitates the widespread implementation of search and content analytics tools. Europe follows closely, with increasing investments in digital transformation initiatives driving market expansion. Meanwhile, the Asia Pacific region is anticipated to witness the highest growth rate, spurred by the rapid digitalization and growing e-commerce industry in countries like China and India. These regional dynamics illustrate the global reach and potential of the search and content analytics market.
The search and content analytics market is segmented into software and services components, each playing a crucial role in the ecosystem of data-driven insights. Software solutions form the backbone of analytics applications, offering platforms for data collection, processing, and analysis. These solutions are critical for businesses seeking to harness the power of big data to drive strategic decisions. The software segment is witnessing robust growth, fueled by continuous innovations and enhancements in analytics capabilities. Cloud-based analytics solutions, in particular, are gaining traction due to their scalability, cost-effectiveness, and ease of deployment. As businesses increasingly migrate towards cloud infrastructures, the demand for cloud-based analytics software is expected to soar.
Within the services component, a broad spectrum of
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website.
The sample dataset contains Google Analytics 360 data from the Google Merchandise Store, a real ecommerce store. The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website. It includes the following kinds of information:
Traffic source data: information about where website visitors originate. This includes data about organic traffic, paid search traffic, display traffic, etc. Content data: information about the behavior of users on the site. This includes the URLs of pages that visitors look at, how they interact with content, etc. Transactional data: information about the transactions that occur on the Google Merchandise Store website.
Fork this kernel to get started.
Banner Photo by Edho Pratama from Unsplash.
What is the total number of transactions generated per device browser in July 2017?
The real bounce rate is defined as the percentage of visits with a single pageview. What was the real bounce rate per traffic source?
What was the average number of product pageviews for users who made a purchase in July 2017?
What was the average number of product pageviews for users who did not make a purchase in July 2017?
What was the average total transactions per user that made a purchase in July 2017?
What is the average amount of money spent per session in July 2017?
What is the sequence of pages viewed?
https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy
Global Text Analytics Market size valued at US$ 10.02 Billion in 2023, set to reach US$ 46.61 Billion by 2032 at a CAGR of about 18.62% from 2024 to 2032.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
The Repository Analytics and Metrics Portal (RAMP) is a web service that aggregates use and performance use data of institutional repositories. The data are a subset of data from RAMP, the Repository Analytics and Metrics Portal (http://rampanalytics.org), consisting of data from all participating repositories for the calendar year 2021. For a description of the data collection, processing, and output methods, please see the "methods" section below.
The record will be revised periodically to make new data available through the remainder of 2021.
Methods
Data Collection
RAMP data are downloaded for participating IR from Google Search Console (GSC) via the Search Console API. The data consist of aggregated information about IR pages which appeared in search result pages (SERP) within Google properties (including web search and Google Scholar).
Data are downloaded in two sets per participating IR. The first set includes page level statistics about URLs pointing to IR pages and content files. The following fields are downloaded for each URL, with one row per URL:
url: This is returned as a 'page' by the GSC API, and is the URL of the page which was included in an SERP for a Google property.
impressions: The number of times the URL appears within the SERP.
clicks: The number of clicks on a URL which took users to a page outside of the SERP.
clickThrough: Calculated as the number of clicks divided by the number of impressions.
position: The position of the URL within the SERP.
date: The date of the search.
Following data processing describe below, on ingest into RAMP a additional field, citableContent, is added to the page level data.
The second set includes similar information, but instead of being aggregated at the page level, the data are grouped based on the country from which the user submitted the corresponding search, and the type of device used. The following fields are downloaded for combination of country and device, with one row per country/device combination:
country: The country from which the corresponding search originated.
device: The device used for the search.
impressions: The number of times the URL appears within the SERP.
clicks: The number of clicks on a URL which took users to a page outside of the SERP.
clickThrough: Calculated as the number of clicks divided by the number of impressions.
position: The position of the URL within the SERP.
date: The date of the search.
Note that no personally identifiable information is downloaded by RAMP. Google does not make such information available.
More information about click-through rates, impressions, and position is available from Google's Search Console API documentation: https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query and https://support.google.com/webmasters/answer/7042828?hl=en
Data Processing
Upon download from GSC, the page level data described above are processed to identify URLs that point to citable content. Citable content is defined within RAMP as any URL which points to any type of non-HTML content file (PDF, CSV, etc.). As part of the daily download of page level statistics from Google Search Console (GSC), URLs are analyzed to determine whether they point to HTML pages or actual content files. URLs that point to content files are flagged as "citable content." In addition to the fields downloaded from GSC described above, following this brief analysis one more field, citableContent, is added to the page level data which records whether each page/URL in the GSC data points to citable content. Possible values for the citableContent field are "Yes" and "No."
The data aggregated by the search country of origin and device type do not include URLs. No additional processing is done on these data. Harvested data are passed directly into Elasticsearch.
Processed data are then saved in a series of Elasticsearch indices. Currently, RAMP stores data in two indices per participating IR. One index includes the page level data, the second index includes the country of origin and device type data.
About Citable Content Downloads
Data visualizations and aggregations in RAMP dashboards present information about citable content downloads, or CCD. As a measure of use of institutional repository content, CCD represent click activity on IR content that may correspond to research use.
CCD information is summary data calculated on the fly within the RAMP web application. As noted above, data provided by GSC include whether and how many times a URL was clicked by users. Within RAMP, a "click" is counted as a potential download, so a CCD is calculated as the sum of clicks on pages/URLs that are determined to point to citable content (as defined above).
For any specified date range, the steps to calculate CCD are:
Filter data to only include rows where "citableContent" is set to "Yes."
Sum the value of the "clicks" field on these rows.
Output to CSV
Published RAMP data are exported from the production Elasticsearch instance and converted to CSV format. The CSV data consist of one "row" for each page or URL from a specific IR which appeared in search result pages (SERP) within Google properties as described above. Also as noted above, daily data are downloaded for each IR in two sets which cannot be combined. One dataset includes the URLs of items that appear in SERP. The second dataset is aggregated by combination of the country from which a search was conducted and the device used.
As a result, two CSV datasets are provided for each month of published data:
page-clicks:
The data in these CSV files correspond to the page-level data, and include the following fields:
url: This is returned as a 'page' by the GSC API, and is the URL of the page which was included in an SERP for a Google property.
impressions: The number of times the URL appears within the SERP.
clicks: The number of clicks on a URL which took users to a page outside of the SERP.
clickThrough: Calculated as the number of clicks divided by the number of impressions.
position: The position of the URL within the SERP.
date: The date of the search.
citableContent: Whether or not the URL points to a content file (ending with pdf, csv, etc.) rather than HTML wrapper pages. Possible values are Yes or No.
index: The Elasticsearch index corresponding to page click data for a single IR.
repository_id: This is a human readable alias for the index and identifies the participating repository corresponding to each row. As RAMP has undergone platform and version migrations over time, index names as defined for the previous field have not remained consistent. That is, a single participating repository may have multiple corresponding Elasticsearch index names over time. The repository_id is a canonical identifier that has been added to the data to provide an identifier that can be used to reference a single participating repository across all datasets. Filtering and aggregation for individual repositories or groups of repositories should be done using this field.
Filenames for files containing these data end with “page-clicks”. For example, the file named 2021-01_RAMP_all_page-clicks.csv contains page level click data for all RAMP participating IR for the month of January, 2021.
country-device-info:
The data in these CSV files correspond to the data aggregated by country from which a search was conducted and the device used. These include the following fields:
country: The country from which the corresponding search originated.
device: The device used for the search.
impressions: The number of times the URL appears within the SERP.
clicks: The number of clicks on a URL which took users to a page outside of the SERP.
clickThrough: Calculated as the number of clicks divided by the number of impressions.
position: The position of the URL within the SERP.
date: The date of the search.
index: The Elasticsearch index corresponding to country and device access information data for a single IR.
repository_id: This is a human readable alias for the index and identifies the participating repository corresponding to each row. As RAMP has undergone platform and version migrations over time, index names as defined for the previous field have not remained consistent. That is, a single participating repository may have multiple corresponding Elasticsearch index names over time. The repository_id is a canonical identifier that has been added to the data to provide an identifier that can be used to reference a single participating repository across all datasets. Filtering and aggregation for individual repositories or groups of repositories should be done using this field.
Filenames for files containing these data end with “country-device-info”. For example, the file named 2021-01_RAMP_all_country-device-info.csv contains country and device data for all participating IR for the month of January, 2021.
References
Google, Inc. (2021). Search Console APIs. Retrieved from https://developers.google.com/webmaster-tools/search-console-api-original.
An interactive dashboard that showcases the City of Austin Open Data Portal (data.austintexas.gov) web traffic and search-term performance metrics. *City of Austin Open Data Terms of Use https://data.austintexas.gov/stories/s/ranj‐cccq