88 datasets found
  1. Z

    Data for study "Direct Answers in Google Search Results"

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Strzelecki, Artur; Rutecka, Paulina (2020). Data for study "Direct Answers in Google Search Results" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3541091
    Explore at:
    Dataset updated
    Jun 9, 2020
    Dataset provided by
    University of Economics in Katowice
    Authors
    Strzelecki, Artur; Rutecka, Paulina
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The goal of this research is to examine direct answers in Google web search engine. Dataset was collected using Senuto (https://www.senuto.com/). Senuto is as an online tool, that extracts data on websites visibility from Google search engine.

    Dataset contains the following elements:

    keyword,

    number of monthly searches,

    featured domain,

    featured main domain,

    featured position,

    featured type,

    featured url,

    content,

    content length.

    Dataset with visibility structure has 743 798 keywords that were resulting in SERPs with direct answer.

  2. Importance of big data search technologies in organizations worldwide 2019

    • statista.com
    Updated Jun 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2020). Importance of big data search technologies in organizations worldwide 2019 [Dataset]. https://www.statista.com/statistics/1026471/worldwide-big-data-search/
    Explore at:
    Dataset updated
    Jun 15, 2020
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2019
    Area covered
    Worldwide
    Description

    This statistic shows the importance of big data search technologies in organizations worldwide as of 2019. Around ** percent of respondents stated that Elasticsearch was critical or very important for their organization as of 2019.

  3. Efficient Keyword-Based Search for Top-K Cells in Text Cube - Dataset - NASA...

    • data.nasa.gov
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Efficient Keyword-Based Search for Top-K Cells in Text Cube - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/efficient-keyword-based-search-for-top-k-cells-in-text-cube
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Previous studies on supporting free-form keyword queries over RDBMSs provide users with linked-structures (e.g.,a set of joined tuples) that are relevant to a given keyword query. Most of them focus on ranking individual tuples from one table or joins of multiple tables containing a set of keywords. In this paper, we study the problem of keyword search in a data cube with text-rich dimension(s) (so-called text cube). The text cube is built on a multidimensional text database, where each row is associated with some text data (a document) and other structural dimensions (attributes). A cell in the text cube aggregates a set of documents with matching attribute values in a subset of dimensions. We define a keyword-based query language and an IR-style relevance model for coring/ranking cells in the text cube. Given a keyword query, our goal is to find the top-k most relevant cells. We propose four approaches, inverted-index one-scan, document sorted-scan, bottom-up dynamic programming, and search-space ordering. The search-space ordering algorithm explores only a small portion of the text cube for finding the top-k answers, and enables early termination. Extensive experimental studies are conducted to verify the effectiveness and efficiency of the proposed approaches. Citation: B. Ding, B. Zhao, C. X. Lin, J. Han, C. Zhai, A. N. Srivastava, and N. C. Oza, “Efficient Keyword-Based Search for Top-K Cells in Text Cube,” IEEE Transactions on Knowledge and Data Engineering, 2011.

  4. Data from: Inventory of online public databases and repositories holding...

    • catalog.data.gov
    • s.cnmilf.com
    • +2more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Inventory of online public databases and repositories holding agricultural data in 2017 [Dataset]. https://catalog.data.gov/dataset/inventory-of-online-public-databases-and-repositories-holding-agricultural-data-in-2017-d4c81
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    United States agricultural researchers have many options for making their data available online. This dataset aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural data. These data serve as both a current landscape analysis and also as a baseline for future studies of ag research data. Purpose As sources of agricultural data become more numerous and disparate, and collaboration and open data become more expected if not required, this research provides a landscape inventory of online sources of open agricultural data. An inventory of current agricultural data sharing options will help assess how the Ag Data Commons, a platform for USDA-funded data cataloging and publication, can best support data-intensive and multi-disciplinary research. It will also help agricultural librarians assist their researchers in data management and publication. The goals of this study were to establish where agricultural researchers in the United States-- land grant and USDA researchers, primarily ARS, NRCS, USFS and other agencies -- currently publish their data, including general research data repositories, domain-specific databases, and the top journals compare how much data is in institutional vs. domain-specific vs. federal platforms determine which repositories are recommended by top journals that require or recommend the publication of supporting data ascertain where researchers not affiliated with funding or initiatives possessing a designated open data repository can publish data Approach The National Agricultural Library team focused on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS) style research data, rather than ag economics, statistics, and social sciences data. To find domain-specific, general, institutional, and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources including re3data, libguides, and ARS lists were analysed. Primarily environmental or public health databases were not included, but places where ag grantees would publish data were considered. Search methods We first compiled a list of known domain specific USDA / ARS datasets / databases that are represented in the Ag Data Commons, including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K Workspace @ NAL, and GRIN. We then searched using search engines such as Bing and Google for non-USDA / federal ag databases, using Boolean variations of “agricultural data” /“ag data” / “scientific data” + NOT + USDA (to filter out the federal / USDA results). Most of these results were domain specific, though some contained a mix of data subjects. We then used search engines such as Bing and Google to find top agricultural university repositories using variations of “agriculture”, “ag data” and “university” to find schools with agriculture programs. Using that list of universities, we searched each university web site to see if their institution had a repository for their unique, independent research data if not apparent in the initial web browser search. We found both ag specific university repositories and general university repositories that housed a portion of agricultural data. Ag specific university repositories are included in the list of domain-specific repositories. Results included Columbia University – International Research Institute for Climate and Society, UC Davis – Cover Crops Database, etc. If a general university repository existed, we determined whether that repository could filter to include only data results after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University of Minnesota DRUM (Digital Repository of the University of Minnesota). We then split out NCBI (National Center for Biotechnology Information) repositories. Next we searched the internet for open general data repositories using a variety of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine whether that repository could filter for data results after search terms were applied. General subject data repositories include Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo. Finally, we compared scholarly journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural data. Extensive lists of journals were compiled, in which USDA published in 2012 and 2016, combining search results in ARIS, Scopus, and the Forest Service's TreeSearch, plus the USDA web sites Economic Research Service (ERS), National Agricultural Statistics Service (NASS), Natural Resources and Conservation Service (NRCS), Food and Nutrition Service (FNS), Rural Development (RD), and Agricultural Marketing Service (AMS). The top 50 journals' author instructions were consulted to see if they (a) ask or require submitters to provide supplemental data, or (b) require submitters to submit data to open repositories. Data are provided for Journals based on a 2012 and 2016 study of where USDA employees publish their research studies, ranked by number of articles, including 2015/2016 Impact Factor, Author guidelines, Supplemental Data?, Supplemental Data reviewed?, Open Data (Supplemental or in Repository) Required? and Recommended data repositories, as provided in the online author guidelines for each the top 50 journals. Evaluation We ran a series of searches on all resulting general subject databases with the designated search terms. From the results, we noted the total number of datasets in the repository, type of resource searched (datasets, data, images, components, etc.), percentage of the total database that each term comprised, any dataset with a search term that comprised at least 1% and 5% of the total collection, and any search term that returned greater than 100 and greater than 500 results. We compared domain-specific databases and repositories based on parent organization, type of institution, and whether data submissions were dependent on conditions such as funding or affiliation of some kind. Results A summary of the major findings from our data review: Over half of the top 50 ag-related journals from our profile require or encourage open data for their published authors. There are few general repositories that are both large AND contain a significant portion of ag data in their collection. GBIF (Global Biodiversity Information Facility), ICPSR, and ORNL DAAC were among those that had over 500 datasets returned with at least one ag search term and had that result comprise at least 5% of the total collection. Not even one quarter of the domain-specific repositories and datasets reviewed allow open submission by any researcher regardless of funding or affiliation. See included README file for descriptions of each individual data file in this dataset. Resources in this dataset:Resource Title: Journals. File Name: Journals.csvResource Title: Journals - Recommended repositories. File Name: Repos_from_journals.csvResource Title: TDWG presentation. File Name: TDWG_Presentation.pptxResource Title: Domain Specific ag data sources. File Name: domain_specific_ag_databases.csvResource Title: Data Dictionary for Ag Data Repository Inventory. File Name: Ag_Data_Repo_DD.csvResource Title: General repositories containing ag data. File Name: general_repos_1.csvResource Title: README and file inventory. File Name: README_InventoryPublicDBandREepAgData.txt

  5. G

    AI Dataset Search Platform Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). AI Dataset Search Platform Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/ai-dataset-search-platform-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Aug 21, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    AI Dataset Search Platform Market Outlook



    According to our latest research, the global AI Dataset Search Platform market size is valued at USD 1.18 billion in 2024, with a robust year-over-year expansion driven by the escalating demand for high-quality datasets to fuel artificial intelligence and machine learning initiatives across industries. The market is expected to grow at a CAGR of 22.6% from 2025 to 2033, reaching an estimated USD 9.62 billion by 2033. This exponential growth is primarily attributed to the increasing recognition of data as a strategic asset, the proliferation of AI applications across sectors, and the need for efficient, scalable, and secure platforms to discover, curate, and manage diverse datasets.



    One of the primary growth factors propelling the AI Dataset Search Platform market is the exponential surge in AI adoption across both public and private sectors. Businesses and institutions are increasingly leveraging AI to gain competitive advantages, enhance operational efficiencies, and deliver personalized experiences. However, the effectiveness of AI models is fundamentally reliant on the quality and diversity of training datasets. As organizations strive to accelerate their AI initiatives, the need for platforms that can efficiently search, aggregate, and validate datasets from disparate sources has become paramount. This has led to a significant uptick in investments in AI dataset search platforms, as they enable faster data discovery, reduce development cycles, and ensure compliance with data governance standards.



    Another key driver for the market is the growing complexity and volume of data generated from emerging technologies such as IoT, edge computing, and connected devices. The sheer scale and heterogeneity of data sources necessitate advanced search platforms equipped with intelligent indexing, semantic search, and metadata management capabilities. These platforms not only facilitate the identification of relevant datasets but also support data annotation, labeling, and preprocessing, which are critical for building robust AI models. Furthermore, the integration of AI-powered search algorithms within these platforms enhances the accuracy and relevance of search results, thereby improving the overall efficiency of data scientists and AI practitioners.



    Additionally, regulatory pressures and the increasing emphasis on ethical AI have underscored the importance of transparent and auditable data sourcing. Organizations are compelled to demonstrate the provenance and integrity of the datasets used in their AI models to mitigate risks related to bias, privacy, and compliance. AI dataset search platforms address these challenges by providing traceability, version control, and access management features, ensuring that only authorized and compliant datasets are utilized. This not only reduces legal and reputational risks but also fosters trust among stakeholders, further accelerating market adoption.



    From a regional perspective, North America dominates the AI Dataset Search Platform market in 2024, accounting for over 38% of the global revenue. This leadership is driven by the presence of major technology providers, a mature AI ecosystem, and substantial investments in research and development. Europe follows closely, benefiting from stringent data privacy regulations and strong government support for AI innovation. The Asia Pacific region is experiencing the fastest growth, propelled by rapid digital transformation, expanding AI research communities, and increasing government initiatives to foster AI adoption. Latin America and the Middle East & Africa are also witnessing steady growth, albeit from a smaller base, as organizations in these regions gradually embrace AI-driven solutions.





    Component Analysis



    The AI Dataset Search Platform market by component is segmented into platforms and services, each playing a pivotal role in the ecosystem. The platform segment encompasses the core software infrastructure that enables users to search, index, curate, and manage datasets. This segmen

  6. f

    Search strings used to generate citation counts for three data sets in WoS,...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Mar 26, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Belter, Christopher W. (2014). Search strings used to generate citation counts for three data sets in WoS, publishers' full text websites, and Google Scholar. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001239723
    Explore at:
    Dataset updated
    Mar 26, 2014
    Authors
    Belter, Christopher W.
    Description

    Search strings used to generate citation counts for three data sets in WoS, publishers' full text websites, and Google Scholar.

  7. n

    Data from: Searching Data: A Review of Observational Data Retrieval...

    • narcis.nl
    • ssh.datastations.nl
    bibtex
    Updated Jul 21, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gregory, K. (Data Archiving and Networked Services) (2017). Searching Data: A Review of Observational Data Retrieval Practices [Dataset]. http://doi.org/10.17026/dans-zgu-qfpj
    Explore at:
    bibtexAvailable download formats
    Dataset updated
    Jul 21, 2017
    Dataset provided by
    Data Archiving and Networked Services (DANS)
    Authors
    Gregory, K. (Data Archiving and Networked Services)
    Description

    This study employed an extensive literature review to identify commonalities in the data retrieval practices of users of observational data. This dataset consists of a BibTeX file with the 146 bibliographic references examined in:

    Gregory, K., Groth, P., Cousijn, H., Scharnhorst, A., & Wyatt, S. (2017). Searching Data: A Review of Observational Data Retrieval Practices. arxiv:1707.06937. [cs.DL]

    The body of literature in the dataset was retrieved using different combinations of keyword searches, primarily in the Scopus database, across all fields. Keyword searches related to information retrieval (e.g. user behavior, information seeking, information retrieval) and data practices (e.g. research practices, community practices, data sharing, data reuse) were combined with keyword searches for research data. As the terms “data” and “search” are ubiquitous in academic literature, title searches also were employed and combined with the controlled vocabulary of the database to locate relevant information. Searches in Scopus included strings such as:

    KEY ( user AND information ) AND TITLE-ABS-KEY ("research data" OR ( scien* W/1 data ) OR ( data W/1 ( repositor* OR archive* ) ) )

    TITLE ( data W/0 ( search OR retriev* OR discover* OR access* OR sharing OR reus* ) )
    AND ( LIMIT-TO ( EXACTKEYWORD , "Information Retrieval" ) OR LIMIT-TO ( EXACTKEYWORD , "Data Retrieval" ) OR LIMIT-TO ( EXACTKEYWORD , "Data Reuse" ) )

    Bibliometric techniques such as citation chaining and related records were also applied. Pertinent journals and conference proceedings not indexed within Scopus (e.g. the International Journal of Digital Curation) were searched directly using similar keywords.

    The approximately 400 retrieved documents were examined by close reading to identify articles referring to observational data for inclusion in the final dataset.

    Acknowledgements This work has funded by the NWO Grant 652.001.002 (programme Creative Industrie - Thematisch Onderzoek (CI-TO), Re-SEARCH: Contextual Search for Scientific Research Data)

  8. Leading search engines in the United States 2015-2025, by market share

    • statista.com
    Updated Jan 15, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2015). Leading search engines in the United States 2015-2025, by market share [Dataset]. https://www.statista.com/statistics/1385902/market-share-leading-search-engines-usa/
    Explore at:
    Dataset updated
    Jan 15, 2015
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2015 - Apr 2025
    Area covered
    United States
    Description

    In April 2025, Google accounted for ***** percent of the search market in the United States across all devices. Bing followed as the second leading search provider in the United States during the last examined month, with a share of around *** percent, among the engine's highest quotas registered in the country to date.

  9. Levels of comfort with data collection by search engines in the European...

    • statista.com
    Updated Jun 9, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2016). Levels of comfort with data collection by search engines in the European Union 2016 [Dataset]. https://www.statista.com/statistics/602887/levels-of-comfort-with-data-collection-by-search-engines-in-the-european-union/
    Explore at:
    Dataset updated
    Jun 9, 2016
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Apr 9, 2016 - Apr 18, 2016
    Area covered
    European Union
    Description

    This statistic displays the findings of a survey on the distribution of comfort levels with online data collection by search engines in order to provide personalized online experiences in the European Union (EU-28) as of April 2016. During the survey period, it was found that five percent of respondents reported that they were very comfortable with the fact that search engines collect their personal data in order to tailor personalized web experience.

  10. n

    Repository Analytics and Metrics Portal (RAMP) 2018 data

    • data.niaid.nih.gov
    • dataone.org
    • +1more
    zip
    Updated Jul 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Wheeler; Kenning Arlitsch (2021). Repository Analytics and Metrics Portal (RAMP) 2018 data [Dataset]. http://doi.org/10.5061/dryad.ffbg79cvp
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 27, 2021
    Dataset provided by
    Montana State University
    University of New Mexico
    Authors
    Jonathan Wheeler; Kenning Arlitsch
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    The Repository Analytics and Metrics Portal (RAMP) is a web service that aggregates use and performance use data of institutional repositories. The data are a subset of data from RAMP, the Repository Analytics and Metrics Portal (http://rampanalytics.org), consisting of data from all participating repositories for the calendar year 2018. For a description of the data collection, processing, and output methods, please see the "methods" section below. Note that the RAMP data model changed in August, 2018 and two sets of documentation are provided to describe data collection and processing before and after the change.

    Methods

    RAMP Data Documentation – January 1, 2017 through August 18, 2018

    Data Collection

    RAMP data were downloaded for participating IR from Google Search Console (GSC) via the Search Console API. The data consist of aggregated information about IR pages which appeared in search result pages (SERP) within Google properties (including web search and Google Scholar).

    Data from January 1, 2017 through August 18, 2018 were downloaded in one dataset per participating IR. The following fields were downloaded for each URL, with one row per URL:

    url: This is returned as a 'page' by the GSC API, and is the URL of the page which was included in an SERP for a Google property.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    country: The country from which the corresponding search originated.
    device: The device used for the search.
    date: The date of the search.
    

    Following data processing describe below, on ingest into RAMP an additional field, citableContent, is added to the page level data.

    Note that no personally identifiable information is downloaded by RAMP. Google does not make such information available.

    More information about click-through rates, impressions, and position is available from Google's Search Console API documentation: https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query and https://support.google.com/webmasters/answer/7042828?hl=en

    Data Processing

    Upon download from GSC, data are processed to identify URLs that point to citable content. Citable content is defined within RAMP as any URL which points to any type of non-HTML content file (PDF, CSV, etc.). As part of the daily download of statistics from Google Search Console (GSC), URLs are analyzed to determine whether they point to HTML pages or actual content files. URLs that point to content files are flagged as "citable content." In addition to the fields downloaded from GSC described above, following this brief analysis one more field, citableContent, is added to the data which records whether each URL in the GSC data points to citable content. Possible values for the citableContent field are "Yes" and "No."

    Processed data are then saved in a series of Elasticsearch indices. From January 1, 2017, through August 18, 2018, RAMP stored data in one index per participating IR.

    About Citable Content Downloads

    Data visualizations and aggregations in RAMP dashboards present information about citable content downloads, or CCD. As a measure of use of institutional repository content, CCD represent click activity on IR content that may correspond to research use.

    CCD information is summary data calculated on the fly within the RAMP web application. As noted above, data provided by GSC include whether and how many times a URL was clicked by users. Within RAMP, a "click" is counted as a potential download, so a CCD is calculated as the sum of clicks on pages/URLs that are determined to point to citable content (as defined above).

    For any specified date range, the steps to calculate CCD are:

    Filter data to only include rows where "citableContent" is set to "Yes."
    Sum the value of the "clicks" field on these rows.
    

    Output to CSV

    Published RAMP data are exported from the production Elasticsearch instance and converted to CSV format. The CSV data consist of one "row" for each page or URL from a specific IR which appeared in search result pages (SERP) within Google properties as described above.

    The data in these CSV files include the following fields:

    url: This is returned as a 'page' by the GSC API, and is the URL of the page which was included in an SERP for a Google property.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    country: The country from which the corresponding search originated.
    device: The device used for the search.
    date: The date of the search.
    citableContent: Whether or not the URL points to a content file (ending with pdf, csv, etc.) rather than HTML wrapper pages. Possible values are Yes or No.
    index: The Elasticsearch index corresponding to page click data for a single IR.
    repository_id: This is a human readable alias for the index and identifies the participating repository corresponding to each row. As RAMP has undergone platform and version migrations over time, index names as defined for the index field have not remained consistent. That is, a single participating repository may have multiple corresponding Elasticsearch index names over time. The repository_id is a canonical identifier that has been added to the data to provide an identifier that can be used to reference a single participating repository across all datasets. Filtering and aggregation for individual repositories or groups of repositories should be done using this field.
    

    Filenames for files containing these data follow the format 2018-01_RAMP_all.csv. Using this example, the file 2018-01_RAMP_all.csv contains all data for all RAMP participating IR for the month of January, 2018.

    Data Collection from August 19, 2018 Onward

    RAMP data are downloaded for participating IR from Google Search Console (GSC) via the Search Console API. The data consist of aggregated information about IR pages which appeared in search result pages (SERP) within Google properties (including web search and Google Scholar).

    Data are downloaded in two sets per participating IR. The first set includes page level statistics about URLs pointing to IR pages and content files. The following fields are downloaded for each URL, with one row per URL:

    url: This is returned as a 'page' by the GSC API, and is the URL of the page which was included in an SERP for a Google property.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    date: The date of the search.
    

    Following data processing describe below, on ingest into RAMP a additional field, citableContent, is added to the page level data.

    The second set includes similar information, but instead of being aggregated at the page level, the data are grouped based on the country from which the user submitted the corresponding search, and the type of device used. The following fields are downloaded for combination of country and device, with one row per country/device combination:

    country: The country from which the corresponding search originated.
    device: The device used for the search.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    date: The date of the search.
    

    Note that no personally identifiable information is downloaded by RAMP. Google does not make such information available.

    More information about click-through rates, impressions, and position is available from Google's Search Console API documentation: https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query and https://support.google.com/webmasters/answer/7042828?hl=en

    Data Processing

    Upon download from GSC, the page level data described above are processed to identify URLs that point to citable content. Citable content is defined within RAMP as any URL which points to any type of non-HTML content file (PDF, CSV, etc.). As part of the daily download of page level statistics from Google Search Console (GSC), URLs are analyzed to determine whether they point to HTML pages or actual content files. URLs that point to content files are flagged as "citable content." In addition to the fields downloaded from GSC described above, following this brief analysis one more field, citableContent, is added to the page level data which records whether each page/URL in the GSC data points to citable content. Possible values for the citableContent field are "Yes" and "No."

    The data aggregated by the search country of origin and device type do not include URLs. No additional processing is done on these data. Harvested data are passed directly into Elasticsearch.

    Processed data are then saved in a series of Elasticsearch indices. Currently, RAMP stores data in two indices per participating IR. One index includes the page level data, the second index includes the country of origin and device type data.

    About Citable Content Downloads

    Data visualizations and aggregations in RAMP dashboards present information about citable content downloads, or CCD. As a measure of use of institutional repository

  11. E

    Eritrea Google Search Trends: Computer & Electronics: Apple

    • ceicdata.com
    Updated Nov 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). Eritrea Google Search Trends: Computer & Electronics: Apple [Dataset]. https://www.ceicdata.com/en/eritrea/google-search-trends-by-categories/google-search-trends-computer--electronics-apple
    Explore at:
    Dataset updated
    Nov 30, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Nov 19, 2025 - Nov 30, 2025
    Area covered
    Eritrea
    Description

    Eritrea Google Search Trends: Computer & Electronics: Apple data was reported at 0.000 Score in 30 Nov 2025. This stayed constant from the previous number of 0.000 Score for 29 Nov 2025. Eritrea Google Search Trends: Computer & Electronics: Apple data is updated daily, averaging 0.000 Score from Dec 2021 (Median) to 30 Nov 2025, with 1461 observations. The data reached an all-time high of 100.000 Score in 17 Apr 2023 and a record low of 0.000 Score in 30 Nov 2025. Eritrea Google Search Trends: Computer & Electronics: Apple data remains active status in CEIC and is reported by Google Trends. The data is categorized under Global Database’s Eritrea – Table ER.Google.GT: Google Search Trends: by Categories.

  12. T

    Internet Service Providers, Web Search Portals, and Data Processing Services...

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Apr 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). Internet Service Providers, Web Search Portals, and Data Processing Services Payroll Employment in Texas [Dataset]. https://tradingeconomics.com/united-states/internet-service-providers-web-search-portals-and-data-processing-services-payroll-employment-in-texas-discontinued-fed-data.html
    Explore at:
    json, csv, xml, excelAvailable download formats
    Dataset updated
    Apr 18, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Texas
    Description

    Internet Service Providers, Web Search Portals, and Data Processing Services Payroll Employment in Texas was 1.84566 Dec to Dec % Chg. in January of 2024, according to the United States Federal Reserve. Historically, Internet Service Providers, Web Search Portals, and Data Processing Services Payroll Employment in Texas reached a record high of 15.43259 in January of 2022 and a record low of -21.87562 in January of 2006. Trading Economics provides the current actual value, an historical data chart and related indicators for Internet Service Providers, Web Search Portals, and Data Processing Services Payroll Employment in Texas - last updated from the United States Federal Reserve on November of 2025.

  13. T

    Internet Service Providers, Web Search Portals, and Data Processing Services...

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Mar 1, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). Internet Service Providers, Web Search Portals, and Data Processing Services Payroll Employment in Texas [Dataset]. https://tradingeconomics.com/united-states/internet-service-providers-web-search-portals-and-data-processing-services-payroll-employment-in-texas-percent-change-at-annual-rate-fed-data.html
    Explore at:
    csv, excel, json, xmlAvailable download formats
    Dataset updated
    Mar 1, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Texas
    Description

    Internet Service Providers, Web Search Portals, and Data Processing Services Payroll Employment in Texas was -6.20983 % Chg. at Annual Rate in August of 2025, according to the United States Federal Reserve. Historically, Internet Service Providers, Web Search Portals, and Data Processing Services Payroll Employment in Texas reached a record high of 139.34517 in January of 1991 and a record low of -93.47831 in January of 2006. Trading Economics provides the current actual value, an historical data chart and related indicators for Internet Service Providers, Web Search Portals, and Data Processing Services Payroll Employment in Texas - last updated from the United States Federal Reserve on December of 2025.

  14. N

    Data from: The current landscape of author guidelines in chemistry through...

    • search.nfdi4chem.de
    • radar-service.eu
    • +2more
    tar
    Updated Dec 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Radar4Chem (2025). The current landscape of author guidelines in chemistry through the lens of research data sharing [Dataset]. http://doi.org/10.22000/702
    Explore at:
    tarAvailable download formats
    Dataset updated
    Dec 3, 2025
    Dataset provided by
    Radar4Chem
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data package contains the results of a large-scale analysis of author guidelines from several publishers and journals active in chemistry research, showing how well the publishing landscape supports different criteria.

  15. the relationship between GDPR and Personal Data

    • kaggle.com
    zip
    Updated Feb 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dalmacyali1905 (2021). the relationship between GDPR and Personal Data [Dataset]. https://www.kaggle.com/dalmacyali1905/the-relationship-between-gdpr-and-personal-data
    Explore at:
    zip(269460 bytes)Available download formats
    Dataset updated
    Feb 7, 2021
    Authors
    Dalmacyali1905
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    The Event I wanted to analyze if there is a relationship between the General Data Protection Regulation (GDPR) and Personal Data in terms of web search in specific periods. To do so, I used Google Trends Statistics. I made my comparisons by using world wide web searches for the topics of GPDR and Personal Data. I looked for worldwide web searches. Because in Google Trends, one can go to one country at a time. Meaning, one can only look for one country for her/his research in each data collection. I used Law & Government as a category for one independent variable.

    Content

    Given both statements for GPDR and Personal Data, one can understand GDPR acknowledges fundamental rights and freedoms of natural persons, and also this law acknowledges the freedom of personal data. We can evaluate from this sentence GDPR does not restrict the movement of personal data, it allows movement of personal data by protecting natural persons’ rights. Since topics of data protection, personal data, and the laws related to these two terms are unfamiliar to many people, it would be interesting to analyze the public’s searches on Google Trends. One can find a powerful correlation if her/his researches contain fresh topics. Also, academically shedding light on untouched fields is always expected from the scholars.

    Inspiration

    GDPR has been enacted on 27 April 2016. From that date, the movement of personal data has been regulated. Controllers who process personal data need to follow GDPR’s regulations. But as a natural person, one might not know what she/he needs to do. Also, not everyone has the privilege to go to lawyers. So, a simple Google search could be an expected outcome for most people. Given that explanation, I wanted to analyze if there is a relationship between GDPR and Personal Data topics on web searches. I proved that GDPR has affected Personal Data searches on Google Trends. My purpose for this project is to shed a light on data protection laws' affection for our lives. It is a new regulation. As data subjects, we should learn our fundamental rights.

  16. N

    Luteolin 400 MHz in DMSOd6 NMR data.cosy

    • search.nfdi4chem.de
    html
    Updated Feb 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nmrXiv (2025). Luteolin 400 MHz in DMSOd6 NMR data.cosy [Dataset]. https://search.nfdi4chem.de/dataset/luteolin-400-mhz-in-dmsod6-nmr-data-cosy
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Feb 3, 2025
    Dataset provided by
    nmrXiv
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    NMR data of luteolin in DMSOd6. The dataset contains 1D 1H 13C as well as 2D COSY, HSQC, HMBC, all acquired at 400 MHz (Jeol 400 MHz spectrometer with Royal Probe) (2019-10-06)

    https://doi.org/10.7910/DVN/AQY4CX, Harvard Dataverse, V1

  17. B

    Belarus Internet Usage: Search Engine Market Share: Desktop: StartPagina...

    • ceicdata.com
    Updated Mar 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). Belarus Internet Usage: Search Engine Market Share: Desktop: StartPagina (Google) [Dataset]. https://www.ceicdata.com/en/belarus/internet-usage-search-engine-market-share/internet-usage-search-engine-market-share-desktop-startpagina-google
    Explore at:
    Dataset updated
    Mar 9, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 2025 - Mar 9, 2025
    Area covered
    Belarus
    Description

    Belarus Internet Usage: Search Engine Market Share: Desktop: StartPagina (Google) data was reported at 0.000 % in 09 Mar 2025. This records a decrease from the previous number of 0.030 % for 08 Mar 2025. Belarus Internet Usage: Search Engine Market Share: Desktop: StartPagina (Google) data is updated daily, averaging 0.070 % from Mar 2025 (Median) to 09 Mar 2025, with 9 observations. The data reached an all-time high of 0.070 % in 05 Mar 2025 and a record low of 0.000 % in 09 Mar 2025. Belarus Internet Usage: Search Engine Market Share: Desktop: StartPagina (Google) data remains active status in CEIC and is reported by Statcounter Global Stats. The data is categorized under Global Database’s Belarus – Table BY.SC.IU: Internet Usage: Search Engine Market Share.

  18. D

    Cognitive Search Tools Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Dec 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2024). Cognitive Search Tools Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/cognitive-search-tools-market
    Explore at:
    csv, pptx, pdfAvailable download formats
    Dataset updated
    Dec 3, 2024
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Cognitive Search Tools Market Outlook



    The global cognitive search tools market size is anticipated to grow from an estimated USD 4.5 billion in 2023 to USD 12.3 billion by 2032, exhibiting a robust compound annual growth rate (CAGR) of 12.1% during the forecast period. The increasing reliance on artificial intelligence (AI) and machine learning (ML) technologies to enhance data search capabilities and provide more accurate and contextually relevant search results is a significant growth driver for this market. Organizations are increasingly adopting cognitive search tools to manage large volumes of unstructured data, which is further propelled by the growing digital transformation across various industries.



    One of the key growth factors for the cognitive search tools market is the exponential rise in data generation across the globe. As businesses and organizations collect vast amounts of data from various sources, the need for advanced search tools to extract meaningful insights from this sea of information becomes paramount. Cognitive search tools leverage AI and ML to understand and process both structured and unstructured data, allowing for more precise information retrieval. This capability is driving their adoption across diverse sectors, particularly in industries like healthcare, BFSI, and retail, where data-driven decision-making is crucial.



    Another important growth factor is the increasing demand for personalized customer experiences. With the advent of digital platforms and e-commerce, consumers expect highly tailored interactions and content. Cognitive search tools enable businesses to analyze user behavior and preferences, thus delivering personalized search results and recommendations. This not only enhances customer satisfaction but also drives engagement and revenues. As companies seek to differentiate themselves in a competitive market, the deployment of cognitive search tools becomes a strategic investment in achieving superior customer experience.



    The integration of cognitive search tools with existing enterprise systems and workflows is also contributing significantly to market growth. By seamlessly integrating with platforms like customer relationship management (CRM) and enterprise resource planning (ERP) systems, cognitive search tools enhance operational efficiency and productivity. They help in uncovering hidden patterns and trends within organizational data, leading to smarter business strategies and decision-making. Furthermore, the cloud deployment of these tools ensures scalability and cost-effectiveness, making them accessible to small and medium enterprises (SMEs) that are increasingly moving towards digital solutions.



    Regionally, North America holds a dominant position in the cognitive search tools market, driven by the early adoption of advanced technologies and substantial investments in AI research and development. The presence of major industry players and a tech-savvy consumer base further fuel market growth in this region. Meanwhile, the Asia Pacific region is expected to register the highest CAGR during the forecast period, propelled by rapid industrialization, digitalization efforts, and increasing investments in AI technology across countries like China, India, and Japan. Europe, with its strong emphasis on data privacy regulations, presents a unique landscape for market expansion, while Latin America and the Middle East & Africa are gradually catching up with increasing awareness and adoption of cognitive search technologies.



    Component Analysis



    The component segment of the cognitive search tools market is bifurcated into software and services. Software constitutes a significant portion of this segment, as it forms the backbone of cognitive search tools. These software solutions are designed to enhance data search capabilities by employing advanced technologies such as natural language processing (NLP), machine learning, and AI. The software component is continually evolving, with ongoing advancements in AI algorithms and architectures that lead to improved search accuracy and efficiency. As organizations seek to harness data for competitive advantage, the demand for sophisticated cognitive search software is expected to escalate.



    Services, on the other hand, play a crucial role in the deployment and functioning of cognitive search tools. These services encompass a range of activities, including consulting, integration, training, and support. As the adoption of cognitive search tools grows, so does the demand for specialized services that ensure successful implementation and optimal utilization. Consulting services help

  19. c

    Kepler K2 Data Search Catalog

    • s.cnmilf.com
    • catalog.data.gov
    Updated Sep 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MAST: Mikulski Archive at Space Telescope (2025). Kepler K2 Data Search Catalog [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/kepler-k2-data-search-catalog
    Explore at:
    Dataset updated
    Sep 19, 2025
    Dataset provided by
    MAST: Mikulski Archive at Space Telescope
    Description

    Launched in 2009, the Kepler Mission is surveying a region of our galaxy to determine what fraction of stars in our galaxy have planets and measure the size distribution of those exoplanets. Although Kepler completed its primary mission to determine the fraction of stars that have planets in 2013, it is continuing the search, using a more limited survey mode, under the new name K2. The K2 Data Search Service provides the main catalog for all K2 data.

  20. D

    Data from: Semantic Query Analysis from the Global Science Gateway

    • ssh.datastations.nl
    • datasearch.gesis.org
    bin, pdf, zip
    Updated Feb 8, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    C. Carlesi; C. Carlesi (2018). Semantic Query Analysis from the Global Science Gateway [Dataset]. http://doi.org/10.17026/DANS-25M-FHE2
    Explore at:
    pdf(14994765), zip(19837), bin(19672036), pdf(1349455), pdf(1431355)Available download formats
    Dataset updated
    Feb 8, 2018
    Dataset provided by
    DANS Data Station Social Sciences and Humanities
    Authors
    C. Carlesi; C. Carlesi
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Nowadays web portals play an essential role in searching and retrieving information in the several fields of knowledge: they are ever more technologically advanced and designed for supporting the storage of a huge amount of information in natural language originating from the queries launched by users worldwide.A good example is given by the WorldWideScience search engine:The database is available at . It is based on a similar gateway, Science.gov, which is the major path to U.S. government science information, as it pulls together Web-based resources from various agencies. The information in the database is intended to be of high quality and authority, as well as the most current available from the participating countries in the Alliance, so users will find that the results will be more refined than those from a general search of Google. It covers the fields of medicine, agriculture, the environment, and energy, as well as basic sciences. Most of the information may be obtained free of charge (the database itself may be used free of charge) and is considered ‘‘open domain.’’ As of this writing, there are about 60 countries participating in WorldWideScience.org, providing access to 50+databases and information portals. Not all content is in English. (Bronson, 2009)Given this scenario, we focused on building a corpus constituted by the query logs registered by the GreyGuide: Repository and Portal to Good Practices and Resources in Grey Literature and received by the WorldWideScience.org (The Global Science Gateway) portal: the aim is to retrieve information related to social media which as of today represent a considerable source of data more and more widely used for research ends.This project includes eight months of query logs registered between July 2017 and February 2018 for a total of 445,827 queries. The analysis mainly concentrates on the semantics of the queries received from the portal clients: it is a process of information retrieval from a rich digital catalogue whose language is dynamic, is evolving and follows – as well as reflects – the cultural changes of our modern society.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Strzelecki, Artur; Rutecka, Paulina (2020). Data for study "Direct Answers in Google Search Results" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3541091

Data for study "Direct Answers in Google Search Results"

Explore at:
Dataset updated
Jun 9, 2020
Dataset provided by
University of Economics in Katowice
Authors
Strzelecki, Artur; Rutecka, Paulina
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The goal of this research is to examine direct answers in Google web search engine. Dataset was collected using Senuto (https://www.senuto.com/). Senuto is as an online tool, that extracts data on websites visibility from Google search engine.

Dataset contains the following elements:

keyword,

number of monthly searches,

featured domain,

featured main domain,

featured position,

featured type,

featured url,

content,

content length.

Dataset with visibility structure has 743 798 keywords that were resulting in SERPs with direct answer.

Search
Clear search
Close search
Google apps
Main menu