100+ datasets found

v
Global import data of Crawler
volza.com
csv
Updated Dec 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Volza FZ LLC (2025). Global import data of Crawler [Dataset]. https://www.volza.com/p/crawler/import/
Explore at:
csvAvailable download formats
Dataset updated
Dec 10, 2025
Dataset authored and provided by
Volza FZ LLC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Count of importers, Sum of import value, 2014-01-01/2021-09-30, Count of import shipments
Description
81368 Global import shipment records of Crawler with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.
W
Web Crawler Tool Report
marketresearchforecast.com
doc, pdf, ppt
Updated Apr 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Research Forecast (2025). Web Crawler Tool Report [Dataset]. https://www.marketresearchforecast.com/reports/web-crawler-tool-542102
Explore at:
pdf, doc, pptAvailable download formats
Dataset updated
Apr 26, 2025
Dataset authored and provided by
Market Research Forecast
License
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global web crawler tool market is experiencing robust growth, driven by the increasing need for data extraction and analysis across diverse sectors. The market's expansion is fueled by the exponential growth of online data, the rise of big data analytics, and the increasing adoption of automation in business processes. Businesses leverage web crawlers for market research, competitive intelligence, price monitoring, and lead generation, leading to heightened demand. While cloud-based solutions dominate due to scalability and cost-effectiveness, on-premises deployments remain relevant for organizations prioritizing data security and control. The large enterprise segment currently leads in adoption, but SMEs are increasingly recognizing the value proposition of web crawling tools for improving business decisions and operations. Competition is intense, with established players like UiPath and Scrapy alongside a growing number of specialized solutions. Factors such as data privacy regulations and the complexity of managing web crawlers pose challenges to market growth, but ongoing innovation in areas such as AI-powered crawling and enhanced data processing capabilities are expected to mitigate these restraints. We estimate the market size in 2025 to be $1.5 billion, growing at a CAGR of 15% over the forecast period (2025-2033). The geographical distribution of the market reflects the global nature of internet usage, with North America and Europe currently holding the largest market share. However, the Asia-Pacific region is anticipated to witness significant growth driven by increasing internet penetration and digital transformation initiatives across countries like China and India. The ongoing development of more sophisticated and user-friendly web crawling tools, coupled with decreasing implementation costs, is projected to further stimulate market expansion. Future growth will depend heavily on the ability of vendors to adapt to evolving web technologies, address increasing data privacy concerns, and provide robust solutions that cater to the specific needs of various industry verticals. Further research and development into AI-driven crawling techniques will be pivotal in optimizing efficiency and accuracy, which in turn will encourage wider adoption.
c
The Global Anti crawling Techniques Market is Growing at Compound Annual...
cognitivemarketresearch.com
pdf,excel,csv,ppt
Updated Dec 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research (2024). The Global Anti crawling Techniques Market is Growing at Compound Annual Growth Rate of 6.00% from 2023 to 2030. [Dataset]. https://www.cognitivemarketresearch.com/anti-crawling-techniques-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
Dec 22, 2024
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
According to Cognitive Market Research, The Global Anti crawling Techniques market size is USD XX million in 2023 and will expand at a compound annual growth rate (CAGR) of 6.00% from 2023 to 2030.

North America Anti crawling Techniques held the major market of more than 40% of the global revenue and will grow at a compound annual growth rate (CAGR) of 4.2% from 2023 to 2030. Europe Anti crawling Techniques accounted for a share of over 30% of the global market and are projected to expand at a compound annual growth rate (CAGR) of 4.5% from 2023 to 2030. Asia Pacific Anti crawling Techniques held the market of more than 23% of the global revenue and will grow at a compound annual growth rate (CAGR) of 8.0% from 2023 to 2030. South American Anti crawling Techniques market of more than 5% of the global revenue and will grow at a compound annual growth rate (CAGR) of 5.4% from 2023 to 2030. Middle East and Africa Anti crawling Techniques held the major market of more than 2% of the global revenue and will grow at a compound annual growth rate (CAGR) of 5.7% from 2023 to 2030. The market for anti-crawling techniques has grown dramatically as a result of the increasing number of data breaches and public awareness of the need to protect sensitive data. Demand for bot fingerprint databases remains higher in the anti crawling techniques market. The content protection category held the highest anti crawling techniques market revenue share in 2023.

Increasing Demand for Protection and Security of Online Data to Provide Viable Market Output

The market for anti-crawling techniques is expanding due in large part to the growing requirement for online data security and protection. Due to an increase in digital activity, organizations are processing and storing enormous volumes of sensitive data online. Organizations are being forced to invest in strong anti-crawling techniques due to the growing threat of data breaches, illegal access, and web scraping occurrences. By protecting online data from harmful activity and guaranteeing its confidentiality and integrity, these technologies advance the industry. Moreover, the significance of protecting digital assets is increased by the widespread use of the Internet for e-commerce, financial transactions, and sensitive data transfers. Anti-crawling techniques are essential for reducing the hazards connected to online scraping, which is a tactic often used by hackers to obtain important data.

Increasing Incidence of Cyber Threats to Propel Market Growth

The growing prevalence of cyber risks, such as site scraping and data harvesting, is driving growth in the market for anti-crawling techniques. Organizations that rely significantly on digital platforms run a higher risk of having illicit data extracted. In order to safeguard sensitive data and preserve the integrity of digital assets, organizations have been forced to invest in sophisticated anti-crawling techniques that strengthen online defenses. Moreover, the market's growth is a reflection of growing awareness of cybersecurity issues and the need to put effective defenses in place against changing cyber threats. Moreover, cybersecurity is constantly challenged by the spread of advanced and automated crawling programs. The ever-changing threat landscape forces enterprises to implement anti-crawling techniques, which use a variety of tools like rate limitation, IP blocking, and CAPTCHAs to prevent fraudulent scraping efforts.

Market Restraints of the Anti crawling Techniques

Increasing Demand for Ethical Web Scraping to Restrict Market Growth

The growing desire for ethical web scraping presents a unique challenge to the anti-crawling techniques market. Ethical web scraping is the process of obtaining data from websites for lawful objectives, such as market research or data analysis, but without breaching the terms of service. Furthermore, the restraint arises because anti-crawling techniques must distinguish between criminal and ethical scraping operations, finding a balance between preventing websites from misuse and permitting authorized data harvest. This dynamic calls for more complex and adaptable anti-crawling techniques to distinguish between destructive and ethical scrapping actions.

Impact of COVID-19 on the Anti Crawling Techniques Market

The demand for online material has increased as a result of the COVID-19 pandemic, which has...
W
Web Crawler Tool Report
marketresearchforecast.com
doc, pdf, ppt
Updated Aug 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Research Forecast (2025). Web Crawler Tool Report [Dataset]. https://www.marketresearchforecast.com/reports/web-crawler-tool-542101
Explore at:
ppt, pdf, docAvailable download formats
Dataset updated
Aug 25, 2025
Dataset authored and provided by
Market Research Forecast
License
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
Discover the booming Web Crawler Tool market! This analysis reveals key trends, drivers, and restraints, plus a detailed look at leading companies like Scrapy, Mozenda, and UiPath. Learn about market size projections, CAGR, and regional market share for informed decision-making.
NASA 3D Models: Crawler - Dataset - NASA Open Data Portal
data.nasa.gov
Updated Mar 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). NASA 3D Models: Crawler - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/nasa-3d-models-crawler
Explore at:
Dataset updated
Mar 31, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
Originally designed to carry the towering Saturn V moon rocket from the Vehicle Assembly Building to the seaside launch site, the enormous transporters now carry the space shuttles to the launch pads for liftoff. Polygons: 146050 Vertices: 141658
v
Global import data of Hydraulic Crawler Crane
volza.com
csv
Updated Oct 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Volza FZ LLC (2025). Global import data of Hydraulic Crawler Crane [Dataset]. https://www.volza.com/p/hydraulic-crawler-crane/import/
Explore at:
csvAvailable download formats
Dataset updated
Oct 31, 2025
Dataset authored and provided by
Volza FZ LLC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Count of importers, Sum of import value, 2014-01-01/2021-09-30, Count of import shipments
Description
133 Global import shipment records of Hydraulic Crawler Crane with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.
Data crawler
kaggle.com
zip
Updated Nov 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Long Vũ Hoàng (2023). Data crawler [Dataset]. https://www.kaggle.com/datasets/longvuhoang/data-crawler
Explore at:
zip(25920397 bytes)Available download formats
Dataset updated
Nov 13, 2023
Authors
Long Vũ Hoàng
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Long Vũ Hoàng

Released under MIT

Contents
w
A corpus of web crawl data composed of 5 billion web pages.
data.wu.ac.at
Updated Oct 10, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Global (2013). A corpus of web crawl data composed of 5 billion web pages. [Dataset]. https://data.wu.ac.at/schema/datahub_io/ZDVlZWJkNmItNThlNC00ZmE1LWE4MGQtNWUwODRjY2ZhZDk5
Explore at:
application/download(31232.0)Available download formats
Dataset updated
Oct 10, 2013
Dataset provided by
Global
Description
A corpus of web crawl data composed of 5 billion web pages. This data set is freely available on Amazon S3 at s3://aws-publicdatasets/common-crawl/crawl-002/ and formatted in the ARC (.arc) file format.

Common Crawl is a non-profit organization that builds and maintains an open repository of web crawl data for the purpose of driving innovation in research, education and technology. This data set contains web crawl data from 5 billion web pages and is released under the Common Crawl Terms of Use.
h
GUI-Net-Crawler
huggingface.co
Updated Nov 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bofei Zhang (2025). GUI-Net-Crawler [Dataset]. https://huggingface.co/datasets/Bofeee5675/GUI-Net-Crawler
Explore at:
Dataset updated
Nov 3, 2025
Authors
Bofei Zhang
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
How to use this data?

After download this repo, use cat to get zip file: cat baidu_wiki_part_* > merge.zip

Then simply, unzip this zip file unzip merge.zip

What is in this data? Image(Screenshot)

Raw images are in images folder. /wikihow$ ls data/images | head -5 1111-4.jpg 111-15.jpg 1-draw-7.png 20200613_130717.jpg 22-19.jpg

Index page

Index page is a collection of web urls. This is how we start to crawl these websites. wikihow$ cat… See the full description on the dataset page: https://huggingface.co/datasets/Bofeee5675/GUI-Net-Crawler.

Global Web Crawler Tool Market Research Report: By Application (Data Mining,...

wiseguyreports.com

Updated Sep 15, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). Global Web Crawler Tool Market Research Report: By Application (Data Mining, Search Engine Optimization, Price Comparison, Web Archiving), By Deployment Type (On-Premises, Cloud-Based), By End Use (BFSI, E-commerce, Media and Entertainment, Healthcare, Education), By Size of Organization (Small Enterprises, Medium Enterprises, Large Enterprises) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/reports/web-crawler-tool-market

Explore at:

Dataset updated

Sep 15, 2025

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Sep 25, 2025

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2023
REGIONS COVERED	North America, Europe, APAC, South America, MEA
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2024	2.87(USD Billion)
MARKET SIZE 2025	3.15(USD Billion)
MARKET SIZE 2035	8.0(USD Billion)
SEGMENTS COVERED	Application, Deployment Type, End Use, Size of Organization, Regional
COUNTRIES COVERED	US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
KEY MARKET DYNAMICS	Increasing data volume, Rising demand for automation, Advancements in AI technologies, Growing e-commerce sector, Emphasis on data analysis
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	Octoparse, IBM, Bing, Moz, Oracle, Ahrefs, Diffbot, WebHarvy, DataMiner, Import.io, Microsoft, ParseHub, Scrapy, Amazon, Google, Yahoo
MARKET FORECAST PERIOD	2025 - 2035
KEY MARKET OPPORTUNITIES	Increased demand for data analytics, Growing emphasis on SEO strategies, Rising usage of AI technology, Expansion in e-commerce sector, Enhanced cloud-based solutions.
COMPOUND ANNUAL GROWTH RATE (CAGR)	9.8% (2025 - 2035)

D
Subsea Crawler Market Report | Global Forecast From 2025 To 2033
dataintelo.com
csv, pdf, pptx
Updated Jan 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). Subsea Crawler Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-subsea-crawler-market
Explore at:
csv, pptx, pdfAvailable download formats
Dataset updated
Jan 7, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Subsea Crawler Market Outlook

The global subsea crawler market size is projected to grow from USD 1.2 billion in 2023 to USD 2.5 billion by 2032, at a CAGR of 8.5% during the forecast period. This significant growth is primarily driven by the rising demand for subsea exploration and underwater construction activities, particularly in the oil & gas and marine research sectors. The increasing focus on deep-sea mining and the growing need for underwater maintenance of infrastructure are also key factors contributing to the market's expansion.

One of the primary growth factors for the subsea crawler market is the robust demand from the oil and gas industry. As global energy needs continue to rise, the industry is compelled to explore new offshore reserves, which necessitates the use of advanced underwater robotics like subsea crawlers. These crawlers are crucial for tasks such as pipeline inspection, maintenance, and repair in challenging underwater environments. The increasing number of offshore drilling activities, combined with the aging underwater infrastructure, is expected to keep the demand for subsea crawlers high over the coming years.

Another significant growth driver is the advancement in subsea technologies. Innovations in robotics, artificial intelligence, and sensor technologies have greatly enhanced the capabilities of subsea crawlers, making them more efficient and reliable. These technological advancements have expanded the application range of subsea crawlers beyond traditional sectors. For instance, their use in marine research has grown as researchers seek to explore the deep sea and its ecosystems. The enhanced maneuverability, data collection, and operational efficiency of modern subsea crawlers make them invaluable tools for scientific exploration.

The growing interest in underwater construction and infrastructure development is also fueling market growth. Subsea crawlers play a critical role in the construction and maintenance of underwater facilities such as dams, bridges, and tunnels. Their ability to perform complex tasks in harsh underwater conditions makes them indispensable for ensuring the structural integrity and longevity of such projects. Additionally, the use of subsea crawlers in military and defense applications, such as mine detection and underwater reconnaissance, is further bolstering the market.

The regional outlook for the subsea crawler market is promising, with significant growth expected across various regions. North America and Europe are anticipated to lead the market due to their advanced technological infrastructure and substantial investments in offshore oil and gas activities. The Asia Pacific region is also expected to witness significant growth, driven by increasing offshore exploration activities and investments in underwater infrastructure. Latin America and the Middle East & Africa are emerging markets with growing potential, particularly in the oil and gas sector.

The integration of Subsea Mining Equipment and Transportation systems is becoming increasingly crucial as the demand for deep-sea resources grows. These systems are designed to efficiently transport extracted materials from the ocean floor to the surface, ensuring minimal environmental impact and operational efficiency. The development of specialized transportation equipment is essential for handling the unique challenges posed by the subsea environment, such as high pressure and corrosive conditions. As the industry advances, innovations in transportation technology are expected to enhance the overall efficiency and sustainability of subsea mining operations. This, in turn, will drive further investment and interest in deep-sea resource exploration.

Type Analysis

The subsea crawler market can be segmented by type into hydraulic subsea crawlers and electric subsea crawlers. Hydraulic subsea crawlers are traditionally more popular due to their robustness and ability to operate in challenging environments. These crawlers are known for their high power output and durability, making them ideal for heavy-duty tasks such as underwater construction and oil & gas operations. Their ability to handle substantial loads and perform complex maneuvers under high pressure conditions makes them a preferred choice for many industries.

On the other hand, electric subsea crawlers are gaining traction due to their enhanced precision and operational efficiency. These c
n
NIF Registry Automated Crawl Data
neuinfo.org
rrid.site
+2more
Updated Aug 29, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2012). NIF Registry Automated Crawl Data [Dataset]. http://identifiers.org/RRID:SCR_012862
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_012862
Dataset updated
Aug 29, 2012
Description
An automatic pipeline based on an algorithm that identifies new resources in publications every month to assist the efficiency of NIF curators. The pipeline is also able to find the last time the resource's webpage was updated and whether the URL is still valid. This can assist the curator in knowing which resources need attention. Additionally, the pipeline identifies publications that reference existing NIF Registry resources as this is also of interest. These mentions are available through the Data Federation version of the NIF Registry, http://neuinfo.org/nif/nifgwt.html?query=nlx_144509 The RDF is based on an algorithm on how related it is to neuroscience. (hits of neuroscience related terms). Each potential resource gets assigned a score (based on how related it is to neuroscience) and the resources are then ranked and a list is generated.
v
Global import data of Crawler Excavator
volza.com
csv
Updated Nov 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Volza FZ LLC (2025). Global import data of Crawler Excavator [Dataset]. https://www.volza.com/p/crawler-excavator/import/
Explore at:
csvAvailable download formats
Dataset updated
Nov 21, 2025
Dataset authored and provided by
Volza FZ LLC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Count of importers, Sum of import value, 2014-01-01/2021-09-30, Count of import shipments
Description
68026 Global import shipment records of Crawler Excavator with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.
Job Posts Data Crawling Project (Vietnam)
kaggle.com
zip
Updated Dec 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Văn Duy Cao (2023). Job Posts Data Crawling Project (Vietnam) [Dataset]. https://www.kaggle.com/datasets/vnduycao/job-posts-data-crawling-project-vietnam
Explore at:
zip(53707 bytes)Available download formats
Dataset updated
Dec 31, 2023
Authors
Văn Duy Cao
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Vietnam
Description
This is a semi-cleaned dataset containing information from job posts related to data science field. The data is scraped from 4 websites and the process is done in December 2023. Langchain framework from OpenAI was used to support the data extraction task. For example, getting the soft skills and tools that the job post's description mention.

Here is the data schema for this data set

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F14229286%2Fcd5c6bc8700ad49f34a48b61981625c4%2Fimage%20(2).png?generation=1703998231851462&alt=media" alt="">

31/12/2023: The data set's description is not finished.
s
The CommonCrawl Corpus
marketplace.sshopencloud.eu
Updated Apr 24, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). The CommonCrawl Corpus [Dataset]. https://marketplace.sshopencloud.eu/dataset/93FNrL
Explore at:
Dataset updated
Apr 24, 2020
Description
The Common Crawl corpus contains petabytes of data collected over 8 years of web crawling. The corpus contains raw web page data, metadata extracts and text extracts. Common Crawl data is stored on Amazon Web Services’ Public Data Sets and on multiple academic cloud platforms across the world.
C
China CN: Steel: Import: Large Section: Alloy Crawler
ceicdata.com
Updated Feb 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2025). China CN: Steel: Import: Large Section: Alloy Crawler [Dataset]. https://www.ceicdata.com/en/china/steel-import-quantity-monthly/cn-steel-import-large-section-alloy-crawler
Explore at:
Dataset updated
Feb 15, 2025
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Nov 1, 2023 - Oct 1, 2024
Area covered
China
Variables measured
Merchandise Trade
Description
China Steel: Import: Large Section: Alloy Crawler data was reported at 0.124 Ton th in Mar 2025. This records a decrease from the previous number of 0.583 Ton th for Feb 2025. China Steel: Import: Large Section: Alloy Crawler data is updated monthly, averaging 0.201 Ton th from Jan 2010 (Median) to Mar 2025, with 180 observations. The data reached an all-time high of 3.821 Ton th in Dec 2019 and a record low of 0.000 Ton th in Dec 2016. China Steel: Import: Large Section: Alloy Crawler data remains active status in CEIC and is reported by General Administration of Customs. The data is categorized under China Premium Database’s Metal and Steel Sector – Table CN.WAG: Steel Import: Quantity: Monthly.

HTTP Client Hint Data Set

kaggle.com

zip

Updated May 27, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

H-BRS - Data and Application Security Group (2024). HTTP Client Hint Data Set [Dataset]. https://www.kaggle.com/datasets/dasgroup/http-client-hints-dataset

Explore at:

zip(1144843980 bytes)Available download formats

Dataset updated

May 27, 2024

Authors

H-BRS - Data and Application Security Group

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Login Pages HTTP Client Hints Dataset

HTTP client hint crawling data of all login pages of the 8M Tranco list websites.

This data set contains the crawled Accept-CH HTTP header values on all Tranco-list-related login pages from August 2022 to December 2023. You can use the data set to reproduce our study results regarding the client hint usage on the Web.

We crawled the data from three different continents (North America: Johnstown, Ohio, USA; Europe: Frankfurt and Biere, Germany; Asia: Singapore) and two different Internet Service Providers (ISP), which were Amazon Web Services (AWS) and Deutsche Telekom (DT).

Overview

You can find the crawling data inside the crawl_data_redacted folder of this repository. It is subdivided into our four different crawling regions, which are also the subfolders:

eu_otc: Crawling data from Biere, Germany (Europe), using the DT ISP.
eu_aws: Crawling data from Frankfurt, Germany (Europe), using the AWS ISP.
ap_aws: Crawling data from Singapore (Asia), using the AWS ISP.
us_aws: Crawling data from Johnstown, Ohio, USA (North America), using the AWS ISP.

Each folder includes the following files:

crawl_data_login_urls_only.csv: Contains the responses from all crawled login URLs
crawl_data_clustered_third_party_urls_only.csv: Contains the responses from requests to third party URLs that were initiated by the login URLs
crawl_data_trackerlist_urls_only.csv: Contains the responses from requests to third-party URLs that were identified as trackers and initiated by the login URLs.

General

Each data set file contains the following columns:

Column	Data Type	Description	Example
date	Timestamp	Point in time when the URL was crawled	2023-03-03 14:45:25.525
login_url	String	Uniform Resource Locator (URL) of the login URL that should be crawled	https://www.example.com/login.html
login_url_hostname	String	Hostname belonging to the crawled login URL	www.example.com
url	String	The actual URL that was crawled. In case it differs from `login_url`, it indicates a third party request.	https://www.example.com/index.html
url_hostname	String	Hostname belonging to the URL	www.example.com
Accept-CH Values (many columns)	Integer	The column name shows the corresponding value that was present in the `Accept-CH` HTTP Header (e.g., `sec-ch-ua-platform`). Its value shows whether this value was present (`1`) or not (`0`)	1 - 0

Data Creation

We used the Tranco List from June 21st, 2022 and visited all 8M hostnames of this list with a crawler bot to identify their login pages. We then crawled the login pages on a monthly basis and recorded the Accept-CH HTTP header sent by each website. For technical reasons, we had crawling gaps of one (October 2022) and two months (October/November 2023). However, the impact should be minimal (see Publication).

Publication

You can find more details on our conducted study in the following journal article:

A Privacy Measure Turned Upside Down? Investigating the Use of HTTP Client Hints on the Web
Stephan Wiefling, Marian Hönscheid, and Luigi Lo Iacono.
19th International Conference on Availability, Reliability and Security (ARES '24), Vienna, Austria

Bibtex

...

Weibo Topic Data of the M5.7 Yibin Earthquake
figshare.com
zip
Updated Nov 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jinsi Liu (2023). Weibo Topic Data of the M5.7 Yibin Earthquake [Dataset]. http://doi.org/10.6084/m9.figshare.19074476.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19074476.v2
Dataset updated
Nov 28, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Jinsi Liu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Yibin
Description
Disclaimer: This data is only for academic research and is not intended for commercial or other purposes. If you want to cite this data, please apply to the author for approval first.Through the topic crawler on Weibo, I crawled the topic data of the earthquake with a magnitude of 5.7 in Yibin, China, to visualize the earthquake network public opinion, and further study the dissemination and influencing factors of the earthquake network public opinion. I declare that the data is only for academic research and not for commercial use, thank you for your understanding!
Z
Data from: AEROARMS - Image Dataset for the Crawler Indirect Detection...
data.niaid.nih.gov
data.europa.eu
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Javier Laplaza; Albert Pumarola; Juan Andrade; Alberto Sanfeliu (2020). AEROARMS - Image Dataset for the Crawler Indirect Detection through its Cage [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_2636665
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Institut de Robòtica i Informàtica Industrial, CSIC-UPC
Authors
Javier Laplaza; Albert Pumarola; Juan Andrade; Alberto Sanfeliu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset containing images and ground-truth position of the crawler's cage used in the AEROARMS project experiments.
c
Europe Anti crawling Techniques Market is Growing at Compound Annual Growth...
cognitivemarketresearch.com
pdf,excel,csv,ppt
Updated Mar 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research (2024). Europe Anti crawling Techniques Market is Growing at Compound Annual Growth Rate of 4.5% from 2023 to 2030. [Dataset]. https://www.cognitivemarketresearch.com/regional-analysis/europe-anti-crawling-techniques-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
Mar 7, 2024
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Europe, Region
Description
Europe Anti crawling Techniques accounted for a share of over 30% of the global market size of USD XX million in 2023 and projected to expand at a compound annual growth rate (CAGR) of 4.5% from 2023 to 2030

Facebook

Twitter

Click to copy link

Link copied

Cite

Volza FZ LLC (2025). Global import data of Crawler [Dataset]. https://www.volza.com/p/crawler/import/

Global import data of Crawler

Explore at:

csvAvailable download formats

Dataset updated

Dec 10, 2025

Dataset authored and provided by

Volza FZ LLC

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Variables measured

Count of importers, Sum of import value, 2014-01-01/2021-09-30, Count of import shipments

Description

81368 Global import shipment records of Crawler with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.

Clear search

Close search

Google apps

Main menu

Global import data of Crawler

Web Crawler Tool Report

The Global Anti crawling Techniques Market is Growing at Compound Annual...

Web Crawler Tool Report

NASA 3D Models: Crawler - Dataset - NASA Open Data Portal

Global import data of Hydraulic Crawler Crane

Data crawler

Dataset

Contents

A corpus of web crawl data composed of 5 billion web pages.

GUI-Net-Crawler

Global Web Crawler Tool Market Research Report: By Application (Data Mining,...

Subsea Crawler Market Report | Global Forecast From 2025 To 2033

Subsea Crawler Market Outlook

Type Analysis

NIF Registry Automated Crawl Data

Global import data of Crawler Excavator

Job Posts Data Crawling Project (Vietnam)

The CommonCrawl Corpus

China CN: Steel: Import: Large Section: Alloy Crawler

HTTP Client Hint Data Set

Login Pages HTTP Client Hints Dataset

Overview

General

Data Creation

Publication

Bibtex

Weibo Topic Data of the M5.7 Yibin Earthquake

Data from: AEROARMS - Image Dataset for the Crawler Indirect Detection...

Europe Anti crawling Techniques Market is Growing at Compound Annual Growth...

Global import data of CrawlerSee More Versions

Global import data of Crawler