81 datasets found
  1. Job Offers Web Scraping Search

    • kaggle.com
    zip
    Updated Feb 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Job Offers Web Scraping Search [Dataset]. https://www.kaggle.com/datasets/thedevastator/job-offers-web-scraping-search
    Explore at:
    zip(5322 bytes)Available download formats
    Dataset updated
    Feb 11, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Job Offers Web Scraping Search

    Targeted Results to Find the Optimal Work Solution

    By [source]

    About this dataset

    This dataset collects job offers from web scraping which are filtered according to specific keywords, locations and times. This data gives users rich and precise search capabilities to uncover the best working solution for them. With the information collected, users can explore options that match with their personal situation, skillset and preferences in terms of location and schedule. The columns provide detailed information around job titles, employer names, locations, time frames as well as other necessary parameters so you can make a smart choice for your next career opportunity

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset is a great resource for those looking to find an optimal work solution based on keywords, location and time parameters. With this information, users can quickly and easily search through job offers that best fit their needs. Here are some tips on how to use this dataset to its fullest potential:

    • Start by identifying what type of job offer you want to find. The keyword column will help you narrow down your search by allowing you to search for job postings that contain the word or phrase you are looking for.

    • Next, consider where the job is located – the Location column tells you where in the world each posting is from so make sure it’s somewhere that suits your needs!

    • Finally, consider when the position is available – look at the Time frame column which gives an indication of when each posting was made as well as if it’s a full-time/ part-time role or even if it’s a casual/temporary position from day one so make sure it meets your requirements first before applying!

    • Additionally, if details such as hours per week or further schedule information are important criteria then there is also info provided under Horari and Temps Oferta columns too! Now that all three criteria have been ticked off - key words, location and time frame - then take a look at Empresa (Company Name) and Nom_Oferta (Post Name) columns too in order to get an idea of who will be employing you should you land the gig!

      All these pieces of data put together should give any motivated individual all they need in order to seek out an optimal work solution - keep hunting good luck!

    Research Ideas

    • Machine learning can be used to groups job offers in order to facilitate the identification of similarities and differences between them. This could allow users to specifically target their search for a work solution.
    • The data can be used to compare job offerings across different areas or types of jobs, enabling users to make better informed decisions in terms of their career options and goals.
    • It may also provide an insight into the local job market, enabling companies and employers to identify where there is potential for new opportunities or possible trends that simply may have previously gone unnoticed

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: web_scraping_information_offers.csv | Column name | Description | |:-----------------|:------------------------------------| | Nom_Oferta | Name of the job offer. (String) | | Empresa | Company offering the job. (String) | | Ubicació | Location of the job offer. (String) | | Temps_Oferta | Time of the job offer. (String) | | Horari | Schedule of the job offer. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit .

  2. S

    Global Data Scraping Tools Market Business Opportunities 2025-2032

    • statsndata.org
    excel, pdf
    Updated Oct 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stats N Data (2025). Global Data Scraping Tools Market Business Opportunities 2025-2032 [Dataset]. https://www.statsndata.org/report/data-scraping-tools-market-51340
    Explore at:
    pdf, excelAvailable download formats
    Dataset updated
    Oct 2025
    Dataset authored and provided by
    Stats N Data
    License

    https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order

    Area covered
    Global
    Description

    The Data Scraping Tools market has seen remarkable expansion and transformation in recent years, driven by the ever-increasing need for data insights across various industries. As organizations strive to harness the power of big data, the demand for effective data extraction tools has surged. These tools provide com

  3. d

    DATAANT | Custom Data Extraction | Web Scraping Data | Dataset, API | Data...

    • datarade.ai
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataant, DATAANT | Custom Data Extraction | Web Scraping Data | Dataset, API | Data Parsing and Processing | Worldwide [Dataset]. https://datarade.ai/data-products/dataant-custom-data-extraction-web-scraping-data-datase-dataant
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset authored and provided by
    Dataant
    Area covered
    Israel, Bulgaria, Uruguay, Andorra, Lithuania, Algeria, Morocco, Vanuatu, Niger, Yemen
    Description

    DATAANT provides the ability to extract data from any website using its web scraping service.

    Receive raw HTML data by triggering the API or request a custom dataset from any website.

    Use the received data for: - data analysis - data enrichment - data intelligence - data comparison

    The only two parameters needed to start a data extraction project: - data source (website URL) - attributes set for extraction

    All the data can be delivered using the following: - One-Time delivery - Scheduled updates delivery - DB access - API

    All the projects are highly customizable, so our team of data specialists could provide any data enrichment.

  4. Books to Scrape Dataset

    • kaggle.com
    • zenodo.org
    zip
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shahporan Priyom (2025). Books to Scrape Dataset [Dataset]. https://www.kaggle.com/datasets/shahporanpriyom/books-to-scrape-dataset
    Explore at:
    zip(24232 bytes)Available download formats
    Dataset updated
    Oct 1, 2025
    Authors
    Shahporan Priyom
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset was prepared as a beginner's guide to web scraping and data collection. The data is collected from Books to Scrape, a website designed for beginners to learn web scraping. A companion demonstrating how the data was scraped is given here

  5. d

    Global Web Data | Web Scraping Data | Job Postings Data | Source: Company...

    • datarade.ai
    .json
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PredictLeads, Global Web Data | Web Scraping Data | Job Postings Data | Source: Company Website | 232M+ Records [Dataset]. https://datarade.ai/data-products/predictleads-web-data-web-scraping-data-job-postings-dat-predictleads
    Explore at:
    .jsonAvailable download formats
    Dataset authored and provided by
    PredictLeads
    Area covered
    Bosnia and Herzegovina, French Guiana, El Salvador, Kuwait, Guadeloupe, Comoros, Virgin Islands (British), Northern Mariana Islands, Kosovo, Bonaire
    Description

    PredictLeads Job Openings Data provides high-quality hiring insights sourced directly from company websites - not job boards. Using advanced web scraping technology, our dataset offers real-time access to job trends, salaries, and skills demand, making it a valuable resource for B2B sales, recruiting, investment analysis, and competitive intelligence.

    Key Features:

    ✅232M+ Job Postings Tracked – Data sourced from 92 Million company websites worldwide. ✅7,1M+ Active Job Openings – Updated in real-time to reflect hiring demand. ✅Salary & Compensation Insights – Extract salary ranges, contract types, and job seniority levels. ✅Technology & Skill Tracking – Identify emerging tech trends and industry demands. ✅Company Data Enrichment – Link job postings to employer domains, firmographics, and growth signals. ✅Web Scraping Precision – Directly sourced from employer websites for unmatched accuracy.

    Primary Attributes:

    • id (string, UUID) – Unique identifier for the job posting.
    • type (string, constant: "job_opening") – Object type.
    • title (string) – Job title.
    • description (string) – Full job description, extracted from the job listing.
    • url (string, URL) – Direct link to the job posting.
    • first_seen_at – Timestamp when the job was first detected.
    • last_seen_at – Timestamp when the job was last detected.
    • last_processed_at – Timestamp when the job data was last processed.

    Job Metadata:

    • contract_types (array of strings) – Type of employment (e.g., "full time", "part time", "contract").
    • categories (array of strings) – Job categories (e.g., "engineering", "marketing").
    • seniority (string) – Seniority level of the job (e.g., "manager", "non_manager").
    • status (string) – Job status (e.g., "open", "closed").
    • language (string) – Language of the job posting.
    • location (string) – Full location details as listed in the job description.
    • Location Data (location_data) (array of objects)
    • city (string, nullable) – City where the job is located.
    • state (string, nullable) – State or region of the job location.
    • zip_code (string, nullable) – Postal/ZIP code.
    • country (string, nullable) – Country where the job is located.
    • region (string, nullable) – Broader geographical region.
    • continent (string, nullable) – Continent name.
    • fuzzy_match (boolean) – Indicates whether the location was inferred.

    Salary Data (salary_data)

    • salary (string) – Salary range extracted from the job listing.
    • salary_low (float, nullable) – Minimum salary in original currency.
    • salary_high (float, nullable) – Maximum salary in original currency.
    • salary_currency (string, nullable) – Currency of the salary (e.g., "USD", "EUR").
    • salary_low_usd (float, nullable) – Converted minimum salary in USD.
    • salary_high_usd (float, nullable) – Converted maximum salary in USD.
    • salary_time_unit (string, nullable) – Time unit for the salary (e.g., "year", "month", "hour").

    Occupational Data (onet_data) (object, nullable)

    • code (string, nullable) – ONET occupation code.
    • family (string, nullable) – Broad occupational family (e.g., "Computer and Mathematical").
    • occupation_name (string, nullable) – Official ONET occupation title.

    Additional Attributes:

    • tags (array of strings, nullable) – Extracted skills and keywords (e.g., "Python", "JavaScript").

    📌 Trusted by enterprises, recruiters, and investors for high-precision job market insights.

    PredictLeads Dataset: https://docs.predictleads.com/v3/guide/job_openings_dataset

  6. I

    Global Web Scraping Software Market Risk Analysis 2025-2032

    • statsndata.org
    excel, pdf
    Updated Oct 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stats N Data (2025). Global Web Scraping Software Market Risk Analysis 2025-2032 [Dataset]. https://www.statsndata.org/report/web-scraping-software-market-7543
    Explore at:
    excel, pdfAvailable download formats
    Dataset updated
    Oct 2025
    Dataset authored and provided by
    Stats N Data
    License

    https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order

    Area covered
    Global
    Description

    The Web Scraping Software market has rapidly evolved, becoming an indispensable tool for businesses across various sectors, including e-commerce, finance, and marketing. This software facilitates the automated extraction of data from websites, enabling organizations to collect valuable insights that inform decision-

  7. Z

    Web scraping and API projects from "Online Data Collection and Management"...

    • data.niaid.nih.gov
    Updated Jun 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datta, Hannes (2022). Web scraping and API projects from "Online Data Collection and Management" (Spring 2022) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6641810
    Explore at:
    Dataset updated
    Jun 14, 2022
    Dataset provided by
    Tilburg University
    Authors
    Datta, Hannes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    As part of "Online Data Collection and Management" (taught at Tilburg University, Spring 2022), students collected publicly available datasets for use in academic research projects. With this repository, I am sharing (a) the documentation of these data sets, and (b) the associated source code that led to the collection of the data. The repository also contains the collected datasets.

    The data consists of the following projects:

    Autoscout (electric cars vs gasoline cars in the Dutch market)

    Mediamarkt (e-commerce)

    Steam API

    Twitch (chat capture)

    Zalando (e-commerce)

    Course website: https://odcm.hannesdatta.com. Archived at https://doi.org/10.5281/zenodo.6641811 (check for more recent versions if available).

  8. Research indices using web scraped price data - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated May 23, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2016). Research indices using web scraped price data - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/research-indices-using-web-scraped-price-data
    Explore at:
    Dataset updated
    May 23, 2016
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Web scraping is a tool for extracting information from the underlying HTML code of websites. ONS has been conducting research into these technologies and, since May 2014, has been scraping prices from the websites of three retailers. Last year, ONS released two updates that constructed experimental price indices from the data. In this release, we provide updates to the experimental indices, and an analysis of the different methods used to clean and classify the data.

  9. G

    Data Scraping Software Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Sep 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Data Scraping Software Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/data-scraping-software-market
    Explore at:
    csv, pptx, pdfAvailable download formats
    Dataset updated
    Sep 1, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Scraping Software Market Outlook



    According to our latest research, the global Data Scraping Software market size reached USD 2.1 billion in 2024, registering a robust growth trajectory with a CAGR of 14.2% from 2025 to 2033. This dynamic market is projected to attain a valuation of USD 6.1 billion by 2033, driven by the escalating need for automated data extraction solutions across diverse sectors. The primary growth factor propelling the data scraping software market is the exponential rise in digital data volumes and the increasing reliance on data-driven decision-making by enterprises worldwide.



    One of the most significant growth drivers in the data scraping software market is the surge in demand for actionable business intelligence. Organizations across industries are leveraging data scraping tools to collect, aggregate, and analyze vast datasets from multiple online sources in real-time. This enables businesses to gain critical insights into consumer behavior, competitor strategies, and emerging market trends. The proliferation of e-commerce, digital marketing, and online financial services has further intensified the need for advanced data scraping solutions that can efficiently handle large-scale, unstructured data. The integration of artificial intelligence and machine learning capabilities into data scraping software is also enhancing accuracy, speed, and the ability to extract complex data patterns, thereby fueling market expansion.



    Another key growth factor is the increasing adoption of data scraping software by small and medium-sized enterprises (SMEs). As digital transformation becomes a strategic imperative, SMEs are seeking cost-effective and scalable tools to stay competitive in rapidly evolving markets. Data scraping software offers these businesses the ability to automate repetitive data collection tasks, reduce operational costs, and accelerate time-to-market for new products and services. Additionally, the growing popularity of cloud-based deployment models is making advanced data scraping solutions more accessible, flexible, and easy to integrate with existing IT infrastructure. This democratization of data extraction technology is expected to further amplify market growth, particularly in emerging economies where digital adoption is on the rise.



    The regulatory landscape and data privacy concerns are also shaping the evolution of the data scraping software market. With the introduction of stringent data protection regulations such as GDPR in Europe and CCPA in California, organizations must ensure compliance while extracting data from public and private sources. Leading vendors are responding by incorporating robust security features, consent management tools, and compliance frameworks into their software offerings. This focus on ethical data scraping and regulatory adherence is not only mitigating legal risks but also building trust among end-users, thereby contributing to sustained market growth.



    In the realm of data extraction, Casing Scrapers play a crucial role, particularly in industries where precision and efficiency are paramount. These tools are designed to clean and prepare wellbore casings, ensuring that data scraping operations can proceed without obstruction. By maintaining the integrity of the casing, Casing Scrapers help prevent data loss and ensure that the extraction process is smooth and uninterrupted. This is especially important in sectors such as oil and gas, where the accuracy of data can significantly impact operational decisions and safety measures. As the demand for reliable data continues to grow, the integration of Casing Scrapers into data scraping processes is becoming increasingly vital, offering enhanced reliability and performance.



    From a regional perspective, North America continues to dominate the data scraping software market, accounting for the largest revenue share in 2024. The region's leadership is attributed to the high concentration of technology-driven enterprises, advanced IT infrastructure, and early adoption of digital solutions. However, Asia Pacific is emerging as the fastest-growing market, propelled by rapid digitalization, expanding e-commerce ecosystems, and increasing investments in data analytics across countries such as China, India, and Japan. Europe also holds a significant market share, driven by robust regulatory frameworks and growing demand for data-driven business intelligence in sect

  10. ARABIC NEWS DATASET - RESULTS FROM WEB SCRAPING

    • kaggle.com
    zip
    Updated Apr 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elaaatif (2024). ARABIC NEWS DATASET - RESULTS FROM WEB SCRAPING [Dataset]. https://www.kaggle.com/datasets/latif8/arabic-news-dataset-results-from-web-scraping
    Explore at:
    zip(10472746 bytes)Available download formats
    Dataset updated
    Apr 15, 2024
    Authors
    Elaaatif
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset obtained from web scraping encompasses a diverse set of news articles from prominent sources: Al Jazeera, BBC News Arabic, Fatabyyano, Verify-Sy and matsda2sh. Each article provides unique insights into various topics, ranging from global politics and current affairs to health, culture, and technology. The dataset offers a comprehensive snapshot of contemporary news coverage, allowing for in-depth analysis and exploration of different perspectives. With detailed information on article titles, categories, publication dates, and content, researchers and analysts can gain valuable insights into arabic media trends, public discourse, and societal issues.

  11. NYC STEW-MAP Staten Island organizations' website hyperlink webscrape

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2022). NYC STEW-MAP Staten Island organizations' website hyperlink webscrape [Dataset]. https://catalog.data.gov/dataset/nyc-stew-map-staten-island-organizations-website-hyperlink-webscrape
    Explore at:
    Dataset updated
    Nov 21, 2022
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Area covered
    New York, Staten Island
    Description

    The data represent web-scraping of hyperlinks from a selection of environmental stewardship organizations that were identified in the 2017 NYC Stewardship Mapping and Assessment Project (STEW-MAP) (USDA 2017). There are two data sets: 1) the original scrape containing all hyperlinks within the websites and associated attribute values (see "README" file); 2) a cleaned and reduced dataset formatted for network analysis. For dataset 1: Organizations were selected from from the 2017 NYC Stewardship Mapping and Assessment Project (STEW-MAP) (USDA 2017), a publicly available, spatial data set about environmental stewardship organizations working in New York City, USA (N = 719). To create a smaller and more manageable sample to analyze, all organizations that intersected (i.e., worked entirely within or overlapped) the NYC borough of Staten Island were selected for a geographically bounded sample. Only organizations with working websites and that the web scraper could access were retained for the study (n = 78). The websites were scraped between 09 and 17 June 2020 to a maximum search depth of ten using the snaWeb package (version 1.0.1, Stockton 2020) in the R computational language environment (R Core Team 2020). For dataset 2: The complete scrape results were cleaned, reduced, and formatted as a standard edge-array (node1, node2, edge attribute) for network analysis. See "READ ME" file for further details. References: R Core Team. (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. Version 4.0.3. Stockton, T. (2020). snaWeb Package: An R package for finding and building social networks for a website, version 1.0.1. USDA Forest Service. (2017). Stewardship Mapping and Assessment Project (STEW-MAP). New York City Data Set. Available online at https://www.nrs.fs.fed.us/STEW-MAP/data/. This dataset is associated with the following publication: Sayles, J., R. Furey, and M. Ten Brink. How deep to dig: effects of web-scraping search depth on hyperlink network analysis of environmental stewardship organizations. Applied Network Science. Springer Nature, New York, NY, 7: 36, (2022).

  12. H

    ChatGPT examples in the hydrological sciences

    • hydroshare.org
    • beta.hydroshare.org
    • +1more
    zip
    Updated Oct 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dylan Irvine (2023). ChatGPT examples in the hydrological sciences [Dataset]. http://doi.org/10.4211/hs.fc0552275ea14c7082218c42ebd63da6
    Explore at:
    zip(1.3 MB)Available download formats
    Dataset updated
    Oct 9, 2023
    Dataset provided by
    HydroShare
    Authors
    Dylan Irvine
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    WGS 84 EPSG:4326,
    Description

    ChatGPT has forever changed the way that many industries operate. Much of the focus of Artificial Intelligence (AI) has been on their ability to generate text. However, it is likely that their ability to generate computer codes and scripts will also have a major impact. We demonstrate the use of ChatGPT to generate Python scripts to perform hydrological analyses and highlight the opportunities, limitations and risks that AI poses in the hydrological sciences.

    Here, we provide four worked examples of the use of ChatGPT to generate scripts to conduct hydrological analyses. We also provide a full list of the libraries available to the ChatGPT Advanced Data Analysis plugin (only available in the paid version). These files relate to a manuscript that is to be submitted to Hydrological Processes. The authors of the manuscript are Dylan J. Irvine, Landon J.S. Halloran and Philip Brunner.

    If you find these examples useful and/or use them, we would appreciate if you could cite the associated publication in Hydrological Processes. Details to be made available upon final publication.

  13. Article: web scraping in data science

    • kaggle.com
    zip
    Updated Nov 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rania Tarek Fleifel (2022). Article: web scraping in data science [Dataset]. https://www.kaggle.com/datasets/raniatarekfleifel/article-web-scraping-in-data-science
    Explore at:
    zip(1245697 bytes)Available download formats
    Dataset updated
    Nov 5, 2022
    Authors
    Rania Tarek Fleifel
    Description

    Dataset

    This dataset was created by Rania Tarek Fleifel

    Contents

  14. r

    Data from: Rakuten dataset

    • resodate.org
    • service.tib.eu
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ziqi Zhang; Xingyi Song (2025). Rakuten dataset [Dataset]. https://resodate.org/resources/aHR0cHM6Ly9zZXJ2aWNlLnRpYi5ldS9sZG1zZXJ2aWNlL2RhdGFzZXQvcmFrdXRlbi1kYXRhc2V0
    Explore at:
    Dataset updated
    Jan 3, 2025
    Dataset provided by
    Leibniz Data Manager
    Authors
    Ziqi Zhang; Xingyi Song
    Description

    The dataset is a collection of product offers crawled from the web, annotated with schema.org vocabulary.

  15. I

    Global Web Scraping Services Market Key Players and Market Share 2025-2032

    • statsndata.org
    excel, pdf
    Updated Nov 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stats N Data (2025). Global Web Scraping Services Market Key Players and Market Share 2025-2032 [Dataset]. https://www.statsndata.org/report/web-scraping-services-market-136281
    Explore at:
    pdf, excelAvailable download formats
    Dataset updated
    Nov 2025
    Dataset authored and provided by
    Stats N Data
    License

    https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order

    Area covered
    Global
    Description

    The Web Scraping Services market has rapidly evolved into a crucial component of data-driven decision-making for businesses across various industries. Web scraping, the automated process of extracting large volumes of data from websites, empowers organizations to gather insights, monitor competition, and analyze mar

  16. Z

    Practica 1 Web Scraping Oil Price Data

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    Updated Nov 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Javier Alonso; Arnau Herrera (2021). Practica 1 Web Scraping Oil Price Data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5655518
    Explore at:
    Dataset updated
    Nov 9, 2021
    Dataset provided by
    Author
    Authors
    Javier Alonso; Arnau Herrera
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset about the prices of the oil and its products with and without taxes across the years1.0

  17. h

    cvm-web-scraping

    • huggingface.co
    Updated Sep 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Italo Xesteres (2024). cvm-web-scraping [Dataset]. https://huggingface.co/datasets/italoxesteres/cvm-web-scraping
    Explore at:
    Dataset updated
    Sep 29, 2024
    Authors
    Italo Xesteres
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    italoxesteres/cvm-web-scraping dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. d

    Dataset - CORE-MD Post-Market Surveillance Tool

    • datasets.ai
    • data.europa.eu
    0
    Updated Mar 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    EU Open Research Repository (2024). Dataset - CORE-MD Post-Market Surveillance Tool [Dataset]. https://datasets.ai/datasets/oai-zenodo-org-10864069
    Explore at:
    0Available download formats
    Dataset updated
    Mar 22, 2024
    Dataset authored and provided by
    EU Open Research Repository
    Description

    Field Description fsn_id The unique identifier assigned to each safety notice. Note that this is not a real data. Country Country from which the safety notice was retrieved. Manufacturer Manufacturer's name Device Device's name Model Model of the device Type Type of the device, which could be 'MD' (Medical Devices), 'IVD' (In Vitro Diagnostic Devices), or 'AIMD' (Active Implantable Medical Devices). Action Action taken Date When the safety notice was published on the official websites. Url Link to the original website where the safety notice was published. EMDN Assigned European Medical Device Nomenclature (EMDN) codes according to the developed methodological framework. If empty, then the algorithm does not succeed. matched Whether the developed methodological framework successfully assigned the most appropriate EMDN codes, with possibile values being 'yes' or 'no'. algorithm If the developed methodological framework was able to assign the EMDN codes, this field specifies whether a direct linkage ('reference'), or entity similarity-based search ('algorithm'), or the nomenclature mapping tool ('mapping') was used. Reason The reason for which the safety notice was issued.

  19. S

    Global Web Scraping Tools Market Scenario Forecasting 2025-2032

    • statsndata.org
    excel, pdf
    Updated Sep 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stats N Data (2025). Global Web Scraping Tools Market Scenario Forecasting 2025-2032 [Dataset]. https://www.statsndata.org/report/web-scraping-tools-market-49163
    Explore at:
    excel, pdfAvailable download formats
    Dataset updated
    Sep 2025
    Dataset authored and provided by
    Stats N Data
    License

    https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order

    Area covered
    Global
    Description

    The web scraping tools market has witnessed significant growth in recent years, driven by the increasing need for data-driven decision-making across various industries. Web scraping, the process of extracting information from websites, provides businesses with valuable insights and competitive advantages. Companies

  20. o

    READ-IT Project Web Scraping Campaign

    • ordo.open.ac.uk
    Updated Apr 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alessio Antonini (2021). READ-IT Project Web Scraping Campaign [Dataset]. http://doi.org/10.21954/ou.rd.14376896.v1
    Explore at:
    Dataset updated
    Apr 7, 2021
    Dataset provided by
    The Open University
    Authors
    Alessio Antonini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets collected through a web scraping campaign targeting reading groups, community reviews and social media comments on Goodreads, Webtoons and Wattpad.These dataset are used to test and validate the ontology design pattern "Profiles, Groups & Communities" https://github.com/modellingDH/profile-group-community-odp and to support research case studies on new media and pop genres within the context of READ-IT project.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Devastator (2023). Job Offers Web Scraping Search [Dataset]. https://www.kaggle.com/datasets/thedevastator/job-offers-web-scraping-search
Organization logo

Job Offers Web Scraping Search

Targeted Results to Find the Optimal Work Solution

Explore at:
zip(5322 bytes)Available download formats
Dataset updated
Feb 11, 2023
Authors
The Devastator
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Job Offers Web Scraping Search

Targeted Results to Find the Optimal Work Solution

By [source]

About this dataset

This dataset collects job offers from web scraping which are filtered according to specific keywords, locations and times. This data gives users rich and precise search capabilities to uncover the best working solution for them. With the information collected, users can explore options that match with their personal situation, skillset and preferences in terms of location and schedule. The columns provide detailed information around job titles, employer names, locations, time frames as well as other necessary parameters so you can make a smart choice for your next career opportunity

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset is a great resource for those looking to find an optimal work solution based on keywords, location and time parameters. With this information, users can quickly and easily search through job offers that best fit their needs. Here are some tips on how to use this dataset to its fullest potential:

  • Start by identifying what type of job offer you want to find. The keyword column will help you narrow down your search by allowing you to search for job postings that contain the word or phrase you are looking for.

  • Next, consider where the job is located – the Location column tells you where in the world each posting is from so make sure it’s somewhere that suits your needs!

  • Finally, consider when the position is available – look at the Time frame column which gives an indication of when each posting was made as well as if it’s a full-time/ part-time role or even if it’s a casual/temporary position from day one so make sure it meets your requirements first before applying!

  • Additionally, if details such as hours per week or further schedule information are important criteria then there is also info provided under Horari and Temps Oferta columns too! Now that all three criteria have been ticked off - key words, location and time frame - then take a look at Empresa (Company Name) and Nom_Oferta (Post Name) columns too in order to get an idea of who will be employing you should you land the gig!

    All these pieces of data put together should give any motivated individual all they need in order to seek out an optimal work solution - keep hunting good luck!

Research Ideas

  • Machine learning can be used to groups job offers in order to facilitate the identification of similarities and differences between them. This could allow users to specifically target their search for a work solution.
  • The data can be used to compare job offerings across different areas or types of jobs, enabling users to make better informed decisions in terms of their career options and goals.
  • It may also provide an insight into the local job market, enabling companies and employers to identify where there is potential for new opportunities or possible trends that simply may have previously gone unnoticed

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: web_scraping_information_offers.csv | Column name | Description | |:-----------------|:------------------------------------| | Nom_Oferta | Name of the job offer. (String) | | Empresa | Company offering the job. (String) | | Ubicació | Location of the job offer. (String) | | Temps_Oferta | Time of the job offer. (String) | | Horari | Schedule of the job offer. (String) |

Acknowledgements

If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit .

Search
Clear search
Close search
Google apps
Main menu