100+ datasets found
  1. d

    Global Web Data | Web Scraping Data | Job Postings Data | Source: Company...

    • datarade.ai
    .json
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PredictLeads, Global Web Data | Web Scraping Data | Job Postings Data | Source: Company Website | 232M+ Records [Dataset]. https://datarade.ai/data-products/predictleads-web-data-web-scraping-data-job-postings-dat-predictleads
    Explore at:
    .jsonAvailable download formats
    Dataset authored and provided by
    PredictLeads
    Area covered
    El Salvador, French Guiana, Virgin Islands (British), Northern Mariana Islands, Bosnia and Herzegovina, Kosovo, Comoros, Kuwait, Guadeloupe, Bonaire
    Description

    PredictLeads Job Openings Data provides high-quality hiring insights sourced directly from company websites - not job boards. Using advanced web scraping technology, our dataset offers real-time access to job trends, salaries, and skills demand, making it a valuable resource for B2B sales, recruiting, investment analysis, and competitive intelligence.

    Key Features:

    βœ…232M+ Job Postings Tracked – Data sourced from 92 Million company websites worldwide. βœ…7,1M+ Active Job Openings – Updated in real-time to reflect hiring demand. βœ…Salary & Compensation Insights – Extract salary ranges, contract types, and job seniority levels. βœ…Technology & Skill Tracking – Identify emerging tech trends and industry demands. βœ…Company Data Enrichment – Link job postings to employer domains, firmographics, and growth signals. βœ…Web Scraping Precision – Directly sourced from employer websites for unmatched accuracy.

    Primary Attributes:

    • id (string, UUID) – Unique identifier for the job posting.
    • type (string, constant: "job_opening") – Object type.
    • title (string) – Job title.
    • description (string) – Full job description, extracted from the job listing.
    • url (string, URL) – Direct link to the job posting.
    • first_seen_at – Timestamp when the job was first detected.
    • last_seen_at – Timestamp when the job was last detected.
    • last_processed_at – Timestamp when the job data was last processed.

    Job Metadata:

    • contract_types (array of strings) – Type of employment (e.g., "full time", "part time", "contract").
    • categories (array of strings) – Job categories (e.g., "engineering", "marketing").
    • seniority (string) – Seniority level of the job (e.g., "manager", "non_manager").
    • status (string) – Job status (e.g., "open", "closed").
    • language (string) – Language of the job posting.
    • location (string) – Full location details as listed in the job description.
    • Location Data (location_data) (array of objects)
    • city (string, nullable) – City where the job is located.
    • state (string, nullable) – State or region of the job location.
    • zip_code (string, nullable) – Postal/ZIP code.
    • country (string, nullable) – Country where the job is located.
    • region (string, nullable) – Broader geographical region.
    • continent (string, nullable) – Continent name.
    • fuzzy_match (boolean) – Indicates whether the location was inferred.

    Salary Data (salary_data)

    • salary (string) – Salary range extracted from the job listing.
    • salary_low (float, nullable) – Minimum salary in original currency.
    • salary_high (float, nullable) – Maximum salary in original currency.
    • salary_currency (string, nullable) – Currency of the salary (e.g., "USD", "EUR").
    • salary_low_usd (float, nullable) – Converted minimum salary in USD.
    • salary_high_usd (float, nullable) – Converted maximum salary in USD.
    • salary_time_unit (string, nullable) – Time unit for the salary (e.g., "year", "month", "hour").

    Occupational Data (onet_data) (object, nullable)

    • code (string, nullable) – ONET occupation code.
    • family (string, nullable) – Broad occupational family (e.g., "Computer and Mathematical").
    • occupation_name (string, nullable) – Official ONET occupation title.

    Additional Attributes:

    • tags (array of strings, nullable) – Extracted skills and keywords (e.g., "Python", "JavaScript").

    πŸ“Œ Trusted by enterprises, recruiters, and investors for high-precision job market insights.

    PredictLeads Dataset: https://docs.predictleads.com/v3/guide/job_openings_dataset

  2. f

    Web Designer Express | Graphics Multimedia & Web Design | Technology Data

    • datastore.forage.ai
    Updated Sep 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Web Designer Express | Graphics Multimedia & Web Design | Technology Data [Dataset]. https://datastore.forage.ai/searchresults/?resource_keyword=web
    Explore at:
    Dataset updated
    Sep 22, 2024
    Description

    Web Designer Express is a reputable Miami-based company that has been in business for 20 years. With a team of experienced web designers and developers, they offer a wide range of services, including web design, e-commerce development, web development, and more. Their portfolio showcases over 10,000 websites designed, with a focus on creating custom, unique solutions for each client. With a presence in Miami, Florida, they cater to businesses and individuals seeking to establish a strong online presence. As a company, Web Designer Express is dedicated to building long-lasting relationships with their clients, providing personalized service, and exceeding expectations.

  3. f

    Business Software Alliance | Web Hosting & Domain Names | Technology Data

    • datastore.forage.ai
    Updated Sep 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Business Software Alliance | Web Hosting & Domain Names | Technology Data [Dataset]. https://datastore.forage.ai/searchresults/?resource_keyword=web
    Explore at:
    Dataset updated
    Sep 22, 2024
    Description

    Business Software Alliance is a trade association that represents the world's leading software companies, including Autodesk, IBM, and Symantec. The organization's members are committed to promoting the use of legitimate software and ensuring the integrity of their intellectual property.

    As a result, the data housed on BSA's website is rich in information related to the software industry, including software licensing, anti-piracy efforts, and digital piracy statistics. The data includes information on software usage, software development, and the impact of piracy on the technology industry. With its focus on promoting legitimate software use, the data on BSA's website provides valuable insights into the global software industry.

  4. D

    Website Analytics

    • data.nola.gov
    • gimi9.com
    • +4more
    csv, xlsx, xml
    Updated Feb 2, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Information Technology and Innovation Web Team (2017). Website Analytics [Dataset]. https://data.nola.gov/City-Administration/Website-Analytics/62d3-pst8
    Explore at:
    xlsx, csv, xmlAvailable download formats
    Dataset updated
    Feb 2, 2017
    Dataset authored and provided by
    Information Technology and Innovation Web Team
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This data about nola.gov provides a window into how people are interacting with the the City of New Orleans online. The data comes from a unified Google Analytics account for New Orleans. We do not track individuals and we anonymize the IP addresses of all visitors.

  5. h

    fineweb-edu

    • huggingface.co
    Updated Jan 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FineData (2025). fineweb-edu [Dataset]. http://doi.org/10.57967/hf/2497
    Explore at:
    Dataset updated
    Jan 3, 2025
    Dataset authored and provided by
    FineData
    License

    https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/

    Description

    πŸ“š FineWeb-Edu

    1.3 trillion tokens of the finest educational data the 🌐 web has to offer

    Paper: https://arxiv.org/abs/2406.17557

      What is it?
    

    πŸ“š FineWeb-Edu dataset consists of 1.3T tokens and 5.4T tokens (FineWeb-Edu-score-2) of educational web pages filtered from 🍷 FineWeb dataset. This is the 1.3 trillion version. To enhance FineWeb's quality, we developed an educational quality classifier using annotations generated by LLama3-70B-Instruct. We then… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu.

  6. Website Statistics

    • data.wu.ac.at
    • lcc.portaljs.com
    • +2more
    csv, pdf
    Updated Jun 11, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lincolnshire County Council (2018). Website Statistics [Dataset]. https://data.wu.ac.at/schema/data_gov_uk/M2ZkZDBjOTUtMzNhYi00YWRjLWI1OWMtZmUzMzA5NjM0ZTdk
    Explore at:
    csv, pdfAvailable download formats
    Dataset updated
    Jun 11, 2018
    Dataset provided by
    Lincolnshire County Councilhttp://www.lincolnshire.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    This Website Statistics dataset has four resources showing usage of the Lincolnshire Open Data website. Web analytics terms used in each resource are defined in their accompanying Metadata file.

    • Website Usage Statistics: This document shows a statistical summary of usage of the Lincolnshire Open Data site for the latest calendar year.

    • Website Statistics Summary: This dataset shows a website statistics summary for the Lincolnshire Open Data site for the latest calendar year.

    • Webpage Statistics: This dataset shows statistics for individual Webpages on the Lincolnshire Open Data site by calendar year.

    • Dataset Statistics: This dataset shows cumulative totals for Datasets on the Lincolnshire Open Data site that have also been published on the national Open Data site Data.Gov.UK - see the Source link.

      Note: Website and Webpage statistics (the first three resources above) show only UK users, and exclude API calls (automated requests for datasets). The Dataset Statistics are confined to users with javascript enabled, which excludes web crawlers and API calls.

    These Website Statistics resources are updated annually in January by the Lincolnshire County Council Business Intelligence team. For any enquiries about the information contact opendata@lincolnshire.gov.uk.

  7. Data Web Fingerprinting

    • kaggle.com
    zip
    Updated Mar 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anna Maria Mandalari (2023). Data Web Fingerprinting [Dataset]. https://www.kaggle.com/datasets/annamariamandalari/data-web-fingerprinting
    Explore at:
    zip(73655761 bytes)Available download formats
    Dataset updated
    Mar 3, 2023
    Authors
    Anna Maria Mandalari
    Description

    Dataset

    This dataset was created by Anna Maria Mandalari

    Contents

  8. d

    DATAANT | Custom Data Extraction | Web Scraping Data | Dataset, API | Data...

    • datarade.ai
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataant, DATAANT | Custom Data Extraction | Web Scraping Data | Dataset, API | Data Parsing and Processing | Worldwide [Dataset]. https://datarade.ai/data-products/dataant-custom-data-extraction-web-scraping-data-datase-dataant
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset authored and provided by
    Dataant
    Area covered
    Bulgaria, Morocco, Algeria, Israel, Niger, Andorra, Lithuania, Uruguay, Yemen, Vanuatu
    Description

    DATAANT provides the ability to extract data from any website using its web scraping service.

    Receive raw HTML data by triggering the API or request a custom dataset from any website.

    Use the received data for: - data analysis - data enrichment - data intelligence - data comparison

    The only two parameters needed to start a data extraction project: - data source (website URL) - attributes set for extraction

    All the data can be delivered using the following: - One-Time delivery - Scheduled updates delivery - DB access - API

    All the projects are highly customizable, so our team of data specialists could provide any data enrichment.

  9. d

    Data from: GIS Web Services

    • catalog.data.gov
    • data.brla.gov
    • +1more
    Updated Sep 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.brla.gov (2023). GIS Web Services [Dataset]. https://catalog.data.gov/dataset/gis-web-services
    Explore at:
    Dataset updated
    Sep 15, 2023
    Dataset provided by
    data.brla.gov
    Description

    A listing of web services published from the authoritative East Baton Rouge Parish Geographic Information System (EBRGIS) data repository. Services are offered in Esri REST, and the Open Geospatial Consortium (OGC) Web Mapping Service (WMS) or Web Feature Service (WFS) formats.

  10. Data from: web scrapping

    • kaggle.com
    zip
    Updated Apr 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suman Das (2020). web scrapping [Dataset]. https://www.kaggle.com/datasets/sumandas000/web-scrapping
    Explore at:
    zip(219065 bytes)Available download formats
    Dataset updated
    Apr 12, 2020
    Authors
    Suman Das
    Description

    Context

    Web scraping is data scraping used for extracting data from websites. Web scraping software may access the World Wide Web directly using the Hypertext Transfer Protocol, or through a web browser.

    Content

    All the details of the ambitionbox website,starting from the title of the company to its rating and the concept of web scraping.

    Acknowledgements

    All the data of ambitionbox and guidance from my teacher.

    Inspiration

    Can you scrap another website?

  11. f

    WP-Script | Web Hosting & Domain Names | Technology Data

    • datastore.forage.ai
    Updated Sep 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). WP-Script | Web Hosting & Domain Names | Technology Data [Dataset]. https://datastore.forage.ai/searchresults/?resource_keyword=web
    Explore at:
    Dataset updated
    Sep 22, 2024
    Description

    WP-Script is a company that provides WordPress themes and plugins for creating adult sites. They offer a range of products, including seven customizable adult WordPress themes and thirteen powerful adult WordPress plugins. Their products are designed to be easy to use and can help entrepreneurs create professional-looking adult sites with minimal technical expertise.

    With WP-Script, you can start your adult site in six easy steps. They also offer a 14-day money-back guarantee, giving you the opportunity to test their products risk-free. Additionally, they provide premium support to help you resolve any issues you may encounter. Their customers love their products, citing excellent themes, easy installation, and good customer support.

  12. National Neighborhood Data Archive (NaNDA): Broadband Internet Availability,...

    • icpsr.umich.edu
    ascii, delimited, r +3
    Updated Nov 14, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Li, Mao; Gomez-Lopez, Iris; Khan, Anam; Clarke, Philippa; Chenoweth, Megan (2022). National Neighborhood Data Archive (NaNDA): Broadband Internet Availability, Speed, and Adoption by Census Tract and ZIP Code Tabulation Area, United States, 2014-2020 [Dataset]. http://doi.org/10.3886/ICPSR38567.v1
    Explore at:
    r, sas, spss, ascii, delimited, stataAvailable download formats
    Dataset updated
    Nov 14, 2022
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    Li, Mao; Gomez-Lopez, Iris; Khan, Anam; Clarke, Philippa; Chenoweth, Megan
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/38567/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/38567/terms

    Time period covered
    2014 - 2020
    Area covered
    United States
    Description

    This study contains two data files. Data file one (Broadband Internet Availability, Speed, and Adoption by Census Tract) contains measures of broadband internet availability, speed, and adoption per United States census tract in 2014 through 2020. The data is derived from internet service providers' Form 477 reports to the Federal Communications Commission. Data file two (Broadband Internet Availability and Speed by ZIP Code Tabulation Area) contains measures of broadband internet access and usage per United States ZIP code tabulation area (ZCTA) in 2014 through 2020. The data is derived primarily from internet service providers' Form 477 reports to the Federal Communications Commission.

  13. NASA Global Web-Enabled Landsat Data Annual Global 30 m V031 - Dataset -...

    • data.nasa.gov
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). NASA Global Web-Enabled Landsat Data Annual Global 30 m V031 - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/nasa-global-web-enabled-landsat-data-annual-global-30-m-v031-37b9c
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The NASA Making Earth System Data Records for Use in Research Environments (MEaSUREs) Global Web-Enabled Landsat Data Annual (GWELDYR) Version 3.1 data product provides Landsat data at 30 meter (m) resolution for terrestrial non-Antarctica locations over annual reporting periods for the 1985, 1990, and 2000 epochs. GWELD data products are generated from all available Landsat 4 and 5 Thematic Mapper (TM) and Landsat 7 Enhanced Thematic Mapper Plus (ETM+) data in the U.S. Geological Survey (USGS) Landsat archive. The GWELD suite of products provide consistent data to derive land cover as well as geophysical and biophysical information for regional assessment of land surface dynamics.The GWELD products include Nadir Bidirectional Reflectance Distribution Function (BRDF)-Adjusted Reflectance (NBAR) for the reflective wavelength bands and top of atmosphere (TOA) brightness temperature for the thermal bands. The products are defined in the Sinusoidal coordinate system to promote continuity of NASA's Moderate Resolution Imaging Spectroradiometer (MODIS) land tile grid.Provided in the GWELDYR product are layers for surface reflectance bands 1 through 5 and 7, TOA brightness temperature for thermal bands, Normalized Difference Vegetation Index (NDVI), day of year, ancillary angle, and data quality information. A low-resolution red, green, blue (RGB) browse image of bands 5, 4, 3 is also available for each granule.Known Issues GWELDYR known issues can be found in Section 4 of the Algorithm Theoretical Basis Document (ATBD).Improvements/Changes from Previous Version Version 3.1 products use Landsat Collection 1 products as input and have improved per-pixel cloud mask, new quality data, improved calibration information, and improved product metadata that enable view and solar geometry calculations.

  14. Coordinated Data Analysis System (CDAWeb CDAS) RESTful Web services API at...

    • data.nasa.gov
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Coordinated Data Analysis System (CDAWeb CDAS) RESTful Web services API at the Space Physics Data Facility (SPDF) - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/coordinated-data-analysis-system-cdaweb-cdas-restful-web-services-api-at-the-space-physics
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    A RESTful web service for querying data and metadata components from data sets, including instruments, observatories, and inventory. This interface calls the services of the SPDF CDAWeb data browsing system. The Space Physics Data Facility (SPDF) is the archive of non-solar data for the Heliospheric Science Division (HSD) at NASA's Goddard Space Flight Center.

  15. Books to Scrape Dataset

    • kaggle.com
    • zenodo.org
    zip
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shahporan Priyom (2025). Books to Scrape Dataset [Dataset]. https://www.kaggle.com/datasets/shahporanpriyom/books-to-scrape-dataset
    Explore at:
    zip(24232 bytes)Available download formats
    Dataset updated
    Oct 1, 2025
    Authors
    Shahporan Priyom
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset was prepared as a beginner's guide to web scraping and data collection. The data is collected from Books to Scrape, a website designed for beginners to learn web scraping. A companion demonstrating how the data was scraped is given here

  16. d

    Web Data | Web Scraping Data | Technographic Data | Source: Job Openings,...

    • datarade.ai
    .json
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PredictLeads, Web Data | Web Scraping Data | Technographic Data | Source: Job Openings, HTML and JavaScripts | 1B+ Records [Dataset]. https://datarade.ai/data-products/predictleads-web-data-web-scraping-data-technographic-da-predictleads
    Explore at:
    .jsonAvailable download formats
    Dataset authored and provided by
    PredictLeads
    Area covered
    Sri Lanka, Nepal, Marshall Islands, Micronesia (Federated States of), Kiribati, Cook Islands, New Caledonia, Grenada, South Africa, Japan
    Description

    PredictLeads Technographic Data is a powerful tool for B2B organizations, providing detailed technographic and firmographic insights extracted through sophisticated web scraping techniques. Unlike traditional datasets, it identifies emerging technologies in job postings, revealing real-time technology adoption trends across industries. These insights fuel technical decision-making, B2B data cleansing, account profiling, and 360-degree customer analysis.

    Use Cases:

    βœ… Technical Account Profiling – Analyze a company’s technology stack and hiring trends for better-targeted sales and marketing. βœ… B2B Data Cleansing – Enhance CRM and data enrichment efforts with up-to-date, verified technographic insights. βœ… Technology Trend Analysis – Identify high-growth industries and emerging tech adoption patterns. βœ… Competitive Intelligence – Assess competitor tech stacks and innovation roadmaps based on hiring activity. βœ… 360-Degree Customer View – Integrate firmographic and technographic data for a complete B2B customer profile.

    Key API Attributes:

    • id (string, UUID) – Unique identifier for the technology detection.
    • first_seen_at (ISO 8601 date-time) – Date when the technology was first detected.
    • last_seen_at (ISO 8601 date-time) – Last observed instance of the technology in use.
    • technology (object) – Details about the detected technology:
    • name (string) – Technology name (e.g., "AWS Lambda", "Kubernetes").
    • company (object) – Data about the company using the technology:
    • domain (string) – Company website domain.
    • company_name (string) – Full company name.
    • seen_on_job_openings (array, nullable) – List of job postings mentioning the technology, indicating hiring demand.
    • seen_on_subpages (array) – URLs of web pages where the technology was detected, providing additional context.

    πŸ“Œ PredictLeads Technographic Data is the go-to solution for B2B professionals looking to optimize technical sales strategies, refine account targeting, and gain a competitive edge in technology-driven markets.

    PredictLeads Docs: https://docs.predictleads.com/v3/guide/technology_detections_dataset

  17. Coffee_Data_CoffeeReview

    • kaggle.com
    zip
    Updated Oct 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hanif Al Irsyad (2023). Coffee_Data_CoffeeReview [Dataset]. https://www.kaggle.com/datasets/hanifalirsyad/coffee-scrap-coffeereview
    Explore at:
    zip(1649986 bytes)Available download formats
    Dataset updated
    Oct 12, 2023
    Authors
    Hanif Al Irsyad
    Description

    Scraping data using python. in this dataset I took data from the website www.coffeereview.com

    This dataset has not been preprocessed, so we still need to perform exploratory data analysis.

    This data will be used to model a recommendation system or predictive analysis based on columns['aroma','sour','body','flavor','aftertaste'].

  18. Responses to "Semantic Web: Perspectives" Questionnaire

    • zenodo.org
    • data-staging.niaid.nih.gov
    • +1more
    bin, pdf, svg, tsv +1
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aidan Hogan; Aidan Hogan (2024). Responses to "Semantic Web: Perspectives" Questionnaire [Dataset]. http://doi.org/10.5281/zenodo.3229401
    Explore at:
    svg, tsv, bin, txt, pdfAvailable download formats
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Aidan Hogan; Aidan Hogan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset provides material relating to a questionnaire entitled "Semantic Web: Perspectives". This questionnaire was addressed to the W3C Semantic Web mailing list (semantic-web@w3.org) and was open to responses from May 12th to May 25th, 2019. A total of 113 responses were collected in this time. The following files are provided:

    • public-comments.txt: provides the public comments of respondents in plain text;
    • questionnaire-form.pdf: illustrates the design of the questionnaire, including questions, types of responses permitted, etc.;
    • questionnaire-responses.tsv: lists the individual responses (without private comments) as a tab-separated values file;
    • success-keywords.xlsx: provides a spreadsheet mapping success story responses to a list of keywords, further providing statistics on these keywords;
    • wordcloud-bw.svg: provides a word-cloud of success-story keywords in black & white;
    • wordcloud-colour.svg: provides a word-cloud of success-story keywords in colour.

    The word-clouds were produced using Jason Davies' online service, copying and pasting the keywords from the success-keywords.xlsx spreadsheet (e.g., Column A, Sheet Statistics) into the text field; the following settings were selected: Orientations from 0Β° to 0Β°, Spiral: Rectangular; Scale: n; Number of words: 400; One word per line: ticked; Font: Patua One (must be installed locally beforehand). The resulting SVG files were later modified in a text editor to add a link to the font used, to tighten the bounding box, and to produce a black & white version.

    We thank the respondents for providing their input.

  19. G

    Internet Data Center Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Internet Data Center Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/internet-data-center-market-global-industry-analysis
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Aug 29, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Internet Data Center Market Outlook



    According to our latest research, the global Internet Data Center market size stood at USD 68.3 billion in 2024, registering a robust growth trajectory. The market is forecasted to reach USD 165.7 billion by 2033, expanding at a healthy CAGR of 10.4% during the 2025-2033 period. The key growth factor driving this surge is the exponential rise in data generation, cloud computing adoption, and the proliferation of digital transformation initiatives across industries worldwide. As organizations increasingly prioritize business continuity, security, and scalability, the demand for advanced data center infrastructure is at an all-time high, shaping the future of the Internet Data Center market.




    One of the primary drivers fueling the growth of the Internet Data Center market is the rapid expansion of digital services and applications, which has led to an unprecedented surge in global data traffic. The proliferation of Internet of Things (IoT) devices, video streaming, e-commerce, and social media platforms has necessitated the deployment of high-capacity, low-latency data centers capable of handling massive workloads. Enterprises and service providers are investing heavily in data center modernization, focusing on energy efficiency, automation, and robust connectivity to support these evolving digital ecosystems. The growing emphasis on hybrid and multi-cloud strategies further amplifies the need for flexible and scalable data center solutions, propelling market growth.




    Another significant growth factor is the increasing adoption of artificial intelligence (AI), machine learning, and big data analytics across various sectors, including healthcare, finance, and retail. These technologies require substantial computational power and storage capabilities, driving demand for advanced data center infrastructure. Modern data centers are being designed to support high-density computing, GPU acceleration, and edge computing, enabling real-time data processing and analytics at scale. Additionally, the shift toward software-defined data centers (SDDC) and virtualization is transforming traditional data center architectures, enabling greater agility, cost-efficiency, and operational resilience. This evolution is further supported by advancements in network technologies such as 5G, which facilitate faster data transmission and improved user experiences.




    Sustainability and energy efficiency have emerged as crucial considerations in the Internet Data Center market, as organizations and governments worldwide prioritize environmental responsibility. Data centers are significant consumers of electricity, prompting the adoption of green technologies, renewable energy sources, and innovative cooling solutions to minimize carbon footprints. Regulatory mandates and industry standards are driving investments in energy-efficient hardware, intelligent power management, and sustainable building practices. Leading market players are increasingly focusing on achieving carbon neutrality and leveraging circular economy principles, which not only reduce operational costs but also enhance brand reputation and stakeholder trust. This sustainable approach is expected to shape investment decisions and technological advancements in the coming years.



    As the demand for data processing and storage continues to grow, the concept of a Hyperscale Data Center has emerged as a pivotal solution to meet these needs. Hyperscale data centers are designed to efficiently scale up resources, accommodating the vast amounts of data generated by modern digital activities. These facilities are characterized by their ability to support thousands of servers and millions of virtual machines, ensuring seamless performance and reliability. The architecture of hyperscale data centers focuses on maximizing energy efficiency and optimizing cooling systems, making them a sustainable choice for large-scale operations. As businesses increasingly rely on cloud services and big data analytics, the role of hyperscale data centers becomes ever more critical in providing the necessary infrastructure to support these advanced technologies.




    Regionally, the Asia Pacific market is witnessing remarkable growth, outpacing other regions due to rapid digitalization, government initiatives, and increasing internet penetration. Countries such as China, India, and Singapo

  20. Web Mining for Collaborative Food Delivery

    • kaggle.com
    zip
    Updated Aug 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jocelyn Dumlao (2023). Web Mining for Collaborative Food Delivery [Dataset]. https://www.kaggle.com/datasets/jocelyndumlao/web-mining-for-collaborative-food-delivery
    Explore at:
    zip(396903 bytes)Available download formats
    Dataset updated
    Aug 26, 2023
    Authors
    Jocelyn Dumlao
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Description

    This is the main data set that was built for the work titled: "A Web Mining Approach to Collaborative Consumption of Food Delivery Services" which is the official institutional research project of Professor Juan C. Correa at FundaciΓ³n Universitaria Konrad Lorenz.

    Categories

    Urban Transportation, Consumer, e-Commerce Retail

    Acknowledgements & Source

    Professor Juan C. Correa at FundaciΓ³n Universitaria Konrad Lorenz

    Data Source

    View Details

    Image Source

    Please don't forget to upvote if you find this useful.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
PredictLeads, Global Web Data | Web Scraping Data | Job Postings Data | Source: Company Website | 232M+ Records [Dataset]. https://datarade.ai/data-products/predictleads-web-data-web-scraping-data-job-postings-dat-predictleads

Global Web Data | Web Scraping Data | Job Postings Data | Source: Company Website | 232M+ Records

Explore at:
.jsonAvailable download formats
Dataset authored and provided by
PredictLeads
Area covered
El Salvador, French Guiana, Virgin Islands (British), Northern Mariana Islands, Bosnia and Herzegovina, Kosovo, Comoros, Kuwait, Guadeloupe, Bonaire
Description

PredictLeads Job Openings Data provides high-quality hiring insights sourced directly from company websites - not job boards. Using advanced web scraping technology, our dataset offers real-time access to job trends, salaries, and skills demand, making it a valuable resource for B2B sales, recruiting, investment analysis, and competitive intelligence.

Key Features:

βœ…232M+ Job Postings Tracked – Data sourced from 92 Million company websites worldwide. βœ…7,1M+ Active Job Openings – Updated in real-time to reflect hiring demand. βœ…Salary & Compensation Insights – Extract salary ranges, contract types, and job seniority levels. βœ…Technology & Skill Tracking – Identify emerging tech trends and industry demands. βœ…Company Data Enrichment – Link job postings to employer domains, firmographics, and growth signals. βœ…Web Scraping Precision – Directly sourced from employer websites for unmatched accuracy.

Primary Attributes:

  • id (string, UUID) – Unique identifier for the job posting.
  • type (string, constant: "job_opening") – Object type.
  • title (string) – Job title.
  • description (string) – Full job description, extracted from the job listing.
  • url (string, URL) – Direct link to the job posting.
  • first_seen_at – Timestamp when the job was first detected.
  • last_seen_at – Timestamp when the job was last detected.
  • last_processed_at – Timestamp when the job data was last processed.

Job Metadata:

  • contract_types (array of strings) – Type of employment (e.g., "full time", "part time", "contract").
  • categories (array of strings) – Job categories (e.g., "engineering", "marketing").
  • seniority (string) – Seniority level of the job (e.g., "manager", "non_manager").
  • status (string) – Job status (e.g., "open", "closed").
  • language (string) – Language of the job posting.
  • location (string) – Full location details as listed in the job description.
  • Location Data (location_data) (array of objects)
  • city (string, nullable) – City where the job is located.
  • state (string, nullable) – State or region of the job location.
  • zip_code (string, nullable) – Postal/ZIP code.
  • country (string, nullable) – Country where the job is located.
  • region (string, nullable) – Broader geographical region.
  • continent (string, nullable) – Continent name.
  • fuzzy_match (boolean) – Indicates whether the location was inferred.

Salary Data (salary_data)

  • salary (string) – Salary range extracted from the job listing.
  • salary_low (float, nullable) – Minimum salary in original currency.
  • salary_high (float, nullable) – Maximum salary in original currency.
  • salary_currency (string, nullable) – Currency of the salary (e.g., "USD", "EUR").
  • salary_low_usd (float, nullable) – Converted minimum salary in USD.
  • salary_high_usd (float, nullable) – Converted maximum salary in USD.
  • salary_time_unit (string, nullable) – Time unit for the salary (e.g., "year", "month", "hour").

Occupational Data (onet_data) (object, nullable)

  • code (string, nullable) – ONET occupation code.
  • family (string, nullable) – Broad occupational family (e.g., "Computer and Mathematical").
  • occupation_name (string, nullable) – Official ONET occupation title.

Additional Attributes:

  • tags (array of strings, nullable) – Extracted skills and keywords (e.g., "Python", "JavaScript").

πŸ“Œ Trusted by enterprises, recruiters, and investors for high-precision job market insights.

PredictLeads Dataset: https://docs.predictleads.com/v3/guide/job_openings_dataset

Search
Clear search
Close search
Google apps
Main menu