100+ datasets found
  1. d

    Open Data Website Traffic

    • catalog.data.gov
    • data.lacity.org
    • +1more
    Updated Jun 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.lacity.org (2025). Open Data Website Traffic [Dataset]. https://catalog.data.gov/dataset/open-data-website-traffic
    Explore at:
    Dataset updated
    Jun 21, 2025
    Dataset provided by
    data.lacity.org
    Description

    Daily utilization metrics for data.lacity.org and geohub.lacity.org. Updated monthly

  2. Company Datasets for Business Profiling

    • datarade.ai
    Updated Feb 23, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oxylabs (2017). Company Datasets for Business Profiling [Dataset]. https://datarade.ai/data-products/company-datasets-for-business-profiling-oxylabs
    Explore at:
    .json, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Feb 23, 2017
    Dataset authored and provided by
    Oxylabs
    Area covered
    Canada, Isle of Man, Northern Mariana Islands, Taiwan, British Indian Ocean Territory, Andorra, Bangladesh, Nepal, Moldova (Republic of), Tunisia
    Description

    Company Datasets for valuable business insights!

    Discover new business prospects, identify investment opportunities, track competitor performance, and streamline your sales efforts with comprehensive Company Datasets.

    These datasets are sourced from top industry providers, ensuring you have access to high-quality information:

    • Owler: Gain valuable business insights and competitive intelligence. -AngelList: Receive fresh startup data transformed into actionable insights. -CrunchBase: Access clean, parsed, and ready-to-use business data from private and public companies. -Craft.co: Make data-informed business decisions with Craft.co's company datasets. -Product Hunt: Harness the Product Hunt dataset, a leader in curating the best new products.

    We provide fresh and ready-to-use company data, eliminating the need for complex scraping and parsing. Our data includes crucial details such as:

    • Company name;
    • Size;
    • Founding date;
    • Location;
    • Industry;
    • Revenue;
    • Employee count;
    • Competitors.

    You can choose your preferred data delivery method, including various storage options, delivery frequency, and input/output formats.

    Receive datasets in CSV, JSON, and other formats, with storage options like AWS S3 and Google Cloud Storage. Opt for one-time, monthly, quarterly, or bi-annual data delivery.

    With Oxylabs Datasets, you can count on:

    • Fresh and accurate data collected and parsed by our expert web scraping team.
    • Time and resource savings, allowing you to focus on data analysis and achieving your business goals.
    • A customized approach tailored to your specific business needs.
    • Legal compliance in line with GDPR and CCPA standards, thanks to our membership in the Ethical Web Data Collection Initiative.

    Pricing Options:

    Standard Datasets: choose from various ready-to-use datasets with standardized data schemas, priced from $1,000/month.

    Custom Datasets: Tailor datasets from any public web domain to your unique business needs. Contact our sales team for custom pricing.

    Experience a seamless journey with Oxylabs:

    • Understanding your data needs: We work closely to understand your business nature and daily operations, defining your unique data requirements.
    • Developing a customized solution: Our experts create a custom framework to extract public data using our in-house web scraping infrastructure.
    • Delivering data sample: We provide a sample for your feedback on data quality and the entire delivery process.
    • Continuous data delivery: We continuously collect public data and deliver custom datasets per the agreed frequency.

    Unlock the power of data with Oxylabs' Company Datasets and supercharge your business insights today!

  3. d

    Website Analytics

    • catalog.data.gov
    • data.brla.gov
    • +2more
    Updated Jul 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.brla.gov (2025). Website Analytics [Dataset]. https://catalog.data.gov/dataset/website-analytics-89ba5
    Explore at:
    Dataset updated
    Jul 12, 2025
    Dataset provided by
    data.brla.gov
    Description

    Web traffic statistics for the several City-Parish websites, brla.gov, city.brla.gov, Red Stick Ready, GIS, Open Data etc. Information provided by Google Analytics.

  4. g

    Website Traffic Dataset

    • gts.ai
    json
    Updated Aug 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2024). Website Traffic Dataset [Dataset]. https://gts.ai/dataset-download/website-traffic-dataset/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Aug 23, 2024
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Explore our detailed website traffic dataset featuring key metrics like page views, session duration, bounce rate, traffic source, and conversion rates.

  5. P

    WEB-FORUM-52 Dataset

    • paperswithcode.com
    Updated Feb 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Albert Weichselbraun; Adrian M. P. Brasoveanu; Roger Waldvogel; Fabian Odoni (2021). WEB-FORUM-52 Dataset [Dataset]. https://paperswithcode.com/dataset/web-forum-52
    Explore at:
    Dataset updated
    Feb 16, 2021
    Authors
    Albert Weichselbraun; Adrian M. P. Brasoveanu; Roger Waldvogel; Fabian Odoni
    Description

    The WEB-FORUM-52 gold standard comprises (i) 13 web forums from the health domain, (ii) 15 forums obtained from a Wikipedia list of popular forums (https://en.wikipedia.org/wiki/List_of_Internet_forums), (iii) 13 forums mentioned on a list of popular German Web forums (https://www.beliebte-foren.de), (iv) nine forums obtained from WPressBlog (https://www.wpressblog.com/free-forum-posting-sites-list/) and (v) two additional forums. For most forums two web pages (from different threads) were used and stored together with gold standard annotations that have been manually created by domain experts and describe the post text, post date, post user and direct URL to the post.

  6. u

    PDMX

    • cseweb.ucsd.edu
    json
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, PDMX [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
    Explore at:
    jsonAvailable download formats
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    We introduce PDMX: a Public Domain MusicXML dataset for symbolic music processing, including over 250k musical scores in MusicXML format. PDMX is the largest publicly available, copyright-free MusicXML dataset in existence. PDMX includes genre, tag, description, and popularity metadata for every file.

  7. i

    Website Fingerprinting Dataset of Browsing Network Traffic for Desktop and...

    • ieee-dataport.org
    Updated Oct 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamad Amar Irsyad Mohd Aminuddin (2024). Website Fingerprinting Dataset of Browsing Network Traffic for Desktop and Mobile Webpages [Dataset]. https://ieee-dataport.org/documents/website-fingerprinting-dataset-browsing-network-traffic-desktop-and-mobile-webpages
    Explore at:
    Dataset updated
    Oct 21, 2024
    Authors
    Mohamad Amar Irsyad Mohd Aminuddin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a dataset of Tor cell file extracted from browsing simulation using Tor Browser. The simulations cover both desktop and mobile webpages. The data collection process was using WFP-Collector tool (https://github.com/irsyadpage/WFP-Collector). All the neccessary configuration to perform the simulation as detailed in the tool repository.The webpage URL is selected by using the first 100 website based on: https://dataforseo.com/free-seo-stats/top-1000-websites.Each webpage URL is visited 90 times for each deskop and mobile browsing mode.

  8. Datasets for figures and tables

    • catalog.data.gov
    • datasets.ai
    Updated Nov 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Datasets for figures and tables [Dataset]. https://catalog.data.gov/dataset/datasets-for-figures-and-tables
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Software Model simulations were conducted using WRF version 3.8.1 (available at https://github.com/NCAR/WRFV3) and CMAQ version 5.2.1 (available at https://github.com/USEPA/CMAQ). The meteorological and concentration fields created using these models are too large to archive on ScienceHub, approximately 1 TB, and are archived on EPA’s high performance computing archival system (ASM) at /asm/MOD3APP/pcc/02.NOAH.v.CLM.v.PX/. Figures Figures 1 – 6 and Figure 8: Created using the NCAR Command Language (NCL) scripts (https://www.ncl.ucar.edu/get_started.shtml). NCLD code can be downloaded from the NCAR website (https://www.ncl.ucar.edu/Download/) at no cost. The data used for these figures are archived on EPA’s ASM system and are available upon request. Figures 7, 8b-c, 8e-f, 8h-i, and 9 were created using the AMET utility developed by U.S. EPA/ORD. AMET can be freely downloaded and used at https://github.com/USEPA/AMET. The modeled data paired in space and time provided in this archive can be used to recreate these figures. The data contained in the compressed zip files are organized in comma delimited files with descriptive headers or space delimited files that match tabular data in the manuscript. The data dictionary provides additional information about the files and their contents. This dataset is associated with the following publication: Campbell, P., J. Bash, and T. Spero. Updates to the Noah Land Surface Model in WRF‐CMAQ to Improve Simulated Meteorology, Air Quality, and Deposition. Journal of Advances in Modeling Earth Systems. John Wiley & Sons, Inc., Hoboken, NJ, USA, 11(1): 231-256, (2019).

  9. D

    Dataset Alerts - Open and Monitoring

    • datasf.org
    • data.sfgov.org
    • +1more
    application/rdfxml +5
    Updated Jun 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Dataset Alerts - Open and Monitoring [Dataset]. https://datasf.org/opendata/
    Explore at:
    json, application/rssxml, csv, tsv, xml, application/rdfxmlAvailable download formats
    Dataset updated
    Jun 20, 2025
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    A log of dataset alerts open, monitored or resolved on the open data portal. Alerts can include issues as well as deprecation or discontinuation notices.

  10. Machine Learning Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Dec 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2024). Machine Learning Dataset [Dataset]. https://brightdata.com/products/datasets/machine-learning
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Dec 23, 2024
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Utilize our machine learning datasets to develop and validate your models. Our datasets are designed to support a variety of machine learning applications, from image recognition to natural language processing and recommendation systems. You can access a comprehensive dataset or tailor a subset to fit your specific requirements, using data from a combination of various sources and websites, including custom ones. Popular use cases include model training and validation, where the dataset can be used to ensure robust performance across different applications. Additionally, the dataset helps in algorithm benchmarking by providing extensive data to test and compare various machine learning algorithms, identifying the most effective ones for tasks such as fraud detection, sentiment analysis, and predictive maintenance. Furthermore, it supports feature engineering by allowing you to uncover significant data attributes, enhancing the predictive accuracy of your machine learning models for applications like customer segmentation, personalized marketing, and financial forecasting.

  11. d

    Free Food & Meal Sites

    • catalog.data.gov
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Philadelphia (2025). Free Food & Meal Sites [Dataset]. https://catalog.data.gov/dataset/free-food-meal-sites
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    City of Philadelphia
    Description

    This dataset and app provide the locations of sites where the public can access free food, nutrition services, and public benefits.

  12. w

    Dataset of free cash flow and website of public companies for Fiserv

    • workwithdata.com
    Updated Nov 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of free cash flow and website of public companies for Fiserv [Dataset]. https://www.workwithdata.com/datasets/public-companies?col=company%2Cfree_cash_flow%2Cwebsite&f=1&fcol0=company&fop0=%3D&fval0=Fiserv
    Explore at:
    Dataset updated
    Nov 27, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about companies. It has 1 row and is filtered where the company is Fiserv. It features 3 columns: website, and free cash flow.

  13. d

    NYC Free Tax Prep Sites

    • catalog.data.gov
    • data.cityofnewyork.us
    Updated Sep 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.cityofnewyork.us (2023). NYC Free Tax Prep Sites [Dataset]. https://catalog.data.gov/dataset/nyc-free-tax-prep-sites
    Explore at:
    Dataset updated
    Sep 2, 2023
    Dataset provided by
    data.cityofnewyork.us
    Area covered
    New York
    Description

    This dataset provides a machine-readable format for the data that populates the "NYC Free Tax Preparation Site Finder" map hosted on DCA's website. The dataset includes the name and address of the service provider, its hours of operation, services available, and required geo-spacial data elements used by the map. DCA's Office of Financial Empowerment (OFE) DCA coordinates the City’s Annual Tax Season Initiative which offers free tax preparation services to qualifying New Yorkers. NYC Free Tax Prep sites are displayed on a map at nyc.gov/taxprep (https://www1.nyc.gov/assets/dca/TaxMap/index.html) The map is updated whenever a new site is added or an existing site changes its hours of operation or services provided. For more information about Free Tax Preparation Sties visit the DCA website (https://www1.nyc.gov/site/dca/consumers/file-your-taxes-faqs.page).

  14. Coursera Courses Uncleaned Dataset to Practice

    • kaggle.com
    Updated May 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Janak Pariyar (2024). Coursera Courses Uncleaned Dataset to Practice [Dataset]. https://www.kaggle.com/datasets/janakpariyar/coursera-courses-uncleaned-dataset-to-practice/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 2, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Janak Pariyar
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The data set is web scraped from the Coursera website. The data is static. It consists of 7 columns with various unstructured data, which might help you on your learning curve of Data Science and Data Analytics . Feel free to play around . Happy Digging :)

  15. w

    Dataset of free cash flow and website of public companies for Netflix

    • workwithdata.com
    Updated Nov 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of free cash flow and website of public companies for Netflix [Dataset]. https://www.workwithdata.com/datasets/public-companies?col=company%2Cfree_cash_flow%2Cwebsite&f=1&fcol0=company&fop0=%3D&fval0=Netflix
    Explore at:
    Dataset updated
    Nov 27, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about companies. It has 1 row and is filtered where the company is Netflix. It features 3 columns: website, and free cash flow.

  16. i

    A Dataset on Online Learning-based Web Behavior from Different Countries...

    • ieee-dataport.org
    Updated Apr 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saumick Pradhan (2022). A Dataset on Online Learning-based Web Behavior from Different Countries Before and After COVID-19 [Dataset]. https://ieee-dataport.org/open-access/dataset-online-learning-based-web-behavior-different-countries-and-after-covid-19
    Explore at:
    Dataset updated
    Apr 27, 2022
    Authors
    Saumick Pradhan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    2022

  17. h

    1k_Website_Screenshots_and_Metadata

    • huggingface.co
    Updated Apr 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Silatus (2023). 1k_Website_Screenshots_and_Metadata [Dataset]. https://huggingface.co/datasets/silatus/1k_Website_Screenshots_and_Metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 13, 2023
    Dataset authored and provided by
    Silatus
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for 1000 Website Screenshots with Metadata

      Dataset Summary
    

    Silatus is sharing, for free, a segment of a dataset that we are using to train a generative AI model for text-to-mockup conversions. This dataset was collected in December 2022 and early January 2023, so it contains very recent data from 1,000 of the world's most popular websites. You can get our larger 10,000 website dataset for free at: https://silatus.com/datasets This dataset includes: High-res… See the full description on the dataset page: https://huggingface.co/datasets/silatus/1k_Website_Screenshots_and_Metadata.

  18. Top rated TV series extracted from TMDB website

    • kaggle.com
    Updated Jul 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aditi Sable (2024). Top rated TV series extracted from TMDB website [Dataset]. https://www.kaggle.com/datasets/aditis4ble/top-rated-tv-series-extracted-from-tmdb-website/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 15, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Aditi Sable
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Aditi Sable

    Released under CC0: Public Domain

    Contents

  19. u

    Pinterest Fashion Compatibility

    • cseweb.ucsd.edu
    • beta.data.urbandatacentre.ca
    json
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, Pinterest Fashion Compatibility [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
    Explore at:
    jsonAvailable download formats
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    This dataset contains images (scenes) containing fashion products, which are labeled with bounding boxes and links to the corresponding products.

    Metadata includes

    • product IDs

    • bounding boxes

    Basic Statistics:

    • Scenes: 47,739

    • Products: 38,111

    • Scene-Product Pairs: 93,274

  20. i

    Labeled Image Datasets for AI & Computer Vision

    • images.cv
    Updated Apr 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Images.cv (2024). Labeled Image Datasets for AI & Computer Vision [Dataset]. https://images.cv/
    Explore at:
    Dataset updated
    Apr 26, 2024
    Dataset provided by
    Images.cv
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Explore and download labeled image datasets for AI, ML, and computer vision. Find datasets for object detection, image classification, and image segmentation.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
data.lacity.org (2025). Open Data Website Traffic [Dataset]. https://catalog.data.gov/dataset/open-data-website-traffic

Open Data Website Traffic

Explore at:
Dataset updated
Jun 21, 2025
Dataset provided by
data.lacity.org
Description

Daily utilization metrics for data.lacity.org and geohub.lacity.org. Updated monthly

Search
Clear search
Close search
Google apps
Main menu