100+ datasets found
  1. Website Statistics

    • data.wu.ac.at
    • data.europa.eu
    csv, pdf
    Updated Jun 11, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lincolnshire County Council (2018). Website Statistics [Dataset]. https://data.wu.ac.at/schema/data_gov_uk/M2ZkZDBjOTUtMzNhYi00YWRjLWI1OWMtZmUzMzA5NjM0ZTdk
    Explore at:
    csv, pdfAvailable download formats
    Dataset updated
    Jun 11, 2018
    Dataset provided by
    Lincolnshire County Councilhttp://www.lincolnshire.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    This Website Statistics dataset has four resources showing usage of the Lincolnshire Open Data website. Web analytics terms used in each resource are defined in their accompanying Metadata file.

    • Website Usage Statistics: This document shows a statistical summary of usage of the Lincolnshire Open Data site for the latest calendar year.

    • Website Statistics Summary: This dataset shows a website statistics summary for the Lincolnshire Open Data site for the latest calendar year.

    • Webpage Statistics: This dataset shows statistics for individual Webpages on the Lincolnshire Open Data site by calendar year.

    • Dataset Statistics: This dataset shows cumulative totals for Datasets on the Lincolnshire Open Data site that have also been published on the national Open Data site Data.Gov.UK - see the Source link.

      Note: Website and Webpage statistics (the first three resources above) show only UK users, and exclude API calls (automated requests for datasets). The Dataset Statistics are confined to users with javascript enabled, which excludes web crawlers and API calls.

    These Website Statistics resources are updated annually in January by the Lincolnshire County Council Business Intelligence team. For any enquiries about the information contact opendata@lincolnshire.gov.uk.

  2. d

    50 States Comparison

    • catalog.data.gov
    • s.cnmilf.com
    • +2more
    Updated Sep 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.iowa.gov (2023). 50 States Comparison [Dataset]. https://catalog.data.gov/dataset/50-states-comparison
    Explore at:
    Dataset updated
    Sep 1, 2023
    Dataset provided by
    data.iowa.gov
    Area covered
    United States
    Description

    This online application gives manufacturers the ability to compare Iowa to other states on a number of different topics including: business climate, education, operating costs, quality of life and workforce.

  3. [Crypto] CoinGecko vs CoinMarketCap Data

    • kaggle.com
    Updated May 11, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sherpa (2020). [Crypto] CoinGecko vs CoinMarketCap Data [Dataset]. https://www.kaggle.com/thesherpafromalabama/coingecko-vs-coinmarketcap-data/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 11, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sherpa
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Use the CMC_CG_Combo dataset, unless you want to recollect and DIY!

    Context

    On a quest to compare different cryptoexchanges, I came up with the idea to compare metrics across multiple platforms (at the moment just two). CoinGecko and CoinMarketCap are two of the biggest websites for monitoring both exchanges and cryptoprojects. In response to over-inflated volumes faked by crypto exchanges, both websites came up with independent metrics for assessing the worth of a given exchange.

    Content

    Collected on May 10, 2020

    CoinGecko's data is a bit more holistic, containing metrics across a multitude of areas (you can read more in the original blog post here. The data from CoinGecko consists of the following:

    -Exchange Name -Trust Score (on a scale of N/A-10) -Type (centralized/decentralized) -AML (risk: How well prepared are they to handle financial crime?) -API Coverage (Blanket Measure that includes: (1) Tickers Data (2) Historical Trades Data (3) Order Book Data (4) Candlestick/OHLC (5) WebSocket API (6) API Trading (7) Public Documentation -API Last Updated (When was the API last updated?) -Bid Ask Spread (Average buy/sell spread across all pairs) -Candlestick (Available/Not) -Combined Orderbook Percentile (See above link) -Estimated_Reserves (estimated holdings of major crypto) -Grade_Score (Overall API score) -Historical Data (available/not) -Jurisdiction Risk (risk: risk of Terrorist activity/bribery/corruption?) -KYC Procedures (risk: Know Your Customer?) -License and Authorization (risk: has exchange sought regulatory approval?) -Liquidity (don't confuse with "CMC Liquidity". THIS column is a combo of (1) Web traffic & Reported Volume (2) Order book spread (3) Trading Activity (4) Trust Score on Trading Pairs -Negative News (risk: any bad news?) -Normalized Trading Volume (Trading Volume normalized to web traffic) -Normalized Volume Percentile (see above blog link) -Orderbook (available/not) -Public Documentation (got well documented API available to everyone?) -Regulatory Compliance (risk rating from compliance perspective) -Regulatory last updated (last time regulatory metrics were updated) -Reported Trading Volume (volume as listed by the exchange) -Reported Normalized Trading Volume (Ratio of normalized to reported volume [0-1]) -Sanctions (risk: risk of sanctions?) -Scale (based on: (1) Normalized Trading Volume Percentile (2) Normalized Order Book Depth Percentile -Senior Public Figure (risk: does exchange have transparent public relations? etc) -Tickers (tick tick tick...) -Trading via API (can data be traded through the API?) -Websocket (got websockets?)

    -Green Pairs (Percentage of trading pairs deemed to have good liquidity) -Yellow Pairs (Percentage of trading pairs deemed to have fair liquidity -Red Pairs (Percentage of trading pairs deemed to have poor liquidity) -Unknown Pairs (percentage of trading pairs that do not have sufficient order book data)

    ~

    Again, CoinMarketCap only has one metric (that was recently updated and scales from 1-1000, 1000 being very liquid and 1 not. You can go check the article out for yourself. In the dataset, this is the "CMC Liquidity" column, not to be confused with the "Liquidity" column, which refers to the CoinGecko Metric!

    Acknowledgements

    Thanks to coingecko and cmc for making their data scrapable :)

    [CMC, you should try to give us a little more access to the figures that define your metric. Thanks!]

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  4. Data from: Nursing Home Compare

    • catalog.data.gov
    • datahub.va.gov
    • +2more
    Updated Aug 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Veterans Affairs (2025). Nursing Home Compare [Dataset]. https://catalog.data.gov/dataset/nursing-home-compare-ed7b0
    Explore at:
    Dataset updated
    Aug 2, 2025
    Dataset provided by
    United States Department of Veterans Affairshttp://va.gov/
    Description

    Nursing Home Compare has detailed information about every Medicare and Medicaid nursing home in the country. A nursing home is a place for people who can’t be cared for at home and need 24-hour nursing care. These are the official datasets used on the Medicare.gov Nursing Home Compare Website provided by the Centers for Medicare & Medicaid Services. These data allow you to compare the quality of care at every Medicare and Medicaid-certified nursing home in the country, including over 15,000 nationwide.

  5. Z

    Kaggle Wikipedia Web Traffic Daily Dataset (without Missing Values)

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 1, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Webb, Geoff (2021). Kaggle Wikipedia Web Traffic Daily Dataset (without Missing Values) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3892918
    Explore at:
    Dataset updated
    Apr 1, 2021
    Dataset provided by
    Hyndman, Rob
    Godahewa, Rakshitha
    Bergmeir, Christoph
    Webb, Geoff
    Montero-Manso, Pablo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was used in the Kaggle Wikipedia Web Traffic forecasting competition. It contains 145063 daily time series representing the number of hits or web traffic for a set of Wikipedia pages from 2015-07-01 to 2017-09-10.

    The original dataset contains missing values. They have been simply replaced by zeros.

  6. A

    Site compare scripts and output

    • data.amerigeoss.org
    • datasets.ai
    • +2more
    zip
    Updated Aug 17, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States (2022). Site compare scripts and output [Dataset]. http://doi.org/10.23719/1500018
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 17, 2022
    Dataset provided by
    United States
    License

    https://pasteur.epa.gov/license/sciencehub-license.htmlhttps://pasteur.epa.gov/license/sciencehub-license.html

    Description

    Monthly site compare scripts and output used to generate the model/ob plots and statistics in the manuscript. The AQS hourly site compare output files are not included as they were too large to store on ScienceHub. The files contain paired model/ob values for the various air quality networks.

    This dataset is associated with the following publication: Appel, W., S. Napelenok, K. Foley, H. Pye, C. Hogrefe, D. Luecken, J. Bash, S. Roselle, J. Pleim, H. Foroutan, B. Hutzell, G. Pouliot, G. Sarwar, K. Fahey, B. Gantt, D. Kang, R. Mathur, D. Schwede, T. Spero, D. Wong, J. Young, and N. Heath. Description and evaluation of the Community Multiscale Air Quality (CMAQ) modeling system version 5.1. Geoscientific Model Development. Copernicus Publications, Katlenburg-Lindau, GERMANY, 10: 1703-1732, (2017).

  7. d

    Website Analytics

    • catalog.data.gov
    • data.nola.gov
    • +4more
    Updated Jun 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.nola.gov (2025). Website Analytics [Dataset]. https://catalog.data.gov/dataset/website-analytics
    Explore at:
    Dataset updated
    Jun 28, 2025
    Dataset provided by
    data.nola.gov
    Description

    This data about nola.gov provides a window into how people are interacting with the the City of New Orleans online. The data comes from a unified Google Analytics account for New Orleans. We do not track individuals and we anonymize the IP addresses of all visitors.

  8. n

    Amazon Web Services Public Data Sets

    • neuinfo.org
    • dknet.org
    • +1more
    Updated Jan 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Amazon Web Services Public Data Sets [Dataset]. http://identifiers.org/RRID:SCR_006318
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    A multidisciplinary repository of public data sets such as the Human Genome and US Census data that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge for the community. Anyone can access these data sets from their Amazon Elastic Compute Cloud (Amazon EC2) instances and start computing on the data within minutes. Users can also leverage the entire AWS ecosystem and easily collaborate with other AWS users. If you have a public domain or non-proprietary data set that you think is useful and interesting to the AWS community, please submit a request and the AWS team will review your submission and get back to you. Typically the data sets in the repository are between 1 GB to 1 TB in size (based on the Amazon EBS volume limit), but they can work with you to host larger data sets as well. You must have the right to make the data freely available.

  9. Web Analytics Dataset

    • kaggle.com
    Updated Sep 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oluwapelumi Ojo (2023). Web Analytics Dataset [Dataset]. https://www.kaggle.com/datasets/oluwapelumiojo/web-analytics-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 4, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Oluwapelumi Ojo
    Description

    This Dataset contains information related to web marketing analytics. it contains information such as sessions, session duration, bounces, time on page, unique page that gives insight into web performance

  10. i

    Website Fingerprinting Dataset of Browsing Network Traffic for Desktop and...

    • ieee-dataport.org
    Updated Oct 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamad Amar Irsyad Mohd Aminuddin (2024). Website Fingerprinting Dataset of Browsing Network Traffic for Desktop and Mobile Webpages [Dataset]. https://ieee-dataport.org/documents/website-fingerprinting-dataset-browsing-network-traffic-desktop-and-mobile-webpages
    Explore at:
    Dataset updated
    Oct 21, 2024
    Authors
    Mohamad Amar Irsyad Mohd Aminuddin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a dataset of Tor cell file extracted from browsing simulation using Tor Browser. The simulations cover both desktop and mobile webpages. The data collection process was using WFP-Collector tool (https://github.com/irsyadpage/WFP-Collector). All the neccessary configuration to perform the simulation as detailed in the tool repository.The webpage URL is selected by using the first 100 website based on: https://dataforseo.com/free-seo-stats/top-1000-websites.Each webpage URL is visited 90 times for each deskop and mobile browsing mode.

  11. Influencer Marketing ROI Dataset

    • kaggle.com
    Updated Jun 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ojas Singh (2025). Influencer Marketing ROI Dataset [Dataset]. https://www.kaggle.com/datasets/tfisthis/influencer-marketing-roi-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 9, 2025
    Dataset provided by
    Kaggle
    Authors
    Ojas Singh
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset tracks influencer marketing campaigns across major social media platforms, providing a robust foundation for analyzing campaign effectiveness, engagement, reach, and sales outcomes. Each record represents a unique campaign and includes details such as the campaign’s platform (Instagram, YouTube, TikTok, Twitter), influencer category (e.g., Fashion, Tech, Fitness), campaign type (Product Launch, Brand Awareness, Giveaway, etc.), start and end dates, total user engagements, estimated reach, product sales, and campaign duration. The dataset structure supports diverse analyses, including ROI calculation, campaign benchmarking, and influencer performance comparison.

    Columns: - campaign_id: Unique identifier for each campaign
    - platform: Social media platform where the campaign ran
    - influencer_category: Niche or industry focus of the influencer
    - campaign_type: Objective or style of the campaign
    - start_date, end_date: Campaign time frame
    - engagements: Total user interactions (likes, comments, shares, etc.)
    - estimated_reach: Estimated number of unique users exposed to the campaign
    - product_sales: Number of products sold as a result of the campaign
    - campaign_duration_days: Duration of the campaign in days

    Getting Started with the Data

    1. Load and Inspect the Dataset

    import pandas as pd
    
    df = pd.read_csv('influencer_marketing_roi_dataset.csv', parse_dates=['start_date', 'end_date'])
    print(df.head())
    print(df.info())
    

    2. Basic Exploration

    # Overview of campaign types and platforms
    print(df['campaign_type'].value_counts())
    print(df['platform'].value_counts())
    
    # Summary statistics
    print(df[['engagements', 'estimated_reach', 'product_sales']].describe())
    

    3. Engagement and Sales Analysis

    # Average engagements and sales by platform
    platform_stats = df.groupby('platform')[['engagements', 'product_sales']].mean()
    print(platform_stats)
    
    # Top influencer categories by product sales
    top_categories = df.groupby('influencer_category')['product_sales'].sum().sort_values(ascending=False)
    print(top_categories)
    

    4. ROI Calculation Example

    # Assume a fixed campaign cost for demonstration
    df['campaign_cost'] = 500 + df['estimated_reach'] * 0.01 # Example formula
    
    # Calculate ROI: (Revenue - Cost) / Cost
    # Assume each product sold yields $40 revenue
    df['revenue'] = df['product_sales'] * 40
    df['roi'] = (df['revenue'] - df['campaign_cost']) / df['campaign_cost']
    
    # View campaigns with highest ROI
    top_roi = df.sort_values('roi', ascending=False).head(10)
    print(top_roi[['campaign_id', 'platform', 'roi']])
    

    5. Visualizing Campaign Performance

    import matplotlib.pyplot as plt
    import seaborn as sns
    
    # Engagements vs. Product Sales scatter plot
    plt.figure(figsize=(8,6))
    sns.scatterplot(data=df, x='engagements', y='product_sales', hue='platform', alpha=0.6)
    plt.title('Engagements vs. Product Sales by Platform')
    plt.xlabel('Engagements')
    plt.ylabel('Product Sales')
    plt.legend()
    plt.show()
    
    # Average ROI by Influencer Category
    category_roi = df.groupby('influencer_category')['roi'].mean().sort_values()
    category_roi.plot(kind='barh', color='teal')
    plt.title('Average ROI by Influencer Category')
    plt.xlabel('Average ROI')
    plt.show()
    

    6. Time-Based Analysis

    # Campaigns over time
    df['month'] = df['start_date'].dt.to_period('M')
    monthly_sales = df.groupby('month')['product_sales'].sum()
    monthly_sales.plot(figsize=(10,4), marker='o', title='Monthly Product Sales from Influencer Campaigns')
    plt.ylabel('Product Sales')
    plt.show()
    

    Use Cases

    • ROI Analysis: Quantify the return on investment for influencer campaigns across platforms and categories.
    • Campaign Benchmarking: Compare campaign performance by type, influencer niche, or platform.
    • Trend Analysis: Track engagement, reach, and sales trends over time.
    • Influencer Selection: Identify high-performing influencer categories and campaign types for future partnerships.
  12. Data from: Internet users

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Apr 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2021). Internet users [Dataset]. https://www.ons.gov.uk/businessindustryandtrade/itandinternetindustry/datasets/internetusers
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Apr 6, 2021
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Internet use in the UK annual estimates by age, sex, disability, ethnic group, economic activity and geographical location, including confidence intervals.

  13. D

    .compare Domain List (CSV) | DomainMetaData

    • domainmetadata.com
    csv
    Updated Aug 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    arndt.ai GmbH (2025). .compare Domain List (CSV) | DomainMetaData [Dataset]. https://domainmetadata.com/compare-domain-list
    Explore at:
    csvAvailable download formats
    Dataset updated
    Aug 22, 2025
    Dataset authored and provided by
    arndt.ai GmbH
    License

    https://domainmetadata.com/termshttps://domainmetadata.com/terms

    Variables measured
    domain name
    Measurement technique
    DNS zone file monitoring, web crawling, machine learning prediction
    Description

    Download new, active & historic .compare domains — updated multiple times daily

  14. p

    Data from: Web Academy

    • publicschoolreview.com
    json, xml
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Public School Review, Web Academy [Dataset]. https://www.publicschoolreview.com/web-academy-profile
    Explore at:
    xml, jsonAvailable download formats
    Dataset authored and provided by
    Public School Review
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2005 - Dec 31, 2025
    Description

    Historical Dataset of Web Academy is provided by PublicSchoolReview and contain statistics on metrics:Total Students Trends Over Years (2005-2007),Distribution of Students By Grade Trends,American Indian Student Percentage Comparison Over Years (2006-2007),Asian Student Percentage Comparison Over Years (2005-2007),Hispanic Student Percentage Comparison Over Years (2005-2007),Black Student Percentage Comparison Over Years (2005-2007),White Student Percentage Comparison Over Years (2005-2007),Diversity Score Comparison Over Years (2005-2007),Free Lunch Eligibility Comparison Over Years (2006-2007)

  15. s

    Statistics Bureau Web Service Interface (WFS) Dataset Collection 2020 -...

    • store.smartdatahub.io
    Updated Nov 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Statistics Bureau Web Service Interface (WFS) Dataset Collection 2020 - Datasets - This service has been deprecated - please visit https://www.smartdatahub.io/ to access data. See the About page for details. // [Dataset]. https://store.smartdatahub.io/dataset/fi_tilastokeskus_tilastointialueet_seutukunta4500k_2020
    Explore at:
    Dataset updated
    Nov 11, 2024
    Description

    This dataset collection comprises a series of related data tables sourced from the website of 'Tilastokeskus' (Statistics Finland), based in Finland. The tables within this collection contain data retrieved from the Statistics Finland's service interface (WFS). The content of the tables is organized in a structured format with rows and columns, showcasing a correlation between different sets of data. The collection, while primarily intended for statistical analysis, can be utilized in a variety of ways, depending on the specific needs of the user. This dataset is licensed under CC BY 4.0 (Creative Commons Attribution 4.0, https://creativecommons.org/licenses/by/4.0/deed.fi).

  16. Job Offers Web Scraping Search

    • kaggle.com
    Updated Feb 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Job Offers Web Scraping Search [Dataset]. https://www.kaggle.com/datasets/thedevastator/job-offers-web-scraping-search
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 11, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Job Offers Web Scraping Search

    Targeted Results to Find the Optimal Work Solution

    By [source]

    About this dataset

    This dataset collects job offers from web scraping which are filtered according to specific keywords, locations and times. This data gives users rich and precise search capabilities to uncover the best working solution for them. With the information collected, users can explore options that match with their personal situation, skillset and preferences in terms of location and schedule. The columns provide detailed information around job titles, employer names, locations, time frames as well as other necessary parameters so you can make a smart choice for your next career opportunity

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset is a great resource for those looking to find an optimal work solution based on keywords, location and time parameters. With this information, users can quickly and easily search through job offers that best fit their needs. Here are some tips on how to use this dataset to its fullest potential:

    • Start by identifying what type of job offer you want to find. The keyword column will help you narrow down your search by allowing you to search for job postings that contain the word or phrase you are looking for.

    • Next, consider where the job is located – the Location column tells you where in the world each posting is from so make sure it’s somewhere that suits your needs!

    • Finally, consider when the position is available – look at the Time frame column which gives an indication of when each posting was made as well as if it’s a full-time/ part-time role or even if it’s a casual/temporary position from day one so make sure it meets your requirements first before applying!

    • Additionally, if details such as hours per week or further schedule information are important criteria then there is also info provided under Horari and Temps Oferta columns too! Now that all three criteria have been ticked off - key words, location and time frame - then take a look at Empresa (Company Name) and Nom_Oferta (Post Name) columns too in order to get an idea of who will be employing you should you land the gig!

      All these pieces of data put together should give any motivated individual all they need in order to seek out an optimal work solution - keep hunting good luck!

    Research Ideas

    • Machine learning can be used to groups job offers in order to facilitate the identification of similarities and differences between them. This could allow users to specifically target their search for a work solution.
    • The data can be used to compare job offerings across different areas or types of jobs, enabling users to make better informed decisions in terms of their career options and goals.
    • It may also provide an insight into the local job market, enabling companies and employers to identify where there is potential for new opportunities or possible trends that simply may have previously gone unnoticed

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: web_scraping_information_offers.csv | Column name | Description | |:-----------------|:------------------------------------| | Nom_Oferta | Name of the job offer. (String) | | Empresa | Company offering the job. (String) | | Ubicació | Location of the job offer. (String) | | Temps_Oferta | Time of the job offer. (String) | | Horari | Schedule of the job offer. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit .

  17. f

    Cinema Context web statistics 2011-2017.xlsx

    • figshare.com
    xlsx
    Updated Nov 8, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julia Noordegraaf (2018). Cinema Context web statistics 2011-2017.xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.5972146.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 8, 2018
    Dataset provided by
    figshare
    Authors
    Julia Noordegraaf
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This file contains data extracted from DPC, “Statistics for: www.cinemacontext.nl” (https://dpc.uba.uva.nl/awstats/awstats.pl?config=www.cinemacontext.nl). The statistics for the Cinema Context website are collected by the Digital Production Center (DPC) of the University Library Amsterdam (UBA), the organization that hosts and maintains the database and web interface. DPC collects the web statistics with the program Advanced Web Statistics (AWStats, version 7.0). The extracted data in this spreadsheet support the analysis of the use of Cinema Context for the article 'Writing Cinema Histories with Digital Databases. The Case of Cinema Context’, authored by Julia Noordegraaf, Kathleen Lotze and Jaap Boter. Tijdschrift voor Mediageschiedenis vol. 21, no. 2 (2018), 106-126. Http://www.tijdschriftmediageschiedenis.nl/index.php/tmg/article/view/369.

  18. Amount of data created, consumed, and stored 2010-2023, with forecasts to...

    • statista.com
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Amount of data created, consumed, and stored 2010-2023, with forecasts to 2028 [Dataset]. https://www.statista.com/statistics/871513/worldwide-data-created/
    Explore at:
    Dataset updated
    Jun 30, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    May 2024
    Area covered
    Worldwide
    Description

    The total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly, reaching *** zettabytes in 2024. Over the next five years up to 2028, global data creation is projected to grow to more than *** zettabytes. In 2020, the amount of data created and replicated reached a new high. The growth was higher than previously expected, caused by the increased demand due to the COVID-19 pandemic, as more people worked and learned from home and used home entertainment options more often. Storage capacity also growing Only a small percentage of this newly created data is kept though, as just * percent of the data produced and consumed in 2020 was saved and retained into 2021. In line with the strong growth of the data volume, the installed base of storage capacity is forecast to increase, growing at a compound annual growth rate of **** percent over the forecast period from 2020 to 2025. In 2020, the installed base of storage capacity reached *** zettabytes.

  19. Africa - Population and Internet users statistics

    • kaggle.com
    Updated Dec 17, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ishmeet singh (2020). Africa - Population and Internet users statistics [Dataset]. https://www.kaggle.com/datasets/ishmeet/africa-population-and-internet-users-statistics
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 17, 2020
    Dataset provided by
    Kaggle
    Authors
    Ishmeet singh
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Area covered
    Africa
    Description

    Context

    Africa - Population and Internet users statistics

    Content

    What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

    Acknowledgements

    Source: https://data.humdata.org/dataset/africa-population-and-internet-users-statistics Last updated at https://data.humdata.org/organization/openafrica : 2019-09-11

  20. S

    Choose Maryland: Compare Metros - Demographics

    • splitgraph.com
    • opendata.maryland.gov
    • +4more
    Updated Jul 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maryland Department of Commerce (2024). Choose Maryland: Compare Metros - Demographics [Dataset]. https://www.splitgraph.com/opendata-maryland-gov/choose-maryland-compare-metros-demographics-h2qn-scd8
    Explore at:
    application/vnd.splitgraph.image, application/openapi+json, jsonAvailable download formats
    Dataset updated
    Jul 9, 2024
    Dataset authored and provided by
    Maryland Department of Commerce
    Area covered
    Maryland
    Description

    Population and income profile - totals, median household.

    Splitgraph serves as an HTTP API that lets you run SQL queries directly on this data to power Web applications. For example:

    See the Splitgraph documentation for more information.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Lincolnshire County Council (2018). Website Statistics [Dataset]. https://data.wu.ac.at/schema/data_gov_uk/M2ZkZDBjOTUtMzNhYi00YWRjLWI1OWMtZmUzMzA5NjM0ZTdk
Organization logo

Website Statistics

Explore at:
csv, pdfAvailable download formats
Dataset updated
Jun 11, 2018
Dataset provided by
Lincolnshire County Councilhttp://www.lincolnshire.gov.uk/
License

Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically

Description

This Website Statistics dataset has four resources showing usage of the Lincolnshire Open Data website. Web analytics terms used in each resource are defined in their accompanying Metadata file.

  • Website Usage Statistics: This document shows a statistical summary of usage of the Lincolnshire Open Data site for the latest calendar year.

  • Website Statistics Summary: This dataset shows a website statistics summary for the Lincolnshire Open Data site for the latest calendar year.

  • Webpage Statistics: This dataset shows statistics for individual Webpages on the Lincolnshire Open Data site by calendar year.

  • Dataset Statistics: This dataset shows cumulative totals for Datasets on the Lincolnshire Open Data site that have also been published on the national Open Data site Data.Gov.UK - see the Source link.

    Note: Website and Webpage statistics (the first three resources above) show only UK users, and exclude API calls (automated requests for datasets). The Dataset Statistics are confined to users with javascript enabled, which excludes web crawlers and API calls.

These Website Statistics resources are updated annually in January by the Lincolnshire County Council Business Intelligence team. For any enquiries about the information contact opendata@lincolnshire.gov.uk.

Search
Clear search
Close search
Google apps
Main menu