79 datasets found

Number of internet users worldwide 2014-2029
statista.com
Updated Apr 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2025). Number of internet users worldwide 2014-2029 [Dataset]. https://www.statista.com/topics/1145/internet-usage-worldwide/
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
World
Description
The global number of internet users in was forecast to continuously increase between 2024 and 2029 by in total 1.3 billion users (+23.66 percent). After the fifteenth consecutive increasing year, the number of users is estimated to reach 7 billion users and therefore a new peak in 2029. Notably, the number of internet users of was continuously increasing over the past years.Depicted is the estimated number of individuals in the country or region at hand, that use the internet. As the datasource clarifies, connection quality and usage frequency are distinct aspects, not taken into account here.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of internet users in countries like the Americas and Asia.
Network Traffic Dataset
kaggle.com
Updated Oct 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ravikumar Gattu (2023). Network Traffic Dataset [Dataset]. https://www.kaggle.com/datasets/ravikumargattu/network-traffic-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 31, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ravikumar Gattu
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

The data presented here was obtained in a Kali Machine from University of Cincinnati,Cincinnati,OHIO by carrying out packet captures for 1 hour during the evening on Oct 9th,2023 using Wireshark.This dataset consists of 394137 instances were obtained and stored in a CSV (Comma Separated Values) file.This large dataset could be used utilised for different machine learning applications for instance classification of Network traffic,Network performance monitoring,Network Security Management , Network Traffic Management ,network intrusion detection and anomaly detection.

The dataset can be used for a variety of machine learning tasks, such as network intrusion detection, traffic classification, and anomaly detection.

Content :

This network traffic dataset consists of 7 features.Each instance contains the information of source and destination IP addresses, The majority of the properties are numeric in nature, however there are also nominal and date kinds due to the Timestamp.

The network traffic flow statistics (No. Time Source Destination Protocol Length Info) were obtained using Wireshark (https://www.wireshark.org/).

Dataset Columns:

No : Number of Instance. Timestamp : Timestamp of instance of network traffic Source IP: IP address of Source Destination IP: IP address of Destination Portocol: Protocol used by the instance Length: Length of Instance Info: Information of Traffic Instance

Acknowledgements :

I would like thank University of Cincinnati for giving the infrastructure for generation of network traffic data set.

Ravikumar Gattu , Susmitha Choppadandi

Inspiration : This dataset goes beyond the majority of network traffic classification datasets, which only identify the type of application (WWW, DNS, ICMP,ARP,RARP) that an IP flow contains. Instead, it generates machine learning models that can identify specific applications (like Tiktok,Wikipedia,Instagram,Youtube,Websites,Blogs etc.) from IP flow statistics (there are currently 25 applications in total).

**Dataset License: ** CC0: Public Domain

Dataset Usages : This dataset can be used for different machine learning applications in the field of cybersecurity such as classification of Network traffic,Network performance monitoring,Network Security Management , Network Traffic Management ,network intrusion detection and anomaly detection.

ML techniques benefits from this Dataset :

This dataset is highly useful because it consists of 394137 instances of network traffic data obtained by using the 25 applications on a public,private and Enterprise networks.Also,the dataset consists of very important features that can be used for most of the applications of Machine learning in cybersecurity.Here are few of the potential machine learning applications that could be benefited from this dataset are :

Network Performance Monitoring : This large network traffic data set can be utilised for analysing the network traffic to identifying the network patterns in the network .This help in designing the network security algorithms for minimise the network probelms.

Anamoly Detection : Large network traffic dataset can be utilised training the machine learning models for finding the irregularitues in the traffic which could help identify the cyber attacks.

3.Network Intrusion Detection : This large dataset could be utilised for machine algorithms training and designing the models for detection of the traffic issues,Malicious traffic network attacks and DOS attacks as well.
Attitudes towards the internet in China 2025
statista.com
Updated Apr 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Umair Bashir (2025). Attitudes towards the internet in China 2025 [Dataset]. https://www.statista.com/topics/1145/internet-usage-worldwide/
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Umair Bashir
Description
When asked about "Attitudes towards the internet", most Chinese respondents pick "It is important to me to have mobile internet access in any place" as an answer. 50 percent did so in our online survey in 2025. Looking to gain valuable insights about users of internet providers worldwide? Check out our reports on consumers who use internet providers. These reports give readers a thorough picture of these customers, including their identities, preferences, opinions, and methods of communication.
d
Custom dataset from any website on the Internet
datarade.ai
Updated Sep 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ScrapeLabs (2022). Custom dataset from any website on the Internet [Dataset]. https://datarade.ai/data-products/custom-dataset-from-any-website-on-the-internet-scrapelabs
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Sep 21, 2022
Dataset authored and provided by
ScrapeLabs
Area covered
Jordan, Kazakhstan, Bulgaria, India, Turks and Caicos Islands, Tunisia, Guinea-Bissau, Argentina, Aruba, Lebanon
Description
We'll extract any data from any website on the Internet. You don't have to worry about buying and maintaining complex and expensive software, or hiring developers.

Some common use cases our customers use the data for: • Data Analysis • Market Research • Price Monitoring • Sales Leads • Competitor Analysis • Recruitment

We can get data from websites with pagination or scroll, with captchas, and even from behind logins. Text, images, videos, documents.

Receive data in any format you need: Excel, CSV, JSON, or any other.
Web Graphs
kaggle.com
zip
Updated Nov 11, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Subhajit Sahu (2021). Web Graphs [Dataset]. https://www.kaggle.com/wolfram77/graphs-web
Explore at:
zip(52848952 bytes)Available download formats
Dataset updated
Nov 11, 2021
Authors
Subhajit Sahu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dynamic face-to-face interaction networks represent the interactions that happen during discussions between a group of participants playing the Resistance game. This dataset contains networks extracted from 62 games. Each game is played by 5-8 participants and lasts between 45--60 minutes. We extract dynamically evolving networks from the free-form discussions using the ICAF algorithm. The extracted networks are used to characterize and detect group deceptive behavior using the DeceptionRank algorithm.

The networks are weighted, directed and temporal. Each node represents a participant. At each 1/3 second, a directed edge from node u to v is weighted by the probability of participant u looking at participant v or the laptop. Additionally, we also provide a binary version where an edge from u to v indicates participant u looks at participant v (or the laptop).

Stanford Network Analysis Platform (SNAP) is a general purpose, high performance system for analysis and manipulation of large networks. Graphs consists of nodes and directed/undirected/multiple edges between the graph nodes. Networks are graphs with data on nodes and/or edges of the network.

The core SNAP library is written in C++ and optimized for maximum performance and compact graph representation. It easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges. Besides scalability to large graphs, an additional strength of SNAP is that nodes, edges and attributes in a graph or a network can be changed dynamically during the computation.

SNAP was originally developed by Jure Leskovec in the course of his PhD studies. The first release was made available in Nov, 2009. SNAP uses a general purpose STL (Standard Template Library)-like library GLib developed at Jozef Stefan Institute. SNAP and GLib are being actively developed and used in numerous academic and industrial projects.

http://snap.stanford.edu/data/index.html#face2face
Long-Term Agricultural Research (LTAR) network - Meteorological Collection
catalog.data.gov
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
+1more
Updated Apr 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Long-Term Agricultural Research (LTAR) network - Meteorological Collection [Dataset]. https://catalog.data.gov/dataset/long-term-agricultural-research-ltar-network-meteorological-collection-7d719
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Description
The LTAR network maintains stations for standard meteorological measurements including, generally, air temperature and humidity, shortwave (solar) irradiance, longwave (thermal) radiation, wind speed and direction, barometric pressure, and precipitation. Many sites also have extensive comparable legacy datasets. The LTAR scientific community decided that these needed to be made available to the public using a single web source in a consistent manner. To that purpose, each site sent data on a regular schedule, as frequently as hourly, to the National Agricultural Library, which has developed a web service to provide the data to the public in tabular or graphical form. This archive of the LTAR legacy database exports contains meteorological data through April 30, 2021. For current meteorological data, visit the GeoEvent Meteorology Resources page, which provides tools and dashboards to view and access data from the 18 LTAR sites across the United States. Resources in this dataset:Resource Title: Meteorological data. File Name: ltar_archive_DB.zipResource Description: This is an export of the meteorological data collected by LTAR sites and ingested by the NAL LTAR application. This export consists of an SQL schema definition file for creating database tables and the data itself. The data is provided in two formats: SQL insert statements (.sql) and CSV files (.csv). Please use the format most convenient for you. Note that the SQL insert statements take much longer to run since each row is an individual insert. Description of zip files The ltararchive*.zip files contain database exports. The schema is a .sql file; the data is exported as both SQL inserts and CSV for convenience. There is a README in markdown and PDF in the zips. Contains the database export of the schema and data for the site, site_station, and met tables as SQL insert statements. ltar_archive_db_sql_export_20201231.zip --> has data until 2020-12-31 ltar_archive_db_sql_export_20210430.zip --> has data until 2021-04-30 Contains the database export of the schema and data for the site, site_station, and met tables as CSV. ltar_archive_db_csv_export_20201231.zip --> has data until 2020-12-31 ltar_archive_db_csv_export_20210430.zip --> has data until 2021-04-30 Contains the raw CSV files that were sent to NAL from the LTAR sites/stations. ltar_rawcsv_archive.zip --> has data until 2021-04-30
V
FCC477 Virginia Broadband Dataset
data.virginia.gov
zip
Updated Feb 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Other (2024). FCC477 Virginia Broadband Dataset [Dataset]. https://data.virginia.gov/dataset/fcc477-virginia-broadband-dataset
Explore at:
zip(251278001)Available download formats
Dataset updated
Feb 3, 2024
Dataset authored and provided by
Other
Description
This data represents the Virginia State Broadband Data area of broadband availability. Area of broadband availability refers to those individual US Census Blocks where each facilities-based provider of broadband service claims to provide broadband services. For each area the provider name and technology they provide there are represented. In addition, advertised and typical upload and download speeds are often reported at these levels. If a provider offers availability to any location within a census block, the entire block is deemed available under this effort. For this purpose, ''broadband service'' is the provision, on either a commercial or non-commercial basis, of data transmission technology that provides two-way data transmission to and from the Internet with advertised speeds of at least 4 megabits per second (mbps) downstream and 0.5 mbps upstream to end users. For this purpose, an ''end user'' of broadband service is a residential or business party, institution or State or local government entity that may use broadband service for its own purposes. An entity is a ''facilities based'' provider of broadband service connections to end user locations if any of the following conditions are met: (1) It owns the portion of the physical facility that terminates at the end user location; (2) it obtains unbundled network elements (UNEs), special access lines, or other leased facilities that terminate at the end user location and provisions/equips them as broadband; or (3) it provisions/equips a broadband wireless channel to the end user location over licensed or unlicensed spectrum.

VABB_CABLE： Cable Wireline Coverage (June 2019) VABB_DSL_COPPER： DSL/Copper Coverage (June 2019) VABB_FIBER： Fiber Optic Coverage (June 2019) VABB_FIXED： Fixed Wireless Coverage (June 2019) VABB_MOBILE： Mobile Wireless Coverage (Dec. 2018) VABB_LTE： 4G/LTE Wireless Coverage (Dec. 2018) VABB_satellite： Satellite Coverage (Dec. 2018) VABB_VATI： Virginia Telecommunication Initiative (VATI) Funding VABB_TRRC： Tobacco Region Revitalization Commission (TRRC) Funding VABB_UNDERSERVED： Underserved Areas greater than 10 Mbps download and 1 Mbps upload and less than 25 Mbps download and 3 Mbps upload VABB_Unserved： Unserved Areas below or equal to 10 Mbps download and 1 Mbps upload VABB_Lacking： No Residential Broadband (25/3) reported (June 2019)
G
Adverse effects of using the Internet and social networking websites or apps...
open.canada.ca
www150.statcan.gc.ca
+1more
csv, html, xml
Updated Jan 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada (2023). Adverse effects of using the Internet and social networking websites or apps by gender and age group, inactive [Dataset]. https://open.canada.ca/data/en/dataset/80c88ac9-8ea1-4ff7-856e-560f7683d660
Explore at:
html, xml, csvAvailable download formats
Dataset updated
Jan 17, 2023
Dataset provided by
Statistics Canada
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
Percentage of Internet users who have experienced selected personal effects in their life because of the Internet and the use of social networking websites or apps, during the past 12 months.
c
Tudor Networks of Power - correspondence network dataset
repository.cam.ac.uk
application/gzip, txt
Updated Oct 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahnert, Ruth; Ahnert, Sebastian; Cree, Jose; Fikkers, Lotte (2023). Tudor Networks of Power - correspondence network dataset [Dataset]. http://doi.org/10.17863/CAM.99562.2
Explore at:
txt(2449 bytes), application/gzip(2172391 bytes)Available download formats
Unique identifier
https://doi.org/10.17863/CAM.99562.2
Dataset updated
Oct 4, 2023
Dataset provided by
University of Cambridge
Apollo
Authors
Ahnert, Ruth; Ahnert, Sebastian; Cree, Jose; Fikkers, Lotte
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Tudor Networks of Power - Correspondence Network Dataset

Ruth Ahnert, Sebastian E. Ahnert, Jose Cree, and Lotte Fikkers

© 2023. This work is licensed under a CC BY-NC-SA 4.0 license. If using this dataset, please cite:

R. Ahnert, S E. Ahnert, "Tudor Networks of Power", Oxford University Press, 2023.

R. Ahnert, S. E. Ahnert, J. Cree, & L. Fikkers, "Tudor Networks of Power - correspondence network dataset". Apollo - University of Cambridge Repository (2023). https://doi.org/10.17863/CAM.99562

The data is released under a Creative Commons BY-NC-SA 4.0 license, which: - requires attribution - permits distribution, remixing, adaptation, or building upon this data as long as the modified material is licensed under identical terms - only permits non-commercial uses of the work

This data contains a temporal, directed edgelist representing (to the best of our knowledge) all items of correspondence in the Tudor State Papers (1509-1603), which are the official government records of the Tudor period in England. The data covers State Papers Domestic and Foreign.

The dataset was created by first extracting the relevant XML metadata of the State Papers Online resource developed by Gale Cengage. We would like to acknowledge the help and support that Gale Cengage provided for our research. The XML metadata closely corresponds to the State Papers Calendars of the 19th century. These contain many ambiguities regarding the identities of people and places, resulting in an extensive effort on our part to disambiguate and de-duplicate person identities and places of writing. The details of this process can be found in our book (see citation above).

The dataset contains:

'letter_edgelist.tsv' - Directed temporal edge list of letters

'people_labels.tsv' - Key for the person IDs used in letter_edgelist.tsv

'place_labels.tsv' - Key for the place IDs used in letter_edgelist.tsv

'people_metadata.tsv' - Additional metadata and URIs for a subset of people

'places_metadata.tsv' - Geolocations and metadata for a large subset of places

Both the code and more extensive datasets that give context to the data curation process, the network analysis methods, and quantitative results in the book can be found at https://github.com/tudor-networks-of-power/code.

Number of global social network users 2017-2028

statista.com
es.statista.com

Facebook

Twitter

Click to copy link

Link copied

Cite

Stacy Jo Dixon, Number of global social network users 2017-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Stacy Jo Dixon

Description

How many people use social media?

              Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.

              Who uses social media?
              Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
              when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.

              How much time do people spend on social media?
              Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.

              What are the most popular social media platforms?
              Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.

Phishing Websites Detection
kaggle.com
zip
Updated May 28, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
J Akshaya (2020). Phishing Websites Detection [Dataset]. https://www.kaggle.com/akshaya1508/phishing-websites-detection
Explore at:
zip(80950 bytes)Available download formats
Dataset updated
May 28, 2020
Authors
J Akshaya
Description
Context

Phishing is a form of identity theft that occurs when a malicious website impersonates a legitimate one in order to acquire sensitive information such as passwords, account details, or credit card numbers. People generally tend to fall pray to this very easily. Kudos to the commendable craftsmanship of the attackers which makes people believe that it is a legitimate website. There is a need to identify the potential phishing websites and differentiate them from the legitimate ones. This dataset identifies the prominent features of the phishing websites, 10 such features have been identified.

Content

Generally, the open source datasets available on the internet do not comes with the code and the logic which arises certain problems i.e.:

Limited Data: The ML algorithms can only be tested with the existing phishing URLs and no new phishing URLS can be checked for its validity.

Outdated URLs: The datasets available on the internet has been uploaded long time ago, there are new kind of phishing URLs arising in every second.

Outdated Features: The datasets available on the internet has been uploaded long time ago, there are new methodologies arising in phishing techniques.

No Access to Backend: There is no stepwise guide describing how the feature has been derived.

On the contrary we are trying to overcome all the above-mentioned problems.

1. Real Time Data: Before applying a Machine Learning algorithm, we can run the script and fetch real time URLs from Phishtank (for phishing URLs) and from moz (for legitimate URLs) 2. Scalable Data: We can also specify the number of URLs we want to feed the model and hence the web scrapper will fetch that much amount of data from the websites. Presently we are using 1401 URLs in this project i.e. 901 Phishing URLs and 500 Legitimate URLS. 3. New Features: We have tried to implement the prominent new features that is there in the current phishing URLs and since we own the code, new features can also be added. 4. Source code on Github: The source code is published on GitHub for public use and can be used for further scope of improvements. This way there will be transparency to the logic and more creators can add there meaningful additions to the code.

Link to the source code

https://github.com/akshaya1508/detection_of_phishing_websites.git

Inspiration

The idea to develop the dataset and the code for this dataset has been inspired by various other creators who have worked on the similar lines.
Places
catalog.data.gov
Updated Jul 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States Census Bureau (USCB) (Point of Contact) (2025). Places [Dataset]. https://catalog.data.gov/dataset/places2
Explore at:
Dataset updated
Jul 17, 2025
Dataset provided by
United States Census Bureauhttp://census.gov/
Description
The Places dataset was published on August 31, 2022 from the United States Census Bureau (USCB) and is part of the U.S. Department of Transportation (USDOT)/Bureau of Transportation Statistics (BTS) National Transportation Atlas Database (NTAD). This resource is a member of a series. The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The TIGER/Line shapefiles include both incorporated places (legal entities) and census designated places or CDPs (statistical entities). An incorporated place is established to provide governmental functions for a concentration of people as opposed to a minor civil division (MCD), which generally is created to provide services or administer an area without regard, necessarily, to population. Places always nest within a state, but may extend across county and county subdivision boundaries. An incorporated place usually is a city, town, village, or borough, but can have other legal descriptions. CDPs are delineated for the decennial census as the statistical counterparts of incorporated places. CDPs are delineated to provide data for settled concentrations of population that are identifiable by name, but are not legally incorporated under the laws of the state in which they are located. The boundaries for CDPs often are defined in partnership with state, local, and/or tribal officials and usually coincide with visible features or the boundary of an adjacent incorporated place or another legal entity. CDP boundaries often change from one decennial census to the next with changes in the settlement pattern and development; a CDP with the same name as in an earlier census does not necessarily have the same boundary. The only population/housing size requirement for CDPs is that they must contain some housing and population. The boundaries of most incorporated places in this shapefile are as of January 1, 2022, as reported through the Census Bureau's Boundary and Annexation Survey (BAS). The boundaries of all CDPs were delineated as part of the Census Bureau's Participant Statistical Areas Program (PSAP) for the 2020 Census, but some CDPs were added or updated through the 2022 BAS as well. A data dictionary, or other source of attribute information, is accessible at https://doi.org/10.21949/1529072
Available Wireless Sensor Network and Internet of Things testbed facilities:...
data.europa.eu
unknown
Updated Oct 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2022). Available Wireless Sensor Network and Internet of Things testbed facilities: dataset [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-7157221?locale=cs
Explore at:
unknown(2365963)Available download formats
Dataset updated
Oct 7, 2022
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this data set, we present data collected for the purpose of carrying out a systematic review of the available Wireless Sensor Network and Internet of Things testbed facilities. The data was collected through multiple stages and in each stage the pre-defined criteria were applied. We provide a dataset describing the hardware and software aspects of Wireless Sensor Network and Internet of Things testbed facilities available in the market and scientific community. The data were gathered through an extensive systematic review process of scientific articles published between the years 2011 and 2021. The review aims to obtain good quality data for people who are actively researching the Internet of Things facilities or anyone who is interested in that field.
TIGER/Line Shapefile, 2021, State, Louisiana, Places
catalog.data.gov
datasets.ai
+1more
Updated Nov 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Spatial Data Collection and Products Branch (Publisher) (2022). TIGER/Line Shapefile, 2021, State, Louisiana, Places [Dataset]. https://catalog.data.gov/dataset/tiger-line-shapefile-2021-state-louisiana-places
Explore at:
Dataset updated
Nov 1, 2022
Dataset provided by
United States Census Bureauhttp://census.gov/
United States Department of Commercehttp://commerce.gov/
Area covered
Louisiana
Description
The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The TIGER/Line shapefiles include both incorporated places (legal entities) and census designated places or CDPs (statistical entities). An incorporated place is established to provide governmental functions for a concentration of people as opposed to a minor civil division (MCD), which generally is created to provide services or administer an area without regard, necessarily, to population. Places always nest within a state, but may extend across county and county subdivision boundaries. An incorporated place usually is a city, town, village, or borough, but can have other legal descriptions. CDPs are delineated for the decennial census as the statistical counterparts of incorporated places. CDPs are delineated to provide data for settled concentrations of population that are identifiable by name, but are not legally incorporated under the laws of the state in which they are located. The boundaries for CDPs often are defined in partnership with state, local, and/or tribal officials and usually coincide with visible features or the boundary of an adjacent incorporated place or another legal entity. CDP boundaries often change from one decennial census to the next with changes in the settlement pattern and development; a CDP with the same name as in an earlier census does not necessarily have the same boundary. The only population/housing size requirement for CDPs is that they must contain some housing and population. The boundaries of most incorporated places in this shapefile are as of January 1, 2021, as reported through the Census Bureau's Boundary and Annexation Survey (BAS). The boundaries of all CDPs were delineated as part of the Census Bureau's Participant Statistical Areas Program (PSAP) for the 2020 Census.
Job Offers Web Scraping Search
kaggle.com
Updated Feb 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Job Offers Web Scraping Search [Dataset]. https://www.kaggle.com/datasets/thedevastator/job-offers-web-scraping-search
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 11, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Job Offers Web Scraping Search

Targeted Results to Find the Optimal Work Solution

By [source]

About this dataset

This dataset collects job offers from web scraping which are filtered according to specific keywords, locations and times. This data gives users rich and precise search capabilities to uncover the best working solution for them. With the information collected, users can explore options that match with their personal situation, skillset and preferences in terms of location and schedule. The columns provide detailed information around job titles, employer names, locations, time frames as well as other necessary parameters so you can make a smart choice for your next career opportunity

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset is a great resource for those looking to find an optimal work solution based on keywords, location and time parameters. With this information, users can quickly and easily search through job offers that best fit their needs. Here are some tips on how to use this dataset to its fullest potential:

Start by identifying what type of job offer you want to find. The keyword column will help you narrow down your search by allowing you to search for job postings that contain the word or phrase you are looking for.

Next, consider where the job is located – the Location column tells you where in the world each posting is from so make sure it’s somewhere that suits your needs!

Finally, consider when the position is available – look at the Time frame column which gives an indication of when each posting was made as well as if it’s a full-time/ part-time role or even if it’s a casual/temporary position from day one so make sure it meets your requirements first before applying!

Additionally, if details such as hours per week or further schedule information are important criteria then there is also info provided under Horari and Temps Oferta columns too! Now that all three criteria have been ticked off - key words, location and time frame - then take a look at Empresa (Company Name) and Nom_Oferta (Post Name) columns too in order to get an idea of who will be employing you should you land the gig!

All these pieces of data put together should give any motivated individual all they need in order to seek out an optimal work solution - keep hunting good luck!

Research Ideas

Machine learning can be used to groups job offers in order to facilitate the identification of similarities and differences between them. This could allow users to specifically target their search for a work solution.

The data can be used to compare job offerings across different areas or types of jobs, enabling users to make better informed decisions in terms of their career options and goals.

It may also provide an insight into the local job market, enabling companies and employers to identify where there is potential for new opportunities or possible trends that simply may have previously gone unnoticed

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: web_scraping_information_offers.csv | Column name | Description | |:-----------------|:------------------------------------| | Nom_Oferta | Name of the job offer. (String) | | Empresa | Company offering the job. (String) | | Ubicació | Location of the job offer. (String) | | Temps_Oferta | Time of the job offer. (String) | | Horari | Schedule of the job offer. (String) |

Acknowledgements

If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit .
o
HEATH Habitat Network - Dataset - Open Data NI
admin.opendatani.gov.uk
Updated Nov 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). HEATH Habitat Network - Dataset - Open Data NI [Dataset]. https://admin.opendatani.gov.uk/dataset/heath-habitat-network
Explore at:
Dataset updated
Nov 25, 2024
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
In 2020, with generous funding from the National Lottery Heritage Fund, Ulster Wildlife, National Trust NI, RSPB NI and Woodland Trust NI came together to start building capacity to deliver Nature Recovery Networks in Northern Ireland. As part of the project, habitat networks maps were produced for all terrestrial and intertidal priority habitats, based on the Natural England (Edwards et al., 2020) methodology. The habitat networks comprise vector datasets that map areas of land into different network categories, based on how favourable the land is for restoration to, or creation of the priority habitat, and how effective actions in each area would be at enhancing connectivity of the priority habitat, based on proximity to existing habitat patches. A description of these network categories is provided in Table 1 in the methodology report, available at https://www.ulsterwildlife.org/sites/default/files/2022-10/EnvSys%20NI%20NRN%20mapping%20report.pdf. The habitat network maps do not represent a fully comprehensive depiction of land cover, nor do they provide specific land management options and do not therefore replace the need for an on-site ecological surveys/appraisals. The maps are intended to function as a decision-support tool alongside other pieces of information, both from on-site surveys and data from other sources.
About Norwegian Agriculture
kaggle.com
Updated Jul 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Olena Bugaiova (2024). About Norwegian Agriculture [Dataset]. http://doi.org/10.34740/kaggle/dsv/9037685
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/9037685
Dataset updated
Jul 26, 2024
Dataset provided by
Kaggle
Authors
Olena Bugaiova
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered
Norway
Description
Context

The cleaned text data can be used to adapt LLM to the domain of Norwegian Agriculture within the Norwegian language. In addition, it can be valuable for various NLP tasks such as region classification, or analytical tasks, such as exploring common agricultural practices in Norway.

Content

This dataset focuses on agronomic management practices and production in Norway. It consists of 2292 articles in Norwegian. All data is derived from three Norwegian agricultural-related websites and includes data from the largest advisory service for the agricultural sector, Norsk landbruksrådgivning (Norwegian Agricultural Extension Service, NLR), the most prominent agricultural research institute in Norway, Norsk Institutt for Bioøkonomi (Norwegian Institute for Bioeconomy, NIBIO), and the most comprehensive web page dedicated to plant protection in agriculture, Plantevernleksikonet.

Inspiration

The emergence of LLMs marked a significant step forward, providing a single solution for generating human-like text. However, training an LLM requires substantial amounts of text data, which is not readily available for most natural languages, including Norwegian. And agriculture as an industry has not seen much penetration of AI, - what if we could provide location-specific insights to a farmer?

Acknowledgements

The data from NLR can be expanded in the future, gathering more text data.
e
The internet and everyday rights in Russia - Dataset - B2FIND
b2find.eudat.eu
Updated Jul 17, 2010
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2010). The internet and everyday rights in Russia - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/68c7603b-0a69-5e8d-972d-e18b6030d627
Explore at:
Dataset updated
Jul 17, 2010
Area covered
Russia
Description
This two-year project analyses whether the internet can champion the causes of citizens in non-democratic states. While there is much speculation that the internet can provide critical social capital when there is a democratic deficit, there is relatively little empirical work on the interplay between online and off-line social protest and action. This project will study the role of the internet in political life in Russia through an analysis of how people seek to fulfil their 'everyday' human rights in gaining access to social services such as pensions and health care. The study uses five central elements to study the role of the internet in these efforts: content community catalyst control co-optation. The project will analyse internet content against a background of key factors, including the nature and behaviour of online users (community), how the internet activity is sparked by real-world events such as protests or funding cuts (catalysts), how the government attempts to regulate the internet (control); and - more pessimistically - how political elites may attempt to hijack the influence of populist bloggers or websites once they have become influential (co-optation).
d
Terrestrial Ecosystem Research Network Monitoring Sites - ARC
data.gov.au
researchdata.edu.au
+1more
zip
Updated Apr 13, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2022). Terrestrial Ecosystem Research Network Monitoring Sites - ARC [Dataset]. https://data.gov.au/data/dataset/c826109d-add0-45d4-9a77-39f672b48980
Explore at:
zipAvailable download formats
Dataset updated
Apr 13, 2022
Dataset authored and provided by
Bioregional Assessment Program
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract

This dataset and its metadata statement were supplied to the Bioregional Assessment Programme by a third party and are presented here as originally supplied.

Location of Terrestrial Ecosystem Research Network Monitoring Sites. TERN aims to connect and enable ecosystem scientists to collect, contribute, store, share and integrate data across disciplines.

Purpose

To locate Terrestrial Ecosystem Research Network Monitoring Sites.

Dataset History

TERN AusPlots sites downloaded from the AEKOS data portal 15/11/13 Citation: TERN AusPlots, ( 2015 ) Terrestrial Ecosystem Research Network AusPlots - Ausplots Rangelands Survey Program (biodiversity mapping supplement/subset). Adelaide, South Australia. AEKOS - TERN Ecoinformatics. , DOI: 10.4227/05/54C1B45A4CF2F see http://portal.aekos.org.au/dataset/172373

Dataset Citation

SA Department of Environment, Water and Natural Resources (2015) Terrestrial Ecosystem Research Network Monitoring Sites - ARC. Bioregional Assessment Source Dataset. Viewed 26 May 2016, http://data.bioregionalassessments.gov.au/dataset/c826109d-add0-45d4-9a77-39f672b48980.
Attitudes towards the internet in Mexico 2025
statista.com
Updated Apr 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Umair Bashir (2025). Attitudes towards the internet in Mexico 2025 [Dataset]. https://www.statista.com/topics/1145/internet-usage-worldwide/
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Umair Bashir
Description
When asked about "Attitudes towards the internet", most Mexican respondents pick "It is important to me to have mobile internet access in any place" as an answer. 56 percent did so in our online survey in 2025. Looking to gain valuable insights about users of internet providers worldwide? Check out our reports on consumers who use internet providers. These reports give readers a thorough picture of these customers, including their identities, preferences, opinions, and methods of communication.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista Research Department (2025). Number of internet users worldwide 2014-2029 [Dataset]. https://www.statista.com/topics/1145/internet-usage-worldwide/

Number of internet users worldwide 2014-2029

Explore at:

287 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Apr 11, 2025

Dataset provided by

Statistahttp://statista.com/

Authors

Statista Research Department

Area covered

World

Description

The global number of internet users in was forecast to continuously increase between 2024 and 2029 by in total 1.3 billion users (+23.66 percent). After the fifteenth consecutive increasing year, the number of users is estimated to reach 7 billion users and therefore a new peak in 2029. Notably, the number of internet users of was continuously increasing over the past years.Depicted is the estimated number of individuals in the country or region at hand, that use the internet. As the datasource clarifies, connection quality and usage frequency are distinct aspects, not taken into account here.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of internet users in countries like the Americas and Asia.

Clear search

Close search

Google apps

Main menu

Number of internet users worldwide 2014-2029

Network Traffic Dataset

Attitudes towards the internet in China 2025

Custom dataset from any website on the Internet

Web Graphs

Long-Term Agricultural Research (LTAR) network - Meteorological Collection

FCC477 Virginia Broadband Dataset

Adverse effects of using the Internet and social networking websites or apps...

Tudor Networks of Power - correspondence network dataset

Number of global social network users 2017-2028

Phishing Websites Detection

Context

Content

Link to the source code

Inspiration

Places

Available Wireless Sensor Network and Internet of Things testbed facilities:...

TIGER/Line Shapefile, 2021, State, Louisiana, Places

Job Offers Web Scraping Search

Job Offers Web Scraping Search

Targeted Results to Find the Optimal Work Solution

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Acknowledgements

HEATH Habitat Network - Dataset - Open Data NI

About Norwegian Agriculture

Context

Content

Inspiration

Acknowledgements

The internet and everyday rights in Russia - Dataset - B2FIND

Terrestrial Ecosystem Research Network Monitoring Sites - ARC

Abstract

Purpose

Dataset History

Dataset Citation

Attitudes towards the internet in Mexico 2025

Number of internet users worldwide 2014-2029