79 datasets found
  1. Number of internet users worldwide 2014-2029

    • statista.com
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2025). Number of internet users worldwide 2014-2029 [Dataset]. https://www.statista.com/topics/1145/internet-usage-worldwide/
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    World
    Description

    The global number of internet users in was forecast to continuously increase between 2024 and 2029 by in total 1.3 billion users (+23.66 percent). After the fifteenth consecutive increasing year, the number of users is estimated to reach 7 billion users and therefore a new peak in 2029. Notably, the number of internet users of was continuously increasing over the past years.Depicted is the estimated number of individuals in the country or region at hand, that use the internet. As the datasource clarifies, connection quality and usage frequency are distinct aspects, not taken into account here.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of internet users in countries like the Americas and Asia.

  2. Network Traffic Dataset

    • kaggle.com
    Updated Oct 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ravikumar Gattu (2023). Network Traffic Dataset [Dataset]. https://www.kaggle.com/datasets/ravikumargattu/network-traffic-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 31, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ravikumar Gattu
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    The data presented here was obtained in a Kali Machine from University of Cincinnati,Cincinnati,OHIO by carrying out packet captures for 1 hour during the evening on Oct 9th,2023 using Wireshark.This dataset consists of 394137 instances were obtained and stored in a CSV (Comma Separated Values) file.This large dataset could be used utilised for different machine learning applications for instance classification of Network traffic,Network performance monitoring,Network Security Management , Network Traffic Management ,network intrusion detection and anomaly detection.

    The dataset can be used for a variety of machine learning tasks, such as network intrusion detection, traffic classification, and anomaly detection.

    Content :

    This network traffic dataset consists of 7 features.Each instance contains the information of source and destination IP addresses, The majority of the properties are numeric in nature, however there are also nominal and date kinds due to the Timestamp.

    The network traffic flow statistics (No. Time Source Destination Protocol Length Info) were obtained using Wireshark (https://www.wireshark.org/).

    Dataset Columns:

    No : Number of Instance. Timestamp : Timestamp of instance of network traffic Source IP: IP address of Source Destination IP: IP address of Destination Portocol: Protocol used by the instance Length: Length of Instance Info: Information of Traffic Instance

    Acknowledgements :

    I would like thank University of Cincinnati for giving the infrastructure for generation of network traffic data set.

    Ravikumar Gattu , Susmitha Choppadandi

    Inspiration : This dataset goes beyond the majority of network traffic classification datasets, which only identify the type of application (WWW, DNS, ICMP,ARP,RARP) that an IP flow contains. Instead, it generates machine learning models that can identify specific applications (like Tiktok,Wikipedia,Instagram,Youtube,Websites,Blogs etc.) from IP flow statistics (there are currently 25 applications in total).

    **Dataset License: ** CC0: Public Domain

    Dataset Usages : This dataset can be used for different machine learning applications in the field of cybersecurity such as classification of Network traffic,Network performance monitoring,Network Security Management , Network Traffic Management ,network intrusion detection and anomaly detection.

    ML techniques benefits from this Dataset :

    This dataset is highly useful because it consists of 394137 instances of network traffic data obtained by using the 25 applications on a public,private and Enterprise networks.Also,the dataset consists of very important features that can be used for most of the applications of Machine learning in cybersecurity.Here are few of the potential machine learning applications that could be benefited from this dataset are :

    1. Network Performance Monitoring : This large network traffic data set can be utilised for analysing the network traffic to identifying the network patterns in the network .This help in designing the network security algorithms for minimise the network probelms.

    2. Anamoly Detection : Large network traffic dataset can be utilised training the machine learning models for finding the irregularitues in the traffic which could help identify the cyber attacks.

    3.Network Intrusion Detection : This large dataset could be utilised for machine algorithms training and designing the models for detection of the traffic issues,Malicious traffic network attacks and DOS attacks as well.

  3. Attitudes towards the internet in China 2025

    • statista.com
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Umair Bashir (2025). Attitudes towards the internet in China 2025 [Dataset]. https://www.statista.com/topics/1145/internet-usage-worldwide/
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Umair Bashir
    Description

    When asked about "Attitudes towards the internet", most Chinese respondents pick "It is important to me to have mobile internet access in any place" as an answer. 50 percent did so in our online survey in 2025. Looking to gain valuable insights about users of internet providers worldwide? Check out our reports on consumers who use internet providers. These reports give readers a thorough picture of these customers, including their identities, preferences, opinions, and methods of communication.

  4. d

    Custom dataset from any website on the Internet

    • datarade.ai
    Updated Sep 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ScrapeLabs (2022). Custom dataset from any website on the Internet [Dataset]. https://datarade.ai/data-products/custom-dataset-from-any-website-on-the-internet-scrapelabs
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Sep 21, 2022
    Dataset authored and provided by
    ScrapeLabs
    Area covered
    Jordan, Kazakhstan, Bulgaria, India, Turks and Caicos Islands, Tunisia, Guinea-Bissau, Argentina, Aruba, Lebanon
    Description

    We'll extract any data from any website on the Internet. You don't have to worry about buying and maintaining complex and expensive software, or hiring developers.

    Some common use cases our customers use the data for: • Data Analysis • Market Research • Price Monitoring • Sales Leads • Competitor Analysis • Recruitment

    We can get data from websites with pagination or scroll, with captchas, and even from behind logins. Text, images, videos, documents.

    Receive data in any format you need: Excel, CSV, JSON, or any other.

  5. Web Graphs

    • kaggle.com
    zip
    Updated Nov 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Subhajit Sahu (2021). Web Graphs [Dataset]. https://www.kaggle.com/wolfram77/graphs-web
    Explore at:
    zip(52848952 bytes)Available download formats
    Dataset updated
    Nov 11, 2021
    Authors
    Subhajit Sahu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dynamic face-to-face interaction networks represent the interactions that happen during discussions between a group of participants playing the Resistance game. This dataset contains networks extracted from 62 games. Each game is played by 5-8 participants and lasts between 45--60 minutes. We extract dynamically evolving networks from the free-form discussions using the ICAF algorithm. The extracted networks are used to characterize and detect group deceptive behavior using the DeceptionRank algorithm.

    The networks are weighted, directed and temporal. Each node represents a participant. At each 1/3 second, a directed edge from node u to v is weighted by the probability of participant u looking at participant v or the laptop. Additionally, we also provide a binary version where an edge from u to v indicates participant u looks at participant v (or the laptop).

    Stanford Network Analysis Platform (SNAP) is a general purpose, high performance system for analysis and manipulation of large networks. Graphs consists of nodes and directed/undirected/multiple edges between the graph nodes. Networks are graphs with data on nodes and/or edges of the network.

    The core SNAP library is written in C++ and optimized for maximum performance and compact graph representation. It easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges. Besides scalability to large graphs, an additional strength of SNAP is that nodes, edges and attributes in a graph or a network can be changed dynamically during the computation.

    SNAP was originally developed by Jure Leskovec in the course of his PhD studies. The first release was made available in Nov, 2009. SNAP uses a general purpose STL (Standard Template Library)-like library GLib developed at Jozef Stefan Institute. SNAP and GLib are being actively developed and used in numerous academic and industrial projects.

    http://snap.stanford.edu/data/index.html#face2face

  6. Long-Term Agricultural Research (LTAR) network - Meteorological Collection

    • catalog.data.gov
    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • +1more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Long-Term Agricultural Research (LTAR) network - Meteorological Collection [Dataset]. https://catalog.data.gov/dataset/long-term-agricultural-research-ltar-network-meteorological-collection-7d719
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    The LTAR network maintains stations for standard meteorological measurements including, generally, air temperature and humidity, shortwave (solar) irradiance, longwave (thermal) radiation, wind speed and direction, barometric pressure, and precipitation. Many sites also have extensive comparable legacy datasets. The LTAR scientific community decided that these needed to be made available to the public using a single web source in a consistent manner. To that purpose, each site sent data on a regular schedule, as frequently as hourly, to the National Agricultural Library, which has developed a web service to provide the data to the public in tabular or graphical form. This archive of the LTAR legacy database exports contains meteorological data through April 30, 2021. For current meteorological data, visit the GeoEvent Meteorology Resources page, which provides tools and dashboards to view and access data from the 18 LTAR sites across the United States. Resources in this dataset:Resource Title: Meteorological data. File Name: ltar_archive_DB.zipResource Description: This is an export of the meteorological data collected by LTAR sites and ingested by the NAL LTAR application. This export consists of an SQL schema definition file for creating database tables and the data itself. The data is provided in two formats: SQL insert statements (.sql) and CSV files (.csv). Please use the format most convenient for you. Note that the SQL insert statements take much longer to run since each row is an individual insert. Description of zip files The ltararchive*.zip files contain database exports. The schema is a .sql file; the data is exported as both SQL inserts and CSV for convenience. There is a README in markdown and PDF in the zips. Contains the database export of the schema and data for the site, site_station, and met tables as SQL insert statements. ltar_archive_db_sql_export_20201231.zip --> has data until 2020-12-31 ltar_archive_db_sql_export_20210430.zip --> has data until 2021-04-30 Contains the database export of the schema and data for the site, site_station, and met tables as CSV. ltar_archive_db_csv_export_20201231.zip --> has data until 2020-12-31 ltar_archive_db_csv_export_20210430.zip --> has data until 2021-04-30 Contains the raw CSV files that were sent to NAL from the LTAR sites/stations. ltar_rawcsv_archive.zip --> has data until 2021-04-30

  7. V

    FCC477 Virginia Broadband Dataset

    • data.virginia.gov
    zip
    Updated Feb 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Other (2024). FCC477 Virginia Broadband Dataset [Dataset]. https://data.virginia.gov/dataset/fcc477-virginia-broadband-dataset
    Explore at:
    zip(251278001)Available download formats
    Dataset updated
    Feb 3, 2024
    Dataset authored and provided by
    Other
    Description

    This data represents the Virginia State Broadband Data area of broadband availability. Area of broadband availability refers to those individual US Census Blocks where each facilities-based provider of broadband service claims to provide broadband services. For each area the provider name and technology they provide there are represented. In addition, advertised and typical upload and download speeds are often reported at these levels. If a provider offers availability to any location within a census block, the entire block is deemed available under this effort. For this purpose, ''broadband service'' is the provision, on either a commercial or non-commercial basis, of data transmission technology that provides two-way data transmission to and from the Internet with advertised speeds of at least 4 megabits per second (mbps) downstream and 0.5 mbps upstream to end users. For this purpose, an ''end user'' of broadband service is a residential or business party, institution or State or local government entity that may use broadband service for its own purposes. An entity is a ''facilities based'' provider of broadband service connections to end user locations if any of the following conditions are met: (1) It owns the portion of the physical facility that terminates at the end user location; (2) it obtains unbundled network elements (UNEs), special access lines, or other leased facilities that terminate at the end user location and provisions/equips them as broadband; or (3) it provisions/equips a broadband wireless channel to the end user location over licensed or unlicensed spectrum.

    VABB_CABLE: Cable Wireline Coverage (June 2019) VABB_DSL_COPPER: DSL/Copper Coverage (June 2019) VABB_FIBER: Fiber Optic Coverage (June 2019) VABB_FIXED: Fixed Wireless Coverage (June 2019) VABB_MOBILE: Mobile Wireless Coverage (Dec. 2018) VABB_LTE: 4G/LTE Wireless Coverage (Dec. 2018) VABB_satellite: Satellite Coverage (Dec. 2018) VABB_VATI: Virginia Telecommunication Initiative (VATI) Funding VABB_TRRC: Tobacco Region Revitalization Commission (TRRC) Funding VABB_UNDERSERVED: Underserved Areas greater than 10 Mbps download and 1 Mbps upload and less than 25 Mbps download and 3 Mbps upload VABB_Unserved: Unserved Areas below or equal to 10 Mbps download and 1 Mbps upload VABB_Lacking: No Residential Broadband (25/3) reported (June 2019)

  8. G

    Adverse effects of using the Internet and social networking websites or apps...

    • open.canada.ca
    • www150.statcan.gc.ca
    • +1more
    csv, html, xml
    Updated Jan 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Canada (2023). Adverse effects of using the Internet and social networking websites or apps by gender and age group, inactive [Dataset]. https://open.canada.ca/data/en/dataset/80c88ac9-8ea1-4ff7-856e-560f7683d660
    Explore at:
    html, xml, csvAvailable download formats
    Dataset updated
    Jan 17, 2023
    Dataset provided by
    Statistics Canada
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Description

    Percentage of Internet users who have experienced selected personal effects in their life because of the Internet and the use of social networking websites or apps, during the past 12 months.

  9. c

    Tudor Networks of Power - correspondence network dataset

    • repository.cam.ac.uk
    application/gzip, txt
    Updated Oct 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahnert, Ruth; Ahnert, Sebastian; Cree, Jose; Fikkers, Lotte (2023). Tudor Networks of Power - correspondence network dataset [Dataset]. http://doi.org/10.17863/CAM.99562.2
    Explore at:
    txt(2449 bytes), application/gzip(2172391 bytes)Available download formats
    Dataset updated
    Oct 4, 2023
    Dataset provided by
    University of Cambridge
    Apollo
    Authors
    Ahnert, Ruth; Ahnert, Sebastian; Cree, Jose; Fikkers, Lotte
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Tudor Networks of Power - Correspondence Network Dataset

    Ruth Ahnert, Sebastian E. Ahnert, Jose Cree, and Lotte Fikkers

    © 2023. This work is licensed under a CC BY-NC-SA 4.0 license. If using this dataset, please cite:

    • R. Ahnert, S E. Ahnert, "Tudor Networks of Power", Oxford University Press, 2023.

    • R. Ahnert, S. E. Ahnert, J. Cree, & L. Fikkers, "Tudor Networks of Power - correspondence network dataset". Apollo - University of Cambridge Repository (2023). https://doi.org/10.17863/CAM.99562

    The data is released under a Creative Commons BY-NC-SA 4.0 license, which: - requires attribution - permits distribution, remixing, adaptation, or building upon this data as long as the modified material is licensed under identical terms - only permits non-commercial uses of the work

    This data contains a temporal, directed edgelist representing (to the best of our knowledge) all items of correspondence in the Tudor State Papers (1509-1603), which are the official government records of the Tudor period in England. The data covers State Papers Domestic and Foreign.

    The dataset was created by first extracting the relevant XML metadata of the State Papers Online resource developed by Gale Cengage. We would like to acknowledge the help and support that Gale Cengage provided for our research. The XML metadata closely corresponds to the State Papers Calendars of the 19th century. These contain many ambiguities regarding the identities of people and places, resulting in an extensive effort on our part to disambiguate and de-duplicate person identities and places of writing. The details of this process can be found in our book (see citation above).

    The dataset contains:

    • 'letter_edgelist.tsv' - Directed temporal edge list of letters
    • 'people_labels.tsv' - Key for the person IDs used in letter_edgelist.tsv
    • 'place_labels.tsv' - Key for the place IDs used in letter_edgelist.tsv
    • 'people_metadata.tsv' - Additional metadata and URIs for a subset of people
    • 'places_metadata.tsv' - Geolocations and metadata for a large subset of places

    Both the code and more extensive datasets that give context to the data curation process, the network analysis methods, and quantitative results in the book can be found at https://github.com/tudor-networks-of-power/code.

  10. Number of global social network users 2017-2028

    • statista.com
    • es.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Number of global social network users 2017-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    How many people use social media?

                  Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.
    
                  Who uses social media?
                  Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
                  when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.
    
                  How much time do people spend on social media?
                  Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.
    
                  What are the most popular social media platforms?
                  Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
    
  11. Phishing Websites Detection

    • kaggle.com
    zip
    Updated May 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    J Akshaya (2020). Phishing Websites Detection [Dataset]. https://www.kaggle.com/akshaya1508/phishing-websites-detection
    Explore at:
    zip(80950 bytes)Available download formats
    Dataset updated
    May 28, 2020
    Authors
    J Akshaya
    Description

    Context

    Phishing is a form of identity theft that occurs when a malicious website impersonates a legitimate one in order to acquire sensitive information such as passwords, account details, or credit card numbers. People generally tend to fall pray to this very easily. Kudos to the commendable craftsmanship of the attackers which makes people believe that it is a legitimate website. There is a need to identify the potential phishing websites and differentiate them from the legitimate ones. This dataset identifies the prominent features of the phishing websites, 10 such features have been identified.

    Content

    Generally, the open source datasets available on the internet do not comes with the code and the logic which arises certain problems i.e.:

    1. Limited Data: The ML algorithms can only be tested with the existing phishing URLs and no new phishing URLS can be checked for its validity.
    2. Outdated URLs: The datasets available on the internet has been uploaded long time ago, there are new kind of phishing URLs arising in every second.
    3. Outdated Features: The datasets available on the internet has been uploaded long time ago, there are new methodologies arising in phishing techniques.
    4. No Access to Backend: There is no stepwise guide describing how the feature has been derived.

    On the contrary we are trying to overcome all the above-mentioned problems.

    1. Real Time Data: Before applying a Machine Learning algorithm, we can run the script and fetch real time URLs from Phishtank (for phishing URLs) and from moz (for legitimate URLs) 2. Scalable Data: We can also specify the number of URLs we want to feed the model and hence the web scrapper will fetch that much amount of data from the websites. Presently we are using 1401 URLs in this project i.e. 901 Phishing URLs and 500 Legitimate URLS. 3. New Features: We have tried to implement the prominent new features that is there in the current phishing URLs and since we own the code, new features can also be added. 4. Source code on Github: The source code is published on GitHub for public use and can be used for further scope of improvements. This way there will be transparency to the logic and more creators can add there meaningful additions to the code.

    Link to the source code

    https://github.com/akshaya1508/detection_of_phishing_websites.git

    Inspiration

    The idea to develop the dataset and the code for this dataset has been inspired by various other creators who have worked on the similar lines.

  12. Places

    • catalog.data.gov
    Updated Jul 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Census Bureau (USCB) (Point of Contact) (2025). Places [Dataset]. https://catalog.data.gov/dataset/places2
    Explore at:
    Dataset updated
    Jul 17, 2025
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Description

    The Places dataset was published on August 31, 2022 from the United States Census Bureau (USCB) and is part of the U.S. Department of Transportation (USDOT)/Bureau of Transportation Statistics (BTS) National Transportation Atlas Database (NTAD). This resource is a member of a series. The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The TIGER/Line shapefiles include both incorporated places (legal entities) and census designated places or CDPs (statistical entities). An incorporated place is established to provide governmental functions for a concentration of people as opposed to a minor civil division (MCD), which generally is created to provide services or administer an area without regard, necessarily, to population. Places always nest within a state, but may extend across county and county subdivision boundaries. An incorporated place usually is a city, town, village, or borough, but can have other legal descriptions. CDPs are delineated for the decennial census as the statistical counterparts of incorporated places. CDPs are delineated to provide data for settled concentrations of population that are identifiable by name, but are not legally incorporated under the laws of the state in which they are located. The boundaries for CDPs often are defined in partnership with state, local, and/or tribal officials and usually coincide with visible features or the boundary of an adjacent incorporated place or another legal entity. CDP boundaries often change from one decennial census to the next with changes in the settlement pattern and development; a CDP with the same name as in an earlier census does not necessarily have the same boundary. The only population/housing size requirement for CDPs is that they must contain some housing and population. The boundaries of most incorporated places in this shapefile are as of January 1, 2022, as reported through the Census Bureau's Boundary and Annexation Survey (BAS). The boundaries of all CDPs were delineated as part of the Census Bureau's Participant Statistical Areas Program (PSAP) for the 2020 Census, but some CDPs were added or updated through the 2022 BAS as well. A data dictionary, or other source of attribute information, is accessible at https://doi.org/10.21949/1529072

  13. Available Wireless Sensor Network and Internet of Things testbed facilities:...

    • data.europa.eu
    unknown
    Updated Oct 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2022). Available Wireless Sensor Network and Internet of Things testbed facilities: dataset [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-7157221?locale=cs
    Explore at:
    unknown(2365963)Available download formats
    Dataset updated
    Oct 7, 2022
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this data set, we present data collected for the purpose of carrying out a systematic review of the available Wireless Sensor Network and Internet of Things testbed facilities. The data was collected through multiple stages and in each stage the pre-defined criteria were applied. We provide a dataset describing the hardware and software aspects of Wireless Sensor Network and Internet of Things testbed facilities available in the market and scientific community. The data were gathered through an extensive systematic review process of scientific articles published between the years 2011 and 2021. The review aims to obtain good quality data for people who are actively researching the Internet of Things facilities or anyone who is interested in that field.

  14. TIGER/Line Shapefile, 2021, State, Louisiana, Places

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Nov 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Spatial Data Collection and Products Branch (Publisher) (2022). TIGER/Line Shapefile, 2021, State, Louisiana, Places [Dataset]. https://catalog.data.gov/dataset/tiger-line-shapefile-2021-state-louisiana-places
    Explore at:
    Dataset updated
    Nov 1, 2022
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    United States Department of Commercehttp://commerce.gov/
    Area covered
    Louisiana
    Description

    The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The TIGER/Line shapefiles include both incorporated places (legal entities) and census designated places or CDPs (statistical entities). An incorporated place is established to provide governmental functions for a concentration of people as opposed to a minor civil division (MCD), which generally is created to provide services or administer an area without regard, necessarily, to population. Places always nest within a state, but may extend across county and county subdivision boundaries. An incorporated place usually is a city, town, village, or borough, but can have other legal descriptions. CDPs are delineated for the decennial census as the statistical counterparts of incorporated places. CDPs are delineated to provide data for settled concentrations of population that are identifiable by name, but are not legally incorporated under the laws of the state in which they are located. The boundaries for CDPs often are defined in partnership with state, local, and/or tribal officials and usually coincide with visible features or the boundary of an adjacent incorporated place or another legal entity. CDP boundaries often change from one decennial census to the next with changes in the settlement pattern and development; a CDP with the same name as in an earlier census does not necessarily have the same boundary. The only population/housing size requirement for CDPs is that they must contain some housing and population. The boundaries of most incorporated places in this shapefile are as of January 1, 2021, as reported through the Census Bureau's Boundary and Annexation Survey (BAS). The boundaries of all CDPs were delineated as part of the Census Bureau's Participant Statistical Areas Program (PSAP) for the 2020 Census.

  15. Job Offers Web Scraping Search

    • kaggle.com
    Updated Feb 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Job Offers Web Scraping Search [Dataset]. https://www.kaggle.com/datasets/thedevastator/job-offers-web-scraping-search
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 11, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Job Offers Web Scraping Search

    Targeted Results to Find the Optimal Work Solution

    By [source]

    About this dataset

    This dataset collects job offers from web scraping which are filtered according to specific keywords, locations and times. This data gives users rich and precise search capabilities to uncover the best working solution for them. With the information collected, users can explore options that match with their personal situation, skillset and preferences in terms of location and schedule. The columns provide detailed information around job titles, employer names, locations, time frames as well as other necessary parameters so you can make a smart choice for your next career opportunity

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset is a great resource for those looking to find an optimal work solution based on keywords, location and time parameters. With this information, users can quickly and easily search through job offers that best fit their needs. Here are some tips on how to use this dataset to its fullest potential:

    • Start by identifying what type of job offer you want to find. The keyword column will help you narrow down your search by allowing you to search for job postings that contain the word or phrase you are looking for.

    • Next, consider where the job is located – the Location column tells you where in the world each posting is from so make sure it’s somewhere that suits your needs!

    • Finally, consider when the position is available – look at the Time frame column which gives an indication of when each posting was made as well as if it’s a full-time/ part-time role or even if it’s a casual/temporary position from day one so make sure it meets your requirements first before applying!

    • Additionally, if details such as hours per week or further schedule information are important criteria then there is also info provided under Horari and Temps Oferta columns too! Now that all three criteria have been ticked off - key words, location and time frame - then take a look at Empresa (Company Name) and Nom_Oferta (Post Name) columns too in order to get an idea of who will be employing you should you land the gig!

      All these pieces of data put together should give any motivated individual all they need in order to seek out an optimal work solution - keep hunting good luck!

    Research Ideas

    • Machine learning can be used to groups job offers in order to facilitate the identification of similarities and differences between them. This could allow users to specifically target their search for a work solution.
    • The data can be used to compare job offerings across different areas or types of jobs, enabling users to make better informed decisions in terms of their career options and goals.
    • It may also provide an insight into the local job market, enabling companies and employers to identify where there is potential for new opportunities or possible trends that simply may have previously gone unnoticed

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: web_scraping_information_offers.csv | Column name | Description | |:-----------------|:------------------------------------| | Nom_Oferta | Name of the job offer. (String) | | Empresa | Company offering the job. (String) | | Ubicació | Location of the job offer. (String) | | Temps_Oferta | Time of the job offer. (String) | | Horari | Schedule of the job offer. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit .

  16. o

    HEATH Habitat Network - Dataset - Open Data NI

    • admin.opendatani.gov.uk
    Updated Nov 25, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). HEATH Habitat Network - Dataset - Open Data NI [Dataset]. https://admin.opendatani.gov.uk/dataset/heath-habitat-network
    Explore at:
    Dataset updated
    Nov 25, 2024
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    In 2020, with generous funding from the National Lottery Heritage Fund, Ulster Wildlife, National Trust NI, RSPB NI and Woodland Trust NI came together to start building capacity to deliver Nature Recovery Networks in Northern Ireland. As part of the project, habitat networks maps were produced for all terrestrial and intertidal priority habitats, based on the Natural England (Edwards et al., 2020) methodology. The habitat networks comprise vector datasets that map areas of land into different network categories, based on how favourable the land is for restoration to, or creation of the priority habitat, and how effective actions in each area would be at enhancing connectivity of the priority habitat, based on proximity to existing habitat patches. A description of these network categories is provided in Table 1 in the methodology report, available at https://www.ulsterwildlife.org/sites/default/files/2022-10/EnvSys%20NI%20NRN%20mapping%20report.pdf. The habitat network maps do not represent a fully comprehensive depiction of land cover, nor do they provide specific land management options and do not therefore replace the need for an on-site ecological surveys/appraisals. The maps are intended to function as a decision-support tool alongside other pieces of information, both from on-site surveys and data from other sources.

  17. About Norwegian Agriculture

    • kaggle.com
    Updated Jul 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Olena Bugaiova (2024). About Norwegian Agriculture [Dataset]. http://doi.org/10.34740/kaggle/dsv/9037685
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 26, 2024
    Dataset provided by
    Kaggle
    Authors
    Olena Bugaiova
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Norway
    Description

    Context

    The cleaned text data can be used to adapt LLM to the domain of Norwegian Agriculture within the Norwegian language. In addition, it can be valuable for various NLP tasks such as region classification, or analytical tasks, such as exploring common agricultural practices in Norway.

    Content

    This dataset focuses on agronomic management practices and production in Norway. It consists of 2292 articles in Norwegian. All data is derived from three Norwegian agricultural-related websites and includes data from the largest advisory service for the agricultural sector, Norsk landbruksrådgivning (Norwegian Agricultural Extension Service, NLR), the most prominent agricultural research institute in Norway, Norsk Institutt for Bioøkonomi (Norwegian Institute for Bioeconomy, NIBIO), and the most comprehensive web page dedicated to plant protection in agriculture, Plantevernleksikonet.

    Inspiration

    The emergence of LLMs marked a significant step forward, providing a single solution for generating human-like text. However, training an LLM requires substantial amounts of text data, which is not readily available for most natural languages, including Norwegian. And agriculture as an industry has not seen much penetration of AI, - what if we could provide location-specific insights to a farmer?

    Acknowledgements

    The data from NLR can be expanded in the future, gathering more text data.

  18. e

    The internet and everyday rights in Russia - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Jul 17, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2010). The internet and everyday rights in Russia - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/68c7603b-0a69-5e8d-972d-e18b6030d627
    Explore at:
    Dataset updated
    Jul 17, 2010
    Area covered
    Russia
    Description

    This two-year project analyses whether the internet can champion the causes of citizens in non-democratic states. While there is much speculation that the internet can provide critical social capital when there is a democratic deficit, there is relatively little empirical work on the interplay between online and off-line social protest and action. This project will study the role of the internet in political life in Russia through an analysis of how people seek to fulfil their 'everyday' human rights in gaining access to social services such as pensions and health care. The study uses five central elements to study the role of the internet in these efforts: content community catalyst control co-optation. The project will analyse internet content against a background of key factors, including the nature and behaviour of online users (community), how the internet activity is sparked by real-world events such as protests or funding cuts (catalysts), how the government attempts to regulate the internet (control); and - more pessimistically - how political elites may attempt to hijack the influence of populist bloggers or websites once they have become influential (co-optation).

  19. d

    Terrestrial Ecosystem Research Network Monitoring Sites - ARC

    • data.gov.au
    • researchdata.edu.au
    • +1more
    zip
    Updated Apr 13, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bioregional Assessment Program (2022). Terrestrial Ecosystem Research Network Monitoring Sites - ARC [Dataset]. https://data.gov.au/data/dataset/c826109d-add0-45d4-9a77-39f672b48980
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 13, 2022
    Dataset authored and provided by
    Bioregional Assessment Program
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract

    This dataset and its metadata statement were supplied to the Bioregional Assessment Programme by a third party and are presented here as originally supplied.

    Location of Terrestrial Ecosystem Research Network Monitoring Sites. TERN aims to connect and enable ecosystem scientists to collect, contribute, store, share and integrate data across disciplines.

    Purpose

    To locate Terrestrial Ecosystem Research Network Monitoring Sites.

    Dataset History

    TERN AusPlots sites downloaded from the AEKOS data portal 15/11/13 Citation: TERN AusPlots, ( 2015 ) Terrestrial Ecosystem Research Network AusPlots - Ausplots Rangelands Survey Program (biodiversity mapping supplement/subset). Adelaide, South Australia. AEKOS - TERN Ecoinformatics. , DOI: 10.4227/05/54C1B45A4CF2F see http://portal.aekos.org.au/dataset/172373

    Dataset Citation

    SA Department of Environment, Water and Natural Resources (2015) Terrestrial Ecosystem Research Network Monitoring Sites - ARC. Bioregional Assessment Source Dataset. Viewed 26 May 2016, http://data.bioregionalassessments.gov.au/dataset/c826109d-add0-45d4-9a77-39f672b48980.

  20. Attitudes towards the internet in Mexico 2025

    • statista.com
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Umair Bashir (2025). Attitudes towards the internet in Mexico 2025 [Dataset]. https://www.statista.com/topics/1145/internet-usage-worldwide/
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Umair Bashir
    Description

    When asked about "Attitudes towards the internet", most Mexican respondents pick "It is important to me to have mobile internet access in any place" as an answer. 56 percent did so in our online survey in 2025. Looking to gain valuable insights about users of internet providers worldwide? Check out our reports on consumers who use internet providers. These reports give readers a thorough picture of these customers, including their identities, preferences, opinions, and methods of communication.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista Research Department (2025). Number of internet users worldwide 2014-2029 [Dataset]. https://www.statista.com/topics/1145/internet-usage-worldwide/
Organization logo

Number of internet users worldwide 2014-2029

Explore at:
287 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Apr 11, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
World
Description

The global number of internet users in was forecast to continuously increase between 2024 and 2029 by in total 1.3 billion users (+23.66 percent). After the fifteenth consecutive increasing year, the number of users is estimated to reach 7 billion users and therefore a new peak in 2029. Notably, the number of internet users of was continuously increasing over the past years.Depicted is the estimated number of individuals in the country or region at hand, that use the internet. As the datasource clarifies, connection quality and usage frequency are distinct aspects, not taken into account here.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of internet users in countries like the Americas and Asia.

Search
Clear search
Close search
Google apps
Main menu