The total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly, reaching *** zettabytes in 2024. Over the next five years up to 2028, global data creation is projected to grow to more than *** zettabytes. In 2020, the amount of data created and replicated reached a new high. The growth was higher than previously expected, caused by the increased demand due to the COVID-19 pandemic, as more people worked and learned from home and used home entertainment options more often. Storage capacity also growing Only a small percentage of this newly created data is kept though, as just * percent of the data produced and consumed in 2020 was saved and retained into 2021. In line with the strong growth of the data volume, the installed base of storage capacity is forecast to increase, growing at a compound annual growth rate of **** percent over the forecast period from 2020 to 2025. In 2020, the installed base of storage capacity reached *** zettabytes.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Internet use in the UK annual estimates by age, sex, disability, ethnic group, economic activity and geographical location, including confidence intervals.
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Africa - Population and Internet users statistics
What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.
Source: https://data.humdata.org/dataset/africa-population-and-internet-users-statistics Last updated at https://data.humdata.org/organization/openafrica : 2019-09-11
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The World Wide Web is a complex interconnected digital ecosystem, where information and attention flow between platforms and communities throughout the globe. These interactions co-construct how we understand the world, reflecting and shaping public discourse. Unfortunately, researchers often struggle to understand how information circulates and evolves across the web because platform-specific data is often siloed and restricted by linguistic barriers. To address this gap, we present a comprehensive, multilingual dataset capturing all Wikipedia links shared in posts and comments on Reddit from 2020 to 2023, excluding those from private and NSFW subreddits. Each linked Wikipedia article is enriched with revision history, page view data, article ID, redirects, and Wikidata identifiers. Through a research agreement with Reddit, our dataset ensures user privacy while providing a query and ID mechanism that integrates with the Reddit and Wikipedia APIs. This enables extended analyses for researchers studying how information flows across platforms. For example, Reddit discussions use Wikipedia for deliberation and fact-checking which subsequently influences Wikipedia content, by driving traffic to articles or inspiring edits. By analyzing the relationship between information shared and discussed on these platforms, our dataset provides a foundation for examining the interplay between social media discourse and collaborative knowledge consumption and production.
The motivations for this dataset stem from the challenges researchers face in studying the flow of information across the web. While the World Wide Web enables global communication and collaboration, data silos, linguistic barriers, and platform-specific restrictions hinder our ability to understand how information circulates, evolves, and impacts public discourse. Wikipedia and Reddit, as major hubs of knowledge sharing and discussion, offer an invaluable lens into these processes. However, without comprehensive data capturing their interactions, researchers are unable to fully examine how platforms co-construct knowledge. This dataset bridges this gap, providing the tools needed to study the interconnectedness of social media and collaborative knowledge systems.
WikiReddit, a comprehensive dataset capturing all Wikipedia mentions (including links) shared in posts and comments on Reddit from 2020 to 2023, excluding those from private and NSFW (not safe for work) subreddits. The SQL database comprises 336K total posts, 10.2M comments, 1.95M unique links, and 1.26M unique articles spanning 59 languages on Reddit and 276 Wikipedia language subdomains. Each linked Wikipedia article is enriched with its revision history and page view data within a ±10-day window of its posting, as well as article ID, redirects, and Wikidata identifiers. Supplementary anonymous metadata from Reddit posts and comments further contextualizes the links, offering a robust resource for analysing cross-platform information flows, collective attention dynamics, and the role of Wikipedia in online discourse.
Data was collected from the Reddit4Researchers and Wikipedia APIs. No personally identifiable information is published in the dataset. Data from Reddit to Wikipedia is linked via the hyperlink and article titles appearing in Reddit posts.
Extensive processing with tools such as regex was applied to the Reddit post/comment text to extract the Wikipedia URLs. Redirects for Wikipedia URLs and article titles were found through the API and mapped to the collected data. Reddit IDs are hashed with SHA-256 for post/comment/user/subreddit anonymity.
We foresee several applications of this dataset and preview four here. First, Reddit linking data can be used to understand how attention is driven from one platform to another. Second, Reddit linking data can shed light on how Wikipedia's archive of knowledge is used in the larger social web. Third, our dataset could provide insights into how external attention is topically distributed across Wikipedia. Our dataset can help extend that analysis into the disparities in what types of external communities Wikipedia is used in, and how it is used. Fourth, relatedly, a topic analysis of our dataset could reveal how Wikipedia usage on Reddit contributes to societal benefits and harms. Our dataset could help examine if homogeneity within the Reddit and Wikipedia audiences shapes topic patterns and assess whether these relationships mitigate or amplify problematic engagement online.
The dataset is publicly shared with a Creative Commons Attribution 4.0 International license. The article describing this dataset should be cited: https://doi.org/10.48550/arXiv.2502.04942
Patrick Gildersleve will maintain this dataset, and add further years of content as and when available.
posts
Column Name | Type | Description |
---|---|---|
subreddit_id | TEXT | The unique identifier for the subreddit. |
crosspost_parent_id | TEXT | The ID of the original Reddit post if this post is a crosspost. |
post_id | TEXT | Unique identifier for the Reddit post. |
created_at | TIMESTAMP | The timestamp when the post was created. |
updated_at | TIMESTAMP | The timestamp when the post was last updated. |
language_code | TEXT | The language code of the post. |
score | INTEGER | The score (upvotes minus downvotes) of the post. |
upvote_ratio | REAL | The ratio of upvotes to total votes. |
gildings | INTEGER | Number of awards (gildings) received by the post. |
num_comments | INTEGER | Number of comments on the post. |
comments
Column Name | Type | Description |
---|---|---|
subreddit_id | TEXT | The unique identifier for the subreddit. |
post_id | TEXT | The ID of the Reddit post the comment belongs to. |
parent_id | TEXT | The ID of the parent comment (if a reply). |
comment_id | TEXT | Unique identifier for the comment. |
created_at | TIMESTAMP | The timestamp when the comment was created. |
last_modified_at | TIMESTAMP | The timestamp when the comment was last modified. |
score | INTEGER | The score (upvotes minus downvotes) of the comment. |
upvote_ratio | REAL | The ratio of upvotes to total votes for the comment. |
gilded | INTEGER | Number of awards (gildings) received by the comment. |
postlinks
Column Name | Type | Description |
---|---|---|
post_id | TEXT | Unique identifier for the Reddit post. |
end_processed_valid | INTEGER | Whether the extracted URL from the post resolves to a valid URL. |
end_processed_url | TEXT | The extracted URL from the Reddit post. |
final_valid | INTEGER | Whether the final URL from the post resolves to a valid URL after redirections. |
final_status | INTEGER | HTTP status code of the final URL. |
final_url | TEXT | The final URL after redirections. |
redirected | INTEGER | Indicator of whether the posted URL was redirected (1) or not (0). |
in_title | INTEGER | Indicator of whether the link appears in the post title (1) or post body (0). |
commentlinks
Column Name | Type | Description |
---|---|---|
comment_id | TEXT | Unique identifier for the Reddit comment. |
end_processed_valid | INTEGER | Whether the extracted URL from the comment resolves to a valid URL. |
end_processed_url | TEXT | The extracted URL from the comment. |
final_valid | INTEGER | Whether the final URL from the comment resolves to a valid URL after redirections. |
final_status | INTEGER | HTTP status code of the final |
When asked about "Attitudes towards the internet", most Mexican respondents pick "It is important to me to have mobile internet access in any place" as an answer. 56 percent did so in our online survey in 2025. Looking to gain valuable insights about users of internet providers worldwide? Check out our reports on consumers who use internet providers. These reports give readers a thorough picture of these customers, including their identities, preferences, opinions, and methods of communication.
https://www.caida.org/about/legal/aua/https://www.caida.org/about/legal/aua/
Packet headers (upto transport layer, inclusive) for Anonymized Internet Traces 2016 Dataset. Derived from OC192 traces on Equinix San Jose and Chicago monitors.
Part of the What Works Cities criterion to achieve Certification, we need to meet the industry standard of at least 75% of our households have subscriptions / access to high-speed broadband servicesPart of the American Community Survey (ACS) asks the levels of internet access residents have. We use the 5-Year Estimates to have a greater level of precision to our data, according to the Distinguishing features of ACS 1-year, 1-year supplemental, 3-year, and 5-year estimates table.We query attributes of the DP02 (Selected Social Characteristics in the United States) Group of questions for years available.This dataset has been narrowed down to Cary township using following the geographies codes supported for the ACS dataset:state: 37county: 183county subdivision: 90536
This data set contains internet traffic data captured by an Internet Service Provider (ISP) using Mikrotik SDN Controller and packet sniffer tools. The data set includes traffic from over 2000 customers who use Fibre to the Home (FTTH) and Gpon internet connections. The data was collected over a period of several months and contains all traffic in its original format with headers and packets.
The data set contains information on inbound and outbound traffic, including web browsing, email, file transfers, and more. The data set can be used for research in areas such as network security, traffic analysis, and machine learning.
**Data Collection Method: ** The data was captured using Mikrotik SDN Controller and packet sniffer tools. These tools capture traffic data by monitoring network traffic in real-time. The data set contains all traffic data in its original format, including headers and packets.
**Data Set Content: ** The data set is provided in a CSV format and includes the following fields:
MAC Protocol Examples 802.2 - 802.2 Frames (0x0004) arp - Address Resolution Protocol (0x0806) homeplug-av - HomePlug AV MME (0x88E1) ip - Internet Protocol version 4 (0x0800) ipv6 - Internet Protocol Version 6 (0x86DD) ipx - Internetwork Packet Exchange (0x8137) lldp - Link Layer Discovery Protocol (0x88CC) loop-protect - Loop Protect Protocol (0x9003) mpls-multicast - MPLS multicast (0x8848) mpls-unicast - MPLS unicast (0x8847) packing-compr - Encapsulated packets with compressed IP packing (0x9001) packing-simple - Encapsulated packets with simple IP packing (0x9000) pppoe - PPPoE Session Stage (0x8864) pppoe-discovery - PPPoE Discovery Stage (0x8863) rarp - Reverse Address Resolution Protocol (0x8035) service-vlan - Provider Bridging (IEEE 802.1ad) & Shortest Path Bridging IEEE 802.1aq (0x88A8) vlan - VLAN-tagged frame (IEEE 802.1Q) and Shortest Path Bridging IEEE 802.1aq with NNI compatibility (0x8100)
**Data Usage: ** The data set can be used for research in areas such as network security, traffic analysis, and machine learning. Researchers can use the data to develop new algorithms for detecting and preventing cyber attacks, analyzing internet traffic patterns, and more.
**Data Availability: ** If you are interested in using this data set for research purposes, please contact us at asfandyar250@gmail.com for more information and references. The data set is available for download on Kaggle and can be accessed by researchers who have obtained permission from the ISP.
We hope this data set will be useful for researchers in the field of network security and traffic analysis. If you have any questions or need further information, please do not hesitate to contact us.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5985737%2F61c81ce9eb393f8fc7c15540c9819b95%2FData.PNG?generation=1683750473536727&alt=media" alt="">
You can use Wireshark or other software's to view files
Internal listing of current employees and authorized users who can access SSA applications.
Automatically describing images using natural sentences is an essential task to visually impaired people's inclusion on the Internet. Although there are many datasets in the literature, most of them contain only English captions, whereas datasets with captions described in other languages are scarce.
PraCegoVer arose on the Internet, stimulating users from social media to publish images, tag #PraCegoVer and add a short description of their content. Inspired by this movement, we have proposed the #PraCegoVer, a multi-modal dataset with Portuguese captions based on posts from Instagram. It is the first large dataset for image captioning in Portuguese with freely annotated images.
Dataset Structure
containing the images. The file dataset.json comprehends a list of json objects with the attributes:
user: anonymized user that made the post;
filename: image file name;
raw_caption: raw caption;
caption: clean caption;
date: post date.
Each instance in dataset.json is associated with exactly one image in the images directory whose filename is pointed by the attribute filename. Also, we provide a sample with five instances, so the users can download the sample to get an overview of the dataset before downloading it completely.
Download Instructions
If you just want to have an overview of the dataset structure, you can download sample.tar.gz. But, if you want to use the dataset, or any of its subsets (63k and 173k), you must download all the files and run the following commands to uncompress and join the files:
cat images.tar.gz.part* > images.tar.gz tar -xzvf images.tar.gz
Alternatively, you can download the entire dataset from the terminal using the python script download_dataset.py available in PraCegoVer repository. In this case, first, you have to download the script and create an access token here. Then, you can run the following command to download and uncompress the image files:
python download_dataset.py --access_token=
When asked about "Attitudes towards the internet", most Australian respondents pick "It is important to me to have mobile internet access in any place" as an answer. 55 percent did so in our online survey in 2025. Looking to gain valuable insights about users of internet providers worldwide? Check out our reports on consumers who use internet providers. These reports give readers a thorough picture of these customers, including their identities, preferences, opinions, and methods of communication.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
The National Broadband Data represents coverage information across Canada for existing broadband service providers with their associated technology types. The coverage information is aggregated and deployed over a grid of hexagons, which cover areas of roughly 25 square km each. Broadband Internet service availability is provided for download/upload speed markers (5/1, 10/2, 25/5 and 50/10 Mbps) where more than 75% of total dwellings covered within the hexagon have access to broadband service offerings meeting these markers. In order to improve the granularity of the broadband data, ISED and the CRTC are providing aggregated and anonymous broadband services data based on the pseudo-household statistical model, hence achieving higher precision in depicting the broadband Internet service availability. This information is available below under the "NBD PHH Speeds" resource. For more information on the pseudo-household statistical model, refer to the Pseudo-Household Demographic Distribution dataset. A representation of broadband services per 250m road segments is now available for download under the “NBD Roads” resource. To generate this dataset, the NBD PHH Speeds information was projected over the nearest road arc from Statistics Canada’s Road Network File, and those roads were spliced in approximately 250m segments. NEW: The data has been augmented to include new presentation layers as published on the National Broadband Map.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This feature layer includes all OPM collected data at the town level.-------------The Connecticut Broadband Availability and Adoption Maps were created to help citizens and policymakers understand the strengths and weaknesses of broadband infrastructure in the state. Data is aggregated to the block, tract, and town (county subdivision) levels and includes counts of locations classified as unserved, underserved, and served as well as whether they meet the state goal of 1000Mbps/100Mbps. This application splits its visualizations into block, tract, and town layers for both unserved locations and progress to the state goal.
This map uses OPM collected availability and adoption data.
As of 2023, OPM collected availability data was submitted by internet service providers pursuant to PA 21-159 and processed by the GIS Office in the Office of Policy and Management, cleaned, and matched to the CostQuest location fabric.
Metadata:
All feature layers, maps, and datasets including OPM's internal broadband availability data follows the same basic schema with additional fields added in some case for convenience.
Fields named no service, unserved, underserved, served, and GigC are counts of locations where a particular level of broadband service is provided, No service locations are those where there is no reported service at all. Unserved locations are locations where there is a provider offering wireline service, but not at or above 25 Mbps download and 3 Mbps upload. Underserved locations are locations where at least one provider offers wireline service of 25 Mbps download and 3 Mbps upload, but there is no provider offering wireline service of 100 Mbps download and 20 Mbps upload. Served locations are locations where there is wireline service of at least 100 Mbps download and 20 Mbps upload. GigC denotes the count of locations that have service at 1000 Mbps download and 100 Mbps upload. Accordingly, total locations is equal to the sum of no service, unserved, underserved, served, and "GigC" locations. Availability also includes fields for average download and upload speeds. These are calculated at the relevant level of census geography based on the maximum for all locations.
The final field included in all availability data is the provider list.
OPM collected adoption data:
OPM collected adoption data uses many of the same naming conventions as the availability data, but there are some notable differences.
Fields named unserved_Sub, underserved_Sub, served_Sub, and GigC _Sub are counts of subscriptions where a particular level of broadband service is currently subscribed to, Unserved subscriptions are subscriptions that do not meet the standard of 25 Mbps download and 3 Mbps upload. Underserved subscriptions are subscriptions with speeds of 25 Mbps download and 3 Mbps upload, but not meeting 100 Mbps download and 20 Mbps upload. Served subscriptions are subscriptions where speeds are between 100 Mbps download and 20 Mbps upload and 1000 Mbps download and 100 Mbps upload. GigC denotes the count of locations that have a subscription at 1000 Mbps download and 100 Mbps upload or higher. For subscription data these locations are NOT included in the "served" field as this does not directly apply to FCC use of the terms.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Smart homes contain programmable electronic devices (mostly IoT) that enable home au- tomation. People who live in smart homes benefit from interconnected devices by controlling them either remotely or manually/autonomously. However, high interconnectivity comes with an increased attack surface, making the smart home an attractive target for adversaries. NCC Group and the Global Cyber Alliance recorded over 12,000 attacks to log into smart home devices maliciously. Recent statistics show that over 200 million smart homes can be subjected to these attacks. Conventional security systems are either focused on network traffic (e.g., firewalls) or physical environment (e.g., CCTV or basic motion sensors), but not both. A key challenge in de- veloping cyber-physical security systems is the lack of datasets and test beds. For cyber-physical datasets to be meaningful, they need to be collected in real smart home environments. Due to the inherited difficulties and challenges (e.g. effort, costs, test-bed availability), such cyber-physical smart home datasets are quite rare. This paper aims to fill this gap by contributing a dataset we collected in a real smart home with annotated labels. This paper explains the process we followed to collect the data and how we organised them to facilitate wider use within research communities.A related article can be found at https://doi.org/10.3389/friot.2023.1275080
These data consist of measures of Internet use estimated using small area estimation. The small area estimation is based on census Output Areas (OAs) using the 2013 Oxford Internet Survey (OxIS) and the 2011 British census. There is an estimate for each OA in Great Britain. By combining the 2013 OxIS survey data with the comprehensive small area coverage of the 2011 British census we can use the strengths of one to offset the gaps in the other. Specifically, we follow a two-step process. First, we use the information that is reliably available in OxIS to create model that estimates the proportion of Internet users in OAs. Second, we use the parameters from this model combined with census data to estimate the proportion of Internet users each OA in Britain. Once these estimates are available, we aggregate the estimates up to higher levels of geography. In this way we can estimate Internet use in Glasgow, Manchester and Cardiff as well as other small areas in Britain. This procedure is referred to as indirect, model-based or synthetic estimation. In recent years such SAE techniques have been widely used throughout Europe and North America. See the project website for more details.The objective of the Geography of Digital Inequality project was to explore the geographical contours of Internet use and penetration in Britain. Specifically, the project assembled from existing datasets a new dataset which contains Internet information at fine-grained geographic levels, census output areas (OAs). From OAs we were able to aggregate to higher geographic levels such as counties, Welsh and Scottish Councils, metropolitan areas, or others. Through this unique dataset we explored digital divides and the geography of the Internet, a capability possessed by no other dataset. Specifically, we explored the extent of use versus non-use of the Internet. There were 2 datasets used to assemble this dataset. First, the 2013 Oxford Internet Survey (OxIS) is a random sample of the 2657 people age 14+ from the British population (England, Scotland & Wales). Interviews were conducted face-to-face by an independent survey research company. The response rate for 2013 was 51%. The data collection was a two-stage sample. A random sample of census output areas (OAs) was selected and respondents were randomly sampled within each selected OA. For details, see "Data collection technical report.pdf" which has been uploaded. We use six variables from OxIS: Internet use, region, age, lifestage, gender and education. The questionnaire for OxIS contains about 300 variables and it is available from the OxIS website, see the URL in the "related resources" section. Second, the 2011 British Census. For information on how the census was conducted,see the census website. The URL for the 2011 census is given below in "related resources".
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this data set, we present data collected for the purpose of carrying out a systematic review of the available Wireless Sensor Network and Internet of Things testbed facilities. The data was collected through multiple stages and in each stage the pre-defined criteria were applied. We provide a dataset describing the hardware and software aspects of Wireless Sensor Network and Internet of Things testbed facilities available in the market and scientific community. The data were gathered through an extensive systematic review process of scientific articles published between the years 2011 and 2021. The review aims to obtain good quality data for people who are actively researching the Internet of Things facilities or anyone who is interested in that field.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Please cite the following paper when using this dataset: N. Thakur, “MonkeyPox2022Tweets: The first public Twitter dataset on the 2022 MonkeyPox outbreak,” Preprints, 2022, DOI: 10.20944/preprints202206.0172.v2
Abstract The world is currently facing an outbreak of the monkeypox virus, and confirmed cases have been reported from 28 countries. Following a recent “emergency meeting”, the World Health Organization just declared monkeypox a global health emergency. As a result, people from all over the world are using social media platforms, such as Twitter, for information seeking and sharing related to the outbreak, as well as for familiarizing themselves with the guidelines and protocols that are being recommended by various policy-making bodies to reduce the spread of the virus. This is resulting in the generation of tremendous amounts of Big Data related to such paradigms of social media behavior. Mining this Big Data and compiling it in the form of a dataset can serve a wide range of use-cases and applications such as analysis of public opinions, interests, views, perspectives, attitudes, and sentiment towards this outbreak. Therefore, this work presents MonkeyPox2022Tweets, an open-access dataset of Tweets related to the 2022 monkeypox outbreak that were posted on Twitter since the first detected case of this outbreak on May 7, 2022. The dataset is compliant with the privacy policy, developer agreement, and guidelines for content redistribution of Twitter, as well as with the FAIR principles (Findability, Accessibility, Interoperability, and Reusability) principles for scientific data management.
Data Description The dataset consists of a total of 255,363 Tweet IDs of the same number of tweets about monkeypox that were posted on Twitter from 7th May 2022 to 23rd July 2022 (the most recent date at the time of dataset upload). The Tweet IDs are presented in 6 different .txt files based on the timelines of the associated tweets. The following provides the details of these dataset files. • Filename: TweetIDs_Part1.txt (No. of Tweet IDs: 13926, Date Range of the Tweet IDs: May 7, 2022 to May 21, 2022) • Filename: TweetIDs_Part2.txt (No. of Tweet IDs: 17705, Date Range of the Tweet IDs: May 21, 2022 to May 27, 2022) • Filename: TweetIDs_Part3.txt (No. of Tweet IDs: 17585, Date Range of the Tweet IDs: May 27, 2022 to June 5, 2022) • Filename: TweetIDs_Part4.txt (No. of Tweet IDs: 19718, Date Range of the Tweet IDs: June 5, 2022 to June 11, 2022) • Filename: TweetIDs_Part5.txt (No. of Tweet IDs: 47718, Date Range of the Tweet IDs: June 12, 2022 to June 30, 2022) • Filename: TweetIDs_Part6.txt (No. of Tweet IDs: 138711, Date Range of the Tweet IDs: July 1, 2022 to July 23, 2022)
The dataset contains only Tweet IDs in compliance with the terms and conditions mentioned in the privacy policy, developer agreement, and guidelines for content redistribution of Twitter. The Tweet IDs need to be hydrated to be used.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ICA243 - Percentage of Internet users who purchased Travel/Culture related services online in the previous 3 months. Published by Central Statistics Office. Available under the license Creative Commons Attribution 4.0 (CC-BY-4.0).Percentage of Internet users who purchased Travel/Culture related services online in the previous 3 months...
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
The Internet access indicator measures the prevalence of different Internet technology options available in Champaign County, Illinois, and the U.S., at two different speeds: 4/1 Mbps and 25/3 Mbps.
Seven types of connection options are evaluated: ADSL, cable, fiber, fixed wireless, satellite, "other" technology, and "any" technology, which includes the previous six options.
Satellite internet, at both speeds, is the most widely available in all three areas. One hundred percent of Champaign County residents have access to satellite internet at both speeds. Cable internet is also widely available across all three areas, and over 90 percent of Champaign County residents have access to cable internet. Fiber internet is the least widely available type of technology, aside from "other" technology. However, fiber internet is now available to almost 38 percent of Champaign County residents as of December 2020, an increase from approximately 25 percent in June 2020.
The ability of Champaign County residents to access the Internet has become key in many facets of life, especially during the COVID-19 pandemic. Internet access provides economic, educational, and social opportunities; having or not having Internet access has become not only a technological issue, but an equity issue.
This data was retrieved from the Federal Communications Commission’s Fixed Broadband Deployment Area Comparison, and dates from December 2020.
Source: Federal Communications Commission. (2020). Fixed Broadband Deployment. Area Comparison. https://broadbandmap.fcc.gov/#/. (Accessed 3 June 2022).
Data collected from interviews with employers, professionals, self-employed individuals, and individual workers who have been assisted by JAN
The total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly, reaching *** zettabytes in 2024. Over the next five years up to 2028, global data creation is projected to grow to more than *** zettabytes. In 2020, the amount of data created and replicated reached a new high. The growth was higher than previously expected, caused by the increased demand due to the COVID-19 pandemic, as more people worked and learned from home and used home entertainment options more often. Storage capacity also growing Only a small percentage of this newly created data is kept though, as just * percent of the data produced and consumed in 2020 was saved and retained into 2021. In line with the strong growth of the data volume, the installed base of storage capacity is forecast to increase, growing at a compound annual growth rate of **** percent over the forecast period from 2020 to 2025. In 2020, the installed base of storage capacity reached *** zettabytes.