18 datasets found
  1. c

    Data from: ERA5 hourly data on single levels from 1940 to present

    • cds.climate.copernicus.eu
    grib
    Updated Oct 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ECMWF (2025). ERA5 hourly data on single levels from 1940 to present [Dataset]. http://doi.org/10.24381/cds.adbb2d47
    Explore at:
    gribAvailable download formats
    Dataset updated
    Oct 2, 2025
    Dataset authored and provided by
    ECMWF
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1940 - Sep 26, 2025
    Description

    ERA5 is the fifth generation ECMWF reanalysis for the global climate and weather for the past 8 decades. Data is available from 1940 onwards. ERA5 replaces the ERA-Interim reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. This principle, called data assimilation, is based on the method used by numerical weather prediction centres, where every so many hours (12 hours at ECMWF) a previous forecast is combined with newly available observations in an optimal way to produce a new best estimate of the state of the atmosphere, called analysis, from which an updated, improved forecast is issued. Reanalysis works in the same way, but at reduced resolution to allow for the provision of a dataset spanning back several decades. Reanalysis does not have the constraint of issuing timely forecasts, so there is more time to collect observations, and when going further back in time, to allow for the ingestion of improved versions of the original observations, which all benefit the quality of the reanalysis product. ERA5 provides hourly estimates for a large number of atmospheric, ocean-wave and land-surface quantities. An uncertainty estimate is sampled by an underlying 10-member ensemble at three-hourly intervals. Ensemble mean and spread have been pre-computed for convenience. Such uncertainty estimates are closely related to the information content of the available observing system which has evolved considerably over time. They also indicate flow-dependent sensitive areas. To facilitate many climate applications, monthly-mean averages have been pre-calculated too, though monthly means are not available for the ensemble mean and spread. ERA5 is updated daily with a latency of about 5 days. In case that serious flaws are detected in this early release (called ERA5T), this data could be different from the final release 2 to 3 months later. In case that this occurs users are notified. The data set presented here is a regridded subset of the full ERA5 data set on native resolution. It is online on spinning disk, which should ensure fast and easy access. It should satisfy the requirements for most common applications. An overview of all ERA5 datasets can be found in this article. Information on access to ERA5 data on native resolution is provided in these guidelines. Data has been regridded to a regular lat-lon grid of 0.25 degrees for the reanalysis and 0.5 degrees for the uncertainty estimate (0.5 and 1 degree respectively for ocean waves). There are four main sub sets: hourly and monthly products, both on pressure levels (upper air fields) and single levels (atmospheric, ocean-wave and land surface quantities). The present entry is "ERA5 hourly data on single levels from 1940 to present".

  2. Z

    Network traffic datasets with novel extended IP flow called NetTiSA flow

    • data.niaid.nih.gov
    Updated Apr 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Karel Hynek (2024). Network traffic datasets with novel extended IP flow called NetTiSA flow [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8301042
    Explore at:
    Dataset updated
    Apr 18, 2024
    Dataset provided by
    Karel Hynek
    Josef Koumar
    Jaroslav Pešek
    Tomáš Čejka
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Network traffic datasets with novel extended IP flow called NetTiSA flow

    Datasets were created for the paper: NetTiSA: Extended IP Flow with Time-series Features for Universal Bandwidth-constrained High-speed Network Traffic Classification -- Josef Koumar, Karel Hynek, Jaroslav Pešek, Tomáš Čejka -- which is published in The International Journal of Computer and Telecommunications Networking https://doi.org/10.1016/j.comnet.2023.110147Please cite the usage of our datasets as:

    Josef Koumar, Karel Hynek, Jaroslav Pešek, Tomáš Čejka, "NetTiSA: Extended IP flow with time-series features for universal bandwidth-constrained high-speed network traffic classification", Computer Networks, Volume 240, 2024, 110147, ISSN 1389-1286

    @article{KOUMAR2024110147, title = {NetTiSA: Extended IP flow with time-series features for universal bandwidth-constrained high-speed network traffic classification}, journal = {Computer Networks}, volume = {240}, pages = {110147}, year = {2024}, issn = {1389-1286}, doi = {https://doi.org/10.1016/j.comnet.2023.110147}, url = {https://www.sciencedirect.com/science/article/pii/S1389128623005923}, author = {Josef Koumar and Karel Hynek and Jaroslav Pešek and Tomáš Čejka} }

    This Zenodo repository contains 23 datasets created from 15 well-known published datasets, which are cited in the table below. Each dataset contains the NetTiSA flow feature vector.

    NetTiSA flow feature vector

    The novel extended IP flow called NetTiSA (Network Time Series Analysed) flow contains a universal bandwidth-constrained feature vector consisting of 20 features. We divide the NetTiSA flow classification features into three groups by computation. The first group of features is based on classical bidirectional flow information---a number of transferred bytes, and packets. The second group contains statistical and time-based features calculated using the time-series analysis of the packet sequences. The third type of features can be computed from the previous groups (i.e., on the flow collector) and improve the classification performance without any impact on the telemetry bandwidth.

    Flow features

    The flow features are:

    Packets is the number of packets in the direction from the source to the destination IP address.

    Packets in reverse order is the number of packets in the direction from the destination to the source IP address.

    Bytes is the size of the payload in bytes transferred in the direction from the source to the destination IP address.

    Bytes in reverse order is the size of the payload in bytes transferred in the direction from the destination to the source IP address.

    Statistical and Time-based features

    The features that are exported in the extended part of the flow. All of them can be computed (exactly or in approximative) by stream-wise computation, which is necessary for keeping memory requirements low. The second type of feature set contains the following features:

    Mean represents mean of the payload lengths of packets

    Min is the minimal value from payload lengths of all packets in a flow

    Max is the maximum value from payload lengths of all packets in a flow

    Standard deviation is a measure of the variation of payload lengths from the mean payload length

    Root mean square is the measure of the magnitude of payload lengths of packets

    Average dispersion is the average absolute difference between each payload length of the packet and the mean value

    Kurtosis is the measure describing the extent to which the tails of a distribution differ from the tails of a normal distribution

    Mean of relative times is the mean of the relative times which is a sequence defined as (st = {t_1 - t_1, t_2 - t_1, ..., t_n - t_1} )

    Mean of time differences is the mean of the time differences which is a sequence defined as (dt = { t_j - t_i | j = i + 1, i \in {1, 2, \dots, n - 1} }.)

    Min from time differences is the minimal value from all time differences, i.e., min space between packets.

    Max from time differences is the maximum value from all time differences, i.e., max space between packets.

    Time distribution describes the deviation of time differences between individual packets within the time series. The feature is computed by the following equation:(tdist = \frac{ \frac{1}{n-1} \sum_{i=1}^{n-1} \left| \mu_{{dt_{n-1}}} - dt_i \right| }{ \frac{1}{2} \left(max\left({dt_{n-1}}\right) - min\left({dt_{n-1}}\right) \right) })

    Switching ratio represents a value change ratio (switching) between payload lengths. The switching ratio is computed by equation:(sr = \frac{s_n}{\frac{1}{2} (n - 1)})

        where \(s_n\) is number of switches.
    

    Features computed at the collectorThe third set contains features that are computed from the previous two groups prior to classification. Therefore, they do not influence the network telemetry size and their computation does not put additional load to resource-constrained flow monitoring probes. The NetTiSA flow combined with this feature set is called the Enhanced NetTiSA flow and contains the following features:

    Max minus min is the difference between minimum and maximum payload lengths

    Percent deviation is the dispersion of the average absolute difference to the mean value

    Variance is the spread measure of the data from its mean

    Burstiness is the degree of peakedness in the central part of the distribution

    Coefficient of variation is a dimensionless quantity that compares the dispersion of a time series to its mean value and is often used to compare the variability of different time series that have different units of measurement

    Directions describe a percentage ratio of packet direction computed as (\frac{d_1}{ d_1 + d_0}), where (d_1) is a number of packets in a direction from source to destination IP address and (d_0) the opposite direction. Both (d_1) and (d_0) are inside the classical bidirectional flow.

    Duration is the duration of the flow

    The NetTiSA flow is implemented into IP flow exporter ipfixprobe.

    Description of dataset files

    In the following table is a description of each dataset file:

    File name

    Detection problem

    Citation of the original raw dataset

    botnet_binary.csv Binary detection of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.

    botnet_multiclass.csv Multi-class classification of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.

    cryptomining_design.csv Binary detection of cryptomining; the design part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022

    cryptomining_evaluation.csv Binary detection of cryptomining; the evaluation part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022

    dns_malware.csv Binary detection of malware DNS Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021.

    doh_cic.csv Binary detection of DoH Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020

    doh_real_world.csv Binary detection of DoH Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022

    dos.csv Binary detection of DoS Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019.

    edge_iiot_binary.csv Binary detection of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.

    edge_iiot_multiclass.csv Multi-class classification of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.

    https_brute_force.csv Binary detection of HTTPS Brute Force Jan Luxemburk et al. HTTPS Brute-force dataset with extended network flows, November 2020

    ids_cic_binary.csv Binary detection of intrusion in IDS Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.

    ids_cic_multiclass.csv Multi-class classification of intrusion in IDS Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.

    unsw_binary.csv Binary detection of intrusion in IDS Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.

    unsw_multiclass.csv Multi-class classification of intrusion in IDS Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.

    iot_23.csv Binary detection of IoT malware Sebastian Garcia et al. IoT-23: A labeled dataset with malicious and benign IoT network traffic, January 2020. More details here https://www.stratosphereips.org /datasets-iot23

    ton_iot_binary.csv Binary detection of IoT malware Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021

    ton_iot_multiclass.csv Multi-class classification of IoT malware Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets.

  3. Network traffic datasets with novel extended IP flow called NetTiSA flow

    • zenodo.org
    csv
    Updated Apr 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Josef Koumar; Josef Koumar; Karel Hynek; Karel Hynek; Jaroslav Pešek; Jaroslav Pešek; Tomáš Čejka; Tomáš Čejka (2024). Network traffic datasets with novel extended IP flow called NetTiSA flow [Dataset]. http://doi.org/10.5281/zenodo.8301043
    Explore at:
    csvAvailable download formats
    Dataset updated
    Apr 18, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Josef Koumar; Josef Koumar; Karel Hynek; Karel Hynek; Jaroslav Pešek; Jaroslav Pešek; Tomáš Čejka; Tomáš Čejka
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Network traffic datasets with novel extended IP flow called NetTiSA flow

    Datasets were created for the paper: NetTiSA: Extended IP Flow with Time-series Features for Universal Bandwidth-constrained High-speed Network Traffic Classification -- Josef Koumar, Karel Hynek, Jaroslav Pešek, Tomáš Čejka -- which is published in The International Journal of Computer and Telecommunications Networking https://doi.org/10.1016/j.comnet.2023.110147

    Please cite the usage of our datasets as:

    Josef Koumar, Karel Hynek, Jaroslav Pešek, Tomáš Čejka, "NetTiSA: Extended IP flow with time-series features for universal bandwidth-constrained high-speed network traffic classification", Computer Networks, Volume 240, 2024, 110147, ISSN 1389-1286

    @article{KOUMAR2024110147,
    title = {NetTiSA: Extended IP flow with time-series features for universal bandwidth-constrained high-speed network traffic classification},
    journal = {Computer Networks},
    volume = {240},
    pages = {110147},
    year = {2024},
    issn = {1389-1286},
    doi = {https://doi.org/10.1016/j.comnet.2023.110147},
    url = {https://www.sciencedirect.com/science/article/pii/S1389128623005923},
    author = {Josef Koumar and Karel Hynek and Jaroslav Pešek and Tomáš Čejka}
    }
    

    This Zenodo repository contains 23 datasets created from 15 well-known published datasets, which are cited in the table below. Each dataset contains the NetTiSA flow feature vector.

    NetTiSA flow feature vector


    The novel extended IP flow called NetTiSA (Network Time Series Analysed) flow contains a universal bandwidth-constrained feature vector consisting of 20 features. We divide the NetTiSA flow classification features into three groups by computation. The first group of features is based on classical bidirectional flow information---a number of transferred bytes, and packets. The second group contains statistical and time-based features calculated using the time-series analysis of the packet sequences. The third type of features can be computed from the previous groups (i.e., on the flow collector) and improve the classification performance without any impact on the telemetry bandwidth.

    Flow features

    The flow features are:

    • Packets is the number of packets in the direction from the source to the destination IP address.
    • Packets in reverse order is the number of packets in the direction from the destination to the source IP address.
    • Bytes is the size of the payload in bytes transferred in the direction from the source to the destination IP address.
    • Bytes in reverse order is the size of the payload in bytes transferred in the direction from the destination to the source IP address.

    Statistical and Time-based features

    The features that are exported in the extended part of the flow. All of them can be computed (exactly or in approximative) by stream-wise computation, which is necessary for keeping memory requirements low. The second type of feature set contains the following features:

    • Mean represents mean of the payload lengths of packets
    • Min is the minimal value from payload lengths of all packets in a flow
    • Max is the maximum value from payload lengths of all packets in a flow
    • Standard deviation is a measure of the variation of payload lengths from the mean payload length
    • Root mean square is the measure of the magnitude of payload lengths of packets
    • Average dispersion is the average absolute difference between each payload length of the packet and the mean value
    • Kurtosis is the measure describing the extent to which the tails of a distribution differ from the tails of a normal distribution
    • Mean of relative times is the mean of the relative times which is a sequence defined as \(st = \{t_1 - t_1, t_2 - t_1, ..., t_n - t_1\} \)
    • Mean of time differences is the mean of the time differences which is a sequence defined as \(dt = \{ t_j - t_i | j = i + 1, i \in \{1, 2, \dots, n - 1\} \}.\)
    • Min from time differences is the minimal value from all time differences, i.e., min space between packets.
    • Max from time differences is the maximum value from all time differences, i.e., max space between packets.
    • Time distribution describes the deviation of time differences between individual packets within the time series. The feature is computed by the following equation:
      \(tdist = \frac{ \frac{1}{n-1} \sum_{i=1}^{n-1} \left| \mu_{\{dt_{n-1}\}} - dt_i \right| }{ \frac{1}{2} \left(max\left(\{dt_{n-1}\}\right) - min\left(\{dt_{n-1}\}\right) \right) }\)
    • Switching ratio represents a value change ratio (switching) between payload lengths. The switching ratio is computed by equation:
      \(sr = \frac{s_n}{\frac{1}{2} (n - 1)}\)

    where \(s_n\) is number of switches.

    Features computed at the collector
    The third set contains features that are computed from the previous two groups prior to classification. Therefore, they do not influence the network telemetry size and their computation does not put additional load to resource-constrained flow monitoring probes. The NetTiSA flow combined with this feature set is called the Enhanced NetTiSA flow and contains the following features:

    • Max minus min is the difference between minimum and maximum payload lengths
    • Percent deviation is the dispersion of the average absolute difference to the mean value
    • Variance is the spread measure of the data from its mean
    • Burstiness is the degree of peakedness in the central part of the distribution
    • Coefficient of variation is a dimensionless quantity that compares the dispersion of a time series to its mean value and is often used to compare the variability of different time series that have different units of measurement
    • Directions describe a percentage ratio of packet direction computed as \(\frac{d_1}{ d_1 + d_0}\), where \(d_1\) is a number of packets in a direction from source to destination IP address and \(d_0\) the opposite direction. Both \(d_1\) and \(d_0\) are inside the classical bidirectional flow.
    • Duration is the duration of the flow

    The NetTiSA flow is implemented into IP flow exporter ipfixprobe.

    Description of dataset files

    In the following table is a description of each dataset file:

    File name

    Detection problem

    Citation of the original raw dataset

    botnet_binary.csv Binary detection of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
    botnet_multiclass.csv Multi-class classification of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
    cryptomining_design.csv Binary detection of cryptomining; the design part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
    cryptomining_evaluation.csv Binary detection of cryptomining; the evaluation part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
    dns_malware.csv Binary detection of malware DNS Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021.
    doh_cic.csv Binary detection of DoH Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020
    doh_real_world.csv Binary detection of DoH Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022
    dos.csv Binary detection of DoS Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019.
    edge_iiot_binary.csv Binary detection of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
    edge_iiot_multiclass.csv Multi-class classification of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications:

  4. Annual Count of Extreme Summer Days - Projections (12km)

    • climatedataportal.metoffice.gov.uk
    Updated Feb 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Met Office (2023). Annual Count of Extreme Summer Days - Projections (12km) [Dataset]. https://climatedataportal.metoffice.gov.uk/datasets/TheMetOffice::annual-count-of-extreme-summer-days-projections-12km/about
    Explore at:
    Dataset updated
    Feb 7, 2023
    Dataset authored and provided by
    Met Officehttp://www.metoffice.gov.uk/
    Area covered
    Description

    [Updated 28/01/25 to fix an issue in the ‘Lower’ values, which were not fully representing the range of uncertainty. ‘Median’ and ‘Higher’ values remain unchanged. The size of the change varies by grid cell and fixed period/global warming levels but the average difference between the 'lower' values before and after this update is 0.0.]What does the data show? The Annual Count of Extreme Summer Days is the number of days per year where the maximum daily temperature is above 35°C. It measures how many times the threshold is exceeded (not by how much) in a year. Note, the term ‘extreme summer days’ is used to refer to the threshold and temperatures above 35°C outside the summer months also contribute to the annual count. The results should be interpreted as an approximation of the projected number of days when the threshold is exceeded as there will be many factors such as natural variability and local scale processes that the climate model is unable to represent.The Annual Count of Extreme Summer Days is calculated for two baseline (historical) periods 1981-2000 (corresponding to 0.51°C warming) and 2001-2020 (corresponding to 0.87°C warming) and for global warming levels of 1.5°C, 2.0°C, 2.5°C, 3.0°C, 4.0°C above the pre-industrial (1850-1900) period. This enables users to compare the future number of extreme summer days to previous values.What are the possible societal impacts?The Annual Count of Extreme Summer Days indicates increased health risks, transport disruption and damage to infrastructure from high temperatures. It is based on exceeding a maximum daily temperature of 35°C. Impacts include:Increased heat related illnesses, hospital admissions or death affecting not just the vulnerable. Transport disruption due to overheating of road and railway infrastructure.Other metrics such as the Annual Count of Summer Days (days above 25°C), Annual Count of Hot Summer Days (days above 30°C) and the Annual Count of Tropical Nights (where the minimum temperature does not fall below 20°C) also indicate impacts from high temperatures, however they use different temperature thresholds.What is a global warming level?The Annual Count of Extreme Summer Days is calculated from the UKCP18 regional climate projections using the high emissions scenario (RCP 8.5) where greenhouse gas emissions continue to grow. Instead of considering future climate change during specific time periods (e.g. decades) for this scenario, the dataset is calculated at various levels of global warming relative to the pre-industrial (1850-1900) period. The world has already warmed by around 1.1°C (between 1850–1900 and 2011–2020), whilst this dataset allows for the exploration of greater levels of warming. The global warming levels available in this dataset are 1.5°C, 2°C, 2.5°C, 3°C and 4°C. The data at each warming level was calculated using a 21 year period. These 21 year periods are calculated by taking 10 years either side of the first year at which the global warming level is reached. This time will be different for different model ensemble members. To calculate the value for the Annual Count of Extreme Summer Days, an average is taken across the 21 year period. Therefore, the Annual Count of Extreme Summer Days show the number of extreme summer days that could occur each year, for each given level of warming. We cannot provide a precise likelihood for particular emission scenarios being followed in the real world future. However, we do note that RCP8.5 corresponds to emissions considerably above those expected with current international policy agreements. The results are also expressed for several global warming levels because we do not yet know which level will be reached in the real climate as it will depend on future greenhouse emission choices and the sensitivity of the climate system, which is uncertain. Estimates based on the assumption of current international agreements on greenhouse gas emissions suggest a median warming level in the region of 2.4-2.8°C, but it could either be higher or lower than this level.What are the naming conventions and how do I explore the data?This data contains a field for each global warming level and two baselines. They are named ‘ESD’ (where ESD means Extreme Summer Days, the warming level or baseline, and ‘upper’ ‘median’ or ‘lower’ as per the description below. E.g. ‘Extreme Summer Days 2.5 median’ is the median value for the 2.5°C warming level. Decimal points are included in field aliases but not field names e.g. ‘Extreme Summer Days 2.5 median’ is ‘ExtremeSummerDays_25_median’. To understand how to explore the data, see this page: https://storymaps.arcgis.com/stories/457e7a2bc73e40b089fac0e47c63a578Please note, if viewing in ArcGIS Map Viewer, the map will default to ‘ESD 2.0°C median’ values.What do the ‘median’, ‘upper’, and ‘lower’ values mean?Climate models are numerical representations of the climate system. To capture uncertainty in projections for the future, an ensemble, or group, of climate models are run. Each ensemble member has slightly different starting conditions or model set-ups. Considering all of the model outcomes gives users a range of plausible conditions which could occur in the future. For this dataset, the model projections consist of 12 separate ensemble members. To select which ensemble members to use, the Annual Count of Extreme Summer Days was calculated for each ensemble member and they were then ranked in order from lowest to highest for each location. The ‘lower’ fields are the second lowest ranked ensemble member. The ‘upper’ fields are the second highest ranked ensemble member. The ‘median’ field is the central value of the ensemble.This gives a median value, and a spread of the ensemble members indicating the range of possible outcomes in the projections. This spread of outputs can be used to infer the uncertainty in the projections. The larger the difference between the lower and upper fields, the greater the uncertainty.‘Lower’, ‘median’ and ‘upper’ are also given for the baseline periods as these values also come from the model that was used to produce the projections. This allows a fair comparison between the model projections and recent past. Useful linksThis dataset was calculated following the methodology in the ‘Future Changes to high impact weather in the UK’ report and uses the same temperature thresholds as the 'State of the UK Climate' report.Further information on the UK Climate Projections (UKCP).Further information on understanding climate data within the Met Office Climate Data Portal.

  5. f

    Prediction performance on the US dataset.

    • plos.figshare.com
    xls
    Updated Sep 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Satoki Fujita; Tatsuya Akutsu (2025). Prediction performance on the US dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0331611.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 15, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Satoki Fujita; Tatsuya Akutsu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    Forecasting the future number of confirmed cases in each region is a critical challenge in controlling the spread of infectious diseases. Accurate predictions enable the proactive development of optimal containment strategies. Recently, deep learning-based models have increasingly leveraged graph structures to capture the spatial dynamics of epidemic spread. While intuitive, this approach often increases model complexity, and the resulting performance gains may not justify the added burden. In some cases, it may even lead to overfitting. Moreover, infectious disease data is typically noisy, making it difficult to extract infectious disease-specific dynamics from data without guidance based on epidemiological domain knowledge. To address these issues, we propose a simple yet effective hybrid model for multi-region epidemic forecasting, termed Physics-Informed Spatial IDentity neural network (PISID). This model integrates a spatio-temporal identity (STID)-based neural network module, which encodes spatio-temporal information without relying on graph structures, with an SIR module grounded in classical epidemiological dynamics. Regional characteristics are incorporated via a spatial embedding matrix, and epidemiological parameters are inferred through a fully connected neural network. These parameters are then used to govern the dynamics of the SIR model for forecasting purposes. Experiments on real-world datasets demonstrate that the proposed PISID model achieves stable and superior predictive performance compared to baseline models, with approximately 27K parameters and an average training time of 0.45 seconds per epoch. Additionally, ablation studies validate the effectiveness of the neural network’s encoding architecture, and analysis of the decoded epidemiological parameters highlights the model’s interpretability. Overall, PISID contributes to reliable epidemic forecasting by integrating data-driven learning with epidemiological domain knowledge.

  6. f

    Prediction performance on the Japan dataset.

    • plos.figshare.com
    xls
    Updated Sep 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Satoki Fujita; Tatsuya Akutsu (2025). Prediction performance on the Japan dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0331611.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 15, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Satoki Fujita; Tatsuya Akutsu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Japan
    Description

    Forecasting the future number of confirmed cases in each region is a critical challenge in controlling the spread of infectious diseases. Accurate predictions enable the proactive development of optimal containment strategies. Recently, deep learning-based models have increasingly leveraged graph structures to capture the spatial dynamics of epidemic spread. While intuitive, this approach often increases model complexity, and the resulting performance gains may not justify the added burden. In some cases, it may even lead to overfitting. Moreover, infectious disease data is typically noisy, making it difficult to extract infectious disease-specific dynamics from data without guidance based on epidemiological domain knowledge. To address these issues, we propose a simple yet effective hybrid model for multi-region epidemic forecasting, termed Physics-Informed Spatial IDentity neural network (PISID). This model integrates a spatio-temporal identity (STID)-based neural network module, which encodes spatio-temporal information without relying on graph structures, with an SIR module grounded in classical epidemiological dynamics. Regional characteristics are incorporated via a spatial embedding matrix, and epidemiological parameters are inferred through a fully connected neural network. These parameters are then used to govern the dynamics of the SIR model for forecasting purposes. Experiments on real-world datasets demonstrate that the proposed PISID model achieves stable and superior predictive performance compared to baseline models, with approximately 27K parameters and an average training time of 0.45 seconds per epoch. Additionally, ablation studies validate the effectiveness of the neural network’s encoding architecture, and analysis of the decoded epidemiological parameters highlights the model’s interpretability. Overall, PISID contributes to reliable epidemic forecasting by integrating data-driven learning with epidemiological domain knowledge.

  7. Z

    Dataset related to the publication "Anomalous radiative transfer in...

    • data.niaid.nih.gov
    Updated Jun 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sassaroli, Angelo (2024). Dataset related to the publication "Anomalous radiative transfer in heterogeneous media" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11278461
    Explore at:
    Dataset updated
    Jun 27, 2024
    Dataset provided by
    Sassaroli, Angelo
    Paolucci, Michela
    Fini, Lorenzo
    Pini, Ernesto
    Pattelli, Lorenzo
    Cavalieri, Stefano
    Martelli, Fabrizio
    Tommasi, Federico
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains Monte Carlo results for anomalous light transport in a spherical geometry with homogeneous (Hom), heterogeneous layered (Lay), and homogeneous layered (VHom) configurations.

    Part of these data is described in the publication:

    F. Tommasi et al. "Anomalous radiative transfer in heterogeneous media" Advanced Theory and Simulations (2024) https://doi.org/10.1002/adts.202400182

    Data is organized in two types of csv files:

    filename.csv containing information on the average path length and total path length distribution

    filename_Fluence.csv containing information on the fluence rate and radiance at each spherical layer boundary

    Filenames contain information on the simulation parameters used:

    ISO isotropic scattering

    K0_X value of the k coefficient of the Generalized Pareto Distribution used in the Monte Carlo simulation (X = 0.3 or 0.7)

    1e7 number of total trajectories considered

    DS, noDS simulation performed used the proposed set of rules for anomalous transport in bounded media (DS) or with the standard set of rules (noDS).

    Hom homogeneous sphere configuration

    NLay number of layers used in the layered configurations

    mism, no_mism configuration with (mism) or without (no_mism) refractive index mismatch with the environment or between layers

    Vhom_layN layered homogeneous sphere configuration with N layers

    mus_const, mus_step simulations performed using a constant scattering coefficient (mus_const) or different values across different layers (mus_step)

    The header of each output file contains general information on the simulation settings as detailed below:

    The number of layers

    The radius of each layer

    The refractive index of each layer and of the external region

    The critical angle for entering and exiting the sphere

    Reduced scattering coefficient of each layer

    Scattering function

    Type of illumination source

    Absorption coefficient of the sphere

    Simulations for different scattering strengths are classified based on the optical thickness Taud of the sphere, corresponding to the product between the sphere diameter (10 mm) and the reduced scattering coefficient. Since we have results only for isotropic scattering, Taud is also the product of diameter and scattering coefficient.For a layered sphere, Taud is given by the sum of the products between the scattering coefficient and radial thickness of each layer.The symbol Taua is also used to denote the product between the sphere and the absorption coefficient. Since we have considered only non-absorbing media, this value is always zero.Each simulation is carried out for different values of the reduced scattering coefficient in the layers, keeping their reciprocal ratios fixed, resulting in the following values of Taud: 1E-3, 2E-3, 5E-3, 1E-2, 2E-2, 5E-2, 1E-1, 2E-1, 5E-1, 1, 2, 5, 10, 20, 50, 100, which, in the case of the homogeneous sphere, correspond to reduced scattering coefficients comprised between 1E-4 and 10.The standard error (SE) is also provided, based on the results of 100 independent simulations with 1E5 trajectories each, for a total of 1E7 total trajectories considered.

    Structure of the pathlength files (filename.csv)

    List of content for the calculated quantities shown:

    Ksc_Max: maximum number of scattering events in the medium

    de_max(mm): maximum pathlength followed by photons

    Number of Photons lost (always zero, all trajectories are collected)

    Total CW Exiting Radiation and Standard Error (SE)

    Mean Pathlength for Total Exiting Radiation and SE

    Solutions of the RTE for the non-absorbing sphere with Lambertian illumination given in terms of the mean pathlength in the sphere and partial pathlength in each layer

    Partial Mean Pathlengths for in each layer for the non-absorbing case

    Standard Error on Partial Mean Pathlengths in each layer

    Discrepancy and SE for the Pathlengths for the non-absorbing case

    Relative SE is shown for the Partial Mean Pathlengths in the non-absorbing case

    Total Spread Function (mm-1) versus the pathlength of emerging photons l (mm) for each Taud value

    Structure of the fluence files (filename_Fluence.csv)

    List of content for the calculated quantities shown:

    Mean pathlength and partial pathlength

    CW fluence at each layer boundary

    SE: standard error for the CW fluence

    Discrepancy and SE of the CW fluence with respect to the expected value from the invariance properties. This is reported for each layer interface

    Relative Error of the CW fluence

    CW FLUX_P at each layer boundary

    Distribution of CW radiance at each layer boundary for all Taud values. The distribution expected from the invariance property is also listed in the columns adjacent to the Monte Carlo results. Radiance data is reported as a function of the angle at each layer boundary

    Additional empty fields in the files refer to output quantities that can be optionally calculated during the simulations. These options were not selected for this study.

    For convenience, we list below which files were used to prepare each Figure (some files are used multiple times for different Figures):

    Figure 2

    Hom/ISO_Hom_mism_k0_3_1e7_DS.csvHom/ISO_Hom_mism_k0_3_1e7_noDS.csvHom/ISO_Hom_mism_k0_7_1e7_DS.csvHom/ISO_Hom_mism_k0_7_1e7_noDS.csvLay/ISO_4Lay_mus_const_k0_3_1e7_DS.csvLay/ISO_4Lay_mus_const_k0_3_1e7_noDS.csvLay/ISO_4Lay_mus_const_k0_7_1e7_DS.csvLay/ISO_4Lay_mus_const_k0_7_1e7_noDS.csvLay/ISO_4Lay_mus_step_k0_3_1e7_DS.csvLay/ISO_4Lay_mus_step_k0_3_1e7_noDS.csvLay/ISO_4Lay_mus_step_k0_7_1e7_DS.csvLay/ISO_4Lay_mus_step_k0_7_1e7_noDS.csv

    Figure 3

    Lay/ISO_10lay_mism_k0_3_1e7_DS_Fluence.csvLay/ISO_10lay_mism_k0_3_1e7_noDS_Fluence.csvLay/ISO_10lay_mism_k0_7_1e7_DS_Fluence.csvLay/ISO_10lay_mism_k0_7_1e7_noDS_Fluence.csv

    Figure 4

    Hom/ISO_Hom_no_mism_k0_3_1e7_DS.csvHom/ISO_Hom_no_mism_k0_3_1e7_noDS.csvHom/ISO_Hom_no_mism_k0_7_1e7_DS.csvHom/ISO_Hom_no_mism_k0_7_1e7_noDS.csvHom/ISO_Hom_mism_k0_3_1e7_DS.csvHom/ISO_Hom_mism_k0_3_1e7_noDS.csvHom/ISO_Hom_mism_k0_7_1e7_DS.csvHom/ISO_Hom_mism_k0_7_1e7_noDS.csv

    Figure 5

    Hom/ISO_Hom_mism_k0_7_1e7_DS.csvVHom/ISO_Vhom_lay2_mism_k0_7_1e7_DS.csvVHom/ISO_Vhom_lay4_mism_k0_7_1e7_DS.csvVHom/ISO_Vhom_lay8_mism_k0_7_1e7_DS.csvVHom/ISO_Vhom_lay16_mism_k0_7_1e7_DS.csv

  8. Number of internet users worldwide 2014-2029

    • statista.com
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2025). Number of internet users worldwide 2014-2029 [Dataset]. https://www.statista.com/topics/1145/internet-usage-worldwide/
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    World
    Description

    The global number of internet users in was forecast to continuously increase between 2024 and 2029 by in total 1.3 billion users (+23.66 percent). After the fifteenth consecutive increasing year, the number of users is estimated to reach 7 billion users and therefore a new peak in 2029. Notably, the number of internet users of was continuously increasing over the past years.Depicted is the estimated number of individuals in the country or region at hand, that use the internet. As the datasource clarifies, connection quality and usage frequency are distinct aspects, not taken into account here.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of internet users in countries like the Americas and Asia.

  9. Annual Cooling Degree Days - Projections (12km)

    • climatedataportal.metoffice.gov.uk
    Updated May 17, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Met Office (2023). Annual Cooling Degree Days - Projections (12km) [Dataset]. https://climatedataportal.metoffice.gov.uk/datasets/annual-cooling-degree-days-projections-12km
    Explore at:
    Dataset updated
    May 17, 2023
    Dataset authored and provided by
    Met Officehttp://www.metoffice.gov.uk/
    Area covered
    Description

    [Updated 28/01/25 to fix an issue in the ‘Lower’ values, which were not fully representing the range of uncertainty. ‘Median’ and ‘Higher’ values remain unchanged. The size of the change varies by grid cell and fixed period/global warming levels but the average difference between the 'lower' values before and after this update is 1.2.]What does the data show? A Cooling Degree Day (CDD) is a day in which the average temperature is above 22°C. It is the number of degrees above this threshold that counts as a Coolin Degree Day. For example if the average temperature for a specific day is 22.5°C, this would contribute 0.5 Cooling Degree Days to the annual sum, alternatively an average temperature of 27°C would contribute 5 Cooling Degree Days. Given the data shows the annual sum of Cooling Degree Days, this value can be above 365 in some parts of the UK.Annual Cooling Degree Days is calculated for two baseline (historical) periods 1981-2000 (corresponding to 0.51°C warming) and 2001-2020 (corresponding to 0.87°C warming) and for global warming levels of 1.5°C, 2.0°C, 2.5°C, 3.0°C, 4.0°C above the pre-industrial (1850-1900) period. This enables users to compare the future number of CDD to previous values.What are the possible societal impacts?Cooling Degree Days indicate the energy demand for cooling due to hot days. A higher number of CDD means an increase in power consumption for cooling and air conditioning, therefore this index is useful for predicting future changes in energy demand for cooling.In practice, this varies greatly throughout the UK, depending on personal thermal comfort levels and building designs, so these results should be considered as rough estimates of overall demand changes on a large scale.What is a global warming level?Annual Cooling Degree Days are calculated from the UKCP18 regional climate projections using the high emissions scenario (RCP 8.5) where greenhouse gas emissions continue to grow. Instead of considering future climate change during specific time periods (e.g. decades) for this scenario, the dataset is calculated at various levels of global warming relative to the pre-industrial (1850-1900) period. The world has already warmed by around 1.1°C (between 1850–1900 and 2011–2020), whilst this dataset allows for the exploration of greater levels of warming. The global warming levels available in this dataset are 1.5°C, 2°C, 2.5°C, 3°C and 4°C. The data at each warming level was calculated using a 21 year period. These 21 year periods are calculated by taking 10 years either side of the first year at which the global warming level is reached. This time will be different for different model ensemble members. To calculate the value for the Annual Cooling Degree Days, an average is taken across the 21 year period. Therefore, the Annual Cooling Degree Days show the number of cooling degree days that could occur each year, for each given level of warming. We cannot provide a precise likelihood for particular emission scenarios being followed in the real world future. However, we do note that RCP8.5 corresponds to emissions considerably above those expected with current international policy agreements. The results are also expressed for several global warming levels because we do not yet know which level will be reached in the real climate as it will depend on future greenhouse emission choices and the sensitivity of the climate system, which is uncertain. Estimates based on the assumption of current international agreements on greenhouse gas emissions suggest a median warming level in the region of 2.4-2.8°C, but it could either be higher or lower than this level.What are the naming conventions and how do I explore the data?This data contains a field for each global warming level and two baselines. They are named ‘CDD’ (Cooling Degree Days), the warming level or baseline, and 'upper' 'median' or 'lower' as per the description below. E.g. 'CDD 2.5 median' is the median value for the 2.5°C projection. Decimal points are included in field aliases but not field names e.g. 'CDD 2.5 median' is 'CDD_25_median'. To understand how to explore the data, see this page: https://storymaps.arcgis.com/stories/457e7a2bc73e40b089fac0e47c63a578Please note, if viewing in ArcGIS Map Viewer, the map will default to ‘CDD 2.0°C median’ values.What do the ‘median’, ‘upper’, and ‘lower’ values mean?Climate models are numerical representations of the climate system. To capture uncertainty in projections for the future, an ensemble, or group, of climate models are run. Each ensemble member has slightly different starting conditions or model set-ups. Considering all of the model outcomes gives users a range of plausible conditions which could occur in the future. For this dataset, the model projections consist of 12 separate ensemble members. To select which ensemble members to use, Annual Cooling Degree Days were calculated for each ensemble member and they were then ranked in order from lowest to highest for each location. The ‘lower’ fields are the second lowest ranked ensemble member. The ‘upper’ fields are the second highest ranked ensemble member. The ‘median’ field is the central value of the ensemble.This gives a median value, and a spread of the ensemble members indicating the range of possible outcomes in the projections. This spread of outputs can be used to infer the uncertainty in the projections. The larger the difference between the lower and upper fields, the greater the uncertainty.‘Lower’, ‘median’ and ‘upper’ are also given for the baseline periods as these values also come from the model that was used to produce the projections. This allows a fair comparison between the model projections and recent past. Useful linksThis dataset was calculated following the methodology in the ‘Future Changes to high impact weather in the UK’ report and uses the same temperature thresholds as the 'State of the UK Climate' report.Further information on the UK Climate Projections (UKCP).Further information on understanding climate data within the Met Office Climate Data Portal.

  10. Number of global social network users 2017-2028

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Number of global social network users 2017-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    How many people use social media?

                  Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.
    
                  Who uses social media?
                  Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
                  when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.
    
                  How much time do people spend on social media?
                  Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.
    
                  What are the most popular social media platforms?
                  Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
    
  11. Annual Count of Hot Summer Days - Projections (12km)

    • climatedataportal.metoffice.gov.uk
    Updated Feb 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Met Office (2023). Annual Count of Hot Summer Days - Projections (12km) [Dataset]. https://climatedataportal.metoffice.gov.uk/items/1a89ff97e169482291ed49ff29ce1120
    Explore at:
    Dataset updated
    Feb 7, 2023
    Dataset authored and provided by
    Met Officehttp://www.metoffice.gov.uk/
    Area covered
    Description

    [Updated 28/01/25 to fix an issue in the ‘Lower’ values, which were not fully representing the range of uncertainty. ‘Median’ and ‘Higher’ values remain unchanged. The size of the change varies by grid cell and fixed period/global warming levels but the average difference between the 'lower' values before and after this update is 0.2.]What does the data show? The Annual Count of Hot Summer Days is the number of days per year where the maximum daily temperature is above 30°C. It measures how many times the threshold is exceeded (not by how much) in a year. Note, the term ‘hot summer days’ is used to refer to the threshold and temperatures above 30°C outside the summer months also contribute to the annual count. The results should be interpreted as an approximation of the projected number of days when the threshold is exceeded as there will be many factors such as natural variability and local scale processes that the climate model is unable to represent.The Annual Count of Hot Summer Days is calculated for two baseline (historical) periods 1981-2000 (corresponding to 0.51°C warming) and 2001-2020 (corresponding to 0.87°C warming) and for global warming levels of 1.5°C, 2.0°C, 2.5°C, 3.0°C, 4.0°C above the pre-industrial (1850-1900) period. This enables users to compare the future number of hot summer days to previous values.What are the possible societal impacts?The Annual Count of Hot Summer Days indicates increased health risks, transport disruption and damage to infrastructure from high temperatures. It is based on exceeding a maximum daily temperature of 30°C. Impacts include:Increased heat related illnesses, hospital admissions or death.Transport disruption due to overheating of railway infrastructure. Overhead power lines also become less efficient. Other metrics such as the Annual Count of Summer Days (days above 25°C), Annual Count of Extreme Summer Days (days above 35°C) and the Annual Count of Tropical Nights (where the minimum temperature does not fall below 20°C) also indicate impacts from high temperatures, however they use different temperature thresholds.What is a global warming level?The Annual Count of Hot Summer Days is calculated from the UKCP18 regional climate projections using the high emissions scenario (RCP 8.5) where greenhouse gas emissions continue to grow. Instead of considering future climate change during specific time periods (e.g. decades) for this scenario, the dataset is calculated at various levels of global warming relative to the pre-industrial (1850-1900) period. The world has already warmed by around 1.1°C (between 1850–1900 and 2011–2020), whilst this dataset allows for the exploration of greater levels of warming. The global warming levels available in this dataset are 1.5°C, 2°C, 2.5°C, 3°C and 4°C. The data at each warming level was calculated using a 21 year period. These 21 year periods are calculated by taking 10 years either side of the first year at which the global warming level is reached. This time will be different for different model ensemble members. To calculate the value for the Annual Count of Hot Summer Days, an average is taken across the 21 year period. Therefore, the Annual Count of Hot Summer Days show the number of hot summer days that could occur each year, for each given level of warming. We cannot provide a precise likelihood for particular emission scenarios being followed in the real world future. However, we do note that RCP8.5 corresponds to emissions considerably above those expected with current international policy agreements. The results are also expressed for several global warming levels because we do not yet know which level will be reached in the real climate as it will depend on future greenhouse emission choices and the sensitivity of the climate system, which is uncertain. Estimates based on the assumption of current international agreements on greenhouse gas emissions suggest a median warming level in the region of 2.4-2.8°C, but it could either be higher or lower than this level.What are the naming conventions and how do I explore the data?This data contains a field for each global warming level and two baselines. They are named ‘HSD’ (where HSD means Hot Summer Days), the warming level or baseline, and ‘upper’ ‘median’ or ‘lower’ as per the description below. E.g. ‘Hot Summer Days 2.5 median’ is the median value for the 2.5°C warming level. Decimal points are included in field aliases but not field names e.g. ‘Hot Summer Days 2.5 median’ is ‘HotSummerDays_25_median’. To understand how to explore the data, see this page: https://storymaps.arcgis.com/stories/457e7a2bc73e40b089fac0e47c63a578Please note, if viewing in ArcGIS Map Viewer, the map will default to ‘HSD 2.0°C median’ values.What do the ‘median’, ‘upper’, and ‘lower’ values mean?Climate models are numerical representations of the climate system. To capture uncertainty in projections for the future, an ensemble, or group, of climate models are run. Each ensemble member has slightly different starting conditions or model set-ups. Considering all of the model outcomes gives users a range of plausible conditions which could occur in the future. For this dataset, the model projections consist of 12 separate ensemble members. To select which ensemble members to use, the Annual Count of Hot Summer Days was calculated for each ensemble member and they were then ranked in order from lowest to highest for each location. The ‘lower’ fields are the second lowest ranked ensemble member. The ‘upper’ fields are the second highest ranked ensemble member. The ‘median’ field is the central value of the ensemble.This gives a median value, and a spread of the ensemble members indicating the range of possible outcomes in the projections. This spread of outputs can be used to infer the uncertainty in the projections. The larger the difference between the lower and upper fields, the greater the uncertainty.‘Lower’, ‘median’ and ‘upper’ are also given for the baseline periods as these values also come from the model that was used to produce the projections. This allows a fair comparison between the model projections and recent past. Useful linksThis dataset was calculated following the methodology in the ‘Future Changes to high impact weather in the UK’ report and uses the same temperature thresholds as the 'State of the UK Climate' report.Further information on the UK Climate Projections (UKCP).Further information on understanding climate data within the Met Office Climate Data Portal.

  12. Annual Count of Frost Days - Projections (12km)

    • climatedataportal.metoffice.gov.uk
    Updated Feb 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Met Office (2023). Annual Count of Frost Days - Projections (12km) [Dataset]. https://climatedataportal.metoffice.gov.uk/datasets/TheMetOffice::annual-count-of-frost-days-projections-12km/explore
    Explore at:
    Dataset updated
    Feb 7, 2023
    Dataset authored and provided by
    Met Officehttp://www.metoffice.gov.uk/
    Area covered
    Description

    [Updated 28/01/25 to fix an issue in the ‘Lower’ values, which were not fully representing the range of uncertainty. ‘Median’ and ‘Higher’ values remain unchanged. The size of the change varies by grid cell and fixed period/global warming levels but the average difference between the 'lower' values before and after this update is 1.2.]What does the data show? The Annual Count of Frost Days is the number of days per year where the minimum daily temperature is below 0°C. It measures how many times the threshold is exceeded (not by how much) in a year. The results should be interpreted as an approximation of the projected number of days when the threshold is exceeded as there will be many factors such as natural variability and local scale processes that the climate model is unable to represent.The Annual Count of Frost Days is calculated for two baseline (historical) periods 1981-2000 (corresponding to 0.51°C warming) and 2001-2020 (corresponding to 0.87°C warming) and for global warming levels of 1.5°C, 2.0°C, 2.5°C, 3.0°C, 4.0°C above the pre-industrial (1850-1900) period. This enables users to compare the future number of frost days to previous values. What are the possible societal impacts?The Annual Count of Frost Days indicates increased cold weather disruption due to a higher than normal chance of ice and snow. It is based on the minimum daily temperature being below 0°C. Impacts include:Damage to crops.Transport disruption.Increased energy demand.The Annual Count of Icing Days, is a similar metric measuring impacts from cold temperatures, it indicates more severe cold weather impacts.What is a global warming level?The Annual Count of Frost Days is calculated from the UKCP18 regional climate projections using the high emissions scenario (RCP 8.5) where greenhouse gas emissions continue to grow. Instead of considering future climate change during specific time periods (e.g. decades) for this scenario, the dataset is calculated at various levels of global warming relative to the pre-industrial (1850-1900) period. The world has already warmed by around 1.1°C (between 1850–1900 and 2011–2020), whilst this dataset allows for the exploration of greater levels of warming. The global warming levels available in this dataset are 1.5°C, 2°C, 2.5°C, 3°C and 4°C. The data at each warming level was calculated using a 21 year period. These 21 year periods are calculated by taking 10 years either side of the first year at which the global warming level is reached. This time will be different for different model ensemble members. To calculate the value for the Annual Count of Frost Days, an average is taken across the 21 year period. Therefore, the Annual Count of Frost Days show the number of frost days that could occur each year, for each given level of warming. We cannot provide a precise likelihood for particular emission scenarios being followed in the real world future. However, we do note that RCP8.5 corresponds to emissions considerably above those expected with current international policy agreements. The results are also expressed for several global warming levels because we do not yet know which level will be reached in the real climate as it will depend on future greenhouse emission choices and the sensitivity of the climate system, which is uncertain. Estimates based on the assumption of current international agreements on greenhouse gas emissions suggest a median warming level in the region of 2.4-2.8°C, but it could either be higher or lower than this level.What are the naming conventions and how do I explore the data?This data contains a field for each global warming level and two baselines. They are named ‘Frost Days’, the warming level or baseline, and ‘upper’ ‘median’ or ‘lower’ as per the description below. E.g. ‘Frost Days 2.5 median’ is the median value for the 2.5°C warming level. Decimal points are included in field aliases but not field names e.g. ‘Frost Days 2.5 median’ is ‘FrostDays_25_median’. To understand how to explore the data, see this page: https://storymaps.arcgis.com/stories/457e7a2bc73e40b089fac0e47c63a578Please note, if viewing in ArcGIS Map Viewer, the map will default to ‘Frost Days 2.0°C median’ values.What do the ‘median’, ‘upper’, and ‘lower’ values mean?Climate models are numerical representations of the climate system. To capture uncertainty in projections for the future, an ensemble, or group, of climate models are run. Each ensemble member has slightly different starting conditions or model set-ups. Considering all of the model outcomes gives users a range of plausible conditions which could occur in the future. For this dataset, the model projections consist of 12 separate ensemble members. To select which ensemble members to use, the Annual Count of Frost Days was calculated for each ensemble member and they were then ranked in order from lowest to highest for each location. The ‘lower’ fields are the second lowest ranked ensemble member. The ‘upper’ fields are the second highest ranked ensemble member. The ‘median’ field is the central value of the ensemble.This gives a median value, and a spread of the ensemble members indicating the range of possible outcomes in the projections. This spread of outputs can be used to infer the uncertainty in the projections. The larger the difference between the lower and upper fields, the greater the uncertainty.‘Lower’, ‘median’ and ‘upper’ are also given for the baseline periods as these values also come from the model that was used to produce the projections. This allows a fair comparison between the model projections and recent past. Useful linksThis dataset was calculated following the methodology in the ‘Future Changes to high impact weather in the UK’ report and uses the same temperature thresholds as the 'State of the UK Climate' report.Further information on the UK Climate Projections (UKCP).Further information on understanding climate data within the Met Office Climate Data Portal.

  13. Facebook users worldwide 2017-2027

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Facebook users worldwide 2017-2027 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    The global number of Facebook users was forecast to continuously increase between 2023 and 2027 by in total 391 million users (+14.36 percent). After the fourth consecutive increasing year, the Facebook user base is estimated to reach 3.1 billion users and therefore a new peak in 2027. Notably, the number of Facebook users was continuously increasing over the past years. User figures, shown here regarding the platform Facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

  14. g

    Statistics Bureau, Private Households with Rented Rooms: Members and Size of...

    • geocommons.com
    Updated Jul 1, 2008
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Burkey (2008). Statistics Bureau, Private Households with Rented Rooms: Members and Size of Household, Japan, 2005 [Dataset]. http://geocommons.com/search.html
    Explore at:
    Dataset updated
    Jul 1, 2008
    Dataset provided by
    Statistics Bureau, Ministry of Internal Affairs and Communications
    Burkey
    Description

    This dataset displays data from the 2005 Census of Japan. It displays data on Private Households throughout prefectures in Japan. This dataset specifically deals with number of Private Households with Rented Rooms, Number of Private Households with Rented Rooms Members, Average number of Members per Private Households with Rented Rooms, Area of Floor Space per Household of Private Households with Rented Rooms, and Area of Floor Space per Person of Private Households with Rented Rooms. This data comes from Japan's Ministry of Internal Affairs and Communication's Statistics Bureau.

  15. f

    Dataset for 3rd variation of SEIRSEI model.

    • plos.figshare.com
    xls
    Updated Apr 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kottakkaran Sooppy Nisar; Muhammad Wajahat Anjum; Muhammad Asif Zahoor Raja; Muhammad Shoaib (2024). Dataset for 3rd variation of SEIRSEI model. [Dataset]. http://doi.org/10.1371/journal.pone.0298451.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Apr 18, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Kottakkaran Sooppy Nisar; Muhammad Wajahat Anjum; Muhammad Asif Zahoor Raja; Muhammad Shoaib
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The paper presents an innovative computational framework for predictive solutions for simulating the spread of malaria. The structure incorporates sophisticated computing methods to improve the reliability of predicting malaria outbreaks. The study strives to provide a strong and effective tool for forecasting the propagation of malaria via the use of an AI-based recurrent neural network (RNN). The model is classified into two groups, consisting of humans and mosquitoes. To develop the model, the traditional Ross-Macdonald model is expanded upon, allowing for a more comprehensive analysis of the intricate dynamics at play. To gain a deeper understanding of the extended Ross model, we employ RNN, treating it as an initial value problem involving a system of first-order ordinary differential equations, each representing one of the seven profiles. This method enables us to obtain valuable insights and elucidate the complexities inherent in the propagation of malaria. Mosquitoes and humans constitute the two cohorts encompassed within the exposition of the mathematical dynamical model. Human dynamics are comprised of individuals who are susceptible, exposed, infectious, and in recovery. The mosquito population, on the other hand, is divided into three categories: susceptible, exposed, and infected. For RNN, we used the input of 0 to 300 days with an interval length of 3 days. The evaluation of the precision and accuracy of the methodology is conducted by superimposing the estimated solution onto the numerical solution. In addition, the outcomes obtained from the RNN are examined, including regression analysis, assessment of error autocorrelation, examination of time series response plots, mean square error, error histogram, and absolute error. A reduced mean square error signifies that the model’s estimates are more accurate. The result is consistent with acquiring an approximate absolute error close to zero, revealing the efficacy of the suggested strategy. This research presents a novel approach to solving the malaria propagation model using recurrent neural networks. Additionally, it examines the behavior of various profiles under varying initial conditions of the malaria propagation model, which consists of a system of ordinary differential equations.

  16. TikTok global quarterly downloads 2018-2024

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department, TikTok global quarterly downloads 2018-2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Description

    In the fourth quarter of 2024, TikTok generated around 186 million downloads from users worldwide. Initially launched in China first by ByteDance as Douyin, the short-video format was popularized by TikTok and took over the global social media environment in 2020. In the first quarter of 2020, TikTok downloads peaked at over 313.5 million worldwide, up by 62.3 percent compared to the first quarter of 2019.

                  TikTok interactions: is there a magic formula for content success?
    
                  In 2024, TikTok registered an engagement rate of approximately 4.64 percent on video content hosted on its platform. During the same examined year, the social video app recorded over 1,100 interactions on average. These interactions were primarily composed of likes, while only recording less than 20 comments per piece of content on average in 2024.
                  The platform has been actively monitoring the issue of fake interactions, as it removed around 236 million fake likes during the first quarter of 2024. Though there is no secret formula to get the maximum of these metrics, recommended video length can possibly contribute to the success of content on TikTok.
                  It was recommended that tiny TikTok accounts with up to 500 followers post videos that are around 2.6 minutes long as of the first quarter of 2024. While, the ideal video duration for huge TikTok accounts with over 50,000 followers was 7.28 minutes. The average length of TikTok videos posted by the creators in 2024 was around 43 seconds.
    
                  What’s trending on TikTok Shop?
    
                  Since its launch in September 2023, TikTok Shop has become one of the most popular online shopping platforms, offering consumers a wide variety of products. In 2023, TikTok shops featuring beauty and personal care items sold over 370 million products worldwide.
                  TikTok shops featuring womenswear and underwear, as well as food and beverages, followed with 285 and 138 million products sold, respectively. Similarly, in the United States market, health and beauty products were the most-selling items,
                  accounting for 85 percent of sales made via the TikTok Shop feature during the first month of its launch. In 2023, Indonesia was the market with the largest number of TikTok Shops, hosting over 20 percent of all TikTok Shops. Thailand and Vietnam followed with 18.29 and 17.54 percent of the total shops listed on the famous short video platform, respectively.
    
  17. Average daily time spent on social media worldwide 2012-2024

    • statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Average daily time spent on social media worldwide 2012-2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    How much time do people spend on social media?

                  As of 2024, the average daily social media usage of internet users worldwide amounted to 143 minutes per day, down from 151 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of three hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in
                  the U.S. was just two hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively.
                  People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general.
                  During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.
    
  18. g

    World Health Organization, Wild Animals Tested Negative for Rabies by...

    • geocommons.com
    Updated May 27, 2008
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data (2008). World Health Organization, Wild Animals Tested Negative for Rabies by Country, World, 1988-2006 [Dataset]. http://geocommons.com/search.html
    Explore at:
    Dataset updated
    May 27, 2008
    Dataset provided by
    data
    World Health Organization
    Description

    This dataset illustrates the number of wild animals with negative results for rabies, by country, from 1988 to 2006. A value of -1 means that no data was available. Source: World Health Organization URL: http://www.who.int/globalatlas/dataQuery/default.asp Date Accessed: October 24, 2007

  19. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
ECMWF (2025). ERA5 hourly data on single levels from 1940 to present [Dataset]. http://doi.org/10.24381/cds.adbb2d47

Data from: ERA5 hourly data on single levels from 1940 to present

Related Article
Explore at:
gribAvailable download formats
Dataset updated
Oct 2, 2025
Dataset authored and provided by
ECMWF
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Time period covered
Jan 1, 1940 - Sep 26, 2025
Description

ERA5 is the fifth generation ECMWF reanalysis for the global climate and weather for the past 8 decades. Data is available from 1940 onwards. ERA5 replaces the ERA-Interim reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. This principle, called data assimilation, is based on the method used by numerical weather prediction centres, where every so many hours (12 hours at ECMWF) a previous forecast is combined with newly available observations in an optimal way to produce a new best estimate of the state of the atmosphere, called analysis, from which an updated, improved forecast is issued. Reanalysis works in the same way, but at reduced resolution to allow for the provision of a dataset spanning back several decades. Reanalysis does not have the constraint of issuing timely forecasts, so there is more time to collect observations, and when going further back in time, to allow for the ingestion of improved versions of the original observations, which all benefit the quality of the reanalysis product. ERA5 provides hourly estimates for a large number of atmospheric, ocean-wave and land-surface quantities. An uncertainty estimate is sampled by an underlying 10-member ensemble at three-hourly intervals. Ensemble mean and spread have been pre-computed for convenience. Such uncertainty estimates are closely related to the information content of the available observing system which has evolved considerably over time. They also indicate flow-dependent sensitive areas. To facilitate many climate applications, monthly-mean averages have been pre-calculated too, though monthly means are not available for the ensemble mean and spread. ERA5 is updated daily with a latency of about 5 days. In case that serious flaws are detected in this early release (called ERA5T), this data could be different from the final release 2 to 3 months later. In case that this occurs users are notified. The data set presented here is a regridded subset of the full ERA5 data set on native resolution. It is online on spinning disk, which should ensure fast and easy access. It should satisfy the requirements for most common applications. An overview of all ERA5 datasets can be found in this article. Information on access to ERA5 data on native resolution is provided in these guidelines. Data has been regridded to a regular lat-lon grid of 0.25 degrees for the reanalysis and 0.5 degrees for the uncertainty estimate (0.5 and 1 degree respectively for ocean waves). There are four main sub sets: hourly and monthly products, both on pressure levels (upper air fields) and single levels (atmospheric, ocean-wave and land surface quantities). The present entry is "ERA5 hourly data on single levels from 1940 to present".

Search
Clear search
Close search
Google apps
Main menu