This dataset contains ERA5.1 surface level analysis parameter data for the period 2000-2006 from 10 member ensemble runs. ERA5.1 is the European Centre for Medium-Range Weather Forecasts (ECWMF) ERA5 reanalysis project re-run for 2000-2006 to improve upon the cold bias in the lower stratosphere seen in ERA5 (see technical memorandum 859 in the linked documentation section for further details). Ensemble means and spreads are calculated from these 10 member ensemble, run at a reduced resolution compared with the single high resolution (hourly output at 31 km grid spacing) 'HRES' realisation, for which these data have been produced to provide an uncertainty estimate. This dataset contains a limited selection of all available variables and have been converted to netCDF from the original GRIB files held on the ECMWF system. They have also been translated onto a regular latitude-longitude grid during the extraction process from the ECMWF holdings. For a fuller set of variables please see the linked Copernicus Data Store (CDS) data tool, linked to from this record. Note, ensemble standard deviation is often referred to as ensemble spread and is calculated as the standard deviation of the 10-members in the ensemble (i.e., including the control). It is not the sample standard deviation, and thus were calculated by dividing by 10 rather than 9 (N-1). See linked datasets for ensemble mean and ensemble spread data. The main ERA5 global atmospheric reanalysis of the covers 1979 to 2 months behind the present month. This follows on from the ERA-15, ERA-40 rand ERA-interim re-analysis projects. An initial release of ERA5 data, ERA5t, are also available upto 5 days behind the present. A limited selection of data from these runs are also available via CEDA, whilst full access is available via the Copernicus Data Store.
https://artefacts.ceda.ac.uk/licences/specific_licences/ecmwf-era-products.pdfhttps://artefacts.ceda.ac.uk/licences/specific_licences/ecmwf-era-products.pdf
This dataset contains ERA5 initial release (ERA5t) surface level analysis parameter data ensemble means (see linked dataset for spreads). ERA5t is the European Centre for Medium-Range Weather Forecasts (ECWMF) ERA5 reanalysis project initial release available upto 5 days behind the present data. CEDA will maintain a 6 month rolling archive of these data with overlap to the verified ERA5 data - see linked datasets on this record. The ensemble means and spreads are calculated from the ERA5t 10 member ensemble, run at a reduced resolution compared with the single high resolution (hourly output at 31 km grid spacing) 'HRES' realisation, for which these data have been produced to provide an uncertainty estimate. This dataset contains a limited selection of all available variables and have been converted to netCDF from the original GRIB files held on the ECMWF system. They have also been translated onto a regular latitude-longitude grid during the extraction process from the ECMWF holdings. For a fuller set of variables please see the linked Copernicus Data Store (CDS) data tool, linked to from this record. See linked datasets for ensemble member and spread data.
Note, ensemble standard deviation is often referred to as ensemble spread and is calculated as the standard deviation of the 10-members in the ensemble (i.e., including the control). It is not the sample standard deviation, and thus were calculated by dividing by 10 rather than 9 (N-1).
The ERA5 global atmospheric reanalysis of the covers 1979 to 2 months behind the present month. This follows on from the ERA-15, ERA-40 rand ERA-interim re-analysis projects. An initial release of ERA5 data (ERA5t) is made roughly 5 days behind the present date. These will be subsequently reviewed and, if required, amended before the full ERA5 release. CEDA holds a 6 month rolling copy of the latest ERA5t data. See related datasets linked to from this record.
https://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/cc-by/cc-by_f24dc630aa52ab8c52a0ac85c03bc35e0abc850b4d7453bdc083535b41d5a5c3.pdfhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/cc-by/cc-by_f24dc630aa52ab8c52a0ac85c03bc35e0abc850b4d7453bdc083535b41d5a5c3.pdf
ERA5 is the fifth generation ECMWF reanalysis for the global climate and weather for the past 8 decades. Data is available from 1940 onwards. ERA5 replaces the ERA-Interim reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. This principle, called data assimilation, is based on the method used by numerical weather prediction centres, where every so many hours (12 hours at ECMWF) a previous forecast is combined with newly available observations in an optimal way to produce a new best estimate of the state of the atmosphere, called analysis, from which an updated, improved forecast is issued. Reanalysis works in the same way, but at reduced resolution to allow for the provision of a dataset spanning back several decades. Reanalysis does not have the constraint of issuing timely forecasts, so there is more time to collect observations, and when going further back in time, to allow for the ingestion of improved versions of the original observations, which all benefit the quality of the reanalysis product. ERA5 provides hourly estimates for a large number of atmospheric, ocean-wave and land-surface quantities. An uncertainty estimate is sampled by an underlying 10-member ensemble at three-hourly intervals. Ensemble mean and spread have been pre-computed for convenience. Such uncertainty estimates are closely related to the information content of the available observing system which has evolved considerably over time. They also indicate flow-dependent sensitive areas. To facilitate many climate applications, monthly-mean averages have been pre-calculated too, though monthly means are not available for the ensemble mean and spread. ERA5 is updated daily with a latency of about 5 days. In case that serious flaws are detected in this early release (called ERA5T), this data could be different from the final release 2 to 3 months later. In case that this occurs users are notified. The data set presented here is a regridded subset of the full ERA5 data set on native resolution. It is online on spinning disk, which should ensure fast and easy access. It should satisfy the requirements for most common applications. An overview of all ERA5 datasets can be found in this article. Information on access to ERA5 data on native resolution is provided in these guidelines. Data has been regridded to a regular lat-lon grid of 0.25 degrees for the reanalysis and 0.5 degrees for the uncertainty estimate (0.5 and 1 degree respectively for ocean waves). There are four main sub sets: hourly and monthly products, both on pressure levels (upper air fields) and single levels (atmospheric, ocean-wave and land surface quantities). The present entry is "ERA5 hourly data on single levels from 1940 to present".
Browse KC HRW-Chicago SRW Wheat Intercommodity Spread Synthetic Futures (CKW) market data. Get instant pricing estimates and make batch downloads of binary, CSV, and JSON flat files.
The CME Group Market Data Platform (MDP) 3.0 disseminates event-based bid, ask, trade, and statistical data for CME Group markets and also provides recovery and support services for market data processing. MDP 3.0 includes the introduction of Simple Binary Encoding (SBE) and Event Driven Messaging to the CME Group Market Data Platform. Simple Binary Encoding (SBE) is based on simple primitive encoding, and is optimized for low bandwidth, low latency, and direct data access. Since March 2017, MDP 3.0 has changed from providing aggregated depth at every price level (like CME's legacy FAST feed) to providing full granularity of every order event for every instrument's direct book. MDP 3.0 is the sole data feed for all instruments traded on CME Globex, including futures, options, spreads and combinations. Note: We classify exchange-traded spreads between futures outrights as futures, and option combinations as options.
Origin: Directly captured at Aurora DC3 with an FPGA-based network card and hardware timestamping. Synchronized to UTC with PTP
Supported data encodings: DBN, CSV, JSON Learn more
Supported market data schemas: MBO, MBP-1, MBP-10, TBBO, Trades, OHLCV-1s, OHLCV-1m, OHLCV-1h, OHLCV-1d, Definition, Statistics Learn more
Resolution: Immediate publication, nanosecond-resolution timestamps
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
An outbreak of the Zika virus, an infection transmitted mostly by the Aedes species mosquito (Ae. aegypti and Ae. albopictus), has been sweeping across the Americas and the Pacific since mid-2015. Although first isolated in 1947 in Uganda, a lack of previous research has challenged the scientific community to quickly understand its devastating effects as the epidemic continues to spread.
All Countries & Territories with Active Zika Virus Transmission
http://www.cdc.gov/zika/images/zikamain_071416_880.jpg" width="600">
This dataset shares publicly available data related to the ongoing Zika epidemic. It is being provided as a resource to the scientific community engaged in the public health response. The data provided here is not official and should be considered provisional and non-exhaustive. The data in reports may change over time, reflecting delays in reporting or changes in classifications. And while accurate representation of the reported data is the objective in the machine readable files shared here, that accuracy is not guaranteed. Before using any of these data, it is advisable to review the original reports and sources, which are provided whenever possible along with further information on the CDC Zika epidemic GitHub repo.
The dataset includes the following fields:
report_date - The report date is the date that the report was published. The date should be specified in standard ISO format (YYYY-MM-DD).
location - A location is specified for each observation following the specific names specified in the country place name database. This may be any place with a 'location_type' as listed below, e.g. city, state, country, etc. It should be specified at up to three hierarchical levels in the following format: [country]-[state/province]-[county/municipality/city], always beginning with the country name. If the data is for a particular city, e.g. Salvador, it should be specified: Brazil-Bahia-Salvador.
location_type - A location code is included indicating: city, district, municipality, county, state, province, or country. If there is need for an additional 'location_type', open an Issue to create a new 'location_type'.
data_field - The data field is a short description of what data is represented in the row and is related to a specific definition defined by the report from which it comes.
data_field_code - This code is defined in the country data guide. It includes a two letter country code (ISO-3166 alpha-2, list), followed by a 4-digit number corresponding to a specific report type and data type.
time_period - Optional. If the data pertains to a specific period of time, for example an epidemiological week, that number should be indicated here and the type of time period in the 'time_period_type', otherwise it should be NA.
time_period_type - Required only if 'time_period' is specified. Types will also be specified in the country data guide. Otherwise should be NA.
value - The observation indicated for the specific 'report_date', 'location', 'data_field' and when appropriate, 'time_period'.
unit - The unit of measurement for the 'data_field'. This should conform to the 'data_field' unit options as described in the country-specific data guide.
If you find the data useful, please support data sharing by referencing this dataset and the original data source. If you're interested in contributing to the Zika project from GitHub, you can read more here. The source for the Zika virus structure is available here.
This dataset contains ERA5.1 surface level analysis parameter data ensemble means over the period 2000-2006. ERA5.1 is the European Centre for Medium-Range Weather Forecasts (ECWMF) ERA5 reanalysis project re-run for 2000-2006 to improve upon the cold bias in the lower stratosphere seen in ERA5 (see technical memorandum 859 in the linked documentation section for further details). The ensemble means are calculated from the ERA5.1 10 member ensemble, run at a reduced resolution compared with the single high resolution (hourly output at 31 km grid spacing) 'HRES' realisation, for which these data have been produced to provide an uncertainty estimate. This dataset contains a limited selection of all available variables and have been converted to netCDF from the original GRIB files held on the ECMWF system. They have also been translated onto a regular latitude-longitude grid during the extraction process from the ECMWF holdings. For a fuller set of variables please see the linked Copernicus Data Store (CDS) data tool, linked to from this record. See linked datasets for ensemble member and spread data. Note, ensemble standard deviation is often referred to as ensemble spread and is calculated as the standard deviation of the 10-members in the ensemble (i.e., including the control). It is not the sample standard deviation, and thus were calculated by dividing by 10 rather than 9 (N-1). The main ERA5 global atmospheric reanalysis of the covers 1979 to 2 months behind the present month. This follows on from the ERA-15, ERA-40 rand ERA-interim re-analysis projects. An initial release of ERA5 data, ERA5t, are also available upto 5 days behind the present. A limited selection of data from these runs are also available via CEDA, whilst full access is available via the Copernicus Data Store.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains all data used during the evaluation of trace meaning preservation. Archives are protected by password "trace-share" to avoid false detection by antivirus software.
For more information, see the project repository at https://github.com/Trace-Share.
Selected Attack Traces
The following list contains trace datasets used for evaluation. Each attack was chosen to have not only a different meaning but also different statistical properties.
dos_http_flood — the capture of GET and POST requests sent to one server by one attacker (HTTP~traffic);
ftp_bruteforce — short and unsuccessful attempt to guess a user’s password for FTP service (FTP traffic);
ponyloader_botnet — Pony Loader botnet used for stealing of credentials from 3 target devices reporting to single IP with a large number of intermediate addresses (DNS and HTTP traffic);
scan — the capture of nmap tool that scans given subnet using ICMP echo and TCP SYN requests (consist of ARP, ICMP, and TCP traffic);
wannacry_ransomware — the capture of Wanacry ransomware that spreads in a domain with three workstations, a domain controller, and a file-sharing server (SMB and SMBv2 traffic).
Background Traffic Data
Publicly available dataset CSE-CIC-IDS-2018 was used as a background traffic data. The evaluation uses data from the day Thursday-01-03-2018 containing a sufficient proportion of regular traffic without any statistically significant attacks. Only traffic aimed at victim machines (range 172.31.69.0/24) is used to reduce less significant traffic.
Evaluation Results and Dataset Structure
Traces variants (traces.zip)
./traces-original/ — trace PCAP files and crawled details in YAML format;
./traces-normalized — normalized PCAP files and details in YAML format;
./traces-adjusted — adjusted PCAP files using various timestamp generation settings, combination configuration in YAML format, and lables provided by ID2T in XML format.
Extracted alerts (alerts.zip)
./alerts-original/ — extracted Suricata alerts, Suricata log, and full Suricata output for all original trace files;
./alerts-normalized/ — extracted Suricata alerts, Suricata log, and full Suricata output for all normalized trace files;
./alerts-adjusted/ — extracted Suricata alerts, Suricata log, and full Suricata output for all adjusted trace files.
Evaluation results
*.csv files in the root directory — data contains extracted alert signatures and their count per each trace variant.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The common vampire bat (Desmodus rotundus) is a hematophagous bat species found across the North, Central, and South American continents. Desmodus rotundus is one of the only tree mammal species exclusively having a sanguivorous diet (i.e., blood). The species has a large distributional range and predates on a large range of vertebrate species. Desmodus rotundus is a know reservoir for the rabies virus and contributes to the continued spread of this pathogen across Latin America. Nevertheless, little is known about the historical distribuion of D. rotundus across it range. Historical occurrence data are critical for the assessment of past and current distributions for this species, and is necessary for a plethora of other ecological, biogeographic, and epidemiological studies. This is a dataset of D. rotundus historical occurrence including >37,000 locality reports across the Americas to facilitate spatiotemporal studies of the species.Data and metadata definitions. The following table provides standardized definitions of each occurrence metadata based on the the Darwin Core Archive format. Each piece of metadata for each occurrence is organized and recorded under the listed column headers.Validation Code: This file contains code for usage and cleaning of the Desmodus rotundus record database. This code was also used for the Technical Validation process of the final D. rotunudus dataset.
This dataset contains ERA5 surface level analysis parameter data ensemble means (see linked dataset for spreads). ERA5 is the 5th generation reanalysis project from the European Centre for Medium-Range Weather Forecasts (ECWMF) - see linked documentation for further details. The ensemble means and spreads are calculated from the ERA5 10 member ensemble, run at a reduced resolution compared with the single high resolution (hourly output at 31 km grid spacing) 'HRES' realisation, for which these data have been produced to provide an uncertainty estimate. This dataset contains a limited selection of all available variables and have been converted to netCDF from the original GRIB files held on the ECMWF system. They have also been translated onto a regular latitude-longitude grid during the extraction process from the ECMWF holdings. For a fuller set of variables please see the linked Copernicus Data Store (CDS) data tool, linked to from this record. Note, ensemble standard deviation is often referred to as ensemble spread and is calculated as the standard deviation of the 10-members in the ensemble (i.e., including the control). It is not the sample standard deviation, and thus were calculated by dividing by 10 rather than 9 (N-1). See linked datasets for ensemble member and ensemble mean data. The ERA5 global atmospheric reanalysis of the covers 1979 to 2 months behind the present month. This follows on from the ERA-15, ERA-40 rand ERA-interim re-analysis projects. An initial release of ERA5 data (ERA5t) is made roughly 5 days behind the present date. These will be subsequently reviewed ahead of being released by ECMWF as quality assured data within 3 months. CEDA holds a 6 month rolling copy of the latest ERA5t data. See related datasets linked to from this record. However, for the period 2000-2006 the initial ERA5 release was found to suffer from stratospheric temperature biases and so new runs to address this issue were performed resulting in the ERA5.1 release (see linked datasets). Note, though, that Simmons et al. 2020 (technical memo 859) report that "ERA5.1 is very close to ERA5 in the lower and middle troposphere." but users of data from this period should read the technical memo 859 for further details.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
land and oceanic climate variables. The data cover the Earth on a 31km grid and resolve the atmosphere using 137 levels from the surface up to a height of 80km. ERA5 includes information about uncertainties for all variables at reduced spatial and temporal resolutions.
The COVID Tracking Project was a volunteer organization launched from The Atlantic and dedicated to collecting and publishing the data required to understand the COVID-19 outbreak in the United States. Our dataset was in use by national and local news organizations across the United States and by research projects and agencies worldwide. In the US, health data infrastructure has always been siloed. Fifty-six states and territories maintain pipelines to collect infectious disease data, each built differently and subject to different limitations. The unique constraints of these uncoordinated state data systems, combined with an absence of federal guidance on how states should report public data, created a big problem when it came to assembling a national picture of COVID-19’s spread in the US: Each state has made different choices about which metrics to report and how to define them—or has even had its hand forced by technical limitations. Those decisions have affected both The COVID Trac...
Survey based Harmonized Indicators (SHIP) files are harmonized data files from household surveys that are conducted by countries in Africa. To ensure the quality and transparency of the data, it is critical to document the procedures of compiling consumption aggregation and other indicators so that the results can be duplicated with ease. This process enables consistency and continuity that make temporal and cross-country comparisons consistent and more reliable.
Four harmonized data files are prepared for each survey to generate a set of harmonized variables that have the same variable names. Invariably, in each survey, questions are asked in a slightly different way, which poses challenges on consistent definition of harmonized variables. The harmonized household survey data present the best available variables with harmonized definitions, but not identical variables. The four harmonized data files are
a) Individual level file (Labor force indicators in a separate file): This file has information on basic characteristics of individuals such as age and sex, literacy, education, health, anthropometry and child survival. b) Labor force file: This file has information on labor force including employment/unemployment, earnings, sectors of employment, etc. c) Household level file: This file has information on household expenditure, household head characteristics (age and sex, level of education, employment), housing amenities, assets, and access to infrastructure and services. d) Household Expenditure file: This file has consumption/expenditure aggregates by consumption groups according to Purpose (COICOP) of Household Consumption of the UN.
National
The survey covered all de jure household members (usual residents).
Sample survey data [ssd]
A multi-stage sampling technique was used in selecting the GLSS sample. Initially, 4565 households were selected for GLSS3, spread around the country in 407 small clusters; in general, 15 households were taken in an urban cluster and 10 households in a rural cluster. The actual achieved sample was 4552 households. Because of the sample design used, and the very high response rate achieved, the sample can be considered as being selfweighting, though in the case of expenditure data weighting of the expenditure values is required.
Face-to-face [f2f]
This dataset contains ensemble spreads for the ERA5 surface level analysis parameter data ensemble means (see linked dataset). ERA5 is the 5th generation reanalysis project from the European Centre for Medium-Range Weather Forecasts (ECWMF) - see linked documentation for further details. The ensemble means and spreads are calculated from the ERA5 10 member ensemble, run at a reduced resolution compared with the single high resolution (hourly output at 31 km grid spacing) 'HRES' realisation, for which these data have been produced to provide an uncertainty estimate. This dataset contains a limited selection of all available variables and have been converted to netCDF from the original GRIB files held on the ECMWF system. They have also been translated onto a regular latitude-longitude grid during the extraction process from the ECMWF holdings. For a fuller set of variables please see the linked Copernicus Data Store (CDS) data tool, linked to from this record. Note, ensemble standard deviation is often referred to as ensemble spread and is calculated as the standard deviation of the 10-members in the ensemble (i.e., including the control). It is not the sample standard deviation, and thus were calculated by dividing by 10 rather than 9 (N-1). See linked datasets for ensemble member and ensemble mean data. The ERA5 global atmospheric reanalysis of the covers 1979 to 2 months behind the present month. This follows on from the ERA-15, ERA-40 rand ERA-interim re-analysis projects. An initial release of ERA5 data (ERA5t) is made roughly 5 days behind the present date. These will be subsequently reviewed ahead of being released by ECMWF as quality assured data within 3 months. CEDA holds a 6 month rolling copy of the latest ERA5t data. See related datasets linked to from this record. However, for the period 2000-2006 the initial ERA5 release was found to suffer from stratospheric temperature biases and so new runs to address this issue were performed resulting in the ERA5.1 release (see linked datasets). Note, though, that Simmons et al. 2020 (technical memo 859) report that "ERA5.1 is very close to ERA5 in the lower and middle troposphere." but users of data from this period should read the technical memo 859 for further details.
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
After May 3, 2024, this dataset and webpage will no longer be updated because hospitals are no longer required to report data on COVID-19 hospital admissions, and hospital capacity and occupancy data, to HHS through CDC’s National Healthcare Safety Network. Data voluntarily reported to NHSN after May 1, 2024, will be available starting May 10, 2024, at COVID Data Tracker Hospitalizations.
The following dataset provides facility-level data for hospital utilization aggregated on a weekly basis (Sunday to Saturday). These are derived from reports with facility-level granularity across two main sources: (1) HHS TeleTracking, and (2) reporting provided directly to HHS Protect by state/territorial health departments on behalf of their healthcare facilities.
The hospital population includes all hospitals registered with Centers for Medicare & Medicaid Services (CMS) as of June 1, 2020. It includes non-CMS hospitals that have reported since July 15, 2020. It does not include psychiatric, rehabilitation, Indian Health Service (IHS) facilities, U.S. Department of Veterans Affairs (VA) facilities, Defense Health Agency (DHA) facilities, and religious non-medical facilities.
For a given entry, the term “collection_week” signifies the start of the period that is aggregated. For example, a “collection_week” of 2020-11-15 means the average/sum/coverage of the elements captured from that given facility starting and including Sunday, November 15, 2020, and ending and including reports for Saturday, November 21, 2020.
Reported elements include an append of either “_coverage”, “_sum”, or “_avg”.
The file will be updated weekly. No statistical analysis is applied to impute non-response. For averages, calculations are based on the number of values collected for a given hospital in that collection week. Suppression is applied to the file for sums and averages less than four (4). In these cases, the field will be replaced with “-999,999”.
A story page was created to display both corrected and raw datasets and can be accessed at this link: https://healthdata.gov/stories/s/nhgk-5gpv
This data is preliminary and subject to change as more data become available. Data is available starting on July 31, 2020.
Sometimes, reports for a given facility will be provided to both HHS TeleTracking and HHS Protect. When this occurs, to ensure that there are not duplicate reports, deduplication is applied according to prioritization rules within HHS Protect.
For influenza fields listed in the file, the current HHS guidance marks these fields as optional. As a result, coverage of these elements are varied.
For recent updates to the dataset, scroll to the bottom of the dataset description.
On May 3, 2021, the following fields have been added to this data set.
On May 8, 2021, this data set has been converted to a corrected data set. The corrections applied to this data set are to smooth out data anomalies caused by keyed in data errors. To help determine which records have had corrections made to it. An additional Boolean field called is_corrected has been added.
On May 13, 2021 Changed vaccination fields from sum to max or min fields. This reflects the maximum or minimum number reported for that metric in a given week.
On June 7, 2021 Changed vaccination fields from max or min fields to Wednesday reported only. This reflects that the number reported for that metric is only reported on Wednesdays in a given week.
On September 20, 2021, the following has been updated: The use of analytic dataset as a source.
On January 19, 2022, the following fields have been added to this dataset:
On April 28, 2022, the following pediatric fields have been added to this dataset:
On October 24, 2022, the data includes more analytical calculations in efforts to provide a cleaner dataset. For a raw version of this dataset, please follow this link: https://healthdata.gov/Hospital/COVID-19-Reported-Patient-Impact-and-Hospital-Capa/uqq2-txqb
Due to changes in reporting requirements, after June 19, 2023, a collection week is defined as starting on a Sunday and ending on the next Saturday.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is an updated and extended record of the Global Fire Atlas introduced by Andela et al. (2019). Input data (burned area and land cover products) are updated to the MODIS Collection 6.1 (the previous version was based on collection 6.0 burned area and collection 5.1 land cover products, respectively). The timeseries is extended to cover the period 2002 to February 2024.
The method employed to create the dataset precisely follows the approach described by Andela et al. (2019).
The input burned area product is MCD64A1 Collection 6.1. It is described by Giglio et al. (2018) and available at: https://lpdaac.usgs.gov/products/mcd64a1v061/.
The input land cover product is MCD12Q1 Collection 6.1. It is described by Sulla-Menashe et al. (2019) and available at: https://lpdaac.usgs.gov/products/mcd12q1v061/.
Note that while the methods have remained the same compared to Andela et al. (2019), we do observe small differences between the Global Fire Atlas products originating from differences between the MCD64A1 collection 6.1 burned area data used here and the collection 6 data used in the original product. In addition, we observe more substantial differences in the dominant land cover class associated with each fire due to the differences between the MCD12Q1 collection 6.1 data used here and collection 5.1 data used in the original product.
The original dataset included time series from 2003 to 2016, including the full fire season for each year. For each MODIS tile, the fire season is defined as the twelve months centred on the month with peak burend area (see Andela et al., 2019). Here we extended the time-series to include the fire season of 2002, and extended the time-series until February 2024. Therefore, both the 2023 and 2024 files will contain incomplete records. For example, for a MODIS tile with peak burned area in December, the 2023 fire season would be defined as the period from July 2023 to June 2024, with the current record ending in February 2024. For the purpose of time-series analysis, we note that the 2002 product may have been affected by outages of Terra-MODIS (most notably, June 15 2001 - July 3 2001 and March 19 2002 - March 28 2002), which affects the burn date estimates and Global Fire Atlas product. Following the launch of Aqua-MODIS in May 2002 burn date estimates are more reliable as estimated from both MODIS sensors onboard Terra and Aqua.
Table 1: Overview of the Global Fire Atlas data layers. The shapefiles of ignition locations (point) and fire perimeters (polygon) contain attribute tables with summary information for each individual fire, while the underlying 500 m gridded layers reflect the day-to-day behavior of the individual fires. In addition, we provide aggregated monthly summary layers at a 0.25° resolution for regional and global analyses.
File name | Content |
SHP_ignitions.zip | Shapefiles of ignition locations with attribute tables (see Table 2) |
SHP_perimeters.zip | Shapefiles of final fire perimeters with attribute tables (see Table 2) |
GeoTIFF_direction.zip | 500 m resolution daily gridded data on direction of spread (8 classes) |
GeoTIFF_day_of_burn.zip | 500 m resolution daily gridded data on day of burn (day of year; 1-366) |
GeoTIFF_speed.zip | 500 m resolution daily gridded data on speed (km/day) |
GeoTIFF_fire_line.zip | 500 m resolution daily gridded data on the fire line (day of year; 1-366) |
GeoTIFF_monthly_summaries.zip | Aggregated 0.25° resolution monthly summary layers. These files include the sum of ignitions, average size (km2), average duration (days), average daily fire line (km), average daily fire expansion (km2/day), average speed (km/day), and dominant direction of spread (8 classes). |
Table 2: Overview of the Global Fire Atlas shapefile attribute tables. The shapefiles of ignition locations (point) and fire perimeters (polygon) contain attribute tables with summary information for each individual fire.
Attribute | Explanation / units |
lat, lon | Coordinates of ignition location (°) |
size | Fire size (km2) |
perimeter | Fire perimeter (km) |
start_date, start_DOY | Start date (yyyy-mm-dd), start day of year (1-366) |
end_date, end_DOY | End date (yyyy-mm-dd), end day of year (1-366) |
duration | Duration (days) |
fire_line | Average length of daily fire line (km) |
spread | Average daily fire growth (km2/day) |
speed | Average speed (km/day) |
direction, direc_frac | Dominant direction of spread (N, NE, E, SE, S, SW, W, NW) and associated fraction |
MODIS_tile | MODIS tile id |
landcover, landc_frac | MCD12Q1 dominant land cover class and fraction (UMD classification), provided for 2002-2023 |
GFED_regio | GFED region (van der Werf et al., 2017; available at https://www.globalfiredata.org/) |
File Naming Convention:
GFA_v{time-stamp}_{data-type}_{fire_season}.{file_type}
{time-stamp} = Date that code was run.
{data-type} = “ignitions” or “perimeters” for vector files; “day_of_burn”, “direction”, “fire_line”, or “speed” for raster files.
{fire_season} = the locally-defined fire season in which the fire was ignited (see more below).
{file_type} = ".shp" for vector files; ".tif" for raster files.
Fire Season Convention:
Please note that the year string in filenames refers to the locally-defined fire season in which the fire ignited, not the calendar year. Hence the file GFA_v20240409_perimeters_2003.shp can include fires from the 2003 fire season that ignited in the calendar years 2002 or 2004. This is particularly relevant in the Southern extratropics and the northern hemisphere subtropics, where the fire seasons often span the new year. The local definition of the fire season is based on climatological peak in burned area as described by Andela et al. (2019).
Projections:
Vector data are provided on the WGS84 projection.
Raster data are provided on the MODIS sinusoidal projection used in NASA tiled products.
We have used a publicly available dataset, COVID-19 Tweets Dataset, consisting of an extensive collection of 1,091,515,074 tweet IDs, and continuously expanding. The dataset was compiled by tracking over 90 distinct keywords and hashtags commonly associated with discussions about the COVID-19 pandemic. From this massive dataset, we focused on a specific time frame, encompassing data from August 05, 2020, to August 26, 2020, to meet our research objectives. As this dataset contains only tweet IDs, we have used the Twitter developer API to retrieve the corresponding tweets from Twitter. This retrieval process involved searching for tweet IDs and extracting the associated tweet texts, and it was implemented using the Twython library. In total, we successfully collected 21,890 tweets during this data extraction phase.
Following guidelines set by the CDC and WHO, we categorized tweets into five distinct classes for classification: health risks, prevention, symptoms, transmission, and treatment. Specifically, individuals aged over sixty, or those with pre-existing health conditions such as heart disease, lung problems, weakened immune systems, or diabetes, are at higher risk of severe COVID-19 complications. Therefore, tweets categorized as ‘health risks’ pertain to the elevated risks associated with COVID-19 due to age or specific health conditions. ‘Prevention’ related tweets encompass discussions on preventive and precautionary measures regarding the COVID-19 pandemic. Tweets discussing common COVID-19 symptoms, including cough, congestion, breathing issues, fever, body aches, and more, are classified as ‘symptoms’ related tweets. Conversations pertaining to the spread of COVID-19 between individuals, between animals and humans, and contact with virus-contaminated objects or surfaces are categorized as ‘transmission’ related tweets. Lastly, tweets indicating vaccine development and drugs used for COVID-19 treatment fall under the ‘treatment’ related category.
We determined specific keywords for each of the five classes (health risks, prevention, symptoms, transmission, and treatment) based on the definitions provided by the CDC and WHO on their official websites. These definitions, along with their associated keywords, are detailed in Table 1. For instance, the CDC and WHO indicate that individuals over the age of sixty with conditions like heart disease, lung problems, weak immune systems, or diabetes face a higher risk of severe COVID-19 complications. In accordance with this definition, we selected relevant keywords such as “lung disease”, “heart disease”, “diabetes”, “weak immunity”, and others to identify tweets related to health risks within the larger tweet dataset. This approach was consistently applied to define keywords for the remaining four classes. Subsequently, we filtered the initial dataset of 21,890 tweets to extract tweets relevant to our predefined classes, resulting in a total of 6,667 tweets based on the selected keywords.
To ensure the accuracy of our dataset, two separate annotators individually assigned the 6,667 tweets to the five classes. A third annotator, a natural language expert, meticulously cross-checked the dataset and provided necessary corrections. Subsequently, the two annotators resolved any discrepancies through mutual agreement, resulting in the final annotated dataset. Our dataset comprises a total of 6,667 data points categorized into five classes: 978, 2046, 1402, 802, and 1439 tweets annotated as ‘health risk’, ‘prevention’, ‘symptoms’, ‘transmission’, and ‘treatment’, respectively
Browse Natural Gas (Henry Hub) Last-day Financial 1 Month Spread Option (G4X) market data. Get instant pricing estimates and make batch downloads of binary, CSV, and JSON flat files.
The CME Group Market Data Platform (MDP) 3.0 disseminates event-based bid, ask, trade, and statistical data for CME Group markets and also provides recovery and support services for market data processing. MDP 3.0 includes the introduction of Simple Binary Encoding (SBE) and Event Driven Messaging to the CME Group Market Data Platform. Simple Binary Encoding (SBE) is based on simple primitive encoding, and is optimized for low bandwidth, low latency, and direct data access. Since March 2017, MDP 3.0 has changed from providing aggregated depth at every price level (like CME's legacy FAST feed) to providing full granularity of every order event for every instrument's direct book. MDP 3.0 is the sole data feed for all instruments traded on CME Globex, including futures, options, spreads and combinations. Note: We classify exchange-traded spreads between futures outrights as futures, and option combinations as options.
Origin: Directly captured at Aurora DC3 with an FPGA-based network card and hardware timestamping. Synchronized to UTC with PTP
Supported data encodings: DBN, CSV, JSON Learn more
Supported market data schemas: MBO, MBP-1, MBP-10, TBBO, Trades, OHLCV-1s, OHLCV-1m, OHLCV-1h, OHLCV-1d, Definition, Statistics Learn more
Resolution: Immediate publication, nanosecond-resolution timestamps
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundIonizing radiation is being used more frequently in medicine, which has been linked to recognized biological effects such as cancer and mortality. Radiology services are becoming more widely available in Ethiopian health facilities but there is no compiled record of worker’s radiation dose. So, assessing the magnitude and identifying the associated factors of occupational radiation exposure dose among radiology personnel help to design strategies for radiation protection.ObjectiveThe study was designed to assess the occupational radiation exposure dose and associated factors among radiology personnel in eastern Amhara, northeast Ethiopia, 2021.MethodsCross-sectional study was conducted from March 25 to April 30, 2021, in 57 health institutions among 198 radiology personnel. The study comprised all eligible radiology personnel. The data were collected using an electronic-based (Google form) self-administered questionnaire, and document review. The data were entered into an excel spread sheet and then, exported to Stata 14 software. Linear regression model was used to analyse the data after checking its assumptions. Variables with a p-value < 0.25 were entered into a multiple linear regression analysis, and those with a p-value < 0.05 were judged significant. VIF was used to check for multi-collinearity. Coefficient of determination was used to check the model fitness.ResultsThe mean (± SD) annual shallow and deep dose equivalents of radiology personnel were 1.20 (± 0.75) and 1.02 (± 0.70) mSv, respectively. Body mass index (β = 0.104, 95% CI: 0.07, 0.14), practice of timing (β = -0.43, 95% CI: -0.73, -0.13), working experience (β = -0.04, 95% CI: -0.048, -0.032), and practice of distancing (β = -0.26, 95% CI: -0.49, -0.17) were found to be statistically significant factors of annual deep dose equivalent. In addition, body mass index (β = 0.113, 95% CI: 0.08, 0.15), practice of timing (β = -0.62 95% CI: -0.93, -0.31) and, working experience (β = -0.044, 95% CI: -0.053, -0.036 had statistically significant associations with annual shallow dose equivalent.ConclusionThe annual dose equivalents were two times higher than the global average of annual per caput effective dose due to medical exposure. Body mass index, practice of timing, working experience, and practice of distancing were factors of occupational radiation exposure dose. Strategies focusing on increasing the skill, experience, and lifestyle of radiology personnel would be supreme important means to reduce occupational radiation exposure dose.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
These datasets are part of a research on open data in the municipality of Delft. The research is focused on motivation perspectives of citizens to engage in democratic processes by using open data. By doing this, the municipality can adapt their policy on open data to the characteristics of (potential) users and the wishes of the citizens. To identify these motivation perspectives, Q-methodology was used. A survey was used that asks participants to rank a number of statements. The survey that is made is spread among citizens of Delft. In total, 22 people participated in the survey. The gathered data is used to conduct a factor analysis and identify motivation perspectives among citizens of Delft. These datasets contain the gathered Q-sorts and the conducted analyses.
Browse Japanese Yen Futures (6J) market data. Get instant pricing estimates and make batch downloads of binary, CSV, and JSON flat files.
The CME Group Market Data Platform (MDP) 3.0 disseminates event-based bid, ask, trade, and statistical data for CME Group markets and also provides recovery and support services for market data processing. MDP 3.0 includes the introduction of Simple Binary Encoding (SBE) and Event Driven Messaging to the CME Group Market Data Platform. Simple Binary Encoding (SBE) is based on simple primitive encoding, and is optimized for low bandwidth, low latency, and direct data access. Since March 2017, MDP 3.0 has changed from providing aggregated depth at every price level (like CME's legacy FAST feed) to providing full granularity of every order event for every instrument's direct book. MDP 3.0 is the sole data feed for all instruments traded on CME Globex, including futures, options, spreads and combinations. Note: We classify exchange-traded spreads between futures outrights as futures, and option combinations as options.
Origin: Directly captured at Aurora DC3 with an FPGA-based network card and hardware timestamping. Synchronized to UTC with PTP
Supported data encodings: DBN, CSV, JSON Learn more
Supported market data schemas: MBO, MBP-1, MBP-10, TBBO, Trades, OHLCV-1s, OHLCV-1m, OHLCV-1h, OHLCV-1d, Definition, Statistics Learn more
Resolution: Immediate publication, nanosecond-resolution timestamps
This dataset contains ERA5.1 surface level analysis parameter data for the period 2000-2006 from 10 member ensemble runs. ERA5.1 is the European Centre for Medium-Range Weather Forecasts (ECWMF) ERA5 reanalysis project re-run for 2000-2006 to improve upon the cold bias in the lower stratosphere seen in ERA5 (see technical memorandum 859 in the linked documentation section for further details). Ensemble means and spreads are calculated from these 10 member ensemble, run at a reduced resolution compared with the single high resolution (hourly output at 31 km grid spacing) 'HRES' realisation, for which these data have been produced to provide an uncertainty estimate. This dataset contains a limited selection of all available variables and have been converted to netCDF from the original GRIB files held on the ECMWF system. They have also been translated onto a regular latitude-longitude grid during the extraction process from the ECMWF holdings. For a fuller set of variables please see the linked Copernicus Data Store (CDS) data tool, linked to from this record. Note, ensemble standard deviation is often referred to as ensemble spread and is calculated as the standard deviation of the 10-members in the ensemble (i.e., including the control). It is not the sample standard deviation, and thus were calculated by dividing by 10 rather than 9 (N-1). See linked datasets for ensemble mean and ensemble spread data. The main ERA5 global atmospheric reanalysis of the covers 1979 to 2 months behind the present month. This follows on from the ERA-15, ERA-40 rand ERA-interim re-analysis projects. An initial release of ERA5 data, ERA5t, are also available upto 5 days behind the present. A limited selection of data from these runs are also available via CEDA, whilst full access is available via the Copernicus Data Store.