https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Objective Daily COVID-19 data reported by the World Health Organization (WHO) may provide the basis for political ad hoc decisions including travel restrictions. Data reported by countries, however, is heterogeneous and metrics to evaluate its quality are scarce. In this work, we analyzed COVID-19 case counts provided by WHO and developed tools to evaluate country-specific reporting behaviors. Methods In this retrospective cross-sectional study, COVID-19 data reported daily to WHO from 3rd January 2020 until 14th June 2021 were analyzed. We proposed the concepts of binary reporting rate and relative reporting behavior and performed descriptive analyses for all countries with these metrics. We developed a score to evaluate the consistency of incidence and binary reporting rates. Further, we performed spectral clustering of the binary reporting rate and relative reporting behavior to identify salient patterns in these metrics. Results Our final analysis included 222 countries and regions. Reporting scores varied between -0.17, indicating discrepancies between incidence and binary reporting rate, and 1.0 suggesting high consistency of these two metrics. Median reporting score for all countries was 0.71 (IQR 0.55 to 0.87). Descriptive analyses of the binary reporting rate and relative reporting behavior showed constant reporting with a slight “weekend effect” for most countries, while spectral clustering demonstrated that some countries had even more complex reporting patterns. Conclusion The majority of countries reported COVID-19 cases when they did have cases to report. The identification of a slight “weekend effect” suggests that COVID-19 case counts reported in the middle of the week may represent the best data basis for political ad hoc decisions. A few countries, however, showed unusual or highly irregular reporting that might require more careful interpretation. Our score system and cluster analyses might be applied by epidemiologists advising policymakers to consider country-specific reporting behaviors in political ad hoc decisions. Methods Data collection COVID-19 data was downloaded from WHO. Using a public repository, we have added the countries' full names to the WHO data set using the two-letter abbreviations for each country to merge both data sets. The provided COVID-19 data covers January 2020 until June 2021. We uploaded the final data set used for the analyses of this paper. Data processing We processed data using a Jupyter Notebook with a Python kernel and publically available external libraries. This upload contains the required Jupyter Notebook (reporting_behavior.ipynb) with all analyses and some additional work, a README, and the conda environment yml (env.yml).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Datasets:
"Figure1a.csv": scattering intensity of hydrated proteins in Wide-Angle X-ray Scattering for different fluences (in units of photons/second/area). "Figure1a_inset.csv": scattering intensity of hydrated proteins in Small-Angle X-ray Scattering for different fluences (in units of photons/second/area). "Figure1b.csv": Intensity autocorrelation functions g2 at momentum transfer Q = 0.08 1/nm for different fluences (in units of photons/second/area). "Figure1b_inset.csv": decay rate (in second) as a function of the momentum transfer Q (in 1/nm) for different fluences (in units of photons/second/area). "Figure1c.csv": decay rate (in second) for variable fluence (in photons/second/um^2) at the momentum transfer Q = 0.08 1/nm. "Figure1d.csv": renormalised intensity autocorrelation functions g2 at momentum transfer Q = 0.08 1/nm for variable fluence (in photons/second/um^2), where the time axis is normalised to the corresponding fluence F by calculating t/(1 + a · F·τ0), where τ0 is the equilibrium time constant extracted by extrapolation to F=0 (from data in "Figure1c.csv)" "Figure2a.csv": The Wide-Angle X-ray Scattering scattering intensity at different temperatures T=180-290 K "Figure2b.csv": The Small-Angle X-ray Scattering scattering intensity at different temperatures T=180-290 K "Figure2c.csv": Intensity autocorrelation functions g2 for different temperatures (T=180-290 K) at momentum transfer Q = 0.1 1/nm. "Figure2d-2e.csv": time constants (in second) and the Kohlrausch-Williams-Watts (KWW) exponent extracted from the fits of data in "Figure2c.csv" as a function of temperature (in K) "Figure3b.csv": The normalised variance Chi_T at different temperatures (T=180-290 K) extracted from the two-time correlation functions. "Figure3c.csv": The maximum of the normalised variance Chi_0 as a function of temperature (in K).
Additionally, a Jupyter notebook "open-data.ipynb" which shows how to load and plot the data from the csv files in Python.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is an Australian extract of Speedtest Open data available at Amazon WS (link below - opendata.aws).AWS data licence is "CC BY-NC-SA 4.0", so use of this data must be:- non-commercial (NC)- reuse must be share-alike (SA)(add same licence).This restricts the standard CC-BY Figshare licence.A world speedtest open data was dowloaded (>400Mb, 7M lines of data). An extract of Australia's location (lat, long) revealed 88,000 lines of data (attached as csv).A Jupyter notebook of extract process is attached.A link to Twitter thread of outputs provided.A link to Data tutorial provided (GitHub), including Jupyter Notebook to analyse World Speedtest data, selecting one US State.Data Shows: (Q2)- 3.1M speedtests- 762,000 devices- 88,000 grid locations (600m * 600m), summarised as a point- average speed 33.7Mbps (down), 12.4M (up)- Max speed 724Mbps- data is for 600m * 600m grids, showing average speed up/down, number of tests, and number of users (IP). Added centroid, and now lat/long.See tweet of image of centroids also attached.Versions:v15/16. Add Hist comparing Q1-21 vs Q2-20. Inc ipynb (incHistQ121, v.1.3-Q121) to calc.v14 Add AUS Speedtest Q1 2021 geojson.(79k lines avg d/l 45.4Mbps)v13 - Added three colour MELB map (less than 20Mbps, over 90Mbps, 20-90Mbps)v12 - Added AUS - Syd - Mel Line Chart Q320.v11 - Add line chart compare Q2, Q3, Q4 plus Melb - result virtually indistinguishable. Add line chart to compare Syd - Melb Q3. Also virtually indistinguishable. Add HIST compare Syd - Melb Q3. Add new Jupyter with graph calcs (nbn-AUS-v1.3). Some ERRATA document in Notebook. Issue with resorting table, and graphing only part of table. Not an issue if all lines of table graphed.v10 - Load AURIN sample pics. Speedtest data loaded to AURIN geo-analytic platform; requires edu.au login.v9 - Add comparative Q2, Q3, Q4 Hist pic.v8 - Added Q4 data geojson. Add Q3, Q4 Hist pic.v7 - Rename to include Q2, Q3 in Title.v6 - Add Q3 20 data. Rename geojson AUS data as Q2. Add comparative Histogram. Calc in International.ipynb.v5 - add Jupyter Notebook inc Histograms. Hist is count of geo-locations avg download speed (unweighted by tests).v4 - added Melb choropleth (png 50Mpix) inc legend. (To do - add Melb.geojson). Posted Link to AURIN description of Speedtest data.v3 - Add super fast data (>100Mbps) less than 1% of data - 697 lines. Includes png of superfast.plot(). Link below to Google Maps version of superfast data points. Also Google map of first 100 data points - sample data. Geojson format for loading into GeoPandas, per Jupyter Notebook. New version of Jupyter Notebook, v.1.1.v2 - add centroids image.v1 - initial data load.** Future Work- combine Speedtest data with NBN Technology by location data (national map.gov.au); https://www.data.gov.au/dataset/national-broadband-network-connections-by-technology-type- combine Speedtest data with SEIFA data - socioeconomic categories - to discuss with AURIN.- Further international comparisons- discussed collaboration with Assoc Prof Tooran Alizadeh, USyd.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This page contains the accompanying deep learning models, dataset and code for the paper on MargNet, titled "Photometric identification of compact galaxies, stars and quasars using multiple neural networks".
Deep Learning Models:
MargNet is a deep learning-based classifier for identifying stars, quasars and compact galaxies using photometric parameters and images from the Sloan Digital Sky Survey. MargNet consists of a combination of Convolutional Neural Network (CNN) and Artificial Neural Network (ANN) architectures. The deep learning Keras model for each experiment was saved as an h5 file after training. All saved models (organised by different experiments, as described in the paper) are available in SavedModels.zip.
Dataset:
Our dataset consists of 240,000 compact objects and an additional 150,000 faint objects consisting of an equal number of stars, galaxies and quasars. This data is available as NumPy arrays and CSV files, as described below:
SDSS ObjID of each object (objlist.npy)
SDSS 5-band images of each object cropped to 32*32 pixels (X.npy)
The set of 24 photometric features for each object (dnnx.npy)
The classification label for each object (y.npy)
SDSS spreadsheet containing all the features from dnnx, labels from y, ObjIDs from objlist and a couple of more SDSS specific parameters (photofeatures.csv)
The complete dataset (organised by different experiments, as described in the paper) is available in Dataset.zip. (Note: objlist, X, dnnx and y are in the same order. So, objlist[0], X[0], dnnx[0] and y[0] correspond to the same object.)
Code:
All our code was written in Python in the form of Jupyter Notebooks. A copy of our code has also been made available on GitHub, but not all files could be included on GitHub due to the storage limit. So a complete copy of the repository has also been mirrored here on Zenodo and is contained in MargNet_RepositoryMirror.tgz
Atmospheric water vapor pressure is an essential meteorological control on land surface and hydrologic processes. It is not as frequently observed as other meteorologic conditions, but often inferred through the August–Roche–Magnus formula by simply assuming dew point and daily minimum temperatures are equivalent or by empirically correlating the two temperatures using an aridity correction. The performance of both methods varies considerably across different regions and during different time periods; obtaining consistently accurate estimates across space and time remains a great challenge. We applied an interpretable Long Short-Term Memory (iLSTM) network conditioned on static, location specific attributes to estimate daily vapor pressure for 83 FLUXNET sites in the United States and Canada. This data package includes all raw data of the 83 FLUXNET sites, input data for model training/validation/test, trained models and results, and python codes for the manuscript "Improving the Estimation of the Atmospheric Water Vapor Pressure Using an Interpretable Long Short-term Memory Network". Specifically, it consists of five parts. - First, "1_Daymet_data_83sites.zip" includes raw data downloaded from Daymet for the 83 sites used in the paper according to their longitude and latitude, in which vapor pressure is used. It also includes a pre-processed CSV data file combining all data from the 83 sites which is specifically used for the paper. - Second, "2_Fluxnet2015_data_83sites.zip" includes raw half hourly data of the 83 sites downloaded from FLUXNET2015 data portal, pre-processed daily data of the 83 sites, a CSV file including combined pre-processed daily data of the 83 sites, and a CSV file including the information (site ID, site name, latitude, longitude, data available period) of the 83 sites. - Third, "3_MODIS_LAI_data_83sites_raw.zip" includes raw leaf area index (LAI) data downloaded from the AppEEARs data portal. - Fourth, "4_Scripts.zip" includes all scripts related to model training and post-processing of a trained model, and a jupyter notebook showing an example for model post-processing. Two typo errors in files titled "run2get_args.py" and "postprocess.py" were corrected on March 27, 2024 to avoid confusions. - Finally, "Trained_models_and_results.zip" includes three folders and three files with suffix ".npy", and each folder corresponds to one file with suffix ".npy" with the same title. Each of the three folders include all trained models associated with one iLSTM model configuration (35 models for each configuration, details are described in the paper). Each file with suffix ".npy" includes the post-processed results of the corresponding 35 models under one iLSTM model configuration.
Load, wind and solar, prices in hourly resolution. This data package contains different kinds of timeseries data relevant for power system modelling, namely electricity prices, electricity consumption (load) as well as wind and solar power generation and capacities. The data is aggregated either by country, control area or bidding zone. Geographical coverage includes the EU and some neighbouring countries. All variables are provided in hourly resolution. Where original data is available in higher resolution (half-hourly or quarter-hourly), it is provided in separate files. This package version only contains data provided by TSOs and power exchanges via ENTSO-E Transparency, covering the period 2015-mid 2020. See previous versions for historical data from a broader range of sources. All data processing is conducted in Python/pandas and has been documented in the Jupyter notebooks linked below.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Objective Daily COVID-19 data reported by the World Health Organization (WHO) may provide the basis for political ad hoc decisions including travel restrictions. Data reported by countries, however, is heterogeneous and metrics to evaluate its quality are scarce. In this work, we analyzed COVID-19 case counts provided by WHO and developed tools to evaluate country-specific reporting behaviors. Methods In this retrospective cross-sectional study, COVID-19 data reported daily to WHO from 3rd January 2020 until 14th June 2021 were analyzed. We proposed the concepts of binary reporting rate and relative reporting behavior and performed descriptive analyses for all countries with these metrics. We developed a score to evaluate the consistency of incidence and binary reporting rates. Further, we performed spectral clustering of the binary reporting rate and relative reporting behavior to identify salient patterns in these metrics. Results Our final analysis included 222 countries and regions. Reporting scores varied between -0.17, indicating discrepancies between incidence and binary reporting rate, and 1.0 suggesting high consistency of these two metrics. Median reporting score for all countries was 0.71 (IQR 0.55 to 0.87). Descriptive analyses of the binary reporting rate and relative reporting behavior showed constant reporting with a slight “weekend effect” for most countries, while spectral clustering demonstrated that some countries had even more complex reporting patterns. Conclusion The majority of countries reported COVID-19 cases when they did have cases to report. The identification of a slight “weekend effect” suggests that COVID-19 case counts reported in the middle of the week may represent the best data basis for political ad hoc decisions. A few countries, however, showed unusual or highly irregular reporting that might require more careful interpretation. Our score system and cluster analyses might be applied by epidemiologists advising policymakers to consider country-specific reporting behaviors in political ad hoc decisions. Methods Data collection COVID-19 data was downloaded from WHO. Using a public repository, we have added the countries' full names to the WHO data set using the two-letter abbreviations for each country to merge both data sets. The provided COVID-19 data covers January 2020 until June 2021. We uploaded the final data set used for the analyses of this paper. Data processing We processed data using a Jupyter Notebook with a Python kernel and publically available external libraries. This upload contains the required Jupyter Notebook (reporting_behavior.ipynb) with all analyses and some additional work, a README, and the conda environment yml (env.yml).