100+ datasets found

1000 Empirical Time series
figshare.com
researchdata.edu.au
png
Updated May 30, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ben Fulcher (2023). 1000 Empirical Time series [Dataset]. http://doi.org/10.6084/m9.figshare.5436136.v10
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5436136.v10
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Ben Fulcher
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A diverse selection of 1000 empirical time series, along with results of an hctsa feature extraction, using v1.06 of hctsa and Matlab 2019b, computed on a server at The University of Sydney.The results of the computation are in the hctsa file, HCTSA_Empirical1000.mat for use in Matlab using v1.06 of hctsa.The same data is also provided in .csv format for the hctsa_datamatrix.csv (results of feature computation), with information about rows (time series) in hctsa_timeseries-info.csv, information about columns (features) in hctsa_features.csv (and corresponding hctsa code used to compute each feature in hctsa_masterfeatures.csv), and the data of individual time series (each line a time series, for time series described in hctsa_timeseries-info.csv) is in hctsa_timeseries-data.csv. These .csv files were produced by running >>OutputToCSV(HCTSA_Empirical1000.mat,true,true); in hctsa.The input file, INP_Empirical1000.mat, is for use with hctsa, and contains the time-series data and metadata for the 1000 time series. For example, massive feature extraction from these data on the user's machine, using hctsa, can proceed as>> TS_Init('INP_Empirical1000.mat');Some visualizations of the dataset are in CarpetPlot.png (first 1000 samples of all time series as a carpet (color) plot) and 150TS-250samples.png (conventional time-series plots of the first 250 samples of a sample of 150 time series from the dataset). More visualizations can be performed by the user using TS_PlotTimeSeries from the hctsa package.See links in references for more comprehensive documentation for performing methodological comparison using this dataset, and on how to download and use v1.06 of hctsa.
O
Time series
data.open-power-system-data.org
csv, sqlite, xlsx
Updated Oct 6, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan Muehlenpfordt (2020). Time series [Dataset]. http://doi.org/10.25832/time_series/2020-10-06
Explore at:
csv, sqlite, xlsxAvailable download formats
Unique identifier
https://doi.org/10.25832/time_series/2020-10-06
Dataset updated
Oct 6, 2020
Dataset provided by
Open Power System Data
Authors
Jonathan Muehlenpfordt
Time period covered
Jan 1, 2015 - Oct 1, 2020
Variables measured
utc_timestamp, DE_wind_profile, DE_solar_profile, DE_wind_capacity, DK_wind_capacity, SE_wind_capacity, CH_solar_capacity, DE_solar_capacity, DK_solar_capacity, AT_price_day_ahead, and 290 more
Description
Load, wind and solar, prices in hourly resolution. This data package contains different kinds of timeseries data relevant for power system modelling, namely electricity prices, electricity consumption (load) as well as wind and solar power generation and capacities. The data is aggregated either by country, control area or bidding zone. Geographical coverage includes the EU and some neighbouring countries. All variables are provided in hourly resolution. Where original data is available in higher resolution (half-hourly or quarter-hourly), it is provided in separate files. This package version only contains data provided by TSOs and power exchanges via ENTSO-E Transparency, covering the period 2015-mid 2020. See previous versions for historical data from a broader range of sources. All data processing is conducted in Python/pandas and has been documented in the Jupyter notebooks linked below.
z
Controlled Anomalies Time Series (CATS) Dataset
zenodo.org
bin
Updated Jul 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patrick Fleith; Patrick Fleith (2024). Controlled Anomalies Time Series (CATS) Dataset [Dataset]. http://doi.org/10.5281/zenodo.7646897
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7646897
Dataset updated
Jul 12, 2024
Dataset provided by
Solenix Engineering GmbH
Authors
Patrick Fleith; Patrick Fleith
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Controlled Anomalies Time Series (CATS) Dataset consists of commands, external stimuli, and telemetry readings of a simulated complex dynamical system with 200 injected anomalies.

The CATS Dataset exhibits a set of desirable properties that make it very suitable for benchmarking Anomaly Detection Algorithms in Multivariate Time Series [1]:

Multivariate (17 variables) including sensors reading and control signals. It simulates the operational behaviour of an arbitrary complex system including:

4 Deliberate Actuations / Control Commands sent by a simulated operator / controller, for instance, commands of an operator to turn ON/OFF some equipment.

3 Environmental Stimuli / External Forces acting on the system and affecting its behaviour, for instance, the wind affecting the orientation of a large ground antenna.

10 Telemetry Readings representing the observable states of the complex system by means of sensors, for instance, a position, a temperature, a pressure, a voltage, current, humidity, velocity, acceleration, etc.

5 million timestamps. Sensors readings are at 1Hz sampling frequency.

1 million nominal observations (the first 1 million datapoints). This is suitable to start learning the "normal" behaviour.

4 million observations that include both nominal and anomalous segments. This is suitable to evaluate both semi-supervised approaches (novelty detection) as well as unsupervised approaches (outlier detection).

200 anomalous segments. One anomalous segment may contain several successive anomalous observations / timestamps. Only the last 4 million observations contain anomalous segments.

Different types of anomalies to understand what anomaly types can be detected by different approaches.

Fine control over ground truth. As this is a simulated system with deliberate anomaly injection, the start and end time of the anomalous behaviour is known very precisely. In contrast to real world datasets, there is no risk that the ground truth contains mislabelled segments which is often the case for real data.

Obvious anomalies. The simulated anomalies have been designed to be "easy" to be detected for human eyes (i.e., there are very large spikes or oscillations), hence also detectable for most algorithms. It makes this synthetic dataset useful for screening tasks (i.e., to eliminate algorithms that are not capable to detect those obvious anomalies). However, during our initial experiments, the dataset turned out to be challenging enough even for state-of-the-art anomaly detection approaches, making it suitable also for regular benchmark studies.

Context provided. Some variables can only be considered anomalous in relation to other behaviours. A typical example consists of a light and switch pair. The light being either on or off is nominal, the same goes for the switch, but having the switch on and the light off shall be considered anomalous. In the CATS dataset, users can choose (or not) to use the available context, and external stimuli, to test the usefulness of the context for detecting anomalies in this simulation.

Pure signal ideal for robustness-to-noise analysis. The simulated signals are provided without noise: while this may seem unrealistic at first, it is an advantage since users of the dataset can decide to add on top of the provided series any type of noise and choose an amplitude. This makes it well suited to test how sensitive and robust detection algorithms are against various levels of noise.

No missing data. You can drop whatever data you want to assess the impact of missing values on your detector with respect to a clean baseline.

[1] Example Benchmark of Anomaly Detection in Time Series: “Sebastian Schmidl, Phillip Wenig, and Thorsten Papenbrock. Anomaly Detection in Time Series: A Comprehensive Evaluation. PVLDB, 15(9): 1779 - 1797, 2022. doi:10.14778/3538598.3538602”

About Solenix

Solenix is an international company providing software engineering, consulting services and software products for the space market. Solenix is a dynamic company that brings innovative technologies and concepts to the aerospace market, keeping up to date with technical advancements and actively promoting spin-in and spin-out technology activities. We combine modern solutions which complement conventional practices. We aspire to achieve maximum customer satisfaction by fostering collaboration, constructivism, and flexibility.
Hourly Sensor Data for Time Series Forecasting
kaggle.com
Updated Jul 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SudhanvaHG (2024). Hourly Sensor Data for Time Series Forecasting [Dataset]. https://www.kaggle.com/datasets/sudhanvahg/hourly-sensor-data-for-forecasting
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 4, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
SudhanvaHG
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset contains hourly sensor data collected over a period of time. The primary objective is to forecast future sensor values using various time series forecasting methods, such as SARIMA, Prophet, and machine learning models. The dataset includes an ID column, a Datetime column and a Count column, where the Count represents the sensor reading at each timestamp.
h
Timeseries-PILE
huggingface.co
Updated May 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Auton Lab (2024). Timeseries-PILE [Dataset]. https://huggingface.co/datasets/AutonLab/Timeseries-PILE
Explore at:
Dataset updated
May 11, 2024
Dataset authored and provided by
Auton Lab
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Time Series PILE

The Time-series Pile is a large collection of publicly available data from diverse domains, ranging from healthcare to engineering and finance. It comprises of over 5 public time-series databases, from several diverse domains for time series foundation model pre-training and evaluation.

Time Series PILE Description

We compiled a large collection of publicly available datasets from diverse domains into the Time Series Pile. It has 13 unique domains of data… See the full description on the dataset page: https://huggingface.co/datasets/AutonLab/Timeseries-PILE.
d
COVID-19 Time Series Data
data.world
csv, zip
Updated Mar 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shad Reynolds (2025). COVID-19 Time Series Data [Dataset]. https://data.world/shad/covid-19-time-series-data
Explore at:
csv, zipAvailable download formats
Dataset updated
Mar 18, 2025
Authors
Shad Reynolds
Time period covered
Jan 22, 2020 - Mar 9, 2023
Area covered
Description

This data is synced hourly from https://github.com/CSSEGISandData/COVID-19. All credit is to them.

Latest Confirmed Cases

@(https://data.world/shad/covid-analysis/workspace/query?datasetid=covid-19-time-series-data&queryid=e066701e-fa8d-4c9f-97f8-aab3a6f219a8)

I have also added confirmed_pivot.csv which gives a slightly more workable view of the data. Extra columns/day makes things difficult.

@(https://data.world/shad/covid-analysis/workspace/file?datasetid=covid-19-time-series-data&filename=confirmed_pivot)

#
f
Data from: Nonparametric Anomaly Detection on Time Series of Graphs
tandf.figshare.com
zip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben (2023). Nonparametric Anomaly Detection on Time Series of Graphs [Dataset]. http://doi.org/10.6084/m9.figshare.13180181.v3
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13180181.v3
Dataset updated
May 31, 2023
Dataset provided by
Taylor & Francis
Authors
Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Identifying change points and/or anomalies in dynamic network structures has become increasingly popular across various domains, from neuroscience to telecommunication to finance. One particular objective of anomaly detection from a neuroscience perspective is the reconstruction of the dynamic manner of brain region interactions. However, most statistical methods for detecting anomalies have the following unrealistic limitation for brain studies and beyond: that is, network snapshots at different time points are assumed to be independent. To circumvent this limitation, we propose a distribution-free framework for anomaly detection in dynamic networks. First, we present each network snapshot of the data as a linear object and find its respective univariate characterization via local and global network topological summaries. Second, we adopt a change point detection method for (weakly) dependent time series based on efficient scores, and enhance the finite sample properties of change point method by approximating the asymptotic distribution of the test statistic using the sieve bootstrap. We apply our method to simulated and to real data, particularly, two functional magnetic resonance imaging (fMRI) datasets and the Enron communication graph. We find that our new method delivers impressively accurate and realistic results in terms of identifying locations of true change points compared to the results reported by competing approaches. The new method promises to offer a deeper insight into the large-scale characterizations and functional dynamics of the brain and, more generally, into the intrinsic structure of complex dynamic networks. Supplemental materials for this article are available online.

Weather Long-term Time Series Forecasting

kaggle.com

Updated Nov 3, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Alistair King (2024). Weather Long-term Time Series Forecasting [Dataset]. https://www.kaggle.com/datasets/alistairking/weather-long-term-time-series-forecasting

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Nov 3, 2024

Dataset provided by

Kaggle

Authors

Alistair King

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Weather Long-term Time Series Forecasting (2020)

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F8734253%2F832430253683be01796f74de8f532b34%2Fweather%20forecasting.png?generation=1730602999355141&alt=media" alt="">

Dataset Description

Weather is recorded every 10 minutes throughout the entire year of 2020, comprising 20 meteorological indicators measured at a Max Planck Institute weather station. The dataset provides comprehensive atmospheric measurements including air temperature, humidity, wind patterns, radiation, and precipitation. With over 52,560 data points per variable (365 days × 24 hours × 6 measurements per hour), this high-frequency sampling offers detailed insights into weather patterns and atmospheric conditions. The measurements include both basic weather parameters and derived quantities such as vapor pressure deficit and potential temperature, making it suitable for both meteorological research and practical applications. You can find some initial analysis using this dataset here: "Weather Long-term Time Series Forecasting Analysis".

File Structure

The dataset is provided in a CSV format with the following columns:

Column Name	Description
`date`	Date and time of the observation.
`p`	Atmospheric pressure in millibars (mbar).
`T`	Air temperature in degrees Celsius (°C).
`Tpot`	Potential temperature in Kelvin (K), representing the temperature an air parcel would have if moved to a standard pressure level.
`Tdew`	Dew point temperature in degrees Celsius (°C), indicating the temperature at which air becomes saturated with moisture.
`rh`	Relative humidity as a percentage (%), showing the amount of moisture in the air relative to the maximum it can hold at that temperature.
`VPmax`	Maximum vapor pressure in millibars (mbar), representing the maximum pressure exerted by water vapor at the given temperature.
`VPact`	Actual vapor pressure in millibars (mbar), indicating the current water vapor pressure in the air.
`VPdef`	Vapor pressure deficit in millibars (mbar), measuring the difference between maximum and actual vapor pressure, used to gauge drying potential.
`sh`	Specific humidity in grams per kilogram (g/kg), showing the mass of water vapor per kilogram of air.
`H2OC`	Concentration of water vapor in millimoles per mole (mmol/mol) of dry air.
`rho`	Air density in grams per cubic meter (g/m³), reflecting the mass of air per unit volume.
`wv`	Wind speed in meters per second (m/s), measuring the horizontal motion of air.
`max. wv`	Maximum wind speed in meters per second (m/s), indicating the highest recorded wind speed over the period.
`wd`	Wind direction in degrees (°), representing the direction from which the wind is blowing.
`rain`	Total rainfall in millimeters (mm), showing the amount of precipitation over the observation period.
`raining`	Duration of rainfall in seconds (s), recording the time for which rain occurred during the observation period.
`SWDR`	Short-wave downward radiation in watts per square meter (W/m²), measuring incoming solar radiation.
`PAR`	Photosynthetically active radiation in micromoles per square meter per second (µmol/m²/s), indicating the amount of light available for photosynthesis.
`max. PAR`	Maximum photosynthetically active radiation recorded in the observation period in µmol/m²/s.
`Tlog`	Temperature logged in degrees Celsius (°C), potentially from a secondary sensor or logger.
`OT`	Likely refers to an "operational timestamp" or an offset in time, but may need clarification depending on the dataset's context.

Potential Use Cases

This high-resolution meteorological dataset enables applications across multiple domains. For weather forecasting, the frequent measurements support development of prediction models, while climate researchers can study microclimate variations and seasonal patterns. In agriculture, temperature and vapor pressure deficit data aids crop modeling and irrigation planning. The wind and radiation measurements benefit renewable energy planning, while the comprehensive atmospheric data supports environmental monitoring. The dataset's detailed nature makes it particularly suitable for machine learning applications and educational purposes in meteorology and data science.

Credits

This data was provided by the Max Planck Institute, and acc...

N
Population Estimates Time Series Data
dtechtive.com
find.data.gov.scot
Updated Mar 27, 2011
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Records of Scotland (2011). Population Estimates Time Series Data [Dataset]. https://dtechtive.com/datasets/3616
Explore at:
Dataset updated
Mar 27, 2011
Dataset provided by
National Records of Scotland
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Area covered
Scotland
Description
Over time statistical outputs (and time series data) may be subject to revisions or corrections. Revisions are generally planned, and are the result of either improvements in statistical methods or the availability of additional data. For example, the annual mid-year population estimates are revised after a census to take account of the additional information gained from the census results. Details of planned revisions are held within the Metadata alongside each publication. Corrections are unplanned and occur when errors in either the statistical data or methodology are found after release of the data. The latest correction to these datasets was in September 2018, for more information please see the revisions and corrections page. This time series section provides access to the latest time series data, taking into account any revisions or corrections over the years. Note: Tables are mainly offered for the purposes of extracting figures. Due to the size of some of the sheets they are not recommended for printing.
m
Example Stata syntax and data construction for negative binomial time series...
data.mendeley.com
Updated Nov 2, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah Price (2022). Example Stata syntax and data construction for negative binomial time series regression [Dataset]. http://doi.org/10.17632/3mj526hgzx.2
Explore at:
Unique identifier
https://doi.org/10.17632/3mj526hgzx.2
Dataset updated
Nov 2, 2022
Authors
Sarah Price
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We include Stata syntax (dummy_dataset_create.do) that creates a panel dataset for negative binomial time series regression analyses, as described in our paper "Examining methodology to identify patterns of consulting in primary care for different groups of patients before a diagnosis of cancer: an exemplar applied to oesophagogastric cancer". We also include a sample dataset for clarity (dummy_dataset.dta), and a sample of that data in a spreadsheet (Appendix 2).

The variables contained therein are defined as follows:

case: binary variable for case or control status (takes a value of 0 for controls and 1 for cases).

patid: a unique patient identifier.

time_period: A count variable denoting the time period. In this example, 0 denotes 10 months before diagnosis with cancer, and 9 denotes the month of diagnosis with cancer,

ncons: number of consultations per month.

period0 to period9: 10 unique inflection point variables (one for each month before diagnosis). These are used to test which aggregation period includes the inflection point.

burden: binary variable denoting membership of one of two multimorbidity burden groups.

We also include two Stata do-files for analysing the consultation rate, stratified by burden group, using the Maximum likelihood method (1_menbregpaper.do and 2_menbregpaper_bs.do).

Note: In this example, for demonstration purposes we create a dataset for 10 months leading up to diagnosis. In the paper, we analyse 24 months before diagnosis. Here, we study consultation rates over time, but the method could be used to study any countable event, such as number of prescriptions.
Four ways to quantify synchrony between time series data
osf.io
Updated Dec 8, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jin Hyun Cheong (2020). Four ways to quantify synchrony between time series data [Dataset]. http://doi.org/10.17605/OSF.IO/BA3NY
Explore at:
Unique identifier
https://doi.org/10.17605/OSF.IO/BA3NY
Dataset updated
Dec 8, 2020
Dataset provided by
Center for Open Sciencehttps://cos.io/
Authors
Jin Hyun Cheong
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This project provides a sample dataset with detailed code on how to quantify synchrony between time series data using a Pearson correlation, time-lagged cross correlations, Dynamic Time Warping, and instantaneous phase synchrony. Rendered tutorial is available at http://jinhyuncheong.com/jekyll/update/2019/05/16/Four_ways_to_qunatify_synchrony.html
D
Bayesian Modeling of Time Series Data (BayModTS)
darus.uni-stuttgart.de
Updated Jun 4, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sebastian Höpfl (2024). Bayesian Modeling of Time Series Data (BayModTS) [Dataset]. http://doi.org/10.18419/DARUS-3876
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.18419/DARUS-3876
Dataset updated
Jun 4, 2024
Dataset provided by
DaRUS
Authors
Sebastian Höpfl
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset funded by
DFG
Description
BayModTS is a FAIR workflow for processing highly variable and sparse data. The code and results of the examples in the BayModTS paper are stored in this repository. A maintained version of BayModTS that can be applied to your personal applications can be found on Git Hub.
f
A novel stock forecasting model based on High-order-fuzzy-fluctuation Trends...
plos.figshare.com
docx
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hongjun Guan; Zongli Dai; Aiwu Zhao; Jie He (2023). A novel stock forecasting model based on High-order-fuzzy-fluctuation Trends and Back Propagation Neural Network [Dataset]. http://doi.org/10.1371/journal.pone.0192366
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0192366
Dataset updated
Jun 2, 2023
Dataset provided by
PLOS ONE
Authors
Hongjun Guan; Zongli Dai; Aiwu Zhao; Jie He
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this paper, we propose a hybrid method to forecast the stock prices called High-order-fuzzy-fluctuation-Trends-based Back Propagation(HTBP)Neural Network model. First, we compare each value of the historical training data with the previous day's value to obtain a fluctuation trend time series (FTTS). On this basis, the FTTS blur into fuzzy time series (FFTS) based on the fluctuation of the increasing, equality, decreasing amplitude and direction. Since the relationship between FFTS and future wave trends is nonlinear, the HTBP neural network algorithm is used to find the mapping rules in the form of self-learning. Finally, the results of the algorithm output are used to predict future fluctuations. The proposed model provides some innovative features:(1)It combines fuzzy set theory and neural network algorithm to avoid overfitting problems existed in traditional models. (2)BP neural network algorithm can intelligently explore the internal rules of the actual existence of sequential data, without the need to analyze the influence factors of specific rules and the path of action. (3)The hybrid modal can reasonably remove noises from the internal rules by proper fuzzy treatment. This paper takes the TAIEX data set of Taiwan stock exchange as an example, and compares and analyzes the prediction performance of the model. The experimental results show that this method can predict the stock market in a very simple way. At the same time, we use this method to predict the Shanghai stock exchange composite index, and further verify the effectiveness and universality of the method.
Z
LamaH-CE: LArge-SaMple DAta for Hydrology and Environmental Sciences for...
data.niaid.nih.gov
zenodo.org
Updated Jul 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Herrnegger, Mathew (2024). LamaH-CE: LArge-SaMple DAta for Hydrology and Environmental Sciences for Central Europe – files [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4525244
Explore at:
Dataset updated
Jul 18, 2024
Dataset provided by
Herrnegger, Mathew
Schulz, Karsten
Klingler, Christoph
Kratzert, Frederik
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Central Europe
Description
Version 1.0 - This version is the final revised one.

This is the LamaH-CE dataset accompanying the paper: Klingler et al., LamaH-CE | LArge-SaMple DAta for Hydrology and Environmental Sciences for Central Europe, published at Earth System Science Data (ESSD), 2021 (https://doi.org/10.5194/essd-13-4529-2021).

LamaH-CE contains a collection of runoff and meteorological time series as well as various (catchment) attributes for 859 gauged basins. The hydrometeorological time series are provided with daily and hourly time resolution including quality flags. All meteorological and the majority of runoff time series cover a span of over 35 years, which enables long-term analyses with high temporal resolution. LamaH is in its basics quite sililar to the well-known CAMELS datasets for the contiguous United States (https://doi.org/10.5194/hess-21-5293-2017), Chile (https://doi.org/10.5194/hess-22-5817-2018), Brazil (https://doi.org/10.5194/essd-12-2075-2020), Great Britain (https://doi.org/10.5194/essd-12-2459-2020) and Australia (https://doi.org/10.5194/essd-13-3847-2021), but new features like additional basin delineations (intermediate catchments) and attributes allow to consider the hydrological network and river topology in further applications.

We provide two different files to download: 1) Hydrometeorological time series with daily and hourly resolution, which requires decompressed about 70 GB of free disk space. 2) Hydrometeorological time series only with daily resolution, which requires 5 GB. Beyond the temporal resolution of the time series, there are no differences.

Note: It is recommended to read the supplementary info file before using the dataset. For example, it clarifies the time conventions and that NAs are indicated by the number -999 in the runoff time series.

Disclaimer: We have created LamaH with care and checked the outputs for plausibility. By downloading the dataset, you agree that we nor the provider of the used source datasets (e.g. runoff time series) cannot be liable for the data provided. The runoff time series of the German federal states Bavaria and Baden-Württemberg are retrospective checked and updated by the hydrographic services. Therefore, it might be appropriate to obtain more up-to-date runoff data from Bavaria (https://www.gkd.bayern.de/en/rivers/discharge/tables) and Baden-Württemberg (https://udo.lubw.baden-wuerttemberg.de/public/p/pegel_messwerte_leer). Runoff data from the Czech Republic may not be used to set up operational warning systems (https://www.chmi.cz/files/portal/docs/hydro/denni_data/Podminky_uziti.pdf).

License: This work is licensed with CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0/). This means that you may freely use and modify the data (even for commercial purposes). But you have to give appropriate credit (associated ESSD paper, version of dataset and all sources which are declared in the folder "Info"), indicate if and what changes were made and distribute your work under the same public license as the original.

Additional references: We ask kindly for compliance in citing the following references when using LamaH, as an agreement to cite was usually a condition of sharing the data: BAFU (2020), CHMI (2020), GKD (2020), HZB (2020), LUBW (2020), BMLFUW (2013), Broxton et al. (2014), CORINE (2012), EEA (2019), ESDB (2004), Farr et al. (2007), Friedl and Sulla-Menashe (2019), Gleeson et al. (2014), HAO (2007), Hartmann and Moosdorf (2012), Hiederer (2013a, b), Linke et al. (2019), Muñoz Sabater et al. (2021), Muñoz Sabater (2019a), Myneni et al. (2015), Pelletier et al. (2016), Toth et al. (2017), Trabucco and Zomer (2019), and Vermote (2015). These references are listed in detail in the accompanying paper.

Supplements: We have created additional files after publication (therefore non peer-reviewed): 1) Shapefiles for reservoirs (points) and cross-basin water transfers (lines) including several attributes as well as tables with information about the accumulated storage volume and effective catchment area (considerung artificial in- and outflows) for every runoff gauge. 2) Water quality data (e.g. dissolved oxygen, water temperature, conductivity, NO3-N), which are suitable to the gauges. The data for water quality may not be used for commercial purposes. If you are interessted, just send us an email with your name, affiliation and the intended purpose for the requested files to the address listed below. If you find any errors in the dataset, feel free to send us an email to: christoph.klingler@boku.ac.at
u
Data from: Predicting spatial-temporal patterns of diet quality and large...
agdatacommons.nal.usda.gov
docx
Updated Feb 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sean Kearney; Lauren M. Porensky; David J. Augustine; Justin D. Derner; Feng Gao (2024). Data from: Predicting spatial-temporal patterns of diet quality and large herbivore performance using satellite time series [Dataset]. http://doi.org/10.15482/USDA.ADC/1522609
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.15482/USDA.ADC/1522609
Dataset updated
Feb 16, 2024
Dataset provided by
Ag Data Commons
Authors
Sean Kearney; Lauren M. Porensky; David J. Augustine; Justin D. Derner; Feng Gao
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Analysis-ready tabular data from "Predicting spatial-temporal patterns of diet quality and large herbivore performance using satellite time series" in Ecological Applications, Kearney et al., 2021. Data is tabular data only, summarized to the pasture scale. Weight gain data for individual cattle and the STARFM-derived Landsat-MODIS fusion imagery can be made available upon request. Resources in this dataset:Resource Title: Metadata - CSV column names, units and descriptions. File Name: Kearney_et_al_ECOLAPPL_Patterns of herbivore - metada.docxResource Description: Column names, units and descriptions for all CSV files in this datasetResource Title: Fecal quality data. File Name: Kearney_etal2021_Patterns_of_herbivore_Data_FQ_cln.csvResource Description: Field-sampled fecal quality (CP = crude protein; DOM = digestible organic matter) data and phenology-related APAR metrics derived from 30 m daily Landsat-MODIS fusion satellite imagery. All data are paddock-scale averages and the paddock is the spatial scale of replication and week is the temporal scale of replication. Fecal samples were collected by USDA-ARS staff from 3-5 animals per paddock (10% - 25% of animals in each herd) weekly during each grazing season from 2014 to 2019 across 10 different paddocks at the Central Plains Experimental Range (CPER) near Nunn, CO. Samples were analyzed at the Grazingland Animal Nutrition Lab (GANlab, https://cnrit.tamu.edu/index.php/ganlab/) using near infrared spectroscopy (see Lyons & Stuth, 1992; Lyons, Stuth, & Angerer, 1995). Not every herd was sampled every week or every year, resulting in a total of 199 samples. Samples represent all available data at the CPER during the study period and were collected for different research and adaptive management objectives, but following the basic protocol described above. APAR metrics were derived from the paddock-scale APAR daily time series (all paddock pixels averaged daily to create a single paddock-scale time series). All APAR metrics are calculated for the week that corresponds to the week that fecal quality samples were collected in the field. See Section 2.2.4 of the corresponding manuscript for a complete description of the APAR metrics. Resource Title: Monthly ADG. File Name: Kearney_etal2021_Patterns_of_herbivore_Data_ADG_monthly_cln.csvResource Description: Monthly average daily gain (ADG) of cattle weights at the paddock scale and the three satellite-derived metrics used to build regression model to predict AD: crude protein (CP), digestible organic matter (DOM) and aboveground net herbaceous production (ANHP). Data table also includes stocking rate (animal units per hectare) used as an interaction term in the ADG regression model and all associated data to derive each of these variables (e.g., sampling start and end dates, 30 m daily Landsat-MODIS fusion satellite imagery-derived APAR metrics, cattle weights, etc.). We calculated paddock-scale average daily gain (ADG, kg hd-1 day-1) from 2000-2019 for yearlings weighed approximately every 28-days during the grazing season across 6 different paddocks with stocking densities of 0.08 – 0.27 animal units (AU) ha-1, where one AU is equivalent to a 454 kg animal. It is worth noting that AU’s change as a function of both the number of cattle within a paddock and the size of individual animals, the latter of which changes within a single grazing season. This becomes important to consider when using sub-seasonal weight data for fast-growing yearlings. For paddock-scale ADG, we first calculated ADG for each individual yearling as the difference between the weights obtained at the end and beginning of each period, divided by the number of days in each period, and then averaged for all individuals in the paddock. We excluded data from 2013 due to data collection inconsistencies. We note that most of the monthly weight data (97%) is from 3 paddocks where cattle were weighed every year, whereas in the other 3 paddocks, monthly weights were only measured during 2017-2019. Apart from the 2013 data, which were not comparable to data from other years, the data represents all available weight gain data for CPER to maximize spatial-temporal coverage and avoid potential bias from subjective decisions to subset the data. Data may have been collected for different projects at different times, but was collected in a consistent way. This resulted in 269 paddock-scale estimates of monthly ADG, with robust temporal, but limited spatial, coverage. CP and DOM were estimated from a random forest model trained from the five APAR metrics: rAPAR, dAPAR, tPeak, iAPAR and iAPAR-dry (see manuscript Section 2.3 for description). APAR metrics were derived from the paddock-scale APAR daily time series (all paddock pixels averaged daily to create a single paddock-scale time series). All APAR metrics are calculated as the average of the approximately 28-day period that corresponds to the ADG calculation. See Section 2.2.4 of the manuscript for a complete description of the APAR metrics. ANHP was estimated from a linear regression model developed by Gaffney et al. (2018) to calculate net aboveground herbaceous productivity (ANHP; kg ha-1) from iAPAR. We averaged the coefficients of 4 spatial models (2013-2016) developed by Gaffney et al. (2018), resulting in the following equation: ANHP = -26.47 + 2.07(iAPAR) We first calculated ANHP for each day of the grazing season at the paddock scale, and then took the average ANHP for the 28-day period. REFERENCES: Gaffney, R., Porensky, L. M., Gao, F., Irisarri, J. G., Durante, M., Derner, J. D., & Augustine, D. J. (2018). Using APAR to predict aboveground plant productivity in semi-aid rangelands: Spatial and temporal relationships differ. Remote Sensing, 10(9). doi: 10.3390/rs10091474 Resource Title: Season-long ADG. File Name: Kearney_etal2021_Patterns_of_herbivore_Data_ADG_seasonal_cln.csvResource Description: Season-long observed and model-predicted average daily gain (ADG) of cattle weights at the paddock scale. Also includes two variables used to analyze patterns in model residuals: percent sand content and season-long aboveground net herbaceous production (ANHP). We calculated observed paddock-scale ADG for the entire grazing season from 2010-2019 (excluding 2013 due to data collection inconsistencies) by averaging seasonal ADG of each yearling, determined as the difference between the end and starting weights divided by the number of days in the grazing season. This dataset was available for 40 paddocks spanning a range of soil types, plant communities, and topographic positions. Data may have been collected for different projects at different times, but was collected in a consistent way. We note that there was spatial overlap among a small number paddock boundaries across different years since some fence lines were moved in 2012 and 2014. Model-predicted paddock-scale ADG was derived using the monthly ADG regression model described in Sections 2.3.3 and 2.3.4. of the associated manuscript. In short, we predicted season-long cattle weight gains by first predicting daily weight gain for each day of the grazing season from the monthly regression model using a 28-day moving average of model inputs (CP, DOM and ANHP ). We calculated the final ADG for the entire grazing season as the average predicted ADG, starting 28-days into the growing season. Percent sand content was obtained as the paddock-scale average of POLARIS sand content in the upper 0-30 cm. ANHP was calculated on the last day of the grazing season fusing a linear regression model developed by Gaffney et al. (2018) to calculate net aboveground herbaceous productivity (ANHP; kg ha-1) from satellite-derived integrated absorbed photosynthetically active radiation (iAPAR) (see Section 3.1.2 of the associated manuscript). We averaged the coefficients of 4 spatial models (2013-2016) developed by Gaffney et al. (2018), resulting in the following equation: ANHP = -26.47 + 2.07(iAPAR) REFERENCES: Gaffney, R., Porensky, L. M., Gao, F., Irisarri, J. G., Durante, M., Derner, J. D., & Augustine, D. J. (2018). Using APAR to predict aboveground plant productivity in semi-aid rangelands: Spatial and temporal relationships differ. Remote Sensing, 10(9). doi: 10.3390/rs10091474
18S Monterey Bay Time Series: an eDNA data set from Monterey Bay,...
gbif.org
portal.obis.org
+2more
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Francisco Chavez; Kathleen Pitz; Francisco Chavez; Kathleen Pitz (2025). 18S Monterey Bay Time Series: an eDNA data set from Monterey Bay, California, including years 2006, 2013 - 2016 [Dataset]. http://doi.org/10.15468/84ntea
Explore at:
Unique identifier
https://doi.org/10.15468/84ntea
Dataset updated
Jul 10, 2025
Dataset provided by
Global Biodiversity Information Facilityhttps://www.gbif.org/
NOAA Integrated Ocean Observing System
Authors
Francisco Chavez; Kathleen Pitz; Francisco Chavez; Kathleen Pitz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Monterey Bay, Monterey County
Description
These data are from marine filtered seawater samples collected at a nearshore station in Monterey Bay, CA. They have undergone metabarcoding for the 18S V9 region. A selection of samples from this plate were included in the publication "Environmental DNA reveals seasonal shifts and potential interactions in a marine community" (Djurhuus et al., 2020). Samples were collected by CTD rosette and filtered by a peristaltic pump system.
Illumina MiSeq metabarcoding data was processed in the following steps: (1) primer sequences were removed through atropos (Didion et al., 2017), (2) reads were denoised, ASV sequences inferred, paired reads merged and chimeras removed through Dada2 (Callahan et al., 2016), (3) taxonomic ranks were assigned through blastn searches to NCBI GenBank's non-redundant nucleotide database (nt) with hits filtered by lowest common ancestor algorithm within MEGAN6 (Huson et al., 2016). Furthermore, post-MEGAN6 filtering was performed to ensure only contigs with a hit of ≥97% sequence identity were annotated to the species level and only contigs with a hit of ≥95% sequence identity were annotated to the genus level. Annotations were elevated to the next highest taxonomic level for contigs that failed these conditions.
Data are presented in two comma-separated values files: occurrence.csv, and DNADerivedData.csv. The former contains the taxonomic identification of each ASV observed and its number of reads, in addition to relevant metadata including the location the water sample was taken, references for the identification procedure, and links to archived sequences. The latter contains the DNA sequence of each ASV observed, in addition to relevant metadata including primer information and links to detailed field and laboratory methods. This data set was transformed from its native format into a table structure using Darwin Core and DNA Derived Data Extension term names as column names.
References:
Djurhuus, A, Closek, CJ, Kelly, RP et al. (2020). Environmental DNA reveals seasonal shifts and potential interactions in a marine community. Nat Commun 11, 254. https://doi.org/10.1038/s41467-019-14105-1
Didion JP, Martin M, Collins FS. (2017) Atropos: specific, sensitive, and speedy trimming of sequencing reads. PeerJ 5:e3720 https://doi.org/10.7717/peerj.3720
Callahan, B., McMurdie, P., Rosen, M. et al. (2016) DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods 13, 581–583 . https://doi.org/10.1038/nmeth.3869
Huson DH, Beier S, Flade I, Górska A, El-Hadidi M, Mitra S, Ruscheweyh HJ, Tappu R. (2016) MEGAN community edition-interactive exploration and analysis of large-scale microbiome sequencing data. PLoS computational biology. Jun 21;12(6):e1004957.
Population estimates time series dataset
ons.gov.uk
cy.ons.gov.uk
csv, xlsx
Updated Oct 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2024). Population estimates time series dataset [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/datasets/populationestimatestimeseriesdataset
Explore at:
csv, xlsxAvailable download formats
Dataset updated
Oct 8, 2024
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
The mid-year estimates refer to the population on 30 June of the reference year and are produced in line with the standard United Nations (UN) definition for population estimates. They are the official set of population estimates for the UK and its constituent countries, the regions and counties of England, and local authorities and their equivalents.
Data from: ANES 2016 Time Series Study
icpsr.umich.edu
ascii, delimited, r +3
Updated Sep 19, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Inter-university Consortium for Political and Social Research [distributor] (2017). ANES 2016 Time Series Study [Dataset]. http://doi.org/10.3886/ICPSR36824.v2
Explore at:
r, delimited, stata, sas, ascii, spssAvailable download formats
Unique identifier
https://doi.org/10.3886/ICPSR36824.v2
Dataset updated
Sep 19, 2017
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
License
https://www.icpsr.umich.edu/web/ICPSR/studies/36824/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36824/terms
Time period covered
Sep 2016 - Jan 2017
Area covered
United States
Description
This study is part of the American National Election Study (ANES), a time-series collection of national surveys fielded continuously since 1948. The American National Election Studies are designed to present data on Americans' social backgrounds, enduring political predispositions, social and political values, perceptions and evaluations of groups and candidates, opinions on questions of public policy, and participation in political life. As with all Time Series studies conducted during years of presidential elections, respondents were interviewed during the two months preceding the November election (Pre-election interview), and then re-interviewed during the two months following the election (Post-election interview). Like its predecessors, the 2016 ANES was divided between questions necessary for tracking long-term trends and questions necessary to understand the particular political moment of 2016. The study maintains and extends the ANES time-series 'core' by collecting data on Americans' basic political beliefs, allegiances, and behaviors, which are so critical to a general understanding of politics that they are monitored at every election, no matter the nature of the specific campaign or the broader setting. This 2016 ANES study features a dual-mode design with both traditional face-to-face interviewing (n=1,181) and surveys conducted on the Internet (n=3,090), and a total sample size of 4,271. In addition to content on electoral participation, voting behavior, and public opinion, the 2016 ANES Time Series Study contains questions about areas such as media exposure, cognitive style, and values and predispositions. Several items first measured on the 2012 ANES study were again asked, including "Big Five" personality traits using the Ten Item Personality Inventory (TIPI), and skin tone observations made by interviewers in the face-to-face study. For the first time, ANES has collected supplemental data directly from respondents' Facebook accounts. The post-election interview also included Module 5 from the Comparative Study of Electorial Systems (CSES), exploring themes in populism, perceptions on elites, corruption, and attitudes towards representative democracy. Face-to-face interviews were conducted by trained interviewers using computer assisted personal interviewing (CAPI) software on laptop computers. During a portion of the face-to-face interview, the respondent answered certain sensitive questions on the laptop computer directly, without the interviewer's participation (known as computer assisted self-interviewing (CASI)). Internet questionnaires could be completed anywhere the respondent had access to the Internet, on a computer or on a mobile device. Respondents were only eligible to compete the survey in the mode for which they were sampled. Demographic variables include respondent age, education level, political affiliation, race/ethnicity, marital status, and family composition.
c
Time Series Databases Software market size will be $993.24 Million by 2028!
cognitivemarketresearch.com
pdf,excel,csv,ppt
Updated Jul 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research (2023). Time Series Databases Software market size will be $993.24 Million by 2028! [Dataset]. https://www.cognitivemarketresearch.com/time-series-databases-software-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
Jul 27, 2023
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
As per Cognitive Market Research's latest published report, the Global Time Series Databases Software market size will be $993.24 Million by 2028. Time Series Databases Software Industry's Compound Annual Growth Rate will be 18.36% from 2023 to 2030. Factors Affecting Time Series Databases Software market growth

Rise in automation in industry

Industrial sensors are a key part of factory automation and Industry 4.0. Motion, environmental, and vibration sensors are used to monitor the health of equipment, from linear or angular positioning, tilt sensing, leveling, shock, or fall detection. A Sensor is a device that identifies the progressions in electrical or physical or other quantities and in a way delivers a yield as an affirmation of progress in the quantity.

In simple terms, Industrial Automation Sensors are input devices that provide an output (signal) with respect to a specific physical quantity (input). In industrial automation, sensors play a vital part to make the products intellectual and exceptionally automatic. These permit one to detect, analyze, measure, and process a variety of transformations like alteration in position, length, height, exterior, and dislocation that occurs in the Industrial manufacturing sites. These sensors also play a pivotal role in predicting and preventing numerous potential proceedings, thus, catering to the requirements of many sensing applications. This sensor generally works on time series as the readings are taken after equal intervals of time.

The increase in the use of sensor to monitor the industrial activities and in production factories is fueling the growth of the time series database software market. Also manufacturing in pharmaceutical industry requires proper monitoring due to which there is increase in demand for sensors and time series database, this fuels the demand for time series database software market.

Market Dynamics of

Time Series Databases Software Market

Key Drivers of

Time Series Databases Software Market

Increasing Adoption of IoT Devices : The rise of IoT devices is producing vast amounts of time-stamped data. Time Series Databases (TSDBs) are specifically engineered to manage this data effectively, facilitating real-time monitoring, analytics, and forecasting—rendering them crucial for sectors such as manufacturing, energy, and smart cities.

Rising Demand for Real-Time Analytics : Companies are progressively emphasizing real-time data processing to enable quicker, data-informed decisions. TSDBs accommodate rapid data ingestion and querying, allowing for real-time analysis across various sectors including finance, IT infrastructure, and logistics, significantly enhancing their market adoption.

Growth of Cloud Infrastructure : As cloud computing becomes ubiquitous, cloud-native TSDB solutions are gaining popularity. These platforms provide scalability, ease of deployment, and lower operational expenses. The need for adaptable and on-demand database solutions fosters the expansion of TSDBs within contemporary IT environments.

Key Restraints in

Time Series Databases Software Market

High Implementation and Maintenance Costs : The deployment and upkeep of Time Series Database (TSDB) systems can necessitate a considerable financial commitment, particularly for small to medium-sized businesses. The costs encompass infrastructure establishment, the hiring of skilled personnel, and the integration with current systems, which may discourage market adoption in environments sensitive to costs.

Complexity in Data Management : Managing large volumes of time-stamped data demands a robust system architecture. As the amount of data increases, difficulties in indexing, querying, and efficient storage can adversely affect performance and user experience, thereby restricting usability for organizations that lack strong technical support.

Competition from Traditional Databases : In spite of their benefits, TSDBs encounter competition from advanced traditional databases such as relational and NoSQL systems. Many of these databases now offer time-series functionalities, leading organizations to be reluctant to invest in new TSDB software when existing solutions can be enhanced.

Key Trends of

Time Series Databases Software Market

Integration with AI and Machine Learning Tools : TSDBs are progressively being integrated with AI/ML platfo...
f
Data from: Scalable Methods for Multiple Time Series Comparison in Second...
tandf.figshare.com
pdf
Updated Aug 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lei Jin; Bo Li (2024). Scalable Methods for Multiple Time Series Comparison in Second Order Dynamics [Dataset]. http://doi.org/10.6084/m9.figshare.26496134.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26496134.v1
Dataset updated
Aug 5, 2024
Dataset provided by
Taylor & Francis
Authors
Lei Jin; Bo Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Statistical comparison of multiple time series in their underlying frequency patterns has many real applications. However, existing methods are only applicable to a small number of mutually independent time series, and empirical results for dependent time series are only limited to comparing two time series. We propose scalable methods based on a new algorithm that enables us to compare the spectral density of a large number of time series. The new algorithm helps us efficiently obtain all pairwise feature differences in frequency patterns between M time series, which plays an essential role in our methods. When all M time series are independent of each other, we derive the joint asymptotic distribution of their pairwise feature differences. The asymptotic dependence structure between the feature differences motivates our proposed test for multiple mutually independent time series. We then adapt this test to the case of multiple dependent time series by partially accounting for the underlying dependence structure. Additionally, we introduce a global test to further enhance the approach. To examine the finite sample performance of our proposed methods, we conduct simulation studies. The new approaches demonstrate the ability to compare a large number of time series, whether independent or dependent, while exhibiting competitive power. Finally, we apply our methods to compare multiple mechanical vibrational time series.

Facebook

Twitter

Click to copy link

Link copied

Cite

Ben Fulcher (2023). 1000 Empirical Time series [Dataset]. http://doi.org/10.6084/m9.figshare.5436136.v10

1000 Empirical Time series

Explore at:

pngAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.5436136.v10

Dataset updated

May 30, 2023

Dataset provided by

Figsharehttp://figshare.com/

Authors

Ben Fulcher

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

A diverse selection of 1000 empirical time series, along with results of an hctsa feature extraction, using v1.06 of hctsa and Matlab 2019b, computed on a server at The University of Sydney.The results of the computation are in the hctsa file, HCTSA_Empirical1000.mat for use in Matlab using v1.06 of hctsa.The same data is also provided in .csv format for the hctsa_datamatrix.csv (results of feature computation), with information about rows (time series) in hctsa_timeseries-info.csv, information about columns (features) in hctsa_features.csv (and corresponding hctsa code used to compute each feature in hctsa_masterfeatures.csv), and the data of individual time series (each line a time series, for time series described in hctsa_timeseries-info.csv) is in hctsa_timeseries-data.csv. These .csv files were produced by running >>OutputToCSV(HCTSA_Empirical1000.mat,true,true); in hctsa.The input file, INP_Empirical1000.mat, is for use with hctsa, and contains the time-series data and metadata for the 1000 time series. For example, massive feature extraction from these data on the user's machine, using hctsa, can proceed as>> TS_Init('INP_Empirical1000.mat');Some visualizations of the dataset are in CarpetPlot.png (first 1000 samples of all time series as a carpet (color) plot) and 150TS-250samples.png (conventional time-series plots of the first 250 samples of a sample of 150 time series from the dataset). More visualizations can be performed by the user using TS_PlotTimeSeries from the hctsa package.See links in references for more comprehensive documentation for performing methodological comparison using this dataset, and on how to download and use v1.06 of hctsa.

Clear search

Close search

Google apps

Main menu

1000 Empirical Time series

Time series

Controlled Anomalies Time Series (CATS) Dataset

Hourly Sensor Data for Time Series Forecasting

Timeseries-PILE

COVID-19 Time Series Data

Data from: Nonparametric Anomaly Detection on Time Series of Graphs

Weather Long-term Time Series Forecasting

Weather Long-term Time Series Forecasting (2020)

Dataset Description

File Structure

Potential Use Cases

Credits

Population Estimates Time Series Data

Example Stata syntax and data construction for negative binomial time series...

Four ways to quantify synchrony between time series data

Bayesian Modeling of Time Series Data (BayModTS)

A novel stock forecasting model based on High-order-fuzzy-fluctuation Trends...

LamaH-CE: LArge-SaMple DAta for Hydrology and Environmental Sciences for...

Data from: Predicting spatial-temporal patterns of diet quality and large...

18S Monterey Bay Time Series: an eDNA data set from Monterey Bay,...

Population estimates time series dataset

Data from: ANES 2016 Time Series Study

Time Series Databases Software market size will be $993.24 Million by 2028!

Data from: Scalable Methods for Multiple Time Series Comparison in Second...

1000 Empirical Time seriesSee More Versions

1000 Empirical Time series