100+ datasets found
  1. 1000 Empirical Time series

    • figshare.com
    • researchdata.edu.au
    png
    Updated May 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ben Fulcher (2023). 1000 Empirical Time series [Dataset]. http://doi.org/10.6084/m9.figshare.5436136.v10
    Explore at:
    pngAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Ben Fulcher
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A diverse selection of 1000 empirical time series, along with results of an hctsa feature extraction, using v1.06 of hctsa and Matlab 2019b, computed on a server at The University of Sydney.The results of the computation are in the hctsa file, HCTSA_Empirical1000.mat for use in Matlab using v1.06 of hctsa.The same data is also provided in .csv format for the hctsa_datamatrix.csv (results of feature computation), with information about rows (time series) in hctsa_timeseries-info.csv, information about columns (features) in hctsa_features.csv (and corresponding hctsa code used to compute each feature in hctsa_masterfeatures.csv), and the data of individual time series (each line a time series, for time series described in hctsa_timeseries-info.csv) is in hctsa_timeseries-data.csv. These .csv files were produced by running >>OutputToCSV(HCTSA_Empirical1000.mat,true,true); in hctsa.The input file, INP_Empirical1000.mat, is for use with hctsa, and contains the time-series data and metadata for the 1000 time series. For example, massive feature extraction from these data on the user's machine, using hctsa, can proceed as>> TS_Init('INP_Empirical1000.mat');Some visualizations of the dataset are in CarpetPlot.png (first 1000 samples of all time series as a carpet (color) plot) and 150TS-250samples.png (conventional time-series plots of the first 250 samples of a sample of 150 time series from the dataset). More visualizations can be performed by the user using TS_PlotTimeSeries from the hctsa package.See links in references for more comprehensive documentation for performing methodological comparison using this dataset, and on how to download and use v1.06 of hctsa.

  2. O

    Time series

    • data.open-power-system-data.org
    csv, sqlite, xlsx
    Updated Oct 6, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Muehlenpfordt (2020). Time series [Dataset]. http://doi.org/10.25832/time_series/2020-10-06
    Explore at:
    csv, sqlite, xlsxAvailable download formats
    Dataset updated
    Oct 6, 2020
    Dataset provided by
    Open Power System Data
    Authors
    Jonathan Muehlenpfordt
    Time period covered
    Jan 1, 2015 - Oct 1, 2020
    Variables measured
    utc_timestamp, DE_wind_profile, DE_solar_profile, DE_wind_capacity, DK_wind_capacity, SE_wind_capacity, CH_solar_capacity, DE_solar_capacity, DK_solar_capacity, AT_price_day_ahead, and 290 more
    Description

    Load, wind and solar, prices in hourly resolution. This data package contains different kinds of timeseries data relevant for power system modelling, namely electricity prices, electricity consumption (load) as well as wind and solar power generation and capacities. The data is aggregated either by country, control area or bidding zone. Geographical coverage includes the EU and some neighbouring countries. All variables are provided in hourly resolution. Where original data is available in higher resolution (half-hourly or quarter-hourly), it is provided in separate files. This package version only contains data provided by TSOs and power exchanges via ENTSO-E Transparency, covering the period 2015-mid 2020. See previous versions for historical data from a broader range of sources. All data processing is conducted in Python/pandas and has been documented in the Jupyter notebooks linked below.

  3. d

    Monthly Modal Time Series

    • catalog.data.gov
    • data.transportation.gov
    • +3more
    Updated Jul 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Federal Transit Administration (2025). Monthly Modal Time Series [Dataset]. https://catalog.data.gov/dataset/monthly-modal-time-series
    Explore at:
    Dataset updated
    Jul 8, 2025
    Dataset provided by
    Federal Transit Administration
    Description

    Modal Service data and Safety & Security (S&S) public transit time series data delineated by transit/agency/mode/year/month. Includes all Full Reporters--transit agencies operating modes with more than 30 vehicles in maximum service--to the National Transit Database (NTD). This dataset will be updated monthly. The monthly ridership data is released one month after the month in which the service is provided. Records with null monthly service data reflect late reporting. The S&S statistics provided include both Major and Non-Major Events where applicable. Events occurring in the past three months are excluded from the corresponding monthly ridership rows in this dataset while they undergo validation. This dataset is the only NTD publication in which all Major and Non-Major S&S data are presented without any adjustment for historical continuity.

  4. z

    Controlled Anomalies Time Series (CATS) Dataset

    • zenodo.org
    bin
    Updated Jul 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick Fleith; Patrick Fleith (2024). Controlled Anomalies Time Series (CATS) Dataset [Dataset]. http://doi.org/10.5281/zenodo.7646897
    Explore at:
    binAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Solenix Engineering GmbH
    Authors
    Patrick Fleith; Patrick Fleith
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Controlled Anomalies Time Series (CATS) Dataset consists of commands, external stimuli, and telemetry readings of a simulated complex dynamical system with 200 injected anomalies.

    The CATS Dataset exhibits a set of desirable properties that make it very suitable for benchmarking Anomaly Detection Algorithms in Multivariate Time Series [1]:

    • Multivariate (17 variables) including sensors reading and control signals. It simulates the operational behaviour of an arbitrary complex system including:
      • 4 Deliberate Actuations / Control Commands sent by a simulated operator / controller, for instance, commands of an operator to turn ON/OFF some equipment.
      • 3 Environmental Stimuli / External Forces acting on the system and affecting its behaviour, for instance, the wind affecting the orientation of a large ground antenna.
      • 10 Telemetry Readings representing the observable states of the complex system by means of sensors, for instance, a position, a temperature, a pressure, a voltage, current, humidity, velocity, acceleration, etc.
    • 5 million timestamps. Sensors readings are at 1Hz sampling frequency.
      • 1 million nominal observations (the first 1 million datapoints). This is suitable to start learning the "normal" behaviour.
      • 4 million observations that include both nominal and anomalous segments. This is suitable to evaluate both semi-supervised approaches (novelty detection) as well as unsupervised approaches (outlier detection).
    • 200 anomalous segments. One anomalous segment may contain several successive anomalous observations / timestamps. Only the last 4 million observations contain anomalous segments.
    • Different types of anomalies to understand what anomaly types can be detected by different approaches.
    • Fine control over ground truth. As this is a simulated system with deliberate anomaly injection, the start and end time of the anomalous behaviour is known very precisely. In contrast to real world datasets, there is no risk that the ground truth contains mislabelled segments which is often the case for real data.
    • Obvious anomalies. The simulated anomalies have been designed to be "easy" to be detected for human eyes (i.e., there are very large spikes or oscillations), hence also detectable for most algorithms. It makes this synthetic dataset useful for screening tasks (i.e., to eliminate algorithms that are not capable to detect those obvious anomalies). However, during our initial experiments, the dataset turned out to be challenging enough even for state-of-the-art anomaly detection approaches, making it suitable also for regular benchmark studies.
    • Context provided. Some variables can only be considered anomalous in relation to other behaviours. A typical example consists of a light and switch pair. The light being either on or off is nominal, the same goes for the switch, but having the switch on and the light off shall be considered anomalous. In the CATS dataset, users can choose (or not) to use the available context, and external stimuli, to test the usefulness of the context for detecting anomalies in this simulation.
    • Pure signal ideal for robustness-to-noise analysis. The simulated signals are provided without noise: while this may seem unrealistic at first, it is an advantage since users of the dataset can decide to add on top of the provided series any type of noise and choose an amplitude. This makes it well suited to test how sensitive and robust detection algorithms are against various levels of noise.
    • No missing data. You can drop whatever data you want to assess the impact of missing values on your detector with respect to a clean baseline.

    [1] Example Benchmark of Anomaly Detection in Time Series: “Sebastian Schmidl, Phillip Wenig, and Thorsten Papenbrock. Anomaly Detection in Time Series: A Comprehensive Evaluation. PVLDB, 15(9): 1779 - 1797, 2022. doi:10.14778/3538598.3538602”

    About Solenix

    Solenix is an international company providing software engineering, consulting services and software products for the space market. Solenix is a dynamic company that brings innovative technologies and concepts to the aerospace market, keeping up to date with technical advancements and actively promoting spin-in and spin-out technology activities. We combine modern solutions which complement conventional practices. We aspire to achieve maximum customer satisfaction by fostering collaboration, constructivism, and flexibility.

  5. d

    COVID-19 Time Series Data

    • data.world
    • kaggle.com
    csv, zip
    Updated Mar 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shad Reynolds (2025). COVID-19 Time Series Data [Dataset]. https://data.world/shad/covid-19-time-series-data
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Mar 18, 2025
    Authors
    Shad Reynolds
    Time period covered
    Jan 22, 2020 - Mar 9, 2023
    Area covered
    Description

    This data is synced hourly from https://github.com/CSSEGISandData/COVID-19. All credit is to them.

    Latest Confirmed Cases

    @(https://data.world/shad/covid-analysis/workspace/query?datasetid=covid-19-time-series-data&queryid=e066701e-fa8d-4c9f-97f8-aab3a6f219a8)

    I have also added confirmed_pivot.csv which gives a slightly more workable view of the data. Extra columns/day makes things difficult.

    @(https://data.world/shad/covid-analysis/workspace/file?datasetid=covid-19-time-series-data&filename=confirmed_pivot)

    #

  6. T

    Time Series Intelligence Software Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jan 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Time Series Intelligence Software Report [Dataset]. https://www.datainsightsmarket.com/reports/time-series-intelligence-software-1960202
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Jan 20, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Time Series Intelligence Software market is projected to grow from USD XXX million in 2023 to USD XXX million by 2033, at a CAGR of XX% during the forecast period. Factors driving the growth of this market include the increasing adoption of IoT devices, the growing need for real-time data analysis, and the need for improved forecasting and predictive capabilities. Furthermore, the rising trend of digital transformation and the increasing need to gain insights from large datasets are contributing to the growth of this market. The market for Time Series Intelligence Software is fragmented, with a number of major players. The key players in this market include Google, SAP, Microsoft Azure, Trendalyze, Anodot, Seeq, SensorMesh, Warp 10, AxiBase, Shapelets, TrendMiner, and Datapred. These players offer a variety of solutions to meet the needs of different industries. For example, Google Cloud's Vertex AI Time Series Insight tool provides end-to-end capabilities for understanding and predicting time series data. Amazon Web Services (AWS) offers Amazon Forecast, which provides time series forecasting capabilities. These solutions are being increasingly adopted by businesses to gain insights from their data and make better decisions.

  7. N

    Population Estimates Time Series Data

    • dtechtive.com
    • find.data.gov.scot
    Updated Mar 27, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Records of Scotland (2011). Population Estimates Time Series Data [Dataset]. https://dtechtive.com/datasets/3616
    Explore at:
    Dataset updated
    Mar 27, 2011
    Dataset provided by
    National Records of Scotland
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Area covered
    Scotland
    Description

    Over time statistical outputs (and time series data) may be subject to revisions or corrections. Revisions are generally planned, and are the result of either improvements in statistical methods or the availability of additional data. For example, the annual mid-year population estimates are revised after a census to take account of the additional information gained from the census results. Details of planned revisions are held within the Metadata alongside each publication. Corrections are unplanned and occur when errors in either the statistical data or methodology are found after release of the data. The latest correction to these datasets was in September 2018, for more information please see the revisions and corrections page. This time series section provides access to the latest time series data, taking into account any revisions or corrections over the years. Note: Tables are mainly offered for the purposes of extracting figures. Due to the size of some of the sheets they are not recommended for printing.

  8. p

    Santa Fe Time Series Competition Data Set B

    • physionet.org
    • search.datacite.org
    Updated Jan 6, 2000
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2000). Santa Fe Time Series Competition Data Set B [Dataset]. http://doi.org/10.13026/C20W2T
    Explore at:
    Dataset updated
    Jan 6, 2000
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    This is a multivariate data set recorded from a patient in the sleep laboratory of the Beth Israel Hospital (now the Beth Israel Deaconess Medical Center) in Boston, Massachusetts. This data set was extracted from record slp60 of the MIT-BIH Polysomnographic Database, and it was submitted to the Santa Fe Time Series Competition in 1991 by our group. The data are presented in text form and have been split into two sequential parts. Each line contains simultaneous samples of three parameters; the interval between samples in successive lines is 0.5 seconds. The first column is the heart rate, the second is the chest volume (respiration force), and the third is the blood oxygen concentration (measured by ear oximetry). The sampling frequency for each measurement is 2 Hz (i.e., the time interval between measurements in successive rows is 0.5 seconds).

  9. f

    Data from: Nonparametric Anomaly Detection on Time Series of Graphs

    • tandf.figshare.com
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben (2023). Nonparametric Anomaly Detection on Time Series of Graphs [Dataset]. http://doi.org/10.6084/m9.figshare.13180181.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Dorcas Ofori-Boateng; Yulia R. Gel; Ivor Cribben
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Identifying change points and/or anomalies in dynamic network structures has become increasingly popular across various domains, from neuroscience to telecommunication to finance. One particular objective of anomaly detection from a neuroscience perspective is the reconstruction of the dynamic manner of brain region interactions. However, most statistical methods for detecting anomalies have the following unrealistic limitation for brain studies and beyond: that is, network snapshots at different time points are assumed to be independent. To circumvent this limitation, we propose a distribution-free framework for anomaly detection in dynamic networks. First, we present each network snapshot of the data as a linear object and find its respective univariate characterization via local and global network topological summaries. Second, we adopt a change point detection method for (weakly) dependent time series based on efficient scores, and enhance the finite sample properties of change point method by approximating the asymptotic distribution of the test statistic using the sieve bootstrap. We apply our method to simulated and to real data, particularly, two functional magnetic resonance imaging (fMRI) datasets and the Enron communication graph. We find that our new method delivers impressively accurate and realistic results in terms of identifying locations of true change points compared to the results reported by competing approaches. The new method promises to offer a deeper insight into the large-scale characterizations and functional dynamics of the brain and, more generally, into the intrinsic structure of complex dynamic networks. Supplemental materials for this article are available online.

  10. Four ways to quantify synchrony between time series data

    • osf.io
    Updated Dec 8, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jin Hyun Cheong (2020). Four ways to quantify synchrony between time series data [Dataset]. http://doi.org/10.17605/OSF.IO/BA3NY
    Explore at:
    Dataset updated
    Dec 8, 2020
    Dataset provided by
    Center for Open Sciencehttps://cos.io/
    Authors
    Jin Hyun Cheong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This project provides a sample dataset with detailed code on how to quantify synchrony between time series data using a Pearson correlation, time-lagged cross correlations, Dynamic Time Warping, and instantaneous phase synchrony. Rendered tutorial is available at http://jinhyuncheong.com/jekyll/update/2019/05/16/Four_ways_to_qunatify_synchrony.html

  11. S&P500 Volatility Prediction Time Series Data

    • kaggle.com
    Updated Aug 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathis Jander (2023). S&P500 Volatility Prediction Time Series Data [Dataset]. http://doi.org/10.34740/kaggle/dsv/6257153
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 6, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mathis Jander
    License

    http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

    Description

    This dataset was created for a academic paper on multivariate time series prediction of S&P500 30-day volatility.

    For this the dataset contains S&P500-related, financial and macroeconomic time series data was compiled from different sources. The individual sources can be viewed in the Methodology section of the following paper.

  12. H

    Time-Series Matrix (TSMx): A visualization tool for plotting multiscale...

    • dataverse.harvard.edu
    Updated Jul 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Georgios Boumis; Brad Peter (2024). Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends [Dataset]. http://doi.org/10.7910/DVN/ZZDYM9
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 8, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Georgios Boumis; Brad Peter
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends TSMx is an R script that was developed to facilitate multi-temporal-scale visualizations of time-series data. The script requires only a two-column CSV of years and values to plot the slope of the linear regression line for all possible year combinations from the supplied temporal range. The outputs include a time-series matrix showing slope direction based on the linear regression, slope values plotted with colors indicating magnitude, and results of a Mann-Kendall test. The start year is indicated on the y-axis and the end year is indicated on the x-axis. In the example below, the cell in the top-right corner is the direction of the slope for the temporal range 2001–2019. The red line corresponds with the temporal range 2010–2019 and an arrow is drawn from the cell that represents that range. One cell is highlighted with a black border to demonstrate how to read the chart—that cell represents the slope for the temporal range 2004–2014. This publication entry also includes an excel template that produces the same visualizations without a need to interact with any code, though minor modifications will need to be made to accommodate year ranges other than what is provided. TSMx for R was developed by Georgios Boumis; TSMx was originally conceptualized and created by Brad G. Peter in Microsoft Excel. Please refer to the associated publication: Peter, B.G., Messina, J.P., Breeze, V., Fung, C.Y., Kapoor, A. and Fan, P., 2024. Perspectives on modifiable spatiotemporal unit problems in remote sensing of agriculture: evaluating rice production in Vietnam and tools for analysis. Frontiers in Remote Sensing, 5, p.1042624. https://www.frontiersin.org/journals/remote-sensing/articles/10.3389/frsen.2024.1042624 TSMx sample chart from the supplied Excel template. Data represent the productivity of rice agriculture in Vietnam as measured via EVI (enhanced vegetation index) from the NASA MODIS data product (MOD13Q1.V006). TSMx R script: # import packages library(dplyr) library(readr) library(ggplot2) library(tibble) library(tidyr) library(forcats) library(Kendall) options(warn = -1) # disable warnings # read data (.csv file with "Year" and "Value" columns) data <- read_csv("EVI.csv") # prepare row/column names for output matrices years <- data %>% pull("Year") r.names <- years[-length(years)] c.names <- years[-1] years <- years[-length(years)] # initialize output matrices sign.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) pval.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) slope.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) # function to return remaining years given a start year getRemain <- function(start.year) { years <- data %>% pull("Year") start.ind <- which(data[["Year"]] == start.year) + 1 remain <- years[start.ind:length(years)] return (remain) } # function to subset data for a start/end year combination splitData <- function(end.year, start.year) { keep <- which(data[['Year']] >= start.year & data[['Year']] <= end.year) batch <- data[keep,] return(batch) } # function to fit linear regression and return slope direction fitReg <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(sign(slope)) } # function to fit linear regression and return slope magnitude fitRegv2 <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(slope) } # function to implement Mann-Kendall (MK) trend test and return significance # the test is implemented only for n>=8 getMann <- function(batch) { if (nrow(batch) >= 8) { mk <- MannKendall(batch[['Value']]) pval <- mk[['sl']] } else { pval <- NA } return(pval) } # function to return slope direction for all combinations given a start year getSign <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) signs <- lapply(combs, fitReg) return(signs) } # function to return MK significance for all combinations given a start year getPval <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) pvals <- lapply(combs, getMann) return(pvals) } # function to return slope magnitude for all combinations given a start year getMagn <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) magns <- lapply(combs, fitRegv2) return(magns) } # retrieve slope direction, MK significance, and slope magnitude signs <- lapply(years, getSign) pvals <- lapply(years, getPval) magns <- lapply(years, getMagn) # fill-in output matrices dimension <- nrow(sign.matrix) for (i in 1:dimension) { sign.matrix[i, i:dimension] <- unlist(signs[i]) pval.matrix[i, i:dimension] <- unlist(pvals[i]) slope.matrix[i, i:dimension] <- unlist(magns[i]) } sign.matrix <-...

  13. f

    A novel stock forecasting model based on High-order-fuzzy-fluctuation Trends...

    • plos.figshare.com
    docx
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hongjun Guan; Zongli Dai; Aiwu Zhao; Jie He (2023). A novel stock forecasting model based on High-order-fuzzy-fluctuation Trends and Back Propagation Neural Network [Dataset]. http://doi.org/10.1371/journal.pone.0192366
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Hongjun Guan; Zongli Dai; Aiwu Zhao; Jie He
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this paper, we propose a hybrid method to forecast the stock prices called High-order-fuzzy-fluctuation-Trends-based Back Propagation(HTBP)Neural Network model. First, we compare each value of the historical training data with the previous day's value to obtain a fluctuation trend time series (FTTS). On this basis, the FTTS blur into fuzzy time series (FFTS) based on the fluctuation of the increasing, equality, decreasing amplitude and direction. Since the relationship between FFTS and future wave trends is nonlinear, the HTBP neural network algorithm is used to find the mapping rules in the form of self-learning. Finally, the results of the algorithm output are used to predict future fluctuations. The proposed model provides some innovative features:(1)It combines fuzzy set theory and neural network algorithm to avoid overfitting problems existed in traditional models. (2)BP neural network algorithm can intelligently explore the internal rules of the actual existence of sequential data, without the need to analyze the influence factors of specific rules and the path of action. (3)The hybrid modal can reasonably remove noises from the internal rules by proper fuzzy treatment. This paper takes the TAIEX data set of Taiwan stock exchange as an example, and compares and analyzes the prediction performance of the model. The experimental results show that this method can predict the stock market in a very simple way. At the same time, we use this method to predict the Shanghai stock exchange composite index, and further verify the effectiveness and universality of the method.

  14. m

    Example Stata syntax and data construction for negative binomial time series...

    • data.mendeley.com
    Updated Nov 2, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah Price (2022). Example Stata syntax and data construction for negative binomial time series regression [Dataset]. http://doi.org/10.17632/3mj526hgzx.2
    Explore at:
    Dataset updated
    Nov 2, 2022
    Authors
    Sarah Price
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We include Stata syntax (dummy_dataset_create.do) that creates a panel dataset for negative binomial time series regression analyses, as described in our paper "Examining methodology to identify patterns of consulting in primary care for different groups of patients before a diagnosis of cancer: an exemplar applied to oesophagogastric cancer". We also include a sample dataset for clarity (dummy_dataset.dta), and a sample of that data in a spreadsheet (Appendix 2).

    The variables contained therein are defined as follows:

    case: binary variable for case or control status (takes a value of 0 for controls and 1 for cases).

    patid: a unique patient identifier.

    time_period: A count variable denoting the time period. In this example, 0 denotes 10 months before diagnosis with cancer, and 9 denotes the month of diagnosis with cancer,

    ncons: number of consultations per month.

    period0 to period9: 10 unique inflection point variables (one for each month before diagnosis). These are used to test which aggregation period includes the inflection point.

    burden: binary variable denoting membership of one of two multimorbidity burden groups.

    We also include two Stata do-files for analysing the consultation rate, stratified by burden group, using the Maximum likelihood method (1_menbregpaper.do and 2_menbregpaper_bs.do).

    Note: In this example, for demonstration purposes we create a dataset for 10 months leading up to diagnosis. In the paper, we analyse 24 months before diagnosis. Here, we study consultation rates over time, but the method could be used to study any countable event, such as number of prescriptions.

  15. Z

    LamaH-CE: LArge-SaMple DAta for Hydrology and Environmental Sciences for...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Klingler, Christoph (2024). LamaH-CE: LArge-SaMple DAta for Hydrology and Environmental Sciences for Central Europe – files [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4525244
    Explore at:
    Dataset updated
    Jul 18, 2024
    Dataset provided by
    Kratzert, Frederik
    Klingler, Christoph
    Schulz, Karsten
    Herrnegger, Mathew
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Central Europe
    Description

    Version 1.0 - This version is the final revised one.

    This is the LamaH-CE dataset accompanying the paper: Klingler et al., LamaH-CE | LArge-SaMple DAta for Hydrology and Environmental Sciences for Central Europe, published at Earth System Science Data (ESSD), 2021 (https://doi.org/10.5194/essd-13-4529-2021).

    LamaH-CE contains a collection of runoff and meteorological time series as well as various (catchment) attributes for 859 gauged basins. The hydrometeorological time series are provided with daily and hourly time resolution including quality flags. All meteorological and the majority of runoff time series cover a span of over 35 years, which enables long-term analyses with high temporal resolution. LamaH is in its basics quite sililar to the well-known CAMELS datasets for the contiguous United States (https://doi.org/10.5194/hess-21-5293-2017), Chile (https://doi.org/10.5194/hess-22-5817-2018), Brazil (https://doi.org/10.5194/essd-12-2075-2020), Great Britain (https://doi.org/10.5194/essd-12-2459-2020) and Australia (https://doi.org/10.5194/essd-13-3847-2021), but new features like additional basin delineations (intermediate catchments) and attributes allow to consider the hydrological network and river topology in further applications.

    We provide two different files to download: 1) Hydrometeorological time series with daily and hourly resolution, which requires decompressed about 70 GB of free disk space. 2) Hydrometeorological time series only with daily resolution, which requires 5 GB. Beyond the temporal resolution of the time series, there are no differences.

    Note: It is recommended to read the supplementary info file before using the dataset. For example, it clarifies the time conventions and that NAs are indicated by the number -999 in the runoff time series.

    Disclaimer: We have created LamaH with care and checked the outputs for plausibility. By downloading the dataset, you agree that we nor the provider of the used source datasets (e.g. runoff time series) cannot be liable for the data provided. The runoff time series of the German federal states Bavaria and Baden-Württemberg are retrospective checked and updated by the hydrographic services. Therefore, it might be appropriate to obtain more up-to-date runoff data from Bavaria (https://www.gkd.bayern.de/en/rivers/discharge/tables) and Baden-Württemberg (https://udo.lubw.baden-wuerttemberg.de/public/p/pegel_messwerte_leer). Runoff data from the Czech Republic may not be used to set up operational warning systems (https://www.chmi.cz/files/portal/docs/hydro/denni_data/Podminky_uziti.pdf).

    License: This work is licensed with CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0/). This means that you may freely use and modify the data (even for commercial purposes). But you have to give appropriate credit (associated ESSD paper, version of dataset and all sources which are declared in the folder "Info"), indicate if and what changes were made and distribute your work under the same public license as the original.

    Additional references: We ask kindly for compliance in citing the following references when using LamaH, as an agreement to cite was usually a condition of sharing the data: BAFU (2020), CHMI (2020), GKD (2020), HZB (2020), LUBW (2020), BMLFUW (2013), Broxton et al. (2014), CORINE (2012), EEA (2019), ESDB (2004), Farr et al. (2007), Friedl and Sulla-Menashe (2019), Gleeson et al. (2014), HAO (2007), Hartmann and Moosdorf (2012), Hiederer (2013a, b), Linke et al. (2019), Muñoz Sabater et al. (2021), Muñoz Sabater (2019a), Myneni et al. (2015), Pelletier et al. (2016), Toth et al. (2017), Trabucco and Zomer (2019), and Vermote (2015). These references are listed in detail in the accompanying paper.

    Supplements: We have created additional files after publication (therefore non peer-reviewed): 1) Shapefiles for reservoirs (points) and cross-basin water transfers (lines) including several attributes as well as tables with information about the accumulated storage volume and effective catchment area (considerung artificial in- and outflows) for every runoff gauge. 2) Water quality data (e.g. dissolved oxygen, water temperature, conductivity, NO3-N), which are suitable to the gauges. The data for water quality may not be used for commercial purposes. If you are interessted, just send us an email with your name, affiliation and the intended purpose for the requested files to the address listed below. If you find any errors in the dataset, feel free to send us an email to: christoph.klingler@boku.ac.at

  16. Population estimates time series dataset

    • ons.gov.uk
    • cy.ons.gov.uk
    csv, xlsx
    Updated Oct 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2024). Population estimates time series dataset [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/datasets/populationestimatestimeseriesdataset
    Explore at:
    csv, xlsxAvailable download formats
    Dataset updated
    Oct 8, 2024
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    The mid-year estimates refer to the population on 30 June of the reference year and are produced in line with the standard United Nations (UN) definition for population estimates. They are the official set of population estimates for the UK and its constituent countries, the regions and counties of England, and local authorities and their equivalents.

  17. Store Sales - T.S Forecasting...Merged Dataset

    • kaggle.com
    Updated Dec 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shramana Bhattacharya (2021). Store Sales - T.S Forecasting...Merged Dataset [Dataset]. https://www.kaggle.com/shramanabhattacharya/store-sales-ts-forecastingmerged-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 15, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shramana Bhattacharya
    Description

    This dataset is a merged dataset created from the data provided in the competition "Store Sales - Time Series Forecasting". The other datasets that were provided there apart from train and test (for example holidays_events, oil, stores, etc.) could not be used in the final prediction. According to my understanding, through the EDA of the merged dataset, we will be able to get a clearer picture of the other factors that might also affect the final prediction of grocery sales. Therefore, I created this merged dataset and posted it here for the further scope of analysis.

    ##### Data Description Data Field Information (This is a copy of the description as provided in the actual dataset)

    Train.csv - id: store id - date: date of the sale - store_nbr: identifies the store at which the products are sold. -**family**: identifies the type of product sold. - sales: gives the total sales for a product family at a particular store at a given date. Fractional values are possible since products can be sold in fractional units (1.5 kg of cheese, for instance, as opposed to 1 bag of chips). - onpromotion: gives the total number of items in a product family that were being promoted at a store on a given date. - Store metadata, including ****city, state, type, and cluster.**** - cluster is a grouping of similar stores. - Holidays and Events, with metadata NOTE: Pay special attention to the transferred column. A holiday that is transferred officially falls on that calendar day but was moved to another date by the government. A transferred day is more like a normal day than a holiday. To find the day that it was celebrated, look for the corresponding row where the type is Transfer. For example, the holiday Independencia de Guayaquil was transferred from 2012-10-09 to 2012-10-12, which means it was celebrated on 2012-10-12. Days that are type Bridge are extra days that are added to a holiday (e.g., to extend the break across a long weekend). These are frequently made up by the type Work Day which is a day not normally scheduled for work (e.g., Saturday) that is meant to pay back the Bridge. Additional holidays are days added to a regular calendar holiday, for example, as typically happens around Christmas (making Christmas Eve a holiday). - dcoilwtico: Daily oil price. Includes values during both the train and test data timeframes. (Ecuador is an oil-dependent country and its economic health is highly vulnerable to shocks in oil prices.)

    **Note: ***There is a transaction column in the training dataset which displays the sales transactions on that particular date. * Test.csv - The test data, having the same features like the training data. You will predict the target sales for the dates in this file. - The dates in the test data are for the 15 days after the last date in the training data. **Note: ***There is a no transaction column in the test dataset as was there in the training dataset. Therefore, while building the model, you might exclude this column and may use it only for EDA.*

    submission.csv - A sample submission file in the correct format.

  18. d

    Sample time series analysis for point locations through the Climate...

    • catalog.data.gov
    • datasets.ai
    Updated Jun 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Climate Adaptation Science Centers (2024). Sample time series analysis for point locations through the Climate Assessments and Scenario Planning (CLASP) project, Derived from GHCN and BCCA data [Dataset]. https://catalog.data.gov/dataset/sample-time-series-analysis-for-point-locations-through-the-climate-assessments-and-scenar
    Explore at:
    Dataset updated
    Jun 15, 2024
    Dataset provided by
    Climate Adaptation Science Centers
    Description

    Long-term historical (derived from GHCN) and future simulated (derived from BCCA) time series analyses for several meteorological variables are provided to several clients within the Northeast Climate Adaptation Science Center (NE CASC) footprint as background of the state of changes in their local climate. Variables include average annual and seasonal temperature and precipitation, extreme temperature and precipitation, wind, and snow depth. Precipitation includes both rain and snow.

  19. c

    Time Series Databases Software market size will be $993.24 Million by 2028!

    • cognitivemarketresearch.com
    pdf,excel,csv,ppt
    Updated Jul 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cognitive Market Research (2023). Time Series Databases Software market size will be $993.24 Million by 2028! [Dataset]. https://www.cognitivemarketresearch.com/time-series-databases-software-market-report
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Jul 27, 2023
    Dataset authored and provided by
    Cognitive Market Research
    License

    https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy

    Time period covered
    2021 - 2033
    Area covered
    Global
    Description

    As per Cognitive Market Research's latest published report, the Global Time Series Databases Software market size will be $993.24 Million by 2028. Time Series Databases Software Industry's Compound Annual Growth Rate will be 18.36% from 2023 to 2030. Factors Affecting Time Series Databases Software market growth

    Rise in automation in industry
    

    Industrial sensors are a key part of factory automation and Industry 4.0. Motion, environmental, and vibration sensors are used to monitor the health of equipment, from linear or angular positioning, tilt sensing, leveling, shock, or fall detection. A Sensor is a device that identifies the progressions in electrical or physical or other quantities and in a way delivers a yield as an affirmation of progress in the quantity.

    In simple terms, Industrial Automation Sensors are input devices that provide an output (signal) with respect to a specific physical quantity (input). In industrial automation, sensors play a vital part to make the products intellectual and exceptionally automatic. These permit one to detect, analyze, measure, and process a variety of transformations like alteration in position, length, height, exterior, and dislocation that occurs in the Industrial manufacturing sites. These sensors also play a pivotal role in predicting and preventing numerous potential proceedings, thus, catering to the requirements of many sensing applications. This sensor generally works on time series as the readings are taken after equal intervals of time.

    The increase in the use of sensor to monitor the industrial activities and in production factories is fueling the growth of the time series database software market. Also manufacturing in pharmaceutical industry requires proper monitoring due to which there is increase in demand for sensors and time series database, this fuels the demand for time series database software market.

    Market Dynamics of

    Time Series Databases Software Market

    Key Drivers of

    Time Series Databases Software Market

    Increasing Adoption of IoT Devices : The rise of IoT devices is producing vast amounts of time-stamped data. Time Series Databases (TSDBs) are specifically engineered to manage this data effectively, facilitating real-time monitoring, analytics, and forecasting—rendering them crucial for sectors such as manufacturing, energy, and smart cities.

    Rising Demand for Real-Time Analytics : Companies are progressively emphasizing real-time data processing to enable quicker, data-informed decisions. TSDBs accommodate rapid data ingestion and querying, allowing for real-time analysis across various sectors including finance, IT infrastructure, and logistics, significantly enhancing their market adoption.

    Growth of Cloud Infrastructure : As cloud computing becomes ubiquitous, cloud-native TSDB solutions are gaining popularity. These platforms provide scalability, ease of deployment, and lower operational expenses. The need for adaptable and on-demand database solutions fosters the expansion of TSDBs within contemporary IT environments.

    Key Restraints in

    Time Series Databases Software Market

    High Implementation and Maintenance Costs : The deployment and upkeep of Time Series Database (TSDB) systems can necessitate a considerable financial commitment, particularly for small to medium-sized businesses. The costs encompass infrastructure establishment, the hiring of skilled personnel, and the integration with current systems, which may discourage market adoption in environments sensitive to costs.

    Complexity in Data Management : Managing large volumes of time-stamped data demands a robust system architecture. As the amount of data increases, difficulties in indexing, querying, and efficient storage can adversely affect performance and user experience, thereby restricting usability for organizations that lack strong technical support.

    Competition from Traditional Databases : In spite of their benefits, TSDBs encounter competition from advanced traditional databases such as relational and NoSQL systems. Many of these databases now offer time-series functionalities, leading organizations to be reluctant to invest in new TSDB software when existing solutions can be enhanced.

    Key Trends of

    Time Series Databases Software Market

    Integration with AI and Machine Learning Tools : TSDBs are progressively being integrated with AI/ML platfo...

  20. Data from: ANES 2016 Time Series Study

    • icpsr.umich.edu
    ascii, delimited, r +3
    Updated Sep 19, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Inter-university Consortium for Political and Social Research [distributor] (2017). ANES 2016 Time Series Study [Dataset]. http://doi.org/10.3886/ICPSR36824.v2
    Explore at:
    r, delimited, stata, sas, ascii, spssAvailable download formats
    Dataset updated
    Sep 19, 2017
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/36824/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36824/terms

    Time period covered
    Sep 2016 - Jan 2017
    Area covered
    United States
    Description

    This study is part of the American National Election Study (ANES), a time-series collection of national surveys fielded continuously since 1948. The American National Election Studies are designed to present data on Americans' social backgrounds, enduring political predispositions, social and political values, perceptions and evaluations of groups and candidates, opinions on questions of public policy, and participation in political life. As with all Time Series studies conducted during years of presidential elections, respondents were interviewed during the two months preceding the November election (Pre-election interview), and then re-interviewed during the two months following the election (Post-election interview). Like its predecessors, the 2016 ANES was divided between questions necessary for tracking long-term trends and questions necessary to understand the particular political moment of 2016. The study maintains and extends the ANES time-series 'core' by collecting data on Americans' basic political beliefs, allegiances, and behaviors, which are so critical to a general understanding of politics that they are monitored at every election, no matter the nature of the specific campaign or the broader setting. This 2016 ANES study features a dual-mode design with both traditional face-to-face interviewing (n=1,181) and surveys conducted on the Internet (n=3,090), and a total sample size of 4,271. In addition to content on electoral participation, voting behavior, and public opinion, the 2016 ANES Time Series Study contains questions about areas such as media exposure, cognitive style, and values and predispositions. Several items first measured on the 2012 ANES study were again asked, including "Big Five" personality traits using the Ten Item Personality Inventory (TIPI), and skin tone observations made by interviewers in the face-to-face study. For the first time, ANES has collected supplemental data directly from respondents' Facebook accounts. The post-election interview also included Module 5 from the Comparative Study of Electorial Systems (CSES), exploring themes in populism, perceptions on elites, corruption, and attitudes towards representative democracy. Face-to-face interviews were conducted by trained interviewers using computer assisted personal interviewing (CAPI) software on laptop computers. During a portion of the face-to-face interview, the respondent answered certain sensitive questions on the laptop computer directly, without the interviewer's participation (known as computer assisted self-interviewing (CASI)). Internet questionnaires could be completed anywhere the respondent had access to the Internet, on a computer or on a mobile device. Respondents were only eligible to compete the survey in the mode for which they were sampled. Demographic variables include respondent age, education level, political affiliation, race/ethnicity, marital status, and family composition.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ben Fulcher (2023). 1000 Empirical Time series [Dataset]. http://doi.org/10.6084/m9.figshare.5436136.v10
Organization logo

1000 Empirical Time series

Explore at:
pngAvailable download formats
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Ben Fulcher
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

A diverse selection of 1000 empirical time series, along with results of an hctsa feature extraction, using v1.06 of hctsa and Matlab 2019b, computed on a server at The University of Sydney.The results of the computation are in the hctsa file, HCTSA_Empirical1000.mat for use in Matlab using v1.06 of hctsa.The same data is also provided in .csv format for the hctsa_datamatrix.csv (results of feature computation), with information about rows (time series) in hctsa_timeseries-info.csv, information about columns (features) in hctsa_features.csv (and corresponding hctsa code used to compute each feature in hctsa_masterfeatures.csv), and the data of individual time series (each line a time series, for time series described in hctsa_timeseries-info.csv) is in hctsa_timeseries-data.csv. These .csv files were produced by running >>OutputToCSV(HCTSA_Empirical1000.mat,true,true); in hctsa.The input file, INP_Empirical1000.mat, is for use with hctsa, and contains the time-series data and metadata for the 1000 time series. For example, massive feature extraction from these data on the user's machine, using hctsa, can proceed as>> TS_Init('INP_Empirical1000.mat');Some visualizations of the dataset are in CarpetPlot.png (first 1000 samples of all time series as a carpet (color) plot) and 150TS-250samples.png (conventional time-series plots of the first 250 samples of a sample of 150 time series from the dataset). More visualizations can be performed by the user using TS_PlotTimeSeries from the hctsa package.See links in references for more comprehensive documentation for performing methodological comparison using this dataset, and on how to download and use v1.06 of hctsa.

Search
Clear search
Close search
Google apps
Main menu