100+ datasets found
  1. d

    U.S. Geological Survey Oceanographic Time Series Data Collection

    • catalog.data.gov
    • data.usgs.gov
    • +4more
    Updated Oct 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). U.S. Geological Survey Oceanographic Time Series Data Collection [Dataset]. https://catalog.data.gov/dataset/u-s-geological-survey-oceanographic-time-series-data-collection
    Explore at:
    Dataset updated
    Oct 30, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    The oceanographic time series data collected by U.S. Geological Survey scientists and collaborators are served in an online database at http://stellwagen.er.usgs.gov/index.html. These data were collected as part of research experiments investigating circulation and sediment transport in the coastal ocean. The experiments (projects, research programs) are typically one month to several years long and have been carried out since 1975. New experiments will be conducted, and the data from them will be added to the collection. As of 2016, all but one of the experiments were conducted in waters abutting the U.S. coast; the exception was conducted in the Adriatic Sea. Measurements acquired vary by site and experiment; they usually include current velocity, wave statistics, water temperature, salinity, pressure, turbidity, and light transmission from one or more depths over a time period. The measurements are concentrated near the sea floor but may also include data from the water column. The user interface provides an interactive map, a tabular summary of the experiments, and a separate page for each experiment. Each experiment page has documentation and maps that provide details of what data were collected at each site. Links to related publications with additional information about the research are also provided. The data are stored in Network Common Data Format (netCDF) files using the Equatorial Pacific Information Collection (EPIC) conventions defined by the National Oceanic and Atmospheric Administration (NOAA) Pacific Marine Environmental Laboratory. NetCDF is a general, self-documenting, machine-independent, open source data format created and supported by the University Corporation for Atmospheric Research (UCAR). EPIC is an early set of standards designed to allow researchers from different organizations to share oceanographic data. The files may be downloaded or accessed online using the Open-source Project for a Network Data Access Protocol (OPeNDAP). The OPeNDAP framework allows users to access data from anywhere on the Internet using a variety of Web services including Thematic Realtime Environmental Distributed Data Services (THREDDS). A subset of the data compliant with the Climate and Forecast convention (CF, currently version 1.6) is also available.

  2. Time series data for forecasting

    • kaggle.com
    zip
    Updated Apr 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vikram Baliga (2024). Time series data for forecasting [Dataset]. https://www.kaggle.com/datasets/vikkubaliga/monash-oiklab-weather
    Explore at:
    zip(104274828 bytes)Available download formats
    Dataset updated
    Apr 17, 2024
    Authors
    Vikram Baliga
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Vikram Baliga

    Released under CC BY-SA 4.0

    Contents

  3. 1000 Empirical Time series

    • figshare.com
    • bridges.monash.edu
    • +1more
    png
    Updated May 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ben Fulcher (2023). 1000 Empirical Time series [Dataset]. http://doi.org/10.6084/m9.figshare.5436136.v10
    Explore at:
    pngAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Ben Fulcher
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A diverse selection of 1000 empirical time series, along with results of an hctsa feature extraction, using v1.06 of hctsa and Matlab 2019b, computed on a server at The University of Sydney.The results of the computation are in the hctsa file, HCTSA_Empirical1000.mat for use in Matlab using v1.06 of hctsa.The same data is also provided in .csv format for the hctsa_datamatrix.csv (results of feature computation), with information about rows (time series) in hctsa_timeseries-info.csv, information about columns (features) in hctsa_features.csv (and corresponding hctsa code used to compute each feature in hctsa_masterfeatures.csv), and the data of individual time series (each line a time series, for time series described in hctsa_timeseries-info.csv) is in hctsa_timeseries-data.csv. These .csv files were produced by running >>OutputToCSV(HCTSA_Empirical1000.mat,true,true); in hctsa.The input file, INP_Empirical1000.mat, is for use with hctsa, and contains the time-series data and metadata for the 1000 time series. For example, massive feature extraction from these data on the user's machine, using hctsa, can proceed as>> TS_Init('INP_Empirical1000.mat');Some visualizations of the dataset are in CarpetPlot.png (first 1000 samples of all time series as a carpet (color) plot) and 150TS-250samples.png (conventional time-series plots of the first 250 samples of a sample of 150 time series from the dataset). More visualizations can be performed by the user using TS_PlotTimeSeries from the hctsa package.See links in references for more comprehensive documentation for performing methodological comparison using this dataset, and on how to download and use v1.06 of hctsa.

  4. Dates

    • figshare.com
    • search.datacite.org
    txt
    Updated Jan 20, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandra Villamizar (2016). Dates [Dataset]. http://doi.org/10.6084/m9.figshare.1483488.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 20, 2016
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Sandra Villamizar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A table of dates for a period of interest, usually a month, expressed in two different formats: mm/dd/yyyy and mm-dd-yyyy Start date: 12/01/2014 End date: 12/31/2014

  5. Rainfall Dataset for Simple Time Series Analysis

    • kaggle.com
    zip
    Updated Apr 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sujith K Mandala (2024). Rainfall Dataset for Simple Time Series Analysis [Dataset]. https://www.kaggle.com/datasets/sujithmandala/rainfall-dataset-for-simple-time-series-analysis
    Explore at:
    zip(684 bytes)Available download formats
    Dataset updated
    Apr 20, 2024
    Authors
    Sujith K Mandala
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains daily rainfall measurements (in millimeters) for the year 2022. The data spans from January 1, 2022, to July 3, 2022, covering a total of 184 days. The dataset can be used for various machine learning tasks, such as time series forecasting, pattern recognition, or anomaly detection related to rainfall patterns.

    Column Descriptors:

    date (date): Description: The date of the rainfall measurement in the format YYYY-MM-DD. Example: 2022-01-01 rainfall (float): Description: The amount of rainfall recorded on the corresponding date, measured in millimeters (mm). Example: 12.5 Range: The rainfall values range from 0.0 mm (no rainfall) to 22.4 mm (the maximum recorded value in the dataset). Missing values: There are no missing values in this column.

  6. Wikipedia time-series graph

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benzi Kirell; Miz Volodymyr; Ricaud Benjamin; Vandergheynst Pierre; Benzi Kirell; Miz Volodymyr; Ricaud Benjamin; Vandergheynst Pierre (2025). Wikipedia time-series graph [Dataset]. http://doi.org/10.5281/zenodo.886484
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Apr 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Benzi Kirell; Miz Volodymyr; Ricaud Benjamin; Vandergheynst Pierre; Benzi Kirell; Miz Volodymyr; Ricaud Benjamin; Vandergheynst Pierre
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Wikipedia temporal graph.

    The dataset is based on two Wikipedia SQL dumps: (1) English language articles and (2) user visit counts per page per hour (aka pagecounts). The original datasets are publicly available on the Wikimedia website.

    Static graph structure is extracted from English language Wikipedia articles. Redirects are removed. Before building the Wikipedia graph we introduce thresholds on the minimum number of visits per hour and maximum in-degree. We remove the pages that have less than 500 visits per hour at least once during the specified period. Besides, we remove the nodes (pages) with in-degree higher than 8 000 to build a more meaningful initial graph. After cleaning, the graph contains 116 016 nodes (out of total 4 856 639 pages), 6 573 475 edges. The graph can be imported in two ways: (1) using edges.csv and vertices.csv or (2) using enwiki-20150403-graph.gt file that can be opened with open source Python library Graph-Tool.

    Time-series data contains users' visit counts from 02:00, 23 September 2014 until 23:00, 30 April 2015. The total number of hours is 5278. The data is stored in two formats: CSV and H5. CSV file contains data in the following format [page_id :: count_views :: layer], where layer represents an hour. In H5 file, each layer corresponds to an hour as well.

  7. u

    NCAR S-Pol radar time series data

    • data.ucar.edu
    • ckanprod.data-commons.k8s.ucar.edu
    archive
    Updated Oct 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NCAR/EOL S-Pol Team (2025). NCAR S-Pol radar time series data [Dataset]. http://doi.org/10.5065/D69P3016
    Explore at:
    archiveAvailable download formats
    Dataset updated
    Oct 7, 2025
    Authors
    NCAR/EOL S-Pol Team
    Time period covered
    Feb 10, 2014 - Mar 1, 2014
    Area covered
    Description

    S-Pol radar full time series data in IWRF format collected continuously during the LATTE (Lower Atmospheric Thermodynamics & Turbulence Experiment) project. Each file covers about 15 minutes of S-Pol operation. See the FRONT S-Pol Data Availability 2014-2015 document linked below to check on data availability.

  8. u

    NCAR S-Pol radar time series data

    • data.ucar.edu
    • ckanprod.data-commons.k8s.ucar.edu
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NCAR/EOL S-Pol Team (2025). NCAR S-Pol radar time series data [Dataset]. http://doi.org/10.5065/D6VD6WNC
    Explore at:
    Dataset updated
    Oct 7, 2025
    Authors
    NCAR/EOL S-Pol Team
    Time period covered
    Mar 9, 2015 - Jul 16, 2015
    Area covered
    Description

    S-Pol Radar full time series data collected during the Plains Elevated Convection at Night (PECAN) campaign from 9 March 2015 to 16 July 2015. This is a "realtime" PECAN data set. The files are a mix of hourly files ("SPOL_scan") and episodic files (SPOL_vert and SPOL_sunscan). The files are in Integrated Weather Radar Facility (IWRF) format and are available as tar archives.

  9. Timeseries(5min)

    • kaggle.com
    zip
    Updated Nov 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ali Mirza (2022). Timeseries(5min) [Dataset]. https://www.kaggle.com/datasets/qaim313/timeseries5min
    Explore at:
    zip(1656 bytes)Available download formats
    Dataset updated
    Nov 10, 2022
    Authors
    Ali Mirza
    Description

    The data has been extracted from Alpha Vantage API through rapidAPI and stored in dataframe through Pandas. This is then saved in CSV format.

    You can extract the data through the below codeπŸ‘ .

    ===============================

    *import requests import pandas as pd url = "https://alpha-vantage.p.rapidapi.com/query" querystring = {"interval": "5min", "function": "TIME_SERIES_INTRADAY", "symbol": "MSFT", "datatype": "json", "output_size": "compact"} headers = { "X-RapidAPI-Key": "yourrapidapikey", "X-RapidAPI-Host": "alpha-vantage.p.rapidapi.com" } response = requests.request("GET", url, headers=headers, params=querystring) response.json() def get_data(): while True:

      response = requests.request("GET", url, headers=headers, params=querystring)
      df = pd.DataFrame(response.json()['Time Series (5min)']) 
      df = df.T
      break
    
    return df.to_csv('TimeSeries(5mins).csv')  
    

    get_data()*

    Please refer to this link for more information https://rapidapi.com/alphavantage/api/alpha-vantage.

  10. Pseudo Periodic Synthetic Time Series

    • kaggle.com
    zip
    Updated Aug 25, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Overfitted (2020). Pseudo Periodic Synthetic Time Series [Dataset]. https://www.kaggle.com/vipulgote4/pseudo-periodic-synthetic-time-series
    Explore at:
    zip(10970583 bytes)Available download formats
    Dataset updated
    Aug 25, 2020
    Authors
    Overfitted
    Description

    Data type:

    The data is a synthetic univariate time series.

    Abstract

    This data set is designed for testing indexing schemes in time seriesdatabases. The data appears highly periodic, but never exactly repeats itself. This feature is designed to challenge the indexing tasks.

    Context

    Data Characteristics

    This data set is designed for testing indexing schemes in time series databases. It is a much larger dataset than has been used in any published study (That we are currently aware of). It contains one million data points. The data has been split into 10 sections to facilitate testing (see below). We recommend building the index with 9 of the 100,000-datapoint sections, and randomly extracting a query shape from the 10th section. (Some previously published work seems to have used queries that were also used to build the indexing structure. This will produce optimistic results) The data are interesting because they have structure at different resolutions. Each of the 10 sections where generated by independent invocations of the function:https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3650646%2F63a7467c9c096ba461b6f02702e6d816%2Fequation.jpg?generation=1598371655944726&alt=media" alt="">

    Where rand(x) produces a random integer between zero and x. The data appears highly periodic, but never exactly repeats itself. This feature is designed to challenge the indexing structure.

    Content

    What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

    Data Format

    The data is stored in one ASCII file. There are 10 columns, 100,000 rows. All data points are in the range -0.5 to +0.5. Rows are separated by carriage returns, columns by spaces.

    Acknowledgements

    Acknowledgements, Copyright Information, and Availability.Freely available for research use.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  11. Internet subscriptions time series data

    • kaggle.com
    zip
    Updated Aug 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmet Zamanis (2022). Internet subscriptions time series data [Dataset]. https://www.kaggle.com/datasets/ahmetzamanis/internet-subscriptions-time-series-data
    Explore at:
    zip(40615 bytes)Available download formats
    Dataset updated
    Aug 9, 2022
    Authors
    Ahmet Zamanis
    License

    https://www.worldbank.org/en/about/legal/terms-of-use-for-datasetshttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasets

    Description

    A dataset of broadband subscriptions and GDP per capita statistics, for 217 countries, between the years 2000-2020. The data is in long format, suitable for time series analysis.

    The variables in the dataset are: - year: Year of the observation, between 2000-2020. - country: Country of the observation, 217 in total. - broadband_subs: Number of broadband subscriptions per 100 people. - GDPPC: GDP per capita, in 2022 US$

    There are some missing values (NAs) for broadband_subs and GDPPC, especially in the eariler years.

    The data source is World Bank Open Data. The original data was retrieved as two separate datasets in wide format, and converted into long format, in May 2022.

  12. d

    Time-series water level and water quality data to accompany Scientific...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Nov 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Time-series water level and water quality data to accompany Scientific Investigations Report 2018-5040 [Dataset]. https://catalog.data.gov/dataset/time-series-water-level-and-water-quality-data-to-accompany-scientific-investigations-2018
    Explore at:
    Dataset updated
    Nov 20, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    This Data Release serves as a repository for a set of time-series data used in Scientific Investigations Report 2018-5040. The data represent continuous measurements of specific conductance, water temperature, and/or water level (stage), recorded by a variety of types of data loggers during three multi-day interference tests conducted on the Virgin River at Pah Tempe Springs during November 2013, February 2014, and November 2014. The data presented are the raw data downloaded from the data loggers and are organized according to the date of the test and the type and name of the observation site. The Data Release contains 3 items: 1. An explanatory table, "PahTempe_table1.xlsx", which indicates which parameters were collected and on what instrument at each site during a given test 2. The data, "PahTempe_data.zip"; this zipped file contains the raw data logger files in comma-separated values (CSV) format, organized into folders according to the date of the interference pumping test 3. The metadata document, "PahTempe_metadata.xml" Because these data were collected during multi-day interference pumping tests, they do not represent natural hydrologic conditions in the river, springs, or shallow groundwater system. Users of this data are advised to refer to the larger work citation for proper use and interpretation of the data.

  13. H

    2 - Discharge time series data

    • hydroshare.org
    • beta.hydroshare.org
    • +1more
    zip
    Updated Dec 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christine D Leclerc (2020). 2 - Discharge time series data [Dataset]. http://doi.org/10.4211/hs.1182147d58724a2a84dc3a382636d35e
    Explore at:
    zip(4.2 MB)Available download formats
    Dataset updated
    Dec 19, 2020
    Dataset provided by
    HydroShare
    Authors
    Christine D Leclerc
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    May 3, 1951 - Jun 28, 2020
    Area covered
    Description

    This directory includes discharge time series data (q) for 14 headwater stream networks, produced in standard format and common units of mm/day for straightforward hydrograph inter-comparison.

  14. ERA5 Land hourly time-series data from 1950 to present

    • cds.climate.copernicus.eu
    {csv,netcdf}
    Updated Nov 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ECMWF (2025). ERA5 Land hourly time-series data from 1950 to present [Dataset]. https://cds.climate.copernicus.eu/datasets/reanalysis-era5-land-timeseries
    Explore at:
    {csv,netcdf}Available download formats
    Dataset updated
    Nov 23, 2025
    Dataset provided by
    European Centre for Medium-Range Weather Forecastshttp://ecmwf.int/
    Authors
    ECMWF
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1950 - Dec 31, 2026
    Description

    ERA5-Land is a reanalysis dataset providing a consistent view of the evolution of land variables over several decades at an enhanced resolution compared to the Fifth Generation of the European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis (ERA5). Produced by replaying only the land component of the ECMWF ERA5 climate reanalysis, it benefits from the same physical data-assimilation framework but runs offline at higher spatial detail (9 km grid) to deliver richer land-surface information. Reanalysis merges numerical model output with global observations into a globally complete, physically consistent climate record; this β€œdata assimilation” approach mirrors operational weather forecasting but is optimised for historical completeness rather than forecast timeliness. Reanalysis datasets extend back several decades by sacrificing forecast deadlines, allowing additional time to gather observations and retrospectively ingest improved data, thereby enhancing data quality in earlier periods. ERA5-Land uses atmospheric fields from ERA5β€”air temperature, humidity, pressureβ€”as β€œforcing” inputs to drive its land-surface model, preventing rapid drift from reality that unconstrained simulations would suffer. Although observations do not enter the land model directly, they shape the atmospheric forcing through assimilation, giving ERA5-Land an indirect observational anchor. To reconcile ERA5’s coarser grid with ERA5-Land’s finer 9 km grid, a lapse-rate correction adjusts input temperatures, humidity, and pressures for altitude differences. Like all numerical simulations, ERA5-Land carries uncertainty that generally grows backward in time as fewer observations were available to constrain the forcing. Users can combine ERA5-Land fields with the uncertainty estimates from equivalent ERA5 variables to assess confidence bounds. The temporal resolution (hourly) and spatial detail (9 km) of ERA5-Land make it invaluable for land-surface applications such as flood and drought forecasting, agricultural monitoring, and hydrological studies. The dataset presented here is a regridded subset of the full ERA5-Land archive, stored in an Analysis-Ready, Cloud-Optimised (ARCO) format specifically designed for retrieving long time-series for individual points. When a user’s requested location does not exactly match a grid point, the nearest grid point is automatically selected. This optimised data source ensures rapid response times.

  15. c

    Data from: Cross-National Time Series, 1815-1973

    • archive.ciser.cornell.edu
    • icpsr.umich.edu
    Updated Jan 5, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arthur Banks (2020). Cross-National Time Series, 1815-1973 [Dataset]. http://doi.org/10.6077/y09q-rh18
    Explore at:
    Dataset updated
    Jan 5, 2020
    Authors
    Arthur Banks
    Variables measured
    GeographicUnit
    Description

    This study is a longitudinal national data series for 167 nations. The present dataset represents an expansion both of temporal coverage and of substantive variable categories from the earlier CROSS POLITY TIME SERIES (ICPSR 5002) by the Center for Comparative Political Research, State University of New York (Binghamton). General areas included among the variables now available are demographic, social, political, and economic topics. Cases in the data collection represent nation-year observations. (Source: downloaded from ICPSR 7/13/10)

    Please Note: This dataset is part of the historical CISER Data Archive Collection and is also available at ICPSR at https://doi.org/10.3886/ICPSR07412.v1. We highly recommend using the ICPSR version as they have made this dataset available in multiple data formats.

  16. Pre-Processed Power Grid Frequency Time Series

    • zenodo.org
    bin, zip
    Updated Jul 15, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Johannes Kruse; Johannes Kruse; Benjamin SchΓ€fer; Benjamin SchΓ€fer; Dirk Witthaut; Dirk Witthaut (2021). Pre-Processed Power Grid Frequency Time Series [Dataset]. http://doi.org/10.5281/zenodo.3744121
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Jul 15, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Johannes Kruse; Johannes Kruse; Benjamin SchΓ€fer; Benjamin SchΓ€fer; Dirk Witthaut; Dirk Witthaut
    Description

    Overview
    This repository contains ready-to-use frequency time series as well as the corresponding pre-processing scripts in python. The data covers three synchronous areas of the European power grid:

    • Continental Europe
    • Great Britain
    • Nordic

    This work is part of the paper "Predictability of Power Grid Frequency"[1]. Please cite this paper, when using the data and the code. For a detailed documentation of the pre-processing procedure we refer to the supplementary material of the paper.

    Data sources
    We downloaded the frequency recordings from publically available repositories of three different Transmission System Operators (TSOs).

    • Continental Europe [2]: We downloaded the data from the German TSO TransnetBW GmbH, which retains the Copyright on the data, but allows to re-publish it upon request [3].
    • Great Britain [4]: The download was supported by National Grid ESO Open Data, which belongs to the British TSO National Grid. They publish the frequency recordings under the NGESO Open License [5].
    • Nordic [6]: We obtained the data from the Finish TSO Fingrid, which provides the data under the open license CC-BY 4.0 [7].

    Content of the repository

    A) Scripts

    1. In the "Download_scripts" folder you will find three scripts to automatically download frequency data from the TSO's websites.
    2. In "convert_data_format.py" we save the data with corrected timestamp formats. Missing data is marked as NaN (processing step (1) in the supplementary material of [1]).
    3. In "clean_corrupted_data.py" we load the converted data and identify corrupted recordings. We mark them as NaN and clean some of the resulting data holes (processing step (2) in the supplementary material of [1]).

    The python scripts run with Python 3.7 and with the packages found in "requirements.txt".

    B) Data_converted and Data_cleansed
    The folder "Data_converted" contains the output of "convert_data_format.py" and "Data_cleansed" contains the output of "clean_corrupted_data.py".

    • File type: The files are zipped csv-files, where each file comprises one year.
    • Data format: The files contain two columns. The first one represents the time stamps in the format Year-Month-Day Hour-Minute-Second, which is given as naive local time. The second column contains the frequency values in Hz.
    • NaN representation: We mark corrupted and missing data as "NaN" in the csv-files.

    Use cases
    We point out that this repository can be used in two different was:

    • Use pre-processed data: You can directly use the converted or the cleansed data. Note however that both data sets include segments of NaN-values due to missing and corrupted recordings. Only a very small part of the NaN-values were eliminated in the cleansed data to not manipulate the data too much. If your application cannot deal with NaNs, you could build upon the following commands to select the longest interval of valid data from the cleansed data:
    from helper_functions import *
    import pandas as pd
    
    cleansed_data = pd.read_csv('/Path_to_cleansed_data/data.zip',
                index_col=0, header=None, squeeze=True,
                parse_dates=[0])
    valid_bounds, valid_sizes = true_intervals(~cleansed_data.isnull())
    start,end= valid_bounds[ np.argmax(valid_sizes) ]
    data_without_nan = cleansed_data.iloc[start:end]
    • Produce your own cleansed data: Depending on your application, you might want to cleanse the data in a custom way. You can easily add your custom cleansing procedure in "clean_corrupted_data.py" and then produce cleansed data from the raw data in "Data_converted".

    License
    We release the code in the folder "Scripts" under the MIT license [8]. In the case of Nationalgrid and Fingrid, we further release the pre-processed data in the folder "Data_converted" and "Data_cleansed" under the CC-BY 4.0 license [7]. TransnetBW originally did not publish their data under an open license. We have explicitly received the permission to publish the pre-processed version from TransnetBW. However, we cannot publish our pre-processed version under an open license due to the missing license of the original TransnetBW data.

  17. ERA5 hourly time-series data on single levels from 1940 to present

    • cds.climate.copernicus.eu
    netcdf
    Updated Apr 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ECMWF (2025). ERA5 hourly time-series data on single levels from 1940 to present [Dataset]. https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries
    Explore at:
    netcdfAvailable download formats
    Dataset updated
    Apr 8, 2025
    Dataset provided by
    European Centre for Medium-Range Weather Forecastshttp://ecmwf.int/
    Authors
    ECMWF
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ERA5 is the fifth generation ECMWF reanalysis for the global climate and weather for the past 8 decades. Data is available from 1940 onwards. ERA5 replaces the ERA-Interim reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. This principle, called data assimilation, is based on the method used by numerical weather prediction centres, where every so many hours (12 hours at ECMWF) a previous forecast is combined with newly available observations in an optimal way to produce a new best estimate of the state of the atmosphere, called analysis, from which an updated, improved forecast is issued. Reanalysis works in the same way, but at reduced resolution to allow for the provision of a dataset spanning back several decades. Reanalysis does not have the constraint of issuing timely forecasts, so there is more time to collect observations, and when going further back in time, to allow for the ingestion of improved versions of the original observations, which all benefit the quality of the reanalysis product. ERA5 provides hourly estimates for a large number of atmospheric, ocean-wave and land-surface quantities. An uncertainty estimate is sampled by an underlying 10-member ensemble at three-hourly intervals. Ensemble mean and spread have been pre-computed for convenience. Such uncertainty estimates are closely related to the information content of the available observing system which has evolved considerably over time. They also indicate flow-dependent sensitive areas. To facilitate many climate applications, monthly-mean averages have been pre-calculated too, though monthly means are not available for the ensemble mean and spread. ERA5 is updated daily with a latency of about 5 days. In case that serious flaws are detected in this early release (called ERA5T), this data could be different from the final release 2 to 3 months later. In case that this occurs users are notified. The dataset presented here is a regridded subset of the full ERA5 data set on native resolution that is stored in a format designed for retrieving long time-series for a single point. When the requested location does not match the exact location of a grid point then the nearest grid point is used instead. It is this source of ERA5 data that is used by the ERA-Explorer to ensure response times required for the interactive web-application. An overview of all ERA5 datasets can be found in this article. Information on access to ERA5 data on native resolution is provided in these guidelines.

  18. m

    EPR 9Β°50'N hydrothermal vent temperature data compilation, 1991-2025 (raw...

    • marine-geo.org
    Updated Aug 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MGDS > Marine Geoscience Data System (2025). EPR 9Β°50'N hydrothermal vent temperature data compilation, 1991-2025 (raw time-series, ASCII format) [Dataset]. http://doi.org/10.60521/332402
    Explore at:
    Dataset updated
    Aug 14, 2025
    Dataset authored and provided by
    MGDS > Marine Geoscience Data System
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    Raw ASCII time-series data (cleaned of corrupted records). This data set contains continuous time-series temperature measurements from several hydrothermal vents (Biovent, Mvent, Bio9 vents, Pvent, Lvent) located along the East Pacific Rise (EPR) near 9Β°50'N. The compilation contains legacy data along with data from cruises AT42-06, AT42-21, RR2102, AT50-07, AT50-21, AT50-33, and AT50-36. The data files are in ASCII format and were collected with temperature probes and autonomous temperature loggers. The data compilation was funded through awards OCE-1834797, OCE-1949485, OCE-1949938, OCE-1948936, ANR-24-CE56-6841 (Project OMENS), ERC-10117070619 (Project SeaSALT).

  19. Various Daily 2.5-degree Tropospheric Analysis Time Series in GRIB Format,...

    • data.ucar.edu
    • rda-web-prod.ucar.edu
    • +4more
    grib
    Updated Oct 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fleet Numerical Meteorology and Oceanography Center, U.S. Navy, U. S. Department of Defense; National Centers for Environmental Prediction, National Weather Service, NOAA, U.S. Department of Commerce; Research Data Archive, Computational and Information Systems Laboratory, National Center for Atmospheric Research, University Corporation for Atmospheric Research (2025). Various Daily 2.5-degree Tropospheric Analysis Time Series in GRIB Format, 1946-1999 [Dataset]. http://doi.org/10.5065/WZ4V-WB07
    Explore at:
    gribAvailable download formats
    Dataset updated
    Oct 9, 2025
    Dataset provided by
    National Science Foundationhttp://www.nsf.gov/
    Authors
    Fleet Numerical Meteorology and Oceanography Center, U.S. Navy, U. S. Department of Defense; National Centers for Environmental Prediction, National Weather Service, NOAA, U.S. Department of Commerce; Research Data Archive, Computational and Information Systems Laboratory, National Center for Atmospheric Research, University Corporation for Atmospheric Research
    Time period covered
    Jan 1946 - Dec 1999
    Description

    This dataset contains various tropospheric analysis time series' created from the grids of other DSS datasets and converted to GRIB (for grids that were not already in this format).

    Currently, the only available time series' are series' of 500mb geopotential height.

  20. Sport Activity Dataset - MTS-5

    • kaggle.com
    zip
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jarno Matarmaa (2023). Sport Activity Dataset - MTS-5 [Dataset]. https://www.kaggle.com/datasets/jarnomatarmaa/sportdata-mts-5
    Explore at:
    zip(498699 bytes)Available download formats
    Dataset updated
    Jul 13, 2023
    Authors
    Jarno Matarmaa
    License

    https://ec.europa.eu/info/legal-notice_enhttps://ec.europa.eu/info/legal-notice_en

    Description

    Description

    Dataset consists of data in categories walking, running, biking, skiing, and roller skiing (5). Sport activities have been recorded by an individual active (non-competitive) athlete. Data is pre-processed, standardized and splitted in four parts (each dimension in its own file): * HR-DATA_std_1140x69 (heart rate signals) * SPD-DATA_std_1140x69 (speed signals) * ALT-DATA_std_1140x69 (altitude signals) * META-DATA_1140x4 (labels and details)

    NOTE: Signal order between the separate files must not be confused when processing the data. Signal order is critical; first index in each of the file comes from the same activity which label corresponds to first index in the target data file, and so on. So, data should be constructed and files combined into the same table while reading the files, ideally using nested data structure. Something like in the picture below:

    You may check the related TSC projects in GitHub: - "https://github.com/JABE22/MasterProject">Sport Activity Classification Using Classical Machine Learning and Time Series Methods - Symbolic Representation of Multivariate Time Series Signals in Sport Activity Classification - Kaggle Project

    https://mediauploads.data.world/e1ccd4d36522e04c0061d12d05a87407bec80716f6fe7301991eaaccd577baa8_mts_data.png" alt="Nested data structure for multivariate time series classifiers">

    In the following picture one can see five signal samples for each dimension (Heart Rate, Speed, Altitude) in standard feature value format. So, each figure contains signal from five different random activities (can be same or different category). However, for example, signal indexes number 1 in each three figure are from the same activity. Figures just visualizes what kind of signals dataset consists. They do not have any particular meaning.

    https://mediauploads.data.world/162b7086448d8dbd202d282014bcf12bd95bd3174b41c770aa1044bab22ad655_signal_samples.png" alt="Signals from sport activities (Heart Rate, Speed, and Altitude)">

    Dataset size and construction procedure

    The original amount of sport activities is 228. From each of them, starting from the index 100 (seconds), have been picked 5 x 69 second consecutive segments, that is expressed as a formula below:

    https://mediauploads.data.world/68ce83092ec65f6fbaee90e5de6e12df40498e08fa6725c111f1205835c1a842_segment_equation.png" alt="Data segmentation and augmentation formula">

    where 𝐷 = π‘œπ‘Ÿπ‘–π‘”π‘–π‘›π‘Žπ‘™ π‘“π‘–π‘™π‘‘π‘’π‘Ÿπ‘’π‘‘ π‘‘π‘Žπ‘‘π‘Ž ,𝑁 = π‘›π‘’π‘šπ‘π‘’π‘Ÿ π‘œπ‘“ π‘Žπ‘π‘‘π‘–π‘£π‘–π‘‘π‘–π‘’π‘  , 𝑠 = π‘ π‘’π‘”π‘šπ‘’π‘›π‘‘ π‘ π‘‘π‘Žπ‘Ÿπ‘‘ 𝑖𝑛𝑑𝑒π‘₯ , 𝑙 = π‘ π‘’π‘”π‘šπ‘’π‘›π‘‘ π‘™π‘’π‘›π‘”π‘‘β„Ž, and 𝑛 = π‘‘β„Žπ‘’ π‘›π‘’π‘šπ‘π‘’π‘Ÿ π‘œπ‘“ π‘ π‘’π‘”π‘šπ‘’π‘›π‘‘π‘  from a single original sequence 𝐷𝑖 , resulting the new set of equal length segments 𝐷𝑠𝑒𝑔. And in this certain case the equation takes the form of:

    https://mediauploads.data.world/63dd87bf3d0010923ad05a8286224526e241b17bbbce790133030d8e73f3d3a7_data_segmentation_formula.png" alt="Data segmentation and augmentation formula with values">

    Thus, dataset has dimesions of 1140 x 69 x 3.

    Additional information

    Data has been recorded without knowing it will be used in research, therefore it represents well real-world application of data source and can provide excellent tool to test algorithms in real data.

    Recording devices

    Data has been recorded using two type of Garmin devices. Models are Forerunner 920XT and vivosport. Vivosport is activity tracker and measures heart rate from the wrist using optical sensor, whereas 920XT requires external sensor belt (hear rate + inertial) installed under chest when doing exercises. Otherwise devices are not essentially different, they uses GPS location to measure speed and inertial barometer to measure elevation changes.

    Device manuals - Garmin FR-920XT - Garmin Vivosport

    Person profile

    Age: 30-31, Weight: 82, Length: 181, Active athlete (non-competitive)

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
U.S. Geological Survey (2025). U.S. Geological Survey Oceanographic Time Series Data Collection [Dataset]. https://catalog.data.gov/dataset/u-s-geological-survey-oceanographic-time-series-data-collection

U.S. Geological Survey Oceanographic Time Series Data Collection

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Oct 30, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description

The oceanographic time series data collected by U.S. Geological Survey scientists and collaborators are served in an online database at http://stellwagen.er.usgs.gov/index.html. These data were collected as part of research experiments investigating circulation and sediment transport in the coastal ocean. The experiments (projects, research programs) are typically one month to several years long and have been carried out since 1975. New experiments will be conducted, and the data from them will be added to the collection. As of 2016, all but one of the experiments were conducted in waters abutting the U.S. coast; the exception was conducted in the Adriatic Sea. Measurements acquired vary by site and experiment; they usually include current velocity, wave statistics, water temperature, salinity, pressure, turbidity, and light transmission from one or more depths over a time period. The measurements are concentrated near the sea floor but may also include data from the water column. The user interface provides an interactive map, a tabular summary of the experiments, and a separate page for each experiment. Each experiment page has documentation and maps that provide details of what data were collected at each site. Links to related publications with additional information about the research are also provided. The data are stored in Network Common Data Format (netCDF) files using the Equatorial Pacific Information Collection (EPIC) conventions defined by the National Oceanic and Atmospheric Administration (NOAA) Pacific Marine Environmental Laboratory. NetCDF is a general, self-documenting, machine-independent, open source data format created and supported by the University Corporation for Atmospheric Research (UCAR). EPIC is an early set of standards designed to allow researchers from different organizations to share oceanographic data. The files may be downloaded or accessed online using the Open-source Project for a Network Data Access Protocol (OPeNDAP). The OPeNDAP framework allows users to access data from anywhere on the Internet using a variety of Web services including Thematic Realtime Environmental Distributed Data Services (THREDDS). A subset of the data compliant with the Climate and Forecast convention (CF, currently version 1.6) is also available.

Search
Clear search
Close search
Google apps
Main menu