MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Time Series PILE
The Time-series Pile is a large collection of publicly available data from diverse domains, ranging from healthcare to engineering and finance. It comprises of over 5 public time-series databases, from several diverse domains for time series foundation model pre-training and evaluation.
Time Series PILE Description
We compiled a large collection of publicly available datasets from diverse domains into the Time Series Pile. It has 13 unique domains of data… See the full description on the dataset page: https://huggingface.co/datasets/AutonLab/Timeseries-PILE.
Modal Service data and Safety & Security (S&S) public transit time series data delineated by transit/agency/mode/year/month. Includes all Full Reporters--transit agencies operating modes with more than 30 vehicles in maximum service--to the National Transit Database (NTD). This dataset will be updated monthly. The monthly ridership data is released one month after the month in which the service is provided. Records with null monthly service data reflect late reporting. The S&S statistics provided include both Major and Non-Major Events where applicable. Events occurring in the past three months are excluded from the corresponding monthly ridership rows in this dataset while they undergo validation. This dataset is the only NTD publication in which all Major and Non-Major S&S data are presented without any adjustment for historical continuity.
All datasets contain univariate time series and they are available in a new format that we name as .tsf, pioneered by the sktime .ts format.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
LOTSA Data
The Large-scale Open Time Series Archive (LOTSA) is a collection of open time series datasets for time series forecasting. It was collected for the purpose of pre-training Large Time Series Models. See the paper and codebase for more information.
Citation
If you're using LOTSA data in your research or applications, please cite it using this BibTeX: BibTeX: @article{woo2024unified, title={Unified Training of Universal Time Series Forecasting Transformers}… See the full description on the dataset page: https://huggingface.co/datasets/Salesforce/lotsa_data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This lesson was adapted from educational material written by Dr. Kateri Salk for her Fall 2019 Hydrologic Data Analysis course at Duke University. This is the first part of a two-part exercise focusing on time series analysis.
Introduction
Time series are a special class of dataset, where a response variable is tracked over time. The frequency of measurement and the timespan of the dataset can vary widely. At its most simple, a time series model includes an explanatory time component and a response variable. Mixed models can include additional explanatory variables (check out the nlme
and lme4
R packages). We will be covering a few simple applications of time series analysis in these lessons.
Opportunities
Analysis of time series presents several opportunities. In aquatic sciences, some of the most common questions we can answer with time series modeling are:
Can we forecast conditions in the future?
Challenges
Time series datasets come with several caveats, which need to be addressed in order to effectively model the system. A few common challenges that arise (and can occur together within a single dataset) are:
Autocorrelation: Data points are not independent from one another (i.e., the measurement at a given time point is dependent on previous time point(s)).
Data gaps: Data are not collected at regular intervals, necessitating interpolation between measurements. There are often gaps between monitoring periods. For many time series analyses, we need equally spaced points.
Seasonality: Cyclic patterns in variables occur at regular intervals, impeding clear interpretation of a monotonic (unidirectional) trend. Ex. We can assume that summer temperatures are higher.
Heteroscedasticity: The variance of the time series is not constant over time.
Covariance: the covariance of the time series is not constant over time. Many of these models assume that the variance and covariance are similar over the time-->heteroschedasticity.
Learning Objectives
After successfully completing this notebook, you will be able to:
Choose appropriate time series analyses for trend detection and forecasting
Discuss the influence of seasonality on time series analysis
Interpret and communicate results of time series analyses
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Monthly movements in output for the services industries: distribution, hotels and restaurants; transport, storage and communication; business services and finance; and government and other services.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The sp500stock_data_description.csv file provides detailed information on the existence of four modalities (text, image, time series, and table) for 4,213 S&P 500 stocks. The hs300stock_data_description.csv file provides detailed information on the existence of four modalities (text, image, time series, and table) for 858 HS 300 stocks.
If you find our research helpful, please cite our paper:
@article{xu2025finmultitime, title={FinMultiTime: A Four-Modal Bilingual Dataset for… See the full description on the dataset page: https://huggingface.co/datasets/Wenyan0110/Multimodal-Dataset-Image_Text_Table_TimeSeries-for-Financial-Time-Series-Forecasting.
The U.S. Census Bureau.s economic indicator surveys provide monthly and quarterly data that are timely, reliable, and offer comprehensive measures of the U.S. economy. These surveys produce a variety of statistics covering construction, housing, international trade, retail trade, wholesale trade, services and manufacturing. The survey data provide measures of economic activity that allow analysis of economic performance and inform business investment and policy decisions. Other data included, which are not considered principal economic indicators, are the Quarterly Summary of State & Local Taxes, Quarterly Survey of Public Pensions, and the Manufactured Homes Survey. For information on the reliability and use of the data, including important notes on estimation and sampling variance, seasonal adjustment, measures of sampling variability, and other information pertinent to the economic indicators, visit the individual programs' webpages - http://www.census.gov/cgi-bin/briefroom/BriefRm.
http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/INSPIRE_Directive_Article13_1ahttp://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/INSPIRE_Directive_Article13_1a
The European Space Agency, in collaboration with BlackBridge collected two time series datasets with a five day revisit at high resolution: February to June 2013 over 14 selected sites around the world April to September 2015 over 10 selected sites around the world. The RapidEye Earth Imaging System provides data at 5 m spatial resolution (multispectral L3A orthorectified). The products are radiometrically and sensor corrected similar to the 1B Basic product, but have geometric corrections applied to the data during orthorectification using DEMs and GCPs. The product accuracy depends on the quality of the ground control and DEMs used. The imagery is delivered in GeoTIFF format with a pixel spacing of 5 metres. The dataset is composed of data over: 14 selected sites in 2013: Argentina, Belgium, Chesapeake Bay, China, Congo, Egypt, Ethiopia, Gabon, Jordan, Korea, Morocco, Paraguay, South Africa and Ukraine. 10 selected sites in 2015: Limburgerhof, Railroad Valley, Libya4, Algeria4, Figueres, Libya1, Mauritania1, Barrax, Esrin, Uyuni Salt Lake. Spatial coverage: Check the spatial coverage of the collection on a map available on the Third Party Missions Dissemination Service.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset serves as supplementary material to the fully reproducible paper entitled "Comparison of stochastic and machine learning methods for multi-step ahead forecasting of hydrological processes". We provide the R codes and their outcomes. We also provide the reports entitled “Definitions of the stochastic processes’’, “Definitions of the forecast quality metrics’’ and “Selected figures for the qualitative comparison of the forecasting methods’’. The former version of this dataset is available in the provided link.
The dataset contains m = 320 points equally spaced in time from [0, 2] and the total number of random features in SRMD is set to N = 50m = 16000.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Seasonally adjusted and non-seasonally adjusted quarterly time series of UK public sector employment, containing the latest estimates.
The U.S. Census Bureau.s economic indicator surveys provide monthly and quarterly data that are timely, reliable, and offer comprehensive measures of the U.S. economy. These surveys produce a variety of statistics covering construction, housing, international trade, retail trade, wholesale trade, services and manufacturing. The survey data provide measures of economic activity that allow analysis of economic performance and inform business investment and policy decisions. Other data included, which are not considered principal economic indicators, are the Quarterly Summary of State & Local Taxes, Quarterly Survey of Public Pensions, and the Manufactured Homes Survey. For information on the reliability and use of the data, including important notes on estimation and sampling variance, seasonal adjustment, measures of sampling variability, and other information pertinent to the economic indicators, visit the individual programs' webpages - http://www.census.gov/cgi-bin/briefroom/BriefRm.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The global Open Source Time Series Database market size was valued at USD XX million in 2025 and is projected to expand at a CAGR of XX% during the forecast period, reaching USD XX million by 2033. The growth of the market is primarily driven by the increasing adoption of the Internet of Things (IoT) devices, which generate vast amounts of time-series data. Additionally, the growing demand for real-time analytics and the need for efficient and scalable data storage solutions are further fueling market growth. The market is segmented based on application, type, and region. By application, the IoT industry is expected to hold the largest market share during the forecast period. By type, the cloud-based segment is projected to dominate the market. Regionally, North America is anticipated to remain the largest market, followed by Europe and Asia Pacific. Major market players include InfluxData, Timescale, Prometheus, OpenTSDB, VictoriaMetrics, and QuestDB, among others.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ERA5-Land is a reanalysis dataset providing a consistent view of the evolution of land variables over several decades at an enhanced resolution compared to the Fifth Generation of the European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis (ERA5). Produced by replaying only the land component of the ECMWF ERA5 climate reanalysis, it benefits from the same physical data-assimilation framework but runs offline at higher spatial detail (9 km grid) to deliver richer land-surface information. Reanalysis merges numerical model output with global observations into a globally complete, physically consistent climate record; this “data assimilation” approach mirrors operational weather forecasting but is optimised for historical completeness rather than forecast timeliness. Reanalysis datasets extend back several decades by sacrificing forecast deadlines, allowing additional time to gather observations and retrospectively ingest improved data, thereby enhancing data quality in earlier periods. ERA5-Land uses atmospheric fields from ERA5—air temperature, humidity, pressure—as “forcing” inputs to drive its land-surface model, preventing rapid drift from reality that unconstrained simulations would suffer. Although observations do not enter the land model directly, they shape the atmospheric forcing through assimilation, giving ERA5-Land an indirect observational anchor. To reconcile ERA5’s coarser grid with ERA5-Land’s finer 9 km grid, a lapse-rate correction adjusts input temperatures, humidity, and pressures for altitude differences. Like all numerical simulations, ERA5-Land carries uncertainty that generally grows backward in time as fewer observations were available to constrain the forcing. Users can combine ERA5-Land fields with the uncertainty estimates from equivalent ERA5 variables to assess confidence bounds. The temporal resolution (hourly) and spatial detail (9 km) of ERA5-Land make it invaluable for land-surface applications such as flood and drought forecasting, agricultural monitoring, and hydrological studies. The dataset presented here is a regridded subset of the full ERA5-Land archive, stored in an Analysis-Ready, Cloud-Optimised (ARCO) format specifically designed for retrieving long time-series for individual points. When a user’s requested location does not exactly match a grid point, the nearest grid point is automatically selected. This optimised data source ensures rapid response times.
Utilisation of this data is subject to European Space Agency's Earth Observation Terms and Conditions. Read T&C here
This is Dataset Version 3 - Updates may be done following feedback from the machine learning community.
This dataset contains 327 time series corresponding to the temporal values of 327 telemetry parameters over the life of the real GOCE satellite (from March 2009 to October 2013). It consists both the raw data and Machine-Learning ready-to-use resampled data:
- The raw values (calibrated values of each parameter) as {param}_raw.parquet
files (irregular)
- Resampled and popular statistics computed over 10-minutes windows for each parameter as {param}_stats_10min.parquet
files.
- Resampled and popular statistics computed over 6-hours windows for each parameter as {param}_stats_6h.parquet
- metadata.csv list of all parameters with description, subsystem, first and last timestamp where a value is recorded, fraction of NaN in the calculated statistics and the longest data gap.
- mass_properties.csv: provides information relative to the satellite mass (for example the remaining fuel on-board).
The Gravity Field and Steady-State Ocean Circulation Explorer (GOCE; pronounced ‘go-chay’), is a scientific mission satellite from the European Space Agency (ESA).
GOCE's primary mission objective was to provide an accurate and detailed global model of Earth's gravity field and geoid. For this purpose, it is equipped with a state-of-the-art Gravity Gradiometer and precise tracking system.
The satellite's main payload was the Electrostatic Gravity Gradiometer (EGG) to measure the gravity field of Earth. Other payload was an onboard GPS receiver used as a Satellite-to-Satellite Tracking Instrument (SSTI); a compensation system for all non-gravitational forces acting on the spacecraft. The satellite was also equipped with a laser retroreflector to enable tracking by ground-based Satellite laser ranging station.
The satellite's unique arrow shape and fins helped keep GOCE stable as it flew through the thermosphere at a comparatively low altitude of 255 kilometres (158 mi). Additionally, an ion propulsion system continuously compensated for the variable deceleration due to air drag without the vibration of a conventional chemically powered rocket engine, thus limiting the errors in gravity gradient measurements caused by non-gravitational forces and restoring the path of the craft as closely as possible to a purely inertial trajectory.
Due to the orbit and satellite configuration, the solar panels experienced extreme temperature variations. The design therefore had to include materials that could tolerate temperatures as high as 160 degC and as low as -170 degC.
Due to its stringent temperature stability requirements (for the gradiometer sensor heads, in the range of milli-Kelvin) the gradiometer was thermally decoupled from the satellite and had its own dedicated thermal-control system.
Flight operations were conducted from the European Space Operations Centre, based in Darmstadt, Germany.
It was launched on 17 March 2009 and came to and end of mission on 21 October 2013 because it ran out of propellant. As planned, the satellite began dropping out of orbit and made an uncontrolled re-entry on 11 November 2013
GOCE used a Sun-synchronous orbit with an inclindation of 96.7 degree, a mean altitude of approximately 263 km, an orbital period of 90 minutes, and a mean local solar time at ascending node of 18:00.
The Global Population Density Grid Time Series Estimates provide a back-cast time series of population density grids based on the year 2000 population grid from SEDAC's Global Rural-Urban Mapping Project, Version 1 (GRUMPv1) data set. The grids were created by using rates of population change between decades from the coarser resolution History Database of the Global Environment (HYDE) database to back-cast the GRUMPv1 population density grids. Mismatches between the spatial extent of the HYDE calculated rates and GRUMPv1 population data were resolved via infilling rate cells based on a focal mean of values. Finally, the grids were adjusted so that the population totals for each country equaled the UN World Population Prospects (2008 Revision) estimates for that country for the respective year (1970, 1980, 1990, and 2000). These data do not represent census observations for the years prior to 2000, and therefore can at best be thought of as estimations of the populations in given locations. The population grids are consistent internally within the time series, but are not recommended for use in creating longer time series with any other population grids, including GRUMPv1, Gridded Population of the World, Version 4 (GPWv4), or non-SEDAC developed population grids. These population grids served as an input to SEDAC's Global Estimated Net Migration Grids by Decade: 1970-2000 data set.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
The mid-year estimates refer to the population on 30 June of the reference year and are produced in line with the standard United Nations (UN) definition for population estimates. They are the official set of population estimates for the UK and its constituent countries, the regions and counties of England, and local authorities and their equivalents.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Consumer trends time series dataset up to Quarter 2 (April to June) 2025.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Movements in the volume of production for the UK production industries: manufacturing, mining and quarrying, energy supply, and water and waste management. Figures are seasonally adjusted.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Time Series PILE
The Time-series Pile is a large collection of publicly available data from diverse domains, ranging from healthcare to engineering and finance. It comprises of over 5 public time-series databases, from several diverse domains for time series foundation model pre-training and evaluation.
Time Series PILE Description
We compiled a large collection of publicly available datasets from diverse domains into the Time Series Pile. It has 13 unique domains of data… See the full description on the dataset page: https://huggingface.co/datasets/AutonLab/Timeseries-PILE.