SCALABLE TIME SERIES CHANGE DETECTION FOR BIOMASS MONITORING USING GAUSSIAN PROCESS
VARUN CHANDOLA* AND RANGA RAJU VATSAVAI*
Abstract. Biomass monitoring, specifically, detecting changes in the biomass or vegetation of a geographical region, is vital for studying the carbon cycle of the system and has significant implications in the context of understanding climate change and its impacts. Recently, several time series change detection methods have been proposed to identify land cover changes in temporal profiles (time series) of vegetation collected using remote sensing instruments. In this paper, we adapt Gaussian process regression to detect changes in such time series in an online fashion. While Gaussian process (GP) has been widely used as a kernel based learning method for regression and classification, their applicability to massive spatio-temporal data sets, such as remote sensing data, has been limited owing to the high computational costs involved. In our previous work we proposed an efficient Toeplitz matrix based solution for scalable GP parameter estimation. In this paper we apply these solutions to a GP based change detection algorithm. The proposed change detection algorithm requires a memory footprint which is linear in the length of the input time series and runs in time which is quadratic to the length of the input time series. Experimental results show that both serial and parallel implementations of our proposed method achieve significant speedups over the serial implementation. Finally, we demonstrate the effectiveness of the proposed change detection method in identifying changes in Normalized Difference Vegetation Index (NDVI) data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This lesson was adapted from educational material written by Dr. Kateri Salk for her Fall 2019 Hydrologic Data Analysis course at Duke University. This is the first part of a two-part exercise focusing on time series analysis.
Introduction
Time series are a special class of dataset, where a response variable is tracked over time. The frequency of measurement and the timespan of the dataset can vary widely. At its most simple, a time series model includes an explanatory time component and a response variable. Mixed models can include additional explanatory variables (check out the nlme
and lme4
R packages). We will be covering a few simple applications of time series analysis in these lessons.
Opportunities
Analysis of time series presents several opportunities. In aquatic sciences, some of the most common questions we can answer with time series modeling are:
Can we forecast conditions in the future?
Challenges
Time series datasets come with several caveats, which need to be addressed in order to effectively model the system. A few common challenges that arise (and can occur together within a single dataset) are:
Autocorrelation: Data points are not independent from one another (i.e., the measurement at a given time point is dependent on previous time point(s)).
Data gaps: Data are not collected at regular intervals, necessitating interpolation between measurements. There are often gaps between monitoring periods. For many time series analyses, we need equally spaced points.
Seasonality: Cyclic patterns in variables occur at regular intervals, impeding clear interpretation of a monotonic (unidirectional) trend. Ex. We can assume that summer temperatures are higher.
Heteroscedasticity: The variance of the time series is not constant over time.
Covariance: the covariance of the time series is not constant over time. Many of these models assume that the variance and covariance are similar over the time-->heteroschedasticity.
Learning Objectives
After successfully completing this notebook, you will be able to:
Choose appropriate time series analyses for trend detection and forecasting
Discuss the influence of seasonality on time series analysis
Interpret and communicate results of time series analyses
Multivariate Time-Series (MTS) are ubiquitous, and are generated in areas as disparate as sensor recordings in aerospace systems, music and video streams, medical monitoring, and financial systems. Domain experts are often interested in searching for interesting multivariate patterns from these MTS databases which can contain up to several gigabytes of data. Surprisingly, research on MTS search is very limited. Most existing work only supports queries with the same length of data, or queries on a fixed set of variables. In this paper, we propose an efficient and flexible subsequence search framework for massive MTS databases, that, for the first time, enables querying on any subset of variables with arbitrary time delays between them. We propose two provably correct algorithms to solve this problem — (1) an R-tree Based Search (RBS) which uses Minimum Bounding Rectangles (MBR) to organize the subsequences, and (2) a List Based Search (LBS) algorithm which uses sorted lists for indexing. We demonstrate the performance of these algorithms using two large MTS databases from the aviation domain, each containing several millions of observations. Both these tests show that our algorithms have very high prune rates (>95%) thus needing actual disk access for only less than 5% of the observations. To the best of our knowledge, this is the first flexible MTS search algorithm capable of subsequence search on any subset of variables. Moreover, MTS subsequence search has never been attempted on datasets of the size we have used in this paper.
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Time Series Analysis Software Market size was valued at USD 1.8 Billion in 2024 and is projected to reach USD 4.7 Billion by 2031, growing at a CAGR of 10.5% during the forecast period 2024-2031.
Global Time Series Analysis Software Market Drivers
Growing Data Volumes: The exponential growth in data generated across various industries necessitates advanced tools for analyzing time series data. Businesses need to extract actionable insights from large datasets to make informed decisions, driving the demand for time series analysis software.
Increasing Adoption of IoT and Connected Devices: The proliferation of Internet of Things (IoT) devices generates continuous streams of time-stamped data. Analyzing this data in real-time helps businesses optimize operations, predict maintenance needs, and enhance overall efficiency, fueling the demand for time series analysis tools.
Advancements in Machine Learning and AI: Integration of machine learning and artificial intelligence (AI) with time series analysis enhances predictive capabilities and automates the analysis process. These advancements enable more accurate forecasting and anomaly detection, attracting businesses to adopt sophisticated analysis software.
Need for Predictive Analytics: Businesses are increasingly focusing on predictive analytics to anticipate future trends and behaviors. Time series analysis is crucial for forecasting demand, financial performance, stock prices, and other metrics, driving the market growth.
Industry 4.0 and Automation: The push towards Industry 4.0 involves automating industrial processes and integrating smart technologies. Time series analysis software is essential for monitoring and optimizing manufacturing processes, predictive maintenance, and supply chain management in this context.
Financial Sector Growth: The financial industry extensively uses time series analysis for modeling stock prices, risk management, and economic forecasting. The growing complexity of financial markets and the need for real-time data analysis bolster the demand for specialized software.
Healthcare and Biomedical Applications: Time series analysis is increasingly used in healthcare for monitoring patient vitals, managing medical devices, and analyzing epidemiological data. The focus on personalized medicine and remote patient monitoring drives the adoption of these tools.
Climate and Environmental Monitoring: Governments and organizations use time series analysis to monitor climate change, weather patterns, and environmental data. The need for accurate predictions and real-time monitoring in environmental science boosts the market.
Regulatory Compliance and Risk Management: Industries such as finance, healthcare, and energy face stringent regulatory requirements. Time series analysis software helps in compliance by providing detailed monitoring and reporting capabilities, reducing risks associated with regulatory breaches.
Emergence of Big Data and Cloud Computing: The adoption of big data technologies and cloud computing facilitates the storage and analysis of large volumes of time series data. Cloud-based time series analysis software offers scalability, flexibility, and cost-efficiency, making it accessible to a broader range of businesses.
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
As per Cognitive Market Research's latest published report, the Global Time Series Databases Software market size will be $993.24 Million by 2028. Time Series Databases Software Industry's Compound Annual Growth Rate will be 18.36% from 2023 to 2030. Factors Affecting Time Series Databases Software market growth
Rise in automation in industry
Industrial sensors are a key part of factory automation and Industry 4.0. Motion, environmental, and vibration sensors are used to monitor the health of equipment, from linear or angular positioning, tilt sensing, leveling, shock, or fall detection. A Sensor is a device that identifies the progressions in electrical or physical or other quantities and in a way delivers a yield as an affirmation of progress in the quantity.
In simple terms, Industrial Automation Sensors are input devices that provide an output (signal) with respect to a specific physical quantity (input). In industrial automation, sensors play a vital part to make the products intellectual and exceptionally automatic. These permit one to detect, analyze, measure, and process a variety of transformations like alteration in position, length, height, exterior, and dislocation that occurs in the Industrial manufacturing sites. These sensors also play a pivotal role in predicting and preventing numerous potential proceedings, thus, catering to the requirements of many sensing applications. This sensor generally works on time series as the readings are taken after equal intervals of time.
The increase in the use of sensor to monitor the industrial activities and in production factories is fueling the growth of the time series database software market. Also manufacturing in pharmaceutical industry requires proper monitoring due to which there is increase in demand for sensors and time series database, this fuels the demand for time series database software market.
Increasing Demand of Data Driven Decision Making Fuels the Market Growth
Restraints for Time Series Databases Software Market
Network Security. (Access Detailed Analysis in the Full Report Version)
Opportunities for Time Series Databases Software Market
IoT and time series database software. (Access Detailed Analysis in the Full Report Version)
Factors Affecting the Time Series Databases Software Market
Time-series data is a sequence of data points collected over time intervals, giving us the ability to track changes over time. Time-series data can track changes over milliseconds, days, or even years. Time-series databases are designed to store data that changes with time. This can be any kind of data which was collected over time. It might be metrics collected from some systems - all trending systems are examples of the time-series data. Time Series Databases (TSDB) are designed to store and analyze event data, time series, or time-stamped data, often streamed from IoT devices, and enables graphing, monitoring and analyzing changes over time. Time series databases allow businesses to store time-stamped data.
A company may adopt a time series database if they need to monitor data in real time or if they are running applications that continuously produce data. Some examples of applications that product time series data include network or application performance monitoring (APM) software tools, sensor data from IoT devices, financial market data, and a number of security applications, among many others. Time series databases are optimized for storing this data so that it can be easily pulled and analyzed. Time series data is often used when running predictive analytics or machine learning algorithms, enabling users to understand historical data to help predict future outcomes. Some big data processing and distribution software may provide time series storage functionality. In some fields, time series may be called profiles, curves, traces or trends. Several early time series databases are associated with industrial applications which could efficiently store measured values from sensory equipment (also referred to as data historians), but now are used in support of a much wider range of applications.
This dataset contains oceanographic and surface meteorological data in netCDF formatted files, which follow the Climate and Forecast metadata convention (CF) and the Attribute Convention for Data Discovery (ACDD). USF CMS - Coastal Ocean Monitoring and Prediction System collected the data from their in-situ moored station named 42023 in the Gulf of Mexico. Southeast Coastal Ocean Observing Regional Association (SECOORA), which assembles data from USF CMS - Coastal Ocean Monitoring and Prediction System and other sub-regional coastal and ocean observing systems of the Southeast United States, submitted the data to NCEI as part of the Integrated Ocean Observing System Data Assembly Centers (IOOS DACs) Data Stewardship Program. NCEI updates this dataset when new files are available.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is for the paper "First onset of unrest captured at Socompa: A Recent Geodetic Survey at Central Andean volcanoes in Northern Chile" which is published in GRL: https://doi.org/10.1029/2022GL102480.
InSAR Data:
The folder of InSAR_149A.rar stores the InSAR time series analysis dataset on ascending track 149.
The 'Imagedate' folder stores the empty *.rslc files to indicate the date of each SLCs.
Data_Asc.mat stores the main InSAR time series data, which includes the UTC time of the acquisition (for accurate time calculation), the length of perpendicular baselines (unit is meter), the number of days counting from the first epoch, the unwrapped time series data (ifg), the unwrapped time series data with GACOS correction (ifg_aps), the look angles (la, unit is rad), and lat&lon.
parms.mat stores the parameters used during the data processing by StaMPS.
semi_fit.mat stores the results of the semi-variogram fitting of each interferogram on time series. It provides two versions for the original dataset (semi) and the GACOS-corrected dataset (semi_aps). This file is mainly used to weight the data during the time series fitting.
runTSA.m, the main function to run the InSAR time series fitting. See more details in the Code part.
The folder of InSAR_156D.rar stores the same content as the InSAR_149A.rar but for descending track 156.
Code:
This folder contains the codes of the InSAR time series fitting for this dataset, and the GBIS software.
TSA_findref.m, this function is used to search the reference point of the InSAR data.
TSA_EQ_fit.m, is the main function to perform InSAR time series fitting.
rb_pixel_fit.m, is the robust way to fit the linear model.
TSA_EQ_pixel.m, is the function used to plot the results.
To perform the InSAR time series fitting, you need to put these four functions under your Matlab path, and then run the runTSA.m function in the data folder.
The GBIS folder stores the updated version of the GBIS software, which allows you to perform the pCDM, CDM, and pECM. The core functions of these models are provided by Dr. Mehdi Nikkhoo, and you could find them here: https://www.volcanodeformation.com/software
GBIS_Modelling_Results:
This folder stores the data of InSAR and GPS joint inversion for Socompa Uplift.
The folder Socompa stores the modelling results using the models of Okada(D), pECM(E), Mogi(M), pCDM(N), and Yang(Y), respectively.
GPS_data.txt stores the cumulative displacements and the uncertainties of the SOCM station in three directions.
Socompa.inp is the configuration file for GBIS running.
Vol_asc.mat and Vol_asc_ds.mat stores the original and the downsampled ascending data, while Vol_dsc.mat and Vol_dsc_ds.mat store those of descending.
Many thanks for using our dataset and please let me know if you have any further questions!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset provided in this repository represents the raw data of the lateral vibration collected through on-site tests on an 18-story timber-concrete hybrid building. The dataset consists of the time-series velocity data obtained by microtremor measurement and human-powered excitation tests.
Servo velocity sensors and a data acquisition system composed of an A/D converter, a DC amplifier and a laptop were used to measure the velocity data. The velocity in the longitudinal and the transverse directions of the building were recorded with a sampling rate of 200 Hz in all the tests.
Each Microsoft Excel Spreadsheet (.xlsx) file contains a ‘Data’ sheet and a ’Plot’ sheet. The time-series velocity data are listed in the Data-sheet, while the velocity-time curves are shown in the Plot-sheet. In the Data-sheet, the first column represents the time. The following columns represent the velocity of each sensor. The unit for the time and the velocity are second (s) and cm/s, respectively.
This data set is a comma-separated values (CSV) file containing continuous hourly water quality observations of the Indian River in Sitka National Historical Park for monitoring years 2010-2021. Core parameters collected are temperature, dissolved oxygen, pH, and conductivity, obtained from multiparameter sondes during the ice-free season. Using the Aquarius Time-Series application, data have been quality controlled, graded against formal criteria specified in the protocol, drift corrected where appropriate, and certified for publication. The data set (CSV) and associated metadata are zipped into a site-specific archive (ZIP file), identified as the FQ_Q deliverable in the SEAN water quality protocol package FQ-2022.1, SOP 8.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books and is filtered where the book is Advanced environmental monitoring with remote sensing time series data and R. It has 7 columns such as book, author, ISBN, BNB id, and language. The data is ordered by publication date.
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Time Series Databases Software Market size was valued at USD 359.37 USD Million in 2024 and is projected to reach USD 773.71 Million by 2031, growing at a CAGR of 10.06% from 2024 to 2031.
Time Series Databases Software Market Drivers
Growing Data Volume: The exponential growth of data generated by various sources, including IoT devices, financial transactions, and digital services, necessitates efficient management and analysis of time-stamped data. Time series databases are optimized for handling large volumes of time-stamped data, driving their adoption.
Rise of IoT and Connected Devices: The proliferation of IoT devices in industries such as manufacturing, healthcare, and smart cities generates massive amounts of time-series data. Time series databases are crucial for storing, querying, and analyzing this continuous stream of data efficiently.
Increasing Importance of Real-Time Analytics: Businesses require real-time insights to make informed decisions and maintain competitive advantage. Time series databases support real-time analytics by efficiently processing and analyzing time-stamped data, which is critical for applications like monitoring, forecasting, and anomaly detection.
This archival package contains time series measurements of temperature and salinity at the GAK1 site at the mouth of Resurrection Bay near Seward, AK from December 1999 through October 2002. Instrument packages were deployed at 6 depth levels.
Abstract copyright UK Data Service and data collection copyright owner.
The General Household Survey (GHS), ran from 1971-2011 (the UKDS holds data from 1972-2011). It was a continuous annual national survey of people living in private households, conducted by the Office for National Statistics (ONS). The main aim of the survey was to collect data on a range of core topics, covering household, family and individual information. This information was used by government departments and other organisations for planning, policy and monitoring purposes, and to present a picture of households, families and people in Great Britain. In 2008, the GHS became a module of the Integrated Household Survey (IHS). In recognition, the survey was renamed the General Lifestyle Survey (GLF). The GLF closed in January 2012. The 2011 GLF is therefore the last in the series. A limited number of questions previously run on the GLF were subsequently included in the Opinions and Lifestyle Survey (OPN).
Secure Access GHS/GLF
The UKDS holds standard access End User Licence (EUL) data for 1972-2006. A Secure Access version is available, covering the years 2000-2011 - see SN 6716 General Lifestyle Survey, 2000-2011: Secure Access.
History
The GHS was conducted annually until 2011, except for breaks in 1997-1998 when the survey was reviewed, and 1999-2000 when the survey was redeveloped. Further information may be found in the ONS document An overview of 40 years of data (General Lifestyle Survey Overview - a report on the 2011 General Lifestyle Survey) (PDF). Details of changes each year may be found in the individual study documentation.
EU-SILC
In 2005, the European Union (EU) made a legal obligation (EU-SILC) for member states to collect additional statistics on income and living conditions. In addition, the EU-SILC data cover poverty and social exclusion. These statistics are used to help plan and monitor European social policy by comparing poverty indicators and changes over time across the EU. The EU-SILC requirement was integrated into the GHS/GLF in 2005. After the closure of the GLF, EU-SILC was collected via the Family Resources Survey (FRS) until the UK left the EU in 2020.
Reformatted GHS data 1973-1982 - Surrey SPSS Files
SPSS files were created by the University of Surrey for all GHS years from 1973 to 1982 inclusive. The early files were restructured and the case changed from the household to the individual with all of the household information duplicated for each individual. The Surrey SPSS files contain all the original variables as well as some extra derived variables (a few variables were omitted from the data files for 1973-76). In 1973 only, the section on leisure was not included in the Surrey SPSS files. This has subsequently been made available, however, and is now held in a separate study, General Household Survey, 1973: Leisure Questions (SN 3982). Records for the original GHS 1973-1982 ASCII files have been removed from the UK Data Archive catalogue, but the data are still preserved and available upon request.
The main GHS consisted of a household questionnaire, completed by the Household Reference Person (HRP), and an individual questionnaire, completed by all adults aged 16 and over resident in the household. A number of different trailers each year covering extra topics were included in later (post-review) surveys in the series from 2000.
A validation assessment of Land Cover Monitoring, Assessment, and Projection Collection 1.1 annual land cover products (1985-2019) for the Conterminous United States was conducted with an independently collected reference dataset. Reference data land cover attributes were assigned by trained interpreters for each year of the time series (1984–2018) at to a reference sample of 24,971 Landsat resolution (30m x 30m) pixels. These pixels were randomly selected from a sample frame of all pixels in the ARD grid system which fell within the map area (Dwyer et al., 2018). Interpretation used the TimeSync reference data collection tool which visualizes Landsat images and Landsat data values for all usable images in the time series (1984-2018) (Cohen et al., 2010). Interpreters also referred to air photos and high resolution images available in Google Earth as well as several ancillary data layers. The interpreted land cover attributes were crosswalked to the LCMAP annual land cover classes: Developed, Cropland, Grass/Shrub, Tree Cover, Wetland, Water, Snow/Ice and Barren. Validation analysis directly compared reference labels with annual LCMAP land cover map attributes by cross tabulation. The results of that assessment are reported here as confusion matrices for land cover agreement and land cover change agreement. The standard errors have been calculated using the post stratified estimator (Card, 1982). Land cover class proportions were also estimated from the reference data for each year, 1985-2018, using the post stratified estimator. A cluster sampling formulation (Stehman 1997) was used to calculate standard sampling error for summary tables reporting results for multiple years of data comparison.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book subjects and is filtered where the books includes Advanced environmental monitoring with remote sensing time series data and R. It has 9 columns such as book subject, earliest publication date, latest publication date, avg publication date, and number of authors. The data is ordered by earliest publication date.
https://www.marketresearchintellect.com/privacy-policyhttps://www.marketresearchintellect.com/privacy-policy
The market size of the Time Series Databases Software Market is categorized based on Application (Relational Databases, NoSQL Databases, Specialized Time Series Databases) and Product (Time-Based Data Storage, Analytics, Monitoring Systems, IoT Applications) and geographical regions (North America, Europe, Asia-Pacific, South America, and Middle-East and Africa).
The provided report presents market size and predictions for the value of Time Series Databases Software Market, measured in USD million, across the mentioned segments.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
You are not authorized to view this dataset. You may email the responsible party OEAW to request access.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset was acquired through measurements on a condition monitoring demonstrator consisting of an AC induction motor powered by 230 V 50 Hz single phase AC using a motor capacitator to start and operate the motor. The motor has a removable fan housing and its original fan is replaced by different 3D printed fans both similar to the original fan and with modifications, such as missing fan blades. The motor is connected to an air-compressor using a metal shaft with an attached 3D printed attachment that allows to fasten a grub screw in order to create an unbalance. The aluminum profiles holding the motor and compressor are uncoupled from each other and the main frame using polymer springs. An LSM9DS1 sensor is mounted on the motor profile and used to measure 3D accelerations at 400 Hz sample. The sensor is connected to an ESP32 microcontroller reading the measurements at <= 3 ms sample time. The condition monitoring demonstrator allows to configure a multitude of operating conditions, of which we select the eight conditions described below for this paper's dataset.
ID | Label | Description |
---|---|---|
1 | off | System is activated, but motor is turned off. |
2 | on | Motor is running, powered with 50 Hz AC. |
3 | cap | Motor capacitor is deactivated while motor is running. |
4 | out | Compressor outlet valve is manually constricted. |
5 | unb | A grub screw is inserted on one side of the shaft to create an unbalance. |
6 | c25 | Minor clogging of fan housing by attaching cover with 25 % reduced passage. |
7 | c75 | Major clogging of fan housing by attaching cover with 75 % reduced passage. |
8 | vnt | Replacing the fan with defective fan that is missing 3 fan blades. |
Each condition is labeled with an ID and an abbreviated label and a short description is given. We recommend to also view the video documentation of the machine conditions at https://t1p.de/ai4i2021video
For each condition 10 seconds of structure-borne sound data is collected using the accelerometer. Accelerometer data showed a jitter with sample times between 2 to 3 ms. The data was then harmonized to regular time intervals at a sampling rate of 300 Hz using cubic spline interpolation. Time series data is available in the folder 'Time Series Data'. There, both raw (_raw.csv) and harmonized data (_hrm.csv) are available. Acceleration values are represented in mg (=10^(-3) g)
Air-borne sound data is recorded using a microphone, which to a small extent contains background noises, although much less than could be expected in may real industrial settings. Microphone data was collected at 48000 Hz and 16 bit resolution and is stored as (*_audio.wav) files in the ‘Time Series Data Folder’.
To train a condition monitoring classifier, we recommend to use the frequency features in the folder 'Frequency Features'. There, a short-time Fourier transform using a 200 ms rectangular window is performed on both structure-borne and air-borne sound data. To acquire more observations, windows overlap by 80 %. Structure-borne sound data is transformed to 10, 15, ..., 120 Hz and air-borne sound data to 25, 50, ..., 2500 Hz frequency amplitude values.
This results in 250 observations per condition, each with 3 x 23 = 69 structure-borne, and 100 air-borne frequency features. The resulting feature dataset of 8 x 250 = 2000 observations is labeled with corresponding IDs and labels, contains the time-stamp at which the STFT window started and the 169 frequency features. The table’s heading denotes the respective acceleration direction and frequency (e.g. xAcc0085Hz, zAcc0015Hz) or the air-borne sound (e.g. snd0075Hz, snd1225Hz).
Stephan Matzka, HTW Berlin, stephan.matzka@htw-berlin.de
This dataset is part of a publication, please cite. S. Matzka, J. Pilz and A. Franke, "Structure-borne and Air-borne Sound Data for Condition Monitoring Applications," 2021 4th International Conference on Artificial Intelligence for Industries (AI4I), 2021, pp. 1-4, doi: 10.1109/AI4I51902.2021.00009
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
A validation assessment of Land Cover Monitoring, Assessment, and Projection Collection 1.0 annual land cover products (2000–2019) for Hawaii was conducted with an independently collected reference dataset. Reference data land cover attributes were assigned by trained interpreters for each year of the time series (2000–2019) to a reference sample of 600 Landsat resolution (30m x 30m) pixels. The LCMAP and reference dataset labels for each pixel location are displayed here for each year, 2000–2019.
This resource contains the supporting data and code files for the analyses presented in "Toward automating post processing of aquatic sensor data," a manuscript submitted to the journal Environmental Modelling and Software. This paper describes PyHydroQC, a Python package developed to identify and correct anomalous values in time series data collected by in situ aquatic sensors. For more information on PyHydroQC, see the code repository (https://github.com/AmberSJones/PyHydroQC) and the documentation (https://ambersjones.github.io/PyHydroQC/). The package may be installed from the Python Package Index (more info: https://packaging.python.org/tutorials/installing-packages/).
Included in this resource are input data, Python scripts to run the package on the input data (anomaly detection and correction), results from running the algorithm, and Python scripts for generating the figures in the manuscript. The organization and structure of the files are described in detail in the readme file. The input data were collected as part of the Logan River Observatory (LRO). The data in this resource represent a subset of data available for the LRO and were compiled by querying the LRO’s operational database. All available data for the LRO can be sourced at http://lrodata.usu.edu/tsa/ or on HydroShare: https://www.hydroshare.org/search/?q=logan%20river%20observatory.
There are two sets of scripts in this resource: 1.) Scripts that reproduce plots for the paper using saved results, and 2.) Code used to generate the complete results for the series in the case study. While all figures can be reproduced, there are challenges to running the code for the complete results (it is computationally intensive, different results will be generated due to the stochastic nature of the models, and the code was developed with an early version of the package), which is why the saved results are included in this resource. For a simple example of running PyHydroQC functions for anomaly detection and correction on a subset of data, see this resource: https://www.hydroshare.org/resource/92f393cbd06b47c398bdd2bbb86887ac/.
SCALABLE TIME SERIES CHANGE DETECTION FOR BIOMASS MONITORING USING GAUSSIAN PROCESS
VARUN CHANDOLA* AND RANGA RAJU VATSAVAI*
Abstract. Biomass monitoring, specifically, detecting changes in the biomass or vegetation of a geographical region, is vital for studying the carbon cycle of the system and has significant implications in the context of understanding climate change and its impacts. Recently, several time series change detection methods have been proposed to identify land cover changes in temporal profiles (time series) of vegetation collected using remote sensing instruments. In this paper, we adapt Gaussian process regression to detect changes in such time series in an online fashion. While Gaussian process (GP) has been widely used as a kernel based learning method for regression and classification, their applicability to massive spatio-temporal data sets, such as remote sensing data, has been limited owing to the high computational costs involved. In our previous work we proposed an efficient Toeplitz matrix based solution for scalable GP parameter estimation. In this paper we apply these solutions to a GP based change detection algorithm. The proposed change detection algorithm requires a memory footprint which is linear in the length of the input time series and runs in time which is quadratic to the length of the input time series. Experimental results show that both serial and parallel implementations of our proposed method achieve significant speedups over the serial implementation. Finally, we demonstrate the effectiveness of the proposed change detection method in identifying changes in Normalized Difference Vegetation Index (NDVI) data.