On this page
2023-02-14T00:00:00.000Z
On this page
2023-02-14
Paper 1 (Unsupervised Deep Learning for IoT Time Series)
liuUnsupervisedDeepLearning2023 link DOI
Non-stationary
For non-stationary time series, researchers can convert them into stationary time series through methods such as detrending to reduce their damage to DL models.
Multimodal source fusion
- Time series encoding (for example time series to image encoding)
- Analysis of multivariate time series
- to treat each dimension as an independent univariate time series
- use multivariate Gaussian distributions
- use two-dimensional matrices
- GNN, encoding the asymmetric relationships between pairs of sensors as directed edges in a graph. The graph’s adjacency matrix is obtained by similarity measurement and embedded by a Graph Attention Network (GAT)
- spatio-temporal autoencoders. 2D-CAEs to encode the correlations between different sensors and then used ConvLSTMs to capture the dynamic patterns of the system
Data heterogeneity
- certain types of neural networks are robust to data heterogeneity (example: Residual Neural Network)
Preprocessing: Denoising
- mathematical transformations: FFT, CWT
- deep learning: Denoising Autoencoder (DAE).
Preprocessing: Insufficient Data
- Data augmentation for training data
- Data augmentation: time domain methods, frequency domain methods, and time-frequency domain methods
- Data augmentation: speed variation, rotation, time warping, missing value simulation, adding noise in time-domain, and adding noise in frequency domain.
Deep Learning Method Notes
- CNN is suitable for processing data with a grid-like structure
- time series converted into sequences of system feature maps, then use CNN
- time series converted into saliency maps by the Spectral Residual (SR) algorithm, then use CNN
- CNN is applied to the hidden states of the LSTM to extract the local spatial information.
- Convolutional LSTM (ConvLSTM) replaces the fully-connected layers in LSTM with convolutional layers to capture the spatio-temporal correlation.
- Quasi-RNN model alternates convolutional layers and simple recurrent layers to allow parallel processing.
- Dilated RNN uses dilated recurrent skip connections to reduce model parameters and improve training efficiency.
- Time Convolutional Network (TCN) uses causal convolution to fit sequential data and extended convolution and residual modules to memorize past states
- Graph Neural Networks (GNN). Modeling graph-structured data, suitable for complex topological relationships between data sources. GNNs are effective for large-scale multi-relational data modeling, suitable for modeling high-dimensional time series. GNNs: Recurrent GNNs, Convolutional GNNs, Graph Autoencoders, and Spatial Temporal GNNs
- Autoencoder (AE): Autoencoders are primarily designed to encode an input into a latent representation and then reconstruct it
- Generative Adversarial Network (GAN): GANs are machine learning frameworks consisting of two neural networks that compete with each other: one (the generator G) is trained to generate fake data, and the other (the discriminator D) is trained to discern the fake data from the real one. GANs can implicitly model the high-dimensional distribution of data.
- Transformer leverages the attention mechanism to process sequence data. Unlike RNNs, which rely on an inherently sequential nature, Transformer allows the model to access any part of the history regardless of distance, making it significantly more parallelizable and potentially more suitable for capturing long-term dependencies.
- Canonical Transformer follows an encoder-decoder structure using stacked self-attention and point-wise, fully connected layers. But it can be utilize for time series modeling.
- Attention Mechanism: component of the neural network architecture responsible for capturing the correlations between different parts of data. It helps the model automatically identify the crucial parts of the input data and assign them large weights. In the field of time series modeling, it is assumed that previous time steps have different correlations with the current state. Thus, models with attention mechanisms can adaptively select appropriate previous time steps and aggregate the information to form a refined output
- Transfer Learning (TL): To overcome insufficient labeled data problem, TL utilizes labeled data from different but related tasks to facilitate the learning of the target task. In other words, the knowledge learned from a related task is transferred to the target task
- Self-Supervised Learning (SSL): To overcome the lack of labeled data, SSL leverages input data itself as supervision. The general process of SSL consists of two steps. The first step is to train a pretext task on a large amount of unlabeled data. Then the second step is to fine-tune the pre-trained model according to the target task.
Generative Methods in Time Series
- Generative Model is developed to achieve better representation capabilities.
- Generative models assume that the available data is generated by some unknown distribution and try to estimate this distribution.
- Models: autoencoders and GANs
- Autoencoders learn the distribution of the input data by minimizing the reconstruction loss that measures the distance between the output of the decoder and the input data.
- traditional autoencoders are trained only to minimize the reconstruction errors, which may lead the model to copy the input without learning helpful information. Denoising autoencoders (DAEs) are proposed to solve this problem. A DAE takes noisy data as input but is forced to reconstruct the clean version of the input.
- Variational autoencoders (VAEs) attempt to model the underlying probability distribution of data. The latent space of a VAE is forced to obtain a specified distribution so that a random vector sampled from the latent distribution can generate meaningful content similar to the real data. The latent constraint also improves the generalization ability of VAEs
- Autoencoders can be constructed based on different neural networks. RNNs can enhance the sequence modeling capability of autoencoders. CNNs can enhance the ability of autoencoders to extract complex features.
- GAN can model the distribution of high-dimensional time
Paper 2 (Unsupervised Deep Learning for IoT Time Series)
foumaniDeepLearningTime2023 link DOI
TSER
- Time series extrinsic regression (TSER) time series prediction in numerical value, rather than categorical. Most deep learning TSC architectures can be directly applied to TSER with trivial modification.
- Algorithm:
- DTW, NN-DTW, Amerced DTW (SOTA)
- HIVE-COTE, HIVE-COTEv2.0 (ensemble of STC, DrCIF, Temporal Dictionary Ensemble (TDE) and Arsenal (ROCKET ensemble))
- Deep learning-based TSC:
- Generative: aiming to find an appropriate time series representation before training a classifier. Generative methods is more complex due to an additional step of unsupervised training. It is typically less efficient than discriminative methods, which directly map raw time series to class probability distribution
- Stacked denoising auto-encoders (SDAE)
- Deep Belief Network (DBN)
- Echo State Network (ESN)
- discriminative
- End-to-end
- MLP
- CNN-based:
- Adapted CNN to time series: t-LeNet, Multi-Channel Deep Convolutional Neural Network (MC-DCNN), Fully Convolutional Networks (FCN), Residual Networks (Resnet), Dilated convolutions neural networks (DCNNs), Disjoint-CNN
- Imaging time series: generating Gramian Angular Field (GAF) (time series data in a polar coordinate) and generating Markov Transition Field (MTF) (encodes the matrix entries using the transition probability of a data point from one time step to another time step), Recurrence Plots (RP), Gramian Angular Difference Field (GADF), bilinear interpolation, and Gramian Angular Summation Field (GASF)
- Multi-Scale Operation: apply a multi-scale convolutional kernel to the input series or apply regular convolutions on the input series at different scales. InceptionTime
- RNN-based. Using RNNs, the input series have been classified based on their dynamic behavior. Sequence-to-sequence architecture is used in which each sub-series of input series is classified in the first step. Then the argmax function is applied to the entire output, and finally, the neuron with the highest rate specifies the classification result.
- RNN
- LSTM
- GRU
- Hybrid: MLSTM-FCN, TapNet, SMATE, Time series attentional prototype Network (TapNet) and Semi-Supervised Spatio-Temporal (SMATE)
- Attention-based, attention models can capture long-range dependencies
- Self-attention
- Feature-based
- End-to-end
- Generative: aiming to find an appropriate time series representation before training a classifier. Generative methods is more complex due to an additional step of unsupervised training. It is typically less efficient than discriminative methods, which directly map raw time series to class probability distribution
- Graph neural networks (GCN) are very application-specific and are not applicable to most general time series
Ideas
- Unsupervised DL for anomaly detection and fault clustering
References
Tags
Edit this page
Last updated on 3/7/2023