2023-02-14T00:00:00.000Z

On this page

2023-02-14

2023-02-14

Paper 1 (Unsupervised Deep Learning for IoT Time Series)

liuUnsupervisedDeepLearning2023 link DOI

Non-stationary

For non-stationary time series, researchers can convert them into stationary time series through methods such as detrending to reduce their damage to DL models.

Multimodal source fusion

Time series encoding (for example time series to image encoding)
Analysis of multivariate time series
- to treat each dimension as an independent univariate time series
- use multivariate Gaussian distributions
- use two-dimensional matrices
- GNN, encoding the asymmetric relationships between pairs of sensors as directed edges in a graph. The graph’s adjacency matrix is obtained by similarity measurement and embedded by a Graph Attention Network (GAT)
- spatio-temporal autoencoders. 2D-CAEs to encode the correlations between different sensors and then used ConvLSTMs to capture the dynamic patterns of the system

Data heterogeneity

certain types of neural networks are robust to data heterogeneity (example: Residual Neural Network)

Preprocessing: Denoising

mathematical transformations: FFT, CWT
deep learning: Denoising Autoencoder (DAE).

Preprocessing: Insufficient Data

Data augmentation for training data
Data augmentation: time domain methods, frequency domain methods, and time-frequency domain methods
Data augmentation: speed variation, rotation, time warping, missing value simulation, adding noise in time-domain, and adding noise in frequency domain.

Deep Learning Method Notes

CNN is suitable for processing data with a grid-like structure
- time series converted into sequences of system feature maps, then use CNN
- time series converted into saliency maps by the Spectral Residual (SR) algorithm, then use CNN
- CNN is applied to the hidden states of the LSTM to extract the local spatial information.
Convolutional LSTM (ConvLSTM) replaces the fully-connected layers in LSTM with convolutional layers to capture the spatio-temporal correlation.
Quasi-RNN model alternates convolutional layers and simple recurrent layers to allow parallel processing.
Dilated RNN uses dilated recurrent skip connections to reduce model parameters and improve training efficiency.
Time Convolutional Network (TCN) uses causal convolution to fit sequential data and extended convolution and residual modules to memorize past states
Graph Neural Networks (GNN). Modeling graph-structured data, suitable for complex topological relationships between data sources. GNNs are effective for large-scale multi-relational data modeling, suitable for modeling high-dimensional time series. GNNs: Recurrent GNNs, Convolutional GNNs, Graph Autoencoders, and Spatial Temporal GNNs
Autoencoder (AE): Autoencoders are primarily designed to encode an input into a latent representation and then reconstruct it
Generative Adversarial Network (GAN): GANs are machine learning frameworks consisting of two neural networks that compete with each other: one (the generator G) is trained to generate fake data, and the other (the discriminator D) is trained to discern the fake data from the real one. GANs can implicitly model the high-dimensional distribution of data.
Transformer leverages the attention mechanism to process sequence data. Unlike RNNs, which rely on an inherently sequential nature, Transformer allows the model to access any part of the history regardless of distance, making it significantly more parallelizable and potentially more suitable for capturing long-term dependencies.
Canonical Transformer follows an encoder-decoder structure using stacked self-attention and point-wise, fully connected layers. But it can be utilize for time series modeling.
Attention Mechanism: component of the neural network architecture responsible for capturing the correlations between different parts of data. It helps the model automatically identify the crucial parts of the input data and assign them large weights. In the field of time series modeling, it is assumed that previous time steps have different correlations with the current state. Thus, models with attention mechanisms can adaptively select appropriate previous time steps and aggregate the information to form a refined output
Transfer Learning (TL): To overcome insufficient labeled data problem, TL utilizes labeled data from different but related tasks to facilitate the learning of the target task. In other words, the knowledge learned from a related task is transferred to the target task
Self-Supervised Learning (SSL): To overcome the lack of labeled data, SSL leverages input data itself as supervision. The general process of SSL consists of two steps. The first step is to train a pretext task on a large amount of unlabeled data. Then the second step is to fine-tune the pre-trained model according to the target task.

Generative Methods in Time Series

Generative Model is developed to achieve better representation capabilities.
Generative models assume that the available data is generated by some unknown distribution and try to estimate this distribution.
Models: autoencoders and GANs
Autoencoders learn the distribution of the input data by minimizing the reconstruction loss that measures the distance between the output of the decoder and the input data.
traditional autoencoders are trained only to minimize the reconstruction errors, which may lead the model to copy the input without learning helpful information. Denoising autoencoders (DAEs) are proposed to solve this problem. A DAE takes noisy data as input but is forced to reconstruct the clean version of the input.
Variational autoencoders (VAEs) attempt to model the underlying probability distribution of data. The latent space of a VAE is forced to obtain a specified distribution so that a random vector sampled from the latent distribution can generate meaningful content similar to the real data. The latent constraint also improves the generalization ability of VAEs
Autoencoders can be constructed based on different neural networks. RNNs can enhance the sequence modeling capability of autoencoders. CNNs can enhance the ability of autoencoders to extract complex features.
GAN can model the distribution of high-dimensional time

Paper 2 (Unsupervised Deep Learning for IoT Time Series)

foumaniDeepLearningTime2023 link DOI

TSER

Time series extrinsic regression (TSER) time series prediction in numerical value, rather than categorical. Most deep learning TSC architectures can be directly applied to TSER with trivial modification.
Algorithm:
- DTW, NN-DTW, Amerced DTW (SOTA)
- HIVE-COTE, HIVE-COTEv2.0 (ensemble of STC, DrCIF, Temporal Dictionary Ensemble (TDE) and Arsenal (ROCKET ensemble))
- Deep learning-based TSC:
  - Generative: aiming to find an appropriate time series representation before training a classifier. Generative methods is more complex due to an additional step of unsupervised training. It is typically less efficient than discriminative methods, which directly map raw time series to class probability distribution
    - Stacked denoising auto-encoders (SDAE)
    - Deep Belief Network (DBN)
    - Echo State Network (ESN)
  - discriminative
    - End-to-end
      - MLP
      - CNN-based:
        Adapted CNN to time series: t-LeNet, Multi-Channel Deep Convolutional Neural Network (MC-DCNN), Fully Convolutional Networks (FCN), Residual Networks (Resnet), Dilated convolutions neural networks (DCNNs), Disjoint-CNN
        Imaging time series: generating Gramian Angular Field (GAF) (time series data in a polar coordinate) and generating Markov Transition Field (MTF) (encodes the matrix entries using the transition probability of a data point from one time step to another time step), Recurrence Plots (RP), Gramian Angular Difference Field (GADF), bilinear interpolation, and Gramian Angular Summation Field (GASF)
        Multi-Scale Operation: apply a multi-scale convolutional kernel to the input series or apply regular convolutions on the input series at different scales. InceptionTime
      - RNN-based. Using RNNs, the input series have been classified based on their dynamic behavior. Sequence-to-sequence architecture is used in which each sub-series of input series is classified in the first step. Then the argmax function is applied to the entire output, and finally, the neuron with the highest rate specifies the classification result.
        RNN
        LSTM
        GRU
        Hybrid: MLSTM-FCN, TapNet, SMATE, Time series attentional prototype Network (TapNet) and Semi-Supervised Spatio-Temporal (SMATE)
      - Attention-based, attention models can capture long-range dependencies
        Self-attention
    - Feature-based
Graph neural networks (GCN) are very application-specific and are not applicable to most general time series

Ideas

Unsupervised DL for anomaly detection and fault clustering

Summary

Tags

Categories

2023-02-14T00:00:00.000Z

2023-02-14

Paper 1 (Unsupervised Deep Learning for IoT Time Series)

Non-stationary

Multimodal source fusion

Data heterogeneity

Preprocessing: Denoising

Preprocessing: Insufficient Data

Deep Learning Method Notes

Generative Methods in Time Series

Paper 2 (Unsupervised Deep Learning for IoT Time Series)

TSER

Ideas

References

Tags