Contents
About ML-models
An ML model is an algorithm based on machine learning methods tasked with analyzing the telemetry of the monitored asset and detecting anomalies.
An ML model is created for a specific monitored asset while taking into account the specifications of the asset and the characteristics of telemetry data. The general structure of the algorithm (architecture) is formed during creation of the ML model. Then the ML model is trained based on historical telemetry data and is thereby adjusted to the behavior of a specific object.
An ML model consists of one or more elements, with each separately analyzing telemetry data to detect anomalies. Normally, the more complex the industrial processes of the monitored asset are, the more elements the ML model will contain. An ML model can include the following elements operating in parallel:
Predictive elements and elements based on elliptic envelopes need to be trained on a dataset. A predictive element learning process may consist of one or several epochs. An epoch is a cycle during which an element is trained on the entire training dataset. The number of training epochs is specified in the element training settings. Elements based on a diagnostic rule do not need to be trained, so they are considered to be pretrained.
The process of using an ML model to analyze telemetry data and detect anomalies is known as inference. In Kaspersky MLAD, ML model inference can be performed on historical data (historical inference) and on telemetry data received in real time (streaming inference). If historical inference is started for multiple ML models, Kaspersky MLAD runs the inference of these ML models in the order of their startup queue. The duration of historical inference is determined by the time interval of the data analyzed by the ML model. If streaming inference is started for multiple ML models, Kaspersky MLAD runs the inference of these ML models simultaneously. Historical inference and streaming inference run in parallel and independently of each other. During the inference process, the ML model registers incidents that can be viewed in the Incidents section.
In addition to incidents, an ML model inference process also generates artifacts. An artifact is a time series of numerical data. An ML model can generate the following artifacts:
- Artifacts associated with tags. An ML model element generates these artifacts for each of its output tags. These artifacts are generated only by the predictive elements of the ML model and represent a predicted tag value and prediction error.
- Artifacts of ML model elements. Each ML model element generates this type of artifact as its primary output. The mathematical nature of an artifact is determined by the analytical algorithms employed by the element. In this context, an artifact for an ML model of any type is uniformly interpreted as the degree to which the behavior of the monitored asset deviates from the expected (normal) behavior. Every artifact has a critical threshold. If this threshold is reached, an incident is recorded.
Any user can view generated artifacts under Monitoring and History.
ML models can be created by Kaspersky specialists or by a certified integrator as part of the Kaspersky MLAD Model-building and Deployment Service. To use such ML models, you must import them to Kaspersky MLAD. You can also create ML models independently and add the necessary elements to them using the model builder.
About predictive ML model elements
Predictive ML model elements predict the behavior of an object from data on its recent behavior. Predictive ML model elements include neural network elements and linear regression-based elements.
Kaspersky MLAD model builder supports the following architectures for ML model predictive elements:
- Dense. Neural network element of an ML model with a fully connected architecture. When creating an ML model element, you must specify the multipliers for calculating the number of neurons on inner layers and the activation functions on them.
- TCN. Neural network element of an ML model with a hierarchical time-based convolutional architecture. When creating an ML model element, you must specify the filter size and number, extensions on layers, activation functions on them and the number of layers in the residual block.
- CNN. Neural network element of an ML model with a convolutional architecture. When creating an ML model element, you must specify the number of neurons on the layers of encoder, the size and number of filters on layers, and the size of the maximum sampling window (MaxPooling).
- RNN. Neural network element of an ML model with a recurrent architecture. When creating an ML model element, you must specify the number of GRU neurons on layers and the number of time-distributed neurons on the layers of the decoder.
- Transformer. Neural network element of an ML model with a transformer architecture. When creating an element of the ML model, the number of attention heads and the number of transformer encoders are specified.
- Linear regression. Element of an ML model based on linear regression.
A predictive element of an ML model generates the following artifacts as a result of inference:
- Predicted tag values. These are displayed in the central part of the Monitoring and History sections on individual graphic areas of the selected preset.
- Individual prediction errors are the differences between the predicted and actual values for each tag. These are displayed in the central part of the Monitoring and History sections on individual graphic areas of the selected preset.
- The total prediction error (cumulative prediction error) is the total discrepancy between the predicted and actual values. Cumulative prediction error and the cumulative prediction error threshold are displayed in the graphic area in the central part of the Monitoring and History sections after the graphic areas of the selected preset and on the ML model element artifact graph located at the bottom of the sections.
If the cumulative prediction error exceeds the cumulative prediction error threshold, predictive element of the ML model considers this a deviation in the behavior of the monitored asset and registers an incident.
About elements of an ML model based on a diagnostic rule
Diagnostic rules describe previously known behavioral traits of the monitored asset that are considered anomalies. Diagnostic rules must be formalized and calculated based on available telemetry data for the object.
Examples of diagnostic rules:
- The level of tag A has changed abruptly (criterion for the behavior of the Step change tag).
- Over the past 12 hours, tag B has trended upward, tag C has trended downward, and tag D has not shown any clear dynamics.
- The value of tag X fell below 2800 after it previously rose higher than 2900.
About elliptic envelope-based ML model elements
Elliptic envelopes are used to detect abnormal states of a monitored asset.
Unlike a predictive element, an elliptic envelope does not attempt to determine how the behavior of the ML model's input tags affects the behavior of its output tags. An elliptic envelope uses the assumption that the set of tags included in the ML model describes the state of the monitored asset at any given moment, and the observable states have a normal distribution (also known as a Gaussian distribution) in the phase space.
During training, the elliptic envelope adjusts the parameters of this normal distribution while considering that the training sample may contain a certain percentage of anomalous states. During the training of an ML model, an elliptical region is formed in the phase space. States that fall within this region are classified as normal, while all other states are categorized as outliers (anomalies). The farther a state is from the boundaries of the ellipse, the more anomalous it is. The tag whose value as part of the anomalous state contributed the most to the deviation from the ellipse is considered the top tag.
An elliptic envelope is simpler to construct than a predictive element, learns more quickly, and requires fewer resources for inference. However, an elliptic envelope only demonstrates good performance when applied to stationary equipment operating modes that do not involve multiple operating ranges or abrupt changes in tag values.
Page top