AUC (ROC AUC)
A Receiver Operating Characteristic Curve is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied.
A classification metric “Area Under the Curve” (AUC) related to the ROC curve represents how effectively the model answers the question “Does the current object belong to the corresponding class?”.
The value is always between 0 and 1:
- The higher = the better;
- The lower = the worse, but in this case, the model works in “reverse” mode (1 - value = the better);
- ~0,5 = the worst (the model provides values like random)
Binary Classification
Binary Classification is used to predict one of the two possible outcomes or classes (e.g. ‘yes’ or ‘no’, ‘black or ‘white’, 0 or 1). If all of the values of your target variable are represented by only two unique values, this is a binary classification task type.
С
Classification
Classification (Classification Task) is the prediction of a target variable represented as a range of discrete classes. Binary classification tasks are represented by a target variable with two possible classes. Multi classification tasks are represented by a target variable with 3 or more classes.
Coefficient of Determination
Coefficient of Determination is the proportion of variance in the dependent variable that is predictable from the independent variables. This metric scores the model with 1 if our model is perfect, and with 0 otherwise. If the Coefficient of Determination is 0.95, then 95% of the data is explained by observed statistics and the trained model.
Confusion Matrix
Confusion Matrix, also known as an error matrix, is a specific table layout that helps to visualize the performance of an algorithm. Each row of the Matrix represents the samples in a predicted class while each column represents the samples in an actual class (or vice versa). The Matrix makes it easy to see if the system is confusing two classes.
D
Dataset
Dataset is a volume of data (statistics) in a tabular format.
F
Feature
Feature (independent variable, predictor) is represented by a column in a dataset that characterizes the target variable.
Feature Extraction
Feature Extraction is an extraction of additional information (creation of the new variables) from existing data.
Feature Importance Matrix (FIM)
Feature Importance Matrix (FIM) is a chart that displays the impact of features on the model, with 1 being the most impactful.
Frequency functions:
- FFT first peak power
- FFT first peak power
- FFT second peak power
- FFT second peak frequency
- FFT third peak power
- FFT third peak frequency
These functions use fast Fourier transform to calculate peaks' frequency and power. If Frequency features are selected the window size should be equal to the power of 2.
F1 Score
The F1 Score can be interpreted as a weighted average of the precision and recall for each class, where an F1 Score reaches its best value at 1 and worst score at 0. You should use this metric when you want to have a good balance between Precision and Recall.
G
Gini
The Gini coefficient applies to binary classification and requires a classifier that can in some way rank examples according to the likelihood of being in a positive class. A Gini value of 0% means that the characteristic cannot distinguish between classes.
The Gini coefficient makes sense for the whole collection of predictions and not individual data points. The Gini coefficient only tells if perfect segregation (based on probability predictions) is possible or not and nothing about the probability threshold. In short, there is no relation between probability threshold and Gini. The Gini coefficient provides an accurate model predictive power measure for imbalanced class problems.
H
Holdout Validation Dataset
Holdout Validation Dataset is an independent portion of data that won’t be used in model training, but on which metrics will be calculated.
K
Kurtosis
Kurtosis is a statistical measure of the combined weight of a distribution's tails relative to the center of the distribution.
L
Lift
Lift is a measure of the performance of a targeting model at predicting or classifying cases as having an enhanced response (with respect to the population as a whole), measured against a random choice targeting model.
LogLoss
Logarithmic Loss is a measure of prediction confidence level. LogLoss represents the difference between the actual class and the probability of a prediction being in that class. For example, the model correctly predicts a 0.90 probability of being in class 1- that means it is pretty confident, but still, there is 0.1 uncertainty of this prediction; LogLoss penalizes for this uncertainty.
The lower the LogLoss score, the better the model`s predictive power. LogLoss takes into account not the rounded-off predicted class but the probability of the prediction corresponding to a certain class.
M
Machine Learning
Machine Learning (ML) is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions, relying on patterns and inference instead.
Macro Average Precision
Precision is the fraction of relevant samples among the retrieved samples. The precision score reaches its best value at 1 and the worst score at 0. Precision is intuitively the ability of the classifier not to label as positive a sample that is negative. Use this metric when it is most important to find only relevant samples without mistake even if you may skip some of the relevant samples. Macro Average Precision does not take class imbalance into account. It is best applicable when (for example) three classes of a multiclass classification problem are represented by an approximately equal quantity of samples.
Macro Average Recall
The Recall is the fraction of relevant samples that have been retrieved over the total amount of relevant samples. The recall score reaches its best value at 1 and worst score at 0. The Recall is intuitively the ability of the classifier to find all the positive samples. Use this metric when it is the most important to find as many relevant samples as possible even if you may also mark some of the wrong examples as relevant. Macro Average Recall does not take class imbalance into account. It is best applicable when (for example) three classes of a multiclass classification problem, represented by an approximately equal quantity of samples.
Macro Average F1 Score
The F1 score can be interpreted as an arithmetic average of the precision and recall for each class, where an F1 score reaches its best value at 1 and worst score at 0. You should use this metric when you want to have a good balance between Precision and Recall. Macro Average F1 score does not take class imbalance into account. It is best applicable when (for example) three classes of a multiclass classification problem, represented by an approximately equal quantity of samples.
Max
Max is a statistical function that calculates the maximum of the column (feature).
Mean
Mean is a statistical function that calculates the arithmetic mean of the column (feature).
Mean Crossings
Mean Crossings is a statistical function that calculates the number of times the selected column crosses the mean.
Mean Absolute Error (MAE)
Mean Absolute Error is the difference between all the observed and predicted values. The direction of the error (positive or negative) does not matter, because the size of the error is calculated by module.
Use this metric to minimize the average error.
There are two other metrics, dependent on Mean Absolute Error:
- Max AE is the maximum absolute difference between the actual value (true) and the predicted value;
- Min AE is the minimum absolute difference between the actual value (true) and the predicted value
Mean Squared Error (MSE)
MSE is a measure of the quality of an estimator, it is always non-negative, and values closer to zero are better. MSE measures the average of the squares of the errors - that is, the average squared difference between the estimated values and true values. The squaring is necessary to remove any negative signs, it also gives more weight to larger differences, so bigger errors are penalized higher.
Metadata
Metadata is the information about the data.
Metric
Metric is a functional value that describes model quality.
Min
Min is a statistical function that calculates the minimum of the column (feature).
Model
Model is a mathematical representation of dependencies between the features (independent variables) and the target variable.
Model Growth Chart
Model Growth Chart presents iterations of growth and model construction on a single graph, allowing users to select and download the optimal model for their specific needs based on both model quality and model size.
Model Quality Diagram
Model Quality Diagram is a graphical representation of model quality in relation to metric indicator values that are scaled in the range [0-1], where 1 is the ideal quality of the model, and 0 is the minimum quality of the model.
Multi Classification
Multi Classification is used to predict one value of the limited number (greater than 2) of possible outcomes (e.g. ‘red’ or ‘green’ or ‘yellow’; ‘high or ‘medium’ or ‘low’; 1 or 2 or 3 or 4 or 5, etc.) If all of the values of your target variable are represented by a discrete (fixed) number of unique values/classes (>2), then this is a Multiclass classification task type.
N
Negative Mean Crossings
Negative Mean Crossings compute the number of times the selected input crosses the mean with a negative slope.
Neural Network
Neural Network is a series of algorithms that endeavors to recognize underlying relationships in a dataset through a process that is close to the way the human brain operates.
P
Positive Mean Crossings
Positive Mean Crossings compute the number of times the selected input crosses the mean with a positive slope.
Precision
Precision is the fraction of relevant samples among the retrieved samples. The precision score reaches its best value at 1 and worst score at 0. Precision is intuitively the ability of the classifier not to label as positive a sample that is negative. Use this metric when the priority is to find only relevant samples without a mistake, even if you may end up skipping some of the relevant samples.
R
Recall
Recall is the fraction of relevant samples that have been retrieved over the total amount of relevant samples. The recall score reaches its best value at 1 and worst score at 0. The recall is the ability of the classifier to find all the positive samples. Use this metric when the priority is to find as many relevant samples as possible even if you may also mark some of the wrong examples as relevant.
Regression
Regression (regression task type) is predicting a continuous value (for example predicting the prices of a house given the house features like location, size, number of bedrooms, etc).
Root Mean Square
Root Mean Square is the root of the arithmetic mean of the squares of a set of numbers.
Root Mean Squared Error (RMSE)
Root Mean Squared Error or RMSE represents an error between observed and predicted values (square root of squared average error over all observations). The lower RMSE - the better predictive power the model has. RMSE is always non-negative, and a value of 0 would indicate a perfect fit for the data. It should be used when you want to make a model which would not have big individual errors for every prediction.
Root Mean Squared Logarithmic Error (RMSLE).
Root Mean Squared Logarithmic Error or RMSLE can be used when one doesn’t want to penalize huge differences when both the values are huge numbers. The lower RMSLE - the better predictive power the model has. RMSLE can also be used when one wants to penalize underestimates more than overestimates.
Root Mean Squared Percentage Error (RMSPE)
Root Mean Squared Percentage Error represents a percentage error between the observed and predicted values measured as the square root of the mean of the squared (difference between the actual values and the predicted values divided by the actual values). The lower the RMSPE – the better the model’s predictive power.
RMSPE is always non-negative, and a value of 0 (almost never achieved in practice) would indicate a perfect fit to the data. In general, a lower RMSE is better than a higher one. However, comparisons across different types of data would be invalid because the measure is dependent on the scale of the numbers used.
Rows with 0 values in the target variable are filtered out and are not used in the validation pipeline as division by 0 is not possible.
S
Skewness
Skewness statistical function measures the asymmetry of the distribution of a variable.
Solution
Solution is an object in Neuton in which all model parameters are specified. All workflow actions are executed inside the solution.
Splitting
Splitting is the process of separating a dataset into two parts one for training and one for validation.
T
Target Variable
Target Variable is a variable the model is learning to predict. The target variable may be represented as a range of discrete classes or as continuous real numbers.
TinyML
TinyML is a field of study in Machine Learning and Embedded Systems that explores the types of models you can run on small, low-powered devices like microcontrollers. It enables low-latency, low-power, and low-bandwidth model inference on edge devices.
Total Footprint
Total Footprint is the amount of space in FLASH memory and SRAM that the model uses for inference.
Training
Training is the process of learning to uncover relationships between the features of a particular dataset and the target variable.
Training Dataset
Training Dataset is the input dataset (or its part) that the machine learning algorithm uses to “learn” to uncover relationships between its features and the target variable.
V
Validation
Validation is the quality assessment process for a model which has been trained and built to predict a particular target variable.
Validation Dataset
Validation Dataset is another subset of the input data used to predict the target variable with the trained model, and measure the error between the known target values in the validation dataset and the predictions.
Validation Metrics
Validation Metric is a functional value that describes model quality applying to the holdout validation dataset or cross-validation process.
Variance
Variance is a statistical function that calculates the minimum of the column (feature).
W
Weighted Average Precision
Weighted Average Precision accounts for imbalanced classes where (for example) one class may be represented by 10% of samples, the second class may be represented by 60% of samples and the other N classes are represented by the remaining 30% of samples.
Weighted Average Recall
Weighted Average Recall accounts for imbalanced classes where (for example) one class may be represented by 10% of samples, the second class may be represented by 60% of samples and the other N classes are represented by the remaining 30% of samples.
Weighted Average F1 Score
Weighted Average F1 Score accounts for imbalanced classes where (for example) one class may be represented by 10% of samples, the second class may be represented by 60% of samples and the other N classes are represented by the remaining 30% of samples.