Model Performance Callibration

Confusion Matrix

This metric can be used for those models that perform Binary (1 or 0) or Multi-class(we can have n-number of classes) classification.

Confusion matrix for binary classification
Confusion Matrix for multiclass classification


Percision, Recall and F1-Score

What is Precision ?



F1 Score


This is a loss function .

  • This metric uses the probability score actually
  • The value lies between 0 and infinity
  • The best value is 0 coz we want all losses to be small
  • This metric is hard to interpret if it has a values greater than zero
  • Penalizes for small deviations is the metric
  • In other words this is average negative log(probability of positive class label)
Logloss example

Receiver Operating Characteristics==> Area Under the Curve(AUC)

Used only in Binary Classification.

Table with tau = 0.93
Table with tau=0.90
  • AUC didn’t care about the Yihat scores as such but only about the ordering. From the table below , we see that the sorted order of yi didn’t change even after the scores from Model 2 were tabulated. The AUC(Model 1) will be equal to the AUC(Model 2)
AUC less than 0.5
  • 1 is a good value
  • A value of 0 for R-square implies that the model is same as simple mean model
  • If R-square value is negative then the model’s performance is a worse
R-square calculation
R-square is not outlier friendly

In the making Machine Learner programmer music lover