Lecture

Evaluating Model Performance (accuracy, R²)

Model evaluation is about measuring how well a trained model makes predictions.

The metric you choose depends on the type of problem:

  • Classification: accuracy, precision, recall, F1-score
  • Regression: R² (coefficient of determination), MSE, MAE

Scikit-learn provides built-in functions to calculate these metrics.


Classification Example: Accuracy Score

The following example shows how to use accuracy score to evaluate a classification model.

Accuracy Score Example
import piplite await piplite.install('scikit-learn') from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.neighbors import KNeighborsClassifier from sklearn.metrics import accuracy_score # Load data X, y = load_iris(return_X_y=True) # Split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Train classifier knn = KNeighborsClassifier(n_neighbors=3) knn.fit(X_train, y_train) # Predict y_pred = knn.predict(X_test) # Evaluate acc = accuracy_score(y_test, y_pred) print(f"Accuracy: {acc:.2f}")

Accuracy = fraction of correct predictions.

Good for balanced datasets, but can be misleading if the classes are imbalanced.


Regression Example: R² Score

The following example shows how to use R² score to evaluate a regression model.

R² Score Example
from sklearn.linear_model import LinearRegression from sklearn.metrics import r2_score # Synthetic regression data import numpy as np rng = np.random.RandomState(0) X_reg = 2 * rng.rand(50, 1) y_reg = 4 + 3 * X_reg.ravel() + rng.randn(50) # Train/test split X_train, X_test, y_train, y_test = train_test_split(X_reg, y_reg, test_size=0.2, random_state=42) # Train reg = LinearRegression() reg.fit(X_train, y_train) # Predict y_pred = reg.predict(X_test) # Evaluate r2 = r2_score(y_test, y_pred) print(f"R² Score: {r2:.3f}")

measures how much variance in the target is explained by the model.

The following are the possible values of :

  • 1.0: perfect prediction
  • 0: no better than mean prediction
  • Negative: worse than mean prediction

Key Takeaways

  • Use classification metrics for categorical outputs, and regression metrics for continuous outputs.
  • Always evaluate on a test set that the model hasn't seen during training.
  • Consider multiple metrics for a more complete evaluation, especially with imbalanced data.
Quiz
0 / 1

What metric is used to evaluate regression models, measuring how much variance in the target is explained by the model?

The metric used to evaluate regression models is .
Accuracy
Precision
F1-score

Lecture

AI Tutor

Design

Upload

Notes

Favorites

Help