Evaluating Regression Models
Regression models are machine learning models used to predict continuous numerical values. The most commonly used metrics for evaluating their performance are:
Mean Squared Error (MSE)
: The average of the squared differences between predicted and actual values (the closer to 0, the better)Coefficient of Determination (R²)
: Measures how well the model explains the variance in the target variable (the closer to 1.0, the better)
Formula for Mean Squared Error
Mean Squared Error (MSE) is calculated as the average of the squared differences between predicted and actual values:
Formula for Coefficient of Determination (R²)
The Coefficient of Determination (R²) represents how well the model explains the variance of the target variable:
Regression Example: R² Score
The following example demonstrates how to evaluate a regression model using the R² score:
R² Score Example
from sklearn.linear_model import LinearRegression from sklearn.metrics import r2_score from sklearn.model_selection import train_test_split # Generate synthetic regression data import numpy as np rng = np.random.RandomState(0) X_reg = 2 * rng.rand(50, 1) y_reg = 4 + 3 * X_reg.ravel() + rng.randn(50) # Split into training and test sets X_train, X_test, y_train, y_test = train_test_split(X_reg, y_reg, test_size=0.2, random_state=42) # Train the model reg = LinearRegression() reg.fit(X_train, y_train) # Make predictions y_pred = reg.predict(X_test) # Evaluate the model r2 = r2_score(y_test, y_pred) print(f"R² score: {r2:.3f}")
Possible values of R²
are:
1.0
: Perfect prediction0
: No improvement over predicting the mean- Negative: Worse than predicting the mean
Key Takeaways
- Use classification metrics for categorical outputs and regression metrics for continuous outputs.
- Regression models are typically evaluated using Mean Squared Error (MSE) and Coefficient of Determination (R²).
Quiz
0 / 1
What is the primary advantage of using a confusion matrix in evaluating a classification model?
It provides the accuracy of the model.
It predicts the future performance of the model.
It reveals where the model is making mistakes for each class.
It generates new datasets for training.
Lecture
AI Tutor
Design
Upload
Notes
Favorites
Help