What is R-squared in regression
R-squared (coefficient of determination) is a statistical metric that quantifies how well a regression model's predictions approximate the actual data points. It ranges from 0 to 1, where 1 indicates perfect prediction and 0 means the model does no better than the mean. In PyTorch, you can compute R-squared by comparing predicted and true values using tensor operations.R-squared is a regression metric that measures the proportion of variance in the dependent variable explained by the independent variables.How it works
R-squared measures the fraction of the total variance in the target variable that the regression model explains. Imagine you have data points scattered around a horizontal line representing the mean. A good model's predictions will cluster closer to the actual points, reducing the residual errors. R-squared is calculated as 1 minus the ratio of residual sum of squares (unexplained variance) to total sum of squares (total variance). Values closer to 1 mean the model fits the data well.
Concrete example
This example shows how to compute R-squared in PyTorch for a simple regression prediction:
import torch
# True target values
y_true = torch.tensor([3.0, -0.5, 2.0, 7.0])
# Predicted values from a regression model
y_pred = torch.tensor([2.5, 0.0, 2.0, 8.0])
# Calculate residual sum of squares (RSS)
rss = torch.sum((y_true - y_pred) ** 2)
# Calculate total sum of squares (TSS)
tss = torch.sum((y_true - torch.mean(y_true)) ** 2)
# Compute R-squared
r2 = 1 - rss / tss
print(f"R-squared: {r2.item():.4f}") R-squared: 0.9486
When to use it
Use R-squared to evaluate regression models when you want a normalized measure of how well your model explains the variance in the target variable. It is best suited for linear regression and continuous targets. Avoid relying solely on R-squared when your data has outliers, non-linear relationships, or when comparing models with different numbers of predictors (use adjusted R-squared or other metrics instead).
Key Takeaways
-
R-squaredquantifies the proportion of variance explained by a regression model. - Calculate
R-squaredas 1 minus the ratio of residual to total variance. - Use
R-squaredto assess model fit for continuous regression tasks in PyTorch. - Beware of
R-squaredlimitations with non-linear data or outliers.