How to use feature importance in Scikit-learn
Quick answer
Use the
feature_importances_ attribute available in tree-based models like RandomForestClassifier or GradientBoostingClassifier in scikit-learn to get the importance scores of each feature. These scores quantify the contribution of each feature to the model's predictions and can be accessed after fitting the model.PREREQUISITES
Python 3.8+pip install scikit-learn>=1.2
Setup
Install scikit-learn if not already installed and import necessary modules.
pip install scikit-learn>=1.2 Step by step
This example trains a RandomForestClassifier on the Iris dataset and prints feature importances.
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
# Load data
iris = load_iris()
X, y = iris.data, iris.target
feature_names = iris.feature_names
# Train model
model = RandomForestClassifier(random_state=42)
model.fit(X, y)
# Get feature importances
importances = model.feature_importances_
# Display feature importances
for name, importance in zip(feature_names, importances):
print(f"{name}: {importance:.4f}") output
sepal length (cm): 0.1117 sepal width (cm): 0.0277 petal length (cm): 0.4413 petal width (cm): 0.4193
Common variations
You can use feature_importances_ with other tree-based models like GradientBoostingClassifier or ExtraTreesClassifier. For linear models, use coefficients via coef_. Also, permutation importance from sklearn.inspection offers model-agnostic feature importance.
from sklearn.inspection import permutation_importance
# Permutation importance example
result = permutation_importance(model, X, y, n_repeats=10, random_state=42)
for name, importance in zip(feature_names, result.importances_mean):
print(f"{name}: {importance:.4f}") output
sepal length (cm): 0.0900 sepal width (cm): 0.0200 petal length (cm): 0.4300 petal width (cm): 0.4100
Troubleshooting
- If
feature_importances_is missing, ensure your model supports it (tree-based models only). - For linear models, check
coef_instead. - Permutation importance requires a fitted model and data; errors may occur if inputs mismatch.
Key Takeaways
- Use
feature_importances_attribute from tree-based models to get feature importance scores. - Permutation importance from
sklearn.inspectionworks for any fitted model and provides model-agnostic insights. - Linear models use
coef_instead offeature_importances_for feature relevance. - Always fit your model before accessing importance attributes to avoid errors.
- Feature importance helps interpret model decisions and improve feature selection.