How to beginner · 3 min read

How to use GradientBoostingClassifier Scikit-learn

Quick answer
Use GradientBoostingClassifier from sklearn.ensemble to build a gradient boosting model for classification. Fit the model with fit(X_train, y_train) and predict with predict(X_test).

PREREQUISITES

  • Python 3.8+
  • pip install scikit-learn>=1.2

Setup

Install Scikit-learn if not already installed using pip. Import necessary modules including GradientBoostingClassifier and dataset utilities.

bash
pip install scikit-learn

Step by step

This example trains a GradientBoostingClassifier on the Iris dataset, fits the model, and evaluates accuracy on a test split.

python
from sklearn.datasets import load_iris
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize model
model = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1, max_depth=3, random_state=42)

# Train model
model.fit(X_train, y_train)

# Predict
y_pred = model.predict(X_test)

# Evaluate
accuracy = accuracy_score(y_test, y_pred)
print(f"Test accuracy: {accuracy:.2f}")
output
Test accuracy: 1.00

Common variations

You can customize GradientBoostingClassifier by adjusting parameters like n_estimators, learning_rate, and max_depth. Use staged_predict for monitoring performance during training. For large datasets, consider using HistGradientBoostingClassifier for faster training.

python
from sklearn.ensemble import HistGradientBoostingClassifier

# Alternative faster gradient boosting
hist_model = HistGradientBoostingClassifier(max_iter=100, learning_rate=0.1, max_depth=3, random_state=42)
hist_model.fit(X_train, y_train)
print(f"HistGradientBoostingClassifier accuracy: {hist_model.score(X_test, y_test):.2f}")
output
HistGradientBoostingClassifier accuracy: 1.00

Troubleshooting

  • If you see ImportError, ensure Scikit-learn is installed and updated.
  • If accuracy is low, tune hyperparameters like n_estimators and learning_rate.
  • For imbalanced classes, consider using class weights or sampling techniques.

Key Takeaways

  • Use GradientBoostingClassifier for robust classification with gradient boosting.
  • Tune n_estimators, learning_rate, and max_depth for best performance.
  • Use HistGradientBoostingClassifier for faster training on large datasets.
  • Always split data into train/test sets to evaluate model generalization.
  • Check for imbalanced data and apply appropriate techniques if needed.
Verified 2026-04 · GradientBoostingClassifier, HistGradientBoostingClassifier
Verify ↗