What is supervised learning
Supervised learning is a machine learning approach where models are trained on labeled datasets, meaning each input has a corresponding correct output. The model learns to map inputs to outputs to predict labels on new, unseen data.Supervised learning is a machine learning method that trains models on labeled data to predict outcomes for new inputs.How it works
Supervised learning works by feeding a model a dataset where each example has an input and a known output (label). The model adjusts its internal parameters to minimize the difference between its predictions and the true labels. Think of it like a student learning math by practicing problems with answer keys: the student tries answers, checks against the key, and improves over time.
Concrete example
Here is a simple Python example using scikit-learn to train a supervised learning model (a decision tree) to classify iris flowers based on features:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
# Load labeled iris dataset
iris = load_iris()
X = iris.data # features
Y = iris.target # labels
# Split data into train and test sets
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.3, random_state=42)
# Initialize and train model
model = DecisionTreeClassifier()
model.fit(X_train, Y_train)
# Predict on test data
predictions = model.predict(X_test)
# Evaluate accuracy
accuracy = accuracy_score(Y_test, predictions)
print(f"Accuracy: {accuracy:.2f}") Accuracy: 0.98
When to use it
Use supervised learning when you have a dataset with clear input-output pairs and want to predict labels or continuous values. Common use cases include image classification, spam detection, and regression tasks like price prediction. Avoid supervised learning if you lack labeled data or want to discover hidden patterns without explicit labels (use unsupervised learning instead).
Key terms
| Term | Definition |
|---|---|
| Supervised learning | Training models on labeled input-output pairs to predict outputs. |
| Label | The known correct output associated with an input example. |
| Model | An algorithm that learns patterns from data to make predictions. |
| Training data | The dataset used to teach the model, containing inputs and labels. |
| Prediction | The output generated by the model for new inputs. |
Key Takeaways
- Supervised learning requires labeled data to train models that predict outputs.
- It is ideal for classification and regression tasks with clear input-output mappings.
- Models improve by minimizing errors between predictions and true labels during training.