How to compute mean and std in numpy
Quick answer
Use
numpy.mean() to compute the average and numpy.std() to compute the standard deviation of arrays. These functions support axis parameters to calculate statistics along specific dimensions.PREREQUISITES
Python 3.8+pip install numpy>=1.23
Setup
Install numpy if not already installed using pip. Import numpy in your Python script to access its statistical functions.
pip install numpy Step by step
Compute mean and standard deviation of a numpy array with simple calls. Use the axis argument to specify dimensions.
import numpy as np
# Create a sample numpy array
arr = np.array([[1, 2, 3], [4, 5, 6]])
# Compute mean of all elements
mean_all = np.mean(arr)
# Compute std of all elements
std_all = np.std(arr)
# Compute mean along columns (axis=0)
mean_axis0 = np.mean(arr, axis=0)
# Compute std along rows (axis=1)
std_axis1 = np.std(arr, axis=1)
print(f"Mean (all): {mean_all}")
print(f"Std (all): {std_all}")
print(f"Mean (axis=0): {mean_axis0}")
print(f"Std (axis=1): {std_axis1}") output
Mean (all): 3.5 Std (all): 1.707825127659933 Mean (axis=0): [2.5 3.5 4.5] Std (axis=1): [0.81649658 0.81649658]
Common variations
You can compute weighted means using numpy.average() or use ddof=1 in numpy.std() for sample standard deviation. These variations help in different statistical contexts.
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
weights = np.array([1, 2, 3, 4, 5])
# Weighted average
weighted_mean = np.average(arr, weights=weights)
# Sample standard deviation (ddof=1)
sample_std = np.std(arr, ddof=1)
print(f"Weighted mean: {weighted_mean}")
print(f"Sample std (ddof=1): {sample_std}") output
Weighted mean: 3.6666666666666665 Sample std (ddof=1): 1.5811388300841898
Troubleshooting
If you get unexpected results, check that your input array is numeric and not empty. Also, verify the axis parameter is within the array's dimensions to avoid errors.
Key Takeaways
- Use
numpy.mean()andnumpy.std()for fast computation of mean and standard deviation. - Specify the
axisparameter to compute statistics along specific dimensions of arrays. - Use
ddof=1innumpy.std()for sample standard deviation instead of population std. - Weighted averages can be computed with
numpy.average()by passing weights. - Always validate input arrays to be numeric and non-empty to avoid runtime errors.