13 Statistics Solutions
title: "Statistics" date: 2026-05-24T13:51:18Z
Statistics
1__author__ = "kyubyong. kbpark.linguist@gmail.com"
1import numpy as np
1np.__version__
'1.11.3'
Order statistics
Q1. Return the minimum value of x along the second axis.
1x = np.arange(4).reshape((2, 2))
2print("x=\n", x)
3print("ans=\n", np.amin(x, 1))
x=
[[0 1]
[2 3]]
ans=
[0 2]
Q2. Return the maximum value of x along the second axis. Reduce the second axis to the dimension with size one.
1x = np.arange(4).reshape((2, 2))
2print("x=\n", x)
3print("ans=\n", np.amax(x, 1, keepdims=True))
x=
[[0 1]
[2 3]]
ans=
[[1]
[3]]
Q3. Calcuate the difference between the maximum and the minimum of x along the second axis.
1x = np.arange(10).reshape((2, 5))
2print("x=\n", x)
3
4out1 = np.ptp(x, 1)
5out2 = np.amax(x, 1) - np.amin(x, 1)
6assert np.allclose(out1, out2)
7print("ans=\n", out1)
x=
[[0 1 2 3 4]
[5 6 7 8 9]]
ans=
[4 4]
Q4. Compute the 75th percentile of x along the second axis.
1x = np.arange(1, 11).reshape((2, 5))
2print("x=\n", x)
3
4print("ans=\n", np.percentile(x, 75, 1))
x=
[[ 1 2 3 4 5]
[ 6 7 8 9 10]]
ans=
[ 4. 9.]
Averages and variances
Q5. Compute the median of flattened x.
1x = np.arange(1, 10).reshape((3, 3))
2print("x=\n", x)
3
4print("ans=\n", np.median(x))
x=
[[1 2 3]
[4 5 6]
[7 8 9]]
ans=
5.0
Q6. Compute the weighted average of x.
1x = np.arange(5)
2weights = np.arange(1, 6)
3
4out1 = np.average(x, weights=weights)
5out2 = (x*(weights/weights.sum())).sum()
6assert np.allclose(out1, out2)
7print(out1)
2.66666666667
Q7. Compute the mean, standard deviation, and variance of x along the second axis.
1x = np.arange(5)
2print("x=\n",x)
3
4out1 = np.mean(x)
5out2 = np.average(x)
6assert np.allclose(out1, out2)
7print("mean=\n", out1)
8
9out3 = np.std(x)
10out4 = np.sqrt(np.mean((x - np.mean(x)) ** 2 ))
11assert np.allclose(out3, out4)
12print("std=\n", out3)
13
14out5 = np.var(x)
15out6 = np.mean((x - np.mean(x)) ** 2 )
16assert np.allclose(out5, out6)
17print("variance=\n", out5)
x=
[0 1 2 3 4]
mean=
2.0
std=
1.41421356237
variance=
2.0
Correlating
Q8. Compute the covariance matrix of x and y.
1x = np.array([0, 1, 2])
2y = np.array([2, 1, 0])
3
4print("ans=\n", np.cov(x, y))
ans=
[[ 1. -1.]
[-1. 1.]]
Q9. In the above covariance matrix, what does the -1 mean?
It means x and y correlate perfectly in opposite directions.
Q10. Compute Pearson product-moment correlation coefficients of x and y.
1x = np.array([0, 1, 3])
2y = np.array([2, 4, 5])
3
4print("ans=\n", np.corrcoef(x, y))
ans=
[[ 1. 0.92857143]
[ 0.92857143 1. ]]
Q11. Compute cross-correlation of x and y.
1x = np.array([0, 1, 3])
2y = np.array([2, 4, 5])
3
4print("ans=\n", np.correlate(x, y))
ans=
[19]
Histograms
Q12. Compute the histogram of x against the bins.
1x = np.array([0.5, 0.7, 1.0, 1.2, 1.3, 2.1])
2bins = np.array([0, 1, 2, 3])
3print("ans=\n", np.histogram(x, bins))
4
5import matplotlib.pyplot as plt
6%matplotlib inline
7plt.hist(x, bins=bins)
8plt.show()
ans=
(array([2, 3, 1], dtype=int64), array([0, 1, 2, 3]))
Q13. Compute the 2d histogram of x and y.
1xedges = [0, 1, 2, 3]
2yedges = [0, 1, 2, 3, 4]
3x = np.array([0, 0.1, 0.2, 1., 1.1, 2., 2.1])
4y = np.array([0, 0.1, 0.2, 1., 1.1, 2., 3.3])
5H, xedges, yedges = np.histogram2d(x, y, bins=(xedges, yedges))
6print("ans=\n", H)
7
8plt.scatter(x, y)
9plt.grid()
ans=
[[ 3. 0. 0. 0.]
[ 0. 2. 0. 0.]
[ 0. 0. 1. 1.]]
Q14. Count number of occurrences of 0 through 7 in x.
1x = np.array([0, 1, 1, 3, 2, 1, 7])
2print("ans=\n", np.bincount(x))
ans=
[1 3 1 1 0 0 0 1]
Q15. Return the indices of the bins to which each value in x belongs.
1x = np.array([0.2, 6.4, 3.0, 1.6])
2bins = np.array([0.0, 1.0, 2.5, 4.0, 10.0])
3
4print("ans=\n", np.digitize(x, bins))
ans=
[1 4 3 2]