13 Statistics Solutions


title: "Statistics" date: 2026-05-24T13:51:18Z

Statistics

1__author__ = "kyubyong. kbpark.linguist@gmail.com"
1import numpy as np
1np.__version__
'1.11.3'

Order statistics

Q1. Return the minimum value of x along the second axis.

1x = np.arange(4).reshape((2, 2))
2print("x=\n", x)
3print("ans=\n", np.amin(x, 1))
x=
 [[0 1]
 [2 3]]
ans=
 [0 2]

Q2. Return the maximum value of x along the second axis. Reduce the second axis to the dimension with size one.

1x = np.arange(4).reshape((2, 2))
2print("x=\n", x)
3print("ans=\n", np.amax(x, 1, keepdims=True))
x=
 [[0 1]
 [2 3]]
ans=
 [[1]
 [3]]

Q3. Calcuate the difference between the maximum and the minimum of x along the second axis.

1x = np.arange(10).reshape((2, 5))
2print("x=\n", x)
3
4out1 = np.ptp(x, 1)
5out2 = np.amax(x, 1) - np.amin(x, 1)
6assert np.allclose(out1, out2)
7print("ans=\n", out1)
x=
 [[0 1 2 3 4]
 [5 6 7 8 9]]
ans=
 [4 4]

Q4. Compute the 75th percentile of x along the second axis.

1x = np.arange(1, 11).reshape((2, 5))
2print("x=\n", x)
3
4print("ans=\n", np.percentile(x, 75, 1))
x=
 [[ 1  2  3  4  5]
 [ 6  7  8  9 10]]
ans=
 [ 4.  9.]

Averages and variances

Q5. Compute the median of flattened x.

1x = np.arange(1, 10).reshape((3, 3))
2print("x=\n", x)
3
4print("ans=\n", np.median(x))
x=
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
ans=
 5.0

Q6. Compute the weighted average of x.

1x = np.arange(5)
2weights = np.arange(1, 6)
3
4out1 = np.average(x, weights=weights)
5out2 = (x*(weights/weights.sum())).sum()
6assert np.allclose(out1, out2)
7print(out1)
2.66666666667

Q7. Compute the mean, standard deviation, and variance of x along the second axis.

 1x = np.arange(5)
 2print("x=\n",x)
 3
 4out1 = np.mean(x)
 5out2 = np.average(x)
 6assert np.allclose(out1, out2)
 7print("mean=\n", out1)
 8
 9out3 = np.std(x)
10out4 = np.sqrt(np.mean((x - np.mean(x)) ** 2 ))
11assert np.allclose(out3, out4)
12print("std=\n", out3)
13
14out5 = np.var(x)
15out6 = np.mean((x - np.mean(x)) ** 2 )
16assert np.allclose(out5, out6)
17print("variance=\n", out5)
x=
 [0 1 2 3 4]
mean=
 2.0
std=
 1.41421356237
variance=
 2.0

Correlating

Q8. Compute the covariance matrix of x and y.

1x = np.array([0, 1, 2])
2y = np.array([2, 1, 0])
3
4print("ans=\n", np.cov(x, y))
ans=
 [[ 1. -1.]
 [-1.  1.]]

Q9. In the above covariance matrix, what does the -1 mean?

It means x and y correlate perfectly in opposite directions.

Q10. Compute Pearson product-moment correlation coefficients of x and y.

1x = np.array([0, 1, 3])
2y = np.array([2, 4, 5])
3
4print("ans=\n", np.corrcoef(x, y))
ans=
 [[ 1.          0.92857143]
 [ 0.92857143  1.        ]]

Q11. Compute cross-correlation of x and y.

1x = np.array([0, 1, 3])
2y = np.array([2, 4, 5])
3
4print("ans=\n", np.correlate(x, y))
ans=
 [19]

Histograms

Q12. Compute the histogram of x against the bins.

1x = np.array([0.5, 0.7, 1.0, 1.2, 1.3, 2.1])
2bins = np.array([0, 1, 2, 3])
3print("ans=\n", np.histogram(x, bins))
4
5import matplotlib.pyplot as plt
6%matplotlib inline
7plt.hist(x, bins=bins)
8plt.show()
ans=
 (array([2, 3, 1], dtype=int64), array([0, 1, 2, 3]))

png

Q13. Compute the 2d histogram of x and y.

1xedges = [0, 1, 2, 3]
2yedges = [0, 1, 2, 3, 4]
3x = np.array([0, 0.1, 0.2, 1., 1.1, 2., 2.1])
4y = np.array([0, 0.1, 0.2, 1., 1.1, 2., 3.3])
5H, xedges, yedges = np.histogram2d(x, y, bins=(xedges, yedges))
6print("ans=\n", H)
7
8plt.scatter(x, y)
9plt.grid()
ans=
 [[ 3.  0.  0.  0.]
 [ 0.  2.  0.  0.]
 [ 0.  0.  1.  1.]]

png

Q14. Count number of occurrences of 0 through 7 in x.

1x = np.array([0, 1, 1, 3, 2, 1, 7])
2print("ans=\n", np.bincount(x))
ans=
 [1 3 1 1 0 0 0 1]

Q15. Return the indices of the bins to which each value in x belongs.

1x = np.array([0.2, 6.4, 3.0, 1.6])
2bins = np.array([0.0, 1.0, 2.5, 4.0, 10.0])
3
4print("ans=\n", np.digitize(x, bins))
ans=
 [1 4 3 2]