# NumPy Statistical Functions | Statistical Operations in NumPy

Statistics involves gathering data, analyzing it, and drawing conclusions based on the information collected.

NumPy provides us with various statistical functions that can perform statistical data analysis.

## Common NumPy Statistical Functions

Here are some of the statistical functions provided by NumPy:

Next, we will see examples using these functions.

## Find Median Using NumPy

The median value of a numpy array is the middle value in a sorted array.

In other words, it is the value that separates the higher half from the lower half of the data.

Suppose we have the following list of numbers:

``1, 5, 7, 8, 9, 12, 14``

Then, median is simply the middle number, which in this case is 8.

It is important to note that if the number of elements is

• Odd, the median is the middle element.
• Even, the median is the average of the two middle elements.

Now, we will learn how to calculate the median using NumPy for arrays with odd and even number of elements.

## Example 1: Compute Median for Odd Number of Elements

``````import numpy as np

# create a 1D array with 5 elements
array1 = np.array([1, 2, 3, 4, 5])

# calculate the median
median = np.median(array1)

print(median)

# Output: 3.0``````

Run Code

In the above example, the array named array1 contains an odd number of elements (5 elements).

So, `np.median(array1)` returns the median of `array1` as 3, which is the middle value of the sorted array.

## Example 2: Compute Median for Even Number of Elements

``````import numpy as np

# create a 1D array with 6 elements
array1 = np.array([1, 2, 3, 4, 5, 7])

# calculate the median
median = np.median(array1)
print(median)

# Output: 3.5``````

Run Code

Here, since the array1 array has an even number of elements (6 elements), the median is calculated as the average of the two middle elements (3 and 4) i.e. 3.5.

## Median of NumPy 2D Array

Calculation of the median is not just limited to 1D array. We can also calculate the median of the 2D array.

In a 2D array, median can be calculated either along the horizontal or the vertical axis individually, or across the entire array.

When computing the median of a 2D array, we use the `axis` parameter inside `np.median()` to specify the axis along which to compute the median.

If we specify,

• `axis = 0`, median is calculated along vertical axis
• `axis = 1`, median is calculated along horizontal axis

If we don't use the `axis` parameter, the median is computed over the entire array.

### Example: Compute the median of a 2D array

``````import numpy as np

# create a 2D array
array1 = np.array([[2, 4, 6],
[8, 10, 12],
[14, 16, 18]])

# compute median along horizontal axis
result1 = np.median(array1, axis=1)

print("Median along horizontal axis :", result1)

# compute median along vertical axis
result2 = np.median(array1, axis=0)

print("Median along vertical axis:", result2)

# compute median of entire array
result3 = np.median(array1)

print("Median of entire array:", result3)``````

Run Code

Output

``````Median along horizontal axis : [ 4. 10. 16.]
Median along vertical axis: [ 8. 10. 12.]
Median of entire array: 10.0``````

In this example, we have created a 2D array named array1.

We then computed the median along the horizontal and vertical axis individually and then computed the median of the entire array.

• `np.median(array1, axis=1)` - median along horizontal axis, which gives `[4. 10. 16.]`
• `np.median(array1, axis=0)` - median along vertical axis, which gives `[8. 10. 12.]`
• `np.median(array1)` - median over the entire array, which gives `10.0`

To calculate the median over the entire 2D array, first we flatten the array to `[ 2, 4, 6, 8, 10, 12, 14, 16, 18]` and then find the middle value of the flattened array which in our case is 10.

## Compute Mean Using NumPy

The mean value of a NumPy array is the average value of all the elements in the array.

It is calculated by adding all elements in the array and then dividing the result by the total number of elements in the array.

We use the `np.mean()` function to calculate the mean value. For example,

``````import numpy as np

# create a numpy array
marks = np.array([76, 78, 81, 66, 85])

# compute the mean of marks
mean_marks = np.mean(marks)

print(mean_marks)

# Output: 77.2``````

Run Code

In this example, the mean value is 77.2, which is calculated by adding the elements (76, 78, 81, 66, 85) and dividing the result by 5 (total number of array elements).

## Example 3: Mean of NumPy N-d Array

``````import numpy as np

# create a 2D array
array1 = np.array([[1, 3],
[5, 7]])

# calculate the mean of the entire array
result1 = np.mean(array1)
print("Entire Array:",result1)  # 4.0

# calculate the mean along vertical axis (axis=0)
result2 = np.mean(array1, axis=0)
print("Along Vertical Axis:",result2)  # [3. 5.]

# calculate the mean along  (axis=1)
result3 = np.mean(array1, axis=1)
print("Along Horizontal Axis :",result3)  # [2. 6.]``````

Run Code

Output

``````Entire Array: 4.0
Along Vertical Axis: [3. 5.]
Along Horizontal Axis : [2. 6.]``````

Here, first we have created the 2D array named array1. We then calculated the mean using `np.mean()`.

• `np.mean(array1)` - calculates the mean over the entire array
• `np.mean(array1, axis=0)` - calculates the mean along vertical axis
• `np.mean(array1, axis=1)` calculates the mean along horizontal axis

## Standard Deviation of NumPy Array

The standard deviation is a measure of the spread of the data in the array. It gives us the degree to which the data points in an array deviate from the mean.

• Smaller standard deviation indicates that the data points are closer to the mean
• Larger standard deviation indicates that the data points are more spread out.

In NumPy, we use the `np.std()` function to calculate the standard deviation of an array.

### Example: Compute the Standard Deviation in NumPy

``````import numpy as np

# create a numpy array
marks = np.array([76, 78, 81, 66, 85])

# compute the standard deviation of marks
std_marks = np.std(marks)
print(std_marks)

# Output: 6.803568381206575``````

Run Code

In the above example, we have used the `np.std()` function to calculate the standard deviation of the `marks` array.

Here, `6.803568381206575` is the standard deviation of `marks`. It tells us how much the values in the `marks` array deviate from the mean value of the array.

## Standard Deviation of NumPy 2D Array

In a 2D array, standard deviation can be calculated either along the horizontal or the vertical axis individually, or across the entire array.

Similar to mean and median, when computing the standard deviation of a 2D array, we use the `axis` parameter inside `np.std()` to specify the axis along which to compute the standard deviation.

### Example: Compute the Standard Deviation of a 2D array.

``````import numpy as np

# create a 2D array
array1 = np.array([[2, 5, 9],
[3, 8, 11],
[4, 6, 7]])

# compute standard deviation along horizontal axis
result1 = np.std(array1, axis=1)
print("Standard deviation along horizontal axis:", result1)

# compute standard deviation along vertical axis
result2 = np.std(array1, axis=0)
print("Standard deviation  along vertical axis:", result2)

# compute standard deviation of entire array
result3 = np.std(array1)
print("Standard deviation of entire array:", result3)``````

Run Code

Output

``````Standard deviation along horizontal axis: [2.86744176 3.29983165 1.24721913]
Standard deviation along vertical axis: [0.81649658 1.24721913 1.63299316]
Standard deviation of entire array: 2.7666443551086073``````

Here, we have created a 2D array named array1.

We then computed the standard deviation along horizontal and vertical axis individually and then computed the standard deviation of the entire array.

## Compute Percentile of NumPy Array

In NumPy, we use the `percentile()` function to compute the nth percentile of a given array.

Let's see an example.

``````import numpy as np

# create an array
array1 = np.array([1, 3, 5, 7, 9, 11, 13, 15, 17, 19])

# compute the 25th percentile of the array
result1 = np.percentile(array1, 25)
print("25th percentile:",result1)

# compute the 75th percentile of the array
result2 = np.percentile(array1, 75)
print("75th percentile:",result2)``````

Run Code

Output

``````25th percentile: 5.5
75th percentile: 14.5``````

Here,

• 25% of the values in array1 are less than or equal to 5.5.
• 75% of the values in array1 are less than or equal to 14.5.

## Find Minimum and Maximum Value of NumPy Array

We use the `min()` and `max()` function in NumPy to find the minimum and maximum values in a given array.

Let's see an example.

``````import numpy as np

# create an array
array1 = np.array([2,6,9,15,17,22,65,1,62])

# find the minimum value of the array
min_val = np.min(array1)

# find the maximum value of the array
max_val = np.max(array1)

# print the results
print("Minimum value:", min_val)
print("Maximum value:", max_val)``````

Run Code

Output

``````Minimum value: 1
Maximum value: 65``````

As we can see `min()` and `max()` returns the minimum and maximum value of array1 which is 1 and 65 respectively.

Note: To learn more about `min()` and `max()`, visit NumPy min() and NumPy max().

1.10 GEEK