Home

Research

Publications

Contact

Calculation of Mean Intersection over Union

Date

2022/11/11 05:56

What is Mean Intersection over Union (mIoU)?

•

mIoU is one of the representative performance metrics for semantic segmentation.

•

It is calculated by computing the intersection over union for each segmented class and then taking the average across all classes.

•

The equation of IoU is defined as below:

{IoU} = \frac{TP}{TP+FP+FN}

•

Here, TPTPTP represents True positive, FPFPFP represents False Positive, and FNFNFNrepresents False Negative.

•

In the figure, the green area represents the Ground Truth and the red area represents the Prediction, as indicated by TPTPTP, FPFPFP, and FNFNFN.

Calculating mIoU

Ground Truth and Prediction

•

Generally, the output of segmentation consists of ground truth (GT) and prediction, as shown in the figure below.

•

GT and Prediction include integers corresponding to the object class for each pixel.

•

In the example below, the total number of classes CCC is represented as 5, ranging from 0 to 4.

•

If implemented as a PyTorch Tensor, it would look like this:

import torch

gt = torch.tensor([
    [0, 1, 2, 3, 4],
    [0, 1, 2, 3, 4],
    [0, 1, 2, 3, 4],
    [0, 1, 2, 3, 4],
    [0, 1, 2, 3, 4],
])

pred = torch.tensor([
    [0, 0, 0, 0, 0],
    [0, 1, 1, 1, 1],
    [0, 1, 2, 2, 2],
    [0, 1, 2, 3, 3],
    [0, 1, 2, 3, 4],
])
Python
복사

Creating the Category Matrix

•

A Category Matrix, which maps pairs of GT and Prediction between 0 and C2C^2C2, is obtained from GT and Prediction. (For convenience, it is referred to as Category below.)

•

The formula to obtain the Category is as follows:

category = gt \times C+prediction

•

Calculating according to the above formula, values from 0 to 5 in the Category represent the values of Prediction when GT is 0.

•

Similarly, values from 6 to 10 represent the values of Prediction when GT is 1.

•

The code is as follows:

num_classes = 5

category = gt * num_classes + pred
print(category)
Python
복사

•

The result of the computation is:

tensor([[ 0,  5, 10, 15, 20],
        [ 0,  6, 11, 16, 21],
        [ 0,  6, 12, 17, 22],
        [ 0,  6, 12, 18, 23],
        [ 0,  6, 12, 18, 24]])
Markdown
복사

Creating the Confusion Matrix

•

The Confusion Matrix can be obtained by calculating the number of elements for each number in the Category.

•

For example, an element with a value of 10 in the Category indicates that GT is 1 and Prediction is 5.

•

Therefore, by counting the number of elements with a value of 10 in the entire Category, we can count the number of pixels where GT was 1 and Prediction was 5.

•

The numbers for each element are calculated using the bincount function in torch and numpy.

•

Since the input of bincount is a 1D input, we perform flatten.

•

Also, we set the minimum value of the bin to C2C^2C2 to ensure that it works properly even if some classes are not included in GT or Prediction.

•

Finally, we resize the matrix to C×CC\times{C}C×C to make it a 2D Confusion Matrix.

category_1d = category.flatten()
cm = torch.bincount(category_1d, minlength=num_classes ** 2)
cm = cm.reshape(num_classes, num_classes)
print(cm)
Python
복사

tensor([[5, 0, 0, 0, 0],
        [1, 4, 0, 0, 0],
        [1, 1, 3, 0, 0],
        [1, 1, 1, 2, 0],
        [1, 1, 1, 1, 1]])
Markdown
복사

•

The Confusion Matrix represents the following in the table:

•

The value at (0, 0) in the Confusion Matrix represents the number of pixels where both GT and Prediction were 0.

Intersection and Union

•

Intersection and Union can be easily calculated from the Confusion Matrix.

•

For example, by adding up the number of rows with GT 0 and columns with Prediction 0 in the Confusion Matrix, we get the union.

•

The intersection is the number of elements where both GT and Prediction are 0.

•

The yellow solid line in the figure represents the Union, and the yellow colored area represents the Intersection.

•

The IoU of class 0 is 5/9=0.55565/9=0.55565/9=0.5556.

•

By following the same procedure for other classes, the IoUs for each class are 0.55560.55560.5556, 0.50000.50000.5000, 0.42860.42860.4286, 0.33330.33330.3333, and 0.20000.20000.2000.

ious = []

for i in range(num_classes):
    inter = cm[i, i].sum()
    union = cm[:, i].sum() + cm[i, :].sum() - inter
    ious.append(inter/union)

print(ious)
Python
복사

[tensor(0.5556), tensor(0.5000), tensor(0.4286), tensor(0.3333), tensor(0.2000)]
Markdown
복사

mIoU

•

Finally, mIoU is calculated as the average of the calculated IoUs.

•

To account for the possibility of certain classes not appearing, we use nanmean instead of the mean function.

mIoU = torch.nanmean(torch.tensor(ious))
print(mIoU)
Python
복사

tensor(0.4035)
Markdown
복사

References

https://gaussian37.github.io/vision-segmentation-miou/