Abstract

In this paper, a detailed summary and analysis over Shi and Malik’s paper on Normalized Cuts and Image Segmentation. Each section covers a summary and analysis of the respective portion of the original paper.

Wertheimer’s Perceptual Grouping Theory

In the introduction, Shi and Malik note that their research is based off of Wertheimer’s Perceptual Grouping Theory. This theory states that regions should be grouped together based on looking at the image on a higher level through objects and patterns.

Choosing the right Subset for Partitions

They notes that there are two key factors to choose the right subset when partitioning:

  • Bayesian View of Subsets: You can look at a picture and classify it on a low level (brightness, color, texture, etc.) and mid/high level (symmetry, object Models, etc.)
  • Hierarchy Partitions: Use low-level cues to form a hierarchical partition and Mid/High-Level knowledge to verify or selects new areas for the hierarchy. Essentially, go from the big picture downwards.

New Approach to Segmentation

On higher dimensional problems the greedy and gradient descent approach usually fails. The new approach uses grouping by splitting vertices into disjoint sets and keeping the similarity high within a set and low between sets. The partitions are then evaluated based on a loss known as a normalized cut.

Grouping as Graph Partitioning

**Normalized Cut & Association: **Previously, the regular cut was used as a loss function, however, it often cut isolated edges. In order to prevent this, a normalized cut was developed which used the cut as a fraction of the total edges. A similar method was used to normalize association.The goal of the Algorithm minimizing the disassociation between the groups and maximizing the association within the groups

**Computing the Optimal Partition: **Please note that for the sake of simplicity, most of the math has been omitted from the summary. In order the compute the optimal partition efficiently using the normalized cut, a generalized eigenvalue problem is solved. The eigenvalue problem is as follows:

(D-W)y = (lambda) (Dy)

The parts of the problem:

  • D: An N x N (Where N = |V|) diagonal matrix
  • W: An N x N symmetrical matrix such that W(i,j) = wij
  • y: y = (1 + x) — b(1-x)

Check the original paper for the full details of each part of the eigenvalue system.

#artificial-intelligence #segmentation #machine-learning #computer-vision #image-processing

An Analysis of Normalized Cuts and Image Segmentation
1.05 GEEK