Image segmentation

What is Image Segmentation?

Image Segmentation is the term used when an image is split up into different regions. This can be a fairly simple task, such as separating the pixels in the image where a black eagle is present from the pixels of the daylight sky. Segmenting camouflaged soldiers in the jungle is a significantly more challenging endeavor..

Thresholding and the binary image

Image thresholding is one of the most widely used techniques for segmenting an image due to its simplicity. The basic approach is to select an intensity which is our threshold value, and any pixel which has an intensity value above this value is considered to be part of region A and anything below is considered to be part of region B.

The key parameter in thresholding is obviously the choice of the threshold. Several different methods for choosing a threshold exist. The simplest method would be to choose the mean or median value, the rationale being that if the object pixels are brighter than the background, they should also be brighter than the average. In a noiseless image with uniform background and object values, the mean or median will work beautifully as the threshold, however generally speaking, this will not be the case. A more sophisticated approach might be to create a histogram of the image pixel intensities and use the valley point as the threshold. The histogram approach assumes that there is some average value for the background and object pixels, but that the actual pixel values have some variation around these average values. However, computationally this is not as simple as we'd like, and many image histograms do not have clearly defined valley points. Ideally we're looking for a method for choosing the threshold which is simple, does not require too much prior knowledge of the image, and works well for noisy images. A good such approach is an iterative method, as follows:

  1. An initial threshold (T) is chosen, this can be done randomly or according to any other method desired.
  2. The image is segmented into object and background pixels as described above, creating two sets:
        1. G1 = {f(m,n):f(m,n)>T} (object pixels)
        2. G2 = {f(m,n):f(m,n)<T} (background pixels) (note, f(m,n) is the value of the pixel located in the mth column, nth row)
  3. The average of each set is computed.
        1. m1 = average value of G1
        2. m2 = average value of G2
  4. A new threshold is created that is the average of m1 and m2
        1. T' = (m1 + m2)/2
  5. Go back to step two, now using the new threshold computed in step 4, keep repeating until the new threshold matches the one before it (i.e. until convergence has been reached).

Another approach is to calculate the new threshold in step 4 using the weighted average of m1 and m2: T' = (||G1||*m1 + ||G2||*m2)/(||G1||+||G2||), where ||Gn|| is the number of pixels in Gn. This approach often gives a more accurate result.

Performance evaluation, the Receiver Operator Curve

Connected and non-Connected Components

Split and Merge Algorithm