Image segmentation is a problem of partitioning an image into a number of regions with similar appearances. Here we focus on textured images.
Texture appearances cannot be captured by individual pixel features (intensity, color). It needs features extracted from local neighborhoods. A simple way is to use the histogram of an image window around a pixel. Such a local histogram characterizes the appearance of image window. We will use histograms with equally divided bins and the bin number is fixed.
When using local histograms for segmentation, we need to address the boundary localization problem. The figure below shows a synthesized image contains two different regions. The local histogram of pixel A is computed from the square window. Since the window crosses two regions, the local histogram is very different from the histograms of both two regions. Although A is within the right region, the histogram of the square window is not guaranteed to be closer to the histogram of right region, regardless of the particular choice of distance metric.
Instead of measuring the distance, we use a different strategy. The local histogram at A is actually the weighted sum of the left region’s histogram and the right region’s histogram. The weights correspond to the area coverage within the square window. That is, the weight tells which region pixel A belongs to. This can be written as
HA is the histogram of the square window. H1 and H2 are the histograms of the left and right regions. w1 and w2 are the weights. If H1 and H2 are known, the weights can be estimated. To get H1 and H2, we can select a pixel within each region and compute its local histogram, which should be close to the histogram of the region. Such histograms will be referred to as representative features.
We compute the local histogram of each pixel location in an image. There is a smart technique called integral histograms, which allows us to quickly compute local histograms regardless of the window size. Then, we extend the analysis from features near boundaries to all the features in the image — we regard the feature of each pixel as the linear combination of all the representative features weighted by the corresponding area coverage. When a window is completely within one region, the weight for the representative feature of that region should be one, while the other weights zero. The formulation becomes
are the local histogram of each pixel location. Each HA to Hn Hr is a representative feature corresponding to one region. Each W is the weight vector for a pixel location. The last term represents noise. As we can see, the feature matrix can be factored into two matrices. Let’s w rite the above equation into a simple matrix form.
Given an image, we compute the feature matrix Y. As mentioned earlier, we can manually put a seed in each region to compute representative features. Then, t he weights can be computed through least square estimation
The weights immediately give the segment label of each pixel. A pixel is assigned to the region that has the largest weight. Here we obtain segmentation through simple matrix operations. This actually provides a very efficient semi-automatic segmentation algorithm (require seed placement for representative features). The following figure gives the segmentation result, where the boundary is accurately localized.
Histograms based on pixel intensities have limited power to characterize textures. This can be greatly improved by first applying filters (Laplacian of Gaussian filters, Gabor filters, etc.) to images . The filters cause certain spatial patterns to respond strongly, which better differentiate texture appearances. We compute histograms of multiple filter responses and concatenate the histograms. Such features are called spectral histograms. Note that pixel intensity can be thought of as the response of the intensity filter.
The analysis above for local histograms also applies for local spectral histograms . Therefore, the method can be used with local spectral histograms. For certain region boundaries, some filters may give strong responses that are not consistent with responses inside regions. However, because the purpose of filtering in local spectral histograms is to capture elementary patterns, the chosen filters are generally small compared with integration scales. Such inconsistent filter responses happen at a small portion of pixels within a local window and thus have a minimal impact on resulting histograms.
The middle image in the following figure is the segmentation result of the leftmost image. Here the intensity filter and two LoG filters are used to compute spectral histograms. Three seeds are selected in each region to get representative features, which are fed into the least square estimation equation above. The resulting boundaries are highly accurate. An alternative method is to use the same representative features and classify each pixel based on the distance between its feature and three representative features. The rightmost image shows a result using Chi-square distance (other distance measures give similar results), where pixels near boundaries are misclassified due to the aforementioned localization problem.
Fully Automatic Segmentation
For fully automatic segmentation, both representative features and combination weights are unknowns, and we aim to estimate these two matrices by factoring the matrix Y. The factorization algorithm can be summarized as follows.
- Apply singular value decomposition (SVD) to obtain factored matrices with low rank.
- The number of segment equals to the effective rank of the feature matrix, which can be estimated from the singular values.
- SVD gives the subspace all features reside in, which greatly reduces feature dimensions and leads to the estimation of representative features.
- A nonnegative matrix factorization technique is used to ensure nonnegativity constraints.
The algorithm is very efficient and performs reliably. Please refer to  for details.
Below are example results on natural images.
The method naturally imposes a smoothing effect on the boundaries. Take the figure below as an example. According to the coverage of the two regions within the window, the method segments the black pixel into the darker region, as shown in the middle image. With the window size sufficiently large, we obtain the segmentation result shown in left image, where the boundary is close to a straight line. The larger the local window used to compute histograms, the smoother the resulting boundaries.
For many image segmentation methods, smoothing is often explicitly included as an objective. In our algorithm, the smoothing effect is naturally encoded.
Three main types of filters used in our experiments are the intensity filter (the filter response is just pixel intensity values), Laplacian of Gaussian filters, and Gabor filters. Other types of filters can also be used. For most natural images, a small number of filters work sufficiently well. To segment images where texture is complex and the difference among regions is small, adding more filters can give more accurate results.
A geospatial application: neighborhood mapping
Neighborhoods with different socioeconomic characteristics exhibit different texture appearance in aerial view images. In the figure below, the proposed method effectively separates settlement areas into regions with large buildings, buildings in vegetated areas, dense slum-like buildings, etc.
 J. Yuan, D. L. Wang, and A. M. Cheriyadat. Factorization-based texture segmentation. IEEE Transactions on Image Processing, 2015.