Blob-Detection in Image Processing

Comparing the blob detection algorithms

Rafael Madrigal
3 min readSep 14, 2022

In image processing jargon, a blob is defined as either bright objects in dark backgrounds or dark objects in bright backgrounds. Basically, in detecting blobs, what we need to find is the boundary where there is a sudden shift from dark to bright (or bright to dark). Since we are talking about sudden drops or shifts, we can rely on our differential calculus skills to solve the problem.

This is the logic behind the three most popular blob detection algorithms in scikit-image. These are:

  • Laplacian of Gaussian (LOG) — Takes the Laplacian of a gaussian smoothed image
  • Difference of Gaussian (DOG) — takes the difference of two gaussian smoothed image
  • Determinant of Hessian (DOH) — takes maximum in the matrix of the determinant of Hessian.

These terms may sound too complicated but luckily they can be implemented using a single line of code in scikit-image.

from skimage.data import hubble_deep_field
from skimage.feature import blob_dog, blob_log, blob_doh
from skimage.color import rgb2gray
import math
im = hubble_deep_field()[:500, :500]
im_gry = rgb2gray(im)
# Laplacian of Gaussian
blobs_log = blob_log(im_gry, max_sigma=30, num_sigma=10, threshold=.1)
blob_log[:, 2] = blobs_log[:, 2] * math.sqrt(2)
# Difference of Gaussian
blobs_dog = blob_dog(im_gry, max_sigma=30, num_sigma=10, threshold=.1)
blob_dog[:, 2] = blobs_dog[:, 2] * math.sqrt(2)
# Determinant of Hessian
blobs_doh = blob_doh(im_gry, max_sigma=30, threshold=0.01

There are basically three hyperparameters in each function: max_sigma, num_sigma, and threshold.

  • max_sigma: refers to the maximum standard deviation of the Gaussian Kernel. The higher this value is, the larger the detected blobs are.
  • threshold: is the absolute lower bound for scale-space maxima. We reduce this value if we want to detect blobs that have less intensities
  • num_sigma: is the number of intermediate values of standard deviations to consider between min_sigma and max_sigma.

Using Connected Components

An alternative way to detect blocks is to use the concept of connected components. According to this, we identify a blob as a set of connected 1s in a binary image. That said, this approach is highly dependent on the thresholds set to binarize the image and in the application of morphological operations as even a tiny speck of 1s in the image can be categorized as a connected component.

For this, we use the region props module in skimage.measure

from skimage.measure import label, regionprops
from skimage.color import rgba2gray
from skimage.io import imreadim = rgba2gray('candies.png')
im_bw = im < 0.5
im_clean = morphological_ops(im_bw)
im_label = label(im_clean)
props = regionprops(im_label)

props[0].area
### Output: 4453 Refers to the Area of the image with label==0
Process of Image Labelling for Feature Extraction: Convert to Grayscale > Binarize > Label Image. Each blob is a set of connected pixels or elements in the array — image generated by the Author.

One benefit of using region props is that we can obtain some geometric information about the blobs we detected such as centroid, area, perimeter, and bounding boxes, which dog, doh, and log cannot do. This is extremely helpful once we decide to generate features for a machine learning task

Wrapping Up

In this article, we showed two approaches to blob detection (1) differential based, and (2) connected components. Differential-based algorithms are useful in counting and marking blobs while the connected components are better when we intend to generate properties of the blobs we identified. However, connected components heavily depend on the cleanliness of the image and how well we performed our thresholding and morphologies.

Next, we look at the different image segmentation algorithms that we can use.

--

--

Rafael Madrigal

Data Scientist and Corporate Strategist. Can’t function without Coffee