visit
This blog post on Canny Edge Detection is the second in the series about OpenCV and Computer Vision. Other parts: Part 1.
Some of OpenCV's most commonly used edge-detection algorithms are Canny, Sobel, Scharr, and Laplacian. Let us quickly and simply go through how one of those algorithms works. We'll look into Canny as it is one of the most commonly used, and then we'll try to implement it from scratch, although a pretty simplified version.
If you need to know anything about noise in edge detection algorithms, know this - it is highly, highly problematic. Noise is considered bad because it introduces false edges and reduces the accuracy of existing ones. In order to reduce noise, we smooth the image by applying a blurring algorithm. A blurring algorithm uses kernels. The kernel is a small square matrix (3x3, 5x5, or 7x7) that defines a set of coefficients/weights used in computer vision algorithms. Basically, this kernel is mathematically utilized on each and every pixel in the image.
G(x) = (1 / sqrt(2 * pi * sigma^2)) * exp(-((x - mu)^2) / (2 * sigma^2))
Gradients are helpful as they give us more information about the changes in intensity in different directions by analyzing the rate of change. The most commonly used algorithm to calculate gradients is the Sobel operator, which is a spatial filter that convolves the image with a small kernel in order to approximate the gradients.
What does it mean to convolve the kernel? With convolving, we apply a kernel to an image. The kernel is moved through the entire image, and on each location, a weighted sum of pixel values is computed. We actually used the same process when we reduced noise using the Gaussian filter. We'll show how this works mathematically in the implementation section. The new image generated represents the convoluted result.
Ok, let us start with the Gaussian filter denoising. We already know what we want. We want to convolve the image with a 2D kernel. Let us first create the kernel:
def generate_gaussian_kernel(kernel_size, sigma):
kernel = np.fromfunction(
lambda x, y: (1 / (2 * np.pi * sigma**2))
* np.exp(
-((x - (kernel_size - 1) / 2) ** 2 + (y - (kernel_size - 1) / 2) ** 2)
/ (2 * sigma**2)
),
(kernel_size, kernel_size),
)
kernel /= np.sum(kernel)
return kernel
gauss_kernel = generate_gaussian_kernel(3, 1)
print(gauss_kernel)
blurred_image = cv2.filter2D(img, -1, gauss_kernel)
<class 'numpy.ndarray'>
[[0.07511361 0.1238414 0.07511361]
[0.1238414 0.20417996 0.1238414 ]
[0.07511361 0.1238414 0.07511361]]
The first thing that might be interesting is numpy.fromfunction
, that constructs an array by executing a function over each element. The producing element is an array from the numerical Python library called – numpy.ndarray
. lambda
is the anonymous function (without the function name) that calculates the kernel.
We will play with the kernel size and sigma to produce the best product when we merge all steps together, but now for the sake of learning, we will create a 3×3 bell-curve-shaped kernel with a sigma deviation of 1 (meaning no deviation). Let us take a step back and ask a really important question. How do we actually convolve the image? By calculating the weighted average at a certain pixel. Check this:
[[168 167 169]
[165 163 168]
[170 165 167]]
kernel_{0, 0} * matrix_{0, 0} + kernel_{0, 1} * matrix_{0, 1} + kernel_{0, 2} * matrix_{0, 2} + ... + kernel_{w, h} * matrix_{w, h}
print(np.sum(slice * gauss_kernel))
166.2624376159791
Blurred
[[167 167 167]
[167 166 167]
[166 166 166]]
The value of the middle pixel is 166, which we actually calculated manually! Amazing! Of course, we won’t do this by hand for every pixel. Thankfully the filter2D
method will do everything that we did manually (and more!). It will actually apply the filter to the image and produce a new one.
gauss_kernel = generate_gaussian_kernel(3, 1.2)
blurred_image = cv2.filter2D(img, -1, gauss_kernel)
The next step is to calculate a gradient. We will convolve the blurred image with two Sobel kernels for the horizontal and vertical direction, and we will produce two gradient values. Then we’ll calculate two new values. gradient_magnitude
is a measure of how much the intensity changes around a pixel and gradient_direction
represents the angle of the gradient at each pixel. We’ll use both of those variables in the algorithm as the magnitude is used to identify edge points, and direction is used to refine the edges itself. We are using asType(np.uint8))
to convert the magnitude and direction values to 8-bit integer values.
We do this because after calculating the Euclidean norm and the arctan function, we most probably have floating-point data. Additionally, the gradient_direction
variable is, by default, returned in radians. That is why we will need to convert it to degrees when we use it.
def gradient_calculation(input_image):
sobel_x = np.array([[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]])
sobel_y = np.array([[-1, -2, -1], [0, 0, 0], [1, 2, 1]])
gradient_x = cv2.filter2D(input_image, -1, sobel_x)
gradient_y = cv2.filter2D(input_image, -1, sobel_y)
gradient_magnitude = np.hypot(gradient_x, gradient_y)
gradient_direction = np.rad2deg(np.arctan2(gradient_y, gradient_x))
gradient_magnitude = gradient_magnitude.astype('uint8')
gradient_direction[gradient_direction < 0] += 180
return gradient_magnitude, gradient_direction
Let us continue with the non-maximum suppression. Now we have information about the strength and direction of the intensity change at each pixel. Using non-maximum suppression, we’ll try to thin the edges by preserving only the maximum values. We’ll go through each pixel in the gradient_magnitude
image and check the gradient direction. Based on the direction, we’ll identify the neighboring pixels and compare the values with the neighboring values.
def non_maximum_suppression(magnitude, direction):
size = magnitude.shape
suppressed = np.zeros(size)
for i in range(1, size[0] - 1):
for j in range(1, size[1] - 1):
if (0 <= direction[i, j] < 22.5) or (157.5 <= direction[i, j] <= 180):
value_to_compare = max(magnitude[i, j - 1], magnitude[i, j + 1])
elif (22.5 <= direction[i, j] < 67.5):
value_to_compare = max(magnitude[i - 1, j - 1], magnitude[i + 1, j + 1])
elif (67.5 <= direction[i, j] < 112.5):
value_to_compare = max(magnitude[i - 1, j], magnitude[i + 1, j])
else:
value_to_compare = max(magnitude[i + 1, j - 1], magnitude[i - 1, j + 1])
if magnitude[i, j] >= value_to_compare:
suppressed[i, j] = magnitude[i, j]
suppressed = np.multiply(suppressed, 255.0 / suppressed.max())
return suppressed
Let us apply the double threshold and edge tracking. After finishing the non-maximum suppression, we have an edge map with edge candidates. We’ll apply the double threshold by setting two threshold values (low and high) in order to distinguish strong and weak edges. We’ll mark pixels in the edge map that are greater than the high values as strong edges and the ones between low and high as weak edges.
def double_threshold_and_edge_tracking(image, low_threshold, high_threshold):
weak = 50
strong = 255
size = image.shape
result = np.zeros(size)
weak_x, weak_y = np.where((image > low_threshold) & (image <= high_threshold))
strong_x, strong_y = np.where(image >= high_threshold)
result[strong_x, strong_y] = strong
result[weak_x, weak_y] = weak
dx = np.array((-1, -1, 0, 1, 1, 1, 0, -1))
dy = np.array((0, 1, 1, 1, 0, -1, -1, -1))
size = image.shape
while len(strong_x):
x = strong_x[0]
y = strong_y[0]
strong_x = np.delete(strong_x, 0)
strong_y = np.delete(strong_y, 0)
for direction in range(len(dx)):
new_x = x + dx[direction]
new_y = y + dy[direction]
if((new_x >= 0 & new_x < size[0] & new_y >= 0 & new_y < size[1]) and (result[new_x, new_y] == weak)):
result[new_x, new_y] = strong
np.append(strong_x, new_x)
np.append(strong_y, new_y)
result[result != strong] = 0
return result
NOTE: Keep in mind if you want to replicate the code we have written until now to change the config variables depending on the picture you provide to the algorithm (when generating the Gaussian kernel and/or thresholding the edge map.
You know, there is a better alternative to all of the work above. We could just use the cv2.Canny()
method that comes with the OpenCV library, which is highly superior to the custom edge detection algorithm we created above. We can provide lots of different arguments, such as the image (of course), the output edge map, threshold values for the hysteresis procedure, aperture size for the Sobel algorithm, and the boolean value for the gradient algorithm. Below, the Canny algorithm produced the following edge detection result only with the image and the two threshold values.
canny_algorithm = cv2.Canny(img,100,200)
Also published .