Python

Python Image Processing Tutorial (Using OpenCV)

In this tutorial, you will learn how you can process images in Python using the OpenCV library.

OpenCV is a free open source library used in real-time image processing. It’s used to process images, videos, and even live streams, but in this tutorial, we will process images only as a first step. Before getting started, let’s install OpenCV.


 

 

Install OpenCV

To install OpenCV on your system, run the following pip command:

pip install opencv-python

Install OpenCV

Now OpenCV is installed successfully and we are ready. Let’s have some fun with some images!

 

Rotate an Image

First of all, import the cv2 module.

import cv2

Now to read the image, use the imread() method of the cv2 module, specify the path to the image in the arguments and store the image in a variable as below:

img = cv2.imread("pyimg.jpg")

The image is now treated as a matrix with rows and columns values stored in img.

Actually, if you check the type of the img, it will give you the following result:


It’s a NumPy array! That why image processing using OpenCV is so easy. All the time you are working with a NumPy array.

To display the image, you can use the imshow() method of cv2.


The waitkey functions take time as an argument in milliseconds as a delay for the window to close. Here we set the time to zero to show the window forever until we close it manually.

Display image using OpenCV

To rotate this image, you need the width and the height of the image because you will use them in the rotation process as you will see later.

height, width = img.shape[0:2]

The shape attribute returns the height and width of the image matrix. If you print img.shape[0:2] , you will have the following output:

Get image attributes

Okay, now we have our image matrix and we want to get the rotation matrix. To get the rotation matrix, we use the getRotationMatrix2D() method of cv2. The syntax of getRotationMatrix2D() is:

cv2.getRotationMatrix2D(center, angle, scale)

Here the center is the center point of rotation, the angle is the angle in degrees and scale is the scale property which makes the image fit on the screen.

To get the rotation matrix of our image, the code will be:

rotationMatrix = cv2.getRotationMatrix2D((width/2, height/2), 90, .5)

The next step is to rotate our image with the help of the rotation matrix.

To rotate the image, we have a cv2 method named wrapAffine which takes the original image, the rotation matrix of the image and the width and height of the image as arguments.

rotatedImage = cv2.warpAffine(img, rotationMatrix, (width, height))

The rotated image is stored in the rotatedImage matrix. To show the image, use imshow() as below:


After running the above lines of code, you will have the following output:

Show rotated image

 


Crop an Image

First, we need to import the cv2 module and read the image and extract the width and height of the image:


Now get the starting and ending index of the row and column. This will define the size of the newly created image. For example, start from row number 10 till row number 15 will give the height of the image.

Similarly, start from column number 10 until column number 15 will give the width of the image.

You can get the starting point by specifying the percentage value of the total height and the total width. Similarly, to get the ending point of the cropped image, specify the percentage values as below:


Now map these values to the original image. Note that you have to cast the starting and ending values to integers because when mapping, the indexes are always integers.

croppedImage = img[startRow:endRow, startCol:endCol]

Here we specified the range from starting to ending of rows and columns.

Now display the original and cropped image in the output:


The result will be as follows:

Crop image


Resize an Image

To resize an image, you can use the resize() method of openCV. In the resize method, you can either specify the values of x and y axis or the number of rows and columns which tells the size of the image.

Import and read the image:


Now using the resize method with axis values:


The result will be as follows:

Resize image using resize method

Now using the row and column values to resize the image:


We say we want 550 columns (the width) and 350 rows (the height).

The result will be:

Resize image using rows and columns

 

Adjust Image Contrast

In Python OpenCV module, there is no particular function to adjust image contrast but the official documentation of OpenCV suggests an equation that can perform image brightness and image contrast both at the same time.

new_img = a * original_img + b

Here a is alpha which defines contrast of the image. If a is greater than 1, there will be higher contrast.

If the value of a is between 0 and 1 (smaller than 1 but greater than 0), there would be lower contrast. If a is 1, there will be no contrast effect on the image.

b stands for beta. The values of b vary from -127 to +127.

To implement this equation in Python OpenCV, you can use the addWeighted() method. We use The addWeighted() method as it generates the output in the range of 0 and 255 for a 24-bit color image.

The syntax of addWeighted() method is as follows:

cv2.addWeighted(source_img1, alpha1, source_img2, alpha2, beta)

This syntax will blend two images, the first source image (source_img1) with a weight of alpha1 and second source image (source_img2).

If you only want to apply contrast in one image, you can add a second image source as zeros using NumPy.

Let’s work on a simple example. Import the following modules:


Read the original image:

img = cv2.imread("pyimg.jpg")

Now apply the contrast. Since there is no other image, we will use the np.zeros which will create an array of the same shape and data type as the original image but the array will be filled with zeros.


In the above code, the brightness is set to 0 as we only want to apply contrast.

The comparison of the original and contrast image is as follows:

Adjust image contrast

 


Make an image blurry

Gaussian Blur

To make an image blurry, you can use the GaussianBlur() method of OpenCV.

The GaussianBlur() uses the Gaussian kernel. The height and width of the kernel should be a positive and an odd number.

Then you have to specify the X and Y direction that is sigmaX and sigmaY respectively. If only one is specified, both are considered the same.

Consider the following example:


In the above snippet, the actual image is passed to GaussianBlur() along with height and width of the kernel and the X and Y directions.

The comparison of the original and blurry image is as follows:

Gaussian Blur

 

Median Blur

In median blurring, the median of all the pixels of the image is calculated inside the kernel area. The central value is then replaced with the resultant median value. Median blurring is used when there are salt and pepper noise in the image.

To apply median blurring, you can use the medianBlur() method of OpenCV.

Consider the following example where we have a salt and pepper noise in the image:


This will apply 50% noise in the image along with median blur. Now show the images:


The result will be like the following:

Median Blur

Another comparison of the original image and after blurring:

Make image blurry

 

Detect Edges

To detect the edges in an image, you can use the Canny() method of cv2 which implements the Canny edge detector. The Canny edge detector is also known as the optimal detector.

The syntax to Canny() is as follows:

cv2.Canny(image, minVal, maxVal)

Here minVal and maxVal are the minimum and maximum intensity gradient values respectively.

Consider the following code:


The output will be the following:

Detect Edges on car

Here is the result of the above code on another image:

Detect Edges

 

Convert image to grayscale (Black & White)

The easy way to convert an image in grayscale is to load it like this:

img = cv2.imread("pyimg.jpg", 0)

There is another method using BGR2GRAY.

To convert a color image into a grayscale image, use the BGR2GRAY attribute of the cv2 module. This is demonstrated in the example below:

Import the cv2 module:

import cv2

Read the image:

img = cv2.imread("pyimg.jpg")

Use the cvtColor() method of the cv2 module which takes the original image and the COLOR_BGR2GRAY attribute as an argument. Store the resultant image in a variable:

gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Display the original and grayscale images:


The output will be as follows:

Grayed image

 


Centroid (Center of blob) detection

To find the center of an image, the first step is to convert the original image into grayscale. We can use the cvtColor() method of cv2 as we did before.

This is demonstrated in the following code:


We read the image and convert it to a grayscale image. The new image is stored in gray_img.

Now we have to calculate the moments of the image. Use the moments() method of cv2. In the moments() method, the grayscale image will be passed as below:

moment = cv2.moments(gray_img)

Then we need to calculate the x and y coordinates of the center of the image by using the moments that we got above:


Finally, we have the center of the image. To highlight this center position, we can use the circle method which will create a circle in the given coordinates of the given radius.

The circle() method takes the img, the x and y coordinates where the circle will be created, the size, the color that we want the circle to be and the thickness.

cv2.circle(img, (X, Y), 15, (205, 114, 101), 1)

The circle is created on the image.


The original image is:

Original image

After detecting the center, our image will be as follows:

Centroid detection

 

Apply a mask for a colored image

Image masking means to apply some other image as a mask on the original image or to change the pixel values in the image.

To apply a mask on the image, we will use the HoughCircles() method of the OpenCV module. The HoughCircles() method detects the circles in an image. After detecting the circles, we can simply apply a mask on these circles.

The HoughCircles() method takes the original image, the Hough Gradient (which detects the gradient information in the edges of the circle), and the information from the following circle equation:

(x - xcenter)2 + (y - ycenter)2 = r2

In this equation (xcenter , ycenter) is the center of the circle and r is the radius of the circle.

Our original image is:

Image before mask

After detecting circles in the image, the result will be:

Detecting circles in image

Okay, so we have the circles in the image and we can apply the mask. Consider the following code:


Detecting the circles in the image using the HoughCircles() code from OpenCV: Hough Circle Transform:


To create the mask, use np.full which will return a NumPy array of given shape:


The next step is to combine the image and the masking array we created using the bitwise_or operator as follows:

final_img = cv2.bitwise_or(img1, img1, masking=masking)

Display the resultant image:

Final image

 

Extracting text from Image (OCR)

To extract text from an image, you can use Google Tesseract-OCR. You can download it from this link

Then you should install the pytesseract module which is a Python wrapper for Tesseract-OCR.

pip install pytesseract

Install pytesseract

The image from which we will extract the text from is as follows:

Image with text

Now let’s convert the text in this image to a string of characters and display the text as a string on output:

Import the pytesseract module:

import pytesseract

Set the path of the Tesseract-OCR executable file:

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract'

Now use the image_to_string method to convert the image into a string:

print(pytesseract.image_to_string('pytext.png'))

The output will be as follows:

Extracted text from Image

Works like charm!

 


Detect and correct text skew

In this section, we will correct the text skew.

The original image is as follows:

Skewed text image

Import the modules cv2, NumPy and read the image:


Convert the image into a grayscale image:

gray_img=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Invert the grayscale image using bitwise_not:

gray_img=cv2.bitwise_not(gray_img)

Select the x and y coordinates of the pixels greater than zero by using the column_stack method of NumPy:

coordinates = np.column_stack(np.where(gray_img > 0))

Now we have to calculate the skew angle. We will use the minAreaRect() method of cv2 which returns an angle range from -90 to 0 degrees (where 0 is not included).

ang=cv2.minAreaRect(coordinates)[-1]

The rotated angle of the text region will be stored in the ang variable. Now we add a condition for the angle; if the text region’s angle is smaller than -45, we will add a 90 degrees else we will multiply the angle with a minus to make the angle positive.


Calculate the center of the text region:


Now we have the angle of text skew, we will apply the getRotationMatrix2D() to get the rotation matrix then we will use the wrapAffine() method to rotate the angle (explained earlier).


Display the rotated image:


Correct text image

 

Color Detection

Let’s detect the green color from an image:

Import the modules cv2 for images and NumPy for image arrays:


Read the image and convert it into HSV using cvtColor():


Display the image:

cv2.imshow("HSV Image", hsv_img)

Colored image

Now create a NumPy array for the lower green values and the upper green values:


Use the inRange() method of cv2 to check if the given image array elements lie between array values of upper and lower boundaries:

masking = cv2.inRange(hsv_img, lower_green, upper_green)

This will detect the green color.

Finally, display the original and resultant images:

cv2.imshow("Original Image", img)

Color detected


Green color detected

 

Reduce Noise

To reduce noise from an image, OpenCV provides the following methods:

  1. fastNlMeansDenoising(): Removes noise from a grayscale image
  2. fastNlMeansDenoisingColored(): Removes noise from a colored image
  3. fastNlMeansDenoisingMulti(): Removes noise from grayscale image frames (a grayscale video)
  4. fastNlMeansDenoisingColoredMulti(): Same as 3 but works with colored frames

Let’s use fastNlMeansDenoisingColored() in our example:

Import the cv2 module and read the image:


Apply the denoising function which takes respectively the original image (src), the destination (which we have kept none as we are storing the resultant), the filter strength, the image value to remove the colored noise (usually equal to filter strength or 10), the template patch size in pixel to compute weights which should always be odd (recommended size equals 7) and the window size in pixels to compute average of the given pixel.

result = cv2.fastNlMeansDenoisingColored(img,None,20,10,7,21)

Display original and denoised image:


The output will be:

Reduce noise

 

Get image contour

Contours are the curves in an image that are joint together. The curves join the continuous points in an image. The purpose of contours is used to detect the objects.

The original image of which we are getting the contours of is given below:

Image with shapes

Consider the following code where we used the findContours() method to find the contours in the image:

Import cv2 module:

import cv2

Read the image and convert it to a grayscale image:


Find the threshold:

retval, thresh = cv2.threshold(gray_img, 127, 255, 0)

Use the findContours() which takes the image (we passed threshold here) and some attributes. See findContours() Official.

img_contours, _ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

Draw the contours on the image using drawContours() method:

cv2.drawContours(img, img_contours, -1, (0, 255, 0))

Display the image:


The result will be:

Get image contour

 

Remove Background from an image

To remove the background from an image, we will find the contours to detect edges of the main object and create a mask with np.zeros for the background and then combine the mask and the image using the bitwise_and operator.

Consider the example below:

Import the modules (NumPy and cv2):


Read the image and convert the image into a grayscale image:


Find the threshold:

_, thresh = cv2.threshold(gray_img, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

In the threshold() method, the last argument defines the style of the threshold. See Official documentation of OpenCV threshold.

Find the image contours:

img_contours = cv2.findContours(threshed, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)[-2]

Sort the contours:


Generate the mask using np.zeros:

mask = np.zeros(img.shape[:2], np.uint8)

Draw contours:

cv2.drawContours(mask, [i],-1, 255, -1)

Apply the bitwise_and operator:

new_img = cv2.bitwise_and(img, img, mask=mask)

Display the original image:

cv2.imshow("Original Image", img)

Image with background

Display the resultant image:


Remove background

 

Image processing is fun when using OpenCV as you saw. I hope you find the tutorial useful. Keep coming back.

Thank you.

Mokhtar Ebrahim
I'm working as a Linux system administrator since 2010. I'm responsible for maintaining, securing, and troubleshooting Linux servers for multiple clients around the world. I love writing shell and Python scripts to automate my work.

4 thoughts on “Python Image Processing Tutorial (Using OpenCV)

  1. Hi loved the content.
    Can you tell me if I want a specific text to be cleared from an image or to place a mask over that area how can I achieve this?
    Any guide would be great.
    Thanks!
    Regards

    1. Thanks a lot!
      You can use pytesseract to extract text as described in the article, then you can apply any mask you want on the area that contains any occurrence of the extracted text.
      Hope that helps!

Leave a Reply

Your email address will not be published. Required fields are marked *