For this section, if you haven’t yet set up your Python environment, please follow the previous tutorial. To see the full list of tutorials, see the main AI Tutorial Page.
At the top of your script (Image_Process.py), type the following pieces of code:
The first piece of code on the left (lines 1-3) import the packages that you need. “cv2” refers to OpenCV, while numpy refers to NumPy. Typing “import … as …” instead doesn’t really change anything, but it allows you to use a shorthand notation for a package when you are trying to refer to it in code. In this case, we will use the common shorthand notation for the NumPy library, “np”. Any time you would like to call something from a library or package in python, you simply type the name that you have imported, then a period, then the function, class, or element you wish to access from that library. Such notation can be seen in all of lines 7-10, where you call the OpenCV library by typing “cv2.getGaussianKernel(20,5)” or “cv2.waitKey(0)”.
The next block of code on the left is defining a function. This is achieved using the notation “def name_of_function(parameters):” Defining a function is useful when it is performing an action that you may want to perform many times. Otherwise, it is just useful to make your code look nicer and more readable. In this case, we will are defining a function to blur an image. Therefore, I’ve named the function blur_image, with a single input (im), representing the image that we wish to blur.
Line 7 defines a kernel (which can also just be considered an image filter) that is built into OpenCV. This type of filter is known as a Gaussian, which is commonly used in many fields, but in image processing it is often used to blur images. The two parameters entered into this function (20, 5) indicate the size and intensity of the blur, which will affect the final image output (examples later). Line 8 applies this filter to the entire image, using the image and the kernel as inputs. Line 9 names (1st parameter in the function) and shows the modified image (2nd parameter in the function), and Line 10 ensures that the program doesn’t continue until you press a key on your keyboard. Notice that all of these lines are actually functions that are built in to OpenCV. This is one of the most powerful things about python: there are already many existing functions that do exactly what you need them to. The trick is in knowing how and when to use them.
Below this function, type the code on the right. It first declares the filename of the picture you are hoping to use. In this case, I have used a picture that my wife took of me in Scotland, at the Eilean Donan castle, and so I have named it ‘EileanDonan.jpg’. The quotation marks indicate that this variable is a string, which means that it can be read by a computer as letters (or anything that you can type on a keyboard), but does not hold any numerical value. Notice that these are also present in line 9 on the left side. For example, if you were to type: sum = 2+2; print(sum), then the computer would, correctly, print 4. On the other hand, if you typed: sum = ‘2+2’; print(sum), then the computer would print 2+2, as it reads each of those individual components as nothing more than a letter. Likewise, if I were to type filename = EileanDonan.jpg, then the computer would throw an error, not understanding that I wanted it to use this combination of letters, and instead would be searching for a different type of variable which it wouldn’t be able to find. In your case, you should find a photo that you want to edit, move it to the same folder as Image_Process.py, and type in the name of the file in quotation marks, as demonstrated above.
The next line on the right block (my_im = cv2.imread) reads the information from the image file and converts it into a format that can be handled by the program. In this case, the format is essentially a 3-dimensional array, which has the shape [height x width x 3]. The height and width components of this are dependent on the particular image that you upload, and represent the number of pixels that are in the image in the up and down direction. You can find these numbers by typing print(my_im.shape) after loading the image with this line of code. The 3 refers to the 3 components of color (blue, green, and red) that, when put together, make up the complex colors that we are able to see in the image. And the values in the array represent how much of each color is present for a given pixel. To explore this concept a little more, you can use this tool (not created by me). The next two lines, similar to lines 9 and 10 on the left side, show the image and wait for the user to press a key. Now, run the code by pressing the play button in one of the three places marked with the red boxes (note that the menu bar appears after right clicking your filename at the top of the screen):
Your code should show you your original image, then a blurred image. Below are shown my results for the initial image, a blurred image using (5, 5) (top right) as input parameters to the Gaussian function, then a blurred image using (20, 5) as input (bottom left), then one using (20, 20) as input (bottom right). You can notice how the bottom left picture is much more blurred than the top right because it used a larger kernel to do the blurring, even though the intensity of the blur was the same. Meanwhile, the bottom right picture using the same size kernel as the bottom left, but a much higher intensity, which results in some perceived double vision, as can be seen most noticeably with the stripes and the glasses.