Categories
Uncategorized

Basic Image Processing: Image Blur

For this section, if you haven’t yet set up your Python environment, please follow the previous tutorial. To see the full list of tutorials, see the main AI Tutorial Page.

At the top of your script (Image_Process.py), type the following pieces of code:

The first piece of code on the left (lines 1-3) import the packages that you need. “cv2” refers to OpenCV, while numpy refers to NumPy. Typing “import … as …” instead doesn’t really change anything, but it allows you to use a shorthand notation for a package when you are trying to refer to it in code. In this case, we will use the common shorthand notation for the NumPy library, “np”. Any time you would like to call something from a library or package in python, you simply type the name that you have imported, then a period, then the function, class, or element you wish to access from that library. Such notation can be seen in all of lines 7-10, where you call the OpenCV library by typing “cv2.getGaussianKernel(20,5)” or “cv2.waitKey(0)”.

The next block of code on the left is defining a function. This is achieved using the notation “def name_of_function(parameters):” Defining a function is useful when it is performing an action that you may want to perform many times. Otherwise, it is just useful to make your code look nicer and more readable. In this case, we will are defining a function to blur an image. Therefore, I’ve named the function blur_image, with a single input (im), representing the image that we wish to blur.

Line 7 defines a kernel (which can also just be considered an image filter) that is built into OpenCV. This type of filter is known as a Gaussian, which is commonly used in many fields, but in image processing it is often used to blur images. The two parameters entered into this function (20, 5) indicate the size and intensity of the blur, which will affect the final image output (examples later). Line 8 applies this filter to the entire image, using the image and the kernel as inputs. Line 9 names (1st parameter in the function) and shows the modified image (2nd parameter in the function), and Line 10 ensures that the program doesn’t continue until you press a key on your keyboard. Notice that all of these lines are actually functions that are built in to OpenCV. This is one of the most powerful things about python: there are already many existing functions that do exactly what you need them to. The trick is in knowing how and when to use them.

Below this function, type the code on the right. It first declares the filename of the picture you are hoping to use. In this case, I have used a picture that my wife took of me in Scotland, at the Eilean Donan castle, and so I have named it ‘EileanDonan.jpg’. The quotation marks indicate that this variable is a string, which means that it can be read by a computer as letters (or anything that you can type on a keyboard), but does not hold any numerical value. Notice that these are also present in line 9 on the left side. For example, if you were to type: sum = 2+2; print(sum), then the computer would, correctly, print 4. On the other hand, if you typed: sum = ‘2+2’; print(sum), then the computer would print 2+2, as it reads each of those individual components as nothing more than a letter. Likewise, if I were to type filename = EileanDonan.jpg, then the computer would throw an error, not understanding that I wanted it to use this combination of letters, and instead would be searching for a different type of variable which it wouldn’t be able to find. In your case, you should find a photo that you want to edit, move it to the same folder as Image_Process.py, and type in the name of the file in quotation marks, as demonstrated above.

The next line on the right block (my_im = cv2.imread) reads the information from the image file and converts it into a format that can be handled by the program. In this case, the format is essentially a 3-dimensional array, which has the shape [height x width x 3]. The height and width components of this are dependent on the particular image that you upload, and represent the number of pixels that are in the image in the up and down direction. You can find these numbers by typing print(my_im.shape) after loading the image with this line of code. The 3 refers to the 3 components of color (blue, green, and red) that, when put together, make up the complex colors that we are able to see in the image. And the values in the array represent how much of each color is present for a given pixel. To explore this concept a little more, you can use this tool (not created by me). The next two lines, similar to lines 9 and 10 on the left side, show the image and wait for the user to press a key. Now, run the code by pressing the play button in one of the three places marked with the red boxes (note that the menu bar appears after right clicking your filename at the top of the screen):

Your code should show you your original image, then a blurred image. Below are shown my results for the initial image, a blurred image using (5, 5) (top right) as input parameters to the Gaussian function, then a blurred image using (20, 5) as input (bottom left), then one using (20, 20) as input (bottom right). You can notice how the bottom left picture is much more blurred than the top right because it used a larger kernel to do the blurring, even though the intensity of the blur was the same. Meanwhile, the bottom right picture using the same size kernel as the bottom left, but a much higher intensity, which results in some perceived double vision, as can be seen most noticeably with the stripes and the glasses.

Categories
Uncategorized

AI Tutorial: Python (PyCharm) Setup

I’ve primarily written this section for people who are completely new to Python and programming. If you comfortable with setting up a Python environment, you can skip to Part 2.

For everyone else, first download and install the newest versions of Python and PyCharm. This should be straightforward, but if you have any issues, please consult their websites. I should note that PyCharm is just one of the many IDEs that you could use to facilitate your Python code, but it is the one that I first became accustomed to when I was learning Python, so I’ve chosen it for this tutorial as well.

Once they are both installed, open PyCharm and start a new project. Name it whatever you would like, as long as you can remember it and find it later. I will name mine “Tutorials”. Click File > New, then select Python File. In the “New Python file” box, type the name of the file (in this case, Image_Process). This will create a new file in your project folder. In this case, it is C:\Users\Daniel\PycharmProjects\Tutorials\Image_Process.py.

Now, we need to set up our python environment. Go to File > Settings and you should see the following window, though yours may have less packages listed. First, ensure that your Project Interpreter (shown with a blue box) is correct. This Interpreter indicates the version of Python you are using, as well as the packages that you have installed for a particular environment. I have just chosen the Python 3.8 system interpreter, which includes all of the packages installed locally on the system. Another popular option would be to create a different virtual environment for each project, which ensures that any projects that require different versions of a package, or different versions of Python, can remain separate and will not interfere with one another. To do this, you need to click the “Settings” wheel on the right side of the blue box, select “Add…”, then the “VirtualEnv Environment” option. However, this is not the topic of this tutorial.

Next, we need to actually install the necessary packages within the Project Interpreter. Any time you get a ModuleNotFoundError when running your code, it indicates that you have not installed a package that you are attempting to import to your program, so this is where you will need to come in order to install those packages into your current environment. Even if you have installed this package elsewhere, if it isn’t present in the environment you are currently using, it will not be accepted in the program. To install a package, click the “+” button indicated by the red box, and search for the package. For the ImageProcess tutorial, you’ll need to import “opencv-python” and “numpy” (which may be installed automatically during installation of opencv-python). OpenCV is one of the most popular libraries used for image processing. Originally written in C/C++, it has been adapted to be usable in python. NumPy, on the other hand, is a package for handling numbers and numerical structures such as arrays and matrices.

Once you have the necessary packages installed, we are ready to start coding.

Categories
Uncategorized

A Global Look Back: Phase 1 of Covid-19

Daniel Freer, May 18, 2020

In what appears to be a lull for Covid-19, we are now able to look back on this crisis (or at least the first part of it) with 20/20 vision for the first time to see what went wrong with various healthcare systems, and how these systems can be improved in the future. In this essay, I’m going to assume that the numbers reported by most major media outlets and international organizations are approximately correct. I have also been closely following the numbers on Worldometers.info, though I have recently heard (from CNN, so take it with a grain of salt) the source behind this site is mysterious and unknown. But that is beside the point.

I’m going to attempt to use two metrics to describe a country’s response to the pandemic: 1) infections per capita; and 2) death rate. Infections per capita (i.e. 58 Covid-19 cases for every 1 million people) can generally reflect how well the country did at stopping the virus from spreading. Meanwhile, death rate (as deaths/the number of infected) generally reflects how well equipped hospitals are to handle a lot of patients. Death from Covid-19 (at least from my understanding) has largely occurred due to overwhelmed and under-equipped hospitals that couldn’t keep up with increased and intensifying workload.

Let’s look at some numbers (according to Worldometers.info on May 18, 2020):

 Total cases (active cases)Infections per 1MDeath Rate
United States1,527,664 (1,090,297)4,6195.9%
China82,954 (82)585.2%
South Korea11,065 (898)2162.3%
Japan16,285 (4,388)1294.7%
UK243,695 (N/A)3,59214.2%
Spain277,719 (54,124)5,94010.0%
Italy225,435 (68,315)3,72814.2%
Germany176,651 (14,002)2,1094.6%
France179,569 (90,248)2,75215.7%
Cases, infections per capita, and death rate of some key countries during the Covid-19 pandemic

So, from these numbers we can see that the United States has a similar death rate to China, while South Korea’s appears to be the best of all countries with a significant amount of cases. Germany is the lowest in Europe, though in general Western European death rates are comparatively high. In terms of how far infection spread, however, the United states was on par with Europe, while Asian countries were clearly much more effective in stopping the spread of the virus within their own communities.

Why is this? Because these Asian governments were able to quickly discover who had the virus and efficiently disseminate information to their neighboring people.  A clear message about health and safety (wear masks, wash your hands, avoid contact with others). And the people listened to this message and responsibly followed instructions.

Any increased numbers in infections per capita or in death rate can therefore reflect a country’s failure to do these things. The first point (1), discovering who has the virus, should be the job of the healthcare system. The second (2), the dissemination of clear information to people about how to protect themselves, is the job of the government and the media. The third point (3) is about the people and the culture, whether they listen to authority figures, and whether they act irresponsibly.

1. Discovering who has the virus

A strange and unpredictable disease was first officially reported in Wuhan, China at the end of December. It is known that the virus was circulating before this however, both in China and in other parts of the world. But China was the first to identify it as a threat and report their findings. This may have been because the virus did start in Wuhan, but based on the existing evidence, we can not be 100% sure that this is true, as patient zero has not been found, and likely never will be.

However, imagine for a moment that the United States had no contact with China whatsoever in December and January, and we were completely unaware that there was a disease there. Imagine that a disease started spreading in New York, with the first cases being reported in the middle of March. Imagine that we couldn’t find the direct source of the virus, but attempted to figure it out as several thousand new cases are reported per day. This was the situation in Wuhan, China, which has a similar population to New York City. However, even on this level playing field, if you consider the differences in case numbers between the two states/provinces, the Chinese response was much better, with about 68,000 total cases and 4,500 deaths in Hubei province as compared to 360,000 cases and 28,000 deaths in New York. But in reality, it wasn’t a level playing field. New York had a huge advantage in that they were told about the virus several months before its spike in reported cases, and still failed to stop its spread.

So what is the cause of this failure in the US and Europe? My theory: it is because not enough people went to the doctor.

In China, the first people to notice that a unique and dangerous virus was spreading were doctors. They reported it to their superiors, who told them not to say anything publicly until they had more information. Then, when they had more information, it was disclosed publicly. However, doctors can only notice such a trend if a high percentage of the population actually receives medical attention when they need it. This was not the case in either Europe or the US.

In Europe, I attribute the high amount of infections per capita to their overwhelmed healthcare systems, which can be seen by their high death rates (except in Germany). Because their hospitals didn’t have enough space for the extreme influx of patients, many people who were sick were forced to stay home, and many who needed to see a doctor were turned away. This caused more interaction between, for example, roommates or family, which created additional avenues for the virus to spread. While hospitals in Wuhan were similarly overwhelmed, they quickly built new hospitals to deal with the overflow, and doctors from other parts of the country relocated to help out. Such a response in Europe did not occur, and would likely be impossible.

In the US, however, the healthcare system was not as overwhelmed as Western European ones, as can be surmised from the relatively low death rate. But this means that the United States’ high infectivity rate requires a different explanation. My theory relates to the fact that the US largely has a reactive rather than a proactive healthcare system. Because healthcare in the United States is so expensive, people (especially those who are uninsured) are discouraged from seeing a doctor regularly, or even when a problem does arise. They simply aren’t able to afford it, and their body usually gets better anyway. This leads to increases in spread of the virus for the same reason as in Europe: more sick people are around other people rather than in a hospital. In this case, it would also take doctors longer to notice the appearance of a new virus, as the majority of people with the virus may not even contact a doctor to get it checked.

2. Dissemination of information

The next important task to handle an epidemic is to get relevant and helpful information out to the people, allowing them to spread the message further and protect themselves. While I know China’s government has received a lot of criticism on this front, the fact remains that they stopped the spread of the virus much more efficiently than both European countries and the United States. Other Asian countries were also very successful in warning the public about the dangers of this disease, which largely stopped its spread before it became a massive problem.

The United States’ messaging on coronavirus has been utterly terrible. First, the initial claim by the CDC (and many other American and European media sources) that masks are not effective, and then a complete reversal of this opinion several months later. The constant fear-mongering of the media and the blaming of the “other side” rather than attempting to actually solve problems. And of course, the President of the United States, who first said that the virus was nothing to worry about, who actively tried to prevent “the numbers” from increasing to avoid a stock market fall, who spouts random accusations at enemies in order to distract, rather than attempting to actually explain what’s happening to the American people. In fact, he has actively spread misinformation on numerous occasions, though we can’t be certain if this is intentional, or just idiocy.

As I haven’t been in Europe at all for this pandemic, I can’t comment too much on their dissemination of information. However, the UK’s initial decision to try to achieve herd immunity and then their quick reversal gave whiplash to the public and likely led to more spread of the disease. And there were also reports in the early stages of the pandemic that some European governments or institutions were actively discouraging the use of masks, then again changed course once they realized that masks could, in fact, prevent further spread of the disease. Both of these were crucial messaging mistakes which worsened the situation for their own people.

3. Culture

East Asians tend to be more cautious about their health, at least in some ways. I am not Asian, but my wife is Chinese, so I have gotten a bit of a glimpse into this thought process during this crisis. For all of the Americans complaining about having to wear a mask: When many Chinese students flew home from the UK (or elsewhere) to China, they wore hazmat suits. They didn’t eat from the time their first flight took off to the time their last flight landed, which was often more than 24 hours, sometimes even more than 40. These measures, while they might be a bit overkill, were almost certainly effective in some way in preventing spread of the virus.

Americans, on the other hand, don’t like being told what to do and have an incredible amount of self-confidence, even when they are wrong. This has led to significant healthcare issues in the United States even before Covid-19 was ever seen, from diabetes to tiger bites (shout out to Saff). Because of this part of American culture, we have seen hundreds of protests around the country, demanding that government lockdowns be lifted. These gatherings themselves have reportedly spread the virus further than otherwise, and have also been fairly effective in getting businesses to open earlier than they probably should. We will see if our indulgent culture will be able to add an element of responsibility during this reopening, or whether the disease will have a second wave even before the first wave has finished.

I don’t want to change American culture. I am American, so I understand it and might even feed into it a bit more than I should. But at the same time, I want Americans, and all people, to be as safe and healthy as possible, as I believe that this leads to more long-term happiness, which leads to prosperity. And sometimes you have to sacrifice your immediate happiness in order to make it easier on yourself later on.

—————————————————————————————–

So in summary, these are the main differences I see between initial responses around the world, and what they could focus on to improve their response to a similar future crisis:

Asian countries have successfully revamped their healthcare systems as a response to SARS in 2003 in order to effectively handle epidemics of this nature. While they are not beyond criticism, we should look at their response to this crisis in an overall positive light, and should look to more closely emulate it should a future epidemic or pandemic occur.  

Western European healthcare systems do not have enough space or resources at their hospitals for the number of people that live there if a crisis occurs. As a result, they should push to build more hospitals, or increase the capacity of their existing hospitals, and additionally should try to incentivize students to pursue healthcare-related professions.

The American healthcare system is too expensive, discouraging frequent doctors’ visits, but we do generally have enough medical resources for our people. Our government and our media have not provided a consistent or even coherent message at times, leading to confusion and misinformation. Lastly, our self-confident and leader-wary culture has only worsened and prolonged this crisis, and feeds into the divided media. So here, the way forward is to focus on building trust between our people and our healthcare and government institutions. One important aspect of this is reducing healthcare costs so that Americans won’t feel like their doctors are robbing them blind, and might be more inclined to go to the doctor if they become ill. The other important aspect is to make our politics less divisive and more inclusive of different people and ideologies, though this also does not seem particularly likely.

I want to emphasize that what I’ve written here are mostly my theories and opinions, and are also only considering 2 metrics and my own knowledge of healthcare systems in order to summarize the initial response of an entire country or region. The true picture is certainly more complex, and the answers aren’t easy. However, I hope that this will give some global perspective about the crisis, and will help us all move forward into a new age for humanity once Covid-19 is no longer a headline.

Categories
Uncategorized

AI Tutorial: Static Noise Removal

Hello all! Today, I’m going to share and explain some code which I have been using to clean up the audio in videos using Python. This technique is very simple, especially with the tools that are available, but I’ll take some time to explain what is really happening in the code, and how this type of processing and other more complicated methods can be applied to artificial intelligence agents like Alexa. Hopefully it will give you an appreciation of what goes into audio editing, even though this is just a very simple example. Follow this link to access the code directly.

First, install and import all of the necessary packages to your code, which are listed below. This includes some standard packages such as scipy, matplotlib, and numpy, but also includes two packages which I had never used before called moviepy and LibROSA. I encourage you to read up on these packages, as they provide good functionality for manipulating video files (moviepy) and audio files (LibROSA).

The first piece of code we will write uses a function VideoFileClip from moviepy.editor to read in an mp4 file and separate it into video and audio clips. This function is mostly just making use of FFMPEG (a different library) to perform this separation, but doing this action through MoviePy is much easier from my perspective, with no major downside. After separating the audio from the video file, we write a new audio (.wav) file, which we will load and use in LibROSA later.

The next step is to load the audio file and convert it into a format that can be mathematically manipulated. To do this, use LibROSA’s load command to read in the file that we just wrote using moviepy. Plot the data to make sure you are getting real and sensible values before continuing. It should look something like the blue graph below:

Here is where the data processing begins. Create a new function like the one below:

The first step is to compute the Fourier transform of the signal. This is achieved by using the LibROSA stft (Short Time Fourier Transform) command, which splits our audio data into half-overlapping windows, converts each window to the frequency domain, and returns the values to us. These values are complex (meaning they contain both real and imaginary numbers), and so must be converted into their magnitude and angle, which are achieved by using the numpy abs and angle functions.

The frequency (Fourier) domain may be a bit of a complicated topic for those who haven’t studied it, but I will try to summarize it briefly here. Essentially, this domain is based on the theory that if can record and plot the magnitude of any signal over time or space (for example, plotting temperature over time, or in this case recorded sound over time), that this signal can also be represented in an alternate way: as a mapping of the magnitude of different frequencies. A slow or small change corresponds to a low frequency, while a quick and fast change corresponds to a high frequency. When you convert a signal to the frequency domain, you evaluate how much of the signal corresponds to each single frequency within a wide range (for example, from 0 to 1024 hertz, where hertz (Hz) indicates the number of changes per second). In order to represent a complex signal, you need to combine several, maybe hundreds or thousands, of these individual frequency magnitudes. The signals combine together through simple addition, complementing each other in some instances and canceling in others depending on each frequency, magnitude, and phase (angle). The phase is used to shift each frequency to the left or right, ensuring that the signals combine in exactly the right way in time when the signal is reconstructed. As this is completed in the frequency domain, the shift occurs through multiplication with an exponential function (ejΘ), where Θ is the angle computer above.

The Fourier domain, while it has been used for image processing and in many other mathematical and physical problems, is particularly suited to audio processing, as the main distinction between different sounds is the frequency at which they occur. High pitches have mostly high frequencies, while low pitches have mostly low frequencies. Frequencies that consistently match up after a given number of cycles sound harmonious, while frequencies that are fighting against each other are dissonant. Sounds that don’t have any discernable pitch sound that way because they are the combination of hundreds or thousands of pitches, so no single dominant frequency can be heard. And different combinations are also easily discernable by humans. For example, if you close your eyes and consider the two sounds “p” and “t”, you will likely easily be able to tell them apart. But how could you mathematically define the difference between them? Something like this is only achievable in the frequency domain, and is the basis for all audio processing, including language, music, and many other applications.

So from our code, we have now determined the magnitude (ss) and phase (angle) for each time window of our recorded audio signal. The question now is what we would like to remove. In this case, we are considering a relatively constant static sound that occurs throughout the video. For my videos, if I don’t have a microphone close to the source, there is often static noise that occurs and is distracting from the most important audio in the video. If we assume that this static noise is constant, then we can also assume that this noise is present in the first fraction of a second of the video, while other more useful noises are not. Therefore, if we can characterize this static noise in the frequency domain from analyzing just the first second or less of the video, we can remove these characteristics from the rest of the video.

To do this, we again use the LibROSA.stft function, but rather than taking the Fourier transform of the entire video, we only consider the first several thousand datapoints. In this code, I have considered 8192 datapoints, but this number could be changed depending on when your desired sound begins or ends in your video. For me, the audio frequency was 44100 Hz, so 8192 datapoints corresponds to about the first 1/5th of a second of the video. This was enough for my video, but if It doesn’t work for yours, you could consider changing this to include more or less datapoints. After computing the STFT of this segment, the average magnitude of each frequency is computed. This frequency profile should be approximately the frequency profile of the static noise. Therefore, we simply subtract the magnitude of frequencies in the first 0.2 seconds from the magnitude of the frequencies of each window computed previously (sa = ss – mns.reshape((mns.shape[0], 1))). Finally, the modified windows from the original audio are shifted back to their proper phase using sa0 = sa*b, where b is the exponential function defined above. Lastly, the inverse Fourier transform is computed, and the new audio file is written to a new filename.

The final part of the code is optional, as I discovered that rewriting the video in this way significantly reduced the video quality. As a result, I just combined the original video together with this new audio by using Windows’ built-in video editing tool, and this may be the best option for many people who are reading this. However, to utilize the tools that we have learned today and come out with a complete video, you can replicate the following:

You can load in the new audio file using moviepy’s AudioFileClip, redefine the audio in the main clip (which you can load in the first function we wrote today) through simple assignment, and then rewrite the combined video file.

Now, if you compare the original audio file to the cleaned one, you should be able to notice a significantly smaller amount of noise in the latter. You can similarly compare the original video file to the one with cleaned audio, and should find the same thing.

This concludes my tutorial on static noise removal from video files. I hope that you have learned something that you can apply to your own work, and I hope that you enjoyed my code and explanations.

To hear the cleaned audio, you can follow this link to the Youtube video, and can compare it to this song, whose audio has not been cleaned.