X-Ray classification with CNN

MIKE ARMISTEAD
2 min readSep 5, 2020

--

Using a CNN (Convolutional Neural Network) I was able to create a classification model that looks at chest x-rays and can determine whether they have pneumonia or if they are healthy lungs.

What is a Convolutional Neural Network and how does it work? First thing is that you need your data to be in a matrix. The CNN model uses a smaller matrix and it slides the smaller matrix through the matrix of data to learn in small segments. It uses what it learns to create a new matrix of data, and from their you can have the model go through the new matrix to learn more or you set up a function to let it classify the data. Now how does this work with images? Images are made of pixels and these pixels have Red, Green, and Blue values that can be turned into matrixes so the CNN model can go through it and learn.

My dataset consisted of 5,863 images of chest x-rays. The x-rays are of pediatric patients between the ages of 1–5, and all of the images were looked over by 3 experts to make sure they are either healthy lungs or have pneumonia. This dataset was imbalanced with 62.5% of the images being x-rays with pneumonia and 37.5% being images of healthy lungs.

The first thing that I needed to do was figure out what the sizes of all the images are and change them all to the same size because the CNN model cannot work with images that are different sizes. The average image size was 1327x970 but since I am working on my low powered computer I resized them to 128x128.

Once I had all of my data cleaned and set up then it was time to start making the CNN model. I created 3 different models. The first model was basic with 4 pooling layers, 3 dense layers and one dropout layer. The second model used data augmentation to manipulate the images so that the model can learn from images that look different. Finally I used data augmentation and transfer learning to create my final model.

I used 4 different evaluation metrics, Accuracy, Precision, Recall, F1 score. To figure out these scores we need to look at correctly predicted sick, healthy and False Negatives and False Positives. False Positive predicted that the image was sick when they were healthy and False Negatives predicted that they were healthy when they are actually sick. I would rather have more incorrectly guessing they are sick when they are healthy than the other way around.

Basic model Accuracy : 78.37% Recall : 0.43 Precision : 0.99 F1 : 0.6

Basic model with Data Augmentation to images Accuracy : 87.18% Recall : 0.91 Precision : 0.78 F1 : 0.84

Transfer Learning with Data Augmentation to images Accuracy : 90.71% Recall : 0.78 Precision : 0.97 F1 : 0.86

Based off of accuracy, F1 score and the low False Negatives the transfer learning model with data augmentation would be the best model. The next best would be the basic model because of the low False Negatives and high Precision.

--

--

MIKE ARMISTEAD
MIKE ARMISTEAD

Written by MIKE ARMISTEAD

Tech recruiter turned Data Scientist

No responses yet