Food Pictures Classification using Deep Learning on Food 101 Dataset.
Hi, I found this interesting data set Food-101: https://www.vision.ee.ethz.ch/datasets_extra/food-101/
I will be using Fast.ai library to train a deep learning model on the above data set to classify the various dishes.
Getting the Dataset:
path = untar_data(URLs.FOOD)
The function Untar_data requires a url.
To check the url for a particular dataset, use the below:
URLs.FOOD
We can check the contents of the dataset using path.ls()
Below is the output:
[PosixPath(‘/root/.fastai/data/food-101/train.json’),
PosixPath(‘/root/.fastai/data/food-101/labels.txt’),
PosixPath(‘/root/.fastai/data/food-101/h5’),
PosixPath(‘/root/.fastai/data/food-101/train.txt’),
PosixPath(‘/root/.fastai/data/food-101/test.json’),
PosixPath(‘/root/.fastai/data/food-101/images’),
PosixPath(‘/root/.fastai/data/food-101/classes.txt’),
PosixPath(‘/root/.fastai/data/food-101/test.txt’)]
As we can see there is a directory names ‘images’ which has all the images.
Ds_tfms are transformations that we need to apply to the images. This transformation function crops, zooms and centers the images. This also changes the size of the images to 224x224 our pretrained model resnet34 was trained on this size of images.
Below is the implementation:
tfms = get_transforms(do_flip=False)
data = (ImageList.from_folder(path=path).split_by_rand_pct().label_from_folder().transform(tfms, size=224).databunch())
ImageList.from_folder- Specify the path here
.split_by_rand_pct()- split the dataset in train and validation with 20% by default
.label_from_folder()- Label the data
.transform()- apply transformation
.databunch()- convert to databunch
Let’s visualize the data that we just created:
data.show_batch(rows=3, figsize=(7,6))
We need to normalize the data in order to equalize the pixel intensities of RGB channels, i.e. setting the mean of all three channel pixel values to 0 and std deviation to 1. This can be done using the below:
data.normalize(imagenet_stats)
To check the total number of categories: data.c
To check the category names: data.classes
Now lets train a model using transfer learning.
learn = cnn_learner(data, base_arch=models.resnet34, metrics= accuracy)
Cnn_learner is a method that resides in fastai.vision.learner class.
We specify the data that is going to be used, the pretrained model which is resnet34 as the base architecture and accuracy as metric to check our model performance
Now we will apply One Cycle Policy which is basically the number of times the model will look at every training example. It should be noted that each time the model will get better at recognising the training examples, so one should look out for the chances of overfitting. Here I will train 4 times.
learn.fit_one_cycle(4)
The above will create some weights which we will save by the below command:
learn.save(‘stage1’)
Now I will use ClassificationInterpretation class to generate Confusion Matrix and plotting misclassified images. I will pass the learn model as the object.
inter= ClassificationInterpretation.from_learner(learn)
Now I will plot the top losses using below:
inter.plot_top_losses(9, figsize=(15, 11))
Now I will plot the confusion matrix:
inter.plot_confusion_matrix(figsize=(20, 20), dpi=100)
Confusion matrix tells you how many times the model predicted a food as its actual category or wrong category.
You can check the classes on which the model performed really bad by using the below:
inter.most_confused(min_val=2)
Below is just a part of a long list:
Now I will find the learning rate so that I can choose the best one using learn.lr_find() and plot the same using the below:
learn.recorder.plot()
plt.title(“Loss Vs Learning Rate”)
The learning rate shows how fast we are updating the parameters in our model. Here we can see that after 1e-04 the loss keeps on increasing. So we will select our range from e-6 to e-4.
So I will train my model again, but this time I will specify my learning rate, instead of letting the algorithm choose:
learn.fit_one_cycle(4, max_lr=slice(1e-4, 1e-2))
And as expected the accuracy of the model has improved a bit from 0.67 to 0.69. It is not a huge jump but still it shows that using the above technique we can improve our model slightly. I will continue to find some ways to improve the model and will post an update on the same. Thank you for reading my blog.