Food Pictures Classification using Deep Learning on Food 101 Dataset.

Aditya Singh Rathore

4 min readNov 3, 2019

Hi, I found this interesting data set Food-101: https://www.vision.ee.ethz.ch/datasets_extra/food-101/

I will be using Fast.ai library to train a deep learning model on the above data set to classify the various dishes.

Getting the Dataset:

path = untar_data(URLs.FOOD)

The function Untar_data requires a url.

To check the url for a particular dataset, use the below:

URLs.FOOD

We can check the contents of the dataset using path.ls()

Below is the output:

[PosixPath(‘/root/.fastai/data/food-101/train.json’),

PosixPath(‘/root/.fastai/data/food-101/labels.txt’),

PosixPath(‘/root/.fastai/data/food-101/h5’),

PosixPath(‘/root/.fastai/data/food-101/train.txt’),

PosixPath(‘/root/.fastai/data/food-101/test.json’),

PosixPath(‘/root/.fastai/data/food-101/images’),

PosixPath(‘/root/.fastai/data/food-101/classes.txt’),

PosixPath(‘/root/.fastai/data/food-101/test.txt’)]

As we can see there is a directory names ‘images’ which has all the images.

Ds_tfms are transformations that we need to apply to the images. This transformation function crops, zooms and centers the images. This also changes the size of the images to 224x224 our pretrained model resnet34 was trained on this size of images.

Below is the implementation:

tfms = get_transforms(do_flip=False)

data = (ImageList.from_folder(path=path).split_by_rand_pct().label_from_folder().transform(tfms, size=224).databunch())

ImageList.from_folder- Specify the path here

.split_by_rand_pct()- split the dataset in train and validation with 20% by default

.label_from_folder()- Label the data

.transform()- apply transformation

.databunch()- convert to databunch

Let’s visualize the data that we just created:

data.show_batch(rows=3, figsize=(7,6))

We need to normalize the data in order to equalize the pixel intensities of RGB channels, i.e. setting the mean of all three channel pixel values to 0 and std deviation to 1. This can be done using the below:

data.normalize(imagenet_stats)

To check the total number of categories: data.c

To check the category names: data.classes

Now lets train a model using transfer learning.

learn = cnn_learner(data, base_arch=models.resnet34, metrics= accuracy)

Cnn_learner is a method that resides in fastai.vision.learner class.

We specify the data that is going to be used, the pretrained model which is resnet34 as the base architecture and accuracy as metric to check our model performance

Now we will apply One Cycle Policy which is basically the number of times the model will look at every training example. It should be noted that each time the model will get better at recognising the training examples, so one should look out for the chances of overfitting. Here I will train 4 times.

learn.fit_one_cycle(4)

The above will create some weights which we will save by the below command:

learn.save(‘stage1’)

Now I will use ClassificationInterpretation class to generate Confusion Matrix and plotting misclassified images. I will pass the learn model as the object.

inter= ClassificationInterpretation.from_learner(learn)

Now I will plot the top losses using below:

inter.plot_top_losses(9, figsize=(15, 11))

Output showing the most mis-categorized Items

Now I will plot the confusion matrix:

inter.plot_confusion_matrix(figsize=(20, 20), dpi=100)

I know the font is too small to read, but basically the names of rows and columns are the names of the different food items and the diagonal basically shows how many times the algorithm got it right.

Confusion matrix tells you how many times the model predicted a food as its actual category or wrong category.

You can check the classes on which the model performed really bad by using the below:

inter.most_confused(min_val=2)

Below is just a part of a long list:

Now I will find the learning rate so that I can choose the best one using learn.lr_find() and plot the same using the below:

learn.recorder.plot()

plt.title(“Loss Vs Learning Rate”)

The learning rate shows how fast we are updating the parameters in our model. Here we can see that after 1e-04 the loss keeps on increasing. So we will select our range from e-6 to e-4.

So I will train my model again, but this time I will specify my learning rate, instead of letting the algorithm choose:

learn.fit_one_cycle(4, max_lr=slice(1e-4, 1e-2))

And as expected the accuracy of the model has improved a bit from 0.67 to 0.69. It is not a huge jump but still it shows that using the above technique we can improve our model slightly. I will continue to find some ways to improve the model and will post an update on the same. Thank you for reading my blog.

Food Pictures Classification using Deep Learning on Food 101 Dataset.

Written by Aditya Singh Rathore

No responses yet