Training a deep learning model to recognize our cat.

Over the last few years, AI has gotten a huge upgrade, becoming more accessible to everyone. It’s able to categorize images, understand speech, generate content and even write code. It’s totally understandable if this fast-paced change feels a bit scary or overwhelming — think of job security or ethical consequences around AI. As everything in software engineering keeps shifting, getting a better understanding of how this technology works, either for you or against you, is becoming increasingly important.

I have been coding for a while and my vision of programming has evolved. I think of code as a tool to solve business problems and like other tools, it evolves. Machine learning is another powerful tool in our arsenal, one that we should use and understand. I don’t know what the future looks like, but the best we can do is to continuously sharpen our skills and take advantage of the opportunities that come our way.

This post is about my journey into deep learning. If you’re already quite advanced in the topic, you might not learn anything new. However, if you’re just starting out, I hope you’ll find it interesting and that it inspires you as much as it did for me.

Getting started with deep learning

In the last week or so, I have started to study more about machine learning and especially deep learning. I might be fashionably late to the party, but I’m really excited about what I’ve been learning so far. I wanted to share that with you.

I’ve been studying the fastai course from Jeremy Howard. The course takes on a very practical approach to deep learning. It gets you to build things quickly and gradually understand the concepts behind it, busting myths and misconceptions along the way.

From day one, the fastai course gets you to build and train models using transfer learning. Quite quickly, you get a model with really good result from little data.

The fastai course is incredibly well-structured, and I’m still going through it. Here’s the thing: I’m still wrapping my head around a lot of these concepts, so I’m not going to deep-dive into details in this post. Instead, I’m excited to share the hands-on experiences and the bits and pieces I’ve picked up along the way.

Today, I want you to try a convolutional neural network (CNN) that I’ve trained to recognize our cat, Dali. It is currently hosted on my personal Hugging Face space but you can try it right from this post. I haven’t reached 100% accuracy on the validation set but it’s quite close. With a bit more data and time, I should be able to get there.

A convolutional neural network is a class of deep neural networks that are really good for image classification. It uses convolutions under the hood, which are used to detect edges.

Dali classifier model

Drop an image of any cat and see if the model correctly identifies if it’s Dali or not. Hugging Face trottles the model to it might take a while for getting a prediction. Locally, it’s way faster.

For your convenience, I’ve included some pictures of Dali as you probably don’t have any on hand. For other cats, you can use Unsplash.

All images provided as examples are from the test set, not the training set. You don’t want to use the same data to train and validate a model. This is because the model might have memorized the data and not actually learned to generalize. This is known as overfitting.

# Changelog

- 2024-04-06: Re-trained model based on community feedback.
- 2024-04-01: Initial publication.

I’ve iterated on the model a few times to get to this point. Figuring out the proper data augmentation and learning rate helped quite a bit. I hope you’ll get good results.

Reach out to me if you’ve got false positives, I’m always eager to improve and fine-tune it again. I’ve open sourced the jupyter notebook I’ve used to train the model, so you can train a similar one.

It’s available on my github repo.

Conclusion

Training an image classification model has never been as easy as it is today. I was surprised myself! In fact, you can train other models that understand language, tabular data, … and get good results without a huge amount of data or very expensive hardware.

The key is to start from a pre-trained model.
I plan on sharing more and more as I experiment with these technologies. I hope you’ll find it interesting and that it will inspire you to try it out yourself ✌️

How about you? Have you tried training a deep learning model before? What was your experience like? Reach out and let me know! I’m always eager to learn from others.

📨 reach out

Last updated