1 May 2020, 08:18

Getting Started with Computer Vision (Tutorial)

Computer vision tasks such as image recognition have many useful applications, and seem somewhat magical when you think about them. But many smart folks have worked very hard indeed to make these fantastical techniques accessible. The goal of this post is to show you that this is a skill that you can learn, whatever your current level, and to present some pathways for you to pursue it if you’d like to learn how to solve these kinds of challenges for yourself.

Do I need a fancy computer?

No! Deep learning is very hard without a GPU. Fortunately, you can access a free GPU using Google Colab or one of the equivalents. You’ll see the starter notebook shared as a colab link: you should be able to run it from any computer with an internet connection, using Google’s computing power. Pretty neat :)

What’s the best way to learn?

Different approaches work for different people. Some enjoy a bottom-up approach, learning the fundamentals and slowly working from the underlying mathematics up to a working model. For others, a top-down approach works better. This involves getting your hands on the tools, trying things out and then slowly filling in the gaps where needed. If you’re in the second category, you might prefer to start by trying out the starter notebook or browsing some other past submissions. There are also courses like course.fast.ai that cater to this way of learning. If you’re more formally inclined, we’ve included some resources to go deeper, and will try to keep adding to the list as more suggestions are shared by the community.

So grab the starter notebook and dive in, or browse some of the other resources shared here:

Starter Notebook

The starter notebook runs through an example submission, with extra explanation and a few exercises for you to practice with yourself.

Submissions shared by the community (masks challenge)

  • #1 Crew_Esi - Fine-tuning a pre-trained model with pytorch. Check out the NoteBook folder for the bulk of the code.
  • #3 Team CIA
  • Aninda_bitm - Using fastai2 tro train densenet201, with test-time augmentation and combining 5 models’ predictions as part of the cross-validation. Some great techniques.
  • Steph_en_m - Building a CNN from scratch. Not as high-scoring as some other approaches, but educational.
  • Nauyan - a very well-organised solution, worth studying as an example of how to share code and present a submission. Built with keras, training an efficientnet model.

Submissions shared by the community (other challenges)

  • Starter Code for Potholes challenge
  • Wheat Rust entry using fastai, including the use of mixup, by steveoni
  • We know there are more! Share yours with us so that we can add them here :)


  • Fastai course - an incredible resource. Lesson 1 takes you through image classification and gets to a point where you can start playing around with these challenges. The rest of the lessons fill in the gaps and cover a whole bunch of other deep learning goodness. Great example of the top-down approach.
  • The classic Stanford course taught by Andrew Ng. Well known, and widely used as an example of a more bottom-up approach that has been very helpful to many in the field.
  • Not really a course, but this is a series of PyTorch tutorials that cover some common CV tasks.

Building networks from scratch

  • Francois Chollet (the inventor of keras) just shared two notebooks on keras (the library he invented). Intro for Engineers (practical, applications) and Intro for Researchers (experimenting with custom architectures).
  • Zindian Blessing Magabane shared a blog post about creating an entry for the potholes challenge with a custom network built with Keras.
  • The tensorflow docs have a very long, detailed section on building an image classification network.

Other tutorials and blogs:

Let us know which of your favourite resources we missed!