For nearly one year, I have been using TensorFlow and considering what I can do with it. Today I am glad to announce that I developed my computer vision model trained by real-world images. This is classification model for automobiles in which 4 kinds of cars can be classified. It is trained by little images on a normal laptop like Mac air. So you can re-perform it without preparing extra hardware. This technology is called "deep learning". Let us start this project and go into deeper now.
1. What should we classify by using images?
This is the first thing we should consider when we develop the computer vision model. It depends on the purpose of your businesses. When you are in health care industry, it may be signs of diseases in human body. When you are in a manufacture, it may be images of malfunctions parts in plants. When you are in the agriculture industry, Conditions of farm land should be classified if it is not good. In this project, I would like to use my computer vision model for urban-transportations in near future. I live in Kuala Lumpur, Malaysia. It suffers from huge traffic jams every day. The other cities in Asean have the same problem. So we need to identify, predict and optimize car-traffics in an urban area. As the fist step, I would like to classify four classes of cars in images by computers automatically.
2. How can we obtain images for training?
It is always the biggest problem to develop computer vision model by deep learning. To make our models accurate, a massive amount of images should be prepared. It is usually difficult or impossible unless you are in the big companies or laboratories. But do not worry about that. We have a good solution for the problem. It is called "pre-trained model". This is the model which is already trained by a huge amount of images so all we have to do is just adjusting our specific purpose or usage in the business. "Pre-trained model" is available as open source software. We use ResNet50 which is one of the best pre-trained models in computer vision. With this model, we do not need to prepare a huge volume of images. I prepared 400 images for training and 80 images for validation ( 100 and 20 images per class respectively). Then we can start developing our computer vision model!
3. How can we keep models accurate to classify the images
If the model provides wrong classification results frequently, it must be useless. I would like to keep accuracy ratio over 90% so that we can rely on the results from our model. In order to achieve accuracy over 90%, more training is usually needed. In this training, there are 20 epochs, which takes around 120 minutes to complete on my Mac air13. You can see the progress of the training here. This is done TensorFlow and Keras as they are our main libraries for deep learning. At 19th epoch, highest accuracy (91.25%) are achieved ( in the red box). So The model must be reasonably accurate!
Based on this project, our model, which is trained with little images, can keep accuracy over 90%. Although whether higher accuracy can be achieved depends on images for training, 90% accuracy is good to start with more images to achieve 99% accuracy in future. When you are interested in the classification of something, you can start developing your own model as only 100 images per class are needed for training. You can correct them by yourselves and run your model on your computer. If you need the code I use, you can see it here. Do you like it? Let us start now!
Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software