Recently, Facebook, Pinterest and Instagram have gotten very popular. A lot of pictures and images are generated and sent by users. From human faces to landscape, there are a lot of varieties of pictures on them. In order to enhance their services, image recognition technology has been developed at the astonishing rate. By this technology, computers can understand what the objects in images are. Today, I would like to re-create the simple image recognition by just following the tutorials on the web.
Image recognition can be done by the state of the art “deep learning”. This is one of the latest iterations of computer programming. It sounds so complicated that business personnel may not want to do that by themselves. However, specific programming languages for deep learning are provided as open source and good tutorials are also available on the web, it is possible that the business persons program simple image recognition by themselves even though they may have no expertise in computer science. Let me tell you my experience of that.
1. Choose programming languages
There are several programming languages for deep learning. I choose “Torch” is provided Facebook artificial intelligence research as it becomes open source at the beginning of this year. I think it is easy to learn for beginners.
2. Find good tutorials for the theory
In order to understand what the theory is behind image recognition, I find the best tutorials and lectures provided by the Computer Science Department of University of Oxford 1 . This is a good reference to understand what deep learning is and its applications. Even though the theory is not always required for programming, it is recommended to watch the tutorials before programming in order to grasp broad pictures of image recognition.
3. Let us program image recognition and find what computer says
Programm itself is provided by the tutorial 2. In the tutorial I use image dataset, which has the classes: ‘airplane’, ‘automobile’, ‘bird’, ‘cat’, ‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’. So computer should classify each image into one of 10 classes above. I just copy and past programs which are provided in the tutorial. It takes less that 10 minutes. I run the program and obtain the results. Then I choose three of the results and see what the computer says. Name of objects above images are correct answers. The computer provides its answers as the probability of the each class. Therefore sum of the 10 numbers below is close to “1”.
In this result, the correct answer is “frog”. In computer answer, frog has the highest probability of 0.4749…. So the computer has a good guess!
In this result, the correct answer is “cat”. In computer answer, cat has the highest probability of 0.3508…. So the computer has a good guess!
In this result, the correct answer is “automobile”. In computer answer, automobile has the highest probability of 0.3622…. So the computer has a good guess!
Although this program is not perfect in terms of accuracy of whole test results, it is reasonable to learn programming of image recognition.
You may not be a computer scientist. However, it is good to program this image recognition by themselves because it enables you to understand how it works based on the state of art deep learning. Once you do it, you do not need to consider image recognition as “Black box”. It is beneficial for you at the age of the digital economy.
Yes, torch and the tutorials are free. No fee is required. Could you try it as your hobby?
1. Machine Learning: 2014-2015, Nando de Freitas, the Computer Science Department of University of Oxford https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearning/
2. Deep Learning with Torch – A 60-minute blitz
Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy. The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.
Data from third-party sources may have been used in the preparation of this material and I, Author of the article has not independently verified, validated such data. I accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user.