Predictive analytics 101-1 : Introduction

WHAT can you learn in this course?

It is wonderful if we can predict the price of wine by data analysis. In this course I would like to explain the methods to predict the price of products or who is likely to be in a default. Basic skills are also provided from basic so that we can implement simple statistical models for calculations. Once you are familiar with these methods, you can apply them to cases of a variety of businesses because they are simple and powerful enough to solve problems. R language is used for implementations of the models and calculations.  This course is for business managers, who do not specialize in data analysis. Beginners for data analysis are also welcome as there is no need to have experience and knowledge of data analysis in advance.  I hope you can enjoy this course !


Target, Features and statistical model

This is a key chart of our goal : "Predictions".  Let me explain it to get the big picture.


What are your interests to predict?    Revenue of your business?  Number of customers?    Satisfactory rate based on client feedback?  Price of wine in future?  You can mention anything you want.  We call it “Target”.  So firstly, “Target” should be defined in predictions so that you can make better business decisions.



Secondly,  let us find something related to your target.  For example, if you are a sales person and interested in who is likely to buy the products,  features are “attributes of each customer such as age, sex, occupation”, “behavior of each customer such as how many times he/she come to the shop per month and when he/she bought the products last time”,  “What did he/she click in the web shop”  and so on.   If you are interested in the price of wine,  features may be temperature,  amount of rain and locations of farms,  and so on.  These are just simple examples. In reality,  a number of features may be 100,  1000  or more.  It depends on whole data you have.  Usually the more data you have, the more accurate your predictions are.  This is why data is very important to obtain predictions.


Statistical models

Finally, all features are input into the statistical model where the target is obtained as an output.  This calculation is done by computers automatically so you need not worry about calculation itself.   If you can predict the price of wine,  you might make good investments of wine. If you can predict who is likely to buy the products,  you can send coupons or tickets to “highly likely to buy”customers in order to increase your sales. Is it good, isn't it?


A set of three things above is considered as "module".  Target, features  and statistical models will be explained step by step in this course. So I recommend to memorize this module because it is used  many times from now on. 





 R is a statistical computing language and used widely among professionals all over the world.  R also can be used for everyone without any fee, although it has incredible functions in it!  Please watch this short movie. If you want to see programming of R, please look at this platform.



If you prepare your PC with R language, it enables you to understand predictive analytics more practically and effectively.

Please go to the R-project site and download R into your PC. You can look at how to download R in the video below.



 See you next lecture !


R: A language and environment for statistical computing.  R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-90005107-0  URL


August 31 2015  :  The course is released  


Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy.  The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.

Data from third-party sources may have been used in the preparation of this material and I, Author of the article has not independently verified, validated such data. I accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user