**Predictive analytics 101- 5** : Logistic regression with R

## WHAT CAN YOU LEARN IN THIS LECTURE

In this lecture, I would like to focus on how to obtain parameters or weights. Once the parameters are obtained, we can predict the values of the target that we want to predict in order to make better business decisions. The target is "probability of defaults for each customer" here. Statistical computing language "R" is used to calculate the value of parameters. Let us start now!

## PARAMETERS SHOULD BE OBTAINED TO PREDICT VALUE OF probability of default, BUT HOW?

1. Recall what parameters are

The last lecture, I explained parameters. Parameters are weights of each corresponding feature. Please look at this chart again. θ is a parameter, x is feature and y is target. In this lecture, the target is the probability of defaults (PD).

2. Collecting data

As I explained the last lecture, data is very important in practice. Here is the data used in this analysis. This is the same as I I explained in the last lecture. Once data is collected, I input the data into R so that computers can learn the data. I name it "datapd".

3. Learning data in the model

Let us make computers learn the data so that parameters can be obtained. This time, I use logistic regression model, which is one of the most widely used statistical models. To do that, I write only one line in R!

"

4. Obtain the parameters

In order to predict probability of default, parameters should be obtained firstly. There is no need to worry about. R can calculate this automatically! The result is as follows. This is the output from R. Coefficients mean parameters.

## PREDICT PROBABILITY OF DEFAULT

Finally, we can predict probably

The numbers above are probability of default predicted by R automatically. Let us compare them

Could you see the probability of defaults for "Steeve"? It is "0.00" in the table above. It is in line with the prediction by computers, which is 4.536137e-11. Because 4.536137e-11 is close to 0. Prediction for each customer is in line with corresponding predictions by computers. Could you confirm them by yourself?

If you are interested in R scripts in details, you can refer this awesome site!

I hope you enjoy prediction of probability of defaults in this lecture. See you again!

October 18, 2015

Notice: TOSHI STATS SDN. BHD.