January, 2018

Introduction

  • Day 1 - Getting started
  • Day 2 - Functions & Spark
  • Day 3 - Tidyverse
  • Day 4 - Plotly
  • Day 5 - Shiny Introduction
  • Day 6 - Reactivity
  • Day 7 - Modules
  • Day 8 - Shiny Project

Day 1 - Getting started

Day 1 - Agenda

  • Why R?
  • Installing R
  • Cloning git repo
  • Installing Packages
  • Finding Help

Why R?

Installing R & Rstudio

  • Go to your internet browser and download R
  • Also, download RStudio
  • Run the R exe and wait for it to finish installing
  • Once done, go ahead and install RStudio

Installing Git & Tortise Git

  • Go to your internet browser and download Git
  • Also, download TortoiseGit
  • Install Git and then TortoiseGit

Setup git proxy

Right-click > TortiseGit > Settings > Network. Then enter your username and password.

Clone git repo

Installing Packages

  • To Install packages you can run install.packages("package name")
  • Lets try by installing the most important package we will be using later in the course; tidyverse
  • Tidyverse is a set of packages that makes data science easy
  • install.packages("tidyverse")
  • You can also use click Install button on the Packages tab, located on the bottom right

Finding Help

  • To find help on a function or package that you have already installed, go to the Help tab on the bottom right and search for a package name or function in the search box
  • Alternatively, you can run ??function_name. E.g. ??tidyverse -There are also many 'cheatsheets' available that gives you the need-to-know information on some of the most popular packages. Some can be found under Help > Cheatsheets

Exercise 1

Use the Base-r cheatsheet to generate loan data. Each account has:

  • id - which is used to identify a specific account (just make it account-1 for now)
  • segment - which is randomly generated from (A,B,C)
  • start_balance - which is randomly generated between (5000,6000)
  • interest_rate - An annual rate randomly generated between (5%,20%) converted to a monthly rate
  • n - The term of the loan a fixed 60 months

Exercise 1

  • calculate the monthly installment as:
    \(pmt=start\_balance\times \frac{r\times(1+r)^n}{(1+r)^n-1}\)
  • calculate contractual outstanding balance as:
    \(balance\_contractual_0=start\_balance\times(1+r) - pmt\) \(balance\_contractual_t=balance\_contractual_{t-1}\times(1+r) - pmt\)

Exercise 1

  • calculate a probability of default term structure as:
    PD<-dlnorm(seq(0.05,3,by = 0.05), meanlog = 0, sdlog = 1, log = FALSE)/6

loop over \(t \subset (1,n)\) and create vectors for:

  • a flag to see if the person made the payment as: \(made\_payment_t<-I(1-PD_t \ge U)\)

  • actual outstanding balance as:
    \(balance\_actual_0=start\_balance\times(1+r) - pmt\times made\_payment_0\) \(balance\_actual_t=balance\_actual_{t-1}\times(1+r) - pmt\times made\_payment_t\)

Exercise 1

Create a dataset containing the columns:

  • id,
  • segment,
  • start_balance, -interest_rate,
  • n,
  • age as a vector from 1 to n
  • pmt,
  • balance_contractual
  • made_payment
  • balance_actual,
  • arrears_amt as balance_contractual - balance_actual
  • cd_bucket as -ceiling(arrears_amt/pmt)

Example solution

See Data > transition_data.csv for example output.