ECOM151 Big Data Applications for Finance
- easygpaser
- Jun 10, 2022
- 2 min read
1. Elements of Statistical Learning.
(a) (5 points) Consider the general formulation of a machine learning model . We know that the mean squared error can be decomposed in a “reducible” and “irreducible” error terms. Under which circumstances the “reducible” error is zero?
(b) (5 points) Can the ”irreducible” error term be set to zero? Why?
(c) (5 points) Highlight the pros and cons of parametric vs non-parametric machine learning methods and provide some example of both classes of methods.
(d) (5 points) The bias-variance trade-off can be understood through the lens of the Mean Squared Error. Why is that so?
Total for Question 1: 20
2. Penalised regressions.
(a) (5 points) Consider the typical linear regression model.
We usually estimate the parameters of this regression via least squares.
What do we mean by “risk of overfitting” when the number of predictors K increases?
(b) (5 points) What do we mean by “regularization” within the linear regression context outlined above? Describe the Ridge and the Lasso penalized regression models.
(c) (5 points) Both the ridge and the lasso shrink the regression parameters towards zero. However, there are some important differences between the two methods. What are these differences?
(d) (5 points) What do we mean by the fact that the ridge and the lasso regressions are “scale invariant”? What could be the problem for estimation and how do we fix it?
Total for Question 2: 20

3. Decision tree methods.
(a) (5 points) What is the main difference between “regression” and “classification” trees?
(b) (5 points) We mentioned in class two alternative loss functions for splitting in classification trees. What these loss functions are?
(c) (5 points) Can we apply regularization techniques when estimating a regression tree? If yes, which feature of the tree we should penalize?
(d) (5 points) Explain what is the “Bagging” procedure, its logic and why it could be helpful to improve forecasts from tree methods.
Total for Question 3: 20
4. Non-linear methods and classification.
(a) (5 points) Suppose we want to forecast individual wage based on three alternative predictors, namely age, year and education. Taken individually, their relationship with wage is as in Figure (1):

If we want to consider all three predictors in a single non-linear regression model which methodology I should implement?
(b) (5 points) What is a Generalised Additive Model (GAM)? What is the main difference with respect to a linear regression model? Why GAMs are called “additive”?
(c) (5 points) Suppose we want to forecast the probability of default for a series of households based on their income and bank account balance. The data are reported in Figure (2):

Based on these data, what kind of relationship we should expect between the probability of default of an individual vs income and account balance? Which predictor likely matters the most? (d) (5 points) Take the ROC curve obtained from a logistic regression, a classification tree and a random forest as shown in Figure (3):

Which method seems to give the best forecasting accuracy?
Comments