Part1-Python Programming
Multiple Linear Regression Analysis
1) Download the file ‘bmi’ in the homework folder of the Course Blackboard
Read in the file and display the first 5 observations
2) Draw the histograms and pairwise scatter plots of all numeric columns using seaborn pairplot.
3) Check the pairwise correlation coefficients between variables. Interpret the result and state which variables have significant linear relationships with ‘BodyFatSiriEqu’.
4) Use ‘BodyFatSiriEqu’ as output variable and all other columns as input variables to
develop a regression model. Write the regression model equation.
5) Identify all significant coefficients of predictors at α=0.05 level using t-tests.
6) Interpret the meaning of coefficients. Compare and interpret the regression model to the result of the pairwise correlation coefficients in (c).
7) Remove insignificant predictors from (e) and develop a regression model with
remaining significant input variables. Write the regression model equation. Compare
and interpret the regression models from (d) and (f)
8) Test the overall model using F-test of the regression output in (f). Interpret the result of the test.
9) State the goodness-of-fit of the regression model
10) Draw the residual plots against each significant predictor, and output variable.
Interpret the results of the plots.
11) Describe any problem or issue of the above regression model and explain how to improve the regression model.