{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Title\n", "\n", "**Exercise: A.2 - Multi-collinearity vs Model Predictions**\n", "\n", "# Description\n", "\n", "The goal of this exercise is to see how multi-collinearity can affect the predictions of a model.\n", "\n", "For this, perform a multi-linear regression on the given dataset and compare the coefficients with those from simple linear regression of the individual predictors.\n", "\n", "# Roadmap\n", "- Read the dataset 'colinearity.csv' as a dataframe\n", "- For each of the predictor variable, create a linear regression model with the same response variable\n", "- Compute the coefficients for each model and store in a list.\n", "- Fit all predictors using a separate multi-linear regression object\n", "- Calculate the coefficients of each model\n", "- Compare the coefficients of the multi-linear regression model with those of the simple linear regression model.\n", "\n", "**DISCUSSION:** Why do you think the coefficients change and what does it mean? \n", "\n", "# Hints\n", "\n", "LinearRegression() : Returns a linear regression object from the sklearn library.\n", "\n", "LinearRegression().coef_ : This attribute returns the coefficient(s) of the linear regression object\n", "\n", "sklearn.fit() : Fit linear model\n", "\n", "df.reshape() : Return a ndarray with the values in the specified shape \n", "\n", "Note: This exercise is **auto-graded and you can try multiple attempts.**" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# import libraries\n", "import pandas as pd\n", "import numpy as np\n", "import seaborn as sns \n", "import matplotlib.pyplot as plt\n", "from sklearn.linear_model import LinearRegression\n", "from pprint import pprint\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Read the file named \"colinearity.csv\"\n", "\n", "df = pd.read_csv(\"colinearity.csv\")" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | x1 | \n", "x2 | \n", "x3 | \n", "x4 | \n", "y | \n", "
---|---|---|---|---|---|
0 | \n", "-1.109823 | \n", "-1.172554 | \n", "-0.897949 | \n", "-6.572526 | \n", "-158.193913 | \n", "
1 | \n", "0.288381 | \n", "0.360526 | \n", "2.298690 | \n", "3.884887 | \n", "198.312926 | \n", "
2 | \n", "-1.059194 | \n", "0.833067 | \n", "0.285517 | \n", "-1.225931 | \n", "12.152087 | \n", "
3 | \n", "0.226017 | \n", "1.979367 | \n", "0.744038 | \n", "5.380823 | \n", "190.281938 | \n", "
4 | \n", "0.664165 | \n", "-1.373739 | \n", "0.317570 | \n", "-0.437413 | \n", "-72.681681 | \n", "