{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Title :\n",
"Exercise: Simple Multi-linear Regression\n",
"\n",
"## Description :\n",
"The aim of this exercise is to understand how to use multi regression. Here we will observe the difference in MSE for each model as the predictors change. \n",
"\n",
"\n",
"\n",
"## Data Description:\n",
"\n",
"## Instructions:\n",
"\n",
"- Read the file `Advertisement.csv` as a dataframe.\n",
"- For each instance of the predictor combination, form a model. For example, if you have 2 predictors, A and B, you will end up getting 3 models - one with only A, one with only B, and one with both A and B.\n",
"- Split the data into train and test sets.\n",
"- Compute the MSE of each model.\n",
"- Print the Predictor - MSE value pair\n",
"\n",
"## Hints: \n",
"\n",
"pd.read_csv(filename)\n",
"Returns a pandas dataframe containing the data and labels from the file data.\n",
"\n",
"sklearn.preprocessing.normalize()\n",
"Scales input vectors individually to unit norm (vector length).\n",
"\n",
"sklearn.model_selection.train_test_split()\n",
"Splits the data into random train and test subsets.\n",
"\n",
"sklearn.linear_model.LinearRegression\n",
"LinearRegression fits a linear model.\n",
"\n",
"sklearn.linear_model.LinearRegression.fit()\n",
"Fits the linear model to the training data.\n",
"\n",
"sklearn.linear_model.LinearRegression.predict()\n",
"Predict using the linear model.\n",
"\n",
"sklearn.metrics.mean_squared_error()\n",
"Computes the mean squared error regression loss\n",
"\n",
"**Note:** This exercise is auto-graded and you can try multiple attempts. "
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# Import necessary libraries\n",
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"from sklearn import preprocessing\n",
"from prettytable import PrettyTable\n",
"from sklearn.metrics import mean_squared_error\n",
"from sklearn.linear_model import LinearRegression\n",
"from sklearn.model_selection import train_test_split\n",
"%matplotlib inline\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Reading the dataset"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# Read the file \"Advertising.csv\"\n",
"df = pd.read_csv(\"Advertising.csv\")\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n", " | TV | \n", "Radio | \n", "Newspaper | \n", "Sales | \n", "
---|---|---|---|---|
0 | \n", "230.1 | \n", "37.8 | \n", "69.2 | \n", "22.1 | \n", "
1 | \n", "44.5 | \n", "39.3 | \n", "45.1 | \n", "10.4 | \n", "
2 | \n", "17.2 | \n", "45.9 | \n", "69.3 | \n", "9.3 | \n", "
3 | \n", "151.5 | \n", "41.3 | \n", "58.5 | \n", "18.5 | \n", "
4 | \n", "180.8 | \n", "10.8 | \n", "58.4 | \n", "12.9 | \n", "