{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Title\n", "\n", "**Exercise: B.1 - Simple Multi-linear Regression**\n", "\n", "# Description\n", "The aim of this exercise is to understand how to use multi regression. Here we will observe the difference in MSE for each model as the predictors change. \n", "\n", "# Instructions:\n", "- Read the file Advertisement.csv as a dataframe.\n", "- For each instance of the predictor combination, form a model. For example, if you have 2 predictors, A and B, you will end up getting 3 models - one with only A, one with only B and one with both A and B.\n", "- Split the data into train and test sets\n", "- Compute the MSE of each model \n", "- Print the Predictor - MSE value pair.\n", "\n", "\n", "# Hints:\n", "\n", "pd.read_csv(filename) : Returns a pandas dataframe containing the data and labels from the file data\n", "\n", "sklearn.preprocessing.normalize() : Scales input vectors individually to unit norm (vector length).\n", "\n", "np.interp() : Returns one-dimensional linear interpolation\n", "\n", "sklearn.train_test_split() : Splits the data into random train and test subsets\n", "\n", "sklearn.LinearRegression() : LinearRegression fits a linear model\n", "\n", "sklearn.fit() : Fits the linear model to the training data\n", "\n", "sklearn.predict() : Predict using the linear model.\n", "\n", "\n", "Note: This exercise is **auto-graded and you can try multiple attempts.**" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "#import necessary libraries\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "from sklearn.linear_model import LinearRegression\n", "from sklearn.metrics import mean_squared_error\n", "from sklearn.model_selection import train_test_split\n", "from sklearn import preprocessing\n", "from sklearn.metrics import mean_squared_error\n", "from prettytable import PrettyTable" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reading the dataset" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "#Read the file \"Advertising.csv\"\n", "df = pd.read_csv(\"Advertising.csv\")" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | TV | \n", "Radio | \n", "Newspaper | \n", "Sales | \n", "
---|---|---|---|---|
0 | \n", "230.1 | \n", "37.8 | \n", "69.2 | \n", "22.1 | \n", "
1 | \n", "44.5 | \n", "39.3 | \n", "45.1 | \n", "10.4 | \n", "
2 | \n", "17.2 | \n", "45.9 | \n", "69.3 | \n", "9.3 | \n", "
3 | \n", "151.5 | \n", "41.3 | \n", "58.5 | \n", "18.5 | \n", "
4 | \n", "180.8 | \n", "10.8 | \n", "58.4 | \n", "12.9 | \n", "