Key Word(s): kNN regression, k-Nearest Neighbors, Linear Regression, MSE, R-squared
Instructions:¶
Read the Advertisement data and view the top rows of the dataframe to get an understanding of the data and columns
Select the first 7 observations and the columns
TV
andsales
.Create a scatter plot
TV
budget vssales
like in the lecture.
Hints:¶
pd.read_csv(filename) : Returns a pandas dataframe containing the data and labels from the file data.
df.iloc[] : Returns a subset of the dataframe that is contained in the row range passed as the argument.
np.linspace() : Returns evenly spaced numbers over a specified interval.
df.head() : Returns the first 5 rows of the dataframe with the column names
plt.scatter() : A scatter plot of y vs. x with varying marker size and/or colour
plt.xlabel() : This is used to specify the text to be displayed as the label for the x-axis
Note: This exercise is auto-graded and you can try multiple attempts.
# Data set used in this exercise
data_filename = 'Advertising.csv'
# Read advertising.csv file using the pandas library
df = pd.read_csv(___)
# Print your new dataframe to see if you have selected 7 rows correctly
print(___)
Plotting the graph¶
# Use a scatter plot for TV vs Sales
plt.___
# Add axis labels for clarity (x : TV budget, y : Sales)
plt.___
plt.___
Post-exercise question¶
Instead of just plotting seven points, experiment to plot all points