Key Word(s): ??



Title

Principal Components Analysis

Description :

This exercise demonstrates the effect of scaling for Principal Components Analysis:

After this exercise you should see following two plots:

Hints:

Principal Components Analysis

Standard scaler

Refer to lecture notebook.

Do not change any other code except the blanks.

In [1]:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
%matplotlib inline
In [2]:
df = pd.read_csv('data2.csv')
display(df.describe())
df.head()
a1 a2 a3 a4
count 200.000000 200.000000 200.000000 200.000000
mean -0.857044 4.212110 0.031775 -0.021885
std 421.494274 420.739376 4.074328 4.183966
min -605.324722 -646.589819 -6.809775 -6.249051
25% -422.060527 -399.810929 -3.883741 -4.010144
50% 46.593553 40.746538 -0.333865 -0.071820
75% 405.738027 403.029485 3.981151 4.128475
max 618.733299 629.307897 6.430227 6.691714
Out[2]:
a1 a2 a3 a4
0 343.952435 619.881035 3.926444 5.074012
1 376.982251 531.241298 2.831349 3.972653
2 555.870831 373.485494 3.365252 3.966670
3 407.050839 454.319406 3.971158 2.483932
4 412.928774 358.566005 4.670696 4.790385
In [0]:
### edTest(test_pca_noscaling) ###
#Fit and Plot the first 2 principal components (no scaling)
fitted_pca = PCA().fit(____)
pca_result = fitted_pca.transform(____)
plt.scatter(pca_result[:,0],pca_result[:,1])
plt.xlabel("Principal Component 1")
plt.ylabel("Principal Component 2")
plt.title("PCA - No scaling");
In [0]:
### edTest(test_pca_scaled) ###
#scale the data and plot first 2 principal components
scaled_df = StandardScaler().____
fitted_pca = PCA().fit(____)
pca_result = fitted_pca.transform(____)
plt.scatter(pca_result[:,0],pca_result[:,1])
plt.xlabel("Principal Component 1")
plt.ylabel("Principal Component 2")
plt.title("PCA - with scaled data");
In [0]: