Title¶

Principal Components Analysis

Description :¶

This exercise demonstrates the effect of scaling for Principal Components Analysis:

After this exercise you should see following two plots:

Hints:¶

Principal Components Analysis

Standard scaler

Refer to lecture notebook.

Do not change any other code except the blanks.

In [1]:

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
%matplotlib inline

In [2]:

df = pd.read_csv('data2.csv')
display(df.describe())
df.head()

	a1	a2	a3	a4
count	200.000000	200.000000	200.000000	200.000000
mean	-0.857044	4.212110	0.031775	-0.021885
std	421.494274	420.739376	4.074328	4.183966
min	-605.324722	-646.589819	-6.809775	-6.249051
25%	-422.060527	-399.810929	-3.883741	-4.010144
50%	46.593553	40.746538	-0.333865	-0.071820
75%	405.738027	403.029485	3.981151	4.128475
max	618.733299	629.307897	6.430227	6.691714

Out[2]:

	a1	a2	a3	a4
0	343.952435	619.881035	3.926444	5.074012
1	376.982251	531.241298	2.831349	3.972653
2	555.870831	373.485494	3.365252	3.966670
3	407.050839	454.319406	3.971158	2.483932
4	412.928774	358.566005	4.670696	4.790385

In [0]:

### edTest(test_pca_noscaling) ###
#Fit and Plot the first 2 principal components (no scaling)
fitted_pca = PCA().fit(____)
pca_result = fitted_pca.transform(____)
plt.scatter(pca_result[:,0],pca_result[:,1])
plt.xlabel("Principal Component 1")
plt.ylabel("Principal Component 2")
plt.title("PCA - No scaling");

In [0]:

### edTest(test_pca_scaled) ###
#scale the data and plot first 2 principal components
scaled_df = StandardScaler().____
fitted_pca = PCA().fit(____)
pca_result = fitted_pca.transform(____)
plt.scatter(pca_result[:,0],pca_result[:,1])
plt.xlabel("Principal Component 1")
plt.ylabel("Principal Component 2")
plt.title("PCA - with scaled data");

In [0]: