{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Title\n", "\n", "**Exercise: Visualization**\n", "\n", "# Description\n", "\n", "For this exercise, we will continue to work with the Boston housing prices dataset that comes with `sklearn` (as we did in Lecture 13). Details about the dataset and its columns are available here.\n", "\n", "In this Exercise, I want you all to get creative and experiment! Instead of rigidly plotting exactly something that we ask, I want you to think of what ***you*** would be interested in plotting and exploring. Specifically, for this exercise, **you have the utmost freedom to plot anything that you'd like from this data**. You're expected to produce **two plots**, both of which should adhere to the principles learned in lecture (e.g., make it clear to understand/digest, effective, simple, not misleading, etc). Please feel inspired to challenge yourself by making a a type of plot you've never made before -- perhaps never even seen before! **Further, you are not confined to using matplotlib; you can use any Python visualization library you want.**\n", "\n", "We load the data into a Pandas DataFrame for you, in case you find this helpful. Feel free to ignore this DataFrame if you rather just work directly with the data. It's totally up to you!\n", "\n", "**Resource:** for tons of great coding examples, visit the matplotlib website." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# CS109A Introduction to Data Science \n", "\n", "## Lecture 14, Exercise: Visualization\n", "\n", "\n", "**Harvard University**
\n", "**Fall 2020**
\n", "**Instructors**: Pavlos Protopapas, Kevin Rader, and Chris Tanner\n", "\n", "---" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "from sklearn.datasets import load_boston" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CRIMZNINDUSCHASNOXRMAGEDISRADTAXPTRATIOBLSTAT
count506.000000506.000000506.000000506.000000506.000000506.000000506.000000506.000000506.000000506.000000506.000000506.000000506.000000
mean3.61352411.36363611.1367790.0691700.5546956.28463468.5749013.7950439.549407408.23715418.455534356.67403212.653063
std8.60154523.3224536.8603530.2539940.1158780.70261728.1488612.1057108.707259168.5371162.16494691.2948647.141062
min0.0063200.0000000.4600000.0000000.3850003.5610002.9000001.1296001.000000187.00000012.6000000.3200001.730000
25%0.0820450.0000005.1900000.0000000.4490005.88550045.0250002.1001754.000000279.00000017.400000375.3775006.950000
50%0.2565100.0000009.6900000.0000000.5380006.20850077.5000003.2074505.000000330.00000019.050000391.44000011.360000
75%3.67708312.50000018.1000000.0000000.6240006.62350094.0750005.18842524.000000666.00000020.200000396.22500016.955000
max88.976200100.00000027.7400001.0000000.8710008.780000100.00000012.12650024.000000711.00000022.000000396.90000037.970000
\n", "
" ], "text/plain": [ " CRIM ZN INDUS CHAS NOX RM \\\n", "count 506.000000 506.000000 506.000000 506.000000 506.000000 506.000000 \n", "mean 3.613524 11.363636 11.136779 0.069170 0.554695 6.284634 \n", "std 8.601545 23.322453 6.860353 0.253994 0.115878 0.702617 \n", "min 0.006320 0.000000 0.460000 0.000000 0.385000 3.561000 \n", "25% 0.082045 0.000000 5.190000 0.000000 0.449000 5.885500 \n", "50% 0.256510 0.000000 9.690000 0.000000 0.538000 6.208500 \n", "75% 3.677083 12.500000 18.100000 0.000000 0.624000 6.623500 \n", "max 88.976200 100.000000 27.740000 1.000000 0.871000 8.780000 \n", "\n", " AGE DIS RAD TAX PTRATIO B \\\n", "count 506.000000 506.000000 506.000000 506.000000 506.000000 506.000000 \n", "mean 68.574901 3.795043 9.549407 408.237154 18.455534 356.674032 \n", "std 28.148861 2.105710 8.707259 168.537116 2.164946 91.294864 \n", "min 2.900000 1.129600 1.000000 187.000000 12.600000 0.320000 \n", "25% 45.025000 2.100175 4.000000 279.000000 17.400000 375.377500 \n", "50% 77.500000 3.207450 5.000000 330.000000 19.050000 391.440000 \n", "75% 94.075000 5.188425 24.000000 666.000000 20.200000 396.225000 \n", "max 100.000000 12.126500 24.000000 711.000000 22.000000 396.900000 \n", "\n", " LSTAT \n", "count 506.000000 \n", "mean 12.653063 \n", "std 7.141062 \n", "min 1.730000 \n", "25% 6.950000 \n", "50% 11.360000 \n", "75% 16.955000 \n", "max 37.970000 " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# load the boston housing dataset\n", "boston = load_boston()\n", "boston_pd = pd.DataFrame(boston.data)\n", "boston_pd.columns = boston.feature_names\n", "boston_pd.describe()" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# our canonical example\n", "plt.figure(figsize=(5, 4))\n", "plt.hist(boston.target)\n", "plt.title('Boston Housing Prices')\n", "plt.xlabel('Price ($1000s)')\n", "plt.ylabel('# of Houses')\n", "plt.show()\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# YOUR FIRST PLOT" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# YOUR SECOND PLOT" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }