CS-109A Introduction to Data Science

Lab 13: Making websites!

Harvard University
Fall 2019
Instructors: Pavlos Protopapas, Kevin Rader, and Chris Tanner
Lab Instructors: Chris Tanner and Eleni Kaxiras
Authors: Chris Tanner and Rahul Dave

In [1]:
## RUN THIS CELL TO PROPERLY HIGHLIGHT THE EXERCISES
import requests
from IPython.core.display import HTML
styles = requests.get("https://raw.githubusercontent.com/Harvard-IACS/2018-CS109A/master/content/styles/cs109.css").text
HTML(styles)
Out[1]:

Learning Goals

The goal of this lab is for students to:

  • Understand why websites are important for this project
  • Understand the basics of how websites work
  • Understand how to make a basic website
  • Connect design, visualization, communication topics discussed earlier to websites

Background

Q1: Why is it useful to make a website for your project, instead of just writing a report (e.g., PDF or Word document file)?

History

In the mid-1990s to mid-2000s, before mobile phones were so ubiquitous and connected to the Internet, the general public often had trouble discerning the differences between The Internet and the World Wide Web (WWW). This was understandable because the WWW was tightly integrated into the only way to access the Internet -- via using your phone line to dial into your Internet Service Provider, then use a browser to "surf" the web. The only common non-web content were chat messaging services (e.g., AOL's AIM, MSN Messenger, ICQ) and computer games. Yet, it's truly the WWW that helped give rise to the explosion of the Internet and its incredible uses.

Q2: For fun, try to guess when the Internet was created? (i.e., the first connection occurred and data was transmitted between two remote computers). First web browser?

The Internet was a research project funded by the US Government (ARPA), and its inception was called ARPANET. The first success of this project was marked by the transmission of data between computers at UCLA and SRI (Stanford) in __. Many people were involved, but Dr. Leonard Kleinrock, Dr. Vint Cerf, and Dr. Robert Kahn are considered the creators of the Interent. Fortunately, they are all still alive. Vint Cerf's title at Google was infamously "Internet Evangelist"

The WWW was created by Tim Berners-Lee in __. He also created the first web browser in the following year, which paved the way for improvements (graphical versions). You can read more about it here. There's always been serious competition amongst browsers as for which would become the most popular. In the 90s, ISPs' built-in browsers were the default, while Netscape and Microsoft's Internet Explorer (IE) competed for dominance. In 1998, the US Government charged Microsoft for being a monopoly, and its Internet Explorer browser was a cornerstone of the lawsuit. You can read more about it here.

Website Basics

A website is obviously public-facing -- anyone who knows the web address and has permission to visit a website can do so. Yet, how is this made possible?

All web pages are just files. Specifically, each web page is a particular computer file, typically written in a format called HTML (HyperText Markup Language)., that resides on a computer somewhere in the world. All of its included objects (e.g., images, sounds, dynamic scripts that allow for interactive experiences, etc) are separate files. The web page defines how to use, structure, and organize these contained objects. This is just like how a document on your local computer works -- you can have a Word document that contains images for whatever you like. Hopefully that document (i.e., web page) is nicely formatted, has a legit structure without errors, and correctly uses objects (e.g., images) that truly exist.

Once a computer has a collection of web pages, which optionally interact and link to one another, how does the common public access it? Doesn't it need permission to access such a machine? Yes, that is what DNS (domain name service) and HTTP (hypertext transfer protocol) are for. Their intricate details are beyond the scope of this course, but in short:

  • DNS allows one to register a URL address (e.g., www.harvard.edu) and designate which specific computer in the world to access (for web page content) when one points their browser at the URL. It's like a forwarding address. It's useful that we can use human language words as an address instead of just long-winded IP addresses (e.g., 182.91.121.89) anytime we want to access a site
  • HTTP defines the mechanism/protocal for how our computers talk to one another to transfer the files in a reliable, robust manner.

You can read more here.

How to make a basic website

There are many free options to use, whereby one does not need to purchase a URL domain like www.yourname.com; instead, for free, companies offer you space on their computers and your address will be things like www.theircompany.com/yoursitenamehere/

For the course project, you are entitled to use any website service you like. I recommend trying one of the following:

Google Sites allows you to use your Google account (no need to make a new account). It offers:

  • very simple creation process, but with potentially limited options
  • sleek themes, although not many

Wix is the most popular. It offers:

  • very simple creation process, potentially more options
  • tons of themes

Squarespace is another option:

  • beautifully designed themes, super easy to use
  • is not free

Because Squarespace is not free, I do not suggest using it for this course. However, in the future, for your own purposes, I think it's a great option.

Jupyter Notebooks also provides a method for converting your notebooks to a webpage, and GitHub can host your webpage. The downside is that this is more difficult to do, as you may have to manually write HTML code to make any edits. If you are interested in this approach, please visit Rahul's example.

In lab, let's walk through an example of using Google Sites.

Design Choices

Recall everything we learned about visualization and communication. For example, who is your audience? What do you expect them to know ahead of time? Does your design enable a clear understanding of your content, or is it distracting?

Think about your navigation style, font choices, font sizes, colors, length and location of text, etc.

Please follow all instructions for your project. For example, be sure to include a problem description, EDA, your model, results, and discussion (what worked well, what didn't, what you would like to improve, etc).

In [ ]: