Key Word(s): Containers, Docker, Singularity



Lecture 15: Beyond Virtual Environments

Thursday, October 24th 2019

Last Time

  • Virtual environments
  • virtualenv, conda
  • Deploying packages

Today

  • Beyond virtual environments
  • Virtual machines
  • Containers
  • Docker and Singularity

Virtual Environment Recap

Virtual Environment Recap

  • Virtual environments are great for working in small development teams.
  • They provide a convenient, lightweight way to achieve code portability.
  • A fellow developer need only clone the repo, create a virtual env, and run the environment (or requirements) file.
  • From that point on, the developer will have a consistent set of dependencies to those used by other developers on the same project.

Some Possible Limitations

  • Virtual environments do not provide complete isolation.
    • Just uses symlinks and changes to PATH.
    • Still depends on your operating system.
      • Several of you ran into this issue last time.
    • Still needs to use global packages and dependencies of your operating system.

How can we get almost complete isolation?

Virtual Machines

Virtual Machines

  • Virtual machines are at the other end of the spectrum.
  • A virtual machine (VM) is a file that acts like a separate computer system.
  • You can install a completely different operating system on this virtual machine.
    • e.g. If you're running MacOS, you can create a VM and install Windows on it.
  • VMs have their own virtual hardware: CPUs, memory, hard drives, etc.
  • The VM is sandboxed).
    • This means that the software inside the VM can't affect the actual computer
  • This means that VMs are a safe place to test virus-infected software (e.g. malware) as well as run OSs that are different from the one on the local machine.

Some Comments on Virtual Machines

  • The OS running on your computer is called the "host".
    • OSs in a VM are called "guests".
    • Multiple VMs can be run simultaneously.
  • A hypervisor manages different VMs on servers.
  • Desktops usually just use the OS to run the VMs.

More Comments on VMs

  • There is overhead associated with VMs.
    • They may not be as fast as the host system.
    • They may not have the same graphics capabilities.
  • They can take some time to start up (order of minutes).
  • They can help lower costs and can be more efficient.
    • No need to spend money on physical hardware and cooling systems, for example.
  • Best Virtual Machines of 2019

Containers

  • Virtual environments help to make development and use of a code more streamlined.
    • Given an OS and hardware, we can get the exact code environment set up.
  • Virtual machines give almost full autonomy: Virtual OS and hardware!
    • Run a computer within a computer.
    • Very secure.
    • Cost efficient - no need to buy new machine and infrastructure.
    • Can be slow for some needs.
  • Is there a middle ground?

Containers

  • Containers only virtualize the OS (not the hardware).
  • They share the OS kernel of the hosting system.
  • Containers give the impression of a separate OS.
    • However, since they're sharing the OS kernel, they are much cheaper than a VM
  • For example, you can create a container on a Mac but install an image of a Linux OS inside that container.
  • The container still works on the Mac OS.
  • But within the container it's like you are running Linux.

Benefits of Containers

  • Containers are lightweight (they cost an order of magnitude less than a VM in memory footprint).
  • Their startup time is on the order of seconds (vs. minutes for VMs).
  • They provide pseudo-isolation.
    • This means they're still pretty secure, but not as secure at a VM.
  • Because they're so lightweight, you can have many containers running at once on your system.

Docker

  • Containers are extremely popular and their popularity is growing.
  • One of the first widely used containers was provided by Docker.
    • ~2013
  • Docker containers can be used to run websites and web applications.
    • Multiple containers can be managed by a service called Kubernetes.
  • Docker is extremely popular and used extensively and widely.
    • Works great for local and private resources.

Singularity

  • Recently, Singularity has emereged as a container provider with technology for high performance computing.
  • Integrates with scientific apps such as SLURM and MPI.
  • Efficient on parallel file systems.
  • Integrates with other container systems...including Docker!

Lecture Exercise