Virtualization All the Way Down; Nesting Dolls; Static Linking

Let’s imagine my Mac running VMware Fusion. Inside that, I’m running Ubuntu. In there, I’ve got a Docker container running a CentOS 6.4 base image. Once there, I’ve used virtualenv to create a Python environment with my favorite version of Flask. (Or maybe I chose Java, then Tomcat, running multiple WARs!) (Or maybe AWS, then Docker, then node.) "Flask in Docker in VMware on Mac"

Look at all the ways we can distribute and run software these days:

Layer Examples Build Target Set
OS Image PXE Hardware Archicture (x86_64)
VM Image VMware, AMI, OpenStack, vagrant Virtualization Platform
OS Packages RPM, deb packages OS Version
OS-level containers Docker, LXC, Solaris Zones Containerization Platform
Language runtime configuration CLASSPATH for Java, virtualenv for python, rvm for Ruby Language runtime version
Language VM manipulation Java Application Servers (e.g., Tomcat) hosting multiple applications; OSGI Container Platform
Static Linking gcc -static; Go binaries OS-architecture pair (e.g., linux_arm)
Browser Web Apps Browser Version and Platform
PaaS Google App Engine, Heroku PaaS Platform

The Build Target Set column suggests what you have to vary as you build. For example, if you’re building OS Packages, then you need to build one package per supported OS Version. If you’re building VMs, you have to build one VM per supported hypervisor. If you’re building web sites, you have to test with every browser version.

A few observations:

All this, of course, reminds us of static linking. (Static linking may be a third rail of computer programming. See Rob Pike, for example.) Shipping an AMI or a VM with all of your bits baked in is isomorphic to shipping a statically linked binary. (The isomorphism is hammered through in this paper about MirageOS which smooshes the app and the OS into a single thing.)

When we ship software, we depend on some platform (be it x86, the JVM, Python, the browser), some local state (the shared libraries that happen to be installed, which is a source of potential trouble), and the bits we actually package up (be it on a website or a CD). We’re now moving the boundaries around aggressively and in different ways. Most of the mechanisms in the table above are trying to squish into nothing the potentially odious middle state, but it’s notable how different a JVM-based approach (somewhat OS independent) is from a Docker-based approach (the OS is part of the distribution).

Thanks to @henryr for commenting on an earlier draft and pointing me towards MirageOS.