For development environments with Docker, a fundamental requirement is the synchronization of host folders with a Docker container.
With it, a developer can employ the full power of the host to work on code locally, and instantly see these changes reflected inside the Docker container.
While Docker Compose includes a command to trivially define synchronized folders with the
volume command, development performance is impaired when the docker daemon is not running on the host itself—such as on OS X.
This post aims at giving an overview on the host volume performance impacts when running Docker on a non-Linux host. This post assumes a basic understanding of the capabilities of Docker and Compose. This preceding post introduces the fundamentals of Docker if you are not familiar with the Docker stack yet.
Recap: Docker Volumes
Imagine you employ Docker for a local, orchestrated array of services around a Rails application. This array would certainly consist of a database server, and could entail an authentication service backed by a Directory Service.
If you develop that application locally, you need to push the changes into the container. Technically, you could build the container running the Rails stack periodically. However, manually re-building the container is time-consuming and impedes rapid development.
A common approach is to mount the isolated application folder inside the docker container. For the Linux kernel, a number of filesystem variants exist that provide this layered functionality: creating, sharing, and re-using volumes on the host itself. These variants are commonly referred to as Union File Systems.
On many systems, some of the kernel features required by Docker are not natively available. Instead, the docker daemon itself is run within a Linux-based Virtual Machine, which the docker client connects to.
On Mac OS X, probably the most popular choice is boot2docker, which is a command-line wrapper around creation and management of a VirtualBox VM set up for docker.
Using boot2docker, volumes can be used just as on a Linux host, but they are exposed as a shared folder to the docker daemon on the guest beforehand.
VirtualBox Shared Folders (
vboxsf) are reliable—but unfortunately—terribly slow, even on fast underlying storage.
There is ongoing discussions and development of workarounds concerning the performance impact with volume sharing. The following suggestions are mainly based from recent blog articles and boot2docker issue #64. They are ascending in the time of my discovery, not preference.
The following workarounds are targeted at OS X, however the general functionality of some workarounds may be available for Windows, too.
1. Patching the boot2docker VM to add NFS support.
A common fix to the problem is avoiding volume synchronization with
vboxsf and instead mount the host volume into the VirtualBox VM using NFS.
This requires manually adding
/Users to the NFS export configuration to make it available from within boot2docker.
The gist of this workaround is to add the following line to the NFS configuration at
/Users -mapall=$(whoami) $(boot2docker ip)"
There is an excellent blog post on how Blackfire leverages Docker on the specifics, common errors, as well as tweaks for the NFS method. The required steps are quite extensive, so I refer you to that article for further details.
2. Employing a custom boot2docker box with Vagrant.
Vagrant is a tool for creating and managing virtual machines, as well as provisioning them with software targeted directly at development.
Some people confuse the functionality of Vagrant with Docker. There is an excellent conversation over at StackOverflow with statements from the authors of both Vagrant and Docker. Vagrant is somewhat comparable in functionality to boot2docker, as it can define and boot a box to run Docker inside it. However, Vagrant also has a number of file synchronization engines such as automatic NFS configuration and rsync built-in, so we can use a Vagrant-provisioned docker VM as an alternative.
In fact, Mitchell Hashimoto (author of Vagrant) himself published a boot2docker clone. However, its included Docker daemon is outdated. A fork exists that is actively maintained and works well under OS X as of this writing. While these are also using VirtualBox, other boxes exist for separate virtualization stacks, (e.g., for Parallels Desktop).
Vagrant with NFS / rsync is what I currently use in my own projects. It works well as a replacement for boot2docker with one exception: You need to manually define the ports to be forwarded between the VM running the docker daemon and the OS X host.
If you want to delve into Vagrant with Docker: Vagrant features a Docker Provisioner and, more recently, a Docker Provider, which allows you to define and orchestrate containers similar to docker-compose.
3. Data containers with a synchronization engine
I recently stumbled upon docker-unison. This is still a new project, but looks like be a promising alternative to the above workarounds, especially compared to manually exposing the filesystem with NFS, which may cause a number of permission errors inside Docker.
However, this approach seems to break at times with strange permission errors, whose origin is unknown.
4. Hodor: A host-side script to automate unison synchronization
Hodor is a tool to add unison synchronisation automatically to containers. This approach automatically adds synchronization via unison, comparable to what you can achieve manually with a unison data container.
However, Hodor adds yet another syntax to define containers in a
.hodorfile and I have yet to discover the advantages to a uniscon container.
5. dinghy: A wrapper for Docker on OS X
dinghy is a full replacement to boot2docker maintaining a Vagrant box with NFS volume synchronization enabled by default.
It contains a number of sensible defaults concerning NFS security. As it builds upon Vagrant, it supports all three major VM stacks on OS X: VMWare Fusion, Parallels Desktop, and VirtualBox.
6. boot2docker with rsync
From Yevgeniy, someone who has tried most of the above options, yet another approach is to use a file watcher and
rsync the changes to the boot2docker guest VM.
He provides a detailed, step-by-step guide over at his own blog and let his experience with the various workarounds flow into the command-line tool docker-osx-dev.
If you're new to the workarounds, you should definitely try his tool first, as it successfuly hides most of the manual interactions some of the above workarounds require.
I have yet to employ this method in a new project, but have also noticed that rsync provides the best performance (e.g., with Vagrant mounts). However in some cases, you need a two-way sync instead. For example, NodeJS modules installed for the frontend is overridden by rsync with the archive flag set.
Edit: Yevgeniy pointed out that employ the exclude option of rsync to avoid deleting specific locations—such as node_modules—from the container.
In docker-osx-dev, use option
-e <path> to exclude paths while synchronization, or specify a file of paths to ignore with
Docker Volume synchronization on non-Linux hosts is a huge pain, performance-wise. A number of workarounds exist, none of which is a single solution I'd recommend wholeheartedly for any use-case.
For the time spent with these workaround with Docker volumes, one might as well install the dependencies and run the application locally, and only connect to dockerized services (e.g., LDAP, PostgreSQL, …).