16 septembre 2014

Boot2docker (eventually) includes Virtualbox guest addition

First try you gives Docker a try,  you get excited how easily the target environment can be reproduced and how fat you can start a "container" compared to virtual-machines.

Then you start considering using Docker for your own development (as it might be hard to convince production guys to adopt a tool that was just released as 1.0 few month ago). As you use a Windows or Apple computer, you will rely on boot2docker. That sounds good ... until you start building non trivial applications.

Running 'docker build' from your computer will send all local directory to the docker vm so the Dockerfile can access project resources, this can take some time but this works. If you use the "code, build, restart" approach, things will be mostly ok, as you just change the commands to be used.

But if your software stack only require to update source code, for sample as you develop an Angular frontend, or rely on modern stack with hot-reload capabilities, thinks become crazy : developing with Docker make you far slower than local development.

The "live-reload" approach is possible with Docker, but require direct connection from your IDE to the folder used by your application container. Docker volume option can mount a directory from host into container, but you're not working on the host - that is your boot2docker VM. 

Request to include VirtualBox guest additions in boot2docker, so the VM can share your $HOME directory, and let you get local file accessed from container using a volume is a long running issue on boot2docker issue tracker. There's lot's of reason to explain this has been rejected :

  1. boot2docker hasn't been designed as a development helper, but as a minimal OS to run docker containers. So it's not tied to Virtualbox
  2. a better option would be for windows/osx docker client --volume option to support remote mount, using some client-daemon communication as transport. But that's not trivial :)
Some experimented with a Samba server running in boot2docker VM, so you can access it from your development environment as a (actually local) remote directory. I didn't tested it but this just looks like a hack, and is a huge entry barrier for newbies.

This morning I had the surprise to see this pull-request being merged :

You can read the discussion there for more details, but the main point is 
"boot2docker" currently is essentially the "Docker daemon for Mac OS / Windows"

That's a major point, and nice to see the boot2docker team consider the way the project is actually used in docker community. Docker ecosystem is full of talented developers, with impressive skills and knowledge on low level system stuff and virtualization constraints. But Docker topic is to make this simple and offer a common interface to all underlying technologies. Docker main benefit is that a developer - even a junior Java EE developer - can use it to run his application on a Windows workstation then reproduce this stupid "UTF-8 Invalid Byte Sequences" bug.

This ended up with docker CTO Salomon Hykes escalating this issue as a major Docker-wide one. There have been an active debate, but we eventually get it fixed. Thank you guy to take care of your community !

14 septembre 2014

is Docker ready for Production ?

This article is a response to IS DOCKER READY FOR PRODUCTION

"You embed a distro in a distro (or multiple distros in a distro)"

Sure, that's the reason dedicated Docker Linux distro have been created, like Boot2docker or CoreOS. Those are designed to be lightweight (~ 100Mb) and focus on production systems : stability and maintenance. Not even do they provide a package manager.

"your container will most likely weight more than 1GB" not really. The base image might weight few hundreds Mb, but you'll only download it once, and most companies will anyway try to standardize distro they use for applications, not just let developer creativity give esoteric distributions a try.

Initial setup for a Docker host will require to download base image and application image layers. The application download in most case is pretty quick, but the base image(s) might take some time. For this reason docker host should be provisioned with most commons base images (in company) pre-installed. On AWS this is a common perf improvement to (re)create AMI image, not just rely on a base AMI and run configuration manager to fully setup the box (at least, we do this way at CloudBees).

"Dreaming of a statically build binary"

Building from scratch image is a bit crazy, interesting exercise, but I'd recommend to rely on Busybox if you want to reduce image size. See David's java8 Dockerfile for sample. A bit hack-ish as Busybox wget don't have https support to download JDK 8 from Oracle web site, but anyway result in a minimalist image.

Complexity to build statically linked binaries depends on target environment. For ruby this seems to be painfull, with lots of dependencies, resulting on a 450Mb install. I guess the Dockerfile did install some build tools, compile, then delete build tools, but not within the same RUN command, resulting in layers in docker image to contain file that get actually deleted in union filesystem.

That's just an assumption. Could flatten image running :
run docker export containerId | docker import - name:latest

For such a setup, should either use David's approach to create a one-liner RUN command, or search for some solution to build the binaries (in a Dockerfile) and only include the actual result in another one (see feature request 7992).


"There’s no easy way logging with Docker" - there is one actually : just dump to stdout. Docker log can be used to retrieve logs, so can the daemon API and you can then plug various management tools. Read for sample this typical usage scenario for libswarm : http://blog.docker.com/2014/07/libswarm-demo-logging/.

For people to prefer syslog approach,  this doesn't break the "1 container, 1 process" philosophy ... until you try to package syslogd in your container. Read http://jpetazzo.github.io/2014/08/24/syslog-docker/ for a description about using Docker with syslog, the later being yet another container-ized service. I don't get the argument against container isolation. application and syslog communicate using a unix socket, what' wrong letting them talk ? 

"admin nightmare" ? yes, you need your sysadmin to understand and manage docker container and how to orchestrate them. Did someone told you docker will replace all of them ? They anyway had to manage apps to communicate with various services and resources with classic deployment setup, that's not a new challenge.

"Network management"

Docker networking is actually complicated, and the documentation first don't help to make you confident with it - but is useful once you understood it to get further. Sorry guy to have a detailed documentation :) http://blog.thestateofme.com/2014/09/12/docker-networking/ has very explicit schemas to explain docker networking, then cryptic iptable configuration samples ... 

Virtual network never has been a simple topic, and Docker way is only for simpler uses cases. Weave or Pipework can cover most complex scenarios. Discussing with some Network engineer about OpenStack capabilities this is definitively a topic that require some advanced skills. Anyway, most human being will only need the --link option for their Docker application, and that's pretty cool.

"Provisioning is not perfect at all"

I agree Dockerfiles are level 0 of software management. But nobody said you need all your process to rely on a Dockerfile. It's 100% valid to use a classic build system then just package application binaries with a Dockerfile to produce deployable application. It's better if all elements can be managed with Dockerfile, but you can combine few of them.

People who are used with Puppet/Chef/Ansible power will just create a base image with those tools setup and inherit from this one for every application, Dockerfile just importing the cookbooks and running chef-solo. This is a nice way to migrate existing infrastructure to Docker. As a result, the Docker image is create with Chef DSL power, but chef will only run once during image creation, the image then is immutable.

Packer is an alternative, and I guess we will se more to emerge in Docker ecosystem to offer a higher level of abstraction and more flexibility to build Docker images. Also you can build a Docker image with just plain old docker commit command, and can integrate it with you build tools, as long as you make it automated some way. Dockerfile is just the common denominator that allows DockerHub to build any image from sources and distributed to any developer.

"Process monitoring? Don’t even think about it

Containers require a new generation of monitoring agents. cAdvisor is one of them. For sure, migrating your existing monitoring system so it embrace Docker container is not trivial. For nagios integration, there's few nagios-docker plugins to be developed, I didn't experimented with any of them so can't tell maturity, but the metrics are available from docker daemon and cgroups API, read http://jpetazzo.github.io/2013/10/08/docker-containers-metrics/ to see how to use them. 

Right, this will require some effort to migrate your existing setup. Never had the promise Docker would feet your existing tools without any effort. libSwarm is especially focussing on this issue : it provides a neutral integration API, so you can plug your custom orchestration / audit / monitoring / whatever tools into a Docker environment. Right, this is just a prototype at this time.

"Porting your application to Docker increases complexity. Really."

Applying the "1 container, 1 process" at early beginning is hard. Consider it an opportunity to rethink your architecture. First use Docker as "lightwight virtual machines" and drop your application with dozen processes and deamons in a Dockerfile. Docker benefits (ability to run production equivalent system locally) and available orchestration tools (start with Fig, then consider alternatives for larger/complex setup) will let you refactor your application and related Docker containers into smaller, focussed service to collaborate together.

Right, I don't know anything about Botify infrastructure, and related constraints. Reading Frédéric's blog my feeling was they experimented with Docker (just for 2 weeks ?) as a major transition, trying to apply it far too strictly, then were hurt by actual constraints. I'd be very happy if I have the opportunity to meet him (he's French like me, so seems feasible) to discuss in details the issue they had.

With a customer of mine (as part of my Freelance activity) we are considering a baby-step migration to Docker, so that we could learn about Docker and actual constraints using it on a production infrastructure, as well as discover it's benefits for development and continuous delivery. This is not a trivial transition, and for sure there's lot's of points we will only discover with real-life usage.