A Docker Dev Environment in 24 Hours! (Part 2 of 2)

Last week, I posted Part I of our “Docker Dev Environment in 24 Hours” series. Today, in Part II, I’ll cover how we built our environment with the goal of helping you get some ideas or even recreate the process on your own. You can get all the files mentioned here in our public github repo: https://github.com/relateiq/docker_public.

Orchestration

The main question we’ve gotten since we built this environment was “How do you pull off the orchestration?” Since our developers now only need to type a single command to start their entire development environments on their MacOS X laptops, we’ll explain how we did it. The complete orchestration and automation components of our Docker dev environment only required two very simple bash scripts to start, stop, and update the entire environment.

Looking at the image above, you can see that we have a single command to start the entire environment called “devenv,” which is responsible for starting up the virtual machine and then executing the devenv-inner script. The devenv-inner script is responsible for the orchestration of each of the containers. The inner script brings up each container with the correct settings, network addresses, and port-forwarding. Let’s dive into each script.

Devenv.sh (What the Developer Sees)

Responsible for starting up the VM environment:

SCRIPT_HOME="$( cd "$( dirname "$0" )" && pwd )"
cd $SCRIPT_HOME/..

case "$1" in
       ssh)
               vagrant ssh
               ;;
       up)
               vagrant up
               ;;
       update)
               git pull
               vagrant ssh -c "sudo /vagrant/bin/devenv-inner.sh update"
               ;;
       *)
               vagrant ssh -c "sudo /vagrant/bin/devenv-inner.sh $1"
               ;;
esac

This script controls the virtual machine environment running on the host (MacOS X). As you can see, it is very simple. The script just abstracts the commands that manage the virtual machine using Vagrant. At the top, we set an environment variable to the path of the Vagrant file so developers don’t need to remember the location of the directory from which they downloaded the repo. Then we created a simple case statement to run each of the Vagrant commands.

A new developer now just needs to type “devenv up” to download the virtual machine image and start it up. Once that is complete, they can run “devenv start,” which  kicks off the “devenv-inner.sh start” command. That’s it. We also have an update command. Running “devenv update” will pull the latest environment code from git and then update each of the Docker images within the virtual machine. This is useful since we’ll always adjust our setup or add new features. For instance, we plan to add a UI (like shipyard) to give visibility to the environment.

Why not let the developer just run the Vagrant commands? We wanted to futurize this script a little just in case we swap out Vagrant with another solution (if Docker gets native MacOS X support, for instance) or we move it to the cloud. This abstraction is important because the developer won’t have to remember the new command that should be entered when things gets changed. As you can see, this was a pretty simple script.

Let’s move on to where most of the magic happens.

Devenv-inner.sh (The Magic!)

Responsible for starting up the docker containers:

start(){
	mkdir -p $APPS/zookeeper/data
	mkdir -p $APPS/zookeeper/logs
	sudo docker rm zookeeper > /dev/null 2>&1
	ZOOKEEPER=$(docker run \
		-d \
		-p 2181:2181 \
		-v $APPS/zookeeper/logs:/logs \
		-name zookeeper \
		server:4444/zookeeper)
	echo "Started ZOOKEEPER in container $ZOOKEEPER"

... - SEE REPO FOR THE REST OF THE CODE - ...

	mkdir -p $APPS/kafka/data
	mkdir -p $APPS/kafka/logs
	sudo docker rm kafka > /dev/null 2>&1
	KAFKA=$(docker run \
		-d \
		-p 9092:9092 \
		-v $APPS/kafka/data:/data \
		-v $APPS/kafka/logs:/logs \
		-name kafka \
		-link zookeeper:zookeeper \
		server:4444/kafka)
	echo "Started KAFKA in container $KAFKA"
}

The devenv-inner.sh script does most of the work. It is responsible for bringing up each container, setting up the volume mapping, applying any network settings, linking each container (new in 0.6.5), and pulling image updates.

Start

The start command has a common pattern to it. If you take a look at the entire script, the following block is very repetitive.

mkdir -p $APPS/kafka/data
mkdir -p $APPS/kafka/logs
sudo docker rm kafka > /dev/null 2>&1
KAFKA=$(docker run \
	-d \
	-p 9092:9092 \
	-v $APPS/kafka/data:/data \
	-v $APPS/kafka/logs:/logs \
	-name kafka \
	-link zookeeper:zookeeper \
	server:4444/kafka)
echo "Started KAFKA in container $KAFKA"

Let’s walk through the lines and what they mean:

1-2.) First, the mkdir -p $APPS/kafka/data (+logs) creates the directory on the docker host (not within the container) to allow persistent logs that will preserve data if a container is restarted. We have to create the directory first since it will error if it doesn’t upon start.

3.) We run docker rm to remove the container from the system so we can reuse the name. This is due to the new-name function in Docker: since containers stay around on the file system after they are shut down, the name is still registered and will cause a conflict if you do not remove it.

4.) Next, we run the docker run command, which has a lot going on. First, we store the docker run command into a variable ($KAFKA) if it needs to be used later. This stores the instance ID for the started container.

5.) The -d just runs the container as a damon.

6.) The -p sets up the port redirection. Kafka’s default port is 9092; the first 9092 is for the docker host and the second is for the docker container. If you submitted something like 4444:9092, you could run multiple Kafka containers on the same host. Each Kafka container would have an inside port of 9092, but you can control the port to which the docker host will redirect.

7.) The -v sets up the persistent volumes to the docker host. This setting is crucial for allowing the container to store its logs and data for future analysis, as well as use cases for loading test data. It also provides a place where an engineer can whip out all of the data if needed.

8.) The -name is the new naming functionality within Docker. This allows you to give the instance a friendly name, but (more importantly) allows you to link a container by name to another container.

9.) The -link zookeeper:zookeeper command enables container linking. By default, containers are fully isolated, but with Docker 0.6.5 you can now link multiple containers together, creating a network bridge between them. Essentially, linking allows multiple containers to communicate with each other and share environment variables. In our use case, this allows Kafka to find Zookeeper’s IP address. Now Kafka knows where to store its queuing information within zookeeper. You can also use this new linking to create clusters to replicate data if needed.

10.) The last line in the docker run command server:4444/kafka just tells the system what docker image to run. In our case, we have a private repository located at server and port 4444 with a tag name of “kafka.”

Update

update(){
	apt-get update
	apt-get install -y lxc-docker

	docker pull server:4444/zookeeper
	docker pull server:4444/redis
	docker pull server:4444/cassandra
	docker pull server:4444/elasticsearch
	docker pull server:4444/mongo
	docker pull server:4444/kafka
        docker pull ehazlett/shipyard
}

If our operations team pushes updates or new images, this function updates the apt packages to the latest versions. It then pulls each image if it does not exist or contains an update on our private registry. It also will pull from the public registry if needed.

Future Docker Work

I hope you enjoyed this article! So far, we’ve found Docker to be super easy to work with and, more importantly, a lot of fun. We have plans to start taking the work we’ve done thus far and push it beyond its original intent. Soon, we’ll start using it in production, so if you’re interested in more use cases for Docker, check back for future posts.

Please comment if you have any feedback or questions. And if you are interested in jobs at RelateIQ, please contact us here.

 

19 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

* Copy This Password *

* Type Or Paste Password Here *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

John Fiedler

Start Free Trial