I’ve being reading about, poking, prodding and playing with CoreOS recently and thought it would good to document how to build a very basic clustered CoreOS setup in your lab. In this first post I’ll describe the CoreOS building blocks and show how to deploy an instance into your vSphere environment.
What the heck is this newfangled CoreOS thingy?
CoreOS is a relatively new Linux distribution. I know, there a gazillion Linux distros out there. We need another Linux distro like a hole in head. So what’s so special about CoreOS? Well, for lots of good reasons it’s becoming the darling of cloud distributions at the moment. Just like the current buzz about the PaaS platform Docker, CoreOS is making waves as the basis of many cloudy infrastructures (due in to no small part to the fact that CoreOS runs Docker apps exceptionally well). It’s oh so de rigueur, and I know the VMware community can’t get enough of playing with the latest-and-greatest lab tools.
This isn’t your father’s oldsmobile
This new Linux platform is a stripped down distribution based on Google’s ChromeOS toolchain. It can run as a VM (or bare metal) on your own hardware, and is also available via most of the popular cloud providers. It’s initial disk footprint is around 400MB, uses less 200MB of RAM, and boots extremely quickly. It’s basically a Linux kernel using systemd (the same init system already adopted by Fedora and OpenSUSE, committed for Red Hat’s and Debian’s next releases, and begrudgingly Ubuntu’s replacement for their existing Upstart system).
CoreOS uses a dual active/passive boot image and updates by downloading a complete image into the passive partition, and activating the new version on the next reboot (Yes, just like we’re used to with ESXi! – except this downloads itself). That means there’s no concept of a package manager doing dependancy resolution, just a full image being dumped down. This makes patching easy, quick and provides a get-out-of-jail free rollback option if there’s a problem.
Applications run within Docker containers, so all the good stuff you’ve heard about Docker is already included, ready to go. And the coolest bit of all this is the inherent clustering architecture. Services and applications can seamlessly and dynamically balance themselves across multiple CoreOS nodes.
The Secret Sauce. Well, not so secret as it’s Open Source. Before I launch into the lab setup, here’s a very quick rundown of the most interesting components that make CoreOS special.
etcd is a daemon that runs on every CoreOS instance proving a distributed key-value store. It keeps configuration data replicated across a CoreOS cluster, and handles things like the election process of members, allows new nodes, services, applications to register themselves in the cluster, and deals with failures appropriately. If your familiar with ZooKeeper or doozerd then you’ll know where this sits. In a vSphere world it’s analogous to the HA service that runs on each ESXi host.
Okay, this isn’t unique to CoreOS, but it’s so fundamental to they way CoreOS is designed I thought a 40,000ft description was important. Docker containers are based on LXC (Linux Containers) using the kernel’s cgroups to isolate resources. Docker’s purpose is to isolate applications, so in that respect you could compare Docker applications to VMware ThinApps in that they are completely self-contained apps that provide isolated sandboxes allowing you to run multiple versions alongside each other. But LXC is more akin to a very thin super-efficient type 2 hypervisor, as it can provide an entire Linux userspace to each app with namespace isolation. So it’s more appropriate to think of Docker containers as very lightweight VMs.
Docker’s virtualized containers can start almost instantaneously, no waiting for another OS to boot. Cluster serveral CoreOS instances together, where the docker apps can run on any node, store configuration information in a distributed service like etcd, and you start to see where this becomes interesting. Imagine an environment where your applications automatically load balance and failover to redundant, scaleable nodes, without application-specific awareness – it’s like what we’ve been doing with VMs across vSphere clusters, but practically removing the Guest OS layer and giving the applications the same mobility as VMs.
Fleet is a daemon that takes the individual systemd service run on each CoreOS machine and clusters it across all the nodes. It basically provides a distributed init process manager and it’s what gives the docker applications the coordinated redundancy and failover services.
It’s a fast-moving project with new ways of doing things so it’s taken a bit of trial-and-error testing to get things working. Certain components aren’t as well documented as they could be. I expect that much of what I write here will also be superseded quickly so if you’re reading this a few months after I publish it then you might want to check around for easier way to do things.
Time to get started
Download the latest CoreOS release with the following link: http://beta.release.core-os.net/amd64-usr/current/coreos_production_vmware_insecure.zip
At the time of writing it was only a tiny 176MB sized download. It’s worth noting in the URL that I’m downloading this from the beta channel. In a future post I’ll explain the update mechanism and how to switch to a different release channel.
I’m going to be installing this on an ESXi host. The instructions here: https://coreos.com/docs/running-coreos/platforms/vmware/ explain how you can download and use the OVF Tool from VMware to convert their Fusion/Workstation VM for ESXi use (I’m sure you could also use VMware Converter if you had a copy to hand). But I’m just going to do this by hand using the VMDK file included in the download.
I created a new VM as you would normally do in your vSphere Web Client and selected the following options through the wizard:
VM name: coreos1
In a future post I’ll explain how to cluster multiple instances together, so it makes sense to append this first VM with a number.
Guest OS: Linux, Other 2.6.x Linux (64-bit)
Compatibility: I selected vSphere 5.0 and above (HW version 8), but you should be able to pick whatever is appropriate for your environment.
Hardware: Here I removed the existing hard disk and floppy drive and set the memory to 512MB (you could drop the RAM lower, but I wanted some headroom for additional apps I’ll be running).
Note: For this tutorial, you’ll need to initially put the VM on a subnet that has DHCP services enabled.
Next, I copied the “coreos_production_vmware_insecure.vmdk” file from the zipped download up to my ESXi host’s datastore.
Now we have to manually convert the VMDK file. SSH into the ESXi host and drop into the newly created coreos1 VM’s directory and run the following command:
vmkfstools -i coreos_production_vmware_insecure_image.vmdk coreos1.vmdk -d thin -a lsilogic
Go back to the newly created VM’s settings to attach the “Existing Hard Drive” and BOOM, hit the power button.
When the boot process first gets to the login screen it might not have initialized the IPv4 address yet. Wait a few seconds and hit Enter in the VM’s console until an IP address is displayed.
Where’s my password?
To log into your newly birthed CoreOS VM, you need to go back to the downloaded zip file and retrieve the “insecure_ssh_key” file. Use the following command to use the key to remotely login in:
ssh -i <path_to_key>/insecure_ssh_key core@<your_coreos1_ip>
As this is just my home lab environment I won’t replace the default certificate, but obviously this is something you want to fix if you’re using this in a production setting.
The End of the Beginning
So we’ve finished the first step by deploying a single CoreOS VM. By itself it doesn’t do anything particularly exciting, but looking forward to some forthcoming articles you can start to see where things start to become more useful and have an insight into why CoreOS is becoming a tour de force all of a sudden.
A handful of the areas I’ll like to cover next are:
- digging into the update mechanism and changing release channels
- using the power of Docker to install applications
- systemd and the mysterious (and largely undocumented) cloud-config
- etcd’s distributed cluster management service
- clustering Docker applications using Fleet