Solaris Zones in the Real World
At one of the clients I’m assigned to at the moment, we’re moving our development environment to Solaris 10 on Sun x4100 servers. We have two physical machines, one for our CruiseControl environments, and one for all our testing. To make good use of the resources we have (Dual Core CPU’s, lots of RAM) I’ve been carving them into zones. I’ve tinkered with zones in Solaris 10 ever since the first beta build that featured them, but it was always for little things and never anything serious. Consequently I thought they were quick and painless. Note the use of the word “thought”. Don’t get me wrong, they are the (almost) perfect solution for what we need, it’s just that if you’re planning on doing anything serious with them, here’s a list of gotchas you need to take in to consideration.
The first problem I bumped into was stability. Once I had configured and booted the zones I wanted, I discovered that ssh-keygen would segfault and dump core when run from within a zone. Normal ssh and scp commands would also occasionally segfault as well. After some very light scratching around, I decided to patch the systems to see if that made the problem go away. It did not – so I replaced it with the Sun Freeware OpenSSH package. This is when I found the next problem – patching.
For understandable reasons, Sun have restricted access to patches for Solaris 10. The days of just pulling down the latest recommended patch cluster to sort out your machines have gone. Sun now recommend you use Update Manager, a Java GUI app that registers your machine with Sun, and lists what patches you can download. All sounds reasonable in theory, but the first few times I tried it, it kept on blowing out with a com.sun.cns.authentication.CMDExecutionException. Turns out it’s broken on machines with Zones, and you need to manually download and apply a patch to fix it. Another thing to remember when patching is to make sure that all zones you have configured have been properly initialized, and that you’ve been through the system identifications at first boot.
And then we have the niggles category. There is no lsof package for amd64 Solaris 10 yet, but you can script most of what you need using pfiles and fuser. DTrace only works in the global zone. While the global zone has all the rights it needs to trace what’s going on, if you’re running more than two or three zones trying to find the specific process you’re trying to debug can be a pain.
With all this pain, is it worth it for development and testing? Most definitely! Zones allow you to have all the production-like environments you need for testing, or even just for developers to spike ideas in. Clients are happy because they don’t need to buy so much hardware. Testers are happy because if they need a new environment they can have it in a matter of hours instead of days or weeks. Sys Admins are happy because they don’t have to keep finding rack space for more machines.