Archive

Author Archive

Quick Chef Tip

11 August 2010 Leave a comment

I’m busy wiring together a new server configuration environment using Windows Deployment Services (don’t ask), Cobbler and Chef. So far things seem to be going quite well, until I bumped in to the following error trying to get a new client to register with the Chef server:

HTTP Request Returned 401 Unauthorized: Failed to authenticate!

A quick sift through Google results didn’t get anything usable. A quick sniff of the packets going over the wire though showed that it was authenticating using a signed certificate. Normally when you sign HTTP requests like that you add some kind of timed expiry. Could the problem be clock related?

Sure enough, a quick check on the new client and the server showed that there was just over an hour time difference. Getting the time on the client and the server in sync got the client registered!

Categories: DevOps, Linux, Unix

Time for a change of scenery…

23 April 2010 Leave a comment

After more than 5 years at ThoughtWorks I’ve decided that it’s time for a change of scenery. As much as I enjoyed the challenge of consulting, meeting new people and seeing new places – I prefer spending time with my family more. It was a tough decision to make, but I think it’s the right one for me now. I will miss many people at TW, but on the plus side I’m again working with some great people who I missed when they left TW…

As I’m going to be doing DevOps type stuff all day every day at DRW now, I’m hoping I’ll have more time to document and share the things I discover.

Categories: Stuff

The End of Buildix

16 December 2009 Leave a comment

I’ve been putting off this post for a few months now, but I think the time has finally come to admit what I’m sure people who care have guessed for a while – active development on Buildix has stopped, and will probably not resume. The site will stay up for the foreseeable future, nothing will vanish, but nothing new will be added either.

We started the project because at the time, setting up a new Continuous Integration server was quite an arduous task. The only real option out there for a Java project was CruiseControl, and it could take a new developer days to get their first build through the system. Thankfully though this is no longer the case.

Since then the whole CI landscape has changed. Just having a single “build server” is now more the exception than the rule. It’s all about build farms these days using tools like Cruise, Hudson and TeamCity. They integrate nicely with a variety of SCM’s and story tracking tools. Setting up a build environment with these tools is really easy now. I’d like to think that Buildix at least had something to do with helping people to see how easy it could be to get a CI environment up and running, and I know that at least in the case of Cruise this is true because I’ve been part of that team.

So – thank you to all of you who used Buildix and liked it and provided feedback. Thank you also to the current big players in the CI field who put effort into making sure that looking after your CI environment no longer needs to be a full time job for someone.

Report back from DevOpsDays 2009

24 November 2009 Leave a comment

A few months ago I got an email from Patrick Debois who I’d met at CITCON Europe asking if I’d be interested in speaking at the first conference aimed at System Administrators practising/interested in/sceptical about Agile. One of the key beliefs of those of us doing this already is that Agile practices are generally too narrowly focussed in their implementation. At the moment it’s primarily the Development organization who drive its adoption, but to get the most benefit Development and Operations groups within an organization need to work together.

With this in mind it was decided to call the conference DevOpsDays. Videos of the talks will be online in the archives section soon, so I’ve decided to write down my thoughts about what went well and not so well – I am a fan of retrospectives.
Read more…

Self Identifying Software

22 September 2009 2 comments

How often has someone come up to you and asked you what build of your software is currently deployed in a specific environment?

How many times have you come across a .jar or .dll file and wondered what version it is? Especially when using Open Source Software?

The most frightening one for me is when I’ve looked at a cluster of production servers and noticed that the .war file for the application deployed on it is a different size on one of the nodes. Which one was the correct one to deploy? Luckily this happened to me a long time ago, but I know that people out there are still having this problem today.

The solution is what I call “Self Identifying Software”. Every build of your software needs to have something that tells you what version it is and how to get back to the source code that created it. Having a build label or release number visible in your application is a good start, but it does not make your software Self Identifying. Product companies have been doing this for ever. The problem is that for that number to be useful (particularly when you’re trying to access the source code to reproduce and fix a bug) you then need to refer to a build system or release notes to find out where the source code came from (if you’re lucky). It often also does not apply to development builds. To be truly Self Identifying you need to make sure that every build (including builds developers create on their workstations) also includes enough information from the SCM system so that anyone who has access to the source code can go right back to the exact source code that produced that binary. For example, if you use Subversion as your SCM then this will be a URL and a revision number.

This is not exactly a new concept, it’s something I (and others) have been doing for a number of years now. The reason I’ve decided to write about it now though is that recently I was showing a new guy around one of the projects I’m working on at the moment, and when I showed him how to determine which version of the app was deployed he was delighted.
Read more…

Real Clouds Don’t Have Logos

28 April 2009 1 comment

I’ve been doing even more reading than normal lately on the subject of Clouds lately as quite a few of us within ThoughtWorks who are going to be speaking on the subject next month are comparing notes. It follows that when I clicked the link for “The Wrong Cloud” I was not actually prepared for a delightfully entertaining paper that would contain my quote of the day:

“Today’s so-called cloud isn’t really a cloud at all. It’s a bunch of corporate dirigibles painted to look like clouds. You can tell they’re fake because they all have logos on them. Real clouds don’t have logos.”

As much as I enjoyed reading their paper, I must say that I disagree with a lot of what the guys at Maya are saying. Yes, there is a large dose of spin and hand wavey magic going on with the current leading fashion trend (that bit is totally true). Yes, it is very easy to tightly couple your application to a cloud vendor. The thing is though that it’s not that different from the tie in you get when selecting what language you use to develop your application in, which third party libraries you use or even what operating system(s) to target. The only real difference I can think of is that if for some reason the cloud vendor you’re backing stops running your app goes down, unlike all those mission critical OS/2 applications that are still running out there…

I’m pretty sure that you’d have fair warning before the plug was pulled though, especially if you’re still paying them money every month for their services.

The real questions you need to ask before doing anything on any cloud service are:

  1. What is the problem I’m trying to solve?
  2. Do any of the cloudy offerings actually help me solve that problem?
  3. What is the cost difference between deploying this app in the cloud vs our own infrastructure (assuming you have any of your own)?
  4. What is the point at which that will change? Is there a usage point where it would be cheaper for me to move off the cloud?
  5. If I chose platform X, how hard will it be to move to platform Y?

Cloud is not a magic silver bullet – such things don’t exist. As with any technology choice you make, you need to select the most cost effective one for the problem you have, and try your hardest to ignore the FUD.

Podcast on Continuous Integration available

27 April 2009 Leave a comment

Last year at JAOO I had the chance to speak to Markus from Software Engineering Radio about the talk I gave there on Continuous Integration. It’s finally available now over here. The slides that go along with the talk are available from the JAOO site.

Talking about Java in the Google Cloud in London

23 April 2009 Leave a comment

Ola Bini and myself are going to be talking about Java in the Google Cloud at Skills Matter in London on 11 May.

Registration and more info at http://skillsmatter.com/podcast/ajax-ria/java-in-the-google-cloud

EC2 AMI Creation Tips Part 2: Work with Images, not Volumes

8 April 2009 1 comment

It’s been a long time since my first post on EC2 AMI Creation Tips. At the time the primary images people were using were the RedHat based ones supplied by Amazon, but I was trying to do something Ubuntu based. Since then a whole host of other well prepared images are now available. I was even lucky enough to be invited to create an AMI for Sun’s launch of OpenSolaris on EC2, but am not allowed to say much more about it…

Recently though I’ve been speaking to more and more people who are trying to take an existing AMI and customise it for their own use. They do this by booting the AMI they want to base theirs on, doing the customization, then bundling up that volume. Generally they do pretty well, but there are three common themes that crop up that often cause pain: transient runtime configuration being bundled up, the time (and to a lesser extent effort) it takes to bundle the new image in the first place and making further changes to the image down the line.

Thankfully there is a simple single solution to these three problems – bundle from an image, not a running volume, and keep that image (or a set of images) along with some nice helper scripts on an EBS volume. That’s the theory, but as always there’s something in the real world that stops it being easy. By default, only the owner of an image can download and unpack an image directly from S3 and the images are encrypted with the owners EC2 private key. For this process to work, you’ll need to at least bootstrap yourself initially by going through the well known and well documented process of bundling a running system. After that though it’s easy. Really. Promise…

Let’s have a closer look at the problems we’re trying to solve first though before I go into how we fix them.

Transient Runtime Pollution

The most frustrating of these is the udev system flagging the source machines MAC address for eth0 and so making their custom image unusable because the network interface does not come up. There are still distributions out there which try to be “helpful” by remembering which physical device, such as network card, maps to which logical device name, such as eth0. This is not in itself a bad thing. This is one feature I was crying for 10 years ago when I started using Linux on bigger iron. I would dread adding another network card to a server because I would normally end up having to re-label the external interfaces. The thing is though that you’re now creating an image that could be running anywhere and you don’t have physical or even console access to it.

Other examples of things that break are helper scripts. Because we’re now on an operating system image that is meant to be able to run anywhere, there are certain things you want to run only once the very first time the system boots. Once they’ve run the these scripts either create a lock file, clear their own executable bits or even delete themselves. If you’re trying to re-bundle and image you’ve already booted, you need to make sure you back out these changes.

Doing your customization in an image that has never actually been booted helps you keep all these things pristine.

Time and Effort of Creation

This one is actually quite straight forward. When you’re bundling a running volume, what happens under the hood is:

  1. A new sparse file for the image is created
  2. A new filesystem is created on the new image file
  3. This new filesystem is mounted somewhere
  4. The contents of your running volume is copied into the new image file
  5. The new filesystem is then unmounted
  6. The image file is compressed and encrypted
  7. The compressed and encrypted file is then split into chunks
  8. Your manifest is created

Some of these steps very I/O intensive. When you’re working with an image though, steps 1 to 5 don’t happen (well, steps 3 and 5 are needed for you to make changes) so you’ll be doing almost 50% less IO. This means that bundling a new image will take about half time. If you work with your image on an EBS volume it’ll be even faster as they have better performance characteristics than the standard instance stores.

Bundling and uploading images are not simple commands though. You need to specify things like your AWS access key and provide your EC2 encryption key. There’s options for which kernels and ramdisks to use. There’s lots of typing which means lots of room for human error. The way to get around this is to have small shell scripts with all these options in them. Now they are simple commands…

Maintenance

Once you’ve got your new AMI looking the way you want and doing the things you need, chances are that a few weeks after you’ve started using it you find that there’s a security fix or package update you’d like to apply. Often this ends up with people starting the whole process from scratch again. Boot up a new instance of the AMI you want to update, update it, type in all those commands and remember the options you used to bundle the volume and upload the new one. If you kept your scripts and image on an EBS you could simply attach it to a running instance and make the fixes there using the same scripts you used last time. Hows that for repeatability?

“So, just how do I work with an image then?” I hear you ask. Here’s a basic outline to get you started.

1. Set up your environment

These steps assume that you have the EC2 AMI and API tools installed locally, and that you’re running the commands on an EC2 instance. If you don’t have them, please look at EC2 AMI Tools and EC2 API Tools.

You also need some environment variables configured to make life easier:

EC2_PRIVATE_KEY=/path/to/your/EC2/private/key
EC2_CERT=/path/to/your/EC2/cert

2. Create your EBS Volume

The hardest thing will be working out how big you need to make it. Absolute worst case will be 20gb per image, but in reality 10gb should be plenty. Remember though that an EBS can only be mounted in the availability zone it is created in, so this command creates one in the same zone you are in.

ec2-create-volume -s 10 -z `curl http://169.254.169.254/2008-09-01/meta-data/placement/availability-zone`

3. Prepare the Volume

First, attach the volume to a running EC2 instance. Make sure it’s at least the same type (i386 or x86_64) as the image you’re working on.

ec2-attach-volume vol-<your vol id> -i `curl http://169.254.169.254/2008-09-01/meta-data/instance-id` -d /dev/sdp

An EBS volume is a raw bit bucket. You need to partition it (if you’re in to that kind of thing) and create the filesystem on it. Partitions don’t really make sense here though, so just create a nice shiny filesystem on it once it’s mounted on the instance.

mke2fs -j /dev/sdp

In this instance I’m making an EXT3 filesystem, but you can use any filesystem that’s supported by the host machine. Please make sure though that the block device you specify (in this example /dev/sdp) matches what you told EC2 to mount your EBS volume on.

mkdir /ebs

mount /dev/sdp /ebs

This mounts your new filesystem on a directory called /ebs

mkdir /ebs/mnt

mkdir /ebs/download

mkdir /ebs/upload

mkdir /ebs/.ec2

This creates some handy directories to help you along with the process. This is what they do:

  • mnt: Will be used as the mount point to access your image
  • download: This is where you’ll to download your initial bundle to
  • upload: When you bundle an image, put it here ready to be uploaded
  • .ec2: This will contain your AWS access keys and your EC2 PEM files as follows:
    • s3.secret: S3 Secret Key
    • s3.access: S3 Access Key
    • ec2-pk.pem: EC2 Private Key
    • ec2-cert.pem: EC2 Certificate
    • id: EC2 user ID (Note: AWS account number, NOT Access Key ID)

ec2-download-bundle -b your-bucket -a `cat /ebs/.ec2/s3.access` -s `cat /ebs/.ec2/s3.secret` -k /ebs/.ec2/ec2-pk.pem -d /ebs/download -p your-image-name

This command pulls down the bundle you want to customise from S3. As I said before though, this will only work if you have sufficient rights on S3 to download the image and the EC2 private key that bundled it up and encrypted it in the first place.

ec2-unbundle -k /ebs/.ec2/ec2-pk.pem -s /ebs/download -d /ebs -m /ebs/download/your-image-name.manifest.xml

This command uncompresses and decrypts the image file from the downloaded bundle. It takes a while…

4. Customise

Now you’re ready to work. All you need is two shell scripts to go in your /ebs directory. work.sh mounts up your image (if it’s not mounted already) and chroot’s you in and you’re now up and running – customise to your hearts content. When you’re done, make sure you’ve logged out of all your work.sh scripts (yes you can run more than one) and then run bundle-and-upload.sh.

When you’re done, just shut down your host machine. When you want to work on it again later, just boot up a new AMI, attach your volume, mount it up and you’re at step 4 already.

Have fun…

Above the Clouds – This Sounds Familiar…

24 February 2009 Leave a comment

I found a link to Above the Clouds, a paper on Cloud Computing recently published by a quartet of UC Berkeley RAD Lab professors. I’ve been quite disappointed with publications on the subject of the latest buzzword taking the world by storm right now, so I was not expecting much when I first clicked on the link. The thing is, as I started reading through the Executive Summary it all sounded very familiar. The outline the give in the summary follows the same outline as a talk I gave in November last year at the ThoughtWorks London office for the London Java Community.

The only criticism I have is that they don’t put enough emphasis on one of my key reasons for why it’s suddenly taken off. Cloud computing is not a new idea – it’s an extension of the Utility Computing that John McCarthy talked about in 1961. Although they only make a passing remark in section 3, I think one of the most important reasons it’s taken off is that the services Amazon provide were the first that were not a “solution looking for a problem”. Earlier offerings by the likes of Sun, HP and Intel all created a solution that they tried to sell to clients. The problem was that there were remarkably few problems that their solutions solved. Amazon simply exposed services that they were using internally already. That’s not to say the other reasons they give are not valid, I totally agree with them. I think they just missed a good point.

One of the topics I only glanced over is covered cover quite well in section 6 – Cloud Computing Economics. They provide some interesting example cost calculations. Although the numbers are obviously US centric, they do provide a nice way for a company to approach making the old “build vs buy” comparison.

In summary, I highly recommend this paper for anyone who wants to get the head around what this Cloud stuff is all about and what they need to do to prepare for it.