I’ve just release Ecks into the wild, a Python library for accessing SNMP data from a server without having to deal with the pain of knowing about what a MIB or OID is. SNMP stands for Simple Network Management Protocol, but for most people it is anything but simple. It’s pretty straight forward once you understand what’s going on, but most people are daunted by the learning curve.
What results from this resistance is that when your average developer decides he wants to monitor CPU usage or disk space on his machine he or she ends up doing it in the most obtrusive way possible – SSH. While I’m a big fan of small shell scripts, this is one place they do not belong. Let me give you an example:
I set up a new server here in London for one of our Chicago teams. Being a conscientious team, the first thing they did was wire in some monitoring that wrote for their servers. It checks things like disk space, memory usage, CPU load and the state of various processes that they care about. They need pretty fine grained checking intervals, so they check these every minute. The easiest way they know how to do this though is to SSH in to their machines and run df, free, netstat,etc and scrape the output. Every minute. Which on this nice shiny server consumed almost 20% of the CPU right off the bat. Educating them on the use of SSH ControlMaster helped, but it’s still doing a lot of work on the machine.
This was the last straw that lead to the creation of Ecks. People will always follow the path of least resistance, so if you want people to do the right thing, you need to make it the easiest thing to do. SNMP has all this information available, modern snmpd implementations are stable, have a tiny footprint and are more secure than providing SSH access to your machine.
The hardest part of all though is what to name this little library. When discussing the problem with Julian Simpson (the @builddoctor), he pointed out that MIB always reminded him of the Men in Black. Reading the Wikipedia article on the original comic book series had some interesting snippets:
The Men in Black are a secret organization that monitors and suppresses paranormal activity on Earth…
Replace “Earth” with “a computer” and you’re starting to get somewhere. Then I noticed this gem:
An agent named Ecks went rogue after learning the truth behind the MiB: they seek to manipulate and reshape the world in their own image by keeping the supernatural hidden.
Many people think that the complexity of the MIB keeps SNMP data hidden. And so the name was chosen…
When I got back from DevOpsDays in Hamburg this year I felt the need to explain my journey to the “DevOps” world and my view on where it’s headed. I started writing a State of the Nation paper to lay it all out. A couple of weeks in I got a message from Matthias Marschall asking if I’d like to do a guest post as part of their DevOps series. I agreed, and after a lot of effort (and help from a couple of great editors) you can now read it here.
It’s by far the longest article I’ve ever written, and I was amazed at how ideas that had been floating around in my head for a while crystalized through the processes of writing them down. I found I got so passionate talking to people about what was in it that I’ve decided to make a talk out of it, the first iteration of which will be in Chicago on Tuesday (see previous post).
Julian Simpson, aka The Build Doctor, has been working away at a nice web based Build Status Monitor called XFD for a while now. One of my complaints for years has been that there’s no nice build status tool that’s easy to use, but I think he’s on to something.
It’s entered in the Ultimate Wallboard Challenge, and you can vote for it here.
A few years ago Martin Fowler introduced me and a few other ThoughtWorkers who were involved in the Continuous Integration and Deployment space to an editor he knew and told us we needed to write a book about what we were doing. The key thing we were focusing on was making sure that quality software could be released to production in a reliable and repeatable way. Software has absolutely no value until it’s running in production. If you can’t get it into production quickly and easily then you’re just wasting time and money.
There were a few false starts, a few changes of crew, but eventually Jez Humble and Dave Farley stuck it out all the way to the end and Continuous Delivery was published this year. No matter what you do in your organisation, if you’re filled with dread at the thought of a new software release then you need to buy it and read it and do what it says. Now. This article will still be here after you’ve ordered it. Although I didn’t have enough time to be a big contributor, Jez would drag me in front of a white board when ever our paths crossed to discuss things and kept on sending me drafts of key chapters that I have specific interest in and so I am proud to have helped in at least some small way.
One of the things he did do was include a mention to a project that Tom Sulston and I started to make configuration management easier. It’s called ESCape, and I’ve written and talked about it a few times in a few different places. As more and more people are reading the book though I’m getting more questions about what the status of the project is and what our plans are going forward.
At the moment the project is in hibernation. We’ve not made any changes for over a year now, and I don’t think I’ll be working on it in its current incarnation any time soon. That does not mean it’s not based on a good idea though! It’s more a problem of implementation.
At its heart, ESCape is supposed to be a simple way to manage a hierarchal key/value store. I like the way the UI works, Dan North even has a wonderful acronym for it that I can’t for the life of me remember right now. The real problem with the current design is how we’re storing the data. Trying to wedge that kind of data into a relation database always felt dirty and I decided to stop before any real damage was done.
What I’d like to do though is to take the existing UI and functionality and use something like Neo4J or CouchDB to store the data. Conversations I’ve had with Jim Webber and Ian Robinson about it were one of the reasons I didn’t start immediately on a replacement as at the time Jim was making plans to write the REST interface into Neo4J. As an early release of it is now available I guess I’ve run out of excuses…
I’ve been putting off this post for a few months now, but I think the time has finally come to admit what I’m sure people who care have guessed for a while – active development on Buildix has stopped, and will probably not resume. The site will stay up for the foreseeable future, nothing will vanish, but nothing new will be added either.
We started the project because at the time, setting up a new Continuous Integration server was quite an arduous task. The only real option out there for a Java project was CruiseControl, and it could take a new developer days to get their first build through the system. Thankfully though this is no longer the case.
Since then the whole CI landscape has changed. Just having a single “build server” is now more the exception than the rule. It’s all about build farms these days using tools like Cruise, Hudson and TeamCity. They integrate nicely with a variety of SCM’s and story tracking tools. Setting up a build environment with these tools is really easy now. I’d like to think that Buildix at least had something to do with helping people to see how easy it could be to get a CI environment up and running, and I know that at least in the case of Cruise this is true because I’ve been part of that team.
So – thank you to all of you who used Buildix and liked it and provided feedback. Thank you also to the current big players in the CI field who put effort into making sure that looking after your CI environment no longer needs to be a full time job for someone.
A few months ago I got an email from Patrick Debois who I’d met at CITCON Europe asking if I’d be interested in speaking at the first conference aimed at System Administrators practising/interested in/sceptical about Agile. One of the key beliefs of those of us doing this already is that Agile practices are generally too narrowly focussed in their implementation. At the moment it’s primarily the Development organization who drive its adoption, but to get the most benefit Development and Operations groups within an organization need to work together.
With this in mind it was decided to call the conference DevOpsDays. Videos of the talks will be online in the archives section soon, so I’ve decided to write down my thoughts about what went well and not so well – I am a fan of retrospectives.
How often has someone come up to you and asked you what build of your software is currently deployed in a specific environment?
How many times have you come across a .jar or .dll file and wondered what version it is? Especially when using Open Source Software?
The most frightening one for me is when I’ve looked at a cluster of production servers and noticed that the .war file for the application deployed on it is a different size on one of the nodes. Which one was the correct one to deploy? Luckily this happened to me a long time ago, but I know that people out there are still having this problem today.
The solution is what I call “Self Identifying Software”. Every build of your software needs to have something that tells you what version it is and how to get back to the source code that created it. Having a build label or release number visible in your application is a good start, but it does not make your software Self Identifying. Product companies have been doing this for ever. The problem is that for that number to be useful (particularly when you’re trying to access the source code to reproduce and fix a bug) you then need to refer to a build system or release notes to find out where the source code came from (if you’re lucky). It often also does not apply to development builds. To be truly Self Identifying you need to make sure that every build (including builds developers create on their workstations) also includes enough information from the SCM system so that anyone who has access to the source code can go right back to the exact source code that produced that binary. For example, if you use Subversion as your SCM then this will be a URL and a revision number.
This is not exactly a new concept, it’s something I (and others) have been doing for a number of years now. The reason I’ve decided to write about it now though is that recently I was showing a new guy around one of the projects I’m working on at the moment, and when I showed him how to determine which version of the app was deployed he was delighted.
I’ve been doing even more reading than normal lately on the subject of Clouds lately as quite a few of us within ThoughtWorks who are going to be speaking on the subject next month are comparing notes. It follows that when I clicked the link for “The Wrong Cloud” I was not actually prepared for a delightfully entertaining paper that would contain my quote of the day:
“Today’s so-called cloud isn’t really a cloud at all. It’s a bunch of corporate dirigibles painted to look like clouds. You can tell they’re fake because they all have logos on them. Real clouds don’t have logos.”
As much as I enjoyed reading their paper, I must say that I disagree with a lot of what the guys at Maya are saying. Yes, there is a large dose of spin and hand wavey magic going on with the current leading fashion trend (that bit is totally true). Yes, it is very easy to tightly couple your application to a cloud vendor. The thing is though that it’s not that different from the tie in you get when selecting what language you use to develop your application in, which third party libraries you use or even what operating system(s) to target. The only real difference I can think of is that if for some reason the cloud vendor you’re backing stops running your app goes down, unlike all those mission critical OS/2 applications that are still running out there…
I’m pretty sure that you’d have fair warning before the plug was pulled though, especially if you’re still paying them money every month for their services.
The real questions you need to ask before doing anything on any cloud service are:
- What is the problem I’m trying to solve?
- Do any of the cloudy offerings actually help me solve that problem?
- What is the cost difference between deploying this app in the cloud vs our own infrastructure (assuming you have any of your own)?
- What is the point at which that will change? Is there a usage point where it would be cheaper for me to move off the cloud?
- If I chose platform X, how hard will it be to move to platform Y?
Cloud is not a magic silver bullet – such things don’t exist. As with any technology choice you make, you need to select the most cost effective one for the problem you have, and try your hardest to ignore the FUD.
Last year at JAOO I had the chance to speak to Markus from Software Engineering Radio about the talk I gave there on Continuous Integration. It’s finally available now over here. The slides that go along with the talk are available from the JAOO site.
I found a link to Above the Clouds, a paper on Cloud Computing recently published by a quartet of UC Berkeley RAD Lab professors. I’ve been quite disappointed with publications on the subject of the latest buzzword taking the world by storm right now, so I was not expecting much when I first clicked on the link. The thing is, as I started reading through the Executive Summary it all sounded very familiar. The outline the give in the summary follows the same outline as a talk I gave in November last year at the ThoughtWorks London office for the London Java Community.
The only criticism I have is that they don’t put enough emphasis on one of my key reasons for why it’s suddenly taken off. Cloud computing is not a new idea – it’s an extension of the Utility Computing that John McCarthy talked about in 1961. Although they only make a passing remark in section 3, I think one of the most important reasons it’s taken off is that the services Amazon provide were the first that were not a “solution looking for a problem”. Earlier offerings by the likes of Sun, HP and Intel all created a solution that they tried to sell to clients. The problem was that there were remarkably few problems that their solutions solved. Amazon simply exposed services that they were using internally already. That’s not to say the other reasons they give are not valid, I totally agree with them. I think they just missed a good point.
One of the topics I only glanced over is covered cover quite well in section 6 – Cloud Computing Economics. They provide some interesting example cost calculations. Although the numbers are obviously US centric, they do provide a nice way for a company to approach making the old “build vs buy” comparison.
In summary, I highly recommend this paper for anyone who wants to get the head around what this Cloud stuff is all about and what they need to do to prepare for it.