Some great Internet reading

The other day a colleague of mine asked me about finding software books on the Internet – he was talking about some of the books I had listed in my previous post. Now, that post referred mainly to books that are published and distributed on paper as physical books (some of them are available for sale in other formats like PDF, Mobi and Kindle as well).

However, there are a lot of books on the Internet and even more lists of books.  I think we sometimes forget, that, besides being a directory of books, the Internet is itself a huge repository of amazing content. So caught up are we nowadays – in the real-time fire-hose of social networks and status updates – we have started overlooking some of the really good articles out there. So, in the interests of providing everyone (and myself) some links to leverage I thought I’d write a post about some of the great content available out there from the software programming perspective …

The best way to get a good list of books about software is to go where software geeks congregate and search. Almost invariably, someone would have asked or talked about the best software books and sparked off the list mania :-) Go to a few websites like this and look for the names that pop-up again and again. Stack Overflow is a great example of this – it is a forum for software related questions and though, of late, they have started discouraging open-ended and subjective questions, there are some really great list of books out here. A few of my favorite lists are –

Of course, like all on-line lists these are updated from time to time so you need to keep going back from time to time to get the latest lists.

The next place I go to are websites that are book repositories and directories. One of the biggest out there is the Wiki Books project – they have a great listing of open books and an entire section is devoted to Computing . There are other websites that specialize in technical books – a couple I go to are –

Another meta-list I go to is my own :-). I leverage on-line bookmarking (I use Delicious) extensively  – I tag my entries profusely and you can slice and dice my list across quite a few dimensions – feel free to do this and pick your favorite lists :-)

An example of an extremely influential article is, the Agile Manifesto, that provided so much momentum to the agile methodologies and the establishment of agile as an alternate development methodology in use today. One person whose articles have influenced a whole generation is Richard Stallman – his articles are available here. I am also a fan of Eric S Raymond’s writing – keeper of the Jargon file – most of his articles have turned to books –

Along the same lines of ground breaking articles is the 1972 Turing Award Lecture by E. W. Dijkstra one of the true giants in the field of programming. There is a PDF version here.

There are some great on-line tutorials on various topics –

There are a lot more – I haven’t listed then explicitly here so feel free to post comments on what you think I should have added.

Some other influential blogs and articles include –

A couple of the links in the list above are to blogs rather than individual articles – this is because I believe those blogs to have a lot of influential articles :-)

Obviously this is an incomplete list – I have written only of the content that I know of today. I hope to be able to add to it and update it with your help :-)  Please post your suggestions and links to improve it :-)

Update : Via Renju – CLR via C# and Patterns of Enterprise Application Architecture

Here is a list by Martin Fowler of books he participated in creating.

I would also like to add my friend Praseed’s posts –  if you are starting out on C/C++ in GNU-Linux these can give you good boost in the right direction :-)

Advertisements

How to write beautiful code

Beautiful code is elegant and simple – it is concise but clear. There is a balance in the code – a rhythm in the definition and structure of conditionals and the loops. The intent of the each function shines through the code – a pattern in the creation and interaction of the classes and methods in classes that combines the code into a coherent and beautiful unit. Beautiful code is concise, there are no wasted variables or endless conditionals – it is a pleasure to read not just because of the ease of reading but from the way in which it communicates the ideas and intent of the
programmer.

Well, now that I have waxed lyrical about what is good code, the next logical step would be to figure out how to write such code. Beautiful code starts with good understanding – in order to write beautiful code the first step is to understand the problem you are trying to solve. The next step is to have a clear idea of the solution and the approach you are going to take. These two things itself are entire subjects in software development – so I shall for the purposes of this post, assume that you have a clear idea of the problem you are solving and the approach you are going to take to solve the problem :-)

Even with these conditions met (understanding the problem and identifying the solution), sitting down in front of a blank page and writing out the excellent bug free code is almost impossible IMHO. The best programs that are out there are the result of an iterative process of coding and re-coding repeatedly – almost obsessively. Writing a program is like building a clay sculpture – you start of with a lump of clay, then you broadly shape it and  then keep removing and adding bits and pieces till you get your sculpture – sometimes you have to remove a big piece and add another instead and sometimes you simply throw everything  away and start over.

Writing beautiful code is hard – a seemingly simple algorithm like the Quick sort is the result of years of effort to come up with a concise and elegant implementation (in fact Quick sort has several implementations).  Even a simple piece of code like the quintessential “Hello World” program can be written in so many ways(in fact it is maintained as a separate GNU project).

So when do you stop iterating ?  There are some factors to consider in making this decision – usually if you are working on commercial software
this decision is not in the developers hands. The almighty deadline determines the ‘done-ness’ of your code – indeed, this seems to be
psychological impetus for a lot us. I have seen a lot of places and projects where people find it hard to work without a deadline looming,
like a Damocles sword, above their heads. Indeed I suspect there is something psychologically appealing to having this decision taken out of
our hands.

Again, for the purposes of this blog post, let us assume you have control over when you decide your code is done and you decide to release it only after you feel it is good enough. The question then becomes how do you know that what you have is beautiful code?

The first requirement of good code is, that it should work. If your code does not solve the problem it was intended to solve – you need to go back to the drawing board my friend – this is a necessary pre-requisite but it is not a sufficient condition for beautiful code.

One way to identify beautiful code is to read about programming – programming  methods, philosophies, etc.  I have book list of good software books to read you can start with (you can look at my post on internet reading for more book lists and articles).   In order to be a good sculptor, you need to know what beautiful sculpture look like – so you look at pictures of great sculptures – in fact this is usually a part of the curriculum for art programs. Similarly, sculptors look at examples of bad sculpture in an effort to recognize what to avoid.  So, in order to identify beautiful code look at examples of beautiful code – code written by great programmers and code written for great projects as well as bad code, ugly code.

Unlike sculpture – where you would have to travel to Italy to get a look at David or The Pieta – you have a lot of good code to read available at your fingertips – just open up the internet and look around :-) There are plenty of open source projects that share their code-base  – start with GNU, Sourceforge, and Google code (check out this post on the worlds oldest software repositories). If that’s not enough take a look at the examples in the ‘Code Complete‘ book (Other books you can look at ‘Beautiful Code‘, ‘Beautiful Data‘, ‘Clean Code‘ and ‘Coders at Work‘). Identify the patterns followed in beautiful code and the patterns you see in ugly code.

Another very important thing to recognize  is when you have stopped coding to a requirement. Good code is spare -it provides a solution to the problem at hand – no more – no less. A sign of good code is when you go through it and feel there is nothing you can remove from it – to paraphrase one of my favorite quotes by  Antoine St Exupery)  – “A programmer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away.”

And finally, the only way to make beautiful code is to write lots of code and publish it. All programs have constraints – some are technical and others logistical and yet others philosophical – good code is a elegant balance between these constraints.

So budding programmer –  Good Luck and Happy Coding !! I leave you with the following philosophy from the Python programing community –

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

PS: I refer to coding and programing interchangeably in this post. What I mean by these terms is the act of writing a software program.

PPS: You can get the “Zen of Python” poem by typing “import this” at the interactive python console.

Update 1 : I found this link to a Ruby conference keynote where the speaker talks about beautiful code – Ruby is another language that has writing elegant code as one of it’s goals.

Update 2: Here is a great post on why you should spend the time and effort to write beautiful code :-)

Update 3: Here is a website from John Graham-Cumming where one can submit and read beautiful code.

Update 4: A great article on beautiful code by Brian Kernigham

Commented code is a bad code smell

This is a rant about commented code.

Have you ever seen projects with many thousands of line of code that have a significant portion of the code commented out ? I am not talking about descriptive or explanatory comments about what the code does – I mean where the code block itself is commented …

Leaving the commented code in an application lowers the readability of code in the application. Today, source control systems are ubiquitous in software projects – so why do people still comment code and leave it rather than deleting it completely? It simply serves no purpose other than increasing code bloat.

We all do it of course – usually while debugging or understanding code. To discover if a piece of code is used in the application – it is sometimes easy to simply comment the code and let the compiler tell you where it is called from.

But, once we finish we should ether delete the commented code or un-comment it. Commenting code is, at best, a stop gap measure and IMHO a rank code smell indicating that you should probably create a code branch for your code for your experimentation.

It is quite tedious and frustrating to have to pick out the blocks of actual code from between the often, vast stretches, of commented code. Let us, as professional developers, develop the habit of never leaving any such code cruft in anything that we check in to a project !

, ,

A podcast review – Software Engineering Radio

I am a fan of podcasting.. video blogging and screen casts are fine, but in terms of convenience and bang for the megabyte – nothing beats a podcast :-). I have been listening to podcasts for a few years now. I am usually listening when walking home from work or doing some chores. In fact I wrote an earlier postabout them along with some of my favorites (at the time).Of late I have listened a lot to a podcast called Software Engineering Radio. Actually, I have been listening to this one for a long time now and it has grown from strength to strength. It was started by a German software consultant – Markus Voelter and is based in Europe (the details are here). There is now a team of people both creating episodes as well as supporting the website. A new episode is published every 10 days and the feed for it is available at the website.

The podcast is about the software engineering and is for professional developers (it says so in the title :-)).  IMHO it is the best podcast for professional developers I have come across.  Markus and his team of volunteers do a great job – the interviews are professionally done and the audio quality is great. The post production is also very well done – in fact they have put up videos about their recording and post-production process. All the recording are released as part of Creative Commons 2.5 license and I think the majority of the episodes are gems worth downloading and keeping.

The coolest thing about Software Engineering Radio and what ultimately makes it such a great resource are the subjects and the people they interview.

The subjects are for the most part about software engineering practice – but they are things that are really useful to ones growth as a software professional. Episodes range from discussions about software development methodology to software languages and tools to computer science research.  In fact there was recently a very interesting episode on the – Difference between Software Engineering and Computer Science.

The people interviewed range from professional software developers and consultants to researchers and academics. Some are giants in their field and others I haven’t heard of (admittedly thats a large percentage of them – maybe because I hadn’t heard of their field in the first place ;-)).

Another thing I would like to mention is the format of the episodes. They are interviews and the team at SE Radio
go to great lengths to prepare and ask just the right questions to
illuminate the topic at completely and extract pearls of wisdom from
the people they interview. I think this format is better than a speech or a presentation because of the interaction between the interviewer and the interviewed.

There is sometimes a problem with the English accent – most of the team are from mainland Europe and have European accents, but it is never so bad that one cannot understand what is being said. In fact one of the things I like about the show are the various accents of people from different parts of Europe, UK and America. It makes the show more interesting to me as I try to guess at the nationalities of the interviewer and interviewed from their accents ;-)

, ,

What is Adobe Flash,Flex and AIR?

Ever since the Internet started becoming a platform for business, people have been working on ways to enhance the limitations of HTML.  One way to do this was to get the user to download and install extensions to the basic browser platform that had the capabilities to run the code used to create regular user interfaces. Several companies introduced various different types of extension – some of the notable efforts in this space are Microsoft with ActiveX controls, Sun with Java Applets and Adobe with Flash. In this post I am going to talk a bit about Adobe’s products in this space.

Over the past couple of years Adobe has been on a platform expansion binge and have significantly added to their capabilities as a web development platform alternative to Microsoft’s .NET and Sun’s Java. So lets start at the beginning – first Adobe introduced the Flash player which was essentially an animation and video platform with some development features (Adobe Flash uses a language called Actionscript) .

As the concept of Rich Internet Applications (as these extensions came to be called) started getting more traction and mind share with the advent of AJAX based technologies – Adobe introduced the AIR platform and the Adobe Flex framework.

The AIR (Adobe Integrated Runtime) platform is a cross platform runtime on which one can deploy applications built using Adobe Flash, HML, AJAX or Adobe Flex on the desktop. Thus AIR is a means by which the developer of an RIA application can extend their presence on to the computers desktop and become independent of the browser.

The Adobe Flex Framework is a bit like Microsoft XAML or Mozilla XUL in that it is uses an XML based language (called MXML) that you use to describe your user interface. You can use in conjunction Actionscript to create applications Flex is an Adobe framework that leverages the Adobe AIR and Adobe Flash run-times. While the framework itself is open source (it’s not free though), the underlying run-times that it targets are proprietary (though I believe they now support the WebKit HTML engine as well which is an open source run-time). Both AIR and the Flex Framework are freely available, Adobe sells an eclipse based tool called Flex Builder that lets easily build powerful applications using the Flex Framework.

A predecessor to all these frameworks you might want to consider is OpenLaszlo.
It is open source framework I came across a couple of years back that
was already supporting flash and I believe now they support a wide
variety of platforms.

Another interesting open source project in this space is Curl which is an MIT project

The competition have not been standing still, however. Microsoft has unveiled Microsoft Silverlight and it’s open source (Mono) counterpart Moonlight, Mozilla has the Prism project and Sun has JavaFX. I don’t know much about Prism or JavaFX but I haven’ seen much traction around either of them.

Finally the HTML 5 specification got off the ground. It has started addressing some of the fundamental issues with the HTML specification and introduced elements for richer media than text like audio and video (along the same lines as image). The web-browser space has also started embracing the specification working to include support for these elements – AFAIK – Opera, Firefox’s Gecko engine and Google Chrome’s Webkit engine are neck and neck in their effort  to support HTML 5 specification with IE 8 following them.

, , ,

Visual Studio .NET 2003 and Windows 7 can get along – Seriously…

I know, I know – VS.NET 2003 is not supported on Windows Vista and above.  In fact Microsoft in their all-knowing wisdom went ahead and supported Visual Studio 6.0 on Vista, but not VS.NET 2003 !!

I don’t know if this speaks to the crappy nature of the VS.NET 2003 product or to weird resource constraints or some other theory (insert favorite MS conspiracy theory here) . The list of supported versions of the Visual Studio product line are AFAIK as follows-

  • Visual Studio 6.0 – Supported on Windows 7
  • Visual Studio 2002 – Not supported on Windows 7
  • Visual Studio 2003 – Not supported on Windows 7
  • Visual Studio 2005 SP 1- Supported on Windows 7
  • Visual Studio 2008 – Supported on Windows 7

So VS.NET 2003 and Window 7 are officially not on speaking terms – but the fact remains I need VS.NET 2003 for my job. Just because MS does not support it – I can’t NOT use it so despite all the warnings we went ahead and installed the darn thing and you know what – it installs – and runs (after a fashion)!  The annoying dialog declaring that the program is not supported is easily dismissed – there is a check box allowing you to block it from appearing again.  Debugging needs you to run in administrator mode – you can set this by right-clicking the shortcut choosing properties and then the compatibility sub-tab – there is a check-box to always run the program as administrator.

The next challenge was VS.NET 2003 refused to load my web projects so I went about looking for a way to set this up…  I opened up the IIS manager and boy let me tell you the IIS manager for IIS 7 is totally different from the IIS 5.0 and IIS 6.0 managers. You can configure everything down to the individual web.config files with this thing. Luckily running ASP.NET 1.1 on IIS7 is supported even though VS.NET 2003 is not so MS provided some helpful articles on IIS.NET – I found this article whose instructions I followed. There were some further gotchas that I encountered that you might be interested in knowing about –

  • I installed VS.NET 2003 SP1 to overcome a compilation problem. Apparently SP1 addresses these issues that occur when you have solutions with a large number of projects.
  • A weird thing is that sometimes the client scripts that are installed with ASP.NET 1.1 do not get properly installed – you need to run “aspnet_regiis” utility with the “-c” option from the command line to ensure it is properly installed.
  • I needed to install “Directory Browsing” from “Control Panel -> Programs -> Turn Windows Features On and Off ” and then switch on directory browsing for the main web-root (theoretically you can over-ride the web-root setting in the individual web project using the web.config files but ASP.NET 1.1 project web.config files are not supported apparently). You need directory browsing to be available in order to add the web-project to your solution from VS.NET 2003
  • You need to be part of the debugger users group on your computer in order to be able to do F5 debugging of your web project.
  • If you want to search across a project or a solution  to work then you need to tweak the compatibility settings . You can do this by right clicking the shortcut to launch VS.NET 2003 and selecting properties, choosing the “Compatibility” sub-tab and then checking “Disable Desktop Compositing” as well as “Disable Visual Themes” .(I got this tip from an answer to a question I posted on Stack Overflow).
  • If you partition your hard drive, make sure you allocate at least double what you used to allocate for the system (C:) drive when you were running on windows – not only is Windows 7 bigger, it needs more RAM and consequently your pagefile is bigger as well. Not to mention you will inevitably install VS.NET 2008 as well as VS.NET 2003 (after all that is the future right) and all the other goodies you had ;-)

Once you have done all this, things are more or less OK – performance is not much better or worse than XP but that’s probably more VS.NET 2003 than Windows 7.  So despite all the warnings and recommendations to use XP in a virtual machine (with 2 GB of RAM at my disposal – yeah right!!) here I am running VS.NET 2003 on Windows 7.  Overall, I think Windows 7 is a cool OS but the experience for me is marred due to reality of having to coax it to work with VS.NET 2003. In my firm – moving everything (and there is a LOT)  to a newer version of .NET is quite understandably a low priority given the economic climate – besides by the time we discuss and negotiate and decide to move MS has already come out with a newer version of everything ;-).

Well HTH  :-)

Take care y’all and be good ;-)

Update: Another blog entry you might be interested in is here.

,

A Linux for every trade…

While puttering around the internet the other day I came upon this software AptonCD and while going through it – a set of light bulbs went off in my head and I actually had an idea ! I was so excited by this that I thought I’d blog about it :-)

I participate in the Free Software Users Group, Thiruvananthapuram (Trivandrum) and one if it’s main activities is making Linux available to the general populace. As part of this activity a local company Zyxware Technologies (these really cool guys BTW) put together vending machine (Freedom Toaster) which burns various Linux distributions on CD/DVD media. This proved to be a very successful project (broadband is still limited and expensive in India so downloading distributions is quite difficult).

One of my challenges as a developer has been around setting up the development environment. All developers have a certain toolkit that they are comfortable with. Depending on their level of  sophistication this can be as simple as a text editor to a full blown IDE…  The challenge for me was finding and setting up equivalent tools in the Linux platform to the ones I used in Windows.

It occurred to me  that the challenge I detailed above would be a common one for anyone that is using a set of software tools for their trade.  It would also be useful to people in other professions (other than programming software) that are switching to Linux from another OS platform like Windows to get a set of equivalent tools in Linux for the ones they use in Windows. In fact there are lists out there that detail Linux equivalents to Windows tools.

So here is my idea – We could leverage AptonCD to create meta-packages that people could simply install over the base Linux distribution. These packages would be prepared separately from the base Debian install (Apt is the Debian package manager) and applied after the distribution is installed.

This is not a new idea – in fact here in Kerala we already have a custom Debian distribution that is targeted for schools – IT@School .  The twist here is that while the IT@School is a custom Debian distribution what I have in mind is more in terms of meta-packages that can be installed over a base Debian distribution like Ubuntu. We can leverage the Freedom Toaster to distribute these packages for people. The packages would be created by professionals in a trade  for other professionals who want to use Linux but are not sure how to get all the tools of their trade on it. These people can simply install the OS distribution and install the meta-package for their trade and voila – they can get to work :-)

Of course there are still challenges – Linux often has several tools for a particular task and there may be version conflicts as well. I have also not accounted for the learning curve in getting used to these tools. Nonetheless, I think this would at least give a head-start to professionals wanting to use Linux as a platform for their trade and drive adoption of Linux.

The ultimate aim IMHO is not to make everyone in the world a super-duper Linux hacker but to make people productive in Linux :-)

Prototype It!

So I saw this post today – it’s by Paul Buchheit a former Googler (he is one of the founders of FriendFeed) and the lead developer of one of my all time favorite software applications – GMail (it was his 20% project at Google) and it’s about the concept of Communicating with code.

Paul writes (in his post) on the concept of  using prototypes  to communicate ideas and concepts. He talks about his work with GMail and how he threw together a prototype in order to show the idea of targeted ads in GMail – targeted ads was not a priority until the prototype showed how useful and interesting it could be. The post ends with a similar exercise that he has done using the Friendfeed API (it’s pretty cool – check it out :-)) The reason I read this post and decided to blog about it is that it talks about developer communication – a topic that I wrote about in another post .

Communicating ideas through prototypes is a great idea – I have always noticed that people get more excited about something they can play with and try out. In fact this is old news in other industries – the auto industry, for example, spends millions to make concept cars to introduce new ideas to the public and solicit feedback Architects likewise – build scale models of their ideas to present to clients. So why don’t we adopt these ideas ? After all we are always talking about “software architecture” and “software construction” and other civil engineering analogies when we talk of software development ;-)

My experience is that when the term “prototype” comes up in software development projects most people are thinking of mock-ups. This is especially true in the web-development shops where there is a separate team of graphics designers creating HTML and image pictures of the user interface while a separate team of developers get to “build” the application from the pictures :-) I think this is a very limiting thing.Prototypes should not be limited to the user experience or to presenting and communicating new ideas. I like to make prototypes of my technical solutions to software problems. For example, if you are trying out this great new idea you had on caching data – write a prototype application – the absolute simplest application you can use to exercise your idea.  This would provide you with feedback on whether your idea is valid as well as show up gotchas or limitations in your design.  The pragmatic programmers in their seminal book – The Pragmatic Programmer refereed to these applications as “Tracer bullets”.

Sometimes the concepts or ideas themselves are large and need a lot of programming to even build the prototype(tracer bullet). In these cases I still believe one must prototype the concept – so the question remains – how do you do this ?
Well one way would be to take the approach Paul took in building the initial GMail prototype – modify some existing code. I find sitting in front of a blank file makes the task ahead seem even bigger than it is – so I start with a piece of code to modify even if it is something as simple as a “Hello World” application.  Another good place to look is in the open source forums (though this might be a problem if you are working in a  company that does not allow open source software) – usually there is some variation on your concept that you could work with there :-). A third option is to accumulate code, links and other resources over time that you can use to jump start your coding.

So – Happy Prototyping :-)

Coding, Puzzles and Mathematics

Coders like to solve problems – the high one gets when one finally ‘cracks’ a problem is one that is as powerful as any drug – they are also alpha males – they don’t like to lose… :-) This powerful combination leads to the traditional image of a pale, pasty, overweight  geek in glasses sitting over a computer screen in the wee hours of the night, mumbling to themselves and ingesting caffeine by the pound.

I’m a bit like that (OK – I am not pale and I like to work in the morning but the rest is similar ;-)) except the problem I might be solving may not even have any real world application. It might simply be a solution to a puzzle !!

This post is about coders and their love of puzzles… Most of us have heard and even experienced the classic puzzle question in interviews for development positions. There are numerous books on this topic (the classic “How Would You Move Mount Fuji? Microsoft’s Cult of Puzzle‘ is a great read) – both about the interviewing techniques as well as methods to solve them. I don’t want to get into a discussion about how good or bad the use of this technique is in interviews (I personally don’t favor them – in the wrong hands they can be horrible – like this one ;-) )

What I like to do is examine the puzzles themselves – puzzles can be of several varieties – one can spend hours and hours over them – indeed some people make it a full-time hobby. There are several puzzle competitions  – one of the most challenging being the MIT Puzzle Hunt (the wikipedia entry is very informative). You also have organization leveraging this interest by posting challenges like the FBI

However, the puzzles that interest me are the ones that have a basis in pure logic. I tend to regard puzzles with an eye to the elegance of the solution. A good logical puzzle with an elegantly simple solution is a thing of beauty – I like to go through the reasoning and try and look for extensions or extrapolations. I look at the assumptions and limitations and try to understand the thought process behind the puzzle.

They are a good way to learn, to get ideas on new approaches.  They expand you mind and  are finally but most importantly – they are FUN. I often use them as a fun way to learn new computer languages – good candidates are mathematical puzzles :-)

Here are some nice links – check them out and happy solving :-)

  1. Maths Challenges – Check out the links section where there are some programming challenges.
  2. Project Euler – Awesome site with some really cool puzzles.
  3. Delphi For Fun – The programs are in Delphi which is aversion of Pascal but it has some nice algorithms.
  4. FBI Ciphers – Ciphers are a fun type of puzzle that kind off go off in to crypt-analysis.
  5. Shine’s Take on IT – My friend Shine is reading “How would you move Mount Fuji?” and is blogging about his read.

Continuous Integration – more than just automated builds

This post is derived from an article by Martin Fowler on Continuous Integration. It my take on Continuous Integration and it’s use.

‘Continuous Integration‘ is a software development practice where members of a team integrate their work frequently; usually each person integrates at least daily – leading to multiple integrations per day. Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly.

Software development is full of best practices that are talked about at great length and detail but rarely actually done. Continuous integration is one such best practice that has been around under various guises for a long time. Anywhere you have heard or discussed the virtue of having a common build that is done in a regular manner – you are hearing about continuous integration. In fact Microsoft has been doing automated daily builds for years.

The term ‘Continuous Integration (CI)‘  is one that originates in the Agile Programming discipline known as ‘Extreme Programming (XP)‘ as one of its twelve original practices. I believe CI has enough merit that it should be implemented even in projects that are not following XP or other agile development practices.

So how does one go about achieving CI within your software development process?
There are two critical aspects to this effort – evangelizing, educating and mentoring the team in CI practices and creating a repeatable process that allows the team to integrate their code quickly, preferably in an automated fashion.

One of the hardest things to express about CI is how it makes a fundamental shift to the whole development pattern; one that isn’t easy to see if you’ve never worked in an environment that practices it. For many people team development just comes with certain problems that are part of the territory. CI reduces or eliminates these problems, in exchange for a certain amount of discipline. The fundamental benefit of CI is that it removes sessions where people spend time hunting bugs where one person’s work has stepped on someone else’s work without either person realizing what happened. These bugs are hard to find because the problem isn’t in one person’s area, it is in the interaction between two pieces of work. This problem is exacerbated by time. Often integration bugs can be inserted weeks or months before they first manifest themselves. As a result they take a lot of finding. With CI the vast majority of such bugs manifest themselves the same day they were introduced. Furthermore it’s immediately obvious where at least half of the interaction lies. This greatly reduces the scope of the search for the bug. And if you can’t find the bug, you can avoid putting the offending code into the product, so the worst that happens is that you don’t add the feature that also adds the bug. (Of course you may want the feature more than you hate the bug, but at least this way it’s you have the choice.)

Though CI is a practice and so does not need any particular tooling to deploy, it is generally found that automation is a key factor in order to make CI. In order to automate the CI process generally it is recommended that the following is done –

  1. Keep a single place where all the source code lives and where anyone can obtain the current sources from (and previous versions)
  2. Automate the build process so that anyone can use a single command to build the system from the sources
  3. Automate the testing so that you can run a good suite of tests on the system at any time with a single command. (The build does a self test)
  4. Make sure anyone can get a current executable which you are confident is the best executable so far. (The build creates a deployable version of the software)

All of this takes a certain amount of discipline and consistency in the approach. It is difficult to introduce it in a project but once introduced it does not take that much effort to keep it up. So let us examine the points given above in turn.

  1. Keep a single place where all the source code lives and where anyone can obtain the current sources from (and previous versions)
    This is a basic requirement in the software development. Regardless of the size of the team or the complexity of the project all the source code must be available in a single easily accessible location. The versioning requirement means that one must invest in a proper source control system. All the artifacts needed to run the application should, ideally, be located in the source control repository. This would allow the build process to obtain all the resources needed to build the project from a single location.
  2. Automate the build process so that anyone can use a single command to build the system from the sources
    Automated builds are essential to create a repeatable process that can be run on demand. This prevents the occurrence of human error in typically complex systems where in addition to building the source code one must set configuration value, etc in order to produce a deployable application. For complicated projects building the source code itself can be a complicated process involving multiple projects and dependencies. Most major software platforms have build tools that help make these tasks simple.
  3. Automate the testing so that you can run a good suite of tests on the system at any time with a single command. – make your build self testing.
    Traditionally building involves compiling, linking, etc – all the stuff required to make your program run. However simply because your program loads does not mean that it is working correctly. One very effective way to catch bugs is to run a test suite against your program. These test suites should be built so that they can run automatically against the latest build of the software.
  4. Make sure anyone can get a current executable which you are confident is the best executable so far.
    This can be achieved by getting everyone to commit their code to the build often (every day) and ensuring that these commits trigger an automated build every time. This helps ensure that anyone can get the latest workable version of the code at all times – one that passes all the tests. This allows one to be more confident about the code that one is developing will work when merged back in. It also allows testing of the code to be done on the latest version of the code ensuring that bugs being tested are not due to version conflicts or integration issues. It helps testers focus on the changes to the code and identify issues faster as a result. Overall the confidence in the quality of the code is improved.

While CI is a great process – it is like any process only as good as its implementation. There is a abundance of tools out there that do CI or have it as part of their feature set. I will list some that I have dabbled in and heard nice things about –

  • CruiseControl .NET This is a open source tool from ThoughtsWorks .
  • CruiseControl This is the original java version of CruiseControl.NET
  • CI Factory Based in CruiseControl.NET but with a lot of the other agile tools also integrated as well as templates on how to set up your repository, etc :-)
  • Hudson Java tool I have seen nice things written about it’s simplicity.

A good book to read is Continuous Integration: Improving Software Quality and Reducing Risk

Update: A Stack Overflow user was asking about Continuous Integration tools today and I pointed him here – I found another useful link while going through all the answers.

ThoughtsWorks have created a feature matrix for CI tools that seems to have all of the major tools covered – link