On Distributed Development

 So, I have been doing distributed development at Bang the Table, for the past year. It’s a company with a very strong work from home culture. We even had a virtual birthday party for one of our directors over Skype video chat -
Happy Birthday Crispin

Birthday party at Bang the Table

Telecommuting / working remotely / working from home – these are various terms for an increasingly popular mode of working. In this model (of working) people use the amazing ability of the internet to communicate over vast distances instantly and modern collaborative software tools to effectively work together without actually being in the same physical room, office or even city.

There are some pretty powerful advantages to such an arrangement -
  1. No need for any commuting, one can stay close to one’s family (a huge factor if you have children) and flexible working hours. It is a more relaxed way to work. I have been working in such an environment for about a year now, and with a new baby, the lack of commute and the freedom to pop downstairs and help out in case my wife needs an extra pair of hands has been awesome!
  2. Another major advantage (at least for the company), is the reduced office infrastructure costs. In places where getting an office building thats a reasonably close to where your employees live is difficult, this can be a major boon. Also, not forcing your employees into having to commute is a major plus both for employee morale and the environment.
  3. One has the opportunity to hire and work with people from a larger pool rather than being limited by a single geographical location, like a city or state or even a country. This can be very powerful advantage, sometimes finding all the right people you need in a single location can be a major problem.
  4. An “advantage” that is often touted is that the geographical separation allows for better efficiency, since, theoretically, one could have employees coming online and starting their workday as others end theirs – making use of the entire 24 hours on the clock for work. I believe, at least for software development, this is a fallacy and attempts to leverage this so called “advantage” is a big reason for the failure of certain companies to successfully leverage this model of working – but more on this later…

In the software development field we have been working in such a distributed model for a long while now and I have seen this model evolve, influenced both by the advances in the technology and tools available for collaboration as well as the evolution of the software development methodology itself. I’d like to take a few lines to write about this evolution and distinguish between what I consider to be “distributed development” model and what I call “outsourced development”.

English: Waterfall Model

Image via Wikipedia

In the outsourced development model  - a software project is broken up into separate well defined phases, each phase having definite outcomes with concrete artifacts (usually in the form of documents). This means that (in theory) one can easily move around parts of a project between various teams who would then rely on the artifacts produced in the previous phase to execute the next phase of the project. Other than a short period between phases, where one team handed over the project to the other, there would be very little actual interaction between teams. This form of software development follows the waterfall methodology and while it is a good fit for outsourcing, it is, in practice, prone to problems, that, for the most part arise due to a lack of communication between the various stakeholders of the project and the members of the team(s).

What I call “distributed development” (there is a more generic definition on wikipedia that is not limited to software development) is a refinement on the outsourced development approach that attempts to leverage technological advances in internet connectivity and better communication tools to try and minimise the disconnect introduced by working remotely in software teams. Distributed development places emphasis on communication and interaction between all the members in the virtual team. The idea is to reduce the feeling that we are working in different physical spaces and that everyone is off working on some discrete task on their own. This kind of thinking is in-line with the agile philosophy which emphasizes individuals and interactions over processes and tools.

At Bang the Table, we get pretty close to what I consider to be an ideal distributed development model and I’ve come up with a few guidelines are a result of my experiences working here and from what I learned listening to a Hanselminutes podcast about working remotely -
  1. One can use many tools and technologies to do your job. But regardless of what you use, patience and trust are the biggest success factors in this kind of working. I initially put this point lower down in the list but on consideration decided that this is the most important tip. Without trust the whole thing is an exercise in futility. One needs to learn to trust in the skills of the developer to do his or her bit and the patience to listen and accept that problems and delays might happen. The best way to build trust is by having open and rich channels of communication and meeting and interacting with everyone in person.
  2. Reliable and fast internet is an absolute must and its not that easy to get – at least where I come from. Ask if the person you are working has a backup power supply and a good internet connection. In terms of power supply, the need for one depends a lot on the place the person is working from – if it’s a big city then usually a UPS that provides an hour of backup should be fine, unless there is chronic power problem. In smaller towns and such, they might need to have something more substantial like an inverter or a portable generator.  Modern laptops with their longer battery life and portability are a major help since they allow one to ignore smaller outages and coupled with the appropriate wireless or cellular data connection can even allow you to simply shift your base of operations to other places with power and internet.
  3. This guideline, I think is going to be a bit controversial and is the reason I think that 24 hours work day idea I have mentioned is not really a good idea. If you have a team that spans countries and (in my case) continents, it is very important to have good overlap between your timezones – at least 4 hours of overlap is needed in my opinion, since this will allow you to actually collaborate rather than be limited to a short meeting at the beginning or end of your day. After all, developing software in a team is, in my experience, a collaborative activity, requiring lots of discussion (and argument)  and having your stakeholder at hand while you work through a problem is very useful. No amount of documentation and pretty pictures can substitute someone  being there, clarifying and providing context for you.
  4. You need to have an open channel of communication during the time of overlap while you are working – this can be as simple as an open IRC (Internet Relay Chat) channel. We use Skype at work and I think the ability to call or video chat on demand make it far superior to IRC. Another option to explore is to use something like a Google+ Hangout while you are working – obviously this will mean the both you need to have reliable high bandwidth internet. Paid options for software tools are there as well like Microsoft Lync which may be a more integrated solution if you have a Microsoft based IT infrastructure with Microsoft Exchange (I got this from the podcast – haven’t used this myself since we don’t use MS Exchange). There are variety of communicators and tools out there to tryout – have a look at this link where Scott provides a deep dive into the communications tools available for remote workers.
  5. Schedule regular meetings and get togethers in the real world. If this means that someone needs to travel somewhere from time to time, consider that the cost of doing distributed development. I would consider this an investment in the team’s productivity and efficiency. When everyone has met in person and worked together on issues, you build up mutual trust and respect. In my year of working at Bang the Table,  I think we have had get togethers at least once every quarter. Sometimes we just got together and spent a few days simply working together with no other agenda, sometimes it was a conference or a training.
  6. Distributed development tends to work well with relatively flat and simple organizational structure. The idea is that everyone should feel equally invested and responsible for the software project. The challenge of course is that this particular setup does not scale very well and can be problematic for large organizations.
  7. A good online project management tool is a must to ensure you are heading in the right direction. We are constantly trying out new tools for this part of our development. I have found JIRA to be good for support and task based work while, I am liking Trello for more open ended new development.
  8. At work we use a Rackspace server as a development server and for testing. This is a machine on which we stage our commits from our local machines and have our testing done on. Having a machine to deploy your code on also forces you to test it in the right environment and also brings deployment considerations into your design and development schedule.

It’s been a fascinating experience for me – working from home and I think I have learnt a lot about the vagaries of distributed development. In fact, I have been experimenting with a more distributed approach to pair programming with a couple of friends of mine.

We used Vim in a GNU Screen session to set up a development environment that we could share between us and then while one person typed we used Skype to chat with each other. We followed the classic test driven mechanic – one person writes a test and the next person passes the test cycle. The experience was remarkably powerful, since I could see when the editor was being manipulated by the other person, the thought process going on as the code was being typed in. We built a small Python program – I came from that experience with a real appreciation of what can be done with some of the simple (and powerful) unix tools already out there.

I came across another tool Tmux which is similar to screen and is used a lot in the Mac when I was listening to this podcast on the Changelog. There is a very nice screencast showing how to use Tmux I found recently, you can checkout.  I also came across this awesome website called pair.io that spins up a server for you to use in your pair programming sessions.

Related articles

The loneliness of the software tester

Sometime back my favorite podcast Radiolab – released a short called “The Loneliness of the Goalkeeper” . The short was a re-broadcast of an english show on football (soccer) and the role of the goalkeeper in the game.

In soccer the goalkeeper plays a truly unique role – the opposite to the role of the remaining 10 players in the team and indeed one could argue the aim of a game of football. The goal keeper has to defend the net and prevent a goal being scored.

Of course, the other players in his team also help him in defending the net, but there are few differences that make the goalkeepers stand apart. The primary one, of course, is that he is allowed to use any part of his body to handle the ball during the play in a restricted area in front of the net – the penalty area. If the ball enters the net – even if it is through a mistaken strike, an ill-conceived pass or fumble from him or his own team – the goal counts against the team…

This means that a goalkeeper rarely ventures outside the penalty area – he is the last line of defense to prevent the ball from getting into the net. This is the loneliness of the goalkeeper – he is ultimately responsible for any goal and must protect the net from all comers – friend or foe…

Austrian Forward Rubin Okotie tries to score o...

Image via Wikipedia

The narrator of the piece goes on exploring the mental make-up of goalkeepers – the strange almost contrary nature that is needed to be a great goalkeeper. There is a subtle psychological difference in the mindset, brought on by the role of the goalkeeper. A goalkeeper does not have a single moment of triumph, there are only moments of disaster…  A goalkeeper’s mistakes are obvious and public while his success is not. After all – a goalkeeper who manages to prevent the opposition from scoring the entire game has simply done his job while a striker that scores that final goal takes his team to victory or saves them from defeat – is the hero!

As I listened to the podcast, I was struck by the similarities between the role of the goalkeeper and the role of a software tester. Unlike the rest of the team, the goalkeeper’s motive is to prevent a goal being scored – similarly the aim of a software tester is to find defects in code that is created by the team. In a way the software testing role requires an almost opposite mindset to that of a software developer, who wants to build a working product.

It is preached everywhere that a good programmer tests his code. Indeed nowadays it is a given that software should be written test first and the agile philosophy embraces this as a core tenet. I think the concept of TDD is brilliant and that it promotes good coding habits. But I think in all the hype there is one point that has for a long time not been emphasized and that is the writing code in the test driven style does not mean that your code will be defect free.

I have seen a lot of people claim that they write their code test first as if that somehow means that they don’t need to test it. This assumption then leads to an impression that all one needs to do for getting defect free code is to embrace the test driven development style. However what people don’t consider is that when a developer sits down to write code , he does so in a certain context. Code does not exist in a vacuum, it needs to interact with other code and in many cases the code itself might be a small section of a module of a feature in a platform.

The tests referred to in TDD are simply the ones that validate the requirements for the code that you are writing at that point of time. These are unit tests and do not test the software product – only the code being written is to be tested. Writing tests for even that narrow scope requires a mental shift. Programmers who are used to simply writing code (and I count myself as one this group) find this mental shift a pretty big hurdle. It is really hard to force yourself to sit down and think of the requirement and then do the coding and writing the tests needed to validate it (especially if you have a deadline or worse a manager breathing down your neck). And after you do all that, you find yourself having to mock the environment around your code and tests to get them to run consistently (or at all). Sometimes it is simply not feasible/cost-effective or even necessary, to write tests that cover every error that your code might meet. To tie this back to the football analogy, I would consider that software developers can make at the most -  defenders of the goal. They are restricted both by their role and by their mindset from being able to truly subject software to the test.

In contrast, a software tester approaches things from a totally different perspective. The role of the software tester is to find defects – it is his goal and his responsibility is to the quality of the software product as a whole. Testers approach software from a product perspective and not a code perspective. They look at the entire product, its various layers and the seams in the code, where the different features come together – and then look for defects. They write code (in fact I think good software testers need to write as much code as developers) to probe the product from various angles. Software testers need to test the various modules, individually and in combination and using different types of input.

Testing is a matter of diligence – the sheer focus needed to go through all the combinatorial options for all the inputs and apis available in a typical software product requires an intense mental focus and very tidy and organized mind. It is this mental makeup that make great software testers. It is not easy to find people with such a unique mindset – in fact I was reading the other day of a startup company that hires autistic people for software testing. Traits that make great software testers – intense focus, comfort with repetition, memory for detail – also happen to be characteristics of autism.

The role of a software tester is a lonely one – it is not something that will endear you with developers – after all you are pointing out defects in the code they wrote. If the software product fails the responsibility of failing to find the defect falls on the software tester. If the software tester does his job well and all the defects in a software product are fixed – he has merely done his job. Indeed if he does his job too well, and there are no defects to fix people will start wondering if there is a need for the role at all…

Like the goalkeeper – the software tester has a lonely job, but a vital one – he is the last line of defense – the responsibility of the overall quality of the software product lies with him. Software developers may have written the code for a killer feature but it is the software tester that ensures that it enhances the software product instead of breaking it….

Related articles

How to read code – a primer

I like programming – it’s what I do and I am blessed in that I get to spend most of my waking hours developing software. Like a lot of programmers I obsess over how good my code is and how I can get better at it.

Over the years there have been reading a lot of articles and books on software development. There has been a lot of ink spent (both physical and virtual) on ways to improve your “programming foo” and become a super ninja programmer ! There are some common pearls in all this ink and one of them is the advice on reading code. This advice, is usually a one liner couched in the midst of a bunch of other recommendations and usually along the lines of – find some great open source software or any piece of software that you admire, open up the source code (or print it out) and read it. While, this is on the whole, great advice there are some problems with actually putting it in practice.  In this post I endeavor to give some practical suggestions on reading code, but first let us enumerate the problems.

  • The usual impression conveyed (in the posts that advise one to read code) is that the dispenser of the advice is a programming guru who can literally sit back in their chair with a page of code and read it like a novel. Well, I am sure there are some superb programmers out there who enjoy looking at pages of cryptic English-like statements over a cup of coffee and can hold entire class hierarchies and architectures in their heads. This post is not meant for them – this post is for poor slobs like me who find staring at reams of code a boring, frustrating and ultimately pointless exercise. Of course, it can be argued that one can learn simply by reading a single class or even a function of a the entire project code, but, IMO, except for the most simple problems, most software is interdependent. It is often impossible to appreciate the design decisions and the rationale behind a particular function or class layout without knowing the rest of the system…
  • The next problem is getting code to read (actually before that you need to be able to identify code worth reading – check out this post for details on that). There is a lot of great software out there – both open source and freely available and licensed or proprietary. There are huge open source directories like Sourceforge and Google Code, and huge pieces of software like Open Office and Linux. If you are working in a software development company, you can probably get access to the proprietary code in your source control repository. A third common avenue are the programs distributed along with books on software development  or as part of resources for  education( Minix being the canonical example). Indeed we are actually spoiled for choice and from this universe of software identifying the ones that are good candidates for our purpose is a hard but essential task.
  • Another problem is the language in which the program is written – reading someone else’s code is tough enough as it is, adding the burden of familiarizing yourself with the quirks and syntax of a new language while doing this, is, IMO, a recipe for disaster and immense frustration . You need to find code written in a language that you are familiar with. This particular problem is not relevant if you are going through the code distributed as part of a book or as an educational resource, since you would have the book or your mentor to explain things and set out the context. If, despite this forewarning, you are planning to read code written in a different language than you are used to (without the benefit of having a book or a mentor) I would advise, that you at least learn enough of the language to create your own programs  in it (“Hello World” does not count :-)) .
  • The bit about context brings me to the next problem – figuring out what the code is doing is a lot harder if you are not familiar with the software itself. For example, it is far more difficult to go through the Linux code and figure out the concept of runlevels if you don’t use Linux daily and see the Linux boot sequence. Using the software gives one a context with which to read the code – this context includes the common terminology used, the functionality and features of the software, even the quirks and bugs that you experience.

I have realized that for me ‘reading code’ does not really describe the activities that I undertake – a better phrase for what I do is ‘code comprehension’. It is quite difficult for me to sit back with a laptop screen (or a printout) full of code and simply read through it. I need a lot more than simply a piece of code – I like to be able to look at documentation, play with the software, step through the code and even write tests for it before I really appreciate it. This is a significant investment of my time and effort, so I have to be very picky about the software I want to “read” (comprehend).

  • The first filter I place on the code directory, when looking for code  is the language filter – for me this means – C# or VB.NET or Python or Javascript(while I am familiar with C++, Ruby and F# as well I do not consider myself at a level where I can understand other people’s code in them). Next is to look for software that I have used – this allows me the a bit of a leg up since I know what the code is meant to do, cannot do and (if I am familiar enough) its limitations. Good candidates are open source software that you use in your day job (for eg. I use Cruise Control.NET, NANT and NUnit which are open source tools written in C#)
  • I happen to work in a software product company (a Microsoft shop), so one of the candidates for my reading list is the code in my companies source repository. If you happen to work in a software company, you can look at other projects, and even older versions of the software you are working on. In addition to providing insight on code, you get a pretty good idea of what was tried before and since. There are a few caveats though -
    • First, if you don’t have direct access to other projects, you need to ask permission – some companies are very touchy about their “intellectual property”.
    • Second, the quality of the software may not be as high as you think, since, in general, proprietary code does not get the kind of scrutiny open source code does. Warning signs to look out for are a lack of regular code reviews – if the software is not code reviewed the odds are that it would not be of good quality.
    • Third (this point is inspired from feedback provided by my friend Praseed), if the code in your company is business software (HR, Finance, ERP, etc) there is a lot of business context that needs to be understood first. Also, since most of this code tends to be factored by business functionality, it generally seems less modular than utility code or APIs.
  • Look for well documented projects (this applies to open source as well as proprietary code).  By this I mean, that the documentation should highlight the overall design, and rationale for the way the code is. Simply having auto-generated Java Doc type documents cannot be considered documentation :-). One useful avenue to explore is software created as educational resources (like Minix ). Since, the target is to teach through the software, they are usually quite clearly documented and have plenty of material explaining the design rationale behind the code.

So, you have identified the software and downloaded the source code and documentation, so let’s get down and start spelunking ;-)

  • Go through the design documentation and try to get a feel for the way the code has been built. Good software projects follow certain architectural patterns – these dictate the code organization. Once you get a handle on this, understanding the code becomes a whole lot easier. If you can create a class diagram of the code you can get a good idea of the layout.
  • The next thing to do is to compile it and run it. This can be straightforward or tough depending on the process followed in the project and it’s documentation.
  • Now it’s time to fire up your favorite IDE and go exploring. A good place to start your code exploration  would be to try to trace a functionality of the project that you are familiar with. This would let you go through the various layers and sub-systems and get a handle on how they inter-connect. For example when I was exploring NUnit – I started by writing a test and looking at the code classes I needed to do that.
  • Try and identify the design patterns used in the code. If you do not know what design patterns are, then you need to stop reading this post right now and read this book. Familiarize yourself with design patterns – they form a great way to recognize and understand the design of well written code. This makes it easier to keep it in your head while reading code. It also helps you identify nuances and customizations made by the programmers more easily.
  • Try to write tests for the code to fully understand it – this is really useful way to understand the dependencies between different parts of the code. When you try to write a test for the code you first need to satisfy (mock) all its dependencies. Next you need to understand the possible entry points as well as the exit values for the code. This improves your understanding of the code and get you to the next level.
  • Finally, try to refactor the code. In this step you have moved from simply understanding the code to becoming familiar enough to be able to modify it. As the sophistication of your refactoring increases so too does your understanding. At this point you can if needed contribute your own code to the project :-)
“Code Reading” IMO is more than just reading – it is a distinct set of activities that together help one understand code. It might seem more intimidating than simply “reading code” but it is well worth then effort IMO.
Happy “code reading” :-)
Update: I came across this post by Joel Spolsky where he quotes Seth Gordon as saying code reading “Is just like reading the Talmud”… Yup, code reading is definitely not easy.

Ganesha – the original lateral thinker

There is an ancient tale from Hindu mythology that illustrates lateral thinking (also known as –  “out of the box” thinking) that I would  like to share -

One day Lord Siva and His consort Parvati were sitting atop their abode on Mt. Kailash with their sons Ganesha and Karthikeyan when the sage Narada dropped by for a visit. Narada had with him a special mango of knowledge, to offer to Siva. After accepting the mango from Narada,  Siva and Parvati decided to have a contest between their sons.

Ganesha

Ganesha

Karthikeyan

Karthikeyan

The first one who circumnavigates the world three times would get to the mango of knowledge. Without further ado Karthikeyan  jumped on his peacock and started off. Ganesha on the other hand was busy eating his favorite ladoos and decided to finish them first. Karthikeyan had completed two rounds by the time Ganesha finally got ready to compete :-)

Ganesha simply approached Siva and Parvati and deliberately walked around them –  He circled them once, twice and three times and then claimed the mango.

When, Siva and Parvati asked him how he could claim the mango when he had not circled the world even once – Ganesha replied – “You both are my world”. Delighted by the answer Siva and Parvati gave Ganesha the mango, which he immediately gobbled up with relish.

Two of the important traits of good software developers are “enlightened laziness” and “Out of the box” thinking.  This tale is an example of both enlightened laziness and out of the box thinking – confronted by the immense task of circumnavigating the world – Ganesha – by simply thinking a little and restating the problem comprehensively defeated his brother Karthikeyan.

So, my eager friends – the ones who are chomping at the bit after the initial presentation of a project – eager to rush into coding it, please spend some time contemplating your problem.  Another, homily you might want to consider is -  “Think twice, code once” – You, might just save yourself a LOT of time and effort !! :-)

Juggling code – the coding zone and burnouts..

Software programming is a very mentally intensive activity. In any non-trivial software system the coder has to juggle a large number of mental models. Like a juggler, a coder,  has to mentally juggle not only the actual code that he/she is writing, but,  details of the code that it is related to, the details of the data being manipulated, the possible errors to be handled, the reliability and performance of the code, it’s security characteristics, the requirement that is being implemented and it’s design and usability, etc (depending on the code there may be more to think of or if you lucky, less :-)). Unlike a juggler who generally juggles things of similar size,  a coder mentally juggles problems whose complexity vary by several orders of magnitude (1 – 109).

Given all this, it takes time for coders to become truly productive when they sit down and start working on something.  Once you get into the what I call the – coding zone , you find the ideas flowing through you seamlessly – coders in the zone lose sense of time and place – the problems and solutions are clear and you find beautiful code coming from your keyboard. Coding when you are in the zone is an immensely satisfying task – it’s like the zone that sportspeople talk about – when they are breaking records, it seems like they are unstoppable and every movement is a beautiful ballet…

This is also why almost all good coders HATE BEING INTERRUPTED !! Whether it is a simple phone call or even a well meaning colleague coming over to tap you on your shoulder and ask a question – the effect is the same as though the coder was invited to a long meeting. It takes time and effort to get back to being productive  after the interruption.

There are several other things that contribute to this problem -

  1. “Open Office” plans where you are compelled to hear your neighbors  conversations.
  2. Having one phone for several people in your area so you cant disconnect it and have to attend it on the off chance it yours.
  3. Conversations over information that can be sent by email or IM or SMS or any of the multitude of asynchronous forms of communication available today.

I have seen several ways to combat this  -

  1. Some people wear head-phones to block the ambient noise and subtly indicate to people they are working on something and interruptions are not encouraged (YMMV – I have seen people ignore the subtle indication and come over anyway).
  2. Some people deal with all their email and IM at scheduled intervals – this way everyone gets their reply and people learn to come with the questions at those times.
  3. If you are lucky enough to have a cabin then disconnecting the phone and leaving a message on the door is often effective.
  4. Some companies even plan their meetings to happen only on certain days so everyday disruption is minimized.
  5. A common inclination is to work at times when no-one else is around to bother you. This is a reason why coders are night owls :-)

Another effect of software programming is burn-out… This is the opposite of being in the coding zone, but it seems to be consequence of being in one…  Like I mentioned in the beginning of this post – software programming is a very mentally intensive activity. Coders have frequently felt mentally burned-out after intensive coding sessions.  This happens more quickly on projects which you don’t find interesting or enjoyable. Sometimes you can continue only for a day, other times it’s a month but invariably – burn-out happens.The key is to recognize it for what it is and deal with it.

Indeed,  when I previously mentioned that coders lose all sense of time when in the zone, I did not mean that they should spend all their time coding.  I am not  applauding coders that brag about sitting for 36 hours at a computer churning out code. Those, that spend 16 hours a day at the terminal and spend their nights dreaming about code are, invariably, the ones whose work the rest of the team has to spend the rest of the month fixing. Like in all things,  there is a balance that needs to be maintained. Spending long amounts of time in intense concentration is tiring, and it is important to give things a rest. It is usually great to take some time off doing something else, like mountain biking, mixed martial arts, flying a plane or playing an instrument (these are pastimes of some of my friends :-)). Some people like physical activity others like mental activities like video games, or chess. The important thing is to have a balance. Sometimes, when you are grappling with a hard problem it is useful to stop thinking about it consciously and let your sub-conscious chew over it.

When you are no longer in the zone and are spinning your wheels, a break is the most productive thing you can do.

The Zen of Programming is being able to get into the zone and more importantly to recognize when you are no longer in it and take that break ! :-)

Update: I came across this article the other day that got me thinking about burn-out.  I mentioned before that burn-out happens more quickly when doing something you don’t find interesting or enjoyable – this advise goes in spades when you are doing something you feel is morally wrong or that goes against your conscience.  Guilt is a catalyst that will accelerate both the speed and the intensity of your burn-out.

IMHO if given a choice it is much more satisfying to do something you believe in at a lower salary than something you don’t at a higher one.

Learning Software Programming Takes Time

This comic created by the Abstruse Goose is one of my all time favorites.

It embodies my frustration at people who think that all  becoming a software programmer involves is learning a little computer syntax , reading a few books and typing out a lot of code.

The books that claim you can “Teach Yourself” X in Y “days/hours” etc are written by charlatans out to make a quick buck (I think sometimes it is the publishers that foist such titles on the authors in the hope to sell a few more copies). All they do is frustrate the people who are genuinely trying to learn software programming and provide PHB types with ammunition – after all if programming can be learned in X time then any schmuck who can read and knows a bit of typing should be able to become a programmer, at least by 2X the time.

Interestingly, no one writes a book on “Teaching Yourself” building a bridge or a skyscraper in X days – after all building construction is a popular metaphor in software engineering. The presence of these books are IMO a symptom of the ignorance that people have about what it takes to write solid software.

As this post points out – it can take up to ten years for an artist, researcher or a sportsman to be considered a master in his or her field. Software programming is no different.

Some great Internet reading

The other day a colleague of mine asked me about finding software books on the Internet – he was talking about some of the books I had listed in my previous post. Now, that post referred mainly to books that are published and distributed on paper as physical books (some of them are available for sale in other formats like PDF, Mobi and Kindle as well).

However, there are a lot of books on the Internet and even more lists of books.  I think we sometimes forget, that, besides being a directory of books, the Internet is itself a huge repository of amazing content. So caught up are we nowadays – in the real-time fire-hose of social networks and status updates – we have started overlooking some of the really good articles out there. So, in the interests of providing everyone (and myself) some links to leverage I thought I’d write a post about some of the great content available out there from the software programming perspective …

The best way to get a good list of books about software is to go where software geeks congregate and search. Almost invariably, someone would have asked or talked about the best software books and sparked off the list mania :-) Go to a few websites like this and look for the names that pop-up again and again. Stack Overflow is a great example of this – it is a forum for software related questions and though, of late, they have started discouraging open-ended and subjective questions, there are some really great list of books out here. A few of my favorite lists are -

Of course, like all on-line lists these are updated from time to time so you need to keep going back from time to time to get the latest lists.

The next place I go to are websites that are book repositories and directories. One of the biggest out there is the Wiki Books project – they have a great listing of open books and an entire section is devoted to Computing . There are other websites that specialize in technical books – a couple I go to are -

Another meta-list I go to is my own :-). I leverage on-line bookmarking (I use Delicious) extensively  – I tag my entries profusely and you can slice and dice my list across quite a few dimensions – feel free to do this and pick your favorite lists :-)

An example of an extremely influential article is, the Agile Manifesto, that provided so much momentum to the agile methodologies and the establishment of agile as an alternate development methodology in use today. One person whose articles have influenced a whole generation is Richard Stallman – his articles are available here. I am also a fan of Eric S Raymond’s writing – keeper of the Jargon file – most of his articles have turned to books -

Along the same lines of ground breaking articles is the 1972 Turing Award Lecture by E. W. Dijkstra one of the true giants in the field of programming. There is a PDF version here.

There are some great on-line tutorials on various topics -

There are a lot more – I haven’t listed then explicitly here so feel free to post comments on what you think I should have added.

Some other influential blogs and articles include -

A couple of the links in the list above are to blogs rather than individual articles – this is because I believe those blogs to have a lot of influential articles :-)

Obviously this is an incomplete list – I have written only of the content that I know of today. I hope to be able to add to it and update it with your help :-)  Please post your suggestions and links to improve it :-)

Update : Via Renju – CLR via C# and Patterns of Enterprise Application Architecture

Here is a list by Martin Fowler of books he participated in creating.

I would also like to add my friend Praseed’s posts -  if you are starting out on C/C++ in GNU-Linux these can give you good boost in the right direction :-)

How to write beautiful code

Beautiful code is elegant and simple – it is concise but clear. There is a balance in the code – a rhythm in the definition and structure of conditionals and the loops. The intent of the each function shines through the code – a pattern in the creation and interaction of the classes and methods in classes that combines the code into a coherent and beautiful unit. Beautiful code is concise, there are no wasted variables or endless conditionals – it is a pleasure to read not just because of the ease of reading but from the way in which it communicates the ideas and intent of the
programmer.

Well, now that I have waxed lyrical about what is good code, the next logical step would be to figure out how to write such code. Beautiful code starts with good understanding – in order to write beautiful code the first step is to understand the problem you are trying to solve. The next step is to have a clear idea of the solution and the approach you are going to take. These two things itself are entire subjects in software development – so I shall for the purposes of this post, assume that you have a clear idea of the problem you are solving and the approach you are going to take to solve the problem :-)

Even with these conditions met (understanding the problem and identifying the solution), sitting down in front of a blank page and writing out the excellent bug free code is almost impossible IMHO. The best programs that are out there are the result of an iterative process of coding and re-coding repeatedly – almost obsessively. Writing a program is like building a clay sculpture – you start of with a lump of clay, then you broadly shape it and  then keep removing and adding bits and pieces till you get your sculpture – sometimes you have to remove a big piece and add another instead and sometimes you simply throw everything  away and start over.

Writing beautiful code is hard – a seemingly simple algorithm like the Quick sort is the result of years of effort to come up with a concise and elegant implementation (in fact Quick sort has several implementations).  Even a simple piece of code like the quintessential “Hello World” program can be written in so many ways(in fact it is maintained as a separate GNU project).

So when do you stop iterating ?  There are some factors to consider in making this decision – usually if you are working on commercial software
this decision is not in the developers hands. The almighty deadline determines the ‘done-ness’ of your code – indeed, this seems to be
psychological impetus for a lot us. I have seen a lot of places and projects where people find it hard to work without a deadline looming,
like a Damocles sword, above their heads. Indeed I suspect there is something psychologically appealing to having this decision taken out of
our hands.

Again, for the purposes of this blog post, let us assume you have control over when you decide your code is done and you decide to release it only after you feel it is good enough. The question then becomes how do you know that what you have is beautiful code?

The first requirement of good code is, that it should work. If your code does not solve the problem it was intended to solve – you need to go back to the drawing board my friend – this is a necessary pre-requisite but it is not a sufficient condition for beautiful code.

One way to identify beautiful code is to read about programming – programming  methods, philosophies, etc.  I have book list of good software books to read you can start with (you can look at my post on internet reading for more book lists and articles).   In order to be a good sculptor, you need to know what beautiful sculpture look like – so you look at pictures of great sculptures – in fact this is usually a part of the curriculum for art programs. Similarly, sculptors look at examples of bad sculpture in an effort to recognize what to avoid.  So, in order to identify beautiful code look at examples of beautiful code – code written by great programmers and code written for great projects as well as bad code, ugly code.

Unlike sculpture – where you would have to travel to Italy to get a look at David or The Pieta – you have a lot of good code to read available at your fingertips – just open up the internet and look around :-) There are plenty of open source projects that share their code-base  – start with GNU, Sourceforge, and Google code (check out this post on the worlds oldest software repositories). If that’s not enough take a look at the examples in the ‘Code Complete‘ book (Other books you can look at ‘Beautiful Code‘, ‘Beautiful Data‘, ‘Clean Code‘ and ‘Coders at Work‘). Identify the patterns followed in beautiful code and the patterns you see in ugly code.

Another very important thing to recognize  is when you have stopped coding to a requirement. Good code is spare -it provides a solution to the problem at hand – no more – no less. A sign of good code is when you go through it and feel there is nothing you can remove from it – to paraphrase one of my favorite quotes by  Antoine St Exupery)  – “A programmer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away.”

And finally, the only way to make beautiful code is to write lots of code and publish it. All programs have constraints – some are technical and others logistical and yet others philosophical – good code is a elegant balance between these constraints.

So budding programmer -  Good Luck and Happy Coding !! I leave you with the following philosophy from the Python programing community -

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

PS: I refer to coding and programing interchangeably in this post. What I mean by these terms is the act of writing a software program.

PPS: You can get the “Zen of Python” poem by typing “import this” at the interactive python console.

Update 1 : I found this link to a Ruby conference keynote where the speaker talks about beautiful code – Ruby is another language that has writing elegant code as one of it’s goals.

Update 2: Here is a great post on why you should spend the time and effort to write beautiful code :-)

Update 3: Here is a website from John Graham-Cumming where one can submit and read beautiful code.

Update 4: A great article on beautiful code by Brian Kernigham

A podcast review – Software Engineering Radio

I am a fan of podcasting.. video blogging and screen casts are fine, but in terms of convenience and bang for the megabyte – nothing beats a podcast :-). I have been listening to podcasts for a few years now. I am usually listening when walking home from work or doing some chores. In fact I wrote an earlier postabout them along with some of my favorites (at the time).Of late I have listened a lot to a podcast called Software Engineering Radio. Actually, I have been listening to this one for a long time now and it has grown from strength to strength. It was started by a German software consultant – Markus Voelter and is based in Europe (the details are here). There is now a team of people both creating episodes as well as supporting the website. A new episode is published every 10 days and the feed for it is available at the website.

The podcast is about the software engineering and is for professional developers (it says so in the title :-)).  IMHO it is the best podcast for professional developers I have come across.  Markus and his team of volunteers do a great job – the interviews are professionally done and the audio quality is great. The post production is also very well done – in fact they have put up videos about their recording and post-production process. All the recording are released as part of Creative Commons 2.5 license and I think the majority of the episodes are gems worth downloading and keeping.

The coolest thing about Software Engineering Radio and what ultimately makes it such a great resource are the subjects and the people they interview.

The subjects are for the most part about software engineering practice – but they are things that are really useful to ones growth as a software professional. Episodes range from discussions about software development methodology to software languages and tools to computer science research.  In fact there was recently a very interesting episode on the – Difference between Software Engineering and Computer Science.

The people interviewed range from professional software developers and consultants to researchers and academics. Some are giants in their field and others I haven’t heard of (admittedly thats a large percentage of them – maybe because I hadn’t heard of their field in the first place ;-)).

Another thing I would like to mention is the format of the episodes. They are interviews and the team at SE Radio
go to great lengths to prepare and ask just the right questions to
illuminate the topic at completely and extract pearls of wisdom from
the people they interview. I think this format is better than a speech or a presentation because of the interaction between the interviewer and the interviewed.

There is sometimes a problem with the English accent – most of the team are from mainland Europe and have European accents, but it is never so bad that one cannot understand what is being said. In fact one of the things I like about the show are the various accents of people from different parts of Europe, UK and America. It makes the show more interesting to me as I try to guess at the nationalities of the interviewer and interviewed from their accents ;-)

, ,

Continuous Integration – more than just automated builds

This post is derived from an article by Martin Fowler on Continuous Integration. It my take on Continuous Integration and it’s use.

‘Continuous Integration‘ is a software development practice where members of a team integrate their work frequently; usually each person integrates at least daily – leading to multiple integrations per day. Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly.

Software development is full of best practices that are talked about at great length and detail but rarely actually done. Continuous integration is one such best practice that has been around under various guises for a long time. Anywhere you have heard or discussed the virtue of having a common build that is done in a regular manner – you are hearing about continuous integration. In fact Microsoft has been doing automated daily builds for years.

The term ‘Continuous Integration (CI)‘  is one that originates in the Agile Programming discipline known as ‘Extreme Programming (XP)‘ as one of its twelve original practices. I believe CI has enough merit that it should be implemented even in projects that are not following XP or other agile development practices.

So how does one go about achieving CI within your software development process?
There are two critical aspects to this effort – evangelizing, educating and mentoring the team in CI practices and creating a repeatable process that allows the team to integrate their code quickly, preferably in an automated fashion.

One of the hardest things to express about CI is how it makes a fundamental shift to the whole development pattern; one that isn’t easy to see if you’ve never worked in an environment that practices it. For many people team development just comes with certain problems that are part of the territory. CI reduces or eliminates these problems, in exchange for a certain amount of discipline. The fundamental benefit of CI is that it removes sessions where people spend time hunting bugs where one person’s work has stepped on someone else’s work without either person realizing what happened. These bugs are hard to find because the problem isn’t in one person’s area, it is in the interaction between two pieces of work. This problem is exacerbated by time. Often integration bugs can be inserted weeks or months before they first manifest themselves. As a result they take a lot of finding. With CI the vast majority of such bugs manifest themselves the same day they were introduced. Furthermore it’s immediately obvious where at least half of the interaction lies. This greatly reduces the scope of the search for the bug. And if you can’t find the bug, you can avoid putting the offending code into the product, so the worst that happens is that you don’t add the feature that also adds the bug. (Of course you may want the feature more than you hate the bug, but at least this way it’s you have the choice.)

Though CI is a practice and so does not need any particular tooling to deploy, it is generally found that automation is a key factor in order to make CI. In order to automate the CI process generally it is recommended that the following is done –

  1. Keep a single place where all the source code lives and where anyone can obtain the current sources from (and previous versions)
  2. Automate the build process so that anyone can use a single command to build the system from the sources
  3. Automate the testing so that you can run a good suite of tests on the system at any time with a single command. (The build does a self test)
  4. Make sure anyone can get a current executable which you are confident is the best executable so far. (The build creates a deployable version of the software)

All of this takes a certain amount of discipline and consistency in the approach. It is difficult to introduce it in a project but once introduced it does not take that much effort to keep it up. So let us examine the points given above in turn.

  1. Keep a single place where all the source code lives and where anyone can obtain the current sources from (and previous versions)
    This is a basic requirement in the software development. Regardless of the size of the team or the complexity of the project all the source code must be available in a single easily accessible location. The versioning requirement means that one must invest in a proper source control system. All the artifacts needed to run the application should, ideally, be located in the source control repository. This would allow the build process to obtain all the resources needed to build the project from a single location.
  2. Automate the build process so that anyone can use a single command to build the system from the sources
    Automated builds are essential to create a repeatable process that can be run on demand. This prevents the occurrence of human error in typically complex systems where in addition to building the source code one must set configuration value, etc in order to produce a deployable application. For complicated projects building the source code itself can be a complicated process involving multiple projects and dependencies. Most major software platforms have build tools that help make these tasks simple.
  3. Automate the testing so that you can run a good suite of tests on the system at any time with a single command. – make your build self testing.
    Traditionally building involves compiling, linking, etc – all the stuff required to make your program run. However simply because your program loads does not mean that it is working correctly. One very effective way to catch bugs is to run a test suite against your program. These test suites should be built so that they can run automatically against the latest build of the software.
  4. Make sure anyone can get a current executable which you are confident is the best executable so far.
    This can be achieved by getting everyone to commit their code to the build often (every day) and ensuring that these commits trigger an automated build every time. This helps ensure that anyone can get the latest workable version of the code at all times – one that passes all the tests. This allows one to be more confident about the code that one is developing will work when merged back in. It also allows testing of the code to be done on the latest version of the code ensuring that bugs being tested are not due to version conflicts or integration issues. It helps testers focus on the changes to the code and identify issues faster as a result. Overall the confidence in the quality of the code is improved.

While CI is a great process – it is like any process only as good as its implementation. There is a abundance of tools out there that do CI or have it as part of their feature set. I will list some that I have dabbled in and heard nice things about -

  • CruiseControl .NET This is a open source tool from ThoughtsWorks .
  • CruiseControl This is the original java version of CruiseControl.NET
  • CI Factory Based in CruiseControl.NET but with a lot of the other agile tools also integrated as well as templates on how to set up your repository, etc :-)
  • Hudson Java tool I have seen nice things written about it’s simplicity.

A good book to read is Continuous Integration: Improving Software Quality and Reducing Risk

Update: A Stack Overflow user was asking about Continuous Integration tools today and I pointed him here – I found another useful link while going through all the answers.

ThoughtsWorks have created a feature matrix for CI tools that seems to have all of the major tools covered – link