ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Tools for Geographically Distributed Software Development

by Ryan Bagueros
05/17/2007

It used to be that software development happened inside a building, usually decked out with office-blue carpeting and fluorescent lighting. Important discussions would take place in impromptu face-to-face meetings by engineers constellated around a water cooler or a box of donuts. Spending any time outside of the office during work hours would mean falling behind because so much of the project communication was exclusively in-person. In the modern world, development teams are increasingly becoming geographically distributed. Some projects last their entire life-span without any of the developers ever meeting each other. Geographically distributed development (GDD) offers a number of rewards in efficiency and cost but there are also challenges in creating successful GDD teams. In this article, I'll review some of the technology and tools we use to get significant gains out of GDD, within the context of the challenges posed by GDD.

Our software teams are made up of people primarily living in San Francisco, California and Sao Paulo, Brazil, but we've had projects that included developers in Germany, Netherlands, Argentina, and Venezuela. The motivations that compel organizations to try GDD can include the cost savings involved with outsourcing and the ability to generate faster development cycles by experimenting with programming methodologies like XP. Or, perhaps, the project is free software and it just so happens that the volunteers are spread out across the world. For our project, we all met while working on and managing open source software projects and our processes evolved to meet the demands imposed by a group made up of people living far away from each other. It was later that we decided to leverage the cost and speed advantages in projects for paying clients.

The biggest risk that a GDD project faces is with its ability to effectively communicate internally. The reason that bosses want their developers in the office everyday is because that's a proven technique to build a tight, well-communicating team, which is a critical part of any project's success. GDD not only lacks face-to-face relationships but there are also language barriers which make it even more difficult. However, technology can help geographically dispersed teams achieve a level of communication as sophisticated as what develops when everyone sees each other everyday.

Two professors from Columbia University have studied this facet of GDD and their answer is CHIME (Columbia Hypermedia Immersion Environment), a 3D virtual world that represents the software project. Developers "walk around" interacting with other developers' avatars and project artifacts (like the list of milestones, for example). The goal here is to give the engineers a virtual office that will promote the same type of interpersonal relationships that make in-person communication so effective. However, in our experience, that same effect can be achieved with some of the most basic internet tools. The cornerstone of GDD for us is the Internet Relay Chat protocol. We've found that IRC is superior to instant messaging for long-distance development because it captures the same effect that CHIME is going for. IRC provides the developers with a space to have real-time conversations with whomever is online and interested in the topic. Other developers don't need to be there to get something out of the conversation as they can read it in the backlog when they return. It is possible to incorporate IRC into a work style such that its easy to always maintain a presence in the channel without exerting too much effort to be there. So, developers can jump in with an answer to a question that only they know and then jump right back into what they were doing. If the development team has successfully integrated IRC into their workflow, it can also provide advantages over the traditional office because in IRC, the water cooler conversation includes more people and is logged for posterity.

The communications format of IRC also provides the basis for confronting the language barrier challenge. The socially-acceptable pauses allowed in the IRC back-and-forth give developers the chance to rely on a suite of tools to help them learn a new language. There is no better way to learn a new language than by jumping into real-world conversations with native speakers. The developer can build an effective suite of tools by using free services on the internet. Generally, the suite should include an online dictionary for translating words as well as a tool to conjugate verbs and provide other grammar-oriented translation functions. For example, AltaVista's Babelfish or the Google Translation Tools are absolutely terrible when translating individual words and, therefore, entire paragraphs. For example, if you put in the word "bear," there is no way to tell AltaVista or Google whether you mean the forest animal or the verb. But those types of tools are useful for conjugating verbs or building prepositional phrases. So, an effective mix of a quality online dictionary and the translation tools are needed. It is very easy to get comfortable with these tools and become really quick at referencing them while not taking an abnormally long time to respond within the context of IRC. Several members of our team have become bilingual or multilingual just by using IRC and free internet language tools. Granted, language instructors will note that this approach focuses entirely too much on reading/writing and almost no time is spent on speaking/listening but written fluency will make spoken fluency much more attainable. And, more importantly for us, written communication will suffice in getting the developers to work together.

Creating a virtual office in IRC (or something similar) is only the foundation of a GDD team but one of the most important. It is safe to say that over the past years, often the office I went to everyday in IRC was more "real" than the physical offices I would go into at dot-com companies in the San Francisco Bay Area. And after six or seven years of working like this, I met my co-workers for the first time last year at a Developers Summit we organized in Sao Paulo, Brazil.

Once the framework for day-to-day collaboration is established, the next challenge to face is temporal. The time zone difference between the members of a GDD team can give the team an edge, or it can be an obstacle that confuses the workflow. An excellent tool that will be constantly useful is timeanddate.com. This web site allows quick access to time zone information and also has a Meeting Planner application which makes it easy to visualize time zone differences while scheduling a meeting. There are also a number of other on the site that are related to date and time which, will be helpful as the GDD team adjusts to their coworkers' days and nights.

Once the GDD team is set up to deal with linguistic, temporal, and distance-related challenges, the next challenge is to find a suite of tools will facilitate the workflow of the development. Again, many of these tools are core internet collaboration technologies. They include:

  1. Mailing lists: A tried-and-true complement to real-time interaction. Mailing lists allow for a different type of discussion which is (in theory) more reasoned and thought-out than real-time exchanges. We use Mailman.
  2. Version control: Most developers are familiar with CVS and Subversion, two traditional client-server versioning systems that keep track of source code that many people are working on simultaenously. A new family of versioning systems are emerging, though, which favor a distributed repository model over the client-server type. What distributed CVS boils down to are smarter mechanisms for merging code that give the developer more freedom (for example, with a distributed versioning system you can check code into the repository without being connected to the internet). Examples of distributed versioning systems are Bazaar and Mercurial.
  3. Bug tracking: The specific bug tracking system used often sets the pace for the phases of the software development cycle, to the extent that it is convenient to use the bug statuses in the system. This component hasn't changed much over the years: bugs are entered, a developer confirms or rejects the bug, the bug status changes as the developer works on the fix, checks it into the CVS repository, and then pushes the code to production. We still use Bugzilla for this.
  4. Document management: Collaborative document management is a key part of long-distance working relationships. The advantages of a wiki is well-known by now; it permits document versioning and rollbacks, it is easy to see who made what change, and its very simple for anyone to jump in and participate on the wiki. There are some "Web 2.0" offerings that make an improvement on the group editing functionality of wikis. An example that we've used is writewith.com, for each document you can establish a workflow, there is an easy-to-use history and notes section and there is a nice desktop-like interface for editing the current revision. However, writewith.com is not a wiki and so the way we've integrated these types of tools are that we'll create the initial version of the document on the more advanced tool like writewith. And once we've worked through the entire process and arrived at a final draft, we'll move it to the wiki which is good for organizing all of our documents.

The problem with using all these different tools are that they're scattered around different places on the internet, require multiple logins, and any risk of confusion about which tool to use should be avoided. We don't have this problem because we've evolved to a stable point in our development process and each tool we use is used because it worked out the best for us in our workflow. It isn't hard for new developers to catch up with our suite of tools.

It is possible to upgrade your different tools to accept a single login, using new software packages coming out such as aMember. aMember is actually used for creating paid membership areas of a web site but that functionality is also useful for having a single point-of-login for many different web applications. We are currently in the process of building an aMember-like meta-login for the various groupware tools we use for GDD.

For teams that are just starting out with GDD, there are commercial services that provide an integrated platform of all these tools. One example is CollabNet's SourceCast, billed as everything you need for geographically-distant software development. Essentially, it provides all the tools discussed above in one place for each software project. While it is certainly possible to construct a GDD environment using free software tools, there is some value in having it all organized in one central location. It is unclear how SourceCast approaches the need for real-time communication and the "virtual workplace."

Similar offerings closer to the hearts of free software advocates are the GNU Project's Savannah and VA Software's SourceForge. SourceCast, SourceForge, and Savannah offer mailing lists, CVS repositories, issue trackers, file stores, and document sharing in one central location for software projects.

An interesting feature that is now being deployed in offshore development applications are VNC "watch windows." Intended to address the discomfort that state-side engineering managers have with offshore engineers being productive, these applications force the developer to "clock in" on his development machine and then everything that happens on his desktop is viewable in a VNC window from the administrator/manager's control panel. While this technology could be useful, it probably will do more harm in the type of relationship the manager is forming with the engineer.

The final application to discuss here for GDD is technology for advanced conferencing. While IRC and e-mail are sufficient for most day-to-day tasks, it is useful now and again to have a face-to-face meeting where the team members can see and hear each other. The new generation of internet conferencing is making all of this possible. We use Skype for audio conferencing, as well as cross-platform audio and video conferencing when we need to have a more sophisticated discussion.

So, although there are commercial services available for outsourcing and long-distance development projects, most of them are packaged versions of technologies that open source and free software projects have used for years. With some careful setup, any group of developers can compete with the budgets used by multinational software development firms to distribute their application engineering and quickly realize significant gains in efficiency and speed. For us, we keep our international team of developers running smoothly and what we get out of that is the ability to bid on jobs on three different continents and in three different languages, we can outsource when appropriate for cost savings, and we get a lot of personal fulfillment out of bridging cultural barriers as part of doing our jobs every day.

Ryan Bagueros builds internet technologies in San Francisco, California and Sao Paulo, Brazil with his company, northxsouth.com


Return to ONLamp.com.

Copyright © 2009 O'Reilly Media, Inc.