overview

Advanced

IBM Turns to Open Source Development

Posted by archive 
IBM Turns to Open Source Development

By David Worthington, BetaNews
June 13, 2005, 2:47 PM
Source

INTERVIEW Is open source changing the way that software is made? It is at IBM. BetaNews sat down with Doug Heintzman, IBM Software Group's VP of Strategy and Technology, to discuss the adoption of a hybrid development model called Community Source that combines the best elements of the open source model with decades of IBM programming practice.

This componentization, says IBM, will liberate the creativity of programmers, drive efficiency, and bring products to the marketplace at a faster rate then was previously possible - avoiding a top down approach that IBM says could make Microsoft's Longhorn obsolete upon arrival.

BetaNews: Let's begin with an introduction to Community Source for our readers.

Doug Heintzman: The quick gist of it is really quite simple. We run a very large software development company and we have laboratories around the world. And due to a number of technological factors as well as some efficiency enablers like the Internet, we have decided to move to a new development methodology.

We are systematically decomposing our technologies into a number of components and a lot more reuse than was previous possible. Because of that strategic decision we need to move to a development system that allows us much greater transparency, and a much greater awareness and cross pollination of expertise, ideas and requirements between all of the various different laboratories.

Increasingly our products are assembled from components instead of architected from the top down. We basically leveraged our rather extensive experience with the open source communities and we have borrowed many of their philosophies, strategies, tools and a lot of their culture to transform IBM's internal development practices to support global component development and promote collaboration and reuse of technology.

That's the high level synopsis.

BetaNews: What inspired IBM to begin the Community Source program?

Doug Heintzman: The community source approach at IBM is a means to an end. We believe in an Internet connected world with the business requirements that the on-demand era of information technology is suggesting. There is going to be an important shift and to deliver technology that addresses that shift in both customer requirements as well as our technological capacity, we have decided to systematically componentize and modularize our software. That is allowing us to get to the market much more quickly to address these requirements in a much more time and cost effective manner.

So with that recognition that this is the way we are going to develop software going forward it is clear to us that the way we traditionally develop applications is sub-optimal to achieve this goal. It's a very ambitious goal; no one has tried to do it on the scale and scope that we are doing it.

We very much believe that the software industry is moving through the same kind of componentization transition that many other industries ranging from the automotive industry to the disk drive industry and chip industry have all gone through. And the companies that emerge from this transition and have successfully broken their products down into sub-assemblies to reusable components will have tremendous advantage in the marketplace. So that's the driving motivator. Community Source is a way to get there.

BN: Would you consider it to be managed code or something different? To elaborate, Microsoft is attempting to push its developers to use modules of managed code within Visual Studio to create more reliable Windows applications more efficiently.

Heintzman: I think that managed code is a way of referring to a structured component and we've got a number of names for them ourselves. We believe a few things: First of all, we have a systematic architectural view of our entire product portfolio. And we have a group of people dedicated to identifying what groups of technologies should be componentized and providing the heavy architectural framework to the development team so that they can do that componentization more effectively.

From that sense there is a managed code approach in that there is a group of people whose full time job it is to worry about what ought be componentized, how should that componentization be done, what standards need to be adhered to.

Now that being said, there is a second part of this, and this is really borrowing from the culture of the open source community. There is a very important role in a software company like IBM for top down managed code architecture and all that kind of good stuff. But there's also a tremendous amount of potential innovation that is locked up in the heads of the front line programmers and we try to liberate that creativity and the innovative potential of all of those people.

We look at the open source communities out there and we are witnessing this kind of fascinating bottoms up grassroots innovation where great people have an idea and collaborate with other people and get together to make those ideas into something real. That's a very exciting phenomenon. So certainly we have a structured approach to community source, but we also have an eye to promoting this bottom up, collaborative, creative process. This is part of borrowing the culture from the open source community.

BN: Speaking of scale and scope, when will Community Source be fully implemented companywide?

Heintzman: It's already been implemented companywide. We have a hundred projects that are currently up using the Community Source. We have over 2,000 users that are currently registered and using the Community Source tools and processes. There are nine or so new projects that have been registered just in February, which is the last time I saw the numbers on the progress we are making.

We actually have 34 or so projects that we call "in active production." They are not just kind of hobbies; these are active development projects that run directly into our product line.

BN: Is Community Source phased in throughout every product group, or is it targeted more towards improving core IBM software technologies that are more modular in nature?

Heintzman: Good question. There are two ways to answer that. Certainly a lot of our componentization efforts, and subsequently the tooling to support those componentization efforts, are focused on the core enabling. But the real key here is that we are looking at a lot of the new emerging applications especially in some of the development areas where we are seeing the first real projects. We don't want to be overly disruptive; we want to phase these things in responsibly.

A lot of projects in Rational are coming online using these kinds of techniques. And of course the Rational people have exposure to the Eclipse open source world. We are learning a lot from that experience. So it's not as systematic; it's the first area to really go at it. The first area that the open source community went out and really attacked was the development phase, so in many ways I suppose it makes some sense for that to be the place where we actually start ourselves. But increasingly, just like the open source community has gone into other areas, that is also true of the community source efforts inside of IBM.

BN:You mentioned grassroots Doug. Is it conceivable that IBM developers could be inspired to contribute projects they are not assigned to in their own spare time?

Heintzman: We would be very happy for programmers to program in their spare time, assuming of course, they have an appropriate degree of life/work balance and all of that. You actually raise a very interesting question. One of the most important reasons we are doing this is because of transparency issues.

It's because the code, the design specs, the documentation are really out in the open that any laboratory working on any product properly registered with the proper authority to do so can go in and make comments. We may have interesting techniques that we have developed in one group that they recognize have not been implemented in the product that they are consuming.

We want them to go out and say, "You know what guys we figured out how to do this six months ago, try it this way." Not coaching them, but to help them along. At the end of the day we want the whole engine to be running as efficiently as possible, and if that means different people working on projects that may need help and need work then so be it.

BN: Moving down another path in this discussion - what is the backend to Community Source? Is it similar to the SourceForge community Web site?

Heintzman: It is similar; in fact we use a lot of the same tools that a lot of these projects use. We have tools that store the actual source logic using repositories very much like SourceForge, documentation and specs, news and bulletins, patches and fixes, and educational material – just the way you would see it in SourceForge.

We have a bunch of other technologies and community tools that we surround it with. In many ways, because our developers aren't just working on one particular project, we have recently implemented a developers' intranet that provides a centralized umbrella for best practices and new tools that are available.

There's a broader community set that in many ways is a superset of what you would see in something just like SourceForge. It's a more comprehensive approach, but we have the resources and the requirement to do that. We also want high degrees of mobility of our programming staff.

That's a degree of discipline that you really don't see in the open source world per se. It's not that we are just copying the open source world, we are borrowing its culture and we are applying our own twist on it to be sure that it runs the way that we need it to run.

BN: So IBM feels it is taking the best of the open source culture. How will IBM keep its source code secure and guarantee the code's safety and integrity inside of this system?

Heintzman: Interesting question. You know that question cuts two ways. There is considerable merit in open source when you've got many eyeballs you can drive out bugs and security holes and flaws and fix them more efficiently. I think that that's true. I think that it is also true that commercial software companies like IBM have very rigid quality control best practices, reviews and scans - disciplines that they have developed over decades.

We are sharing with the open source community to try to help them mature. So we find ourselves borrowing, as you say, the best of this new culture and merging it with a lot of the lessons, tools, techniques and practices that we have developed over many, many years to come up with this productive hybrid. That's what Community Source is really all about.

BN: Could Community Source assist in breaking down language barriers between IBM's locations throughout the world since programming code is a universal language? IBM is a company with many laboratories outside of the United States so there must be linguistic challenges.

Heintzman: Absolutely. We've got a whole bunch of laboratories on every corner of the planet. I've never really thought about it. Certainly in all of the various different programming reviews that we have looked at, linguistic diversity has never been one of our prohibitors that I am aware of.

Will this help? Sure. Is it "the" reason to do it? I would say it's probably a "nice to have."

BN: So is linguistic diversity now somewhat easier to have than in the past?

Heintzman: I wouldn't quite phrase it that way. The frank reality is that in the world of coding there really is kind of one language. Programmers work with certain programming tools and we build our reference materials and our documentations in reference languages and then we massively translate things. Our laboratories have been working from common linguistic code bases for ages. It's too complex to try to integrate all of that componentry from different laboratories unless you have some kind of normalization.

BN:In a press release IBM said, "Community Source has made it easier for IBM developers to create and trade Lego-like components of code that can be plucked from a library, quickly assembled and re-used later."

Considering how many ongoing projects IBM has, how will these 'legos' be organized? We touched on this briefly when we discussed the full time IBM employees that are dedicated to making sure that these modules of code are used appropriately.

Heintzman:We've got a division whose role is to work with the core architects for all the software groups to get the big, big, big picture. They drill down what the big platform picture looks like, they identify targets that would lend themselves to componentization and high degrees of reuse.

This is a journey for us. We are starting the modules that will give us the best payback and we will move to other secondary modules. They sit down and kind of look from an architectural standpoint and they define lists. Then they go to the laboratories that have responsibilities in those areas and they sit down and say, "This is what it means to be defined as a component. This is how we are going to normalize things to allow for high degrees of reuse."

That's an interesting dialog because when you take on the responsibility of owning a component, you are actually taking on a higher degree of financial burden. It is more expensive to develop to the high degree of specification which components require and to subsequently support them and provide support to all of the groups that consume them.

Be that as it may, we have made a decision that says, "This is our strategic direction," and each of the laboratories understand that they have responsibilities to live up to, and yes they may be spending more money on one particular component, but they will also get benefits because they are now consuming components that they don't have to fund themselves.

BN: IBM states that development is 30 percent faster using an open source model: I am just curious do you have any empirical data to back that up and how did you come to that conclusion?

Heintzman: That's a good question, and unfortunately I am not the most skilled person to answer the specifics of the methodology behind those kinds of numbers.

At the high level there are all different kinds of code and there are all kinds of ways of measuring the productivity of programming. Say you are building a product and you need an administration console. The fact that there is one in the library that you can plug in is almost infinitely productive relative to hiring a bunch of coders and going out and building the thing. So how much extra productivity do you get from that now? A lot.

If you use these Community Source approaches, the more quickly and more cost effectively you get to the right answer. You could get that console built by taking those programmers and locking them in a room and then off they go, but will they produce a piece of code at the end of the day that really is highly productive for reuse across the company?

There are all sorts of ways of measuring it and 30 percent is kind of our rough averaging of how quickly the projects that we have brought into production have succeeded versus the percentage of them that have been deemed failures, and how many developers its was required to code a particular function versus how we have done it before. We actually do have specific metric we measure for everything and if you take a step back and put your thumb out we are getting about a 30 percent improvement.

BN: It struck me that IBM said, "at companies like Microsoft (just think of Longhorn, taking years to deliver at the risk of obsolescence upon delivery)." That's a pretty bold statement. How can IBM say that considering IBM itself has for a very long time used the same hierarchical approach?

Heintzman: I think there are actually two issues that we are commingling here, so let's make them a little distinct. So one of the issues is code that is highly integrated code that was built originally for a specific purpose. These code bases are very expensive to modernize, to add functionality to develop and support.

Certainly the Microsoft Windows platform, and we've witnessed the continual delays of the platform, is an incredibly complex engineering problem they have. They've got millions and millions of lines of code and they are all spaghetti'ed together and as soon as they go and change or fix something it can ripple all the way through the code.

We had exactly the same challenges in our Lotus Notes platform. It is a very mature, very complex highly functional platform, but it is very expensive to develop it and to maintain it. And so what we've decided to do is work to rip the Notes code out and replace it with components to kind of systematically move towards componentization.

Our hope is -- and our early experience with it is actually very encouraging -- that we will be able to bring products to the marketplace very, very quickly. If you look at the rate of innovation, feature addition and quality enhancement in something like our portal platform, they are versioning at unbelievable rates. It went from almost nothing to an industry leading incredibly comprehensive platform very quickly.

You couldn't do that with the same rate of innovation using the highly structured models of the code bases. So componentization gives you some extra flexibility there.

The second question: Now, if you buy the first argument that says that integrated platforms become huge and go into multi-million lines of code, the engineering problem becomes incredibly difficult to scale. The question is what kind of programming culture do you want to put in place using this componentization approach.

BN:So has Microsoft put itself at a disadvantage?

Heintzman: Microsoft has a unique challenge. In their position in marketplace and dealing with the compatibility issues, they have an enormously complex engineering problem that with every release just gets worse and worse and worse. At some point that system really breaks down and you get diminishing returns on your investment.

We kind of took a review and projected out where our various different platforms were going years ago and we projected that they were going to hit a knee in the curve that made them incredibly unproductive because of the engineering complexity. So we decided on this new development strategy and this new philosophy, and we turned to Community Source as one of the enabling technologies to allow us to obtain engineering efficiencies.

BN:One last question Doug: What is the future of the program?

Heintzman: You don't do these things over night. There are a lot of good things in the open source development process, the community development and the culture. There are also some deficiencies with that process and having some architectural discipline to make sure that you have the most efficient deployment of resource makes very good sense.

We are really trying to harvest the programming culture and practices that we have honed, developed and optimized over many, many years, along with some of these new techniques that we have learned through our exposure to the open source communities and have tailored to our purposes internally.

We will merge these things, and certain projects that have been done in highly traditional ways will adopt some of the practices. New projects that grew up on these new processes will also adapt to many of the traditional disciplines that we have matured over the years. We hope we end up with the appropriate balance of the best of both of these worlds.