Open Source Software / Free Software (OSS/FS) References

This short paper gives links to important pages related to Open Source Software / Free Software (OSS/FS). This idea / movement / community has risen to prominence, and I've found it very interesting. The following are the links I've found most helpful to understanding it.

Definitions/Names

You can find definitions for the terms open source software (OSS, defined in the Open Source Definition) and free software (defined by the Free Software Foundation (FSF)). In practice, nearly all software meeting one definition also meets the other. In summary, OSS/FS programs are programs whose licenses permit users the freedom to run the program for any purpose, to study and modify the program, and to freely redistribute copies of the original or modified program.

The motives (or at least the emphasis) of the people who use the term ``open source'' are sometimes different than those who use the term ``Free Software.'' The term ``open source software'' (a term championed by Eric Raymond) is often used by people who wish to stress aspects such as high reliability and flexibility of the resulting program as the primary motivation for developing such software. In contrast, the term ``Free Software'' (used in this way) stresses freedom from control by another (the standard explanation is ``think free speech, not free beer''). The FSF has a page written by its founder, Richard Stallman, on why the FSF prefers the term ``free software'' instead of ``open source software''. Eben Moglen has some commentary about these two terms as well. Merrill Lynch executive Robert Lefkowitz found what may be a better way to describe the phrase free software: "We like to think of it as 'free as in market.'" A speech by Tony Stanco describes some of the issues of free software; a good quote from it is that "[in cyberspace] software is the functional equivalent to law in real space, because it controls people, just like law does... [it is] much more obedient and therefore dangerous in the wrong hands." In contrast, Eric Raymond's Open Source Initiative declares that the term ``open source'' is ``a marketing program for free software'' (and recommends using the term ``open source'' instead).

In a similar manner, the most widely used OSS/FS operating system is referred to by two names: ``GNU/Linux'' and simply ``Linux.'' ``GNU'' is pronounced ``guh-new'', and ``Linux'' rhymes with ``cynics.'' Technically, the name ``Linux'' is just the name of one system component (the ``kernel''), but often ``Linux'' is used to mean the entire system. Richard Stallman has written an article on why he believes GNU/Linux should be the preferred term when discussing the entire system, as well as a FAQ. The advantage of the term ``GNU/Linux'' is that it properly gives credit to the organization most responsible for its development - the FSF's GNU project. Not only did the FSF spearhead the initial work, but as measured by lines of code the GNU project contributes far more code than the Linux kernel does. An advantage of the term ``Linux'' is that it's much easier to say. It's also worth noting that many other organizations besides GNU helped develop GNU/Linux, and calling the result ``GNU/Linux'' doesn't give them credit. I try to use the term ``GNU/Linux'' here and in my related paper Why Open Source Software / Free Software? Look at the Numbers!, simply to be consistent (I have to pick a name!). In person, and in some other articles, I use either one; while names are important, I'm more interested in clear communication.

The Free Software Foundation has developed a set of terms for categories of software that you may find helpful.

The word "free" has many different meanings, and these different meanings often make it harder to understand OSS/FS. The term "Free software" (as used in OSS/FS literature) is based on the word "freedom" (the word "libre" is used in some other languages). However, "free" can also mean "no cost", and sometimes "no cost" products come with a "catch" that in fact is the opposite of freedom. A LinuxToday posting found a simple way to express these different meanings of the word free, which I'll slightly paraphrase here:

Free can mean: - They are not all the same.
Indeed, the notion of being "free from control by another" is a concept that many OSS/FS advocates or sympathizers emphasize. For example, ZDNet's article by David Berlind titled Who gave Microsoft control of your IT costs? You did, states that "Let this be a lesson: The minute you get you or your company hooked on a proprietary technology, you put the vendor of that technology in control of a lot of things that I'm certain you'd prefer to control..." This is publicly known to be Microsoft's approach; as reported by Cnet (July 2, 1998), Bill Gates said that "about 3 million computers get sold every year in China, but people don't pay for the software... Someday they will, though. As long as they are going to steal it, we want them to steal ours. They'll get sort of addicted, and then we'll somehow figure out how to collect sometime in the next decade." Colin Stefani noted on August 1, 2002, that the license for Windows update "SP3" now requires that users give automatic consent to send to Microsoft not only their product identification number, but the names and version numbers of every software package, and the id's of every hardware device. Many people fear this kind of loss of privacy. At one time, it was widely understood that while commercial off the shelf (COTS) products were useful, you needed to stick with standards so that you had the freedom to switch to competing products when necessary. That lesson has had to be re-learned by some people quite painfully.

Don't confuse ``open source software'' or ``free software'' with ``non-commercial'' software -- there are many examples of commercial open source / free software, and OSS/FS must be usable for commercial purposes. Antonyms of OSS/FS are ``closed'' and ``proprietary'' software.

OSS/FS Software Licenses

Essentially all of today's software is licensed; to be OSS/FS, the software license has to follow certain rules. As noted above, although there are two different definitions (for open source software and free software), in practice a license either meets or doesn't meet both definitions.

The Software Release Practice HOWTO discusses briefly why license choices are so important to OSS/FS projects:

The license you choose defines the social contract you wish to set up among your co-developers and users ...
Who counts as an author can be very complicated, especially for software that has been worked on by many hands. This is why licenses are important. By setting out the terms under which material can be used, they grant rights to the users that protect them from arbitrary actions by the copyright holders.
In proprietary software, the license terms are designed to protect the copyright. They're a way of granting a few rights to users while reserving as much legal territory is possible for the owner (the copyright holder). The copyright holder is very important, and the license logic so restrictive that the exact technicalities of the license terms are usually unimportant.
In open-source software, the situation is usually the exact opposite; the copyright exists to protect the license. The only rights the copyright holder always keeps are to enforce the license. Otherwise, only a few rights are reserved and most choices pass to the user. In particular, the copyright holder cannot change the terms on a copy you already have. Therefore, in open-source software the copyright holder is almost irrelevant -- but the license terms are very important.

There are dozens of OSS/FS licenses, but nearly all OSS/FS software uses one of the four major licenses: the GNU General Public License (GPL), the GNU Lesser (or Library) General Public License (LGPL), the MIT (aka X11) license, and the BSD-new license. Indeed the Open Source Initiative refers to these four licenses as the classic open source licenses. The GPL and LGPL are termed ``copylefting'' licenses, that is, these licenses are designed to prevent the code from becoming proprietary. See Perens' paper for more information comparing these licenses. The GPL allows anyone to use the program and modify it, but prevents code from becoming proprietary once distributed and it also forbids proprietary programs from "linking" to it. The MIT and BSD-new licenses let anyone do almost anything with the code except sue the authors. One minor complication: there are actually two "BSD" licenses, sometimes called "BSD-old" and "BSD-new"; new programs should use BSD-new instead of BSD-old. The LGPL is a compromise between the GPL and MIT/BSD-new approaches and is primarily intended for code libraries; like the GPL, LGPL-licensed software cannot be changed and made proprietary, but the LGPL does permit proprietary programs to link to the library.

The most popular OSS/FS license by far is the GPL. For example, Freshmeat.net reported on April 4, 2002 that 71.85% of the 25,286 software branches (packages) it tracked are GPL-licensed (the next two most popular were LGPL, 4.47%, and the BSD licenses, 4.17%). Sourceforge.net reported on April 4, 2002 that the GPL accounted for 73% of the 23,651 ``open source'' projects it hosted (next most popular were the LGPL, 10%, and the BSD licenses, 7%). In my paper More than a Gigabuck: Estimating GNU/Linux's Size, I found that Red Hat Linux, one of the most popular GNU/Linux distributions, had over 30 million physical source lines of code in version 7.1, and that 50.36% of the lines of code were licensed solely under the GPL (the next most common were the MIT license, 8.28%, and the LGPL, 7.64%). If you consider the lines that are dual licensed (licensed under both the GPL and another license, allowing users and developers to pick the license to use), the total lines of code under the GPL accounts for 55.3% of the total. My paper on GPL compatibility discusses these figures further.

The popularity of the GPL is easy to explain, indeed, for many OSS/FS projects the GPL is a good license. An OSS/FS program using a non-copylefting license like the MIT or BSD-new license can be taken by a large company, extended in incompatible ways, and made proprietary . Over time the proprietary version may have so many features needed by customers (or be incompatible with the original) that all users, including the original developers, have to buy and become dependent on that other company for what was originally the original developers' work. This technique is sometimes called "embrance, extend, extinguish". Whether or not this is a problem depends, of course, on the goals of the program developers. A program licensed under the GPL or LGPL, which are copylefting licenses, has a much lower risk of this occurring (again, program developers may not perceive this as a risk). Many people writing libraries want proprietary programs to be able to call them, so for them the LGPL is a popular choice. Note that a GPL or LGPL program can be used for commercial gain - you just can't distribute binaries without distributing their source code. Of course, if it's your desire that people be able modify the code and create a proprietary version of it, then a non-copylefting license should be used instead (in that case, I'd suggest using the MIT license, which is simpler and clearer than the BSD licenses). Some projects are "dual licensed", that is, they are available under multiple licenses.

There are legions of articles on licenses, some quite heated. Evan Leibovitch wrote three columns about them at ZDNet: (Fatal flaw in BSD?, Is the GPL really "user hostile"? When you have Right on your side (or Left, as the case may be), including an extremely thoughtful piece by Jason Earl on why some BSD advocates are hostile to the GPL. A posting by Morris McGee briefly contrasts the GPL and BSD approaches, arguing that both are needed for OSS/FS development. Sadly, some of it degenerates into name-calling, with Microsoft calling the GPL a "virus"; a counter-claim is that the GPL is a "vaccine against proprietary vendor lock-in". However, it is true that licenses are important, because they set the rules for future users and developers, and any license will restrict what one group or another can do.

Most OSS/FS developers shouldn't create their own licenses; creating a good license requires a good lawyer, and the probability of unintentional incompatibility is great. Even large organizations are usually poorly served by creating their own license, since doing so will greatly reduce the amount of possible code reuse and the size of developers willing to aid the project. In summary, if you want to develop OSS/FS software, consider the GPL for applications, the LGPL for libraries (if you want proprietary applications to call it), and the MIT license if you want your code incorporated into others' proprietary code. The BSD-new license is a reasonable alternative to the MIT license. In particular, it's unwise to create an OSS/FS project using a license incompatible with the GPL, because such a license bars code sharing with a vast amount of OSS/FS software; see my article for more information on why OSS/FS developers should select a GPL-compatible license. The LGPL, MIT, and BSD-new licenses are compatible with the GPL.

Descriptions/History

Here are some especially useful descriptions of open source/free software, including philosophical approaches, how it's used in practice, history, and so on:

Several of these documents are available as a single collection titled The Open Source Reader; I suggest looking at this if you want to print a collection of works directly from the web.

Why use OSS/FS?

There are many good reasons to use OSS/FS, and there's actually quantitative data justifying some of its claims (such as higher reliability). In fact, there's too much data - I needed to separate that out into a separate page. See my Why OSS/FS page for more information and quantitative evidence for OSS/FS. A simple qualitative argument (by a company developing OSS/FS) is Michael A. Olson's ``A business case for open source''.

Major Projects

See my Generally Recognized as Mature (GRAM) OSS/FS list for a list of important OSS/FS programs that are generally recognized as mature.

Major OSS/FS projects include:

  1. Linux kernel,
  2. Apache (web server),
  3. Samba (supports interoperability with Windows clients by acting as a Windows file and print server),
  4. GNOME (a desktop environment),
  5. KDE (also a desktop environment),
  6. The GIMP (bitmapped image editor),
  7. MySQL (database emphasizing speed),
  8. PostgreSQL (database emphasizing functionality),
  9. PHP (hypertext preprocessor used for web development),
  10. Mailman (mailing list manager),
  11. XFree86 (graphics infrastructure which implements the X window system),
  12. bind (domain naming service, a critical Internet infrastructure service),
  13. GNU Compiler Collection (GCC, a suite of compilation tools for C, C++, and several other languages),
  14. Perl (programming/scripting language),
  15. Python (another programming/scripting language),
  16. Mozilla (web browser and email client),
  17. OpenOffice.org (office suite, including word processor, spreadsheet, and presentation software),
  18. the open source BSD Operating systems (FreeBSD (general purpose), OpenBSD (security-focused), NetBSD (portability-focused)).
A great deal of documentation is available at the Linux Documentation Project (LDP).

A number of up and coming projects are at an alpha or beta level. Some projects that have the potential to be very important, have running code, and are working toward more functionality or stability include the following: Wine (a program to allow Windows programs to run on Unix-like systems, AbiWord (a word processor), Gnumeric (spreadsheet), KOffice (office suite), and GnuCash (money management).

Web projects around the world often use LAMP, an abbreviation for Linux, Apache, MySQL (sometimes replaced with PostgreSQL), and PHP/Perl/Python. More complex web projects may use major libraries or frameworks, such as PHP-Nuke (based on PHP) and Zope (based on Python).

There isn't really one place to find ``all about GNU/Linux''; you could do worse than looking at the linux.org information.

For information on other packages, you could go to places which track OSS/FS projects like Freshmeat (which lists new software available for GNU/Linux and other systems), the FSF list of free software, and the BerliOS SourceWell.

GNU/Linux Distributions

Few people will want to do all the packaging work for entire operating systems based on GNU/Linux. Thus, ``Linux distributors'' sprung up, who do that work and sell support, extra services, and so on. Major distributions of GNU/Linux include: Red Hat (#1 in the U.S. by most measures), Debian (#1 non-commercial distribution), SuSE (a major force in Europe), and Mandrake; there are many others.

It's tricky to figure out the market share of GNU/Linux distributions. An IDC study of copies of GNU/Linux sold in 1999 (Red Hat holds huge Linux lead, rivals growing) found that Red Hat shipped 48% and SuSE sold 15%. However, while the market grew 89%, Red Hat's share grew only 69%, suggesting that Red Hat will have more competition ahead.

The fundamental failure of these numbers is that they only count sales. Debian is not usually "sold" in the traditional manner. A copy of a Linux distribution can in many cases be downloaded for free (see LinuxISO.org for one source), and/or installed on as many computers as you wish. Thus, this measure is useful to show that Red Hat is widely used, but it's less useful in showing true market shares.

One good source for information on distributions is Distrowatch; for a quick introduction, see its page The Linux Distribution Game.

Community/News

Open source / free software is a community and culture, not just an idea. Thus, you can get news and cultural information from some of the following:

Miscellaneous Related Sites

You can view this page at http://www.dwheeler.com/oss_fs_refs.html. Feel free to see my home page at http://www.dwheeler.com. Note that this is a personal article and is not endorsed by my employer.