Extending the free software paradigm to DIY Biology

Some time ago I wrote an article about Jim Kent, an American biologist who used free and open software to race Craig Ventnor to the finishing line, sequencing the human genome. That was very big, cutting edge science with a global audience and reach. We live in an age when big science is done, overwhelmingly, in big businesses, universities, research labs and government laboratories. In Eric Raymond's paradigm it is the culture of the Cathedral.

Hierarchical, big, controlled and funded by taxpayers, venture capital or shareholders. The time of the amateur dilletante scientist seems to be over. It takes the huge, collective organisation of private individuals to challenge this monopoly. GNU/Linux has managed to make a significant challenge but what of open science, not just the actual use of free software as practised by CERN but utilizing the whole philosophy of organizing scientific endeavour on the principles of open source? Some amateur biohackers think they have the answer.

The era of the gentleman scientist is over, or is it?

The corpus of human knowledge has never been greater or its accumulation faster

The seventeenth century probably marked the beginning of the end for the gentleman scientist. After the end of the English Civil War and the Restoration, the Royal Society started the really first serious business of centrally organised scientific inquiry. Up to that point it was still possible for an educated Englishman like the poet John Milton to be be scientifically literate, tour Europe and meet with Galileo and discuss astronomy. But it was the beginning of the end. Progressively, science became bigger and tied into the economic destiny of a nation. As it discovered more, new branches emerged and subjects split increasingly into specialisms--and sub specialisms too. Today, there are more scientists alive that in the whole of human history. The corpus of human knowledge has never been greater or its accumulation faster. For the interested amateur lay person, keeping up is virtually impossible. As for actually doing any real science, that seems beyond the realms of fantasy. I don't think you're going to find many Large Hadron Colliders lurking in suburban garden sheds.

It is early days but perhaps it is only a matter of time before some biological Linus Torvalds sends an e-mail to say that he or she is working on a project but that it won't be big or professional--and the rest will be history. Many biohackers have clearly been watching the growth of free software communities facilitated by the internet. In an interview with Seed Magazine Mac Cowell, the founder of DIYBIO, saw the obvious parallel:

I was disappointed with the huge barrier of entry for average people, or for anyone who wants to get involved but is not already in a PhD program. The open-source computer-programming movement became ubiquitous, and computers became a platform that enabled a huge amateur or hobbyist culture of people to push the field further. Many people got organized and started working on projects collaboratively. So why can't we do that with biology? Why does all biology happen in academic or industrial labs? What's the barrier to entry for doing something interesting in biology? It's a four- to seven-year PhD program. There must be another opportunity

So, what's wrong with traditional peer review?

He's right. Peer reviewed science has become a synonym for getting you papers accepted and published by the ruling orthodoxies. Just try and get a scientific paper published challenging any aspect of the global warming industry and see how far you get. And publication is the key to academic tenure and career progression. Using free software and deploying the principles of open source, no such constraint exists. How many good ideas and projects in the sciences have been stillborn because they were either not fashionable or offended the vested orthodoxies of the day, and what progress has been stymied by a kind of scientific, vendor lock out?

Peer review is one of the core principles of all branches of science but it can be and has been abused, especially in the field of climate change where some of the abuses constitute an absolute scandal and corruption of science itself. Peer review can cause bottlenecks too. Valuable data which could be read, used and tested by others sits for a period of months or a year or more awaiting acceptance and publication in some learned journal. Prior to the ubiquity of the internet that bottleneck virtually went with the territory, but the web has been as effective as a plumber in unblocking it and opening up possibilities for amateur biohackers. We will never know how many potentially great ideas or concepts never saw the light of day because of office politics or vested financial interests. You need only think of Microsoft to know what I'm talking about.

The concept of the GPL trumps all aces. Be you ever so high the GPL is above you

You will argue that free software projects too are littered with the same human fallibilities. That's true. Even the great Debian project suffers but the difference with proprietary projects is that if irreconcilable disagreements arise, projects can and do fork and people move on and make fresh starts. Knowledge is not suppressed or lost provided someone has the willpower and skill to carry it forward. The concept of the GPL trumps all aces. Be you ever so high the GPL is above you.

They're in the garage inspection pit with a pipette and a gene sequencer

In the spirit and terminology of the free software community they march under the banner of the mantra, OpenWetWare. Inevitably, these citizen scientists are sometimes called biohackers. I'll try to avoid using that term as it has negative and misleading connotations. These guys are the whitehats who want to democratise science (I don't mean dumbing down here. Many, if not most, will almost certainly be products of third-level education), not hack into a biolab database and steal critical data. They want to create their own data and just like Steve Jobs did, they're doing it in the garage. Literally.

One citizen biologist is trying to modify Jellyfish genes and adding them to yoghurt to detect the presence of Melamine (which was implicated in contaminated baby milk in China). She intends, if successful, to release the design into the public domain. Another citizen scientist in the Unites States has a lab no bigger than one square metre in which she conducted synthetic biology experiments to engineer a microbe capable of performing simple logical operations which presages the idea of biological computing. Clever, but she was beaten to the first prize by another enterprising bio-citizen who designed a bacteria to enable rice plants process nitrogen more efficiently, in a competition run by the science fiction website, io9. Where do these so called garage geneticists get their technology and raw material? Many pick up basic equipment on eBay. As for materials, they have something akin to GNU/Linux software repositories on which to call. MIT has a registry of standard biological parts called biobricks. The remit could almost have been written by a Stallman clone:

The Registry is based on the principle of "get some, give some". Registry users benefit from using the parts and information available from the Registry in designing their engineered biological systems. In exchange, the expectation is that Registry users will, in turn, contribute back information and data on existing parts and new parts that they make to grow and improve this community resource.

That's Freedoms number two and three in anyone's language Not surprising really, for while Biobricks has no formal affiliation with the FSF, two of its members do. The FSF approves many licences, including the GPLv3, but as yet the Biobricks Foundation (BBF) has not opted for a species of viral licence, though it has taken Science Commons, an offshoot of the Creative Commons, as the basis for developing a legal framework for its activities. You'll see a link to it on the CC website, listed as one of its programmes. That said, the thinking behind all this is not just quite as ideologically driven as it is for the FSF. It is there but, like free software, there is a desire to accelerate change and invite non-corporate expertise across the distributive medium of the web which is not just changing the world of software but the whole way we all do our business, research and leisure activities. You cannot say it often enough: the internet has changed the rules of the game. Forever. The genie's out of the bottle and no amount of commercial lock in or state coercion can put it back in.

The BBF may not feel the need to slay any biological equivalents of Microsoft but the effect of what they advocate and practice is to implicitly undermine the outdated methodology of their software development model, based as it is on proprietary code and moving to a model of rented software and software as a service (SaaS). One of the best and most interesting takes I have read on these issues to date is a Ph.D thesis on open source biotechnology by Janet Hope which is available an an online PDF. It is particularly good on discussing the biological equivalents of the GPL, copyleft and what constitutes source code and binaries (as it is perhaps a little disingenuous to call molecular biology software or DNA code, unless you agree with Richard Dawkins that human beings are wetware and evolution is the software that does the debugging). Highly recommended. Also highly recommended is an introduction to science commons by John Wilbanks and James Boyle which is particularly good at eliciting the problems with "proprietary" science and how they can inhibit the rate of progress, discovery and application--in the same way that proprietary software militates against better coding and distributive free software models like GNU/Linux.

We will never know what might have been

It is impossible to estimate what has been missed because of rigid hierarchies and career trajectories. It was ever thus. In the early days of the Royal Society, the personal dominance of Sir Humphrey Davy temporarily eclipsed the brilliance of Michael Faraday, who was looked down upon as a mere apprentice bookbinder but his work on electromagnetism is one of the great achievements of science. Davy is remembered for the miner's safety lamp. Great if you were a miner but Faraday's work touched the whole of mankind and informed every aspect of our technical civilization.

how much more quickly would matters have progressed if hierarchies were less formal and abused by settled personal and professional prejudices?

There is a story, almost certainly an early urban myth, that the Prime Minister of the day visited Faraday in his laboratory and asked him what use his discovery was. Faraday replied that he was not entirely certain but that he was sure that the government would tax it. They did, eventually. The moral of that story is that you just never can tell where your research will lead in terms of improving life or creating jobs and generating revenue. Faraday lacked the professional mathematical skills to take his work any further but the Scottish Physicist James Clerk Maxwell took up the baton and his field equations predicted radio waves. The rest, as they say, is history but how much more quickly would matters have progressed if hierarchies were less formal and abused by settled personal and professional prejudices?

The price of everything, the value of nothing

The leaders of modern research bodies could learn a lot from that story and from the methodology and spirit that informs the free software community. Here in the United Kingdom the Medical Research Council and the National Environmental Research Council, responsible for huge budgets, are headed up by business people, not scientists. Last year they released new guidelines for research grants which effectively turned them into corporate research departments issuing grants only where applicants demonstrate a viable business model and a subsequent timetabled revenue flow. Pure research was shown the door. Faraday would turn in his grave.

They seem oblivious that much of modern technology is based on earlier science motivated by people's insatiable curiosity. Thank God J.J.Thompson discovered the electron when he did (1897) and was not submitted to a board of accountants and business executives demanding he give a presentation on the profits to be made from his curiosity. What Thompson and others began, lead, ultimately, to radios, televisions, lasers and... computers. They exploit quantum tunneling which is a direct result of quantum theory. And thank God no one demanded that Richard Stallman or Linux Torvalds submitted viable business models (won't be big or professional Mr. Torvalds? Couldn't possibly fund that). Stallman could only do what he did because he quit MIT to pursue his own ideas about software freedom--and look where we are now. It seems that those who have science budgets by the throat, like Microsoft, know the price of everything and the value of nothing. In their blinkered short termism they miss on an infinity of possibilities.

The internet has changed all that in ways we are still discovering and the free software community is its apogee. It was a stone dropped in a pond. The ripples have spread, and even allowing for hyped-up buzzwords and transient fashions, other disciplines have seen the potential for doing things differently and better. Even if the free software paradigm never completely dislodges entrenched proprietary business models it has ignited an interest and enthusiasm for doing things the open source way. Business will probably never fully adopt that model. It's only raison d'etre is profit. Nothing wrong with that as such, excepting when the desire for profit intentionally suppresses access to and free promotion of knowledge for others outside the commercial cathedral. The most extraordinary and extreme example of this I came across was a man who was sequencing his own daughter's DNA in a bid to locate the cause of her illness, a quest documented in Wired Magazine.

The semantic web my be the citizen scientist's best friend

We know that free software is a boon to relatively small, poor countries which simply cannot afford the Microsoft tax or are unable to resist its FUD or boot boy tactics. By the same token it is a hardy perennial of books and documentaries how the self same countries struggle with drug costs and biotechnology. Combined with projects like Sugar and the OLPC, free and open biology (and other scientific disciplines) may just constitute a way to get some serious science done on the relative cheap without technical, political and financial interference from the usual vested interests. The semantic web, if realised, may have a big role to play here too.

Its origins give cause for hope; Tim Berners-Lee, head of W3C, Lee mooted the idea (an extension of the world wide web, defining the semantics of information understood by people and machines) in the Scientific American in 2001. It has its critics and the idea is yet to come to fruition (if it ever does) but biologists have seen the potential. Nextbio is an interactive life sciences search engine which searches through and correlates experiments, literature and clinical date to assist researchers to make new discoveries. This is done through a conventional search engine interface. You just type in a term or terms as you would do in a conventional search engine like Google. As the Nextbio website itself says it is a platform for researchers to "search, discover and share knowledge locked within public and proprietary data".

Nextbio is ultimately a commercial undertaking but there is a free basic web application which is ideal for citizen scientists and does not appear to compromise any open source principles. It even allows registered users to personalise their Nextbio experience to share, communicate and collaborate with the Nextbio community. (You'll need to register to get beyond the initial search page.) It may be free but you seem to get a lot for your money. Hell, Nextbio basic is even available for the iPhone and there is a semantic plugin called Piggy Bank for Firefox. It would be interesting to know if one could use that plugin when on the Nextbio page to do a mashup. (For a guide to how to use it try this PDF.)

Bio terror or bio error?

expect a mountain of Microsoft-style FUD from large commercial corporations protecting their profits under the banner of concern for public safety

However, garage genetics sounds like a great headline but there are very real concerns about health and safety here. Releasing a software virus into the wild is one thing but a real one would be an entirely different matter. Regulations vary from country to country but they are strict and with good reason as there are issues of public safety. Nevertheless, expect a mountain of Microsoft-style FUD from large commercial corporations protecting their profits under the banner of concern for public safety. Deja vu anyone? The promoters of citizen science are aware of this and are busy building a legal framework to cover the bases to avoid that midnight knock at the door by the health and safety police. The state will of course argue that there must be controls to ensure public safety and that they and they alone should be the only and ultimate arbiters of that custodial process.

Sounds reasonable, until you recall that the self same guardians of public safety are the same ones that invented biological and nuclear weapons (and invaded countries looking for them too). Very safe indeed. It doesn't take much to cause public hysteria and panic. Everything from the European witch craze of the sixteenth century to SARS, H1N1 bird flu and CO2. It doesn't take much for the mob, armed with flaming torches and pitchforks, to lay siege to the castle walls (an early instance of democratic accountability?). Given the record of even democratic governments to protect confidential personal files and their manipulation of freedom of information protocols I see little reason to repose any more faith in politicians than in the disparate ranks of so called biohackers. It has always been a central part of proprietary FUD that free software is a security hazard. Number of Linux viruses in the wild? None. Number of Windows viruses? Too many to count.

Biological viruses are not the same as software viruses though and citizen scientists are well aware that safeguards and ethical codes will be necessary. They may not be "democratically accountable" like politicians or state officials but the record of governments and state organs is a litany of evasion of that very thing, an evasiveness propped up by access to coercion, spin, propaganda and force. No citizen scientist has that sort of clout; few if any would want it.


Every human enterprise is fraught with risk and imperfection, including free software and open science. Imperfect, because human beings are imperfect but the fact remains that neither politics or religion have barely altered the terms of the human condition. Neither the humanities, religion or politics are answering the big questions. They are not even capable of asking the right questions. Strip out science and we're back to square one. A visitor from a distant planet would not be able to tell the difference between 2009AD and 2009BC in terms of our moral development. Only science seems to be capable of tracing a trajectory of real difference. It evolves and news forms evolve all the time. If the free software, open source paradigm can liberate some science from the ivory towers then it might just prove that the laws of evolution are applicable to new, open methodologies. That's good enough for me.


Verbatim copying and distribution of this entire article are permitted worldwide, without royalty, in any medium, provided this notice is preserved.