Oh dear. After the debacle with Microsoft Poland's apparent racist photoshopping, Microsoft China went and got the company in hot water for allegedly "stealing" code. Yes you read that right: Microsoft and wholesale "theft" of code from another website. Of course it's not "theft" it's copyright infringement but tomayto/tomarto. Microsoft confessed blaming a vendor they had worked with. No surprise really but the damage to their name may have already been done. There's more to discuss here than Microsoft's already tarnished reputation though. The issue raises some important points in favour of free software and points to why more if not all code should benefit from free licencing.
Inspired web coding
Richard Stallman started the FSF and the GNU project while lamenting the culture of learning from shared code he had experienced. Unlike desktop and server software development, the web has long been a hotbed of learning by observing. Since it's earliest days, web programmers have always looked at the HTML, CSS and client-side scripts behind various websites. Mostly this will have been done for inspiration. I have little doubt that sometimes code has been wholesale lifted from other sites. Anyone who used the web in the mid to late 90s will remember the ubiquitous JavaScript rippling-water effect and when JavaScript-based hover images first appeared. Similarly the propagation of CSS probably owes as much to the accessibility of the files as it does to the structure of the language. I would imagine that many a web-developer has increased their knowledge "in the field" by simply browsing sites and viewing the source code. This is a well-known and oft-exploited side-effect of mark-up and client-side scripting languages. The source is sent in plain text and can thus be read by anybody viewing the resulting webpages.
In this instance it appears Microsoft China viewed the JavaScript source code behind the Canadian Plurk micro-blogging service. Plurk (who claim to be "Asia's leading micro blogging service") give examples of their code and the code behind Microsoft's own micro blogging service ("Juku") and by my reading it does seem that Microsoft China use code that is remarkably similar if not identical to the Plurk code. But whether Microsoft have "stolen" this code or not is not what interested me about this story -- although I will admit there was a certain delicious irony about it. I was interested in what it had to say about code licencing on the web.
Most webpages produce HTML that is covered by standard copyright. No licence is given or indicated. Thus you the user are not given any indication about how you may use the code if at all - in which case you must assume that you cannot copy it. However there's nothing to stop you looking at the code and working out how something was done and then reproducing a similar situation on your site. As said above (and partly because of the nature of HTML and mark-up coding) this situation has traditionally led to unspoken code sharing between websites. This is fine for code within the page-source itself but scripts that are referenced from the main page are slightly different and require more effort to be "inspired" by. To view them you need to examine the HTML source and grab the URI of the script. Then you can get that and examine it. As said I imagine there is many a web-developer who has built-up their skills by examining others' scripts. What usually does not happen though is somebody copying the look and feel of a site (phishers excepted) as nobody wants their site to be exactly like another one. In this case it would seem that Microsoft China left the look and feel the same and it was probably this which brought the matter to Plurk's attention.
Ambiguous licencing and authorship leads to confusion, replication and accusations of subverted code
The problem is that this culture has created a grey area where the original developer has not included any terms of use or re-use in their scripts. Often you will see just the code and that's it - not even a copyright notice. Examining such code, you have no idea whether it is original work, inspired or just plain ripped from another site. The law would say that a copyright alone (with or without a notice) will mean you can look but not touch but there's a reason why nobody has been prosecuted of "stealing" HTML code (that I am aware of). The practicalities of it are that if there is a good chance the code you are looking at is a combination of original, mash-up, inspiration and even the odd ripping then does the owner of the site have the right to restrict you from doing the same. With a mark-up language there are also fewer ways to achieve the same result so it's entirely likely that two developers can develop almost identical code independently of each other. The truth is that ambiguous licencing and authorship leads to confusion, replication and in extreme cases, accusations of subverted code.
But all of these things only matter if web-coders are precious about their code. If the Plurk code had clearly indicated it was covered by say the GPL or AGPL then anybody looking at it would have been able to re-use the code as they saw fit as long as they referenced back to Plurk. Plurk get the credit, web-coders learn their trade and the users get better sites. Please note that I am talking about client-side scripts and mark-up code here not server-based PHP scripts and the like. That said I believe that it is the service not the code that the users will value more and releasing the code under a free licence (such as the AGPL) will not change the popularity of the site based upon it.
Avoiding the mess in the first place
If code is freely accessible, people will use it. Rather than get precious about the code, web developers should free it up.
What I am trying and probably failing to say is that this debacle would have been avoided had the Plurk code been free and clearly indicated as such. If code is freely accessible, people will use it. Perhaps this is a bad analogy but I recall Nike made a brand of training shoe that you could customise by having them put whatever text you wanted on it. Somebody asked them to put "slave-trade" on and they refused. If you open things up to the people you will have a hard job controlling how they use them. In the case of a shoe you might not want to take that risk but with web-code there's less risk to you personally if somebody uses code and/or techniques on their site. It's a different matter if they copy your logo and images etc in an attempt to replicate your site (for whatever reason) but really if your service isn't up to it people will leave. Yes the marketing budget of somebody like Microsoft gives them an unfair edge but stopping them using your code won't stop them producing a competing site. When it comes to client-side web code, it will always be accessible and thus you will always have a hard job controlling how (or whether) people use it. Rather than get precious about the code, web developers should free it up.
Code cannot be stolen. It's a breach of copyright, not theft
The free software and creative commons movements have long shown that people are quite happy to attribute their sources and really the days when the source code itself (as opposed to the effort and skill of writing it) was a source of income are numbered. I can think of no other reason why a developer would be precious about their code other than they think the source code itself is of value and by allowing others to "abuse" it they are "losing out". That argument works if somebody is stealing your pencils but if you are worried about people "stealing" your code - don't leave it out in the open. Again I will state that code cannot be stolen. It's a breach of copyright, not theft. A much better solution is to leave it in the open with a note saying "If you use this, please tell others where you got it and don't stop them from sharing it either".
StatusNet show how it should work
As a direct response to the Microsoft story breaking in the news channels -- StatusNet (formerly Laconi.ca) the people behind the popular Identi.ca free software micro-blogging software and service (which I have mentioned before) posted a blog asking Microsoft to take their code. In that post StatusNet founder Evan Prodromou said:
"We wanted those developers, as well as their clients at Microsoft, to know that we're more than happy to let them copy our code. StatusNet is available for download to everyone, with all source (server and client side) available for free.
You can get your Juku service running again quickly using StatusNet. (We even have a Chinese translation!) Please feel free to contact us; we'll be happy to help you out."
I'm not defending Microsoft. There's an hypocrisy of ripping off code when you bleat on about piracy so much -- particularly in China. Even if they do choose to blame someone else, the buck must stop with them but the problem will not go away because one company -- even the biggest one -- is shamed into admitting they breached copyright. This particular problem will only go away when the shared learning and advancement on the web is continued to its natural conclusion.
Free the code and the rest will follow.