The Web is Yourspace

The Web is Yourspace

The web is Yourspace: a commentary on how the web was created, and how open standards were instrumental in its success.

Common Communication: The Web By Design

The web was created by Tim Berners-Lee as a method to facilitate communication around CERN, though it became apparent it could become a lot more when coupled with two other technologies developed by the US DoD:

“I just had to take the hypertext idea and connect it to the TCP and DNS ideas and — ta-da! — the World Wide Web.”

—Tim Berners-Lee

There are some acronyms in there that need explaining. First a protocol is like a language designed for computers to “talk” to each other. TCP–Transmission Control Protocol–is special because computers can make sure that they are getting the data they requested, or the data they are sending is getting received. Imagine it like when you sign for a mail-order package: the “TCP packets” are the letters, and the signing is the TCP protocol. These packets could be re-sent, or even separated and sent around other networks as needed.

This was quite a big idea in communications as it meant that even if a particular network went down, the packets could be “switched” along a different path. It’s a misconception that these ideas were solely developed for the purpose of ensuring the US military could communicate reliably in the event of a nuclear war, but certainly they used these ideas extensively in the making of the DoD’s ARPANet–the “mother” of today’s Internet. This combination of “packet switching” to by-pass network problems and “signing off” of data packets made for robust communications.

DNS or the “Domain Name System” was a solution to the problem of finding a computer on a network in the first place. Or more specifically allowing a human to find a computer on a network, as computers find each other quite easily by IP addresses; going back to our letters analogy this would be like sending a letter to a number. Humans aren’t very good at carrying around numbers in their heads, so the Domain Name System was created to match up IP addresses to text addresses: instead of sending our “letter” to 345.678.910 you can just type in http://jones.example.com.

Conveniently that brings up that complicated-looking word “hypertext”. That “http” you type is another protocol. It’s called Hypertext Transfer Protocol, let’s examine it: well, it’s a protocol (as we know that’s like a “language” for computers to communicate with) that transfers “Hypertext”. The protocol is actually similar to our own language, using verbs such as “GET”, “POST”, and “DELETE” among others. Going back to the analogy, you could say it’s a bit like a postman–but this postman you can control. If you type in http://jones.example.com your browser uses the “GET” part of the protocol, because you want to “GET” that hypertext. If you fill in a form and click “Submit” the browser issues a “POST” to post your data. Conveniently your browser does all this for you, but you could actually issue these commands yourself.

But what about hypertext? This too is a simple idea, but an idea that had been brewing as far back as 1945. To understand hypertext, first think of footnotes in books. At a certain part of the text you would get a little number, indicating that if you looked at the footnote then you could find something else that was related to that word or phrase. Hypertext encompasses this idea by making the word or phrase itself a gateway to further information. A sterling example of hypertext is Wikipedia: you could spend days following hyperlinks in the hypertext to ever more information.

Around 1990 when Berners-Lee was doing his stuff I should imagine there was lots of talk about the uses of this new and exciting technology. Unencumbered by things like long-distance phone calls, or long and often unreliable postal mail, people would be able to communicate like never before. Berners-Lee kept this in mind when designing HTML: the markup language that gives structure to hypertext. It’s a simple user-friendly system of “tags” used to enclose their respective data. For example a paragraph would be indicated like so: <p>paragraph text</p>. The web was designed to allow people with no special computing knowledge to add things to it.

An Important Decision

About as important as actually inventing the web was the decision to make it open. This means that you don’t have to pay royalties or own a commercial piece of software to access it. A lot of people have made an incredible amount of money off the web, but the creator of it isn’t one of them. To further protect your freedom, he founded an organization called the World Wide Web Consortium. This organization creates and publishes open standards for creating and accessing the web.

Open standards are important to combat short-sighted business practices. It goes without saying software companies want you to buy and use their software, and a way of making you do this is creating a special language that only their products understand, this is called a proprietary format. Once enough people use the software that proprietary format is called the “industry standard” and people take it for granted that you have to buy the software, otherwise you won’t be able to digitally interact with all the other people who own it. Already your freedom has been severely curtailed; if you want to see that file then you have to pay out for the software. Some companies provide you with a free piece of software to view files, but what if you wanted to edit them? Or incorporate it into a larger project? You don’t have this freedom unless you’re prepared to pay out.

It is your freedom to pay for software, but it is also your freedom to decide how much you want to spend. However this freedom is again curtailed when buying the vast majority of software that uses proprietary formats. The reason behind this lies in competition–because one company holds all the cards–there is none. A little upstart company can build a better product with more features, but nobody would buy it because it wouldn’t be able to, or be severely limited, in accessing or using the larger company’s proprietary format. Rather than information is power, it seems the flow of information is power.

Without open standards on the web you’d be ”locked into” the information that you’re allowed to see. Instead of HTML there could have been a proprietary format that could only be accessed by certain pieces of software. Those would be the new barriers between global communications, the barrier between who can afford to “pay up” and who can’t.

Thankfully this never happened. You don’t need any special authoring tools to publish content to the web. You also know that people won’t need any special software to view your content. However remember that other formats you see on the web and offline, formats that you need plugins or programs to view, choose to use proprietary formats. These include Windows Media .wmv files, Apple Quicktime .mov files, RealNetworks .ram or .rm files, the Adobe Flash Player, Apple iTunes .aac files, Microsoft .doc, .xls, .ppt files and many more.

You usually need to buy software to create these files and then you are locked into them. An example would be if you wanted to buy something other than an iPod, you would have to burn every single .aac file to a CD then encode it in a different format. This is cumbersome, but remember there aren’t even fixes like this with other such formats. The only real answer is to use and promote open formats.