Put side to side a laptop and a mainframe from 40 years ago and you will be amazed by the astonishing evolution. Not only are hardware resources many orders of magnitude larger, but also the software is immensely more powerful and sophisticated. Operating systems appeared in the ’60, for managing mainframe computers. They handled tasks common to a variety of user programs, such as low-level device handling, scheduling the processor, allocating and naming storage, enforcing security. An operating system main role is to transform the raw machine into a set of high-level abstraction, which can be used and shared by applications. For example, instead of thinking which bytes to write or read from a disk, the applications access named files.
It may seem that we are close to have reached the last word in operating systems, and studying them will soon be part of history. But an invisible revolution is ongoing, whose importance and magnitude is enormous. Despite its enormous scale, the revolution is mostly invisible. Hidden from our eyes is slowly evolving is a set of enormous, planetary-scale operating systems. We are still quite far from having reached this goal; to make a historic analogy, I estimate is that we are still in the pre-Unix era (1971) with respect to the planetary OS. If you look carefully you can see how every year a few more pieces of the puzzle are being filled.
So, what is the computer that is to be managed by the planetary OS? In its current incarnation it is the datacenter — a collection of tens of thousands of processors and disks — but soon it is going to be a collection of datacenters linked together into a seamless whole. The goal of the planetary OS is to make millions of processors and disks to behave as a single machine, as easy and natural to use as a desktop. Using a simple interface you will invoke the power of tens of thousands of processors, spread all over the world. You will sift through petabytes of data with a simple mouse click. You will manage millions of data streams from a single console.
You can read about some of the pieces of this enormous puzzle. Here are some sample links: the Google File System, Amazon’s S3 and VMWare VMFS offer a single filesystem namespace, spanning reliably and transparently up to tens of thousands of machines. Google’s Map-Reduce and Microsoft’s Dryad allow anyone to execute a program on thousands of machines, sifting through hundreds of terabytes of data in a matter of minutes. Amazon’s EC2 and VMWare’s DRS offers transparent compute resource virtualization (extending the concept of process in operating systems to the cluster-level). Microsoft’s Autopilot is the generalization to the cluster-level of the BIOS and software updates, monitoring and deploying automatically software. Google’s BigTable transforms a cluster into a huge database. User management is described in Google’s paper, but it is an ubiquitous piece in many other on-line services, from E-Bay to Yahoo IM. Google’s Chubby provides cluster-level reliable interprocess communication. Akamai’s network handles vast volumes of traffic for millions of clients. And, of course, Web-based platforms are the new user-level APIs, which programmers can use to craft mash-ups (the extension of traditional applications).
And this list could continue.
Currently most of these pieces are still disjoint – not yet tied together into a coherent whole, and most of them do not scale up to planetary-size (they are confined to the cluster- or datacenter level). But the trend is unmistakable: there is an ongoing race to build a unified, simple, single system, which will manage transparently, automatically, and effortlessly (to a large degree – not even a desktop operating system can work without supervision, so we can’t expect full autonomy at this scale) a gigantic computer spanning the whole globe.
It is certainly a very exciting time!