The Google Supercomputer

One node in the discussion of the internet as a new platform is the meme of Google specifically as this platform. The notion gained legs with announcements of Orkut, the social software site affiliated with Google and later touched a nerve with the announcement of Gmail, Google’s online email service. This discussion was kicked off in April 2004 by Rich Skrenta, CEO of Topix.net in a well cited post, The Secret Source of Google’s Power in a post whose comments section is now outgrown the original post

. . . expanded by Jason Kottke in his post, GooOS, the Google Operating System

. . . referenced by Jon Udell in his Strategic Developer column,

. . . and summed up by Tim O’Reilly

Gmail is fascinating to me as a watershed event in the evolution of the internet. In a brilliant Copernican stroke, gmail turns everything on its head, rejecting the personal computer as the center of the computing universe, instead recognizing that applications revolve around the network as the planets revolve around the Sun. But Google and gmail go even further, showing that once internet apps truly get to scale, they’ll make the network itself disappear into the universal virtual computer, the internet as operating system.

Jon Udell later extends his musings in a later column,

The gigabyte slice of the Google file system available to Gmail beta testers will, in many cases, surpass the testers’ own corporate disk quotas for email.

Put that way, one can begin to see a world where the Google index is the broader file systems that points to “things out there” where our email, web pages, and social networks are all inputs into that file system. Jon goes on to explain this world where Google owned the operating system and what such a unified file system that continually indexes everything on your local PC could do:

Bayesian categorization: My SpamBayes-enhanced e-mail program learns continuously about what I do and don’t find interesting, and helps me organize messages accordingly. A systemwide agent that’s always building categorized views of all your content would be a great way to burn idle CPU cycles.

Context reassembly: When writing a report, you’re likely to refer to a spreadsheet, visit some Web pages, and engage in an IM chat. Using its indexed and searchable event stream, the system would restore this context when you later read or edited the document. Think browser history on steroids.

Screen pops: When you receive an e-mail, IM, or phone call, the history of your interaction with that person would pop up on your screen. The message itself could be used to automatically refine the query.

I guess I’m ok with this, so long as it’s not trying to sell me ads based on its findings!

Leave a comment