Total Recall: How the E-Memory Revolution Will Change Everything - C. Gordon Bell, Jim Gemmell (2009)
Chapter 2. MYLIFEBITS
My own quest for Total Recall began in 1998 while I was working as a researcher at Microsoft. I didn’t start out thinking of Total Recall. As usual, I was being pragmatic and looking for things to make my own life better. A colleague, Raj Reddy, asked me if he could digitize the books I had written and put them on the Web as part of his Million Books Project.
“Sure,” I told him, “Microsoft has a lot of lawyers. They should be able to get me out of any trouble that comes from copyrights.”
Seeing those books become digital felt good, and encouraged me to try some more scanning. I did the scanning myself just to see how to do it and whether it was interesting and useful. I scanned a pile of correspondence, patents, and around a hundred articles. I became even more upbeat, and set my sights on scanning all of my papers and notebooks. I saw it would declutter my office, and allow me to work from home or anywhere I happened to be. I could be an efficient, paperless teleworker.
Then I thought: Why stop there? With all those cheap terabytes of storage coming down the pipe, why not just keep everything? Not only books and papers and e-mails, but slide presentations, product brochures, health records, interviews, photos, songs, movies—all the information of my life.
It wasn’t as if no one had thought of this before. Bill Gates wrote in his 1995 book, The Road Ahead, “Someday we’ll be able to record everything we see and hear.” Clearly, that someday will come because we’ll be able to make practical use of all those e-memories. That someday is going to be in the middle of someone’s life. Why not mine? Why not now? But how? How could one person speed the arrival of the era of Total Recall? I became intrigued with the idea of keeping everything.
There is a strong social prejudice against this very simple idea. Keeping everything is like the eighth deadly sin. You’ll become a packrat, a horder, obsessed with your past. You shouldn’t look back. You need to clear out your attic and throw stuff away. And in a nondigital world, that kind of thinking made some sense. But in a digital world, with time and cost barriers melting away before our eyes, things have changed. Keeping everything doesn’t mean you have to spend all your time looking after masses of paper and stuff. Don’t throw it away, digitize it.
I especially wanted to rid myself of my filing cabinets and the countless banker boxes holding my old papers. Making them e-memories gave me the pleasure of getting rid of them—without really getting rid of them.
The idea is simple to state: “Record everything, keep everything.” But actually putting it into practice turned out to be a major project. Even though cheap terabytes were still some years off, I felt it was important to start immediately. When the day of the cheap terabyte arrived, I hoped to be able to give people insight into the logistics, costs, benefits, feasibility, and desirability of recording everything. And just what “everything” in your life might mean.
Building my own e-memory became a three-pronged effort. First, I had to make digital copies of everything from my past. Second, I had to start recording and storing everything I saw, heard, and did from that point forward. Then, third, I had to figure out how to organize the information in my digital corpus. This last prong was crucial. Just saving files willy-nilly into an e-memory is easy, just as throwing receipts into a drawer is easy; but come tax time, or if you ever need to find a specific subset of those receipts, you’ll rue your lack of filing discipline. So the big task would be to figure out what kind of software would be needed to make such a massive and miscellaneous collection of information useful.
By January 2001 my sixteen-gigabyte e-memory contained more than five thousand photographs and about one hundred thousand pages of paper: letters, memos, bills, receipts, financial statements, legal documents, ticket stubs, business cards, greeting cards, brochures, meeting agendas, symposium programs, diplomas, warrantees, manuals, purchase orders, circuit diagrams, employee evaluations, annual reports, newspaper clippings, article printouts, stock certificates, report cards, childhood drawings, birth certificates.
I’d hung on to those hundreds of pounds of yellowing paper not because I wanted to help found a thriving community of sil verfish in my home, but because I knew that someday, for some reason, I would certainly need to refind at least one old item. The vast majority of them would never see the light of day again, but I had no way to predict which one I’d need back. I couldn’t possibly know in advance which check I might need to end a payment dispute or whether I would face a tax audit for a certain year. So I’d felt trapped into keeping all of them. It took me more than a decade to throw away circuit-theory class notes from MIT, even though it was clear I wouldn’t be designing any of those kinds of circuits.
Scanning and digitizing that much paper turned out to be a very big job, so in April 1999, I hired Vicki Rozycki as my personal assistant. Over the next two years we would scan, a handful at a time, what we then thought was everything. Then, for several years after that, we continued unearthing more to scan. It took up a large amount of her time. The hard part was finding stuff and getting it ready for the scanner. (Nowadays there are commercial services that will do this sort of thing for you much faster and cheaper, using automated bulk scanners.)
I never knew quite how much I’d resented the need to stockpile so much paper until I saw it dwindle away like dirty old winter snow in the spring thaw. Folder by folder, box by box, week after week, it disappeared. The clutter and hassle of keeping paper files had been like the half-noticed droning of an electric motor that suddenly goes silent, leaving me in a startled state of peace. Not to make light of tragedy, but this passage from a recent novel, Extremely Loud and Incredibly Close by Jonathan Safran Foer, struck a chord with me:
[It] was the paper that kept the [World Trade Center] towers burning. All of those notepads, and Xeroxes, and printed e-mails, and photographs of kids, and books, and dollar bills in wallets, and documents in files . . . all of them were fuel. Maybe if we lived in a paperless society, which lots of scientists say we’ll probably live in one day soon, Dad would still be alive.
I also made digital records of all my physical memorabilia. The scanner worked fine for smallish, flattish items such as medals, coins, and pins, but for larger and more fully three-dimensional objects I had to use a digital camera. I took down all my paintings and made high-quality photographs of them. One of my favorites, titled Meeting on Gauguin’s Beach, began as a sketch rendered in 1988 by a computer program called Aaron. Then the artist who developed Aaron, Harold Cohen, hand-painted a version for me in vivid oil paints on a five-by-seven-foot canvas. Now I completed the cycle and sent it back to cyberspace.
I took pictures of all my eagles. I love eagles and have amassed a collection of eagle sculptures, picture books, postcards, knick knacks, and hand puppets. I took pictures of mugs. I have a collection of special coffee mugs. A few of them have eagles, but all of them have some connection to my past. I call some of them my “one-hundred-thousand-dollar mugs” because they are all I have to show for a one-hundred-thousand-dollar investment in a start-up company. I photographed all of those along with their companion T-shirts. The most rewarding part was putting them all in a box and delivering the whole collection to the Computer History Museum—they were someone else’s clutter now!
If any of these treasures are ever lost in an earthquake or a fire, I’ll have nice remembrances of them. And if my heirs don’t want to hang on to the cluttered remnants of my life after I’m gone, they’ll always have these images and my notes about them if they want to see what was important to me.
Of course, my collection of two hundred CDs also needed to be ripped onto my computer. I also had several drawers- and shelves ful of home movies, videotaped lectures, and voice recordings on audiocassette that were collecting dust and needed to be digitized. A service converted the 8mm movies to VHS tapes, which were later converted to CDs.
The second prong of getting my life digitized was to start scanning and recording everything I did from that point forward. In 2002 I decided to go paperless and to scan and not store or file any paper documents. I had already resolved in 1995 not to take any more paper newspapers. (Later, I was at a dinner with New York Times publisher Arthur Sulzberger Jr., and his description of their investments in new printing plants convinced me I was right in believing that paper newspapers were on the wane.)
For new paper it was easy. When billing statements or important notices came in, I scanned them in less time than it used to take me to physically file them away. And thankfully, the amount of paper I’ve needed to scan has shrunk each year since I began. If you piled up all the paper mail I used to receive annually, it would measure about thirteen feet high. Twelve of those feet were utter junk that could be safely thrown away—generic credit card offers, random bulk catalogs, “you may already be a winner” mailings, and the like. (Note: My goal is to record everything I actually read, not what others send me. It’s my choice, not theirs, that counts.) Only the remaining foot of paper was worth scanning, and even that has grown less as I’ve switched to digital bills and statements.
Nowadays it’s much easier to avoid paper. All the technical magazines and news sources I read are “born digital,” as are nearly all books. For legal reasons I keep a few items in paper, such as stock certificates, but not many. At that time, I started signing everything I could digitally, avoiding the creation, transmission, and especially storage of any paper.
As scanned documents and pictures piled up into my surrogate memory, I faced the challenge of figuring out how best to organize it. I began with what I had to work with: the folder hierarchy that every computer user is familiar with (every folder contains a list of files and subfolders, and every subfolder contains its own list of files and sub-subfolders, and so on). I filed my documents in folders according to a set of categories that made sense to me. From the design point of view this was not perfect, but I had to get started somehow.
In this earliest stab at organizing my scanned data, I split my e-memory into two top-level divisions: items related to current events in my life, and an archive of older, inactive information. Under those two main folders, I had dozens of subfolders for categories including books, medical records, the Computer History Museum (which I’d helped start), trips, underwater photos, food, and so on. Under “Animals” you could find a picture of alligators, various images of San Francisco’s wild parrots, and an astonishing set of images showing a snake swallowing a kangaroo whole. I had my “Eagles” folder, of all things eagle-related, and a “Fun” folder, which included a picture of the adult me swinging from a rope.
To help find things again easily, I gave each item a long, detailed file name. For example, the file name of a technical article would include the title, where it was published, the date, keywords, and other pertinent details.
But even with all my documents and pictures stashed away in a well-thought-out classification hierarchy of file folders, it was hard to find particular items quickly, if at all, because it required remembering where it was put. It was just like a library organized by subject without a card catalog. Poring through multiple folders for the right name or thumbnail icon took too much time. Without better labels, even my photos were not much use. When I looked at some, I couldn’t recall what they were about. It was painfully clear that the problem would get far, far worse once I started adding hundreds of daily pictures and hours of daily audio to the jumble.
My friend and boss, Jim Gray, teased me about it. When you burn data onto most compact discs, the operation is permanent, and this is known as “Write Once, Read Many” or WORM. Jim mocked me as the inventor of WORN: “Write Once, Read Never.”
“It’s all just a bunch of bits unless it’s annotated,” I grumbled.
I began to realize the magnitude of what was lacking. This was not a project to store my life bits; it was about how to get them back!
Scanned documents are image files, not text files, and as such, they’re invisible to keyword searches. But with thousands upon thousands of documents in my e-memory, keyword searching would be the only way to re-locate an old file that I could only recollect one or two fragments of, such as a name, a dollar amount, or a dateline. So I ran all the scanned documents through optical character recognition (OCR) software, which is able to recognize written letters and numbers in an image and reconstruct them in a text file. What I ended up with were thousands upon thousands of text files that were neatly interleaved among the scanned files.
Now I just needed desktop search software, that is, software that would allow me to search through my thousands of files for some desired text, just like you search for Web pages now using Yahoo or Google. But at this time operating systems were still several years away from offering desktop search. Desktop search was in its infancy, and every such product I tried was pretty “bleeding edge.”
I tried to get Microsoft to take the lead in desktop search, starting with the acquisition of a leading start-up, but was unable to convince the right people. I would have to wait for others to revolutionize search technology. In the meantime, if I wanted to continue my little lifelogging experiment, I would have to cobble together my own solution. In October 2001, Jim Gemmell and Roger Lueder, who had been working with me on other projects at Microsoft, decided that this would make a great research project for them to get involved in. We started out like we do with any new research project, by combing through the published literature to see what others had learned.
I dug up an old paper that I recalled as being relevant, and was surprised at just how relevant it was. In fact, it specified a system almost made to order for us. That’s pretty amazing, when you consider that it had been written more than fifty years earlier.
In 1945, when electronic computers were actually multistory buildings, the director of the federal Office of Scientific Research and Development, Dr. Vannevar Bush, published an essay in the Atlantic Monthlytitled “As We May Think,” which outlined a radical new vision of how people might one day keep their own libraries of personal media. He proposed the memex:
A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.
It consists of a desk, and while it can presumably be operated from a distance, it is primarily the piece of furniture at which he works. On the top are slanting translucent screens, on which material can be projected for convenient reading. There is a keyboard, and sets of buttons and levers. Otherwise it looks like an ordinary desk.
Most of the memex contents are purchased on microfilm ready for insertion. Books of all sorts, pictures, current periodicals, newspapers, are thus obtained and dropped into place. Business correspondence takes the same path. And there is provision for direct entry. On the top of the memex is a transparent platen. On this are placed longhand notes, photographs, memoranda, all sorts of things. When one is in place, the depression of a lever causes it to be photographed. . . .
. . . As he ponders over his notes in the evening, he again talks his comments into the record. . . . He can add marginal notes and comments . . . and it could even be arranged so that he can do this by a stylus scheme. . . .
Another way to get material into the memex was with a wearable camera:
The camera hound of the future wears on his forehead a lump a little larger than a walnut. It takes pictures. . . . The lens is of universal focus. . . . There is a built-in photocell on the walnut . . . which automatically adjusts exposure for a wide range of illumination. . . . It produces its result in full color. It may well be stereoscopic, and record with two spaced glass eyes. . . .
The cord which trips its shutter may reach down a man’s sleeve within easy reach of his fingers. A quick squeeze, and the picture is taken. On a pair of ordinary glasses is a square of fine lines near the top of one lens, where it is out of the way of ordinary vision. When an object appears in that square, it is lined up for its picture. As the scientist of the future moves about the laboratory or the field, every time he looks at something worthy of the record, he trips the shutter and in it goes, without even an audible click. . . .
I love Bush’s description of the memex. The image he conjures is like something straight out of a Jules Verne novel. I envision a luxurious mahogany desk festooned with brass push-buttons, levers, and translucent screens. I can just hear the muffled clickity clicking of mechanical registers crunching numbers deep inside the casing. But even though most of Bush’s hardware suggestions are now obsolete, the antiquated trappings belie the sheer brilliance of his prescience. Bush’s desk with storage, screens, keyboard, stylus, and platen is the equivalent of today’s desktop PC with a microphone, multiple monitors, and a scanner. Add in a tablet PC and you gain pen-based input. And sub-walnut-size cameras are now affordable and plentiful. Just about all new cell phones and laptops come with one built in, and they can also be bought and worn on their own.
Bush was writing with scientists in mind. “There is a growing mountain of research,” he lamented. “We are being bogged down today as specialization extends. The investigator is staggered by the findings and conclusions of thousands of other workers—conclusions which he cannot find time to grasp, much less to remember.”
But he also realized that quantity was not the core problem. “The difficulty,” he wrote, “seems to be not so much that we publish unduly . . . but rather that publication has been extended far beyond our present ability to make real use of the record . . . [It] must be continuously extended, it must be stored, and above all it must be consulted.”
Bush wanted to free his fellow scientists from the drudgery of searching and cross-referencing their books, journals, and notes so that they could focus more on the creative side of their work.
“Creative thought and essentially repetitive thought are very different things,” said Bush. “For the latter there are, and may be, powerful mechanical aids.”
Bush did not want any scientist to worry about running out of space in his or her storage unit, which would imply having to discard items that might later prove useful. Memex was to have infinite storage. “If the user inserted five thousand pages of material a day it would take him hundreds of years to fill the repository, so that he can be profligate and enter material freely,” he wrote.
Memex would allow the scientist to annotate any item in the collection by speech or writing. Bush also wanted to support the way our minds work in associating one idea with another. He contrasted existing data storage with biological memory:
When data of any sort are placed in storage, they are filed alphabetically or numerically, and information is found (when it is) by tracing it down from subclass to subclass. It can be in only one place, unless duplicates are used; one has to have rules as to which path will locate it, and the rules are cumbersome. Having found one item, moreover, one has to emerge from the system and reenter on a new path.
The human mind does not work that way. It operates by association. With one item in its grasp, it snaps instantly to the next that is suggested by the association of thoughts, in accordance with some intricate web of trails carried by the cells of the brain. It has other characteristics, of course; trails that are not frequently followed are prone to fade, items are not fully permanent, memory is transitory. Yet the speed of action, the intricacy of trails, the detail of mental pictures, is awe-inspiring beyond all else in nature.
Bush hoped that “[selection] by association, rather than indexing, may yet be mechanized.” To this end, he proposed that “trails” be created, connecting one document with the next in a sequence that could be followed again later. Trails could be given names, were something that you could share with your friends, and all the links were two-way. (The familiar hyperlinks on the World Wide Web are only partially realized trails. They are one-way, and are not grouped or named.)
Bush’s memex was inspirational. The time was ripe to realize his dream—and to extend it far beyond the realm of scientific research into the lives of everyone.
A REAL MEMEX
We named our research project MyLifeBits, and adopted memex as its minimal requirement. Our goals were twofold:
1. To create software for lifelogging, and the subsequent recall and usage of one’s e-memories. We wanted software to record a diverse array of information about one’s life and activities, from a variety of sources and devices, and to do so as easily, as unobtrusively, and as automatically as possible. The software would have to give people powerful tools for searching, organizing, annotating, and pattern-mining their ultimately huge e-memories.
2. To identify the benefits, drawbacks, technical issues, sticking points, and usability of Total Recall in real life. We wanted to try it out (as much as we could) and see what it was like.
Since 2001 I have been serving as the primary test subject, but Jim Gemmell is also an avowed user, while Roger and Vicki have tried out numerous aspects of it in real life. A number of universities also have used our software and have experimented with it.
MyLifeBits is not a commercial product; it is a research project. In fact, MyLifeBits software is not a single application. It is a prototypical suite of applications, and a storage system that blends files with a database. You won’t see Microsoft eventually ship MyLifeBits version 1.0. Instead, you will gradually see more and more of the kinds of things done in MyLifeBits also being done in operating systems and in applications.
Our aim was to preview and to help lay the groundwork for the Total Recall systems that are coming soon—very soon—and that by 2020 will be as commonplace as Web browsers and cell phones are today. A few early steps in the Total Recall revolution have already hit the market. These include Evernote, reQall, OneNote, Google’s Web history, and support for desktop search in operating systems. But as this book is being written these products remain small discrete solutions within a much larger puzzle.
HOW TO ORGANIZE AN E-MEMORY
Back in 2001 I could see we still had a lot of basic things to figure out about how to store and organize my data. We had just my sixteen gigabytes of documents and photographs loaded in my imperfect classification hierarchy of folders, and we had no good way to search them, sort them, annotate them, link them together into Bush’s “trails,” or analyze them for patterns and trends.
The files-and-folders method of organizing data is a fundamental feature of all modern operating systems such as Windows, Mac intosh, and Linux. File-and-folder hierarchies, even when stored digitally, suffer from the same basic limitation as libraries once did: Each book can exist only in one place, filed under one category. But an item might properly belong to several categories, or hundreds. A Brief History of Timeis a physics book, but it’s also a book by Stephen Hawking, it was a best-seller, it talked about black holes, and it was published in 1988. You could easily come up with dozens of other attributes that would be perfectly legitimate criteria for tracking down A Brief History of Time and for sorting and grouping it with other books (and for that matter, for sorting and grouping it with other media of any kind: with lecture recordings, with songs, with articles, with pictures, with old news footage).
The MyLifeBits project ran into the problem like this: Logically, my Eagles folder should have been stored in both my Fun folder and my Animals folder, but in practice I had to make an arbitrary choice. And often, if I wanted to find some half-remembered piece of data again, I’d have to hunt for it in various folders, asking myself: If I were me, where would I file it right now?
Librarians have been familiar with this restriction for centuries. A copy of a book can only be on one shelf in just one section, often determined by the Dewey decimal system of book topics. So they created paper card catalogs, where a card was a surrogate for a book. Now the book—or at least its surrogate card—could be filed in more than one place. Dewey might have it placed in the physics section, but it could also be in the title card catalog, filed alphabetically by its title, as well as being filed alphabetically by author in the author card catalog. For your convenience, the Dewey subject index would be duplicated in the card catalog also, allowing you to flip through cards with your fingers rather than hiking down the aisles.
So here I was, with a system that was worse than a library with paper card catalogs. I was like a librarian who was not allowed to have a card catalog. Jim Gray, who is widely celebrated as a pioneer, even a founding father, of database design, shook his head over me.
“You need to use a database, Gordon. When are you going to listen to me?” he would ask.
I was resistant. “We don’t need no stinkin’ databases,” I’d reply.
My resistance stemmed from my first experiences with databases back in the 1970s. Back then databases were still new and much hyped—they were also space-hogging and hard to use. I knew they’d been improved in the time since, but I’d heard enough horror stories over the years to keep my prejudices well nourished. Also, I wasn’t clear on exactly what I wanted out of even a well-behaved one. But, it turned out Jim Gray was right—as usual.
A database is a program for storing and retrieving large collections of interrelated information. Modern databases let you very quickly retrieve all the records with a given attribute. You can rapidly sort, sift, and combine information in just about any way you can imagine. There was once a slight technical distinction to be made between how a database could index and look up records and full-text retrieval of documents, but by now databases have subsumed full-text search; they are happy to store documents and perform Google-like retrieval.
In his memex paper, Bush had expressed hope that the search algorithms of the future would be better than simple index-lookup on some attribute like author or date. He held up the human brain’s associative memory as the ideal. In an associative network, items are linked together by contingency in time and space, by similarity, by context, and by usefulness. There are often numerous paths to each item.
Bush was right that trails and associative linking were critical components of an effective e-memory machine. But his dismissal of indexing was one of his rare failures of imagination. In his day, indexing meant alphabetical lookup in a predefined, noncompre hensive list of topics or keywords, as in a library card catalog. With the speed of modern computers, it has become possible to index every single word and phrase in every document and to search all of them in an instant. When indices are so comprehensive, and lookup by the index instantaneous, then indexing is actually the mechanism by which associative memory becomes possible.
The MyLifeBits research project revealed that any system that aspires to be sold as an e-memory machine in the age of Total Recall will have to use a database storage engine, including full-text indexing. Only a database will allow you to create two-way links between items (including annotations) and to regroup and recat egorize items and collections of items in an open-ended fashion. Only full-text indexing will give you keyword access to all of your e-memory.
With MyLifeBits we could find all items that share a certain property, such as having the same creation date, or having been edited during a particular meeting, or having been viewed within a certain span of hours after a particular phone call.
To make a database-style system work, we needed to include what is called metadata, or “data about data.” Metadata is essentially digital annotation about a file or other software object. Metadata may be embedded inside a file, or it may be “attached” to it from the outside. Conceptually, it’s a bit like a sticky label on a manila folder that characterizes its contents.
Your computer’s operating system keeps a little metadata on each file for bookkeeping, such as its creation date, the date last modified, the size of the file, and who is allowed to access it. Certain file types support additional metadata. For instance, the Microsoft Word document I am typing in right now lets me enter author, title, subject, keywords, category, status, and comments. Pictures in JPEG format can record things like the date taken, location, camera model, focal length, f-stop, and exposure time. Nearly all music formats include artist, album, composer, genre, and length.
Some of this metadata gets filled in automatically. Digital cameras fill in the JPEG fields when they take a picture. CD-ripping programs look up the album information on the Web and populate the metadata for each song. In contrast, the metadata for Microsoft Office documents must be manually entered. Your name, which you were prompted for when you installed Office, is prepopulated as the author in new documents, but everything is blank—and tends to stay that way. Many of these manual-entry attributes will remain blank until we have systems more like MyLifeBits so that there is an actual payback for doing the work of entry.
One kind of metadata attribute that is getting a lot of attention these days is called a tag. A tag is simply a single word or short phrase. Tags aren’t much different than the keywords attribute I have for this Word document. But they are creating a stir because there are some great photo applications, such as Flickr, that make tags easy to create and very useful for finding things again. You can add any number of tags to a file. For example, I have a photo of myself at age ten, with my four-year-old sister Sharon and my pony Snippy, who liked to bite. I kept that photo in a scrapbook for most of my life. It existed on one page only, and I was the only person who could find it quickly. But by putting the image into a database, I can tag it in any way I might find useful: Gordon, Sharon, Snippy, 1944, black-and-white, Missouri. Thereafter, I can use the tags for searching and sorting.
We also needed to be able to forge links—Bush’s “trails”—between any items or collections of items in my database. For instance, I wanted to be able to link some photographs to an entry in my calendar, to indicate that they were photos of that event. Or, if I record some audio of me talking about a photo, I want to be able to link the audio to the photo, so it is clear that it is a comment about the photo.
So Jim Gray and some other colleagues convinced us to take the plunge, and we created a database to hold all our files and other information. It was great. We could still view my data using my original folder-based organization scheme if we wanted, but that became just one of myriad ways we could view it. We could group items into what we called “collections,” and each item could belong to more than one collection.
We worked hard on annotations and metadata, making them automatic whenever possible, and otherwise trying to make it quick and easy at any moment to add information. For instance, I could select a bunch of items and then type a comment about them at any time. If I didn’t feel like typing, I could click a button and just say the comment. Silence would automatically be stripped out of the beginning and end of the comment so I could be relaxed about hitting the start and stop buttons. I could also add ratings to any item. I could comment and rate Web pages from inside the browser. I could rate and comment on anything that came up on my screensaver.
Here’s an example of what we can do with MyLifeBits, courtesy of a database design with good metadata and complete indexing. Say I’m trying to remember the name of a biotech entrepreneur I read about several years ago. I can’t remember his name or his company or anything else specific enough for a standard search. What I do remember is that I read about him on the Web, the article involved biotech, it was between two and four years ago, I was at the office, and it was during a fairly long phone call with Jim Gemmell—say, ten minutes or longer. Those are pretty vague parameters, but they’re enough for MyLifeBits to winnow the selection down to just a handful of archived Web pages. I quickly find the name I need.
In my old files-and-folder system, I had metadata, such as date of publication, name of person, et cetera, embedded in long file names. But in the database, we could have an actual publication-date field that I could use for sorting and searching.
Unfortunately, we didn’t have the manpower to make all the existing programs out there work smoothly with our database, so we ended up having our database keep an eye on a regular file-and-folder system and stay synchronized with it. This gave the folder system more prominence than we would have liked, and made the overall system more fragile, but that was the reality of creating prototypes with limited resources. Thankfully, it was just a nuisance, not a serious roadblock. Our software still looked pretty much the same, and we still could learn from real-life experience with this kind of storage, no matter what was under the hood. We were up and running.
CAPTURE EVERYTHING, DISCARD NOTHING
While we were thinking through the memory organization problems, I continued capturing and saving more and more of my life bits. The project mantra had become: Capture everything, discard nothing.
We made it a goal to make capture as automatic as possible; otherwise I knew I just wouldn’t capture enough. We enhanced my Web browser to record a copy of every Web page I visited—not just the URL that points to it, but a copy of everything on the page. The advantage of this is that it solves the problem of “link rot,” the process in which hyperlinks gradually become invalid, one by one. Link rot happens for several reasons. Web sites sometimes restructure their content, change hosts, get bought up, or go extinct. Other Web sites make content free and viewable when it’s fresh but disable the links after a few weeks. Another problem arises with sites that are continuously edited, such as political position papers and Wikipedia entries. Creating a copy of every page I visit in the exact form it had at the time circumvents all these problems. Furthermore, it is often easier to find a page from my collection of seen pages, rather than search for it out of the entire Web.
This page-logging can be turned off so that I can visit sites without having them go in my e-memory. However, with all the storage at my disposal, there’s not much point. I literally can’t surf the Web fast enough to incur a significant storage cost.
I also started recording all my instant messaging and saving all my e-mails, minus the spam (just like my paper, I want to keep what I actually read, not what marketers force into my in-box).
We set up hardware to record telephone calls in my office. If you call me, you will first hear a voice say, “Recording.” This notifies you that the call is being recorded, as is required by California law (not to mention common courtesy). I can settle any dispute about what was said on a conference call by instantly retrieving the audio file. My alibi in court, if I ever need one, will be ironclad to the extent I can prove that I didn’t fabricate it.
We started tracking all kinds of things: the number of mouse and keyboard clicks, every time a document was opened, every window shown on my PC screen, and the history of my music playback. We logged every search. I bought a GPS and started loading my location history into MyLifeBits.
We even experimented with recording radio and television shows. Digital video recorders (or DVRs) such as TiVo were just coming out, and we wondered what it would be like to keep everything when it came to TV. We built our own DVR and set it up with nearly two terabytes of storage—more than twenty times the capacity of the early DVRs. If you think your TV program guide is big, try wading through more than a thousand shows, all of which are actually interesting to you. And radio was a totally different experience. We recorded lots of National Public Radio shows, including Prairie Home Companion, Car Talk, and news. We played the audio back on a Pocket PC, so it was like a cross between TiVo and podcasting. Jim Gemmell learned that he fast-forwarded through all but fifteen minutes of a typical news hour.
But I quickly lost interest in TV and radio because such shows would soon be archived and available on demand. Having your own copy is not so special if you can just have it streamed to you through the ether anytime you please. It’s still worthwhile to have your lifelog make a record of what you watched and when, but not to copy the program itself.
By October 2003, I still wasn’t wearing the walnut-sized camera strapped to my forehead that Bush had predicted. But Lyndsay Williams, a colleague from the Microsoft Research Laboratory in Cambridge, England, had come up with something even more interesting. She called it the SenseCam. About the size of a cigarette pack that hangs from a cord around your neck, the SenseCam is a fisheye camera that takes pictures automatically. When it detects a change in light level it presumes you’ve passed through a door or otherwise changed your setting, and snaps a picture. When its passive infrared sensor detects the appearance of a warm body, it snaps a picture of whoever just came into view. An accelerometer lets the SenseCam know when to delay taking a picture to avoid motion blur. And of course, you can point the SenseCam and take photos at will rather than waiting for it to take the initiative.
Lyndsay once confided that one reason she developed the SenseCam was to find her misplaced eyeglasses. By scanning SenseCam images, she can find the last place she put them down.
One of my favorite examples of how the SenseCam enhances life comes from Cathal Gurrin, a lecturer at Dublin City University in Ireland. Cathal set out to perform a year-long experiment, wearing the SenseCam during all his waking hours. When the year was over, many people expected him to be glad to stop. In fact, he wouldn’t give the SenseCam back. Cathal began wearing the SenseCam daily in June 2006 and, as I write, has worn the SenseCam for almost three years, acquiring over three million photos. Gurrin has a collection of his favorite photos rotating on a digital photo album on his desk which he shows off with the enthusiasm of a new parent with baby pictures. “Look,” he says, “here’s a picture of the first moment I met my girlfriend—not that I knew she’d become my girlfriend at the time.”
A fun thing to do is to play back all the SenseCam images from a day or a week in rapid succession, which takes just a few minutes. Talk about your life flashing before your eyes! It’s an amazing feeling to see your life on fast-forward like that.
I enjoyed taking the SenseCam on walkabouts with my GPS. I could later reconstruct my travels on an animated map, with pictures taken along the way to tell the story. The best series I did was an eight-hour trip along the Great Ocean Road in Australia and through a treetop walk in a rain forest.
My SenseCam has captured many special moments, especially at parties, lunches, and conference exhibits. I have a sequence of when I was admitted to the hospital for heart bypass surgery in July 2007. My partner, Sheridan, wore the camera as I was wheeled into the operating room.
THE DAY OF CARPE
By 2004, we were so excited about where MyLifeBits was taking us, and saw so much potential, that we wanted to encourage others to get involved. Jim Gemmell launched a workshop at ACM Multimedia 2004, a professional conference for computer scientists. The theme of the workshop, which was held annually for three years, was CARPE: Continuous Archival and Retrieval of Personal Experiences.
In 2005 we invited universities to submit proposals for research projects. We received eighty submissions and selected fourteen of them to receive money, SenseCams, and our software. I was thrilled at the wonderful results from academia, touching into many areas and ideas we would never have thought of, from helping disabled students to logging therapy sessions for stroke victims.
As of this writing I have 261 gigabytes of information saved on my main computer and about 100 gigabytes accessible in my cloud. I add about one gigabyte a month. This doesn’t include continuous audio and video, but that’s on the horizon.
The MyLifeBits software is far from perfect. The hardware right now is clunky enough that I don’t use it all the time (I hate dealing with heaps of batteries and chargers!). But between MyLifeBits and the work of our colleagues in the research community, we believe we have a proof of concept. We’ve built and experienced enough to confidently endorse Total Recall.
We will be taking a tour of how Total Recall has affected my life so far and how it will affect your life, in ways direct and indirect, large and small, as e-memory becomes standard furniture in our daily lives. Before we get to the effects Total Recall can have on work, health, learning, and our personal relationships, we need to take a deeper look at what science can tell us about the meeting of e-memory and bio-memory, that stuff that resides in our heads.