Initial Thought on Archiving Social Media

My head is buzzing from the one-day Archiving Social Media workshop organized by the Center for History and New Media at George Mason University and our close neighbor, the University of Mary Washington. The workshop wrapped upon only a few hours ago, but I’m already feeling a need to synthesize some thoughts about archives, social media, and the humanities. And I know I won’t have time in the next day or two to do this, so I’m taking a moment to synthesize a single thought.

And it is this: we need a politics and poetics of the digital archive. We need a politics and poetics of the social media archive.

Much work has been done on the poetics of traditional archives—Carolyn Steedman’s Dust comes to mind—and there’s emerging political work on social media archives. But there is no deliberate attempt by humanists to understand and articulate the poetics of the social media archive.

And this is exactly what humanists should be doing. Matthew Kirschenbaum asked today, incisively, what can humanists bring to discussions about social media and archives. My answer is this: we don’t need to design new tools, create new implementation plans, or debate practical use issues. We need to think through social media archives and think through the poetics of these archives. We need to discern and articulate the social meaning of social archives. That’s what humanists can do.

Maps and Timelines

Over a period of a few days last week I posted a series of updates onto Twitter that, taken together, added up to less than twenty words. I dragged out across fourteen tweets what could easily fit within one. And instead of text alone, I relied on a combination words and images. I’m calling this elongated, distributed form of social media artisanal tweeting. Maybe you could call it slow tweeting. I think some of my readers simply called it frustrating or even worthless.

If you missed the original sequence of updates as they unfolded online, you can approximate the experience in this thinly annotated chronological trail.

I’m not yet ready to discuss the layers of meaning I was attempting to evoke, but I am ready to piece the whole thing together—which, as befits my theme, actually destroys much of the original meaning. Nonetheless, here it is:

The Archive or the Trace: Cultural Permanence and the Fugitive Text

We in the humanities are in love with the archive.

My readers already know that I am obsessed with archiving otherwise ephemeral social media. I’ve got multiple redundant systems for preserving my Twitter activity. I rely on the Firefox plugins Scrapbook and Zotero to capture any online document that poses even the slightest flight risk. I routinely backup emails that date back to 1996. Even my  recent grumbles about the Modern Language Association’s new citation guidelines were born of an almost frantic need to preserve our digital cultural heritage.

I don’t think I am alone in this will to archive, what Jacques Derrida called archive fever. Derrida spoke about the “compulsive, repetitive, and nostalgic desire for the archive” way back in 1994, long before the question of digital impermanence became an issue for historians and librarians. And the issue is more pressing than ever.

Consider the case of a Hari Kunzru short story that Paul Benzon described in an MLA presentation last month. As Julie Meloni  recently recounted, Kunzru had published “A Story Full of Fail” online. Then, deciding instead to find a print home for his piece, Kunzru removed the story from the web. Julie notes that there’s no Wayback Machine version of it, nor is the document in a Google cache. The story has disappeared from the digital world. It’s gone.

Yet I imagine some Kunzru fans are clamoring for the story, and might actually be upset that the rightful copyright holder (i.e. Kunzru) has removed it from their easy digital grasp. The web has trained us to want everything and to want it now. We have been conditioned to expect that if we can’t possess the legitimate object itself, we’ll be able to torrent it, download it, or stream it through any number of digital channels.

We are archivists, all of us.

But must everything be permanent?

Must we insist that every cultural object be subjected to the archive?

What about the fine art of disappearance? Whether for aesthetic reasons, marketing tactics, or sheer perversity, there’s a long history of producing cultural artifacts that consume themselves, fade into ruin, or simply disappear. It might be a limited issue LP, the short run of a Fiestaware color, or a collectible Cabbage Patch kid. And these are just examples from mass culture.

Must everything be permanent?

In the literary world perhaps the most well-known example is William Gibson’s Agrippa (A Book of the Dead), a 300-line poem published on a 3.5″ floppy in 1992 that was supposed to erase itself after one use. Of course, as Matthew Kirschenbaum has masterfully demonstrated, Gibson’s attempt at textual disintegration failed for a number of reasons. (Indeed, Matt’s research has convinced me that Kunzru’s story hasn’t entirely disappeared from the digital world either. It’s somewhere, on some backup tape or hard drive or series of screen shots, and it would take only a few clicks for it to escape back into everyday circulation).

I have written before about the fugitive as the dominant symbolic figure of the 21st century, precisely because fugitivity is nearly impossible anymore. The same is now true of texts. Fugitive texts, or rather, the fantasy of fugitive texts, will become a dominant trope in literature, film, art, and videogames, precisely because every text is archived permanently some place, and usually, in many places.

We already see fantasies of fugitive texts everywhere, both high and low: House of Leaves, The Raw Shark Texts, Cathy’s Book, The Da Vinci Code, and so on. But what we need are not just stories about fugitive texts. We need actual texts that are actual fugitives, fading away before our eyes, slipping away in the dark, texts we apprehend only in glimpses and glances. Texts that remind us what it means to disappear completely forever.

The fugitive text stands in defiant opposition to the archive. The fugitive text exists only as (forgive me as I invoke Derrida once more) a trace, a lingering presence that confirms the absence of a presence. I am reminded of the novelist Bill Gray’s lumbering manuscript in DeLillo’s Mao II. Perpetually under revision, an object sought after by his editor and readers alike, Gray’s unfinished novel is a fugitive text.

Mao II is an extended meditation on textual availability and figurative and literal disappearance, but it’s in DeLillo’s handwritten notes for the novel — found ironically enough in the Don DeLillo Papers archive at the University of Texas at Austin — that DeLillo most succinctly expresses what’s at stake:

Reclusive Writer: In the world of glut + bloat, the withheld work of art becomes the only meaningful object. (Spiral Notebook, Don DeLillo Papers, Box 38, Folder 1)

Bill Gray’s ultimate fate suggests that DeLillo himself questions Gray’s strategy of withdrawal and withholding. Yet, DeLillo nonetheless sees value in a work of art that challenges the always-available logic of the marketplace — and of that place where cultural objects go, if not to die, then at least to exist on a kind of extended cultural life support, the archive.

Years ago Bruce Sterling began the Dead Media Project, and I now propose a similar effort, the Fugitive Text Collective. Unlike the Dead Media Project, however, we don’t seek to capture fleeting texts before they disappear. This is not a project of preservation. There shall be no archives allowed. The collective are observers, nothing more, logging sightings of impermanent texts. We record the metadata but not the data. We celebrate the trace, and bid farewell to texts that by accident or design fade, decay, or simply cease to be.

Let the archive be loved. But fugitive texts will become legend.

The Modern Language Association Wishes Away Digital Différance

This is the first academic semester in which students have been using the revised 7th edition of the MLA Handbook (you know, that painfully organized book that prescribes the proper citation method for material like “an article in a microform collection of articles”).

From the moment I got my copy of the handbook in May 2009, I have been skeptical of some of the “features” of the new guidelines, and I began voicing my concerns on Twitter:

But not only does the MLA seem unprepared for the new texts we in the humanities study, the association actually took a step backward when it comes to locating, citing, and cataloging digital resources. According to the new rules, URLs are gone, no longer “needed” in citations. How could one not see that these new guidelines were remarkably misguided?

To the many incredulous readers on Twitter who were likewise confused by the MLA’s insistence that URLs no longer matter, I responded, “I guess they think Google is a fine replacement.” Sure, e-journal articles can have cumbersome web addresses, three lines long, but as I argued at the time, “If there’s a persistent URL, cite it.”

Now, after reading a batch of undergraduate final papers that used the MLA’s new citation guidelines, I have to say that I hate them even more than I thought I would. Although “hate” isn’t quite the right word, because that verb implies a subjective reaction. In truth, objectively speaking, the new MLA system fails.

The MLA apparently believes that all texts are the same

In a strange move for a group of people who devote their lives to studying the unique properties of printed words and images, the Modern Language Association apparently believes that all texts are the same. That it doesn’t matter what digital archive or website a specific document came from. All that is necessary is to declare “Web” in the citation, and everyone will know exactly which version of which document you’re talking about, not to mention any relevant paratextual material surrounding the document, such as banner ads, comments, pingbacks, and so on.

The MLA turns out to be extremely shortsighted in its efforts to think “digitally.” The outwardly same document (same title, same author) may in fact be very different depending upon its source. Anyone working with text archives (think back to the days of FAQs on Gopher) knows that there can be multiple variations of the same “document.” (And I won’t even mention old timey archives like the Short Title Catalogue, where the same 15th century title may in fact reflect several different versions.)

The MLA’s new guidelines efface these nuances, suggesting that the contexts of an archive are irrelevant. It’s the Ghost of New Criticism, a war of words upon history, “simplification” in the name of historiographic homicide.

On Hacking and Unpacking My (Zotero) Library

Many of my readers in the humanities already know about Zotero, the free open-source citation manager that works within Firefox and scares the hell out of Endnote’s makers. If you are a student or professor and haven’t tried Zotero, then you are missing out on an essential tool. I use it daily, both for my research and in my teaching. [Full disclosure: I am not an entirely impartial evangelist for Zotero, as its developers are colleagues at George Mason University, in the incomparable Center for History and New Media.]

The latest version of Zotero allows you to “publish” your library, so that anybody can see your collection of sources (and your notes about those sources, if you choose). In my case, I’ve not only published my library on the zotero.org site, I’ve updated the main sidebar on this very blog with a news feed of my “Recently Zoteroed” books and articles. As I gather and annotate sources for my teaching and research, the newest additions will always appear here, with links back to the full bibliographic information in the online version of my library.

How did I do this?

Why did I do this?

What follows is an attempt to answer these two questions. Before I address the how-to, though, I’ll explain the why-to: why I’m making the sources I use for my teaching and research public in the first place.

Sharing my Library in theory

Like many scholars in the humanities (I imagine), I initially had qualms about sharing my library online — checking that little box in my Zotero privacy settings that would “make all items in your library viewable by anyone.” Emphasizing the gravity of the decision, zotero.org adds this warning: “Be very sure you want to do this.”

I do want to do this, I do, I do.

But why? We are accustomed, in the humanities, to being very secretive about our research. Oh sure, we go to conferences and share not-yet-published work. But these conference papers, even if they’re finished the morning of the presentation with penciled-in edits, they’re still addressed to an audience, meant to be shared. But imagine publishing your research notes and only the notes, shorn of context or rhetoric or (especially or) the sense of a conclusion we like to build into our papers. Imagine sharing only your Works Cited. Or, imagine sharing the loosest, most chaotic collection of sources, expanded way beyond the shallows of Works Cited, past the nebulous Works Consulted, deep into the fathomless Works Out There.

Proprietary software like Endnote reinforces the notion that the engine of scholarship is competition.
A paranoid academic (and most of us are paranoid) might worry that by sharing our pre-publication sources, whether they’re primary or secondary sources, we are exposing our research before its time. My sense is that we like to keep our collection of sources private as long as possible, holding them close to our chest as if we were gamblers in the great poker game of academia. And in this game, our colleagues are not colleagues, but opponents sitting across the table from us, bluffing perhaps, or maybe holding a royal flush. Proprietary software like Endnote, which by default encloses research libraries within a walled garden, reinforces this notion, that the engine of scholarship is competition rather than collaboration.

Or, to switch metaphors, sharing our sources in advance of the final product is like sharing the blueprints to a house we haven’t yet built — a house we may not even have the money to build, and meanwhile you just know there’s somebody out there, more clever or less scrupulous or just damn faster, who can take those blueprints and erect an edifice that should have been ours while we’re still at town hall getting zoning permits. We’ve all had that experience of reading a journal article or — damn it! — a mother effing blog in which the author tackles clearly, succinctly and without pause some deep research concern that we’ve been pondering for years, waiting for it to blossom into a Beautiful Idea in our writing before going public with it. And POOF! somebody else says it first, and says it better.

Keeping our sources private is the talisman against such deadly blows to our research, akin to some superstitious taboo against revealing first names. We academics are true believers in occult knowledge.

To put it in the starkest terms possible: before I published my library I was concerned that someone might take a look at my sources and somehow reverse engineer my research.

Let’s face it, I’m an English professor. It’s not as if I’m working on the Manhattan Project.
Are we in the humanities really that ridiculous and self-important? Let’s face it, I’m an English professor. It’s not as if I’m working on the Manhattan Project. My teaching and research adds only infinitesimally incrementally to the storehouse of human knowledge. I don’t mean to belittle what scholars in the humanities do à la Mark Bauerlein. On the contrary, I think that what we do — striving to understand human experience in a chaotic world — is so crucial that we need to share what we learn, every step along the way. Only then do all the lonely hours we spend tracing sources, reading, and writing make sense.

Looked at prosaically, public Zotero libraries may be the equivalent of a give-a-penny, take-a-penny bowl at a local store. This convenience alone would be useful, but the creators of Zotero are much more inspired than that. They know that sharing a library is crowdsourcing a library. The more people who know what we’re researching before we’re done with the research, the better. Better for the researchers, better for the research. Collaboration begins at the source, literally. And as more researchers share their libraries, we’re going to achieve what the visionaries in the Center for History and New Media call the Zotero Commons, a collective, networked repository of shareable, annotatable material that will facilitate collaboration and the discovery of hidden connections across disciplines, fields, genres, and periods.

And that is why I’m sharing my library.

Sharing my Library in Practice

Now, how am I sharing it? I’ve taken what seems to be an unnecessarily complicated route in order to incorporate my library into my blog. There is an easy way to do what I’ve done: Zotero has native RSS feeds for users’ collections, and all you need is to subscribe to that feed using a widget on your blog. In my case I could have used the default WordPress RSS sidebar widget. But I didn’t. I wound up working with both Dapper and Yahoo Pipes, and here’s why.

I didn’t like how the RSS feed built into zotero.org included everything I added, including duplicate citations, snapshots that I later categorized as something else, and PDFs unattached to metadata (even if I retrieved that metadata later). In short, the default RSS stream looked messy in WordPress (but it looks great in Google Reader). [UPDATE: Patrick Murray-John’s awesome Zotero WordPress plugin solves these problems and makes the Pipes solution below unnecessary—though still cool.]

The online mash-up tool Yahoo Pipes is perfect for combining and filtering RSS feeds and that’s what I wanted to use. I can’t program my way out of a paper bag, but Pipes is simple enough that even I can use it. So why did I also use Dapper, another online tool that lets you do fun things with RSS feeds? Because Pipes for some reason would not accept the Zotero RSS feed as valid. I haven’t been able to confirm this, but I’m guessing it has something to do with Zotero’s API using a secure HTTPS rather than HTTP. Or maybe it’s because the Zotero feed is actually XML rather than RSS. Again, I’m not a programmer and I’m just fumbling my way around this hack. In any case I ran my Zotero feed through the Dapp Factory, which did accept it.

Next I dumped the Dapper feed into Yahoo Pipes, using several of Pipe’s operators to filter duplicates and attachment file names that were cluttering the RSS feed. Here’s is a map of my Pipe.

Using Yahoo Pipes to filter a Zotero library
Using Yahoo Pipes to filter a Zotero library

It’s quite simple, and with some experimentation I may improve my hack (for example, I’m toying with Feedburner as a substitute for Dapper, which may preserve more of the original XML, giving Pipes more raw data to manipulate and mash). But even right now in its kludged form, the result is exactly what I set out to do.

In addition to its simplicity, one of the advantages of Yahoo Pipes is the variety of output formats available. For my blog’s sidebar I have Pipes generate an RSS feed, but I could just as easily create an interactive Flash “badge” with it:

I find the possibilities of a portable, embeddable version of my Zotero library extremely evocative. It’s a kind of artifact from the future that our methodological and pedagogical approaches haven’t caught up with yet. Here is where the theory and practice of a collaborative library have yet to meet — and I want to end my manifesto/guide with a simple appeal: let’s begin thinking about the untapped power of this intersection and what we can do with it, for ourselves, our students, and our scholarship.