Follow Harriet on Twitter
A Textual Ecosystem
Faced with an unprecedented amount of digital text, writing needs to redefine itself in order to adapt to the new environment of textual abundance. What do I mean by textual abundance? A recent study showed that in 2008, the average American consumed 100,000 words of information in a single day.1 (By comparison, Leo Tolstoy’s War and Peace was only 460,000 words long.) This doesn’t mean we read 100,000 words a day — it means that 100,000 words cross our eyes and ears in a single 24-hour period.
Of course, one can never know what all those words mean or if they have any use whatsoever, but for writers and artists — who often specialize in seeing value in things that most people overlook — this glut of language signifies a dramatic shift in their relationship to words. Since the dawn of media, we’ve had more on our plates than we could ever consume, but something has radically changed: never before has language had so much materiality — fluidity, plasticity, malleability — begging to be actively managed by the writer. Before digital language, words were almost always found imprisoned on a page. How different today when digitized language can be poured into any conceivable container: text typed into a Microsoft Word document can be parsed into a database, visually morphed in Photoshop, animated in Flash, pumped into online text-mangling engines, spammed to thousands of email addresses and imported into a sound editing program and spit out as music; the possibilities are endless.
If we think of words as both carriers of semantic meaning and as material objects, it becomes clear that we need a way to manage it all, an ecosystem that can encompass language in its myriad forms. I’d like to propose such a system, taking as inspiration James Joyce’s famous meditation on the universal properties of water in Ulysses. When Joyce writes about the different forms that water can take, it reminds me of different forms that digital language can take. Speaking of the way water puddles and collects in “its variety of forms in loughs and bays and gulfs,” I am reminded of the process whereby data rains down from the network in small pieces when I use a bittorrent client, pooling in my download folder. When my download is complete, the data finds its “solidity in glaciers, icebergs, icefloes,” as a movie or music file. When Joyce speaks of water’s mutability from its liquid state into “vapour, mist, cloud, rain, sleet, snow, hail,” I am reminded of what happens when I join a network of torrents and I begin “seeding” and uploading to the data cloud, the file simultaneously constructing and deconstructing itself at the same time. The utopian rhetoric surrounding data flows — “information wants to be free,” for example — is echoed by Joyce when he notes water’s democratic properties, how it is always “seeking its own level.” He acknowledges water’s double economic status in both “its climatic and commercial significance,” just as we know that data is bought and sold, as well as given away. When Joyce speaks of water’s “weight and volume and density,” I’m thrown back to the way in which words are used as quantifiers of information and activity, entities to be weighed and sorted. When he writes about the potential for water’s drama and catastrophe “its violence in seaquakes, waterspouts, artesian wells, eruptions, torrents, eddies, freshets, spates, groundswells, watersheds, waterpartings, geysers, cataracts, whirlpools, maelstroms, inundations, deluges, cloudbursts,” I think of electrical spikes that wipe out hard drives, wildly spreading viruses, or what happens to my data when I bring a strong magnet too close to my laptop, disasterously scrambling my data in every direction. Joyce speaks of water the way data flows through our networks with “its vehicular ramifications in continental lakecontained streams and confluent oceanflowing rivers with their tributaries and transoceanic currents: gulfstream, north and south equatorial courses,” while speaking of its upsides, “its properties for cleansing, quenching thirst and fire, nourishing vegetation: its infallibility as paradigm and paragon.”
While writers have traditionally taken great pains to ensure that their texts “flow,” in the context of our Joyce-inspired language / data ecosystem, this takes on a whole new meaning, as writers are the custodians of this ecology. Having moved from the traditional position of being solely generative entities to information managers with organizational capacities, writers are embodying tasks once thought to belong only to programmers, database minders, and librarians, thus eradicating the distinction between archivists, writers, producers and consumers.
Using methods similar to Jonathan Lethem’s The Ecstasy of Influence: A Plagiarism, Joyce composed this passage by patchwriting an encyclopedia entry on water. By doing so, he actively demonstrates the fluidity of language, moving language from one place to another. Joyce presages uncreative writing by the act of sorting words, weighing which are “signal” and which are “noise,” what’s worth keeping and what’s worth leaving. Identifying — weighing — language in its various states of “data” and “information” are crucial to the health of the ecosystem:
Data in the 21st century is largely ephemeral, because it is so easily produced: a machine creates it, uses it for a few seconds and overwrites it as new data arrives. Some data is never examined at all, such as scientific experiments that collect so much raw data that scientists never look at most of it. Only a fraction ever gets stored on a medium such as a hard drive, tape or sheet of paper. yet even ephemeral data often has ‘descendents’ — new data based on the old. Think of data as oil and information as gasoline: a tanker of crude oil is not useful until it arrives, its cargo unladed and refined into gasoline that is distributed to service stations. Data is not information until it becomes available to potential consumers of that information. On the other hand, data, like crude oil, contains potential value.” 2
How can we discard something that might in another configuration be extremely valuable? As a result, we’ve become hoarders of data, hoping that at some point we’ll have a “use” for it. Look at what’s on your hard drive in reserve (pooled, as Joyce would say) as compared to what you actually use. On my laptop, I have hundreds fully indexable PDFs of e-books. Do I use them? Not in any regular way. I store them for future use. Like those PDFs, all of the data that’s stored on my hard drive is part of my local textual ecosystem. My computer indexes what’s on my hard drive and makes it easier for me to search what I need by keyword. The local ecosystem is pretty stable; when new textual material is generated, my computer indexes it as data as soon as it’s created. On the other hand, my computer doesn’t index information: If I’m looking for a specific scene in a movie on my drive, my computer will not be able to find that unless I have, say, a script of the film on my system. Even though digitized films are made of language, my computer’s search function only, in Joycean terms, the skims the surface of the water, recognizing only one state of language. What happens on my local ecosystem is prescribed, limited to its routine, striving to function harmoniously. I have software to protect against any viruses that might destabilize or contaminate it, allowing my computer to run as it’s supposed to.
Now let’s say I take a text document and upload a copy of it to a publicly accessible server where it can be downloaded, while keeping a copy on my PC. I have the identical text in two places, operating in two distinct ecosystems, like twins, one who spends their life close to home, and the other who adventures out into the world: each textutal life is marked accordingly. The text document on my PC sits untouched in a folder, remaining unchanged, while the text in play on the network is subject to untold changes: it can be cracked, password protected, stripped of its textual character, converted into plain text, remixed, written into, translated, deleted, eradicated, converted to sound, image or video, and so forth. If a version of that text were somehow to find its way back to me, it might very well be more unrecognizable than my nursery rhyme.
The editing process that occurs between two people via email of a word processing document is an example of a microclimate where the variables are extremely limited and controlled. The tracked editorial changes are extra-linguistic and purposeful. Opening up the variables a little more, think of what that happens when an mp3 is passed around from one user to another, each slightly remixing it, defying any definitive version. In these ecologies, final versions do not exist. Unlike the result of a printed book or pressed LP, there is no endgame, rather flux is the inherent to the digital.
The text cycle is primarily additive, spawning new texts continuously. If a hosting directory is made public, language is siphoned off like water from a well, replicating it infinitely. There is no need to assume that — notwithstanding any of the above mentioned catastrophes — that a textual drought will occur. The morass of language does not deplete, rather it creates a wider, rhizomatic ecology, leading to a continuous and infinite variety of textual occurrences and interactions across both the network and the local environment.
The uncreative writer constantly cruises the web for new language, the cursor sucking up words from untold pages like a stealth encounter. Those words, sticky with residual junky code and formatting are transferred back into the local environment and scrubbed with TextSoap, which restores them back to their virginal states by removing extra spaces, repairing broken paragraphs, deleting email forwarding marks, straightening curly quotation marks, even extracting text from the morass of html. With one click of a button, these soiled texts are cleaned and ready to be redeployed for future use.
(1), (2) Roger E. Bohn and James E. Short, “How Much Information? 2009: Report on American Consumers”, Global Information Industry Center, University of California, San Diego, December 9, 2009.