Place and No Place: Reflections on Panorama, Glitch, and Photospheres in an Aesthetic Imaginary Shared by Humans and Machines


Scott Rettberg, University of Bergen

Notre Dame Review Online, Winter/Spring 2018

This is an extended version of the essay of the same title published in the print edition of the Notre Dame Review.

This essay addresses the relationship between aesthetic choices made by humans and those determined by software, by glitch, and by chance in new kinds of digital images produced on mobile phones. I consider how the variable image, particularly in an age of ubiquitous and mobile computing, both brings us closer to the specificity of location in the physical world and, through its transcoded and malleable nature, defamiliarizes and metaphorizes place. I will focus in particular on a set of images made using an iPhone, “glitch panoramas” and “photospheres.” Most importantly, I’m questioning how images that through a combination of human and machine vision within the contemporary transitional media environment function as an aesthetic imaginary that is created by and shared between humans and technology.

The photograph and the unreal 

During the history of photography (a relatively short one in comparison to other art forms) we have seen significant transitions in our perception of the indexicality of the photograph. During the initial transmission of photographic images, there was less trust of the photographic image as a true representation of reality than there was as the technology became more commonplace. The initial impulse was to understand the image as something other than what one would see with one’s own eyes. Consider the situation of early photographic portraits, such as those of soldiers departing for the battlefields of the American Civil War. These images would require long sitting sessions and long exposures. The subject of the portrait would need to sit still in an unnatural position for minutes (not seconds or fractions of a second or microseconds). Because of the long exposure, these images would result in an artifact that, while bearing a strong resemblance to the subject, would also capture facial ticks and slight movements recorded during the session as blurs of light, obfuscating the image of the person. The images would then often be further touched-up, painted by hand to make them appear more lifelike. The resulting images were not truly indexical or “lifelike” so much as memento mori painted with light.

Private Edward A. Cary of Company I, 44th Virginia Infantry Regiment, in uniform and his sister, Emma J. Garland née Cary. 1861-62. Charles R. Rees, photographer.

The image produced was highly dependent on the material substrates used in the image capture and development process. Daguerreotype, calotype negatives, salt printing, autochrome, Kodachrome—every successive analog image technology relied on photochemical processes that had specific physical effects on the type of image that resulted. The era of analog photography resulted in images within a set of constraints defined both by the camera and printing technologies used. The photographic image gradually became less “painterly” and more “realistic” from the 19th until the late 20th century, though the specific properties of color, sharpness, material surface, etc. continued to be variable in this material sense depending on the camera, film, photographic paper, etc. up until the end of the period in which analog photography was dominant. During the transitional period of photography in the 20th century, the trust in the photographic image grew stronger, and the image to some degree acquired the quality of “indexicality.”

Charles Pierce distinguished between iconicity and indexicality. He described the photograph as indexical in the sense that:

Photographs, especially instantaneous photographs …  are in certain respects exactly like the objects they represent. But this resemblance is due to the photographs having been produced under such circumstances that they were physically forced to correspond point by point to nature. (11)

Because of this perceived indexical quality, photographs were eventually taken to be a more trustworthy representation of reality than that made by a sketch artist or a human memory of the particular details of an event. Photographs were introduced and accepted as evidence at trials, and published in newspapers not only as illustrations but as factual evidence that the events described in narrative texts actually took place. While no storyteller could ever be completely trusted, audiences would look at a series of published photographs and say “you can’t fake that.” The photographic images were then accorded great respect, as a form of testimony more reliable than the human.

One popular phenomena during the early days of photography from the late 19th through to the 20th was “spirit photography.” Photographers noticed that when a double exposure accidentally took place, one of the images would appear to be “ghosted” onto the other image. William H. Mumler, purported to be the first photographer to discover this phenomena, took advantage of the technological artifact to develop a fraudulent career as a medium. His images (such as an image which purports to show Mary Todd Lincoln with the ghost of her deceased husband) became widely popular, even though there is no evidence that there was widespread belief in the indexicality of these spirit images.

Picture of the ghost of Abraham Lincoln with Mary Todd Lincoln (circa 1869). William H. Mumler.

Spirit photography had well-known advocates, such as Arthur Conan Doyle, and some books were published in support of the belief that spirit photographs were derived from the means of the fluid substance of “ectoplasm” leached by the ghosts into the image,  but the most likely explanation for the popularity of spirit photographs is not that people believed them indexical but that they recognized their own affected desire, their “wish for them to be so.” The example of Lincoln’s ghost appearing with his widow was for example not the result of happenstance—Mary Todd Lincoln specifically sought out Mumler and came to his studio with the express desire of encountering her husband in the spirit world (Kaplan 93). While there is a strong appeal to the documentary image, as it answers a need for a shared understanding of objective reality, there has also always been a pull towards the non-indexical photographic image, the image that presents us with what our eyes desire but cannot see: the unreal image produced by the apparatus of documentary image recording.

Trick photography and physical retouching of images were used not only for aesthetic ends or for necromancy but also for political ends. The famous image of Joseph Stalin known as “the Commissar Vanishes” is but one of many examples that demonstrate both the power of the belief in the indexical nature of the image and the actual frailty of that indexicality[1].  Removing a person from an image was for Stalin part of a process of removing that person from shared memory and therefore from common reality. It is important to note that were it not for the increasing trust in the indexicality of the photographic image, this gesture of erasure would not have the power that it did.

Unaltered and censored images of Stalin, (Nikolai Yezhov, censored) and Molotov at the shore of the Moscow-Volga canal. (1937, 1940).

Nothing inherent in the technologies of analog photography that prevented manipulation of the photographic image, but as a general rule, when we encountered an analog photograph, our first assumption was that it at least began as an indexical image with a strong physical relationship to the object it depicted. Until recently the retouching of images such as of models in fashion magazines was commonplace, but well-done manipulation of photographs or cinematic images was expensive and time-consuming. You may have encountered and processed the cover of Vogue as a manipulated image, but if a friend handed you a pile of photographs from her summer vacation, you would not assume that those images had been faked or manipulated. For the most part we encountered the photograph as an indexical representation of a moment in reality. In the digital era, the general assumption that photographs are indexical is falling away.

The transcoded, malleable, networked image

As Lev Manovich highlights, with the dawn of digital photography, our understandings of the indexical relationship between the photographic (or cinematic) image and its generic function necessarily change, albeit in a complex way. In The Language of New Media, Manovich asks:

...what happens to cinema's indexical identity if it is now possible to generate photorealistic scenes entirely in a computer using 3-D computer animation; to modify individual frames or whole scenes with the help of a digital paint program; to cut, bend, stretch and stitch digitized film images into something which has perfect photographic credibility, although it was never actually filmed? (295)

Because the photograph, audio recordings, texts, video and all other media processed by the computer are transcoded variable media, they are much more easily modified both by humans and by algorithmic processes.  The net effect of digitization for cinema may actually be an end to the concept of cinema as indexical media technology and in a way,  a return to a prior art practices. Manovich argues that “the manual construction of images in digital cinema represents a return to nineteenth century pre-cinematic practices, when images were hand-painted and hand-animated” and that “cinema can no longer be clearly distinguished from animation. It is no longer an indexical media technology but, rather, a sub-genre of painting” (“Digital Cinema” 3). As we discover whenever we consider genre in digital media, clear distinctions between drawn image, painted image, photographic image, and generated image collapse in an environment where all images are processable and malleable.

Even as we note this collapse of genres, we must however also acknowledge that the photographic image has never been more ubiquitous than it is today. If we cannot blindly trust the veracity of any given image, even as we reside in a “post-truth society” filled with fake news and ideologically contingent media, the photographic image is also today more than ever before an instrument of control. We have never been so thoroughly surveilled or such willing participants in our surveillance. Our images, our bodies and our faces, are not only read and processed by other human beings, but by various technological systems. Every image we post to a social network, every selfie we take, every Snapchat story we share, we contribute to a massively interconnected surveillance engine, diffusely accessed and controlled. The state is one participant in this continuous surveillance, certainly, but so are we as social networkers. We post and we say to our friends, networks, and agents we are not even aware of “Monitor me! Notice and record my activities!” One might even argue that collective surveillance has become our primary mode of social interaction on the network. We watch our friends and we watch our friends watching us. There is, of course, a difference between being monitored by a hidden security camera and posting a selfie to Facebook from a hike to the Grand Canyon or a drunken escapade in the campus quad, in that there is some degree of agency in our social network activity. We can control aspects of our surveillance by sharing our perception or at least our desired perceptions of our own experience. But that control is always limited. The fact that it is useful to me to share images, that I find it rewarding in some sense, a fulfilling activity that increases my sense of well-being and connection to distant family and friends, does not obviate the fact that other human and non-human actors are putting those same images to other uses than those I intend. At the same time as I am sharing pictures of my daughter’s birthday party with family and friends, I may also be helping Facebook to identify trends in shopping patterns, or helping Google to train its face-recognition algorithms, or helping the NSA to keep tabs on me just in case I should ever fall out of line. The transcoded image is no longer only seen by humans, nor is it even only “seen” at all—it is instead yet another form of data, another entry in a perpetually updated interconnected database of databases, continuously harvested and reprocessed by agents beyond our horizon of knowing.

There is a justifiable sense of paranoia to our interactions with a global network that not only provides us with ways of sharing and manipulating our data that would have largely been inconceivable even a decade ago but also uses that data in ways that are not transparent to us as interactors. We do not know what is happening to our data beyond what the platforms we send our data to feed back to us. But make no mistake, the platforms are giving a lot back to us—they are giving us things we did not even know we wanted until the platforms started giving them to us. I remember when video-chatting services like Skype were science fiction, when the closest thing we knew to an iPhone was the tricorder on Star Trek. The tricorder[2]? We have that now! It’s a smartphone! Even something as simple and now commonplace as a collaboratively written document, a Google doc, a text that lives in the purportedly transcendental social data space of “the cloud” would have seemed wildly futuristic only a couple of decades ago.

And now I can click a button and have my face biometrically mapped with dog or alien or clown features and instantaneously videocast on Snapchat. That’s wonderful, it’s astounding! There is a supercomputer in the palm of my hand, and it eanbles me to send my class a video of myself discussing media theory while my face moves with the bizarre visage of a basset hound! It’s a brave new world, my fellow puppies!

Glitch and the New Aesthetic

Macintosh HD:Users:scottrettberg:Dropbox:Panoramas, Glitch, Photospheres essay:images:05_face_swap_vape.jpg

Faceswapping while vaping: 21st Century spirit photography on Snapchat.

The above image of “faceswapping while vaping” (21st century spirit photography?) is one of thousands reposted on James Bridle’s Tumblr feed The New Aesthetic. Bridle began gathering images on this Tumblr feed in 2011, and during 2012 the collection of images became a kind of rallying point for critical consideration of images produced not only as a result of the variability and malleability of digital media, but also of the effects of machine vision on the production of new types of aesthetic artifacts. Bridle describes the project somewhat vaguely on his “About” page for the Tumblr feed:

Since May 2011 I have been collecting material which points towards new ways of seeing the world, an echo of the society, technology, politics and people that co-produce them.

The New Aesthetic is not a movement, it is not a thing which can be done. It is a series of artefacts of the heterogeneous network, which recognises differences, the gaps in our distant but overlapping realities.

The most compelling images in this collection, and the fulcrum for the critical attention yielded by the conception of the New Aesthetic, are those which must clearly illustrate the conception of “overlapping realities” in the sense that as humans we have aesthetic responses to images that are produced as a result of machines observing and processing the world. The overlapping realities concerned are those of human intelligence and aesthetic sensibility with those of artificial intelligence and what might be understood as algorithmic and sometimes accidental aesthetics. Computational processes result in images that may serve an intended function of the system, or may be a tertiary result of the system, or may be produced as a result of a flaw in the system, a glitch.

The term “glitch” apparently originates in the German and Yiddish word glitschen, to slip. It was adapted during the 1940s and 50s as a word for an error by the radio and television broadcast industries. In 1959 Sponsor magazine described glitch as “slang for the 'momentary jiggle' that occurs at the editing point if the sync pulses don't match exactly in the splice” (Zimmer). During the 1960s it was adopted during the Mercury space program as a term for “a spike or change in voltage in an electrical circuit” and by extension to any noticeable electronic problem. In recent years, glitch has in particular been used to discuss artifacts (typically visual artifacts) produced by errors in computer software or hardware. An artistic hacking aesthetic has even developed around  the purposeful production of glitch effects, for example by deleting or changing lines of the hexadecimal code of JPEG images, or by physically cutting some of the wires in VGA cables to change the images carried through them.

After witnessing a panel on the New Aesthetic at the 2012 South by Southwest Festival, Bruce Sterling wrote a long critical piece for Wired on what he saw and heard. He noted that the New Aesthetic was illustrative of a particular kind of cultural moment, that “this is one of those moments when the art world sidles over toward a visual technology and tries to get all metaphysical.” Sterling is critical of the New Aesthetic approach, not because he thinks the idea invalid, but because perceives the collection of objects curated by Bridle as too diffusely heterogenous, not so much a defined aesthetic program as a mish-mash of images emerging from contemporary digital technology:

a heap of eye-catching curiosities don’t constitute a compelling worldview. Look at all of them: Information visualization. Satellite views. Parametric architecture. Surveillance cameras. Digital image processing. Data-mashed video frames. Glitches and corruption artifacts. Voxelated 3D pixels in real-world geometries. Dazzle camou. Augments. Render ghosts. And, last and least, nostalgic retro 8bit graphics from the 1980s.

Sterling points out that there is a problem in the fact that “these cats don’t herd together.” He sees the New Aesthetic as largely being a “design-fiction” that lyricizes “machine vision” without ever precisely defining what machine vision might entail—pixelated camouflage, surveillance cameras, and 8bit video game graphics are after all very different types of things. Sterling suggests that “a sincere New Aesthetic would be a valiant, comprehensive effort to truly and sincerely engage with machine-generated imagery—not as a freak-show, a metaphor or a stimulus to the imagination—but *as it exists.*” Although the images are produced by technology, Sterling sees this “ain't-it-cool” attitude towards these images as inauthentic and in some sense naive. The core point from Sterling’s essay worth considering more thoroughly is this:

Our human, aesthetic reaction to the imagery generated by our machines is our own human problem. We are the responsible parties there. We can program robots and digital devices to generate images and spew images at our eyeballs. We can’t legitimately ask them to tell us how to react to that.

We might reframe Mark Amerika’s anthropomorphic assertion in his performances and exhibitions of the “Museum of Glitch Aesthetics” that “Glitch is the soul in the machine.” Glitch is not the soul in the machine, but the soul that we see in the machine.

The aesthetic of the New Aesthetic is not determined by any given system, but is something that we construct in response to and with the new inputs that are fed to us. So one problem, challenge, and opportunity presented to contemporary new media artists is how to encounter the artifacts of machine-produced images on an aesthetic basis. As a writer, I’m thinking about how these these types of images, and the computational processes they entail, might provide us with new poetic opportunities, new materials and environments for digital narrative.

The images I’ll discuss—that I have made in collaboration with software running on my iPhone (and posted to the amorphous cloud)  over the past few years are for the most part the same type of Wunderkammer cabinet-of-curated-curiosity items that characterize the New Aesthetic. But they might begin to bring me closer to an understanding of how the supercomputers in our pockets can serve as unconscious collaborators and generators of new material and environments for digital narrative.

I focus here on two types of images I have been producing habitually for the past several years which I have not completely figured out how I will use in electronic literature projects: horizontal panoramic photos and 360° panoramas—also known as photospheres. The process involved in producing each of these types of artifacts with a smartphone differs and in each case involves strange bodily interaction with the device, as well as complex algorithmic manipulation of the image by software, aspects of which are entirely beyond the control of the photographer. So there are complex and strange feedback loops involved in the production of these images. They are also remarkable for the high incidence of visual glitches, half-captured images, artifacts, etc. present in the output images. When making panoramas many photographers seek a kind of perfection in the image—for example, to capture a mimetic representation of a serene sunset over a mountainscape—which might bring the view closer to a sense of “being there” than a conventional photograph. I find these glitched, flawed, imperfect images however to be much more compelling than “perfect” mimetic panoramas in the sense that they provide representations of the present moment of an aesthetic imaginary that is shared between humans and machines.

In his “Manifesto for a Theory on the New Aesthetic” Curt Cloninger addresses the strangeness of  images produced as a result of collaboration between humans and algorithms:

New Aesthetic images are uncanny (unheimlich, un-homelike). If NA images were totally familiar, we would read them as family photos. (They are our new family photos). We recognise ourselves in NA images, but also something other than ourselves: or rather, still ourselves—but ourselves complicated, enmeshed, othered.

While there is certain satisfaction to be had in capturing a sublime landscape in precise photographic detail, these uncanny images offer something else, inviting a different kind of aesthetic fascination that has a lot to do with the sense of “othering” that Cloninger mentions. For me the appeal is not that they perfectly capture and allow me to see again and share an experience that I actually had in my experience, but instead that they allow me to see and share an experience of a time and place that I never had, even though I was present in that time and place and was an agent in the production the resulting image.

The embodied, techno-temporal situation of iPhone panoramas and Google photospheres

Although even a simple snapshot taken with any contemporary digital camera involves a feedback loop between a human operator and a complex algorithmic process, most experiences of digital photography tend towards mimesis of a deceptively simple variety. I see something that I want to capture. I take out a camera or a phone. I touch a button. The device focuses for me, post-processes the image for me, and there I have it, a high resolution capture of an experience that I had, ready to share and transmit. In other words, I pretty much know what to expect from the experience because what I attempt to capture is something that I have seen and chosen to capture. The reason that these panoramic images delight and challenge me is that even as I capture them, I do not know what to expect of them until a stitching algorithm finishes assembling them. The process of taking these types of images is similar to running a poetry generator—while I may have a sense of the variables I have provided the program and perhaps even the algorithmic process that will manipulate those variables, I don’t know what poems the generator will produce when I run the program. Vito Campanelli suggests that “...the crucial element of the New Aesthetic should be identified precisely as the sublimity of the images produced by the innovative forms of collaboration between humans and machines enabled by digital media” (260). These images result from precisely this form of collaboration, although in this case neither the human nor the machine actor have more agency that the other in the production of the image, and in fact there is a third vector here in given circumstances of the environment during the time the image is taken. So I would describe the production of these images as emerging from a triad of human, processing (or perhaps in N. Katherine Hayles’s terms, cognizing) machine, and spatio-temporal environment. This last aspect of the image is importantly aleatory. While there is always an element of chance involved in the production any given photographic image, this aspect is heightened in the capture of both normal panoramic images and 360° photospheres because of the fact that these images are captured on an extended time scale.

Christian Ulrik Andersen and Søren Bro Pold describe the New Aesthetic as “a description of computational practices that are often caused by misuse and failure, where we see ‘an eruption of the digital into the physical’ (Sterling 2012) and ‘a grain of computation’ (Jones 2011)” (272). I would describe three different types of aberrations that occur in panoramas and 360° photosphere images taken with the iPhone or other contemporary smartphones:

  1. The spatio-temporal situation of the image capture;
  2. The accidental or purposeful movements of the human photographer;
  3. Bugs or limitations in the hardware and software of the smartphone, local software, or cloud-based application used to create the image.

Mark B.N. Hansen’s Feed-Forward: On the Future of Twenty-first Century Media centers on the idea that contemporary computational technologies have the capability to process certain types of information faster than the human sensory apparatus can apprehend them. Hansen writes:

If these media systems help us—embodied, minded, and enworlded macroscale beings that we are—to access and act on the microtemporalities of experience, they do so precisely and only because they bypass consciousness and embodiment, which is really to say because they bypass the limitations of consciousness and embodiment. (46)

Hansen uses the term “feed-forward” rather than “feedback” because in most of his examples, computational systems are actually not responding to and reprocessing information provided consciously by a human actor, but themselves using sensors to see the world in a way that human could not process it to begin with, processing that information, and then providing it packaged to human consciousness for further action. An advanced example of this might be an airport security system that scans a crowd for faces of persons of interest, or for postures indicative of suspicious behavior, and then alerts security guards to the location of those specific people. While the guard may have just seen faces in a crowd, the system regards each individual face, each individual body, as a collection of data to be sensed, processed, and scanned against a database of known profiles. It then feeds that information forward to human actors, bypassing their own sensory apparatus. The panoramas and 360 images I discuss here provide more rudimentary but no less valid examples of a feed-forward phenomena: they provide a mechanism for gathering and processing images with a phone that I could not otherwise see. Although I am an agent in choosing a moment and a location, and involved in the embodied experience of gathering the data, the system senses and processes an image that I could not see without it. It is not a matter of “taking” a shot of something I have seen. It is a matter of interacting with and providing visual information to a system that will see the world in a way that I could not within the limitations of my own sensory apparatus.

Tuscany Panorama, 13.07.2015. Full size image:  https://goo.gl/photos/mKWJkpbUNN5zZkhp8

Let’s look at a few examples of different types of panoramic images and consider their effects. First consider two examples of what might be described as “normal” panoramas. The first is a picture of my daughter taken at sunset in Tuscany during a summer holiday. In my view this is representative of the usual effect we seek in typical panoramas. By offering an extended horizontal frame, the photo can be said to achieve a better sense of the landscape, and of the environment in which it is experienced than a typical framing of the same scene would offer. And while the girl is at the center of the image and in a sense focalizes the image, we are not as focused on her as we would be with a normal crop. The human is part of the environment of the landscape, but not its main element. This image is clearly intended to capture a different sense of immersion within a landscape than a normally dimensioned photograph, but one that remains tied to a conventional sense of place.

Amsterdam City Center Panorama, 23.07.2013. Full size image: https://goo.gl/photos/ERS88bUDEx75vZ8UA

The second image, taken in the center of Amsterdam, also looks fairly conventional, but for me demonstrates another effect of panoramas and their relationship to the device used to produced them. While the image is a cityscape, our attention is drawn more to the human activity in the image as an aspect of the landscape than for example to the architecture or the fountain at the center of the image. As opposed to a landscape panorama of an open landscape, the dimensions of the city square are warped and flattened. There is however a different social dimension to the photograph than there would be if it was a normally framed photo. When we look closely we see humans enmeshed in the dramas of their individual lives: the middle-aged couple on the left of the image—who seem to be the only people conscious of the photography taking place—are enjoying an intimate caress. At least five other people in the scene are also simultaneously photographing the same cityscape from different angles. Outside of a coffeeshop on the right side of the image, a group of youth are lying in the grass, perhaps having just sampled the wares. The fountain at the center of the image offers us spouting water frozen in time. The only artifacts in the image that indicate it is actually being stitched through an algorithmic process are overlapping Gs in the booking.com logo in the background and the slightly distorted face of a man in a blue shirt in the foreground. In the panorama we see human activity within the environment that we would not normally see merely by being there. When I am within the scene as a participant-observer, even as I take out my phone to photograph it, I am typically not looking at particular people or consciously registering their activities. I simply take out my phone, hold it vertically, press a button and sweep from left to right, trying to keep the image horizontally steady as I record it—a process that is represented in the interface as keeping an arrow moving on a line. The phenomenological experience of taking one of these panoramas is in a way more like driving a car and trying to stay between the lines than it is like seeing. I don’t actually see the people in the image until the software processes it and I look at it afterwards.

In “Taking a Scroll: Text, Image and the Construction of Meaning in a Digital Panorama,” Roderick Coover describes the conventional panorama as a “collection of moments seamlessly combined; it is not one moment.” Panoramas made with conventional digital photography involve taking a number of different shots across a panoramic field and then post-processing and stitching them together using software. A distinction between this process and panoramas and those made with the iPhone camera app or other similar applications is that for the photographer, the process of making the image is much more seamless and takes place over a different time scale. Shooting a panorama with a camera is a matter of holding a phone, pressing a button and sweeping your arm across a visual field. It takes a matter of perhaps 5-10 seconds, while the photography involved in conventional panoramas might take one or two minutes. The more significant difference is that the image is processed and stitched nearly instantaneously by the software internal to the phone itself. The software takes a stack of images as the user sweeps the camera, using the camera’s motion sensors to align images in relation to each other, overlapping and blurring the images together. Rather than spending painstaking hours aligning a panorama with desktop software, the human user perceives the process as only slightly different from taking a normal photograph. You shoot it, and then you’re done. Nevertheless, a great deal can happen in the physical world in 5-10 seconds, much more than occurs within the fraction of a second it takes for a conventional camera to open and close a shutter. So whenever the panoramic images are of scenes where people or objects are moving through space, aberrations and artifacts will appear that reflect that temporal dimension.

Amsterdam Bicycle Panorama 1, 23.07.2013. Full size image: https://photos.app.goo.gl/BWd47NGf40jm6kSy1

Amsterdam Bicycle Panorama 2: Look out for the bicycle, 22.07.2013. Full size image: https://goo.gl/photos/9uKkN3e2B2eQJLUJA

Amsterdam Bicycle Panorama 3: Curiosities, 22.07.2013. Full size image: https://goo.gl/photos/FKcV5appmeC6nDJY6

The above shots of bicyclists in Amsterdam manifest a glitch in the iPhone panorama system. While individual people usually move slowly enough that they might appear normal in a panorama image, and automobiles move quickly enough that they might appear only as a small band within the image, bicycles move at a speed in-between, so that a moving bicycle might appear a number of times in different states within the image. The effect is that we see the bicycles and their riders differently than we would either through our own sensory experience or through a normal photograph. The image is capturing movement and temporal change, not as a moving image, but as a series of changes of state within a two-dimensional image.

Amsterdam Bicycle Panorama 4: Disembodied leg, 23.07.2013. Full size image: https://goo.gl/photos/Bua2UvE8EZDW84ww6

Where there were multiple bicyclists passing by a café at various speeds, this affected the way that each was captured within the image. A slow-moving bicyclist wearing a white shirt appears in a dozen slices that track his motion through a single rotation of the wheels, another wearing a purple shirt appears whole in the distance but only as a flash of purple in the near foreground, while one woman passed so quickly that she only appears in the image as a disembodied leg.

File:Duchamp - Nude Descending a Staircase.jpg

Marcel Duchamp’s Nude Descending a Staircase No. 2. (1912).

These glitch panoramas call to mind the work of Cubists and Futurists, such as Marcel Duchamp’s Nude Descending a Staircase No. 2 or Eadweard Muybridge’s late 19th Century photographic motion studies. The glitch panoramas are similar in portraying motion through 2D still images, but different in that these motion studies manifest not as a result of artistic intention or even as an intended feature of a computational system, but through the output of a system arbitrarily confronting motion through space and time. This feature of capturing motion within a 2D image is a glitch in the original sense of the term, the result of an anomaly that occurs when layers are spliced together.

Google Street View and Google Earth images as curated digital art

In recent years, Google Maps, Google Street View, and Google Earth have effectively changed the way that Google’s many users understand, navigate and encounter the physical world. If I am planning a journey, I use Google Maps to chart the best route, and after I have rented the car, I use the same software to guide me to my destination. If I am renting a house, I will use Google Street View to examine the property and navigate the neighborhood, as if I were walking through it. And if I am talking with my children about something I have seen in another part of the world, only rarely will I pull out a paper map. Instead I use Google Earth to fly there and show them the place.

In his essay on “New Aesthetic in the Perspective of Social Photography,” Vito Campanelli discusses Jon Rafman’s The Nine Eyes of Google Street View. Rafman’s artistic practice is to roam Street View looking for unusual images, screen capture them, and save them to an archive. Campanelli describes the New Aestheticist as “a new figure in between those of the artist and curator, characterized by the capacity to aggregate aesthetic materials” whose function is to “derive value from an image produced by machinic entities and to ‘ascribe’ an aesthetic to it” (264). Rafman says that he opposes Google Street View’s presentation of the world as “observed by the detached gaze of an indifferent Being” and notes that Google cameras “witness but do not act in history.” He says that his task is that of “restoring the human gaze within Street Views” (“IMG MGT”). Rafman’s work is an intervention. Some of the images he has collected, initially gathered by Google’s Street View cars, reveal shocking moments of human vulnerability: one woman dragging another through the street by the hair near an apartment complex, a man holding a woman and threatening her, a hooded youth about to launch a molotov cocktail. Others are moments of strange beauty: two identical twins gazing up at a bridge from the waterside, farm workers in a field of roses. The artist’s work here is to search within this robotic photography archive to humanize the world as depicted of Google Street View through the selection not of places but of moments that reveal their temporal nature in a world shaped by human activity.

Images from Jon Rafman’s The Nine Eyes of Google Street View.

Rafman’s artistic/curatorial practice is similar to that of Clement Valla’s project Postcards from Google Earth—although Valla is looking for a different type of image. Valla has gathered a collection of images which include impossible deformities in the representation of the world as shown by Google Earth. In “The Universal Texture,” an essay Valla published on Rhizome about the project, he argues that these misrepresentations of the world that are nevertheless not errors, but “the absolute logical result of the system” as they demonstrate the truthful outcome of the algorithmic process through which Google Earth processes a tapestry of satellite imagery and data. Google Earth pulls not from one absolutely coherent set of satellite images or maps, but from a continuously updating set of diverse data sources. Google’s patented “Universal Texture” process maps and blends the visual information from these various sources as textures onto a 3D model of the whole planet. So the strange images that Valla collects are anomalies, but not errors. They reveal that in spite of the illusion that we are flying over a photographic representation of the planet as we navigate the space of Google Earth, we are actually encountering images that have been assembled, pastiched and “filled in” by a computational system. Although we encounter these images as if they were photographs, Valla asserts that what we see when we navigate Google Earth is “essentially a database disguised as a photographic representation.”

Macintosh HD:Users:scottrettberg:Dropbox:Panoramas, Glitch, Photospheres essay:images:07_valla_peter-guice.jpg

An image from Clement Valla’s Postcards from Google Earth project.

Projects such as Rafman’s and Valla’s lead us to consider a new role for the artist in the New Aesthetic. Patrick Lichty points out that there important questions for digital artists working with digital media about “agency and autonomy, and how much control the New Aestheticist gets in the execution of their process.” While traditional artists think of technologies as tools and work with media as material (in the sense that the painter has the brush, the paint, and the canvas), in the case of these projects the traditional role of the artist is taken by the technological apparatus itself. And yet, of course, Rafman’s and Valla’s projects have been exhibited in galleries and museums as works of art. The artist has not made the images, the artist has seen and gathered the images produced by the technological system, and recontextualized them in a human aesthetic context.

Human computation and human aesthetic sensibilities in Google Street View photospheres

The corpus of images provided by Google Street View has two components: the first consists of the images produced by Google’s massive fleet of Street View cars, trekkers, trolleys, boats and other vehicles, each with a nine-camera-and-laser rig, roaming the world and capturing views of the world in 360 and in 3D. The other component of the corpus consists of images contributed by people like me, who have the Google Street View app on their phones. Since 2010, Google has invited users to contribute panoramic images. Since 2013, Google has integrated 360 image capture software into Street View. So in addition to the images continuously gathered by Google’s fleet, Street View now includes a massively crowdsourced component of images produced by volunteer photographers using the Street View app or other devices to capture and contribute panoramic imagery. Users who contribute these panoramas with the app choose to pause at a given location and take about 20 pictures as they are directed by a series of orange dots on the screen, moving around horizontally and up and down to capture many facets of the same scene. These images are then automatically stitched together by software into a 360 format which presents the collected images into a single sphere.

iPhone screenshot demonstrating the Google Streetview 360 camera interface. The photographer rotates in a circle and tilts the phone up and down, following orange dots to gather the imagery required to produce the sphere.

Jill Walker Rettberg describes her experience of creating photospheres using the Street View software as one of becoming automated: “I found myself adjusting my motions to follow the dots as precisely as I could, replicating a flawed human version of the perfect motions of the Google Street view robots I was copying.” She also conveys her sense that the software through its stitching algorithm attempts to remove the human photographer from image: “The photographer is invisible. The human must become like a machine to make a sphere” she writes, “We have become sensors for the machines.”

Contributed Google Street View 360 images represent an interface between human and  machine vision and are an excellent example of a type of process known as “human computation.” Human computation as “a paradigm for utilizing human processing power to solve problems that computers cannot solve.” Quinn and Bederson (2011) further describe a consensus that what constitutes human computation are the problems that fit the general paradigm of computation, and as such might be solvable by computers; and in which the human participation is directed by the computational system or process[3]. Both the Google Street View car fleet and contributed panoramas involve human computation—the cars are piloted by people (for now). But whereas the cars are highly automated—the drivers simply follow a prescribed route to map the territory while the nine-camera rig gathers visual data, the individually contributed panoramas involve a great deal more human aesthetic interaction. The individual photographer chooses a particular location out of a specific, personal, human interest. While in some cases this interest may be commercial (if the photographers wish for example to get an image of their place of business situated on Google maps), for the most part we can assume that the photographers chose places, and moments, they consider important, evocative, or beautiful. So they are human-situated in a way that the images gathered more systematically by the Street View cars are not.

360 image of cherry blossoms in Bergen at Lille Lungegårdsvannet, 11.05.2016. See in 360: http://bit.ly/2p9jGFZ

Since 2014, I have produced several hundred of these 360 panorama images and contributed about 150 of them to Google Street View. Compared to capturing ordinary photographs, the process of making these is labor-intensive in a way that is markedly physical—it takes can take about five minutes and many contortions of the body to produce a single sphere. While the images produced are almost always intriguing, they are never “perfect” or even as close to perfect as image produced by the Google Street View cars—or by any of the number of 360 cameras which are currently emerging on the market. Smartphone 360 panoramas need to be understood as transitional media. They will be surpassed by better cameras with better software that capture images more mimetically and provide a better sense of depth and a more seamless sense of visual immersion. But in many ways, these flawed, human-centered images are aesthetically satisfying than the images produced by the planned and systematized Google Street View mapping of any given city. Because these images are produced in a radically imperfect way over what is in photographic terms a very long period of time, and stitched together from twenty or so different image, they embed within them a temporal dimension. This becomes particularly notable when there are images of people in the photos, for example walking down a street. When I create spheres of cityscapes I often end up with a 360 image that contains within it multiple images of the same person. I find these images particularly fascinating: although the image is 2D (albeit a strange sort of 2D which in its spherical presentation has a kind of 3D effect) and is static, the presence of these bodies or parts of bodies or multiple iterations of the same body has the same effect as the bicycle panoramas discussed earlier: presenting motion and human movement through time.

360 panorama of Roderick Coover at Jardin de Tuileries, Paris, 27.11.2014. See in 360: https://goo.gl/photos/z75EsaT9XjP4Yggo9

360 panorama of woman with pink umbrella, Byparken, Bergen, 31.10.2014. See in 360: https://goo.gl/photos/pMqKCcMvB3bg1Yiw8

While the software attempts to stitch the body of the photographer out of the photograph, the net effect of the photosphere is in fact literally anthropocentric: the human perspective is in the direct center of the image. When flattened, these images have a balanced and symmetrical appearance, because the human point of view is always in the center of any given scene, and while the software is good at blocking images of the photographer’s feet on the ground, in most daylight situations, the shadow of the photographer appears. So too, do other people. While the “perfect” or glitch-free image of a given place would likely not have people in it, photospheres taken by individuals, for examples in cities, most often do—albeit people who are ghosted, partial, fractured and replicated by the passage of time and the stitching algorithm. Google blurs images of people from the official photos taken by its service, but those uploaded by contributors are actually copyrighted by the contributors themselves, and the people in the images remain present as people with eyes and faces unless the image has been modified by the photographer.  The human is not automatically obfuscated. The New Aesthetic admixture of human input, algorithmic manipulation, and chance results in unintentionally cubist, futurist, surrealist depictions of human beings going about their business in the lifeworld of the city.

Temple Street Night Market, Hong Kong, 17.12.2016. See in 360: https://goo.gl/photos/jyxmL4QW5tgvn5cC6

360 panoramas within the Google media ecology

When I am travelling or find myself in a particularly visually striking place, I’ll often decide to take a photograph with the phone that is always in my pocket. I find myself looking at places I might photograph differently than I used to. I look at the scene in front of me and consider whether it would be better captured as a conventional photograph or as horizontal panorama. Then I look behind me, and at the ground, and at the sky, and consider whether it is instead worth the investment of time to capture the scene as a sphere.

I will use many of these images in my own art projects, but I have also uploaded the majority of them to Google Street View. Although any images you capture with Street View camera are saved to the camera roll and are not published to Google Street View by default, the program’s path-of-least-resistance is to share the images with Google immediately after they are rendered. You are reminded “No else sees”  the photospheres that remain in the “private” section of the app until you publish them to Google.

Screenshot of Google Street View App.

The photospheres actually have a number of distinct medial forms depending what software and device they are viewed with. If they are opened with a standard photo viewer (or in a document as in this essay) they will display as rectilinear image. Within the Google Street View app and the version of Street View connected to Google Maps, they render as spherical environments which you can navigate by rotating the image. If the same images are uploaded to Facebook, they can be navigated by moving the phone, as if the phone were a window you were looking through into the 360 space. The spheres can also be rendered in a form suitable for the Google Cardboard head-mounted display device. In this case, the sphere is rendered as two images, one for each eye. The user navigates the image by moving her head and turning around, to see the spherical image from all angles within the Cardboard viewer. While the images are not truly stereoscopic in the same was images that are rendered to show depth, they do have the effect of immersing the viewer within the image. In Ingrid Hoetzl and Rémi Marie’s terms, the image here is not so much object as “soft image”—image data that manifests differently depending on the platform in which it is read.

View of a photosphere rendered in Cardboard view.

The situation of viewing photospheres with Cardboard-style (Viewmaster) head-mounted display. The user places opens a viewer application and then places her phone into the viewer, rotating head and body to view the immersive image.

Because I find photospheres strange and complex as digital objects, and beautiful in their strangeness, I have made a lot of them, and about 150 of them have been “approved” by Google for presentation on Google Street View. Once they have been uploaded to the Google ecosystem, their function as “soft image” changes again. From Google maps, users can click on a stick figure icon to see Street View images, including contributed photospheres.

Google map with Street View images. Google-gathered Street View images follow roads driven by their fleet appear as lines. Human-gathered photospheres appear as circles, often on areas inaccessible by car.

These human-generated images provide a pool of imagery that is much more diverse and aesthetically compelling than the Street View images themselves which seem more intended to map the territory than to see the place.

Google, perhaps matched only by Facebook in the race to be the planet’s consummate purveyor of user-generated content, is quite skilled at motivating users to contribute materials without paying for their use. The Street View app, which includes the camera, galleries of contributed images to explore, and phone and Cardboard interface viewers, also connects to the user’s Google profile. Shortly after I uploaded my first sphere, I started getting weekly emails from Google, thanking me for contributions, encouraging me to contribute more and probably most importantly from a motivational standpoint, providing me with statistics showing how many views each of my images had garnered in the Google ecosystem. These statistics are also displayed in the user profile of the app itself. And the numbers are actually quite staggering. As of today, the 204 photospheres I have uploaded to Street View have been viewed 3.5 million times. That’s a lot of views. I have been producing digital narratives and artworks, poems, novels, films, for about 20 years. A number of them have been moderately successful in the fields in which they operate as cultural objects, but millions? One photograph I posted of the Centre Pomidou has been viewed 368,100 times. I’m pretty sure that means that this image is most-viewed thing I have made in my life. That is wonderful but also arbitrary and strange and in some ways demoralizing. More people have looked at this image than have read anything I’ve ever written.

Jeu de Paume at Place de la Concorde, Paris, 14.12.2014. Full size image: Full size image: https://goo.gl/maps/K5zoBchtJcT2

Of course these images, which I consider aesthetically when I produce or view them,  become other things when they become embedded within Google Maps.  287,300 views of a photosphere of the Jeu de Paume on Google Maps is not the same as the same number of viewers of a photo physically exhibited inside the Jeu de Paume (which is, after all, a photography museum) would be.

In Google Maps, it is an image but it becomes a mode of orientation, a virtual point of interest, a curiosity in a very, very large cabinet of them. The images I upload also become “operational” (Hoetzl and Marie 2015, 109) in the Google ecosystem in other ways. As soon as the image is uploaded, Google asks me to associate it with a place suggested from their database based on named places near the geolocation of the image.

If I cannot find a specific place to associate the image with, Google will suggest that I add the missing place to Google Maps. In this we can see another way in which the image becomes operative for Google. Google uses the image to engage its users in its continuous process of locative epistemology, a somewhat ambitious project with a goal of knowing, seeing, and naming every public space in the world.

Google is using me much more than I am using Google. I see the problem with that. And yet I need to confess that I don’t really mind. I don’t mind helping Google, that I have become its human agent. I actually look forward to those emails from Google telling me that my photos have been viewed another 39,000 times or so this week. That’s just enough, really, to satisfy and motivate me to continue making spheres. Google’s software enables me to make these images that I “own” and enjoy making, and can use for my own purposes, but I’ll never profit from them in the same way or to the same extent that Google does and will in the future. It is an ingenious system of human computation in the service of Google’s capitalist enterprise. Everybody who participates in some way wins, but Google always wins far more than its users.

360 panoramas as poetic environments

I’m interested in how photospheres might function not only within Google, and not only as art objects, but also as environments for digital narrative or poetry. I want to offer a few examples here of tentative experiments in using spheres as environments for digital writing.

In 2014, digital poet Jason Nelson organised a group of digital writers to experiment with “Story Spheres,” a beta test of online software made by an Australian design studio, Grumpy Sailor, and Google Labs. The platform was developed for creating narrative / audio experiences within photosphere images. I worked on the project while I was a visiting professor with the Labex Arts at Paris 8. The developers never really finished the Beta (and so I never completely finished the project) but some of the Story Spheres I produced are still accessible. It’s not clear whether or not the Story Spheres will continue to develop beyond the beta. There does not seem to have been much development on the platform since 2015,  and there has never been a major public release of the platform, though it seems like Google has cherry-picked of the features of the platform, such as the ability to link some photospheres or to add an audio track, in a more limited ways, into Street View and Cardboard Camera.

The major functionalities of Story Spheres not present in Street View are the ability to add audio tracks to the images, and to link between them. These two features add significant new dimensions of multimedia and narrative to the spheres, transforming them into potential storytelling environments.

“Hanging Sphere Garden”  in the Centre Pompidou https://www.storyspheres.com/scene/xAFEATe3

Structurally, the Story Spheres environment is essentially the same as that of hypertext fiction. Each sphere image functions as a hypertext node. The image provides an immersive environment. Poetry or narrative can come in either through modifying the image or through audio. A single sphere can include multiple audio tracks with are situated directionally within the image, and can be activated either automatically, with the viewer hearing the track that she is facing as the loudest, or can be activated in the same way as links, by clicking in the computer application or by focusing on a hotspot in the Carboard application. In “The Music of Our Spheres” beta project, I experimented with several models of introducing text, narrative, and poetics into the space of the spheres. In the “Hanging Sphere Garden” I took a sphere inside of the Centre Pompidou. I embedded two sound tracks, one of a choir singing in a Paris church, and another of a satellite signal I downloaded from NASA’s site. The two sounds are located on either side of the virtual space, so that the mix of the two sounds changes as the viewer rotates within the space. I wrote two poems and then modified the image so that each poem was written on an opposite facing wall within the gallery space.

In “Tuileries Pond” I experimented with using a number of different audio tracks in the same sphere. One of the tracks (ambient sounds from the garden recorded when I photographed the image) loops continously, while the other tracks are launched by the user’s interaction. The audio tracks included both sounds recorded in Paris (such as an organ grinder a few miles away, or the subway station directly beneath the park) and spoken word lines of a poem that responded to specific details within the image, pulling them into a kind of diffuse and mysterious narrative.

“Tuileries Pond”: viewers of the piece can trigger individual audio tracks or follow links to other Story Spheres. https://www.storyspheres.com/scene/59EEdZNE

In “Giant,” a sphere photographed in one of the gardens of Versailles, I wrote text into the image based on the myth of Enceladus, about a giant rising from the earth, the theme of one of the garden where the photo was taken. The myth is considered in the light of the persona of Louis XIV, the Sun King, who ordered the scenario's construction in the physical space of the garden. Lines of texts are situated in different parts of the image, while one audio track (the rumbling sound of a volcanic eruptions) loops in the background.

“Giant” in the gardens of Versailles https://www.storyspheres.com/scene/5ZGDvP5R

The Story Spheres platform seemed (and may well still be) promising, but it also demonstrated a lot of limitations. The platform is entirely situated on the server. So I could only upload content and manipulate it on the Web. I could not download the work, work with it offline, or save a local copy on my own system. This problem became particularly acute when the Story Sphere server crashed about halfway through my work on the project and some of my work was permanently lost. There were other limitations as well—the size of the images is restricted so that they will download relatively quickly, so the images are relatively low-resolution. The icons for links and audio samples cannot be changed by the individual creator, so that aspect of the visual design is limited. Finally there is the matter of how the works could be distributed. Links to the images can be shared, but they neither flow easily into a network service (as in the example of the Street View photospheres) nor be packaged by the individual creator into a work that functions outside of the Story Sphere site.

Although I basically gave up on the Story Spheres platform after losing some of my work due to the server crash incident, I searched for other solutions. One could of course code most of these interactions by hand—there is a Web protocol for displaying spheres and links and audio could be embedded with javascript. But the advantage of Story Spheres was that no coding was required in order to produce a project. Thankfully I ran across the program Pano2VR—software which has many of the same features of Story Spherse, but which functions on the user’s own computer, rather than the server. Pano2VR allows for linking and embedding of directional sound, and can output to HTML 5 or Cardboard. I have a couple of projects in progress using this software. One advantage of this platform is that authors own and control the output, so modifications such creating a new “skin” of icons or embedding javascript to introduce new interactive elements is possible. Of course, on the basis of my experience, no  artwork I create using this platform will reach an audience as large as that of the images that I contribute to Google Maps. The distribution networks of electronic literature are much smaller than those of the Google empire.

360 video and VR as documentary media

Virtual reality has been around in various implementations since the 1990s, but it is only in recent years that consumer applications of VR have become affordable and widespread. From rudimentary headsets such as Google Cardboard or Samsung Gear, which essentially pieces of paper or plastic made for the insertion of smartphones to more advanced dedicated devices such as the Oculus Rift or HTC Vive, there is a larger audience for VR now than there has ever been before. While it is unlikely that VR will live up the marketing hype now accompanying it, there is no question that there will very soon be a large installed user base for VR applications. The technology and entertainment industries are currently in the process of testing out what sorts of content will appeal to users. Although substantial resources are being poured into game development for VR, one of the surprises early in this wave of activity in VR has been how much interest has developed in 360 VR as a documentary medium.

The New York Times is at the forefront of VR content development, and features a “daily 360” on their digital edition. In November 2015, they distributed one million Google Cardboard headsets to print subscribers in one Sunday edition. The main type of content that they share on their digital edition is short form documentary, usually 1-3 minutes long, often quite rough cuts. The main advantage of this form of journalism over standard video documentary is that it puts the viewer in situ. Scenes were our focus would typically be on the action in front of us in any case (for example a political speech, a sports event, or a theatrical performance) don’t gain a great deal from 360 video. We could turn our heads around and look behind us, but why would we? Other types of situations (a bustling fish market, a firefight in a war zone, a carnival) that more naturally lend themselves to omnidirectional attention better lend themselves to experiences in these formats. Perhaps this is why games are not yet the dominant VR content type. Most contemporary computer games are structured on levels in a rail type structure—our orientation is set forward to the next challenge in front of us.

A number of recent documentary works point to the potential uses of 360 experiences in documentary narrative. 6X9: a virtual experience of solitary confinement produced by The Guardian places the VR headset user in virtual solitary confinement. The cell is a 3D-modeled environment. Facts about solitary confinement and quotations from prisoners who have experienced long-term solitary confinement appear on the walls, and affective simulations of hallucination, isolation, and fear take place. In this case, the 360 experience is used to induce a sense of claustrophobia in the viewer that, at least in theory, creates a sense of empathy with the people who subject to this particularly cruel form of modern incarceration.

Screenshot from 6x9: a virtual reality experience of solitary confinement.

In a TED talk digital artist and VR producer Chris Milk has referred to VR as “the ultimate empathy machine.” While this claim is certainly debatable (Harriet Beecher Stowe’s Uncle Tom’s Cabin served as a pretty good empathy machine, in the form of a book), it is clearly the case that the 360 experience of a stereoscopic head mounted display produces a distinctly different embodiment, and a different affect, than watching a film in a movie theater. In 2015, Milk worked with Gabo Arora and the UN Human Rights Commission on two VR film projects, Clouds Over Sidra—a 360 documentary made in the Za’atari Refugee Camp in Jordan, home to over 80,000 Syrian refugees and Waves of Grace, another 360 documentary set in West Point, Liberia that follows the experience of Decontee Davis, an Ebola survivor who uses her immunity to help others affected by the disease. The claims that these films produce ultimate empathy and understanding of the circumstances of others are somewhat overstated—spending a few minutes in a virtual reality solitary confinement cell or taking a virtual tour of a Syrian refugee camp cannot truly produce a deep understanding of the experience of the solitary prisoner or the refugee. But they do appear to have positive effects. Gabo Arora reported that VR stations manned by fundraisers who showed people the films generated double the donations for UNICEF than standard appeals.

For the most part the first generation of HMD VR narratives have either been “flat” video experiences in the sense that the images do not have stereoscopic depth, or have been set in 3D graphic modeled environments produced using software platforms such as Blender or Unity. In this sense, the majority of 360 documentary projects have not been “true” VR as they surround the viewer, but are not experienced in 3D. This will likely change soon—professional 360 3D camera setups are available now, although they are quite costly. Good 3D 360 camera setups are expected to be on the market at more accessible price points by 2018.

The term “virtual reality” is itself problematic, as it refers to a wide range of technologies and media environments. If a 360 film is not “true” VR, nor is a 3D 360 environment. They are both embodied audiovisual media environments that function much differently than an experience of “actual reality.” And in some sense, any sort of narrative representation is always already “virtual reality”—whether it functions purely through symbolic language or through immersive interactive audiovisual media.

It is a remarkable moment in the development of VR as an expressive medium in the sense that there is not any general agreement about what kind of genre it is, much less a coherent aesthetic tradition in which to process narrative or artistic experiences produced for it. In an article titled “Not a Film and Not an Empathy Machine” Janet Murray argues that VR should not be developed and understood using the language of cinema, and that we should not make the mistake of thinking that “empathy” is automatically generated through the technology itself. I agree with Murray’s second point—a technology can change our embodied relationship to content, and can appeal to specific registers of our sensory apparatus, but it cannot automatically produce a sense of empathy. Murray however argues that VR should not be understood as a film we watch but as “a virtual space to be visited and navigated through,” and further claims that makers should leave out “anything that can be heard or seen that is not diegetically part of the virtual space that is the actual focus of your design,” including elements such as edits, voice-overs, text overlays, or background music. This strikes me as a position that privileges environment over narrative and leans towards an understanding of VR as mimetic environmental simulation which could limit it is as a form of artistic expression. If we come to VR with a set of aesthetic assumptions, for example that its ultimate achievement would be the holodeck of Star Trek: The Next Generation and that any artistic, narrative, or documentary experience designed for a virtual environment should represent an iterative step towards a seamless mimetic immersion, then we will fail to explore the full range of narrative potentiality within these environments. There is no reason such an experience should not have a soundtrack, or a voiceover, or breaks within its diegetic layers. Just as literary modernism and postmodernism challenged  assumptions of linearity and diegetic unity within fiction, I hope that VR producers will provide audiences with a diverse range of approaches to narrative and artistic exploration of virtual environments in a truly experimentalist mode, including works that are more discordant than fluid, that embrace the transitional, the broken and glitched, works that are closer to poetry than they are to reality.

An aesthetic imaginary shared between machine and human actors

What is imagining when Google imagines the world? What does the world that Google imagines consist of? When we navigate Google Maps or Google Earth or Google Street View or even when we use Google’s search engine, we access a collective imaginary that is continuously modified and updated both algorithmically and by the contributions and actions of millions of users. Google comprises the most significant corpus of language, text, image, and video ever assembled by a single entity. Every time we enter a search term into Google’s engine, we are feeding and retraining various aspects of  a system that has greater mastery over human language (in the broadest sense) than any prior linguist or linguistic system. Google is both verb and noun; collective and singular. Unless its users take specific actions to prevent it from doing so, Google will provide each and every user with a specific profile, which in turn shapes specific filters, specific preferences, a specific identity, which inform Google’s understanding of us and in turn our field of potential interactions with it. Google imagines a world that is shaped by our collective input to Google, and Google in turn imagines each of its users as a collection of data.

And how do we imagine Google? In one of the stanzas of his “Pentameters Towards the Dissolution of Certain Vectoralist Relations” poet John Cayley lays out one of the fundamental problems with our contemporary relationship with Google, social networks such as Facebook, and similar entities:

Although the objects of our culture have each

Their specific materials, now these may be mediated

By the insubstantial substance of machines

That symbolize—or seem to, in potential—

Every thing. The digital appears

To us historically unprecedented, thus:

It presents itself as servant and as Golem,

Non-vital but commensurate, un-alive

And yet all-capable: of service, of facility:

A limitless archive of affordances,

And so it ceases to be some thing or substance

Amongst others; it becomes the currency

Of all we are: essential infrastructure,

Determinative of practice and of thought.

Despite this, it still seems made by us, and lesser,

A servant still, and so we treat the digital

As if it remained in service, though it sustains—

Or seems to—all that we desire to be.

We will not live without it, yet we believe

That we still choose to purchase and to use

A relation that is optional, elective, and we

Manage it as such.

Cayley argues that our relationship to Google is not a reciprocal one, that the relation is out of balance, and that we mindlessly sign off on terms of service as if Google were primarily servicing us when in fact we are primarily servicing Google. The digital “becomes the currency/ Of all we are” which is to say that we in turn become the currency of the digital. At the same time as Google becomes our “essential infrastructure,” we are simultaneously its “essential infrastructure.” Cayley warns that the more used we become to Google being “determinative of thought and practice” the less the Golem becomes our servant and the more we become its willing slaves.

And yet at this point I don’t think that even the poet could imagine living in the Western world without Google. We are enmeshed within it and within other network systems. Google is not the enemy in conventional terms—Google is the Google that is, a generative engine of growth and innovation, if ultimately driven by increasing its economic value to its shareholders. The enemy, in my view, as is the power of the impulse to trust these systems and the corporations that own them to define “all that we desire to be,” and in effect, to do our imagining for us. In reclaiming human aesthetic perspectives within an aesthetic imaginary that is increasingly enabled and determined by technology, we might also reclaim a space for human imagination, the ghost in the machine.

In the woods on Ulriken (detail), Bergen, 06.05.2016. Full size image: Full size image: https://photos.app.goo.gl/8i5AWdphN2aOwGpp1 

References

6x9: A Virtual Experience of Solitary Confinement. The Guardian, 2015, https://www.theguardian.com/world/ng-interactive/2016/apr/27/6x9-a-virtual-experience-of-solitary-confinement 

Amerika, Mark. Museum of Glitch Aesthetics. 2013, http://www.glitchmuseum.com 

Andersen, Christian Ulrik, and Søren Bro Pold. “Aesthetics of the Banal—’New Aesthetics’ in an Era of Diverted Digital Revolutions.” Postdigital Aesthetics: Art, Computation, and Design, edited by David Berry and Michael Dieter, 2015.

Arora, Gabo. “Getting Real with Virtual Reality.” Media Evolution / The Conference. 2016. https://www.youtube.com/watch?v=g8KAeiyKakA

Arora, Gabo and Chris Milk. Clouds over Sidra. Unicef, 2015. https://www.unicefusa.org/stories/clouds-over-sidra-award-winning-virtual-reality-experience/29675

Bridle, James. “About.” The New Aesthetic, 2011, http://new-aesthetic.tumblr.com/about 

Campanelli, Vito. “New Aesthetic in the Perspective of Social Photography.” Postdigital Aesthetics: Art, Computation, and Design, edited by David Berry and Michael Dieter, Palgrave Macmillan, 2015, pp. 259–70.

Cayley, John. “Pentameters: Toward the Dissolution of Certain Vectoralist Relations.” AModern, Oct. 2013, http://amodern.net/article/pentameters-toward-the-dissolution-of-certain-vectoralist-relations 

Cloninger, Curt. “Manifesto for a Theory of the New Aesthetic.” Mute, vol. 3, no. 4, Oct. 2012, http://www.metamute.org/editorial/articles/manifesto-theory-%E2%80%98new-aesthetic%E2%80%99 

Coover, Roderick. “Taking a Scroll: Image and the Construction of Meaning in a Digital Panorama.” Hyperrhiz, no. 6, 2009, http://hyperrhiz.io/hyperrhiz06/artist-statements/taking-a-scroll-text-image-and-the-construction-of-meaning-in-a-digital-panorama.html 

Hansen, Mark. Feed-Forward: On the Future of Twenty-First Century Media. U of Chicago P, 2015.

Hoetzl, Ingrid, and Rémi Marie. Soft Image: Toward a New Theory of the Digital Image. Intellect, 2015.

Kaplan, Louis. The Strange Case of William Mumler, Spirit Photographer. U of Minnesota P, 2008.

King, David. The Commissar Vanishes: The Falsification of Photographs and Art in Stalin’s Russia. Henry N. Abrams, 2014.

Lichty, Patrick. “New Aesthetics: CyberAesthetics and Degrees of Autonomy.” Furtherfield, 2013, http://www.furtherfield.org/features/articles/newaestheticscyberaestheticsanddegreesautonomy 

Manovich, Lev. The Language of New Media. Revised ed. edition, The MIT Press, 2002.

---. What Is Digital Cinema. 1995, http://manovich.net/content/04-projects/009-what-is-digital-cinema/07_article_1995.pdf 

Milk, Chris. How Virtual Reality Can Create the Ultimate Empathy Machine. https://www.ted.com/talks/chris_milk_how_virtual_reality_can_create_the_ultimate_empathy_machine?language=en. TED Talk.

Murray, Janet H. “Not a Film and Not an Empathy Machine.” Immerse, no. 1, Oct. 2016, https://immerse.news/not-a-film-and-not-an-empathy-machine-48b63b0eda93 

Peirce, Charles. “Logic as Semiotic: The Theory of Signs.” Semiotics: An Introductory Anthology, edited by Robert Innis, Indiana UP, 1985, pp. 1–24.

Rafman, Jon. “IMG MGMT: The Nine Eyes of Google Street View.” Art F City, Aug. 2009, http://artfcity.com/2009/08/12/img-mgmt-the-nine-eyes-of-google-street-view/ 

---. The Nine Eyes of Google Street View. http://9-eyes.com/ 

Rettberg, Jill Walker. “How Do Algorithms See? Machine Vision in Camera Apps.” Visual Technologies, Place, and Space, Bergen, Norway. 22 April 2016.

Rettberg, Scott. “Human Computation in Electronic Literature.” Handbook of Human Computation, edited by Pietro Michelucci, Springer-Verlag New York, 2013, pp. 187–203.

Sterling, Bruce. “An Essay on the New Aesthetic.” Wired, Apr. 2012, https://www.wired.com/2012/04/an-essay-on-the-new-aesthetic/ 

Valla, Clement. Postcards from Google Earth. 2012, . http://www.postcards-from-google-earth.com/ 

---. “The Universal Texture.” Rhizome, 31 July 2012, http://rhizome.org/editorial/2012/jul/31/universal-texture/ 

“Virtual Reality and Vulnerable Communities.” Sustainable Development Goals Action Campaign, https://sdgactioncampaign.org/virtualreality/ 

Zimmer, Ben. “The Hidden History of Glitch.” Visual Thesaurus, 4 Nov. 2013, https://www.visualthesaurus.com/cm/wordroutes/the-hidden-history-of-glitch/ 

Images (in order of appearance)

Private Edward A. Cary of Company I, 44th Virginia Infantry Regiment, in uniform and his sister, Emma J. Garland née Cary. 1861-62. Charles R. Rees, photographer. https://www.loc.gov/item/2012645976/

 

Picture of the ghost of Abraham Lincoln with Mary Todd Lincoln (circa 1869). William H. Mumler. https://commons.wikimedia.org/wiki/File:Mumler_(Lincoln).jpg

Portrait of Scott Rettberg as a basset hound. Still from Snapchat video, 2016.

Image of faceswapping while vaping. (2016). Dorrit Shank. Animated gif from video source: https://youtu.be/ZSBLxOKx0fQ

Stalin, (Nikolai Yezhov, censored) and Molotov at the shore of the Moscow-Volga canal. (1937, 1940). Source: https://commons.wikimedia.org/wiki/File:The_commissar_vanishes_-_90%25_quality_JPEG_compression.jpg and https://commons.wikimedia.org/wiki/File:Voroshilov,_Molotov,_Stalin,_with_Nikolai_Yezhov.jpg

Tuscany panorama, 13.07.2015. Scott Rettberg. https://goo.gl/photos/mKWJkpbUNN5zZkhp8

 

Amsterdam city center panorama, 23.07.2013. Scott Rettberg. https://goo.gl/photos/ERS88bUDEx75vZ8UA

 

Amsterdam bicycle panorama 1: Bicyclist crossing a canal, 23.07.2013. Scott Rettberg. https://goo.gl/photos/dB4yLeHekuKC81Kw6

 

Amsterdam bicycle panorama 2: Look out for the bicycle, 23.07.2013. Scott Rettberg. https://goo.gl/photos/9uKkN3e2B2eQJLUJA

 

Amsterdam bicycle panorama 3: Curiosities, 22.07.2013. Scott Rettberg. https://goo.gl/photos/FKcV5appmeC6nDJY6

 

Amsterdam bicycle panorama 4: Disembodied leg, 23.07.2013. Scott Rettberg. https://goo.gl/photos/Bua2UvE8EZDW84ww6

 

Nude Descending a Staircase No. 2.  Marcel Duchamp. 1912. https://en.wikipedia.org/wiki/Nude_Descending_a_Staircase,_No._2#/media/File:Duchamp_-_Nude_Descending_a_Staircase.jpg

 

Google Street View photo of one woman pulling another woman’s hair, from Jon Rafman’s 9 eyes project: http://9-eyes.com/

 

Google Street View photo of two identical twins gazing up at a bridge, from Jon Rafman’s 9 eyes project: http://9-eyes.com/

 

Image from Clement Valla’s Postcards from Google project: http://www.postcards-from-google-earth.com/peter-guice/

 

iPhone screenshot demonstrating the Google Streetview 360 camera interface. 23.04.2017.

360 image of cherry blossoms in Bergen at Lille Lungegårdsvannet, 11.05.2016. Scott Rettberg.  http://bit.ly/2ppRNdM

360 panorama of woman with pink umbrella, Byparken, Bergen, 31.10.2014. Scott Rettberg. https://goo.gl/photos/pMqKCcMvB3bg1Yiw8

360 panorama of Temple Street Night Market, Hong Kong, 17.12.2016. Scott Rettberg. https://goo.gl/photos/jyxmL4QW5tgvn5cC6

Screenshot of Google Street View App. 23.04.2017.

View of a photosphere rendered in Cardboard view in Street View Application. 23.04.2017.

Jill Walker Rettberg demonstrating a cardboard-style (Viewmaster) head-mounted display. 23.04.2017. Scott Rettberg.

Screenshot of Google map with Street View images. 23.04.2017.

360 panorama of Jeu de Paume at Place de la Concorde, Paris, 14.12.2014. Scott Rettberg. https://goo.gl/maps/K5zoBchtJcT2

Story sphere "Hanging Sphere Garden” in the Centre Pompidou. Scott Rettberg. 12.2014. https://www.storyspheres.com/scene/xAFEATe3

Story sphere “Tuileries Pond.” Scott Rettberg. 12.2014. https://www.storyspheres.com/scene/59EEdZNE

Story sphere “Giant” in the gardens of Versailles. Scott Rettberg. 12.2014. https://www.storyspheres.com/scene/5ZGDvP5R

Screenshot from 6x9: a virtual reality experience of solitary confinement.

In the woods on Ulriken (detail), Bergen, 06.05.2016. Scott Rettberg. Full size image: https://photos.app.goo.gl/8i5AWdphN2aOwGpp1


[1] David King’s The Commissar Vanishes: The Falsification of Photographs and Art in Stalin’s Russia provides an extensive examination of Stalin’s propaganda use of trick photography.

[2] The fictitious tricorder on Star Trek was named as such because it fulfilled three functions simultaneously: Sensing, Computing and Recording.

[3] See Rettberg 2013 for a discussion of uses and critiques of human computation in electronic literature and digital art.