Creative Criticism: Sound and Listening in Sensory Documentary

Photo by Zhao Liang.

The Visual and New Media Review section of the Cultural Anthropology website is pleased to announce a new series, Creative Criticism, which will give space to and make room for the praxis of reviewing new and audiovisual media. In this series, contributors will seek to capture how moving images, stills, and other art works strike their senses and thought processes, while also working through and reflecting on the techniques and rituals of reviewing from an anthropological perspective. Hannah Paveck’s review of the 2016 Sheffield Doc/Fest is the first post in this new series; stay tuned for more.

The U.K. premiere of In Pursuit of Silence (2016) opened with an invocation to listen. Sheffield composer Stephen Chase arrived on stage, guitar in hand. After tuning the guitar, he fell silent, moving only to turn over the sheet music. The silence amplified the ambient sounds of the cinema. Without words or music to guide or orient, we listened as the hum of the projector intermingled with the sound of the audience, breathing audibly and shifting uncomfortably in our seats. As film sound theorist Michel Chion (2009, 151) describes, “Any silence makes us feel exposed, as if we were laying bare our own listening, but also as if we were in the presence of a giant ear, tuned to our own slightest noises.” In the face of silence, we listen to ourselves listen. As the force of this listening intensified, a member of the audience interrupted the scene: “What about the film?”

In staging the experience of listening to silence, Chase’s performance of John Cage’s experimental composition 4’33” (1952) signaled and framed our engagement with Patrick Shen’s documentary feature. Premiering at Sheffield Doc/Fest, an internationally renowned U.K. documentary festival now in its twenty-third year, In Pursuit of Silence concerns the devaluation of silence in contemporary culture. From vows of silence to rituals of meditation, anechoic chambers to noise pollution, the film translates its eponymous quest for silence into a cinematic journey between diverse settings and soundscapes. While a thematic as opposed to formal concern in In Pursuit of Silence, a consideration of sound and listening within documentary cinema was a thread resonating through this year’s film program. This essay examines how a selection of these films, which were screened at the Sheffield Doc/Fest in June 2016, foreground the constitutive role of sound. From A. J. Schnack’s short documentary Speaking is Difficult (2015) to the winner of the Storytelling and Innovation Award, Notes on Blindness (2016), “sound is conceived not as an adjunct or accompaniment to image, but as a complex, often intense auditory surround within which the imagery unfolds” (MacDonald 2013, 315). Eschewing traditional modes of voice-over narration, dialogue, and on-location sound, these films draw on formal experimentation and recent sensory approaches in nonfiction filmmaking to challenge the conventions of documentary sound.

The Politics of Listening in Speaking is Difficult (2015)

A commitment to sound and aural modes of perception underlies A. J. Schnack’s Speaking is Difficult. The short documentary film is Schnack’s first for Field of Vision, a film-based visual journalism platform he cofounded in 2015. Instigated by that year’s shooting in Roseburg, Oregon, Speaking is Difficult presents a close examination of the recent pattern of mass shootings in the United States. Assembled from the footage of twenty cinematographers, the film depicts the locations of previous shootings across the country, from San Bernardino, California, to Tucson, Arizona. In each location, audio recordings of police and 911 calls overlay a series of static shots of urban landscapes. Through its formal repetition and layered temporalities, Speaking is Difficult underscores the violence that connects these disparate events and landscapes. As the film unfolds backwards in time, from 2015 to 2011, the disjunction between sound (recorded at the time of the shooting) and image (shot in 2015) becomes all the more palpable.

The film’s rhythmic editing and disjunctive sound/image relations direct the spectator’s perceptual focus toward listening. The highly structured soundtrack, composed of panicked voices of 911 callers, police radio, and emergency commands from dispatch centers, guides the rhythm and sequence of images. While the images remain largely uniform, the intensity of the audio recordings builds across the film, inviting a heightened aural engagement. Speaking is Difficult harnesses the indexical quality of these recordings to flesh out the affective and experiential texture of each mass shooting. Capturing the sound of voices, sirens, and movement in close-up and in real time, the recordings give the spectator an embodied sense of proximity to the tragedies as they unfold. In the context of contemporary debates around bearing witness through visual media, the deliberate downplaying of the image in favor of a focus on sound seems salient. By harnessing sound’s affective power, Schnack’s Speaking is Difficult not only exposes the frequency and relentless repetition of mass shootings in the United States, but foregrounds the urgency of political intervention.

Ambient Extremity: Sounds of Industry in Consumed (2015) and Behemoth (2015)

From the mines of Inner Mongolia to the factories of Shenzhen, the port of Quindao to high-frequency shipping lanes, Richard John Seymour’s short film Consumed (2015) traces the trajectory of supply chains, exposing the global systems and flows that undergird the production, circulation, and consumption of everyday objects. Alternating between wide-angle shots of industrial sites and landscapes, and close-ups of repetitive gestures of human labor and machinic automation, the film focuses on how these global systems and flows (re)figure the relations between humans, nature, and technology. Consumed foregrounds the material reality of these relations through its textured soundscape and amplified sound effects. Ambient industrial noises and their rhythmic repetition combine with the film’s reverberant score. Amid this noise of industry, we hear the voice of Chen Li Ming describing his experience as a factory worker in Shenzhen. However, instead of structuring the narrative and guiding audition, this voice-over narration comprises just one layer of the film’s intricate sound composition. Shifting away from the conventions of voice-over and towards the role of sound effects, Consumed foregrounds the sensory dimensions of film sound to critique both the human and environmental costs of global supply chains.

In Consumed, sound effects—ambient sounds of industry—threaten to overpower. Magnified through repetition and elevated volume, these industrial noises recall film scholar Lisa Coulthard’s (2013) notion of ambient extremity: a film’s formal experimentation with acoustic proximity, frequency, and volume in an effort to maximize the visceral impact of a film’s sound effects. Ambient extremity in Consumed intensifies the bodily response of the spectator, exposing our vulnerability to sound’s material and vibrational forces. This visceral impact evokes the violence of industry at the centre of the film’s critique.

Zhao Liang’s Behemoth (2015) mobilizes this formal strategy of ambient extremity to similar ends. Blurring the distinction between art film and ethnographic documentary, Behemoth examines the human and environmental costs of China’s coal industry. Like Consumed, the film opens with a static, wide-angle shot of the mines of Inner Mongolia. A series of explosions interrupt the quiet ambience of the degraded landscape. The camera zooms in on the billowing smoke, the force of explosion scattering shards of rock and debris. As matter contaminates the frame, threatening to puncture the image, we hear the guttural rumbling of Tuvan throat singing. From its opening sequence, Behemoth draws attention to the sensory materiality of its sounds and images, creating aural and visual contrasts that place the spectator in a heightened attentive state.

Eschewing dialogue and narration in favour of a “compositional logic” (Kara 2013, 590), the film unfolds through an emphasis on sensory detail, contrasting sequences, and the poetic interweaving of fragments from Dante’s Divine Comedy. Guided by a Virgil-like figure, a coal miner holding a mirror on his back, Behemoth moves between a focus on the harsh conditions faced by coal miners and the industry-ravaged landscape of Inner Mongolia. Sound in Behemoth plays a constitutive role in encouraging sensorial engagement with “the rhythms, textures, and patterns of documentary reality” (Kara 2013, 588) From detonations in the coal mines and clangs of metal in the iron works, to the coal miners’ truncated breathing and scraping of calloused hands, contrasts in ambient extremity enable the spectator to feel the human and environmental costs of the coal industry. Director Zhang Liang underscores the affective power of the film’s amplification of sound effects: “I wanted the sounds to go beyond its physical, ‘natural’ characteristics and affect the audience.” In both Consumed and Behemoth, ambient extremity becomes an ethical rather than simply a formal gesture: it is an attunement to material suffering (human and nonhuman).

(Dis)Orientation: Listening to Notes on Blindness (2016)

The relation between form and ethics is at the centre of Peter Middleton and James Spinney’s Notes on Blindness (2016). Blending documentary and dramatic elements, the film chronicles the experience of theology professor John Hull as he gradually loses his sight. Between 1983 and 1986, Hull recorded a series of audio diaries, interweaving personal reflections with philosophical meditations on the phenomenological experience of blindness. Reflecting this experience through formal experimentation, Notes on Blindness limits the visual and amplifies the aural to engage the spectator in Hull’s perceptual world. The film’s densely textured soundtrack combines Hull’s audiotapes with additional family recordings; conversational interviews with Hull and his wife Marilyn; diegetic recordings; and foley work. On screen, actors lip-synch to the voices of Hull and his family, animating these recordings within the multisensory dimensions of the cinematic medium. As the film unfolds, ambient sounds begin to intensify, echoing Hull’s increasing attunement to the aural contours of his environment.

This enhanced soundtrack contrasts with the film’s visual language. Shot in tight framing with limited depth of field, Notes on Blindness limits the spectator’s traditional patterns of visual perception. Through its lack of establishing shots, blurred focus, and fragmented, close-up images, the film’s visual language destabilizes the spectator’s spatial geography. This sense of perceptual disorientation encourages a heightened listening, directing the spectator towards the sensory detail and textural elements of sound that communicate the “feel of the environments depicted on-screen” (Donaldson 2014, 118) and beyond our gaze. In communicating a sense of space, the film’s amplification of ambient sounds orients the spectator into Hull’s world. By engaging the spectator within the perceptual world of blindness evoked in John Hull’s audio diaries, the film highlights the constitutive role of sound within a sensory approach to documentary filmmaking. As Lisa Coulthard (2013, 118) underscores, “sensory cinema communicates sensations of physical, bodily life through an intensification of its sounds. This last point is crucial: we hear not only with our ears, but our entire bodies.” Notes on Blindness illuminates the potential of documentary sound not only to intensify the embodied engagement of the spectator, but to attune the spectator to the embodied experience of the other. Through formal strategies of sound/image relations, ambient extremity, and perceptual (dis)orientation, these films gesture towards the ethical role of listening beyond dialogue.


Chion, Michel. 2009. Film, A Sound Art. Translated by Claudia Gorbman. New York: Columbia University Press.

Coulthard, Lisa. 2013. “Dirty Sound: Haptic Noise in New Extremism.” In The Oxford Handbook of Sound and Image in Digital Media, edited by Carol Vernallis, Amy Herzog, and John Richardson, 115–26. New York: Oxford University Press.

Donaldson, Lucy Fife. 2014. Texture in Film. New York: Palgrave Macmillan.

Kara, Selmin. 2013. “The Sonic Summons: Meditations on Nature and Anempathetic Sound in Digital Documentaries.” In The Oxford Handbook of Sound and Image in Digital Media, edited by Carol Vernallis, Amy Herzog, and John Richardson, 582–97. New York: Oxford University Press.

MacDonald, Scott. 2013. American Ethnographic Film and Personal Documentary: The Cambridge Turn. Berkeley: University of California Press.