header photo

The Journal of Multimodal Rhetorics

ISSN: 2472-7318

Parenting in the time of COVID-19: The Cacophony

Catherine C. Braun

Table of Contents

Keywords: parenting, accessibility, distance learning, sonic chaos, multimodal composing

Categories: Parenting as (Im)possibility in Impossible Circumstances; Visual, Sonic, Tactile, Interactive Texts as Self- and Collective Care; Writing the Process of Writing

When the call for this special issue was announced and I decided to submit, I chose to make a video to evoke my experience rather than trying to translate it into written words alone. One, because my brain was having trouble languaging with words during the pandemic. Two, because my experience was one of sensory overload, and I felt the multimodal affordances of video would be able to communicate that visceral experience more immediately and effectively than words alone. Three, because I’m a proponent of multimodal scholarship and want to strengthen the field’s commitment to non-alphabetic scholarly production.

To evoke the multiple layered demands of my carework and university work during the pandemic, I layered multiple audio tracks with a visual track. The multiple tracks were deliberately and strategically layered to create a feeling of sensory overload. In particular, audio tracks were overlapped and their levels adjusted so that the “primary” voice or sound, or the one in the foreground at any given moment, had only a tenuous hold on the foreground and was constantly in danger of slipping into the background hum of the various voices that filled my living space during the pandemic: my teaching and parenting and inner voices, my children’s teachers’ voices, voices in virtual meetings and presentations, my husband’s voice, my daughters’ voices, the voices coming from school iPad apps, the music coming from school iPad apps, the alert noises coming from iPads and Outlook and Zoom, and so on. Cacophony.

As I was preparing to submit the piece for consideration, I began to grapple with the challenge of captioning. I didn’t attempt automatic captioning because I imagined the nature of the layered audio would cause any automatic captioning tool to generate an unreadable text. Instead, I experimented with creating different transcripts with more or less detail.

At first, I thought I could use brackets with descriptions (e.g., [teacher’s voice layered with crying child’s voice]) and only caption the foregrounded words. However, that solution did not sit well with me because of the aforementioned tenuous nature of the foregrounded voice/sound. Depending on how a reader listened to and focused on the piece, the foregrounded voice could change, and different “listenings” might result in different words being heard. I didn’t want that fluidity to be lost in fixed captions. At one point, to address this issue of multiple interpretations or multiple “listenings,” I thought I could create a “score” of audio tracks that could be viewed and read separately from the video, but I didn’t like the idea of that separation because the visual track also relayed meanings that would be lost or altered by that separation.

Ultimately, I decided that the difficulty with captions for this project is that the act of captioning assumes that meaning is contained in the specific words spoken. However, meaning in my composition was not so much in the specific words in and of themselves, but rather in the interaction between the various audio tracks and the accompanying images, in the ways the various speakers’ voices were overlaid, and the disorienting sensory experience created by their layering. How could I make that experience of the text accessible to deaf and hard of hearing folx?

Ultimately, in order to create the best possible rendition of the original text for a deaf or hard of hearing audience, I leaned into the theme of “cacophony,” or sonic chaos, and set about trying to create a sort of visual and spatial cacophony through color-coded, stacked, and moving representations of the sonic elements of the composition. I am deeply grateful to my colleague Scott Lloyd DeWitt for helping me think through this process and for looking at multiple drafts.

Does the visual/spatial cacophony I have created make the original text more accessible? In a way, it does, because the content of the dialogue and of the surrounding sounds are transcribed. However, it also probably doesn’t because of the chaotic nature of its implementation. But that chaos is part of the point, and this approach captures my compositional goals in a way that is accessible to those with hearing impairments, which is not true of the original text without the additional alphabetic-text layer.

My goal with this text is thus two-fold:

1.  to help readers/viewers experience what parenting as an academic during a pandemic was like for me;

2.  to engage with the conversation about accessibility in multimodal compositions, challenging the field to consider the ways that captions interact with other compositional elements and contribute to layers of meaning and to an aesthetic experience.


Original, Non-captioned Video


Captioned Video


Sparse Transcript

[Siri beep]
[Siri beep]

Little kid: “That’s what our iPad’s heart [Zoom chime] looks like”

[Zoom chime]
[Zoom chime and multiple computer sounds gradually crescendo-ing into static]

Kid and mom trying to figure out phonics while a teacher talks about letter formation in the background.
Kid: “I don’t knooooooow!”
Mom: “Yeah, I don’t know either.”

Mom: “The more you get done today, the less you’ll have to do tomorrow.”
Teacher talking about letter formation continues in the background.
Parents discuss tech issues with turning in child’s homework.
Online course video playing in the background.
Log in sounds in the background.

Dad: “Hey guys, if you don’t get your schoolwork done, I’m gonna take away a few minutes of your iPad time later.”
Kid sighs. “Fine!”
Teacher talks about writing assignment in the background.
Kid: “But I’m not going to say that I’m actually going to like it.”
Mom: “You don’t have to like it. You just have to get it done.”

Layered sounds: One teacher talking, one kid reading, one parent talking, and one kid crying.

Phone buzz

“Well good afternoon everyone.” Layered talking from several meetings. Kid crying in the background. Kid on merry-go-round: “Get me off of this thing!” Kid writing: “My thing is running out.” “Parenting in the time of Covid-19.” “I’m going to share my screen.” “Oh my gosh, I can’t even write anymore!”

Little kid: “The fox is almost endangered.” Kid continues to talk about foxes while teacher explains writing assignment. Fox squeals.


[Zoom chime]



Catherine C. Braun (she/her) is an Associate Professor of English at The Ohio State University at Marion. Her scholarly and teaching interests include digital literacy studies, technical communication, rhetoric/composition, and film.

Table of Contents