We often engage with new forms through the metaphors implicit in the technology of our immediate past
As an artist working with new technologies, I often find myself faced with the challenge of building a relationship to systems typically designed for entirely different audiences, usually with different ideological and aesthetic concerns built into their design. In many cases, the friction between these differing conceptions has been a source of inspiration - intentionally misusing audio-to-MIDI plugins, appropriating commercial production techniques and procedurally searching stock libraries has informed my creative practice across several recent projects. Many times, I've found that exposing the artifacts hiding at the edges of these technologies has offered a way to explore the social and artistic values built into these tools and the communities that create them. In a developing field like virtual reality, however, we find that these assumptions are still in flux, though they tend to borrow heavily from the formats that precede them. In this article, we attempt to unpack some of the assumptions built into contemporary VR development tools and offer new metaphors for creative work in virtual reality.
Down the Conceptual Rabbit Hole
The previous year brought a torrent of conversation on the nature of content for virtual reality, however, there seems to be little discussion done to conceptualize the tools used to create these experiences, or how their associated methodologies may have an ideological impact greater than the simple logistics of workflow. Although well worn, Marshall McLuhan’s writings on looking into the future through the rear-view mirror provide an excellent starting point for unpacking the assumptions built into the current crop of VR development tools. According to McLuhan, we often engage with new forms through the metaphors implicit in the technology of our immediate past: The car becomes a horseless carriage, MIDI is a digital form of sheet music, and the Moog synthesizer takes shape as a sci-fi keyboard organ. While each of these examples represent a clear step forwards in most obvious respects, they tend to preclude interesting alternatives by gently steering us towards 'preferred' use cases.
Taking a generalized survey of contemporary VR, we see a proclivity for 'shooters', puzzles and task-based games. While these choices are likely driven by market demands and the interests of those involved in the development space, it's worth considering how these creative choices may have been guided by the forms inherited from previous mediums, as well as the assumptions built into the tools themselves. In most conventional game engines, 'objects' are placed in an environment awaiting activation by the 'player'. Interaction comes in the form of binary events (press the button, shoot the gun), the unfolding of AI logics (a monster goes from passive to aggressive 'states' upon being triggered by proximity to the user), or decision tree style interactions. Granted, these frameworks may be used to create an almost infinite variety of situations, though they tend to quietly but forcefully guide our hand through the implementation of their underlying conceptual assumptions. Coming to the VR space as an electronic composer, I often found myself at odds with the metaphors built these tools.
'Game Object' vs 'Sound Object'
I would like to suggest a new approach to the creation of virtual media drawing from the methodologies of electronic composition - namely, the patching of an analog modular synthesizer. As a composer, I was initially drawn to virtual reality as a new medium for extending elements of my own artistic practice: sound and media as an 'object' with spatial presence, associative qualities, shape and morphology. VR appeared to be an ideal space for exploring the gestural
"Ultimately, the thoughts contained in this article only begin to touch on the multiplicity of new forms and concepts taking shape in virtual reality development."manipulation of these forms, and after initial research, even the terms and language of the software seemed to support these ideas. The typical workflow in a game engine such as Unity allows you to quickly create a 'game object' and assign it properties (IE: let's make a cube, turn it pastel pink, have it spin slowly around the y-axis and emit a low-pitched FM bell sound, etc). After building this 'scene', a user may confront this virtual object as a digital sculpture akin to the foreboding structures of minimalist art. Building from there, we can create multiple virtual objects, each with their own individual characteristics and behavior. Taken together, this aggregate of materials could be viewed as something similar to a Calder mobile or an Earle Browne composition, in which simple fixed elements are recombined in infinite variation.
However, pushing this line of thought quickly brings out fault lines: Say we want the rotation angle of one object to change the rotation speed of a second object, and the scale of yet a third object. Then let's say we want the proximity of a user to the first object to increase the rate of the entire system, and each time the third object reaches a certain size, a random offset is applied to the transform height of all objects. Then let's also say that these changes in spatial position are associated with the changing timbre of various sounds emanating from each object in sonic space.
This seems like a potentially interesting creative situation, but a nightmare to implement and explore. While it's worth nothing that any intermediate developer could create this behavior with C# scripting, the path they might take to develop and explore this compositional system would be dramatically different in line code. Returning to the metaphor of a modular synthesizer, these types of interactions are fundamental to the nature of 'patching', and encourage a type of playful improvisation in the creation of interactive structures that feels fundamentally different from assigning parameters and variables in line code and object hierarchies. Ultimately, this becomes an issue of framing the 'game object' as something to be acted upon in a top-down environment versus a 'sound/media object' existing in real-time communication with a larger organic system.
Theory and Practice
Fortunately, contemporary VR is still a young field, and offers us ample opportunity to re-conceptualize our approaches before they become calcified as standard practice. Many of the ideas above came about in relationship to to an early virtual work of mine titled All That is Solid Melts Into Air. While conducting research for the piece, I found myself drawn to the idea of applying ideas drawn from electronic composition to the manipulation of virtual materials (an LFO controlling the spatial location of an object and its associated sound, an envelope generator stretching the size of a 3D model, a random generator or sequencer driven by a players movement, etc). Initial attempts to implement these ideas using Unity and C# were limited for many of the reasons listed above, prompting a turn towards Max/MSP and the OSC protocol. Using the excellent UniOSC C# assets as a bridge, I was able to easily leverage techniques and patches from my compositional practice in VR via the udpsend and udprecieve objects in Max. For example, creating an LFO by scaling a cycle~ object and sending it to the transform coordinates of an object in Unity via OSC quickly became second nature, and allowed a workflow where ideas may be sketched and improvised with in real time, uniting ideation and execution. Utilizing an OSC bridge also allows for the use of Ableton as a source of OSC control (via Max for Live) and as an environment to develop through-composed gestures using the linear workflow of a DAW. Alternatively, information from Unity may be sent back to Max or Ableton over an additional OSC channel as a second layer of control. If we conceptualize this 'bridging' approach as a bi-directional modulation path, we have a strong foundation to leverage techniques from generative and interactive composition. Unfortunately, applying these ideas to spatialized audio in realtime becomes convoluted fairly quickly.
Virtual reality offers an exciting frame to explore spatial composition, often in ways that may be impossible or unfeasible in the physical world, and Unity's sound engine and related plug-ins do an incredible job of streamlining the implementation of localized sound. Unfortunately, due to the underlying frameworks of these tools ('do the thing, hear the sound'), they are currently ill-suited to the generation or manipulation of audio in realtime. A notable exception is the 'Teleport' script and AU plug-in included in the Unity Audio SDK, which allows the streaming of audio from a DAW directly into a localized 3D object, similar to routing sound between applications with Soundflower or AudioJack. Unfortunately, these tools are not well known and not supported to the same degree as more conventional approaches. In my own projects thus far, I've ended up bypassing Unity's audio engine entirely, relying instead on a set of Max patches receiving object and user location via OSC and spatializing incoming audio via ambisonic 'virtual speakers', then encoding the result binaurally. This approach allows modulation sources and compositional input to be mapped to the creation of spatial sound and virtual movement simultaneously in an organic way. Unfortunately, because Max/MSP is a standalone, separate program that runs alongside Unity rather than
"The objects of a patcher-UI are analogous to the modules of a patchable synthesiser..." within it, we are faced with an unstable system that limits the 'portability' of the finished piece and creates a needlessly complex workflow. Luckily, a renewed interest in procedural audio has brought realtime audio generation to the gaming sphere through embedded synthesizers like middleware Wwise's Synth One, the built-in subtractive synthesizer in the Unreal engine, and an unexpected but promising focus on Pure Data as an embeddable source of sound generation and modulation data. These developments allow us the opportunity to explore new approaches to creative work while still leveraging the audio spatialization and other strengths of Unity.
Pure Data and LibPD
Pure Data and Max/MSP share a great deal of DNA; an experienced user of one can make the switch to the other very quickly. However, the open-source nature of Pure Data makes it a very different beast. In an interview with Varun Nair, LibPD creator Peter Brinkmann says:
“…what is Pd exactly? Is it the existing code or is it a specification that happens to be implemented by the existing code and regardless of what the answer to it is right now, the question is what the answer should be. My belief is that we should really have a general definition of Pd and then we should have implementations of it – more than one.”
It's easy to see the resonance between this conception of Pure Data—as a general specification, rather than a fixed existing set of code— and the “modular” rather than “event-based” approach to creation in virtual reality. Additionally, patcher environments such as Pure Data and Max/MSP root their interface metaphor in the patchable synthesiser. Sound generating, sound-modifying, and control signals are generated by and/or routed through ‘objects’ (more properly considered functions in a programming sense), connected by virtual 'cables’ carrying audio or control data. The objects of a patcher-UI are analogous to the modules of a patchable synthesiser: the intuitive or improvisational facet of working with such a system lies in the creative connection between known, imperative modules. A Pure Data patch is, for all intents and purposes, an interactive data-flow graph of a signal-processing, control, or sound-generating routine.
LibPD is an embeddable library encapsulating the core (or ‘vanilla’) Pure Data functions; one which can be used nearly anywhere, with constantly expanding options. PC, OSX, iOS, Android, Unity 3D are already well-supported, with experimental (at this time)
"As always though, the tools and techniques used to create are secondary to the ideas they are in service of." implementations for Java/Web-based applications. It is Brinkmann’s extended concept of what Pure Data is that allows for this level of flexibility and embed-ability. LibPD is available on Github and a companion book, Making Musical Apps is available at libpd.cc. Though much of the documentation is geared toward integrating PD with iOS and Android applications, it still provides valuable guidance for those wanting to compose with the Pure Data environment, then leverage those patches as part of the audio engine for an application.
This approach has been extended to Unity with the libpd4unity library by Patrick Sebastian. LibPD for Unity comes bundled with several working examples and the necessary C# scripts for sending and receiving data between objects in a Unity scene and objects in Pure Data. This bidirectional communication—scene data able to manipulate sound and the resultant sound able to manipulate objects in our Unity scene—is the key to realising the compositional methodology set out above.
If we assume the value of the ‘modular’ rather than ‘event-based’ model of virtual-reality composition for extending our scope beyond existing metaphors of interaction, the value of LibPD as the sound-generating source for a VR scene in Unity becomes clear. Because the controlling metaphor of Pure Data itself is that of the modular synthesiser, we have a unity of compositional metaphor across domains (conceptual, technical, etc.) which opens up creative spaces beyond what is native to the model-metaphor native to sound in Unity itself.
New Futures and Possible Directions
Ultimately, the thoughts contained in this article only begin to touch on the multiplicity of new forms and concepts taking shape in virtual reality development. WebVR has become an open space for new formats, often unencumbered by metaphors inherited from previous platforms in the way most commercial game engines have been. Even in Unity, the continuing efforts of developers like Keijiro Takahashi have made new approaches visible through open-source toolkits like Klak (OSC and MIDI support) and the OP-Z (hardware synth and sequencer for Unity), an upcoming collaboration with design firm Teenage Engineering. As always though, the tools and techniques used to create are secondary to the ideas they are in service of. While new metaphors beget new methodologies, they do so best in communication with other voices. The author(s) would encourage you to continue this conversation by adding to the resource list linked below, or to get in touch with new ideas or anything we've missed!