8. Data Transparency In Music Visualization
The Impediments To Satisfying Learner Curiosity
Can you name the single biggest logistical bottleneck in current web-based approaches to 3D music visualization?
How’s about simple data transparency? This, the tangible benefit of what is known as data binding at interface level, gives direct access to the data underlying on-screen musical objects. Note names, frequencies, octaves, midi numbers, color codings, and in some cases generated or sampled tones. Why so elusive in WebGL solutions?
Behaviors taken for granted in established music editing applications (tooltips on mouseover, click-playing a note, drag-and-drop to staff-line, raising an object menu) are often unthinkable using 3D, WebGL-based approaches. Yet this level of data access is routine to many 2D data visualization libraries targeting, say, SVG+CSS.
Why so important? The future of music teaching is immersive, notation driving possibly multiple, dependent animations during playback.
So think: one source, multiple freely selectable models, data bound to (and precisely hardware-synchronized across) many screen objects. Everything in lockstep. Everything interrogable.
For all WebGL’s multi-dimensional visual allure, data access is unequivocally the core feature - supporting not just a host of AI integrations, but at graphical interface level governing everything from dimensioning to musical properties and what meaning is given to individual symbols. For any single item of data, then: potentially multiple forms of expression and multiple points of access.
Without data, no fingering positions, no user interrogation (mouse-over tooltips) of pitch, notename, octave or interval type.
No easily implemented dependent animations, no automatic matching of notation with instrument models, theory tools or physical simulations.
So where does this data come from? The diagram to the left shows the various potential sources of data in a typical animated musical model.
The animation might be an instrument model, a theory tool, a physics simulation or any of a host of other visualizations. The result is truly data-driven, wholly interrogable, and in every aspect shareable with remote users.
The perhaps least obvious of these data sources are user interactions, such as placement of a capo, a mouse-over driven ‘interval’ tooltip, the saving of a particular configuration, or a model-external (i.e. global) environmental configuration change (for example to -say- note colorations).
Unsure what is meant by ‘data-driven’? A simple analogy might be a plant seed. Left to itself, it will grow to assume a shape characteristic (‘transparent’) to that species.
By manipulating it’s growing conditions, we can encourage it to reveal specific qualities such as it’s ability to climb, spread, survive arid conditions, cope with shade and so on.
Each form assumed is an expression of the underlying seed type, but also of it’s manipulation. Similarly, a single data set can express itself in many ways, allowing us to better understand it. We may visualize data as (say) a heirarchy or tree, events on a timeline, a scatterplot, bubble chart, chord diagram or any of a host of other forms. Each tends to focus on a different quality or property of the data.
A distinct 3D challenge is how we interact with shapes generated from this data. Having no equivalent to CSS class and id selectors, WebGL-based visualizations have no easy means of differentiation, identification or manipulation of graphical objects. Effectively, though visually reactive, visualizations are ‘data-inarticulate’ and tedious to reconfigure.
Moreover, these visualizations lack economies of code reuse and tight, data-to-graphical bindings. In sum, they have none of the DOM affinity we are so used to.
Navigable Or Interrogable?
Granted, many WebGL products are fully navigable.
Nevertheless, currently, either you have viewpoint control, or access to the data underlying the graphics — but not both.
On top of this, the quality of (predominantly audio-derived) data is often questionable.
Stunningly, MusicXML, the W3C format for music exchange, is currently not at all integrated with 3D WebGL-based modeling environments. No wonder music visualization is struggling to take off as a discipline.
As a result, music visualizations -though initially arresting- have a poor shelf life, and are all too easily dismissed - as eye candy.
In much current music visualization, shapes are modeled in 3D, that is to say volumetrically, the limits defined by a mesh and manipulated as a whole with various degrees of freedom: forward/back, up/down, left/right, yaw, pitch and roll.
Both in music instrument and theory tool modeling, we are generally more concerned with some form of lattice structure, comprising nodes (notes) and connectors (intervals) aligned or distributed in some plane. A key- or fingerboard. A circle or helix of fifths, a just intonation spiral, or one of many Tonnetze structures. Here, direct access to the contributing elements is essential.
Moreover, we do not necessarily need 3D navigability: for many situations, panning and zooming of 2D planes are more than adequate, and for the remainder rotation around a vertical axis likely more than enough.
While lattices and meshes can of course be portrayed in navigable 3D space, again, interaction is often limited, manipulation again only possible on the modeled whole. We really need to ask ourselves if 3D:
- adds anything to interrogable information content.
- gets in the way of other, more important interactions.
- impedes on-demand exchange of source data.
Seen from an application shelf-life or user perspective, these are central considerations.
Let’s remind ourselves which facets of musical expertise (see illustration) an instrumental learner is striving towards. Overall goal? Musical tension.
Any learning environment should support these effortlessly and transparently.
If our main preoccupation is navigating the learning environment, we are likely barking up the wrong tree.
It may be helpful to identify which visualization mechanisms achieve which learning ends.
This turns out to be quite an informative (if subjective) exercise.
Multiple (dependent) models and on-demand data have the widest impact, representing perhaps the most significant advance over legacy approaches to music visualization.
WebGL’s six-degrees-of-freedom navigation, on the other hand, may serve immersion and autonomy, but is deficient in all other areas.
Could we map between these and visualization environments? Let’s attempt a visual breakdown of the technology space.
Here we see base data usage and visualization types linked via some dominant visualization libraries or pre-processing workflow environments.
WebGL visualizations are normally associated with audio-derived data, with impact on data quality and accessibility.
Music industry data sets tend towards revenue overviews, genre maps, business relationships or workflows.
MusicXML-derived data is not associated with 3D WebGL environments because to date there is no means of (interchangeably) integrating it, and even were it so, much would be inaccessible. Equally, impoverished audio-derived data is a mismatch with powerful data-driven environments.
The diagram suggests, then, that to arrive at rich, immersive on-demand data transparency in the WebGL space, we need to sacrifice a visual dimension (from 3- to 2D), and with it, 3D navigability.
Most current music visualizations are tasked with a curiously secondary role: ‘enhancing the consumer experience’.
Some aspire to artistic heights, others add a visual dimension to an audio feed, some offer a visual structuring of contextual banalities (genre, label, release dates, contributors). Yet others reek of 3D graphics familiarization for it’s own sake: cosmetic, and short on practical use.
Music learning is another matter: here the practical potential is vast, the role unequivocally a primary one. It is difficult, though, to abstract data from an audio signal with any depth or accuracy. Poor data? Impoverished visualizations.
Accurate, timely and dependable data can dramatically enrich the information reaching our visual cortex — and so our musical understanding.
So is there a better source? As it turns out, several. Some (figuratively speaking) ‘vertical’ or ‘in-stack’, some ‘horizontal’ — as in ‘interworking’ / ‘peer-to-peer’ and ‘environmental’.
What may at first glance seem to be courting information overload is sharply reduced through actual user needs. Indeed, it takes astonishingly little to get started.
Users can be expected to reach for the familiar, personalizing menus with a few own instrument and other tool preferences. These are easily managed and easily propagated to other users in interworking sessions.
Most such data arrives in simple text formats. Some are presented to the user directly as classification trees, others as data graphs, but all find their ultimate expression in intuitive and exchangeable visualizations of one form or another. The mechanisms for integrating different types of data source are identical.
MusicXML: Though not (yet) perfect, provides on the whole a wide range of reliable, human-readable and actionable data.
Shortfalls? Nothing that can’t be overcome: a focus on 12TET largely to the exclusion of other world music systems, and ‘hard-coded’ MusicXML-to-instrument fingering data. (More on this further below).
Data examples? Tempo, key, part, voice, measure, modal (scale) type, time signatures, note names, types and durations, pitch modifiers (‘alter’), note tail orientation, octave, accidentals, dynamics.
Instrument Models: fully configurable, interactive and score-driven models of real-life instruments, with entirely dynamic fingering display: tooltips and mouse-overs providing (for example) notenames, fingerings, note frequencies, midi or other indexing information.
There are two basic types of instrument model: those featuring one-to-one note-to-fingering mappings (such as guitar or harp), and those catering to many-to-one mappings (whistle or clarinet).
Instrument Fingerings: potentially the most disruptive topic in instrument visualization due to their ‘hard-coded’ specification in music exchange files. Hence an inability to reflect individual player’s preferences, alternative tunings, alternative instrument layouts (number of channels or courses) and -given notation compatibility- other world music systems.
More or less all popular exchange formats are impacted, including the de-facto (W3C) standard, MusicXML.
Fingering strategies could (see illustration to the left) potentially be mentor- or teacher-driven (as part of a music exchange file constellation), algorithmically, or via MI (machine learning) and AI (artificial intelligence).
Many-to-one finger-to-note mappings are more complex to manage than those associated with one-to-one mapping instruments, but, decoupled as musical properties, still reasonably straightforward. For one-to-one mapping instruments, there are strong arguments (instrumental freedom and personal preference) in favor of decoupling fingerings from exchange files.
The relationship between a piece of music and it’s fingerings is 1:many. Block-chain technologies suggest a possible route out of the ensuing fingering-to-exchange-file mapping chaos. Once an asset is listed on the the blockchain, it cannot be altered or counterfeited. Unless the owner verifies a change, ownership is immutable.
Classification Trees: if something can be modeled, it can often be classified.
Classifications can be applied to music systems, instruments, theory tools, genres, typefaces, notation types (trans-notation), musical properties and more.
Classification provides structure to data repositories, meaning to URLs, and content to menus.
Indeed, classification trees make sense even in the context of ad-hoc ‘graph’ database concepts. More on that (given crowd-funding success) in a later post.
Social Network: friends, teachers, learners, possibly identified from established social media sites.
Theory Visualization Tools: The greater the level of abstraction, the more generally applicable the insights. Music theory is by nature a reduction down to the essential. Sometimes valid across multiple genres, sometimes allowing us to guess at ‘between-the-lines’ behaviors, these allow us to place other music systems, instruments and conventions into a wider cultural perspective.
These tools comprise dynamic, score-driven 2- and 3D visual models of a wide variety of note (node) and interval (connector) abstractions. As with instruments, models can be developed from a generic base.
These take many forms: conventional scales (effectively scatterplots), various tetrahedral and prismatic models, harmonic spirals and helixes, plus lattices (‘Tonnetze’). This list is probably far from exhaustive.
Some visual models can be used interchangeably, highlighting different musical qualities. One problem, many facets.
Physical Models: reflecting material behaviors (such as airborne wave-forms, their derivatives and interactions), specific instrument forms (for example flute vs clarinet), and the music systems on which their configurations are based.
And hey, also score-driven. :-)
Shared Resources: teacher’s and learner’s entire environments, their preferences and music. User convenience and comfort, propagated with a single click.
Score Analysis Tools: among the most instantly recognizable are arc diagrams, which identify motif patterns in a work of music and are potentially of help in nesting and prioritizing notational ‘building blocks’ for structured practice.
Many commonplace data visualizations are suited to adaptation to related purposes, the same data often finding a wide variety of visual expression.
Other Feeds: some that come to mind are dance notations, a directory of instrument craft- or hand-builders, psychophysics, construction materials and historical information.
Distractions (advertising) have on the other hand no place in a learning environment, and the data required of users is minimal. Any commercial services (teachers, instrument builders, notation) can be discretely associated with a particular instrument model via shopping cart icon, and users trusted to find them if needed.
Measured against broad music-cultural diversity, current online music standards and platform capabilities are astonishingly impoverished. The following diagrams hint at the scope for improvement.:
- MusicXML (the W3C standard format for music exchange) is limited to 12TET and a tiny subset of just intoned music systems.
- The range of supported in-browser notations types is very limited. Though SVG fonts find increasing application, most are bound to 12TET, and their potential for direct data bindings wholly unexploited.
- Due to MusicXML’s hard-coded fingerings, instrument models are limited to standard tunings for a tiny range of oversubscribed western instruments. Guitar, keyboards, basta.
- Theory tools have yet to find any application in direct association with source notation.
- No attempt has been made to bind-in related disciplines, such as psychophysics, IoT (Internet of Things), usage statistics.
Though often drawn towards exotic instruments, learners are challenged to find online, interactive tutors, and outstanding teachers and motivated learners are often widely geographically separated.
World musical diversity is, then, astonishingly poorly served on the internet. Whether kora, nykelharpa, sitar, hang, bağlama/saz, duduk (or many of the thousands of other instruments in use worldwide), experienced instruction still involves a lot of travel.
Needed is a dramatic extension of current capabilities. Anything else is driving cultural collapse.
We have highlighted the focus in purely musical visualization on nodes (notes) and connectors (intervals), that -in the name of data transparency- 3D navigation is secondary to interactivity, and the pressing need for ‘on-demand’ exchange of source data. These have profound impact on modeling environment choice. Let’s finish by closing the conceptual loop.
Particularly intriguing amongst emerging WebGL-based approaches is Stardust, whose API and focus on solving the data bindings bear similarities to that of d3.js, but which leverages the GPU — all while claiming to remain platform agnostic.
Where D3 provides the better support for fine-grained control and styling on a small number of items, Stardust is profiled as being good at bulk-rendering-and-animating large numbers of GPU-rendered graphical elements (known as ‘marks’) via parameters. Instead of mapping data to the DOM, Stardust maps data to an array of these marks.
Indeed, in this context, Stardust’s creators suggest a mix of the two in applications: D3 being used to render (for example) a scatterplot’s axes and handle interactions such as range selections, but Stardust used to render and animate the points.
Complementing rather than replacing D3.js, developers with D3 expertise are expected to adopt easily to Stardust. In sum, for those few cases where structural animation is a must, Stardust looks pretty damn good.
Well, that’s it for now. If you have any interest in music visualization, you’ve seen at least as many opportunities overlooked as visited.
Everything seen here is, though, part of an open source, non-profit, digital commons vision for the musical future. It is in implementation, but desperately short on funds. Feedback, clapping and sharing with abandon would be so much appreciated.