This graphic presents a demonstration of piano rolls for separate instruments, which our J-DISC tools should be able to identify in future development stages. (Please click on the image to enlarge in a new window.)
This graphic presents a demonstration of piano rolls for separate instruments, which our J-DISC tools should be able to identify in future development stages. (Please click on the image to enlarge in a new window.)
MIR tools – current capability in the field or of J-DISC MIR
1. A highly regarded recording engineer has stated that he did not actually do many jazz recordings credited to him by a canonical jazz label. But he did not specify which ones he did or did not record. Using a given recording studio ambiance quality as a ground truth, we could search all recordings on that label during the pertinent time period and determine which ones he actually did. Further research could then compare A&R or engineering quality at the other sessions, and ask what the significant differences might be. Still further but feasible work would be to characterize the typical recording ambience at a given level of sound technology—and thus to be able to pinpoint, within a few years, the date of a given recording as a classifier for research purposes.
2. Find every recorded track, from a sample of 20, in which a given instrument takes a featured solo improvisation. If the instrumentalist were known, it would be possible to identify where to find that artist’s personal statement, and avoid listening to tracks where s/he was only accompanying others.
3. This task follows from #2, but for a larger sample: Find every performance in which an artist takes a featured solo improvisation, and is not simply an accompanist, in the entire recorded output, even in the thousands of songs, where that artist is nominally a part (something traditional discographers—human listeners—would never be able to work through).
MIR tools – projected for next phase of MIR work
4. A library patron working with a large data set of audio files can use J-DISC MIR tools to obtain information on various attributes of their data, such as information on pitch, rhythm, timbre, and performance attributes.
5. Find all bass and drum solo breaks with Paul Chambers as bassist, within a specified range of dates, or with a particular producer.
6. Taking #5 further, analyze all Paul Chambers bass solos from a specified date range, and plot his performances over the compass of the instrument. Determine if there are relevant differences in his performances, and whether they correlate with any other data points. One result might be that he is more likely to play more material in the highest registers when tempo is over 180 bpm.
7. Find every instance of a carefully defined and modeled expressive trait, such as a glissando or bend, used by modern jazz singers, active in New York in the 1950s and ‘60s, who seek to imitate the technical prowess of instruments, and generate graphics showing the typical musical “shapes.” Develop resulting visualizations to instruct fledgling performers in their use.
8. From a representative sample of jazz recordings, compare the use of an innovative melodic trope, the fourth triad (for example, d-g-c, e-a-d, or all transpositions), before and after 1961, when John Coltrane introduced it and a system of harmonic progression based on it. It is expected that it would be used far more after Coltrane’s recorded oeuvre introduced it, but with a lag of several years as other musicians absorbed and developed the system. The result would be to provide direct evidence of Coltrane’s influence, and to suggest ways to study the dissemination of musical ideas through recordings.
9. Find all jazz recordings made within a 6-month period. Using J-DISC text search, indicate stylistic diversity, patterns in personnel, instrumentation, repertoire, (including repertoire titles themselves). Use J-DISC MIR tools to support investigation of rhythm accompaniment styles or basic harmonic format. As a next step, compare this sample with an appropriate 6-month sample earlier, or later; or compare these trends with respect to jazz recordings in a different geographic location at the same point in history.
10. Access all of Charlie Parker recordings of How High the Moon/ Ornithology, then run queries identifying elements of his approach to the song such as standard material, experiments and recombinations of standard material, or “singularities” (unique and original creative moments.)
11. Assume that it is possible to profile a given artist’s solo style by their timbre, attack, or repeated melodic material. With a single query, a researcher or fan would be able bring back all instances available on the Internet in which two musicians played with each other, based on their sonic profile alone and not dependent on any text annotation of the individual resources.
12. For musical repertoires that do not use a musical score as a central performance document (such as jazz and improvised music, much electronic music, and some non-Western repertoires), MIR tools can provide valuable information about formal structure and data for comparative musical analysis. For instance, a researcher could examine the frequency of motivic patterns across several performers to determine what material might be widely used in common, and what material is unique to a particular performer or situation. Taking the concept in reverse, it would be possible to produce graphic “listening scores” of performances, which might illuminate features of the music not easily caught in the moment of local listening.
13. For musical repertoires that do use musical scores as a central performance document, such as classical repertoire, MIR tools could provide interesting data for analysis on how the score (a standardized set of musical performance instructions) is actually brought to life as a musical performance (since performers almost inevitably add their own expressive features in realizing a notated score, in all forms of music). Work has already been done in comparative mapping of tempi by different artists in performance. MIR tools would open up further possibilities in this type of research.
14. Find, within the full available corpus of a trumpet player’s recordings, the exact repeated elements among the melodic fragments or riffs he or she produces in their solo improvisations, and thus highlight in turn which elements were created spontaneously within the improvisational process.
15. Using the full available corpus of two saxophonists’ recordings, in cases where one was said to be derivative of the other, determine what actual melodic material they both used, identify portions unique to each one, and thereby gather evidence to support evaluations of the artistic originality of the “junior” saxophonist.
16. Numerous private recordings from the loft jazz scene in New York during the early 1970s have been passed down to the artists’ surviving family members or younger participants. No one knows the exact venue and personnel of most of the tapes. Though it is clear there is overlap among the individual recorded sessions, no single party has all of them. After digitization, the solo personnel and combinations could be identified based on ground truth samples of those artists’ commercial recordings. In addition, the duplicates could be identified and differing venues coded by the ambient sound, ranked and selected according to quality, and the exact location and date could be correlated with personnel pending further historical research. When this process was complete, the heirs would be prepared to issue the material commercially in a single package or branded series of issues.
17. Using discographic metadata to identify every recording of a single, widely recorded big band arrangement, or item of standard New Orleans Ragtime repertoire, each version could be searched and marginal variations in the performances identified. With the ability to pinpoint these variations, researchers could ask whether the arrangers or bandleaders were more concerned with adding new solo improvisations or revising the written portions, or explore many other details of performance such as tempo, inflection, and intonation.
18. Using identification and analysis of song forms by their bar lengths, research within a large set of given recordings can uncover patterns in stylistic innovation and ask how and when innovations, carefully defined, might be adopted. For example, it is assumed that early modern jazz composers of the 1940s adhered to the standard 32-bar song forms divided into 8-bar units favored by Tin Pan Alley and Swing composers as a basis for improvisations. Progressive jazz that emerged slightly afterward was marked by bar lengths and song formats that included odd numbers of bars or unconventional combinations of bar lengths. After choosing a large corpus of jazz recordings made within the relevant time span, researchers could explore how widespread the change to uneven bar lengths was after the first recorded innovations were disseminated, and how quickly or slowly the change was adopted throughout the whole profession.
19. Currently, many music scores are slowly being digitized as score images. The next phase should include the actual encoding of the data from the scores (using such tools as the Music Encoding Initiative, MEI). This could provide a researcher with a large data set of both music notation (via the encoded score data) and audio performances (via the MIR tools analysis of the audio files). A researcher could cross-reference melodic motives in Mozart piano sonatas, for example, with historical patterns in the live performance of those motives. It might then be explored whether there are certain melodic figures uniformly treated with rubato.
I have listed below some examples of research questions that the current live, text-only version of J-DISC can help answer.
Yes, current discographic sources in print or online can help yield answers to these questions. The same answers could be found, however, using J-DISC in one or two queries, and would be automatically counted.
I invite you, then, to simply read these sample text queries, then to imagine if all the recordings themselves they refer to were available, accessible, and amenable to classifiers and/or visualizations . . .
Use case examples
In “Advanced Search” in J-DISC, find:
I came across two articles by Matthew Butterfield, a professor of music theory at Franklin & Marshall College, that should be interesting for us in surveying analyses of jazz, and in resorting to observations or measurements using waveforms as evidence.
One of the articles is titled “Why Do Jazz Musicians Swing their Eighth Notes,” and, while the question can hardly ever be answered definitively, it is a good one to be asking in analyzing jazz rhythm. Here is the citation:
Butterfield, Matthew W. “Why Do Jazz Musicians Swing Their Eighth Notes?” Music Theory Spectrum – The Journal of the Society for Music Theory 33. 1 (Spring 2011): 3-26, 107.
Butterfield makes some convincing points on a subject, jazz rhythm, that often seems to produce vague generalities and mysticism in the literature on jazz. Carefully examining ratios between different parts of soloists’ and drummers’ patterns of eighth notes, or eighth notes and quarter notes (which he calls the “Beat/Upbeat Ratio”), he goes on to show how soloists use minute variations in the ratios to embellish their phrases. That in turn suggests that they do so in distinctive ways, and that a particular artist’s expressive effects in this realm might be profiled, a possibility we should discuss further.
But my chief concern in this post is where and how Butterfield derives his evidence. He investigates audio waveforms for the nuances of timing he seeks to observe. Here is a footnote from the above article, p. 166, on his method for observing changes in the Beat/Upbeat ratio:
“BUR values in all musical examples included in this study were calculated by the author. The digital sound-editing program Audacity was used on a Macintosh computer to identify the onset of each note from a visual and aural analysis of its waveform. From these figures, IOIs between successive notes were defined and then employed to calculate the BUR values. There is inevitably some degree of uncertainty in identifying the attack point of each note, as noise and other onset ambiguities can render an exact determination impossible. By employing a consistent set of criteria to resolve ambiguities, I am confident that my figures are accurate to within ±.5 milliseconds, which translates into a BUR value accuracy of ±.05 at the tempos shown in Examples 4 and 5.”
Butterfield’s claims depend on his ability to identify onsets of given parts of the beat in a soloists’ performance. How can he be so sure he is that accurate? Because he is simply looking at a single point in musical time, aided by listening to the recording itself at that point, without aggregating it statistically (which we must do to characterize whole files)? What are his “criteria for resolving ambiguities”? is his manner of determining an onset useful to us in some fashion?
I have similar questions in another equally interesting article by this author. This one is open access:
Butterfield, Matthew. “Participatory Discrepancies and the Perception of Beats in Jazz,” Music Perception: An Interdisciplinary Journal, Vol. 27, No. 3 (February 2010): 157-176. URL: http://www.jstor.org/stable/10.1525/mp.2010.27.3.157
This piece tests the contribution of “participatory discrepancies,” or subtle rhythmic changes ahead or behind the beat, to the effect of swing in jazz. The idea has weight in ethnomusicology, based on widely known work of Charles Keil, who sees the inherent asynchronicity and rhythmic in music as an attraction to is, an invitation to become part and shape the interaction. (See Keil, Charles, “Participatory discrepancies and the power of Music,” Cultural Anthropology, 2, pp. 275-283.) Discrepancies between instruments have also been a central concern of ours, even if our agenda was to ask how they might affect beat-tracking software and identifying individual artists (especially drummers), rather than delving into how they fuel the fundamental human experience of playing or listening to music.
Butterfield’s main concern is whether listeners, even ones with limited musical training, can perceive these discrepancies, and he describes some elaborate tests with real subjects. His answer seems to be “not much.” His conclusion may be valid within the terms of his own research design, but it is a straw man: a misguided attempt to directly and empirically test what for Keil was more of a philosophical argument and musical manifesto. Average listeners may not be able to accurately tell whether a bass is behind a drummer, but the practice of playing ahead or behind itself is beyond question, in more kinds of music than jazz (and listeners may sense it even if they cannot articulate or analyze what they hear). Arguing this point more carefully would take us too far afield.(Butterfield himself makes some very valid points toward the end of this article, once he leaves testing with human subjects behind and proceeds from his own observations on the performance discrepancies themselves, which deserves another post here.)
What is most important here is that, on page 163, Butterfield once again refers to his own observations using waveforms in a rendering of a jazz recording in Audacity. In this case, he is interested identifying lags between the accompanying bass and drums, and determining who might be ahead, so that he can test whether listeners perceive the discrepancies.
Here Butterfield claims not only to be able to tell where an onset is, but to distinguish different instruments within a waveform:
“Cymbal strikes in particular tend to be well defined and easy to spot—they appear ‘furry.’ Bass onsets, by contrast, are characterized by a substantial burst in wave amplitude.”
See the diagram on p. 163 to visualize this “furriness” and “burst.” Butterfield then seems to veer toward, or call out for, a beat-tracking methodology in the next passage:
“[Bass onsets] are not as clear as cymbal strikes, however, and this required formulation of a consistent procedure to define them. To this end, determination of each bass onset would begin with an onset hypothesis, placing it tentatively at the peak of the first wave whose amplitude departed significantly from the prevailing shape preceding it. A careful aural analysis of the beat ensued, working backwards and forwards from that point and adjusting it in accordance with aural evidence for an earlier or later onset until it could be determined with confidence to within ±5 ms. Any beat where the bass onset could not be determined with confidence to within this interval was omitted from analysis.”
He seems to acknowledge the need to adjust moment to moment rhythmic events to a virtual or average beat, but then proceeds to do intuitively or ad hoc, it seems to me.
So there are some common points of approach between Butterfield’s work and ours. If these methods of gathering evidence are in question, what would it say in the end about his otherwise compelling arguments about rhythmic dynamics in jazz? Are there quantitative terms or observations about jazz practices that could be useful for us in profiling certain artists, or analyzing whole performances, or major structural parts?
Just as a point of interest, here’s an example of a human-generated, heavily annotated transcription of a jazz performance:
Sancticity, Scofield solo analysis
transcribed by Bert Ligon
It’s clear that in the near future we’ll be able to generate a machine transcription that more or less matches this one in terms of notation. But it’s also clear that “which note, when” is just the very tip of what it might mean to analyze a performance.
Much of our discussion lately has been about biases of various sorts in MIR tools and how to avoid/fix them. Here’s a paper that touches on many of the topics we’ve been thinking about:
George Tzanetakis, Ajay Kapur, W. Andrew Schloss, Matthew Wright
Plus many papers on similar themes at:
We are now hosting a general discussion list for people and machines interested in exploring the application of MIR (Music Information Retrieval) techniques to jazz:
As Tad mentioned in previous posts, rhythmic variation is one of the key distinguishing features of jazz. We’ve been specifically interested in analyzing how drummers can create a different feel by playing “ahead of” or “behind” the beat (whatever that might mean). In this post, I’ll cook up a visualization method to give a detailed view of rhythm and tempo variation.
DAn commented on my last post agreeing that it was circular to ask whether a drummer is “ahead” or “behind” the beat, when it is precisely the drummer we may be relying on to determine the beat. That led me to want to snip the loop of this circularity in this post. Let’s leave aside the idea of a drummer’s relationship to some abstract temporal frame of reference and focus on what drums alone are doing.
Can we simply separate the drummer’s actual statement of the beat and examine that on its own? Or perhaps use this beat as the reference point for other musically meaningful events?
For a reliable, highly conventional statement of an actual tempo within many jazz performances, we would take the quarter note part of the cymbal “ride”. The familiar figure is:
dang dang-da dang dang-da-dang dang-da-dang
Which would be notated this way:
The quarter notes in 4/4 time would be the “dang” part. It is true that the second and fourth dang in each bar do not sustain for a full quarter note. They are paired with a “da.” But they are struck on the quarter note of each quarter in 4/4, creating a steady tempo.
To do this, we could look only at the exact outset of the cymbal sound (given we had the ability to separate this feature, a big, big “if” I realize.)
If these real, not abstract, quarter notes could be extracted, perhaps we could:
This could help profile or identify percussionists, or how artists play together in specific ensembles. In addition, perhaps these drum beats could serve as criteria for creating musically meaningful segments that could then be analyzed.
After working on beat tracking for whole performances, our conversation turned to how to think about the chief rhythmic instrument in a conventional jazz group. The drummer states the tempo for the group. He also varies it in subtle but important ways, and plays contrasting figures against it. This holds for early jazz and fusion jazz and everything in between. It is part of the rhythmic expression and dynamism that is central to jazz (and many other types of music). This level of rhythmic variation is also a challenge for MIR: it may confuse the effort to establish an overall metric for performance tempos, and also generate complex patterns that could be difficult to identify reliably.
We proposed that if we could compare different drummers with different approaches to timing and tempo, we could begin to sharpen our understanding of these musical features as they relate to MIR. We would select drummers who may tend to play “ahead” or else “behind” the beat. I mentioned Tony Williams and Elvin Jones, respectively, as having these contrasting approaches to timekeeping and rhythm. Orientation toward the beat is an important point of discussion among musicians negotiating how to play together. It is also a factor in our current discussion of how an individual local event or section of a performance deviated from the average tempo in our beat-tracking work. Investigating them should help refine the beat tracking tools we have. It can help profile artists or ensembles by rhythmic style. And it might even help in the more strictly musicological aim of uncovering the techniques that serve expression, and providing demonstrable evidence for, or against, interpretative and evaluative musical terms in jazz.
We agreed, however, that we have to define more closely what we mean by a musical event or performance being ahead or behind the beat, and whether there are other phenomena involved outside of strict matters of tempo that need to be distinguished and defined separately as part of the analysis. What follows makes a tentative step toward such definitions and remarks on the difficulties in reaching them.
I assume that, as part of Brian’s work on feature extraction, we will be able to distinguish and measure what the drummer, or his cymbal in particular, is playing with respect to the tempo. Drummers play very regular patterns as part of their role of keeping time. Even if they vary their figures, they state a pulse as part of their standard “ride” pattern. I assume also that if deviations from the average beat can be measured, the deviations of a certain instrument, like the cymbal, can be as well.
Based on that groundwork, what does it mean to play something ahead of the beat, whether in one single instance, or to do so as a marker of “style”? The word implies positive but subjective qualities like having energy, pushing the group forward (precisely in the face of the potential chaos or entropy in group improvisation). But the conventional usage it is poorly defined. “Ahead” yes, but of what?
I already implied that it could mean playing ahead of an average beat for the performance. But then there may or may not be issues of measurement error there when the tempo itself may be changing—or when it is precisely the drums whose statement of the tempo we are relying on determine the average tempo. Ahead of other performers? It is plausible that a percussive event could take place before some other musical one by a different instrument. But to say that at some given time one instrument plays “ahead” of another implies reference to some established tempo against which both can be measured, leading us back to the question of how tempo is established in the first place. Our beat-trackers do this fairly effectively for short term variations from the average tempo. But do we have a way to “view” or reference that grid for the short samples we will be starting with? Perhaps we can find a valid way to establish tempo that is tailored to such a short experimental segment.
Given that we can establish such references to an underlying beat, a reservation I (and DAn) still have about speaking of a given event or percussive stroke as being “ahead” is that may not actually being ahead of anything. There are presumably volumes written about what an “accent” is and how it “should” be played. But it seems to me that it is possible that accenting some note may create the impression of being ahead of some other instrument or tempo when it is not actually so—just as shading in black and white creates the illusion of color. To strike something a little harder in one note in a series may seem to give it more “energy” (as it does to the overall performance). It may actually have more energy, in the sense that the change is measurable in amplitude and perhaps frequency of overtones. Yet on the other hand to strike something harder may actually require or occur with a relatively earlier attack, confusing the issue further. In my work on Lucky Thompson’s style of accents, I assumed that accenting a note gave at least the impression of its being slightly ahead of other notes in the series, and of the overall rhythmic pulse, and then simply noted where I thought the accents were in a transcribed solo. (Shull, “When Backward Comes Out Ahead,” Annual Review of Jazz Studies 2003).
These issues in determining what is ahead of the beat may have their mirror image in what figures or events might be “behind” the beat. Is it bad to be “behind” something? Not in jazz or African-American music in general. Being “laid back”: it implies being soulful as opposed to aggressive; it means allowing space for the soloist to develop his or her ideas; and it suggests a sense of “holding back” or suspense that is an ingredient of storytelling.
But once again, we have to ask: behind what? The question of the frame of reference holds in this case as well as that of figures or performances that are ahead. Can a drummer be consistently behind the beat and keep from losing the tempo or momentum altogether? Or is it possible that, like the subjective feeling of being ahead, being behind is another kind of artistic illusionism?
I say this because when I listen to drummers who at face value seem to have this quality, I do not necessarily hear they are striking the instrument, say a cymbal, any later than might be expected by the overall tempo, the other instruments, or relative to what other drummers are doing. They may strike the cymbal more or less on the beat—but they allow it to sustain longer. This may be an effect of the choice of cymbal or stick, and of many other factors. The point is that the sense of a statement of pulse being “relaxed” or “behind” is really “broader” rather than actually “later.” In fact, I believe this sostenuto (sustained) effect creates the impression of ease in the work of many singers, whether jazz or not, who thus get a supple feeling without seeming to lose or float entirely outside the beat. I certainly hear it in Elvin Jones, as opposed to Tony Williams, both of whom I have been listening to in trying to articulate these concepts.
If the timbre or sonic qualities of a timekeeping beat by these two influential drummers can create these necessary illusions, so can their more sophisticated rhythmic combinations or figures. Identifying and comparing such figures may be far in the future for us, so I will briefly suggest how more complex patterns may come into play. Williams often states quarter note quintuplets in his accompanying “comments” on the soloist’s ideas, while keeping or implying the beat. That is not implying a faster rate; it is actually moving faster. In contrast, Jones tends to use quarter note triplets, or more properly displaced grouping of eighth note triplets, which seem to slow down the rate and energy of the steady stream of eighth notes generated by and expected of the soloist (at least in bebop and progressive jazz). The two drummers’ overall approaches to stating the tempo, in other words, are reinforced by their rhythmic figures and the way they subdivide the time signature.
In addition to questions of timing and relation to an underlying tempo, we have a great deal of work ahead of us on the tonal qualities of percussion instruments in jazz. As with any instrument, we confront not just timbre, but modulation of timbre, even within a given single abstract “note.” Perhaps even something as granular as a single “attack” might have to be segmented or studied further at a micro level before we can proceed further on these questions of timing and duration. This will be tough to define and measure, but I mention it because it might help profile different artists’ characteristic timbres or “sounds.” With regard to percussion instruments, it does seem that prospects are good for being able to separate them from other instruments, based on Brian’s latest work.
Rather than trying to tackle the complex phenomena that might comprise a single attack, it might be simpler to identify the precise onset of a given percussion event, in a certain recording, perhaps a single part of a drum kit (for which Brian has developed codebooks I believe, or for closely related parts of a drum set). We would also want to be able to ask what this duration of this event is after this onset. To begin, this might be chosen from a small segment of a single performance.
As a next step, we could ask how these timings, or patterns of them, relate to an average beat, and compare that to other events within or across performances. Then I believe we could confidently state that event X (onset or final decay) happened at time (A) within a tempo continuum. Of course we would need to define and implement the average beat or tempo in a satisfactory way in order to reliably track what individual instruments are doing in relation to it.
If these concepts were adequately defined, we might then examine one drummer or compare two of them. We could also apply what we learned to non-percussion instruments: Improvising soloists, too, have characteristic orientations toward “the beat.” If these tools were available, even far in the future, we could not only profile a given drummer, but get a more direct look at how they interact rhythmically with other musicians, or at how polyrhythmic practices create their expressive effects, with reference to hard evidence at a micro level.