I dislike blogs. Usually they are a lot of blather. Maybe I can count on myself to stick to a point, but blogs have the problem of being unstructured, bits of information scattered here and there on no organizing principal except the order in which the writer thought of them. They appeal to the checkout counter magazine mentality. Chaos and ephemera. Blogs are bad.
On the other hand, I've often wanted to let people know what is on my mind, what technical point I'm thinking about currently. To the outside world, I appear such as I do, pushing a lawnmower eternally. Internally, I'm pondering whatever is the current problem in the Odracam project. For example, I may be imagining myself at the center of a sphere with sonic vectors pointing at me, and I'm trying to push their elevations up and down by discrete amounts so as to achieve a balance in their projections onto the z-axis. As you lie in your hammocks, sipping mojitos in the shade, watching me mow, I'd like you to know that. So, here is my explosion of high-class bunk. It will be about what part of the Odracam project is on my mind at the moment.
The big picture is: I've got version 1.0 of the Írim orchestra working. I want to proceed to the big harmony-counterpoint programs, the largest, most complex, and most innovative parts of the project. They are almost completely planned. I need to set up Python, learn that jazz, and realize them. But, there is something I want to fix first...
I'd like to have version 2.0 of the orchestra in place before I start writing composing programs. There are two big improvements I'd like to make:
- I want to automate expressive timing.
- I want to re-write the routines that deal with sound in space (localization, spatialization, etc.)
A bug in Csound prevents certain timing information from being passed from the score to the orchestra, making my true legato system incompatible with varying tempo. When I reported the bug, the Csound developers solved the problem by changing a few words in the manual, glossing over the inadequacy of Csound's score preprocessing routine. I'll save the details for another post. For a long time, I mulled over solutions so that I could have varying tempi. Finally, I saw a way of rewriting Csound's timing system—or, actually, of inserting my own timing system on top of theirs. This opened up some further possibilities: a system for declaring meter, groove, and expressive tempo curves at one place in the score and greatly simplifying timing and loudness macros, including articulations like staccato and overlap legato, along with other features like soundtrack synchronization. It was very exciting, but there was one other thing I need to fix first...
I'm not happy with the sound quality of the orchestra. It sounds too diffuse to me. For spatialization, I did physical models of rooms, thinking that a room will sound good if it sounds real. They don't, necessarily. Reading about David Griesinger's work, I became convinced that phase scrambling was to be avoided, and so I want to overhaul all the sound-generating instruments and the formant system to make sure I'm using minimal or linear phase filters, etc., throughout the whole chain. But, there is something I want to fix first...
I'm happy with large parts of the existing sound-in-space system. I have systems for binaural (for headphones) and transaural (for standard loudspeaker arrays, using Ambisonics) output. The binaural and Ambisonic technologies are wonderful, and I'm glad to use them. I also have in place a system whereby composers can give simple, short text commands to describe the position and motions of sound sources and listeners in the virtual space. You can place them or move them by vector in all three dimensions or in one at a time, in any of three coördinate systems (Cartesian, spherical, or three orientations of cylindrical) at the same time, with a choice of origins (center-of-room, center-of-floor, or listener point-of-view) in units of meters or percentage of distance to the wall. That's all one might want. Oh, and you can make the listener's head wag or rotate (for Leslie effects, I'm thinking). I'm happy with that.
The essence of the Odracam project is to give the ear what it wants. Beauty is far more important than realism or anything else, but I had not been following that dictum in this department. My real interest is in high-level music theory ideas, not in low-level stuff like synthesis and virtual performers. I've taken pains in these matters only because I know people only engage with the surface level of things—if the presentation isn't pretty, they won't appreciate the deeper structures. So, when it came time to handle sound-in-space, I suppose I glossed over it a bit and reached for a simple solution. But, what exactly is beauty in the auditory spatial senses? What does the ear want in this matter?
Answering this pulled me into a study of concert hall acoustics. I spent months reading books and journal articles. It was more interesting than I had expected. I pieced together a picture of spatial perception that includes many different senses with names like clarity, spatial impression, reverberance, strength, envelopment, atmosphere, presence, openness, and so on, each related somehow to the type, direction, timing, level, and spectrum of reflections reaching the listener in a reverberant space, and all of this beyond the ten or so familiar localization cues. I began to see that these small senses fall into three categories:
- localization (how we know where a sound originates),
- positionization (how we know where we are in the space), and
- spatialization (how we know what kind of space we are in).
In all of these, the ear derives pleasure from certain patterns of reflection. I saw ways of maximizing the pleasure, and gradually came to see where the tradeoffs should be and how artistic parameters should be arranged to control the experience.
I'm still working out the details. I have figured out how to describe angles in perceptual units and how to calculate the timing, levels, and directions of the late and early reflections, except for the remaining detail of the early reflections' elevations. I have a strategy that I think will maximize the relevant pleasant sensations (mainly envelopment, positionization, spatial impression, and openness). I need to express it as a detailed algorithm. Then I'll move on to spectral controls of reflections, distance cues and how they relate to perceptual distance units, and finally clearing up some finer points of localization. Then I'll write it all up into the appendix of the manual that explains how the sound-in-space system works. Then I'll write up the user's manual chapter about how to use it. Then I'll write the code that does what the manual says it will do. Then it will be part of Írim orchestra 2.0. Then I move on to carefully removing any phase scrambling from the signal path. Then the big expressive timing system and a few little fixes before I write the user's manual and take 2.0 on a shakedown run. Then I can begin the harmony program that I'm really dreaming of.
But, this is where I am. I'd intended on making these blog posts much more technical, with details about what I'm working on right now (like the elevation of early reflections), but this first one has necessarily been an overview.
I was going to wait until the harmony-counterpoint programs were tested before starting a website, but a dear old friend of mine encouraged me to put one up now, before I'm ready. I saw the wisdom in that. I've been doing this work for a good thirty years, mostly in isolation. It will be a help just to have people know what is on my mind. So, back to work... but first, I must learn Joomla.