The Holodeck On Your Head: A Virtual Media Studio

I was asked to give the Closing Keynote for the 2013 Audio Engineering Conference this week at Javits in NYC. The convention drew over 18,000 attendees. For my topic I selected “The Future of Audio Production 2020-2050”. Added:  CNET reported here. I spoke about the demise of “physical” post-production. The lecture was accompanied by around forty proprietary research graphs, not included here. By 2050, perhaps even as early as 2030, I showed (with extensive data) that most media post-production will be performed in virtuality, where every functional piece of equipment — every knob, fader, switch, and patch point — will be visible and controllable entirely in virtual space. This paradigm will encompass film editing, sound and music editing, game production, mixing, mastering, and just about any type of aural-visual post-production and delivery. By 2040, we’ll have mostly abandoned the mouse. Physical touchscreens will be largely obsolete. There will be far fewer physical media objects — such as external audio monitors, keyboards, trackballs, personal desktop video monitors, and so forth. Save for a quiet room, a comfortable chair, and innocuous motion trackers, the physical “production studio” will largely be a thing of the past. Certainly, a number of “legacy hardware rooms” will still exist, but they will be dying curiosities. Bottom line: we have moved from a desktop-culture to a hand-held culture, and now we are moving from a hand-held culture to a head-worn culture. Physicality will be replaced with increasingly sophisticated head worn immersion devices. Most of these basic changes will be well in place by 2035. And by 2050, head-worn audio and visual fully-spherical realism will be nearly indistinguishable from real-space. Audio will be mixed for a true three-dimensional sound space (in fact, we are doing this now). Visual production will require three axes of reality (also happening today). During this transition, perhaps the only remaining piece of CEH (clunky external hardware) will be sub-woofers, which cannot be emulated with a headworn device. By 2025, today’s emerging object-oriented 3D audio environments (Atmos, Neo, Auro, etc.) will be commodity delivery formats. By 2025-2030, head motion tracking and hand gestural tracking will also be inexpensive, matured commodities. A single desktop computer in 2050 will be equivalent to roughly 10 billion human brains working in parallel, so media processing power is no longer a bottleneck. The 2050 Internet will be hosting roughly 10,000,000,000,000,000,000,000 bits of data, per second (10 sextillion bits/s). Production and post-production studios of 2030-2040 will give us our familiar working tools:  mixing consoles, outboard equipment, patch bays, audio and visual monitors … or their real space DAW equivalents. The difference is that all of this “equipment” will live in virtual space. When we don our VR headgear, anything we require for media production is there “in front of us” with lifelike realism. It’s a Holodeck on your head. Headworn reality. Matured gestural control (2030-2035) allows us to reach out and control anything in the production chain. Efficiency will be improved with scalable depth-of-field. Haptic touch (emulated physical feedback) will add an extra layer of realism (2030-2040), but it’s probably not necessary for media production emulation. Anything in the virtual room can be changed with one voice or gestural command. Don’t like the sound of that Neve 8086 console? Install the Beatles EMI Abbey Road console. A ten second operation. But why stop there? Let’s dream bigger. Call up a complete AI symphony orchestra that fills your immersive vision stage. Call up a great concert hall (let’s try the Concertgebouw. Hmm, that’s a little too swimmy. Let’s try Boston Symphony Hall). Add a 200 voice choir. Add Yo Yo Ma soloing with his carbon fiber cello. You’re there in front, conducting and refining the orchestra with gestural and voice commands, making refinements to the score and performance, until it becomes exactly as you want it. We achieve a complete Virtual Audio Workstation, or more precisely a Virtual Media Workstation which can be tailored to fit any creative production goal. The future of audio, music, film making, game design, TV, industrial apps  —  any creative media construction, from inception to post-production — becomes truly boundless and limited only to our imagination. Personally, I dream about being able to think of music directly into a recording system:  a non-invasive brain-machine interface. It turns out that this dream is moving from science fiction to reality (link, link, link). And if we assume a two-year doubling period for cortex sensing resolution, by the early 22nd century our non-invasive brain interfaces will be about 20 orders of magnitude more powerful than today. But will that give us the ability to think music and visual art directly into our computers? Or does it simply blur the line between our brains and our computers, so that the entire paradigm of augmented thinking and collective knowledge is radically shifted? At that point … when we have billions of devices globally networked, and each device is trillions of times smarter than the combined intelligence of all humanity … what will our species become? What will our collective thought processes look like? Personally, I think these kinds of paradigm-shifting social questions are coming sooner than we may realize. And I think there’s both great promise and great risk with the technologies that are emerging. Or as my wife reminds me before my lectures, teach them that the heart is always more important than our technology. Or as Bryan Stevenson said, “we will not be judged by our technology, intellect, or reason. Ultimately, the character of a society will be judged not by how they treat the powerful, but by how they treat the poor.” Nevertheless, somewhere in the future, we will create human-to-machine interfaces that respond and adapt directly to our personal imagery and creative ideas; so that one day just about anything we can imagine will become our art. Millennia Media,  


Not long ago, a talented local family asked me to produce an EP (4 song record) of their music. I rarely produce outside work, but made an exception for this wonderful family – six sisters who sing, write, and dance. So today I was curious and called their mom. They moved to L.A., recorded a new EP, have signed a deal with Universal, have over 110 million YouTube views (UPDATE: over 300 million), have over 1/2 million subscribers, and are often the #1 trending topic on Twitter. Their first full length record releases next year. Much of the music will be original material written by the sisters. If their new EP is any indication, I think they are going to hit big with 10-18 market. UPDATE: After its first day of release, their new EP has climbed to #4 on the iTunes pop  

Remembering Bruce Jackson

UPDATE 9 Feb:  An Australian memorial service will be held for Bruce in the Sydney Opera House on 25 Feb at 10AM. U.S. memorial service is being planned. Will post date and location when available. There’s a chance it will held in the Spring. . . . . 1 Feb:  There has been a rumor floating around the audio industry nets today, and sadly the rumor was just confirmed true. My dear friend Bruce Jackson was killed in his private plane when it crashed in the California desert on Saturday. Bruce loved to fly his fast little Mooney single-engine aircraft. I understand that his wife, Terri, was informed last night. I last saw Bruce about four months ago. I’ve included some photos from a visit Bruce and  family made to Placerville a few years ago. He took Dan and Cynthia up for a ride in the Mooney and let young Daniel be the pilot. Besides being a genuinely sweet soul, Bruce will be remembered as one of the greatest audio engineering talents of the last four decades. After building his audio business into the largest sound company in Australia, Bruce started his U.S. career as Elvis Presley’s live sound engineer (oh, the stories he told) and was Elvis’s private jet pilot. Bruce would go on to engineer for (among countless others) Bruce Springsteen, Fleetwood Mac, Johnny Cash, and Barbra Streisand (including the legendary Millennium Concert, the highest grossing one night performance in live music history). Bruce founded the electronics company Apogee Electronics, and later Lake Technology, which was acquired by Dolby Labs. Bruce was selected as the audio designer / director for no fewer than three recent Olympics (Australia, China, Canada). The leading U.S. live sound magazine of the day did a cover story hailing Bruce as “Live Sound Engineer of the Century.” I’ll never forget the Streisand Millennium concert (01-01-00). After the show, I was hanging out with Bruce at the front-of-house mixer (a big Midas, for you audio geeks). Bruce had something like 100 channels of my mic amplifiers on stage (full orchestra), and Bab’s vocal path included our NSEQ parametric EQ and TCL compressor. The house (15,000 seat MGM auditorium) was now empty, but Barbara is coming back out to do “pickups” for the TV special. So Bruce needs to run back stage and asks me to babysit the console. He says, “if Barbara comes back out, un-mute her microphone.”  Not long after Bruce leaves me, Babs comes walking out on stage. So I un-mute her channel and . . . . . . . SSSCCCCREEEEEEETTTCHHHHHH !!!!!!!!!  Massive piercing feedback pumped into 20,000 watts of a Clair Brothers Line Array (apparently, the acoustic signature of the empty room was less absorbent and more prone to feedback, even though her audio gain hadn’t changed). It took me about five seconds to get over the shock and pull down her fader. About the only thing I remember after that was Streisand yelling something at me, and then a few seconds later Bruce sprinting back to the console! Bruce leaves behind a great many people who will miss him terribly, including his beautiful family. Bruce and my better half Cynthia on a walk around our neighborhood. Cynthia getting ready for a flight around the Sierras with Bruce and Daniel After going through these photos, I remembered that Daniel (8?) was running around the airplane and bumped his head pretty good on the wing. I snapped this photo right after that happened. Ouch! Dan’s first flight. Bruce let him pilot the aircraft. Cynthia, Dan, Bruce’s daughter Brianna, his wife Terri, and Bruce. On an evening stroll in historic downtown Placerville. Terri, Brianna, and Daniel in the hood Tail Number N50BJ – so long buddy Other memorials:  


I’m particularly happy about a new product just announced at the 2010 San Francisco AES Conference (Audio Engineering Society). We call it the AD-596. It’s an 8-channel analog to digital converter of exceptionally high sonic performance. I’ve been using the AD-596 for critical listening tests in my lab and can confirm that it outperforms other well-known ADC designs at 2 or even 3 times its price point. Besides setting a dramatic new sonic value point in professional audio conversion, the AD-596 is also the world’s smallest 8-channel ADC, requiring just a single ‘500’ rack space (5.25″ x 1.75″). Up to 80 channels of this ultra-transparent converter can now fit in a single 3U 19″ rack. It was also confirmed by API that our AD596 is the first known digital-audio product for the 500-style rack. Some of the internal features include over-engineered AES transformers designed and built exclusively by Millennia, exceptional clocking circuitry of vanishingly low jitter performance in both internal and external modes, 90% efficient isolated switching power supply, ultra-quiet radiated and conducted performance for use with adjacent high-gain analog 500-rack preamplifiers, and premium components used throughout. Video report here:  

POW-r Algorithms

Twelve years ago, some friends and I got together with the intent of developing the most musically neutral and dynamically accurate audio bit length reduction algorithms. As we completed the code, many of the audio industry’s golden-eared engineers and producers reviewed our work favorably. We soon after became the world’s #1 software for audio bit length reduction. The software is called POW-r, which is an acronym for “psychoacoustically optimized word-length reduction.” Most professional audio recording today uses DAWs, PC-based “digital audio workstations.”  Digitized audio is stored in software bit chunks called “words.” Most DAWs today default to 24-bit word lengths (although internal processing may be twice that or more). Each bit represents a 6dB change in “audio voltage.” More bits equals higher acoustic dynamic range. A higher dynamic range equates to more realistic sound reproduction. The common CD stores digital audio in 16-bit word lengths. And this is the problem: when transferring native 24-bit audio from the DAW onto a 16-bit CD, we lose 8-bits, or 48dB! What does 48dB sound like? It’s the difference between normal conversation (65dB) and a live rock concert (115dB), or the difference between a softly played piano (75dB) and a forte symphony orchestra (120dB). How do you get the full impact of a 24-bit studio recording (potentially 144dB*) onto a CD which can only represent 96dB? Enter the unique software algorithms called POW-r. Our code was created in the real world of symphony orchestras, of which I have engineered hundreds of recordings. We tested numerous iterations of the software in real-world acoustics, carefully comparing musical results until we found optimal subjective performance. Today, POW-r remains the world’s #1 word-length-reduction solution, both for CD and MP3 bit preparation. Most of the top DAW companies license POW-r (Apple Logic, Avid ProTools, Cakewalk Sonar, Magix Samplitude and Sequoia, Ableton Live, Pyramix, and many others). It’s been estimated that POW-r is now used on over 400 million CDs and downloads annually. (* in practice, studio recordings rarely achieve 144dB dynamic range, and home playback systems can rarely offer much more than 110dB, if that. What’s worse, most music today is played back into ear buds, with a dynamic range rarely exceeding 90dB, and that assumes a very quiet environment and high quality playback