Here's the pitch: With regular ray tracing, rays of light are traced backward from a pixel of the camera, to an object and eventually to a light source (or lack thereof). If you can do that with light, why can't it be done with sound?
Over ten years ago I was having breakfast with a friend and I sketched out the idea on a napkin. This kind of math is definitely not my strong point. But instead of a camera, have a microphone. Instead of tracing rays, trace vibrations, and instead of light sources, there's air and friction; or so the napkin said.
Instruments aren't the goal, but a simple model to test would be a
grade school caliber recorder. After modeling materials, blown air resonates down the tube of the recorder generating sound.
As different quality of air and different materials are created, artists build wireframes of different instruments. The effects of room acoustics could be modeled too.
This would culminate in a completely rendered orchestra performing a wave traced rendition of Beethoven's Symphony No 5.
As I mentioned earlier, instruments aren't the goal: Voices are.
After modeling complex instruments, the next step is to model and animate the speech pathway from lungs to lips. The speech animated model (or "Sam" for short) would start with simple vowels like the [a:] sound in raw, or the [æ] sound in "bad". (Wikipedia's page on the IPA has some good introductory information). Eventually Sam would be animated to fluidly pronounce most phonemes. Linguistics are also not my strong point.
I'd imagine that Sam would sound very lifeless and flat. So we would capture data from sensors attached to voice model actors. Voice capture artists would smooth out the data and bring Sam to life. Motion capture is also not my strong point.
With only a handful of voice model actors, spoken phrases could be captured for an entire movie or video game! The captured data could be applied to dozens or hundreds of different sounding models by adjusting Sam's variables. Accents would easy to add by changing the way Sam enunciates words. Playing with Sam's breathing rate would simulate a person out of breath or scared. Sam's voice could be lowered or raised or even modeled after a real person.
Eventually the captured data could be used to animate 3D model too!
This would be a perfect for RPG games like Oblivion or GTA that can benefit from an almost endless supply of different voices.
Let me know if you ever do anything with this, or find something that does. Especially if you're Bethesda!
3 comments:
Look for the paper "A Beam Tracing Approach to Acoustic Modeling for Interactive Virtual Environments".
Simulating the actual creation of speech from the lungs on through the mouth and to the air is very exciting, but the first step is not the simulation of sound creation, but the accurate simulation of sound propagation through space. In games, sounds should originate from distinct positions in 3D space, just as in reality. Because realistically wavetracing audio should require much less computation than realistically raytracing graphics, there is no reason why 3D-modeled audio in games has not reached near-maturity, instead of being nonexistent (FYI: EAX only applies effects that mimic 3D positioning).
With several-hundred-core video cards and CUDA, we should be able to use wavetracing to realistically model the variables of sound position, direction, original amplitude, speed, and transmission/reflection/diffraction medium. This would produce the effects of perceived volume and position, frequency attenuation, reverberation, and doppler shift, among others. Imagine a game which uses wavetracing to model the input from player's microphones:
Three people are playing, each in a different location. In the game, two of the players are standing side by side, and the other is a football field's length away. Using their headset microphones, the two players can talk quietly without being heard by the third player. In order to be heard by the other pair, the third player would have to raise his voice considerably. If the third player was carrying a small radio and walking towards the pair, the pair would hear a steady increase in higher frequencies and overall volume, with no sudden 'jumps' in volume, and no distance at which a sound simply stops abrubtly (Sound familiar?).
Very intresting!
But I think use ray traching for sounds is not the right way to simulate at the best the sound propieties. You can use ray traching for describe light because light has a ray behaviour (if you interpretate it like a particle) and wave behaviour at the same time. So in fact, even for the light, ray traching has many limits! Sound has only the wave propieties so I think you can't describe it as well as light!
Post a Comment