MIT researchers recreate sound by analyzing video taken of objects


In a potential new boon for the law enforcement and surveillance communities, MIT scientists have discovered a brilliant new way of extracting audio from video taken of objects inside a closed room.

Sound travels in waves, causing subtle vibrations in the objects through which they pass, vibrations so subtle that they are invisible to the naked eye. Using an algorithm that reconstructs audio by analyzing these minute vibrations (which are changes in fractions of pixels), MIT researchers are able to reconstruct sound as varied as music and human speech.

In their experiments, researchers used high speed cameras shooting between 2000 and 6000 frames per second to film objects like the surface of a glass of water, aluminum foil, houseplants, potato chip bags, and headphone ear-buds. For example, human speech was reconstructed from the vibrations detected in a bag of potato chips 15 feet away behind soundproof glass. In another experiment the objects were filmed using a standard 60 frames per second smart-phone camera. Although the sound extracted was of lesser quality, it was still recognizable.


Research focused on this technology is also exploring the possibility of determining the material and structural makeup of an object by observing how it reacts to sound.

It can only be hoped that this technology will be used with restraint instead of being one more nail in the coffin of privacy.



About Author

Brandon Bailey is a late bloomer, specifically a Saussurea Obvallata. Someday you may see him at a local botanical display, or perhaps just withering on the vine. Brandon has had a lifelong fascination with science, history, travel, and the lost arts. He can be found writing in East Los Angeles, California, or exploring the city’s many hidden treasures. Brandon is also a self-taught pianist and a connoisseur of music in all its forms.

Comments are closed.