IBM will release a new corporate search engine, the “DB2 Information Integrator” (code-named Masala) tomorrow reports CNet and eWeek.

The information integrator is able to do this because it can search rapidly across multiple databases, including relational and non-relational databases and structured and unstructured data such as text files, word documents, Adobe Acrobat files, video or audio files, according to Jones.

“To gather this information up today, they might have to use multiple searches,” Jones said. DB2 Information Integrator can replace all of these searches with a single search that gathers all of the types of information to answer a single question, he said.

Sounded like a pretty tall order to me. I’ve heard of connectors that can search the closed-caption text of a video but audio has no such meta-data. A scan of the IBM website for Masala doesn’t help either. I then happened upon this IBM Research site that shows early attempts to automatically categorize images on MPEG-7 video files. Once categorized, you can then query the meta-data attached to each image.

Some further work is needed to iron out the kinks. In the example, looking closer you can see that both Janis Joplin and Peter Jennings were tagged as “animals”