Index We Trust

In the spring of 1996, when I was writing my first book and the Web was young, I posed a question in WIRED magazine: “How are we going to find stuff?” Particularly, how are we going to be able to search multimedia formats like Java applets, PDF files, and Shockwave movies? (Flash movies hadn’t made their appearance yet.)

Within a couple of years of my brief article, the better search engines had incorporated PDF parsing engines. For most other formats, the answer was to use meta tags in the headers of the enclosing HTML document that contained text for search engines to index. But — depending on who you talk to &mdash meta tags are obsolete, and their content is no longer indexed by Google or any of the remaining major search engines.

I don’t know what the answer is. I’ve always thought the various people selling their “services” to “tune” your web site and get it ranked higher on search engines were a fraud, precisely because much smarter people at companies like Google worked to make their systems more efficient and less prone to “gaming”.

But the question remains (and came up in a DIRECT-L thread titled “dcr, swf and google” initiated by Quixadá) of just how we can easily get text from our DCR and SWF movies into the memory banks.