Google Webmaster Tools has created a page on how they work with Flash and other rich media files.
Regarding Flash, they say: “Google can now discover and index text content in SWF files of all kinds, including self-contained Flash websites and Flash gadgets such as buttons or menus. This includes all textual content visible to the user. Google supports common JavaScript techniques. In addition, we can now find and follow URLs embedded in Flash files. We’ll crawl and index this content in the same way that we crawl and index other content on your site–you don’t need to take any special action.”
And what about Silverlight? Well: “Google can crawl and index the text content of Flash files, but we still have problems accessing the content of other rich media formats such as Silverlight … In other words, even if we can crawl your content and it is in our index, it might be missing some text, content, or links.”
Enough said!

17 thoughts

  1. To be fair, Google had to undertake the effort to index Flash because of it’s ubiquity. And while Adobe hasn’t been entirely proactive on the matter, they haven’t turned a blind eye to it.
    What’s the incentive for Google to spend the same effort on Silverlight?

  2. Maybe Google can index static content in Flash files. A lot of Flash and Flex content is dynamically fetched from external files and/or databases. And Google still can’t handle those. 🙁

  3. OK Ben, I can follow that reasoning, but how can google know which records to fetch from the database? Does it also search for click handlers and executes each and every one of them to make sure every part of your SWF is covered? E.g. assume I have a header-detail component. Does it execute the detail fetch for each item in the datagrid?

  4. Steve, yes, it can do just that, responding to every prompt for click and more. Do a search for the video of MAX 2008 Day 2 keynote where the engineer who created Ichabod demonstrates and explains the functionality.
    — Ben

  5. Ben,
    No reports I’ve see indicate that Google actually index the dynamic content? They have "ichabod" and use it, but figuring out how to present the dynamic content is still their main problem. It’s hard to automatically figure out what is relevant from maybe 1000 dynamic responses…

  6. @Sven
    Note that they are saying "we CAN index this external content" (my emphasis). They can’t possibly index loads of dynamic content without knowing how to present it.
    Think about it – how would YOU display hundreds of dynamic result

  7. The example mentioned in Jeff’s link there is a good example.
    The SWF at loads in data from a static XML file that clearly tells Google that it is created for CoffeeCup Image Gallery. This is a commercial application ( and someone at Google must have created some rule saying that we only want to display this and this node in XML for that kind of galleries.
    There’s nothing dynamic in that example, so if @Sven is betting his business on Google indexing dynamic content, I think he’s out of luck. If that CMS produces as SWF that can show hundreds of pages, Google simply cannot make a reasonable search result page for it. The only reason it works here is that it’s a statically linked XML where someone have made some rules for what to present as relevant?

Leave a Reply