Google: We Can Index Flash, Silverlight Is A Problem

Google Webmaster Tools has created a page on how they work with Flash and other rich media files.
Regarding Flash, they say: “Google can now discover and index text content in SWF files of all kinds, including self-contained Flash websites and Flash gadgets such as buttons or menus. This includes all textual content visible to the user. Google supports common JavaScript techniques. In addition, we can now find and follow URLs embedded in Flash files. We’ll crawl and index this content in the same way that we crawl and index other content on your site–you don’t need to take any special action.”
And what about Silverlight? Well: “Google can crawl and index the text content of Flash files, but we still have problems accessing the content of other rich media formats such as Silverlight … In other words, even if we can crawl your content and it is in our index, it might be missing some text, content, or links.”
Enough said!

17 responses to “Google: We Can Index Flash, Silverlight Is A Problem”

  1. Henry Ho Avatar
    Henry Ho

    search keywords "cf9 bug report" in google.

  2. Paul Avatar

    To be fair, Google had to undertake the effort to index Flash because of it’s ubiquity. And while Adobe hasn’t been entirely proactive on the matter, they haven’t turned a blind eye to it.
    What’s the incentive for Google to spend the same effort on Silverlight?

  3. Matt Williams Avatar
    Matt Williams

    It will be interesting to see if Bing and/or Yahoo are able to crawl Silverlight before Google…

  4. Steven Peeters Avatar
    Steven Peeters

    Maybe Google can index static content in Flash files. A lot of Flash and Flex content is dynamically fetched from external files and/or databases. And Google still can’t handle those. 🙁

  5. Ben Forta Avatar
    Ben Forta

    That is incorrect. Google has a headless Flash Player that they use in indexing Flash that actually executes the SWF, and it can indeed make calls for backend data.
    See, and search online for "Adobe Flash Ichabod".
    — Ben

  6. Seth Avatar

    Where can I get a headless flash player?

  7. Samir Kerimov Avatar
    Samir Kerimov

    How about bibg ? I think bing has problem with both Flash and Silverlight.

  8. Steven Peeters Avatar
    Steven Peeters

    OK Ben, I can follow that reasoning, but how can google know which records to fetch from the database? Does it also search for click handlers and executes each and every one of them to make sure every part of your SWF is covered? E.g. assume I have a header-detail component. Does it execute the detail fetch for each item in the datagrid?

  9. Ben Forta Avatar
    Ben Forta

    Steve, yes, it can do just that, responding to every prompt for click and more. Do a search for the video of MAX 2008 Day 2 keynote where the engineer who created Ichabod demonstrates and explains the functionality.
    — Ben

  10. judah Avatar

    It wasn’t too long ago that Flash *wasn’t* being indexed…

  11. Jensa Avatar

    No reports I’ve see indicate that Google actually index the dynamic content? They have "ichabod" and use it, but figuring out how to present the dynamic content is still their main problem. It’s hard to automatically figure out what is relevant from maybe 1000 dynamic responses…

  12. Sven Miller Avatar
    Sven Miller

    Actually Google indexes external/linked xml content in Flash see
    My company Yooba does Flash CMS and it is very important to us that this works as described.

  13. Jensa Avatar

    Note that they are saying "we CAN index this external content" (my emphasis). They can’t possibly index loads of dynamic content without knowing how to present it.
    Think about it – how would YOU display hundreds of dynamic result

  14. Jeff Fall Avatar
    Jeff Fall

    Yeah, the information on that link above is out of date. Google does now index dynamic content from flash files.

  15. Jensa Avatar

    The example mentioned in Jeff’s link there is a good example.
    The SWF at loads in data from a static XML file that clearly tells Google that it is created for CoffeeCup Image Gallery. This is a commercial application ( and someone at Google must have created some rule saying that we only want to display this and this node in XML for that kind of galleries.
    There’s nothing dynamic in that example, so if @Sven is betting his business on Google indexing dynamic content, I think he’s out of luck. If that CMS produces as SWF that can show hundreds of pages, Google simply cannot make a reasonable search result page for it. The only reason it works here is that it’s a statically linked XML where someone have made some rules for what to present as relevant?

  16. Brian Avatar

    Google says they can read the text of the SWF, but the most basic of example shows that they cannot.
    I would love to see search engines read contents of SWF files, but the technology isn’t there yet.

Leave a Reply