June 30th 2008 was a day that Flash developers had been waiting for a long time; Google and Adobe had finally announced that Flash .swf files could be crawled by Google! In fact, the extensive news release from the Adobe Developer Center also stated that Yahoo would be incorporating similar technology in short order. When I read this news and the consequential articles from the web marketing community it became very clear that this update was a great step but far from the fix that some Flash developers are likely to pitch to their clients. As a result, I wanted to add my voice to the buzz on this topic and share with you my thoughts on how to optimize a site using Flash while considering the current updates.


What is Flash?

Okay, lets get down to basics. To introduce and establish what Flash is all about I am going to fall back on Wikipedia for a concise description:

Adobe Flash (previously called Shockwave Flash and Macromedia Flash) is a set of multimedia technologies developed and distributed by Adobe Systems and earlier by Macromedia. Since its introduction in 1996, Flash technology has become a popular method for adding animation and interactivity to web pages; Flash is commonly used to create animation, advertisements, and various web page components, to integrate video into web pages, and, more recently, to develop rich Internet applications.” Source, Wikipedia

BEFORE: Search Engines Could Not Crawl Flash
Up until recently the textual content found in .swf Flash files was, for all intents and purposes, just as unreadable for search engine spiders as the text in images; only HTML text on a page could be read and indexed by search engine spiders because they could not yet (and still cannot) conduct on-the-fly optical character recognition.

To explain this differently I think of the HTML that spiders can read like the braille-like feeling of running your finger over a letter written in ball point pen; you can feel the contour of writing. Whereas something unreadable like Flash or an image on a page is like running your fingers along a 4×6 picture of a road sign… you won’t feel anything, so by the same token the text on that road sign cannot be read by a search engine spider.

NOW: Search Engines Can Crawl Text in Flash
For the first time, on June 30th, 2008 Google announced it could accurately spider the textual content hidden within Flash files found on the Internet. This major announcement was enabled by a partnership between Adobe, Google and Yahoo where Adobe provided their proprietary Flash Player technology to the search engines so they could integrate it into their systems and successfully ‘read’ the content within Flash files. This technology has vast implications for Google’s and soon Yahoo’s indexes because, at least in Google’s case, this allows the search engine to index the content within over 70.4 million Flash (SWF) files. That is a vast amount of content that was previously inaccessible to the search engines and the ability to access it could add a lot of value for search engine users.

For example, an inspiring and eloquent Flash site like Forests Forever could be indexed which would expose more viewers to a website that provides a wonderful introduction to the world’s forests. Of course that is just one Flash site of many that will add value to search engines when indexed; it just happens to be one of my personal favorites.

Search Engine Optimization Now Possible with Flash
The implementation of Flash crawling technology means that the text within Flash can now be indexed and links can be followed. Here are some examples of the basic optimization that is now possible within Flash:

  • Optimizing page content for specific keyphrase(s) to ensure a visiting search engine bot will correctly perceive the page’s topic.
  • Using keywords within internal links to pass link juice from page to page; only applicable for sites where the Flash pages are broken down onto separate URLs.
  • Providing emphasis (bolding) to particular words may help to emphasize keyphrase(s); but I am reaching here… it is unknown if this new technology provides text-importance recognition.

The Limitations of Flash Search Engine Optimization
Now that you have some idea of what can now be optimized for search engines here are a few pitfalls that still limit the search engine friendliness of Flash:

  1. Single URL Flash Websites: Many websites I encounter still incorporate all of the website in a single Flash file; in other words as a user navigates the site they are still using the same URL but different pages appear. In such an instance the search engines will index the content and potentially drive traffic to the site but as Google cannot link to content within a Flash file all users will be sent to the beginning of the file. That type of indirect search result is likely to infuriate many searchers who have come to expect immediate results.Here is a quote from Google’s comment area on this topic:”We’ve heard requests for deep linking (linking to specific content inside file) not just for Flash results, but also for other large documents and presentations. In the case of Flash, the ability to deep link will require additional functionality in Flash with which we integrate.”That last line is interesting because it leaves room for interpretation. Do they mean Adobe will have to add the “additional functionality” to Flash or that Google needs to beef up their indexing technology to take advantage of the existing Flash functionality? Perhaps some Flash gurus out there could weigh in on this one. It is definitely an ambiguous way for Google to answer the question.

    If you need a work-around to deep-link single SWF files Adobe notes a solution: “you can create multiple HTML files that provide different variables to the SWF and start your application at the correct subsection. By creating multiple entry points, you can get the benefits of a site that is indexed as a suite of pages but still only need to manage one copy of your application.”

  2. Text in Images is Not Indexed: Many Flash websites inexplicably incorporate a great deal of textual content within images and currently search engines cannot index text in images; I expect that will remain true for at least another year or two. As a result, a Flash website that includes a vast amount of text within graphics will not see a noticeable benefit to this enhanced crawling technology.
  3. Resource-File Based Content Not Indexed: I noted this in Google’s comment area from their support team: “At this time, content loaded dynamically from resource files is not indexed. We’ve noted this feature request from several webmasters — look for this in a near future update.”

In addition, Google’s news release announced the following limitations to Flash that Google expects to surmount soon (quoted from Google blog):

  1. “Googlebot does not execute some types of JavaScript. So if your web page loads a Flash file via JavaScript, Google may not be aware of that Flash file, in which case it will not be indexed.”
  2. “We currently do not attach content from external resources that are loaded by your Flash files. If your Flash file loads an HTML file, an XML file, another SWF file, etc., Google will separately index that resource, but it will not yet be considered to be part of the content in your Flash file.”
  3. “While we are able to index Flash in almost all of the languages found on the web, currently there are difficulties with Flash content written in bidirectional languages. Until this is fixed, we will be unable to index Hebrew language or Arabic language content from Flash files.”

Verdict: SEO for Flash is Still in Diapers
It is wonderful news that Flash is becoming more search engine friendly and there is no question that the addition of previously unattainable Flash content to search engine indexes will prove valuable. But the fact of the matter is that at this moment I wouldn’t dream of telling a client that Flash can be a competitive medium for search engine optimization. There are simply too many roadblocks that still exist and need to be addressed before a Flash website has any hope of competing with an HTML website on the basis of just search engine optimization. I do, however, see a couple exceptions to the rule:

  1. At a certain point a threshold can be met where significant incoming links can push even the most un-search engine friendly website to the top rankings. As a result, it is highly likely that some Flash websites with a decent incoming link support structure will see vast improvements in rankings when their content is finally considered thanks to this new crawling technology.
  2. In less competitive arenas (obscure keyphrases or keyphrases with little competition) the basic search engine optimization capabilities opened to Flash may very well be all that is needed to attain top search engine rankings.

In conclusion I would like to pass on extreme kudos to Adobe, Google and Yahoo for working this new technology into their systems. With all of the new multimedia formats coming online it has always seemed quite silly to me that Flash, having been around for years, was still not fully indexable. Thankfully Flash can now be crawled and the day where it could potentially compete for competitive rankings is on the distant horizon.

by Ross Dunn, CEO, StepForth Web Marketing Inc.