This is intended to be a quick tutorial explaining how you should lay out your HTML pages to ensure that your site works properly for all visitors and search engines. This tutorial assumes that you already have a general knowledge of HTML and CSS.
For many people, most of these things will be old hat. However, this post is just intended to recap four necessary steps for any webmasters trying to get their Web sites indexed by various search engines.
Create a Site Map
First and foremost, you must have an XML sitemap. Almost all search engines (certainly the three big ones) utilize a sitemap to become aware of all pages on your site. If you have good link structure, a sitemap might not be necessary, but, it doesn’t take much effort to put one together, and the benefits could be fantastic.
Many content management systems (CMS) will automatically generate a sitemap for you, but, if you aren’t using a CMS that does that, you have a few other options. If your site has an auto-generated RSS feed, that can be a good place to start. Most search engines will treat an RSS feed the same way an XML sitemap is treated, so that can make a big difference.
If you don’t have an auto-generated sitemap or RSS feed, there are quite a few tools you can use to create a sitemap; but you’ll have to make sure you update it each time your site architecture changes. A site called “XML-Sitemaps” will crawl your site and create a sitemap for up to 500 pages for free.
Kyle James, founder of doteduguru, analytics guru and consultant for HubSpot, made two really good posts the last few weeks about search engine optimization (SEO). His two articles deal with on-page SEO (the way to optimize the pages themselves for search engines) and off-page SEO (the way to optimize other parts of your site to direct people to the pages you want optimized).
Earlier this month CenterNetworks was converted from Drupal to WordPress. Part of the conversion resulted in several of the CN sites getting hit with an exploit. It appears that one of the CN sites might have actually been hit earlier and I just never noticed it and only upon CN getting hit did I realize this other site was also hit.
This other site apparently lost most of its “Google Juice” which resulted in a major reduction in organic search site traffic. Here’s a graph of the before, during and after.
At the lowest point, nearly 70% of Google-referral traffic to the site in question was lost. As you can see from the chart above, slowly the Google Juice has been restored and we are back to normal traffic today. Phew, at least now I can get the investors off my back.
What did I learn from this experience? Google indexes sites very quickly but it seems to take about two weeks for the Google search crawlers to update an entire site. From what I can tell, there’s no real way to tell Google that a site was infected and that it is now clean of bad links. There is a re-inclusion request form but I’ve never received any feedback when I have submitted that form in the past so no idea if it actually worked. More importantly, the experience made me realize just how much Google controls how this site does monetarily each and every day.