An Interesting Tidbit About W3C’s Validators

I found out something interesting about the W3C validator services yesterday while working on some code. I wanted to find a way to dynamically validate my pages, so that I didn’t put a validity notice on the page if it wasn’t valid.

While validating one of my pages, I decided to check the response headers returned by the W3C validator engine, just on a whim, to see if there was anything in there that might tell me whether or not my page was valid.

What do you know? Not only is it spelled out in very plain English, the header also tells you how many errors and warnings were returned by the engine if the page wasn’t valid.

The header array returned by the validator after checking a valid page looks like:

Date: Sat, 06 Sep 2008 13:05:51 GMT
Server: Apache/2.2.6 (Debian)
Content-Language: en
X-W3C-Validator-Recursion: 1
X-W3C-Validator-Status: Valid
X-W3C-Validator-Errors: 0
X-W3C-Validator-Warnings: 0
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8
200 OK

This is the response header array returned by a page with errors:

Date: Sat, 06 Sep 2008 13:15:59 GMT
Server: Apache/2.2.6 (Debian)
Content-Language: en
X-W3C-Validator-Recursion: 1
X-W3C-Validator-Status: Invalid
X-W3C-Validator-Errors: 25
X-W3C-Validator-Warnings: 0
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8
200 OK

I found a bit of information about the headers returned in the W3 documentation. Check it out for yourself.

In my case, I found that it was causing an endless loop to try to have the page validate itself when it loaded (not to mention the fact that it would slow everything down and put extra load on my server and W3’s server), so I wrote a script that checks a page when its created or edited. I also wrote a simple script that is capable or looping through every page on my site (25 at a time so it doesn’t time out) and validates them. I then store the validation information in my database alongside the content of each page.

Now, when you load a page on the site I’m developing, the “This page is valid” text only shows up if the page has been validated.

These headers are returned by all of the W3’s validation services, so you can use them when checking anything that can be validated by the W3 (X/HTML, CSS, RSS and more).