What Is HTML5?

There’s been a lot of talk about HTML5 and all of the new elements it introduces. Forms will be built and used completely differently, the structure of documents will be much more semantic, and new features will be available to website and application developers.

But, what are these semantic elements? Are they really anything new? Will they change the structure of your document at all? The simple answer is “no”. The new elements, for the most part, just make your documents easier to parse and understand (for machines and for people using assistive technology). Very few of the new elements are really all that new; they’re just the same old elements with new names for new purposes.

Let’s take a look at some of the new elements. To start, we have the following:

  • header
  • footer
  • aside
  • section
  • article
  • nav
  • hgroup
  • output*

Basically, each of the elements mentioned above is nothing more than a specialized <div> tag (or possibly a <span> tag, though, I believe these are all block elements by default). With the exception of the <output> tag, they don’t really serve any different purpose, nor do they have any different attributes than a normal div. However, they will be invaluable in providing semantic data for the document.

The <output> tag is a little different, in that it actually does have some new attributes, and has a very specific purpose. It is basically the equivalent of the empty <div> many application developers used to act as a container for information returned from an AJAX request; but, when used properly, it may actually return information in HTML5 documents without any server interaction (calculations between two inputs, etc.).

Next, we have another list of elements:

  • details and summary
  • figure and figcaption

Essentially, these two pairs of tags are nothing more than fancy fieldset and legend tags. However, as far as I know, browsers will not render them quite the same way fieldsets and legends have been traditionally rendered.

The details and summary tags help express some information about the website or document currently being viewed (copyright information, etc.). You must use the summary tag as the first child of the details element; otherwise your document will not validate properly.

The figure and figcaption tags do not have the same restrictions. The figcaption element can be used anywhere within the figure element; but the concept remains the same. The figcaption is intended to act as a “legend” for the figure container.

So, then, what is truly different about HTML5?

To begin with, we do have a few new elements that are very different from anything we’ve seen in previous iterations of HTML.

  • audio, video and source
  • canvas
  • ruby, rp and rt (for East Asian characters)

HTML5 also includes some additional tags that are meant specifically for semantic purposes and, by default, do not make any difference in the appearance of the document. These are similar to the classic <b>, <i>, <strong>, <em>, <del>, etc. tags, but offer even more detailed information about the text. Many of them actually have their own attributes that help machines determine exactly what the text infers. Some of these are:

  • mark
  • time
  • meter

Document Outlines

In addition, the outlines of HTML5 documents will eventually be interpreted differently than previous versions of HTML documents. Rather than attempting to imply an outline based on the levels of headings within the document (something marked with an <h1> at the top is a top-level item, an <h2> marks the beginning of a second-level item, etc.).

However, HTML5 documents are expected to have more explicit outline definitions. The <section> and <article> tags are intended to separate the items within the outline. In fact, every new block level element (<header>, <section>, <article>, <footer>, etc.) is supposed to start with an <h1> heading. If a <section> or <article> is nested within another <section> or <article>, the outer element will be considered a top-level item in the outline, while the inner item will be a second-level item in the outline.

If you plan to use any headings that should be left out of the outline (subtitles, for example), you should use the <hgroup> tag to bind them together. So, if you have an article with the title of “What is HTML5?” and a subtitle of “A short guide to the new HTML5 elements“, you would use the following code to keep the subtitle from showing up as the second level of the outline:

<hgroup>
<h1>What is HTML5?</h1>
<h2>A short guide to the new HTML5 elements</h2>
</hgroup>

Further Reading

If you are interested in reading more about HTML5, you can check out the current list of elements allowed in HTML5 over at the W3C (though, please keep in mind that the list is somewhat in flux, and is actually being discussed in two completely separate working groups).

If you’re interested in actually learning how to use the new elements, you can take the SitePoint “HTML5 Live” course with John Allsopp. Be aware that this course is not intended for complete beginners. The course expects that students already have a good HTML foundation, and really only explores how the new elements differ from existing HTML elements. Also (at least, when I took the course – as part of the first group to take the course), it was put together as a live course; so John does make some mistakes and misstatements throughout. If you’re expecting a perfectly edited, top-of-the-line, professional course, you should look for local training in your area. However, for less than $10, I don’t see any reason not to at least try the course. I thoroughly enjoyed it, and got a lot of fantastic information out of it.