What is HTML validation?

HTML validation is a check to see if a website’s source code complies with certain standards. Errors are usually highlighted to help the owner make corrections before testing again. It is not the same as content validation.

What is valid HTML?

Valid HTML adheres to all of the conventions and specifications of Hypertext Markup Language.

Hypertext Markup Language, or HTML, is the underlying code that forms the basis of webpages. To ensure that all existing browsers, and even those that do not yet exist, know how to deal with instructions in an HTML document, there are official specifications from the Internet Engineering Task Force: Hypertext Markup Language and the recommendation for the fifth revised version HTML5.

This specifies, for example, how an <a> element should be handled, what can be done with it and which attributes are permitted (such as target=””) or are mandatory (such as href=””).

What can also happen is that procedures that were possible in earlier versions of HTML are removed in newer versions.

The specification defines the grammar for this markup language to ensure that a website owner does not have to create three (slightly) different versions of their website for Safari, Chrome and Firefox.

How does HTML validation work?

The World Wide Web Consortium, W3C, provides a free tool at https://validator.w3.org/ that can be used to check a single HTML page to see if the latest HTML conventions are being followed on the page.

The validator downloads the HTML source code of the page and goes through it line by line. The result is a list of warnings and error messages:

html checker image

How important is HTML validation?

The most important question is, of course, “how important is it that my HTML source code is valid?

Fortunately, the answer to this is as follows: “it is not all that important that the source code runs through the validator without errors”.

This is due to the fact that many websites are not built manually; automated programs, ranging from desktop applications like FrontPage or content management solutions like WordPress, are often used.

A visual editor is often used to build the page and the programme creates the HTML source code in the background – and it is not a simple task for these programmes to always create valid source code.

How does Google deal with non-validating websites?

Google noticed early on that many HTML documents on the web do not validate and sometimes even use elements or attributes that do not exist. Therefore, Google does not penalise pages with invalid HTML.

Google even tries to understand web pages that use incorrect language: content of the highest quality can be found on pages with invalid HTML source code.

Are there advantages to validating HTML?

Matt Cutts mentions some of the advantages of validating HTML in the above video, and the W3C has written a separate document about it, “Why Validate?“.

The benefits are as follows:

  • Validation as a debugging tool – Not all platforms and browsers process source code and HTML errors therein equally. So when something doesn’t work, looking at validation errors can provide a good idea of what’s going wrong and what should be looked at more closely.
  • Validation as future-proof quality assurance – Just because a page can be built on most browsers today does not mean that the next browser version will continue to support all the quirks of the past. If your own site is built cleanly according to the recommendations, it can be assumed that new versions will have no problems with it.
  • Validation to simplify maintenance – If I adhere to fixed conventions, then my HTML files can also be easily checked, updated and adapted by others. The same applies to larger departments in which several employees share the programming of the website.
  • Validation as an aid to teaching good programming practices – Adherence to established conventions can help beginners, in particular, to better understand the HTML source code they create and also to grasp higher-level concepts more easily
  • Validation as a sign of professionalism – The W3C article also points out that clean (and thus valid) code is a sign of a programmer who prioritises quality

Conclusion

Whilst there are good reasons to write clean and valid HTML, it is not a ranking factor for good reason. As a website owner, I don’t have to worry too much about it.

However, one can gain many benefits from the requirements that must be met for HTML code to be valid and clean across a website.

Steve Paine
24.08.2021