This section will go over HTML tags that semantically mark up parts of the text. These are the kind of HTML tags that most people are already familiar with. Most of these tags (but not all) are inline; text outside the tags will flow around their contents, without spacing or line breaks. There are a ton of these tags, and I tried to separate them according to use case.
- Anchor. This tag defines a hyperlink to another document, or to another part of the same document. The contents of the tag, when clicked on, will take the user to the target location of the hyperlink.
- A hyperlink reference. This can be a URL, either fully-qualified or relative. It can also be just a fragment identifier: the ID of another tag on the page, preceded by a pound sign. In this case, the location would be that specific tag in the document. You could use fragments to handle footnotes, for example.
hrefattribute. If the user clicks on the hyperlink, the browser will have nowhere to go, so it does nothing.
- The target of that link. The values of the target attribute can be:
- The same browser window as the current HTML page. This is the default.
- A new window or tab in the browser.
- The window of the parent document, if the HTML page is embedded in an
<iframe>element. (I will talk about the
<iframe>element in the section on embedded content.)
- The topmost browser window. Unless the HTML page is in an
<iframe>, it is the same as
- A matching
idattribute of an
<iframe>element. If you use this target, the contents of the
<iframe>will be reloaded using the
hrefvalue of the
- The relationship between this document and the linked document. Unlike the
<link>tag, this attribute is not required, and in fact has few use cases. One of those use cases, however, is important: it can tell web crawlers not to follow this hyperlink. (This attribute is usually specified on links in comments, to prevent SEO spam.) To do this, use the
titleattribute. Most browsers display this in a tooltip.
You do not need to use this tag every time you write an abbreviation, unless you need to (e.g. to style all abbreviations using CSS). But it is common practice to use this tag, and its
title attribute, when readers first encounter the abbreviation. Because of this, it is often used with the
<dfn> tag (see below). It is common practice to nest the
<abbr> tag inside the
This is a block-level element, so it will be displayed with empty lines before and after the address.
<bdo>(see below). The difference is that you cannot specify a text direction with
<bdi>, so it is used where the text direction is unknown. It is only supported by modern versions of Chrome and Firefox, and no other browsers.
I recommend that you use another element instead, and specify its
dir attribute as
auto. If the text direction is known, you should use the
<bdo> element, which is supported by all browsers.
- The text direction. This attribute is required. Its value must be either
ltr(left-to-right); unlike other tags, it cannot accept a value of
<dfn>element should contain the definition of this term.
This tag should be used for terms that are defined inline, with the rest of the document. If you are making a list of defined terms (like a glossary), then you should use a definition list instead. See the section on structured text for details.
The W3C HTML5 standard redefines this tag to mean a
a paragraph-level thematic break, e.g. a scene change in a story, or a transition to another topic within a section of a reference book. It recommends that you not use the
<hr> tag if a header or
<section> tag can serve the same purpose.
a run of text in one document marked or highlighted for reference purposes, due to its relevance in another context.For example, you could use it to highlight a part of a quotation that you want to focus on. Or if a user performs a search, server-side software could use this tag to mark up search results in the page. By default, supporting browsers render the text with a light yellow background.
This is not an empty tag. A closing tag is required for valid XHTML, and leaving it off has been discouraged since HTML 4.01 (at least). Designers shouls always provide the closing
However, back when HTML was being standardized, it inherited SGML’s concept of an “implicitly closed” tag. This is a tag that may have its closing tag omitted, if it is followed by certain other tags. The paragraph tag is an implicitly closed tag, and HTML (even HTML5) inherits this behavior.
This means that you cannot nest most block-level tags inside the
<p> tag. If you do this, browsers will implicitly add the closing
</p> tag. For example, this:
…will be implicitly converted to this:
Most HTML tags are inline tags, and those can be nested without issue. The ones that cannot be nested are:
This tag can be used to render lots of different kinds of content, such as ASCII art or post office addresses. It is also used quite often to display programming code. See the
<code> tag for details.
This was never formally deprecated in the HTML 4.01 standard, though they also discouraged its use. The HTML5 standard redefined it to mean
side comments such as small print. I would still avoid using it if possible.
But there are many cases where no other tag would be appropriate, so it is still widely used. For example, if you’re displaying an artist’s discography, you could use
<span class="album-format"> to define the text representing an album format (CD, LP, digital download, or whatever). It is also one of the ways that the W3C recommends marking up “subheadings” (subtitles, alternative titles, or taglines) in a headline.
If you want to define block-level text, rather than inline text, use a
<div> tag instead.
<em>), or it may be text that is typographically emphasized in the page (like a warning). By default, is is rendered in bold type by the browser.
In printed text, footnote numbers are often printed as superscript, but this is not the consensus for HTML footnotes (probably because they’re harder to click on). Instead, put the footnote number inside square brackets. To link the footnote number (and brackets) to the footnote’s location on the page, use the
These tags should be used when you are quoting another source in your document.
The convention for abbreviating quotes is to put your own alterations (including ellipses) in square brackets. For example, let’s say the full quote was “When Fred came home after working at the car wash, he said he had a great day, before eating dinner.” You could shorten the quote in this way: “When Fred came home […] he said he had a great day[.]” If you emphasize text that is not emphasized in the original, make sure you indicate this as well, saying something like “(emphasis added).” Make sure your abbreviations don’t fundamentally alter the meaning of the quote. This is not part of the HTML standard, it is just good practice.
- Quotation block. This tag should be used to display a lengthy block of quoted text. It is a block element, so it will be displayed with empty lines before and after the content. Browsers usually display the contents with indented margins on the left and right.
The W3C suggests that you put source citations in the
<cite>tag, possibly nested in a
<footer>tag (if you’re using HTML5). This can be put inside or immediately following the
<blockquote>tag, but it probably makes more sense to put them inside.
- Citation. The HTML 4.01 standard defines this asa citation or a reference to other sources, while the HTML5 standard defines this asa reference to a creative work. In practice, this means either the title of the quoted work, or the author of the quotation (whether written or spoken). It should not be used to mark up the actual quotation; use
Because many sources are available online, it is fairly common to use an
<a>tag to link to the cited source. Convention dictates that the
<a>tag be nested inside the
This tag should not be confused with the
citeattribute, which will not be displayed in the browser. (See below.)
- Quotation. This is used to mark up a short quotation; it will appear inline with the surrounding text. For longer quotations, use the
<blockquote>tag, which will display the quotation as a separate block of text. Browsers will add quotation marks around the contents of the
<q>tag, so you don’t need to include them.
- The source of the citation. The value should be a URL to a resource where the quotation is taken from; this can be a relative URL, but more likely will be an absolue URL to a page on a different website. It is meant to be read by machines, and will not be displayed in a web browser.
These tags serve the special purpose of displaying computer code in an HTML document. In most cases, the
<code> tag will be all you need; the others are not widely used.
There are two things to keep in mind about these tag. The first is that any HTML inside them is not escaped; if the code you’re presenting is HTML code, you will need to use HTML entity references for the angle brackets. The second is that they are inline tags, and do not retain formatting. So if you’re displaying a block of code, you will need to use a
<pre> tag as well. The W3C standard suggests that you put the code-related tags inside the
If your code should be treated like a “figure” (as it is in many textbooks), you should consider wrapping everything in a
<figure> tag, and give it a caption using the
There is no standardized way to specify what programming language the code is written in. The W3C suggests you use the
classattribute, and a value string having a
language-prefix followed by the name of the programming language. No browser has ever done anything with this information, but some client-side syntax highlighters use this technique.
- What the user would type at a keyboard; that is, the command-line input to a program.
- A variable. This could be a variable in a programming language, but it could also be used to mark up a mathematical variable.
- “Sample output;” the command-line output from a program.
Insertions and deletions
These tags deal with text that has been deleted from the document, and replaced with other text.
- Deleted text. By default, it is rendered as strikethrough (crossed out) text by default.
- Inserted text. By default, it is not rendered differently from any other text.
Insertion and Deletion Attributes:
- Specifies the URL of a document that explains the change. This won’t be rendered by browsers; it’s mainly for machine use (e.g. to gather statistics about a document).
- Specifies the time and date when this edit took place. Like the
citeattribute, this will not be rendered by browsers. The datetime string should have the format
YYYY-MM-DD hh:mm:ssTZD(year, month, day; hour, minute, second; time zone designator). If you specify the date, the time is optional; the time zone is optional in any case. You can also use a “T” character instead of a space to separate the date and time.
Ruby annotations (sometimes spelled Rubi) are annotations that are usually used as pronunciation guides. They are most common in East Asian languages (Chinese, Japanese, Korean, or Vietnamese), where a more complex script (like Japanese Kanji) is broken down into multiple syllables using a phonetic script (like Furigana). The annotations are usually displayed above the main characters, but they may be displayed on the side if the text runs top-to-bottom.
Ruby annotations are not widely used outside of East Asian countries. They are not supported by Firefox, or by Android browsers earlier than Android 3.0. Surprisingly, they have been supported by Internet Explorer since version 5.5 at least.
It goes without saying that you should not confuse Ruby annotations with the Ruby programming language. Both originated in East Asia, but that’s about all they have in common.
- This is the root element (the “container” tag) for text that has Ruby annotations. The
<rp>tags should be nested inside it.
- Ruby base. This is the text to be annotated.
- Ruby text. This tag’s contents contain the actual Ruby annotations.
- Ruby parentheses. This tag exists for browsers that do not support Ruby annotations.
If a reader is using one of those browsers, then the annotations should be displayed inside parentheses. But if their browser does support Ruby annotations, you don’t want the parentheses to show. This tag is the solution to that problem. You surround the opening and closing parentheses characters inside the
<rp>tag. Browsers that support Ruby annotations will recognize the
<rp>tags, and hide the parentheses; other browsers will ignore the unrecognized
<rp>tags, and show the parentheses as usual.
Here is some Ruby code showing you how to pronounce my name:
<ruby> <rb>Karl Giesing</rb> <rp>(</rp><rt>KAH rl GEE zing</rt><rp>)</rp> </ruby>
There are a lot of text markup tags that are deprecated. Rather than mix them in with the other tags, I decided to put them in their own section. A few of them were deprecated because their behavior overlapped other tags; but the majority were deprecated because the tags were presentational, and not semantic.
The W3C revived some of these tags in the HTML5 standard, redefining them in the process. This was probably done because they are still in common use, even though they shouldn’t be. I would personally treat those tags as still being deprecated, and avoid using them.
- Accronym. Use the
- Bold text. If you want to mark up important text, use the
<strong>tag instead; it renders as bold type by default. Otherwise, use CSS.
<b>tag was deprecated in HTML 4.01, but revived in the HTML5 standard. It was redefined to meana span of text to which attention is being drawn for utilitarian purposes without conveying any extra importance and with no implication of an alternate voice or mood.(Yeah, OK then.) The W3C’s examples includekey words in a document abstract, product names in a review, actionable words in interactive text-driven software, or an article lede.
- Big text; that is, text rendered in a bigger font size. Use CSS instead.
- This tag was used to display text in a different style. Use an appropriate semantic tag (or
<span>if necessary), and style it using CSS.
- Italic. The
<em>tag is rendered as italic text by default, so you should use that tag instead.
<i>tag was deprecated in HTML 4.01, but revived in the HTML5 standard. It was redefined to meana span of text in an alternate voice or mood.The W3C’s examples area taxonomic designation, a technical term, an idiomatic phrase from another language, transliteration, a thought, or a ship name in Western texts.
- Strikethrough. If you want to mark up deleted text, use the
In HTML 4.01, this was a synonym for the
<strike>tag (and thus deprecated). It was revived in the HTML5 standard, and redefined to meancontents that are no longer accurate or no longer relevant.Note that the
<strike>tag was not revived.
- Strikethrough. If you want to mark up deleted text, use the
<del>tag instead. (But, also see the
- Teletype. This was rendered as a fixed-width font, and you should use CSS for this instead. In most cases, other tags would be more appropriate, like the
- Underline. There is no non-deprecated equivalent. Most things that should be underlined are emphasized or important, so should use either the
<strong>tags. For edge cases, you could use a
<span>tag with a specialized
<u>tag was deprecated in HTML 4.01, but revived in the HTML5 standard. It was redefined to meana span of text with an unarticulated, though explicitly rendered, non-textual annotation.This could include text that is spelled wrong, or a proper name in Chinese script.