The formal statement of what a code point is supposed to represent is the description contained in column 5. If you have a standards-compliant browser configuration, it will also be displaying the actual characters in columns 6, 7 and (we hope) 8, but in the event of any discrepancies, it's column 5 that you should believe. The table only covers the code points from 160(decimal) upwards, because
Authors are also recommended to refer to my report on browsers, which shows the extent to which some representative browsers support the mechanisms described here, and offers advice to authors on how to code their HTML for best results.
The character code test table, linked to the next section
of the present document, contains only one of these
entity names per character, and conforms to the "Proposed Entities" list
in the HTML2.0 Specification (RFC1866), and the "HTML 3.2 Reference
Specification" (W3C Recommendation 14-Jan-1997), which in turn are
consistent with the larger lists in the ISOnum and ISOdia
entity sets used in SGML.
In the ISO entity sets,
uml and die are
alternative names for the same glyph, intended to be
used according to context: the HTML specifications only
use the uml variant, and by now the current
browsers do support that, but some also support die,
and some older browsers only supported die.
(Look them up in a good dictionary if you want to know
the difference - "dieresis" if it's a US dictionary.)
As neither of them is of much use in HTML, this is usually
nothing to worrry about, but anyway, (except for some pre-HTML2.0
browsers, which can now be ignored), all browsers support the
numerical character references, which neatly side-steps the problem.
The ­, and its equivalents ­
and the 8-bit character, should be treated specially, as explained in
the main briefing and in RFC2070. Its rendering in the test tables is
not significant, as those tables are not using it in the proper fashion.
There should be a specific test to ensure that the browser is handling
it correctly (i.e suppressing it if it is not contiguous with a
linebreak, and rendering a hyphen if it is).
Description entity name test case
or numerical ref.
----------- ----------- ---------
Umlaut mark or diaeresis uml ¨
die ¨
macron (overbar) macron ¯on;
macr ¯
hibar &hibar;
degree degree °ree;
deg °
cedilla Cedilla ¸
cedil ¸
The following are not part of the ISO-8859-1 repertoire:
(numerical character references larger than 255 are from Unicode)
trade mark (TM) trade ™
ditto as numerical: #8482 ™
endash (old version) endash &endash;
endash (current version) ndash –
ditto numerical: #8211 –
emdash (old version) emdash &emdash;
emdash (current version) mdash —
ditto numerical: #8212 —
aleph symbol (from HTML4.0) alefsym ℵ
ditto numerical: #8501 ℵ
(in HTML+, this was: aleph ℵ )
"Non-white" space (shown in brackets for clarity):
ensp [ ]
#8194 [ ]
emsp [ ]
#8195 [ ]
some folks seem to think these are enspace [&enspace;]
emspace [&emspace;]
If the browser is behaving as desired, then columns 6, 7 and (where the browser supports it) column 8 should all be displaying the glyph appropriate to the description in column 5.
If your browser supports at least the basic elements of the HTML3 <TABLE> construct, you can view the TABLE format test document; any browser should be able to view the pre-formatted test document.
Technical note: the tables were created by executing a REXX script.
The layout is the same as before. || no space at all | | a single ordinary space | | a single nbsp | | three nbsp | | numerical ref.   | | three of those | | a single ordinary space again
![[Prev]](/saved/~flavell/left.gif)