SGML and HTML Markups Compared

stoa.org

Two Ways to Mark a Text

Last changed: April 11, 2011

The HTML and SGML versions of Philoctetes both contain the same text, but present it differently. In the HTML version, the emphasis is on appearance, while in the SGML the emphasis is on what the text means.

Start with the cast list. In the HTML version, it has a human-readable header line, then a list of names and descriptions. The names are marked to be boldfaced: that's how the reader knows they're names.

<P><B>CHARACTERS OF THE PLAY</B> <BR>

<P>
<B>ODÝSSEUS</B>, son of Laertes.<BR>
<B>NEOPTÓLEMUS, </B>young son of Achilles.<BR>
<B>PHILOCTÉTES</B>, son of Poeas.<BR>
<B>SPY</B>, a sailor of Odysseus, later disguised as a merchant
<BR>
<B>HÉRACLES<BR>
<B>CHORUS</B></B>, sailors of Neoptolemus.<BR>
A sailor.
<P>

In the SGML, on the other hand, the cast list is labelled <castlist>. Each named character has a <role>, and most characters have a role description, <roledesc>. The roles also have identifiers, used to mark the characters' speeches.
<castlist>
<castitem><role id=Odysseus>Odýsseus 
          <roledesc>son of Laertes</roledesc></castitem>
<castitem><role id=Neoptolemus>Neoptólemus 
          <roledesc>young son of Achilles</roledesc></castitem>
<castitem><role id=Philoctetes>Philoctétes 
          <roledesc>son of Poeas.</roledesc></castitem>
<castitem><role id=Spy>Spy 
          <roledesc>a sailor of Odysseus, later disguised as a merchant</roledesc></castitem>
<castitem><role id=Heracles>Héracles</castitem>
<castitem><role id=chorus>Chorus 
          <roledesc>sailors of Neoptolemus.</roledesc></castitem>
<castitem><roledesc>A sailor.</roledesc></castitem>
</castlist>

Nothing in the SGML text specifies how the list is to be displayed. A separate display program can interpret the cast list and lay it out as appropriate:
  • name in bold caps, comma, description, as in the HTML;
  • name in mixed case, comma, description in italics;
  • name and description in parallel columns of a table;
  • just the names, without descriptions, as is often done in Greek editions;
or any other way the user requires - all from the same marked-up text.

The play begins with the entrance of Odysseus and Neoptolemus. In the HTML, it looks like this:

<CENTER>[<I>Scene: the island of Lemnos, in front of <B>PHILOCTETES</B>'
cave.</I>]<BR>
</CENTER>
<P>
<CENTER>[<I>Enter</I> <B>ODYSSEUS</B> and <B>NEOPTOLEMUS</B> <I>followed
by the sailor who will return later as Odysseus' </I><B>SPY.</B>]
<BR>
</CENTER>
<P>
<B>ODYSSEUS</B>. This is the shore of the sea-encircled isle<BR>
of Lemnos, uninhabited and forlorn.<BR>

The SGML:
<div1 type=prologue>
<stage type=setting>
Scene: the island of Lemnos, in front of Philoctetes' cave.
</stage>
<stage type=entrance>
Enter Odysseus and Neoptolemus followed by the sailor who will return later as Odysseus' Spy.
</stage>
<div2 type=unspec>
<sp who=Odysseus>
<l>This is the shore of the sea-encircled isle
<l>of Lemnos, uninhabited and forlorn.

Note how the stage directions are marked <stage> - and even indicate whether they set the scene or call for the entrance of a character. There are also type=exit directions where appropriate. The speech is marked <sp> and its speaker is indicated, who=Odysseus. It's up to the display program to decide whether speakers' names are shown in bold caps before the speech, as in the HTML, centered above the speech, or shown some other way. A program could even pull out only Odysseus's speeches, to make a single actor's part or to analyze his language.

The sections of the text are marked: the prologue, the episodes, the choral odes. These are all considered first-level divisions, <div1>. Within these are second-level divisions, <div2>, for the strophes and antistrophes, the stichomythia, and other features the markup editor may wish to tag. To facilitate markup, every line must belong to a <div2> subdivision, even if it's only an "unspecified" (type=unspec) division; this is because there cannot be speeches or stage directions in a <div1> after the last <div2>. Naturally this "default" division could be inserted automatically by editing tools.

The HTML does not allow machine-readable marking of these divisions, and does not help the markup editor produce consistent human-readable markings either. Look how the two kommoi are marked:
First:

<B>NEOPTOLEMUS</B>. I think that sleep will come upon this man
<BR>
soon now: his head already is bent back; <BR>
the sweat is pouring over his whole body; <BR>
a thin black stream of blood has broken from<BR>
his wounded foot. Friends, let us leave him here <BR>
in peace, and hope that sleep may come upon him.<BR>
<BR>
<I>[COMMOS]<BR>
<I>[Strophe]<BR>
</I></I><B>CHORUS</B>. Sleep who art stranger to sorrow and suffering,<BR>
come to him gently, gently,<BR>
and grant him, lord, thy blessing now;<BR>
let him behold this light<BR>
which now spreads over his lustrous eyes:<BR>
come, Healer, I pray.<BR>
Child, you must make your decision now<BR>
what to do and how to think,<BR>
seeing how these matters stand.<BR>
Why should we be slow to act?<BR>
The moment is judge over every deed,<BR>
and often allows unexpected achievements.<BR>
<B>NEOPTOLEMUS</B>. I know he cannot hear me but I see that we
must fail<BR>
if we should take the bow alone without the man and sail:<BR>
the crown is his and he it was the god meant for our prize -<BR>
it is a shameful thing to boast of futile deeds and lies.<BR>

Second:
<CENTER>[<B>NEOPTOLEMUS</B> <I>follows</I> <B>ODYSSEUS</B> <I>out</I>.]</CENTER>
<P>
[<I>Commos</I>]
<P>
[<I>Strophe A</I>]
<P>
<B>PHILOCTETES</B>. O my hollow cavern of stone, <BR>
now hot, now icy cold, was I <BR>
never again to leave you then, <BR>
unhappy that I am, but die <BR>
with no one near but you?<BR>
Ah ah ah ah!<BR>

The [commos] marker is upper case for the first one, mixed case for the second. This is unattractive to a human reader, and would make a special case that a program trying to parse the HTML would have to be aware of.

In SGML, both are marked the same way: <div1 type=commos>. Here is the first one:

<sp who=Neoptolemus><l> I think that sleep will come upon this man
<l>soon now: his head already is bent back;
<l>the sweat is pouring over his whole body;
<l>a thin black stream of blood has broken from
<l>his wounded foot. Friends, let us leave him here
<l>in peace, and hope that sleep may come upon him.
</div1>

<div1 type=commos><note resp=markup>this is an ordinary stasimon, with a
few lines of hexameters in the middle, but the translator calls it a commos</note>
<div2 type=strophe>
<sp who=chorus><l> Sleep who art stranger to sorrow and suffering,
<l>come to him gently, gently,
<l>and grant him, lord, thy blessing now;
<l>let him behold this light
<l>which now spreads over his lustrous eyes:
<l>come, Healer, I pray.
<l>Child, you must make your decision now
<l>what to do and how to think,
<l>seeing how these matters stand.
<l>Why should we be slow to act?
<l>The moment is judge over every deed,
<l>and often allows unexpected achievements.
</div2>
<div2 type=mesode>
<sp who=Neoptolemus><l> I know he cannot hear me but I see that we must fail
<l>if we should take the bow alone without the man and sail:
<l>the crown is his and he it was the god meant for our prize -
<l>it is a shameful thing to boast of futile deeds and lies.
</div2>
And the second one:
<stage type=exit>Neoptolemus follows Odysseus out.</stage>
</div1>

<div1 type=commos>
<div2 type=strophe n=1>
<sp who=Philoctetes><l> O my hollow cavern of stone,
<l>now hot, now icy cold, was I
<l>never again to leave you then,
<l>unhappy that I am, but die
<l>with no one near but you?
<l>Ah ah ah ah!

It's up to the display program, as always, to decide how to show the word "commos" wherever it occurs. Because the SGML parser and editing tools work together to make the markup consistent, the display program can easily produce consistent output.

Content-based markup also allows marking of features that do not necessarily appear in the display. For example, a scholar might wish to analyze the use of stichomythia in this play. To do this, she must first find the stichomythia in the text. A human reader can do this easily enough, but can a machine? Yes, if the markup editor has marked this feature. There's no obvious way to mark stichomythia (or similar features) with display-oriented HTML, but with content marking it's easy. Here's an example.

First, the undistinguished HTML:

<B>ODYSSEUS. </B>Son of a valiant sire, I once was young;<BR>
my tongue, like yours, was slow; my hand was active.<BR>
But now, by long experience, I see<BR>
the tongue, not deeds, is ruler in all things.<BR>
<B>NEOPTOLEMUS</B>. What are you asking but that I should lie?<BR>
<B>ODYSSEUS. </B>I say, snare Philoctetes by deception.<BR>
<B>NEOPTOLEMUS. </B>But why deceive him rather than persuade him?<BR>
<B>ODYSSEUS. </B>He will not listen - nor be caught by force.<BR>
<B>NEOPTOLEMUS</B>. What dreadful strength could make a man so bold?<BR>
<B>ODYSSEUS</B>. Arrows which bring inevitable death.<BR>
<B>NEOPTOLEMUS</B>. Then do we not so much as dare approach him?<BR>
<B>ODYSSEUS. </B>Only if by deception - as I said.<BR>
<B>NEOPTOLEMUS</B>. Do you not think that telling lies is shameful?<BR>
<B>ODYSSEUS</B>. No - not, at least, if lies lead on to safety.<BR>
<B>NEOPTOLEMUS. </B>How can a man face speaking words like these?<BR>
<B>ODYSSEUS</B>. None should recoil when what he does brings profit.<BR>
<B>NEOPTOLEMUS</B>. How will I profit if he comes to Troy?<BR>
<B>ODYSSEUS</B>. Troy will be captured only by his bow.<BR>
<B>NEOPTOLEMUS</B>. Then will I not sack Troy, as it was promised?<BR>
<B>ODYSSEUS</B>. Not without it, nor it apart from you.<BR>
<B>NEOPTOLEMUS</B>. It must be taken then, if that is true.<BR>
<B>ODYSSEUS</B>. When you accomplish this, two gifts are yours.<BR>
<B>NEOPTOLEMUS</B>. What? Tell me, and I will no longer scruple.<BR>
<B>ODYSSEUS</B>. To be proclaimed at once both wise and good.<BR>
<B>NEOPTOLEMUS</B>. I'll do it then - and lay all shame aside.<BR>
<B>ODYSSEUS</B>. Do you remember all that I have told you?<BR>
<B>NEOPTOLEMUS</B>. Be sure of it - for now I have consented.<BR>
<B>ODYSSEUS</B>. Then stay here by the cave and wait for him.<BR>

Now the SGML:
<sp who=Odysseus><l> Son of a valiant sire, I once was young;
<l>my tongue, like yours, was slow; my hand was active.
<l>But now, by long experience, I see
<l>the tongue, not deeds, is ruler in all things.
<div2 type=stichomythia>
<sp who=Neoptolemus><l> What are you asking but that I should lie?
<sp who=Odysseus><l> I say, snare Philoctetes by deception.
<sp who=Neoptolemus><l> But why deceive him rather than persuade him?
<sp who=Odysseus><l> He will not listen - nor be caught by force.
<sp who=Neoptolemus><l> What dreadful strength could make a man so bold?
<sp who=Odysseus><l> Arrows which bring inevitable death.
<sp who=Neoptolemus><l> Then do we not so much as dare approach him?
<sp who=Odysseus><l> Only if by deception - as I said.
<sp who=Neoptolemus><l> Do you not think that telling lies is shameful?
<sp who=Odysseus><l> No - not, at least, if lies lead on to safety.
<sp who=Neoptolemus><l> How can a man face speaking words like these?
<sp who=Odysseus><l> None should recoil when what he does brings profit.
<sp who=Neoptolemus><l> How will I profit if he comes to Troy?
<sp who=Odysseus><l> Troy will be captured only by his bow.
<sp who=Neoptolemus><l> Then will I not sack Troy, as it was promised?
<sp who=Odysseus><l> Not without it, nor it apart from you.
<sp who=Neoptolemus><l> It must be taken then, if that is true.
<sp who=Odysseus><l> When you accomplish this, two gifts are yours.
<sp who=Neoptolemus><l> What? Tell me, and I will no longer scruple.
<sp who=Odysseus><l> To be proclaimed at once both wise and good.
<sp who=Neoptolemus><l> I'll do it then - and lay all shame aside.
<sp who=Odysseus><l> Do you remember all that I have told you?
<sp who=Neoptolemus><l> Be sure of it - for now I have consented.
<div2 type=unspecn>
<sp who=Odysseus><l> Then stay here by the cave and wait for him.

Note how the markup follows the structure of the play. A display program could choose to ignore the <div2 type=stichomythia> tags, could print stichomythia in a different color, or perhaps might print only the stichomythia, for the hypothetical scholar mentioned above.

Similarly, the SGML markup labels episodes, stasima, and any other divsions the play has. In an Old Comedy, for example, the markup editor might mark the parabasis, its ephirrhemes, and the agon.

Of course, the markup scheme is only as good as the editor who follows it. If the markup editor chooses to set out a verse translation in prose, or (following ancient practice) does not indicate who is speaking the speeches, no computer program can fill in the omitted information. But if the markup editor wants to capture more information about a text than just how it looks on the page, the SGML scheme provides a standard way to do so.

For more technical information on the markup rules used by the Stoa and the Perseus Project, see this document; follow this link to return to the Introduction to Structured Markup.
 

Goals
Identities
Forum
Review
Audiences
Technical
Guidelines
Copyright
Reference
FAQ
Options
Projects


Please send your comments concerning The Stoa: A Consortium for Electronic Publication in the Humanities to Ross Scaife (scaife@stoa.org). This document was published on: 22 December 1999