What are the tokens in the following section of a HTML file?
<h1>Important Heading</h1> <p> <span style="color:blue">Many words</span> </p>
<h1> Important Heading </h1> <p> <span style="color:blue"> Many words </span> </p>
Depending on the application, the tokens might be different than those above. Say that our application is a HTML text formatter (perhaps part of a web browser). For this example application the tokens are:
<
and >
, including spaces
and punctuation.A word token can be delimited by a character from a tag or by white space. For example, in the following:
<h1>Important Heading</h1>
The word token "Heading" is delimited on the right by <
.
Pretend that a scanner is building the token "Heading" character by character. When has it reached the end of the token?