XML is a standard for the creation of tagging languages. It sets out a collection of rules that govern how a parser is to behave. An XML parser that follows these rules can parse any document tagged in an XML compatible language. This means you can make up your own language and not have to write any code to parse it. You can concentrate on writing code that processes the information in useful ways. . . .
The advantages of XML is that it allows you to define your own data structures. When you use any particular language defined in XML, you are no longer enjoying the advantages of XML, you are enjoying the advantages of the particular markup language you have chosen.
XML was developed by an SGML Editorial Review Board (ERB) formed under the auspices of the World Wide Web Consortium (W3C) in 1996 and chaired by Jon Bosak of Sun Microsystems, with the very active participation of an SGML Working Group also organized by the W3C. Dan Connolly served as the ERB's contact with the W3C.
The design goals for XML are:
1. XML shall be straightforwardly usable over the Internet.
2. XML shall support a wide variety of applications.
3. XML shall be compatible with SGML.
4. It shall be easy to write programs which process XML documents.
5. The number of optional features in XML is to be kept to the absolute minimum, ideally zero.
6. XML documents should be human-legible and reasonably clear.
7. The XML design should be prepared quickly.
8. The design of XML shall be formal and concise.
9. XML documents shall be easy to create.
10. Terseness in XML markup is of minimal importance.
<!ENTITY me "Dmitry Kirsanov, St.Perersburg, Russia">
This document was created by &me; on October 4, 2000
<!DOCTYPE HTML SYSTEM "http://www.foo.com/ myfiles/html3x.dtd" [ <!- - your DTD goes here - -> ]>
|Defines Page Layout||Defines Page Content|
|Concerned with appearance of Web Pages||Concerned with meaning of Web objects|
|Data for display only||Abstract data concepts given form and structure|
|Analogous to Spreadsheet||Analogous to DataBase|
|Uses fixed set of Tags (theoretically)||Uses customized set of author-defined Tags|
|Tag attributes specify appearance of objects||Tag attributes specify behavior of objects|
|Appearance of elements can be modified by CSSs||Appearance of elements can be modified by CSSs|
|Tag functionality is fixed||Tag functionality is defined by DTD and modifiable|
|Supported by All Browsers||Supported only by IE 4.0, 5.0|
|Requires bloated browser to parce bad HTML code||Requires lean & mean parcer for strict syntax|
<SENTENCE> <SUBJECT TYPE="COMPLEX"> <ARTICLE TYPE="INDEFINITE">A</ARTICLE> <ADJECTIVE> quick</ADJECTIVE> <ADJECTIVE> brown</ADJECTIVE> <NOUN>fox</NOUN> </SUBJECT> <VERB TYPE="INTRANSITIVE">jumps</VERB> <PREPOSITION>ove<</PREPOSITION> <ARTICLE TYPE ="INDEFINITE">a</ARTICLE> <OBJECT> <ADJECTIVE>lazy</ADJECTIVE> <NOUN> dog</NOUN> </OBJECT>. </SENTENCE>
The Root Document Element Definition from play.dtd
<!ELEMENT play (title, fm, personae, scndescr, playsubt, induct?,prologue?,act+,epilogue?)>
Note: the "?" means "optional item", and the "+" means multiple occurences
The play.dtd also shows the relationship between tags:
<!ELEMENT speech (speaker+, (line|stagedir|subhead)+)> <!ELEMENT speaker (#PCDATA)> <!ELEMENT line (stagedir |#PCDATA)+)> <!ELEMENT stagedir (#PCDATA)> <!ELEMENT subhead (#PCDATA)>
This is a SPEECH element using the DTD definitions above:
<SPEECH> <SPEAKER> PROSPERO</SPEAKER> <LINE><STAGEDIR>Aside</STAGEDIR> The Duke of Milan </LINE> <LINE>And his more braver daughter could control thee, </LINE> <LINE>If now 'twer fir to do't. At the first sight</LINE> <LINE>They have changed eyes. Delicate Ariel, </LINE> <LINE>I'll set thee free for this. </LINE> <STAGEDIR> To FERDINAND </STAGEDIR> <LINE>A word, good sir;</LINE> <LINE>I fear you have done yourself some wrong; a word. </LINE> </SPEECH>
The XML Code:
<?xml version="1.0" encoding="ISO8859-1" ?> <CATALOG> <CD> <TITLE>Empire Burlesque</TITLE> <ARTIST>Bob Dylan</ARTIST> <COUNTRY>USA</COUNTRY> <COMPANY>Columbia</COMPANY> <PRICE>10.90</PRICE> <YEAR>1985</YEAR> </CD> <CD> <TITLE>Hide your heart</TITLE> <ARTIST>Bonnie Tylor</ARTIST> <COUNTRY>UK</COUNTRY> <COMPANY>CBS Records</COMPANY> <PRICE>9.90</PRICE> <YEAR>1988</YEAR> </CD> . . . </catalog>
Note: Internet Explorer 5.0 will display this, but NetScape 4.7 will not.
The representation of mathematical expression has been one of the weakest features of HTML. Our attempts to do so:
(from Computer Graphics by M. Firebaugh)
have benn to compose the equations with MS-Word, SNAP the image, and save it as a transparent GIF. The results may look nice (see Chapter 10), but are static images, with no access to the symbols for editing, and no mathematical meaning or content.
Now there is an XML-based markup language called MathML.
In order to meet the diverse needs of the scientific community, MathML has been designed with the following ultimate goals in mind.
<math> <mrow> <mfrac linethickness='0.2 cm'><mn>1</mn> <mrow><mi>y</mi><mo>+</mo> <mn>3</mn></mrow></mfrac> <mfrac linethickness='0.3 ex'><mn>1</mn> <mrow><mi>y</mi><mo>+</mo> <mn>3</mn></mrow></mfrac> </mrow> </math>
<math> <mstyle background='#88cc88'> <mfenced> <mrow> <mi>a</mi> <mo>+</mo> <mi>b</mi> </mrow> </mfenced> </mstyle> <mo>=</mo> <mstyle fontweight='bold'> <mfenced> <mfrac> <mi>α</mi> <mi>β</mi> </mfrac> </mfenced> </mstyle> <mo>=</mo> <mstyle fontfamily='Helvetica'> <mfenced open='['> <mrow> <mi>a</mi> <mo>+</mo> <mi>b</mi> </mrow> </mfenced> </mstyle> </math>
Let's link to the HTML description.
Neither browser supports it.
XHTML 1.0 is the first step toward a modular and extensible web based on XML (Extensible Markup Language). It provides the bridge for web designers to enter the web of the future, while still being able to maintain compatibility with today's HTML 4 browsers. It is the reformulation of HTML 4 as an application of XML. It looks very much like HTML 4, with a few notable exceptions, so if you're familiar with HTML 4, XHTML will be easy to learn and use. XHTML 1.0 was released on January 26th as a Recommendation by the W3C.
Portability: By the year 2002 as much as 75% of Internet access could be carried out on non-PC platforms such as palm computers, televisions, fridges, automobiles, telephones, etc. In most cases these devices will not have the computing power of a desktop computer, and will not be designed to accommodate ill-formed HTML as do current browsers (bloated with code to handle sloppy or proprietary HTML).
An excellent introduction to XHTML is given in Introduction to XHTML, with eXamples
The XHTML Code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd" > <html xmlns = "http://www.w3.org/1999/xhtml"> <head> <title>Quick Example</title> </head> <body> <h1> Quick Example </h1> <a href = "http://validator.w3.org/check/referer"> <img src = "http://validator.w3.org/images/vxhtml10" height = "31" width = "88" border = "0" hspace = "16" align = "left" alt = "Valid XHTML 1.0!" /></a> <p> Note that the layout (with tabs and alignment) is purely for readability - XHTML doesn't require it. </p> </body> </html>
The differences between HTML and XHTML are beautifully summarized in Introduction to XHTML: Differences with HTML
SMIL, pronounced "smile", has the following two design goals:
1) Define an XML-based language that allows authors to write interactive multimedia presentations. Using SMIL 2.0, an author can describe the temporal behavior of a multimedia presentation, associate hyperlinks with media objects and describe the layout of the presentation on a screen.
2) Allow reusing of SMIL syntax and semantics in other XML-based languages, in particular those who need to represent timing and synchronization. For example, SMIL 2.0 components are used for integrating timing into XHTML [XHTML10] and into SVG [SVG].
The specification for SMIL are published by the W3C consortium.
Note: In the examples below, the additional syntax related to layout and other issues specific to individual document types is omitted for simplicity.
All the children of a par begin by default when the par begins. For example:
<par> <img id="i1" dur="5s" src="img.jpg" /> <img id="i2" dur="10s" src="img2.jpg" /> <img id="i3" begin="2s" dur="5s" src="img3.jpg" /> </par>
Elements "i1" and "i2" both begin immediately when the par begins, which is the default begin time. The active duration of "i1" ends at 5 seconds into the par. The active duration of "i2" ends at 10 seconds into the par. The last element "i3" begins at 2 seconds since it has an explicit begin offset, and has a duration of 5 seconds which means its active duration ends 7 seconds after the par begins.
There is an important difference between the semantics of end and dur. The dur attribute, in conjunction with the begin time, specifies the simple duration for an element. This is the duration that is repeated when the element also has a repeat specified. The attribute end on the other hand overrides the active duration of the element. If the element does not have repeat specified, the active duration is the same as the simple duration. However, if the element has repeat specified, then the end will override the repeat, but will not affect the simple duration. For example:
<seq repeat="10" end="stopBtn.click"> <img src="img1.jpg" dur="2s" /> <img src="img2.jpg" dur="2s" /> <img src="img3.jpg" dur="2s" /> </seq>
The sequence will play for 6 seconds on each repeat iteration. It will play through 10 times, unless the user clicks on a "stopBtn" element before 60 seconds have elapsed.
The time manipulations are based upon a model of cascading time. That is, each element defines its active and simple time as transformations of the parent simple time. This recurses from the root time container to each "leaf" in the time graph. If a time container has a time manipulation defined, this will be reflected in all children of the time container, since they define their time in terms of the parent time container. In the following example a sequence time container is defined to run twice as fast as normal (i.e. twice as fast as its respective time container).
<seq speed="2.0"> <video src="movie1.mpg" dur="10s" /> <video src="movie2.mpg" dur="10s" /> <img src="img1.jpg" begin="2s" dur="10s"> <animateMotion from="-100,0" to="0,0" dur="10s" /> </img> <video src="movie4.mpg" dur="10s" /> </seq>
The entire contents of the sequence will be observed to play (i.e., to progress) twice as fast. Each video child will be observed to play at twice the normal rate, and so will only last for 5 seconds. The image child will be observed to delay for 1 second (half of the specified begin offset). The animation child of the image will also "inherit" the speed manipulation from the sequence time container, and so will run the motion twice as fast as normal, leaving the image in the final position after only 5 seconds. The simple duration and the active duration of the sequence will be 21 seconds (42 seconds divided by 2).
These attributes define a simple acceleration and deceleration of element time, within the simple duration. The values are expressed as a proportion of the simple duration (i.e. between 0 and 1), and are defined such that the length of the simple duration is not changed by the use of these attributes. The normal play speed within the simple duration is increased to compensate for the periods of acceleration and deceleration (this is how the simple duration is preserved). The modified speed is termed the run rate. As the simple duration progresses (i.e., plays back), acceleration causes the rate of progress to increase from a rate of 0 up to the run rate. Progress continues at the run rate until the deceleration phase, when progress slows from the run-rate down to a rate of 0. The SMIL code:
<animation dur="10s" accelerate="0.3" decelerate="0.3" .../>
produces the trajectory:
As a simple example, the following defines an animation of an SVG rectangle shape. The rectangle will change from being tall and thin to being short and wide.
<rect ...> <animate attributeName="width" from="10px" to="100px" begin="0s" dur="10s" /> <animate attributeName="height" from="100px" to="10px" begin="0s" dur="10s" /> </rect>
The rectangle begins with a width of 10 pixels and increases to a width of 100 pixels over the course of 10 seconds. Over the same ten seconds, the height of the rectangle changes from 100 pixels to 10 pixels.
Consider a simple still image slideshow of four images, each displayed for 5 seconds. Using SMIL Timing, this slideshow might look like the following:
... <seq> <img src="butterfly.jpg" dur="5s" ... /> <img src="eagle.jpg" dur="5s" ... /> <img src="wolf.jpg" dur="5s" ... /> <img src="seal.jpg" dur="5s" ... /> </seq> ...
Currently when this presentation plays, we see a straight "cut" from one image to another, as shown in this animated image. However, what we would like to see are three left-to-right wipes in between the four images. We can get these by modifying the code as follows:
... <transition id="wipe1" type="barWipe" subtype="leftToRight" dur="1s"/> ... <seq> <img src="butterfly.jpg" dur="5s" fill="transition" ... /> <img src="eagle.jpg" dur="5s" fill="transition" transIn="wipe1" ... /> <img src="wolf.jpg" dur="5s" fill="transition" transIn="wipe1" ... /> <img src="seal.jpg" dur="5s" transIn="wipe1" ... /> </seq>
Now the presentation plays as follows, as illustrated by this animated image.
# of Characters
ISO 10646 BMP
Character Set (UCS)
In 1992, the Network Working Group published RFC 1341 extending the ability of Internet e-mail to handle various non-text file formats
Examples of MIME Types and corresponding Extensions include:
|video/mpeg||mpeg, mpg mpe|
HTML 4.0 introduced the LANG attribute:
Ce paragraphe est en Français
<P LANG="fr">Ce paragraphe est en Français</P>
<P LANG="en"> The English language always uses quotes <Q>like this</Q>, French has <Q LANG="fr">comme ça</Q> and German prefers <Q LANG="de">wie hier</Q>.</P>
The English language always uses quotes
like this, French has
comme ça and German prefers
So HTML 4.0 does not seem to be implemented.
Web Design in a Nutshell, Jennifer Neiderst, O'Reilly & Associates, Sebastopol, CA (1999)
Just XML, John E. Simpson, Prentice Hall PTR, Upper Saddle River, NJ (1999)
A good introduction to scalable vector graphics is given at:
The specification for SMIL is available at
A good summary of XML features is given at
A good introduction to MathML is given at:
Scalable Vector Graphics (SVG) Specifications are given at: