CSci 220 - Lecture 27

Web Page Design

© Morris Firebaugh


I. Introduction

The Emergence of XML


A. Meta-languages & XML

What is the relationship between SGML, HTML, and XML?

All right, then, what is a meta-language?

Visions inspiring XML:

The grand vision of XML is the creation of a worldwide collection of data objects that are fully addressable and fully open to borrowing, reuse, and repackaging by anybody on the net--in short, everything that the copyright laws have strived for centuries to prevent.

XML is a standard for the creation of tagging languages. It sets out a collection of rules that govern how a parser is to behave. An XML parser that follows these rules can parse any document tagged in an XML compatible language. This means you can make up your own language and not have to write any code to parse it. You can concentrate on writing code that processes the information in useful ways. . . .

The advantages of XML is that it allows you to define your own data structures. When you use any particular language defined in XML, you are no longer enjoying the advantages of XML, you are enjoying the advantages of the particular markup language you have chosen.

What XML is:

What XML is NOT:

A very good introduction to XML has been written by the W3 Consortium with commentary by Tim Bray

B.  Design Goals of XML

XML was developed by an SGML Editorial Review Board formed under the auspices of the World Wide Web Consortium (W3C) in 1996 and chaired by Jon Bosak of Sun Microsystems, with the very active participation of an SGML Working Group also organized by the W3C. Dan Connolly served as the ERB's contact with the W3C.

The design goals for XML are:

"XML is primarily intended to meet the requirements of large-scale Web content providers for industry-specific markup, vendor-neutral data exchange, media-independent publishing, one-on-one marketing, workflow management in collaborative authoring environments, and the processing of Web documents by intelligent clients. It is also expected to find use in certain metadata applications. XML is fully internationalized for both European and Asian languages, with all conforming processors required to support the Unicode character set in both its UTF-8 and UTF-16 encodings. The language is designed for the quickest possible client-side processing consistent with its primary purpose as an electronic publishing and data interchange format." [971208 W3C press release]"XML documents are made up of storage units called entities, which contain either parsed or unparsed data. Parsed data is made up of characters, some of which form the character data in the document, and some of which form markup. Markup encodes a description of the document's storage layout and logical structure. XML provides a mechanism to impose constraints on the storage layout and logical structure. A software module called an XML processor is used to read XML documents and provide access to their content and structure. It is assumed that an XML processor is doing its work on behalf of another module, called the application. This specification describes the required behavior of an XML processor in terms of how it must read XML data and the information it must provide to the application." [adapted from the Proposal]
from The SGML.XML Web Page by OASIS

C.  The Syntax of XML

The Syntax of XML is Defined in XML Part 1. Syntax

XML supports two levels syntax:

  1. Well-formed XML documents [lower level]
  2. Valid XML documents specified by Document Type Definitions (DTD) [higher level]

An XML document is a systematic set of containers called elements

General Syntax for well-formed documents

Purpose of Document Type Definitions (DTDs)

Advantages of Document Type Definitions (DTDs)

 <!ENTITY  me  "Dmitry Kirsanov,  St.Perersburg, Russia">


This document was created by &me; on April 21, 1999
   <!- -   your DTD goes here  - ->

D. Examples of XML

Example 1: A Well-formed XML Document

     <NOUN> dog</NOUN>

Example 2: A DTD Document

The Root Document Element Definition from play.dtd

<!ELEMENT play (title, fm, personae, scndescr, playsubt, induct?,prologue?,act+,epilogue?)>

Note: the "?" means "optional item", and the "+" means multiple occurences

The play.dtd also shows the drelationship between tags:

<!ELEMENT  speech   (speaker+, (line|stagedir|subhead)+)>
<!ELEMENT  speaker  (#PCDATA)>
<!ELEMENT  line     (stagedir  |#PCDATA)+)>
<!ELEMENT  stagedir (#PCDATA)>
<!ELEMENT  subhead  (#PCDATA)>

This is a SPEECH element using the DTD definitions above:

<LINE><STAGEDIR>Aside</STAGEDIR>  The Duke of Milan  </LINE>
<LINE>And his more braver daughter could control thee,  </LINE>
<LINE>If now 'twer fir to do't.  At the first sight</LINE>
<LINE>They have changed eyes.  Delicate Ariel,  </LINE>
<LINE>I'll set thee free for this.  </LINE>
<LINE>A  word, good sir;</LINE>
<LINE>I fear you have done yourself some wrong;  a word. </LINE>

Another excellent resource on XML Files is the XML Magazine

E.  Linking Capabilities of XML

One of the greatest strengths of XML is its more general and abstract Linking Capabilities

Prospects for XML

Problems of XML


F.  Laboratory Assignment # 14

Updated 4/23/2001