CSci 220 - Lecture 29

Web Page Design

© Morris Firebaugh

     

Introduction

XML Applications

 


A. Examples of XML Application Areas

One of the best information pages on XML is given by OASIS

Here are some of the first Application Areas in which XML is being put to work

MatML

The MatML project focused upon the distribution of materials property data is coordinated by the US National Institute of Standards and Technology (NIST). The MatML effort "is addressing the problems of interpretation and interoperability through the development of an Extensible Markup Language (XML) [Materials Property Data Markup Language] for materials data that will permit the storage, transmission, and processing of materials property data distributed via the World Wide Web. A MatML Working Group has been established and represents a cross section of the materials community with members from private industry, government laboratories, universities, standards organizations, and professional societies. The Working Group uses an online forum for discussing issues such as the scope of and specifications for MatML and has recently produced a working draft of the document type definition (DTD) for the markup language. The MatML DTD contains structures for transferring information concerning the material and its properties, terms which may help with the interpretation of the transferred data, and graphs. The DTD is the XML semantic and syntactic formalism that software will need to parse, interpret, and use the data contained in MatML documents."


Chemical Markup Language (CML)

[CR: 19990724]

The Chemical Markup Language is documented as "an application of XML" and was
demonstrated at WWW6 with the Jumbo Java-based browser for XML documents.

XML-CML A centre for Chemical Markup Language (CML) resources
CML Frequently Asked Questions
JUMBO XML/CML Browser
PDB2CML - A public domain PDB to CML converter
CML DTD - By P.Murray-Rust and H.Rzepa. Documented in Journal of Chemical
Information and Computer Science, 1999. [local archive copy]


BIOpolymer Markup Language (BIOML)

[CR: 20000629]

The BIOpolymer Markup Language (BIOML) is "a new XML language, designed to be used for
the annotation of biopolymer sequence information. BIOML allows the full specification of all
experimental information known about molecular entities composed of biopolymers, for example,
proteins and genes. There is currently no general method of annotating biopolymer sequences,
in their biological context. The goal of BIOML is to provide an extensible framework for this
annotation and to provide a common vehicle for exchanging this information between scientists
using the World Wide Web. A BIOML document will describe a physical object, e.g., a
particular protein, in such a way that all known experimental information about that object can
be associated with the object in a logical and meaningful way. The advantage of using a Markup
Language for this task is that the information is necessarily nested at different levels of
complexity and it fits in very well with the tree-leaf structure inherent in XML. Additionally,
although the primary purpose of BIOML is the transfer of information between machines, the
additional style information available when using an XML-based approach will simplify the task
of displaying that information on various types of browsing and display software." BIOML was
designed and written by Ron Beavis, with help from David Fenyö (ProteoMetrics, LLC) and
Brian Chait (Rockefeller University). David States (Washington University) has assisted in the
editing of the DTD language definition.

BIOML Home Page


Weather Observation Markup Format (OMF)

[CR: 19980914]

The Weather Observation Markup Format [or: Weather Observation Definition Format] is an
application of XML used to encode weather observation reports. The goal of the OMF system
is to annotate and augment standard weather reports with derived, computed quantities, and
to re-cast the essential information in a markup format that is easier to interpret, yet
completely accurate. The data formats typically used in weather reports ("FM 15-X Ext.
METAR, FM 16-X Ext. SPECI, FM 51-X Ext. TAF, etc. [constituting] KAWN, WMO feeds . . .")
are both incomplete and suboptimal for some processing objectives. According to a summary
from one of OMF's designers, the OMF application thus "uses XML for annotating weather
observation reports, forecasts and advisories as issued by Weather Meteorological
Organization (WMO), the National Weather Center and Air Force Global Weather Center.
Currently, METAR/SPECI observational reports, Terminal Aerodrome Forecasts (TAFs) and
SIGMET significant weather aircraft advisories are being analyzed and marked up. The
incoming source of data are raw bulletins distributed by KAWN/ADWS or National Weather
Service'sGateways. The bulletins are parsed, reports are decoded and stored into a database,
which can then be queried. The results of the queries are XML-formatted into OMF
documents. It is always possible to reconstruct original reports by stripping away the XML
markup. The designers are also working on adding other types of reports - Upper Air reports,
regional SIGMETs, AIRMETS, Bathythermographs, PIREPS, etc. The markup system is in
actual use "to distribute the most current annotated weather observations, forecasts and
advisories; the Navy's Joint Metoc Viewer is one application that can ingest OMF documents
and display the corresponding data."

Links:

Weather Observation Markup Format. A Description.


Signed Document Markup Language (SDML)

[CR: 19980621]

[June 21, 1998] As of June 19, 1998, the W3C has acknowledged receipt of a NOTE submission
from the Financial Services Technology Consortium (FSTC) for the Signed Document Markup
Language (SDML), Specification 2.0. References: NOTE-SDML-19980619, W3C Note
19-June-1998. Author: Jeff Kravitz (IBM). The topic of the proposal is within scope for the
W3C Digital Signature Initiative. The goal of SDML language, as part of the Electronic Check
Project, is to "a) tag the individual text items making up a document, b) group the text items
into document parts which can have business meaning and can be signed individually or
together, c) allow document parts to be added and deleted without invalidating previous
signatures, and d) allow signing, co-signing, endorsing, co-endorsing, and witnessing
operations on documents and document parts." Abstract: "The SDML 2.0 specification
describes a generic method for digitally signing a document, one or more sections of a
document and/or multiple documents together. The signed documents may be web pages,
e-mail messages or any text based documents. SDML requires the use of public key
cryptography and hash algorithms and will support all of the common methodologies in use
today and those which will be developed in the future. The authors of SDML have defined its
structure, in part, through the use of the Standard Generalized Markup Language (SGML).
SDML is a generalization of the Financial Services Markup Language (FSML), developed by
the Financial Services Technology Consortium. FSML defines the specific document parts
needed for electronic checks, e.g., the tags needed to identify check specific data items, the
semantics of the data items, and processing requirements for electronic checks. Note that the
SGML Document Type Definition is presented in Appendix A; among the 'Issues and
Directions' still under discussion: "Convergence with the new XML standard. XML, a new
SDML-based standard from the W3C is a subset of SGML, as is SDML, but was developed in
order to allow for more flexible documents on the World Wide Web. XML is essentially a
subset of SGML. A number of issues related to the differences between XML and SGML, as
well as some problems caused by the differences in goals for the two languages, will need to
be resolved if SDML were to migrate to become XML-compatible." The W3C reviewers
"expect that the next phase of DSig would want to move to an XML-compliant syntax, and
fully address internationalization requirements."

Links:

http://www.w3.org/TR/1998/NOTE-SDML-19980619/


Mathematical Markup Language (MathML)

[CR: 19990520]

The Mathematical Markup Language (W3C Working Draft) is a specification which "defines the
Mathematical Markup Language, or MathML. MathML is an XML application for describing
mathematical notation and capturing both its structure and content. The goal of MathML is to
enable mathematics to be served, received, and processed on the Web, just as HTML has
enabled this functionality for text. The document begins with background information on
mathematical notation, the problems it poses, and the philosophy underlying the solutions
MathML proposes. MathML can be used to encode both mathematical notation and
mathematical content. About 25 of the MathML tags describe abstract notational structures,
while another 75 provide a way of unambiguously specifying the intended meaning of an
expression. Additional chapters discuss how the MathML content and presentation elements
interact, and how MathML renderers might be implemented and should interact with browsers.
Finally, the document addresses the issue of MathML entities (extended characters) and their
relation to fonts."

[February 24, 1998] MathML (Mathematical Markup Language) specification issued by W3C as
a Proposed Recommendation. Editors: Patrick Ion and Robert Miner. Reference:
PR-math-19980224. Abstract: "MathML is a low-level syntax for representing structured data
such as mathematics in machine-to-machine communication over the Web, providing a
much-needed solution for including mathematical expressions over the Web. In developing
MathML, the goal was to define an XML-compliant markup language that describes the content
and presentation of mathematical expressions. This was achieved with MathML. As an effective
way to include mathematical expressions in Web documents, MathML gives control over the
presentation and the meaning of such expressions. It does this by providing two sets of markup
tags: one set presents the notation of mathematical data in markup format, and the other set
relays the semantic meaning of mathematical expressions, enabling complex mathematical and
scientific notation to be encoded in an explicit way. As an XML application, MathML capitalizes
on XML features and benefits from the wide support of XML. Unlike HTML which was
intended as a markup language for use by people, MathML is intended to be used by machines,
facilitating the searching and indexing of mathematical and scientific information. Software tools
that work with MathML render MathML into formatted equations, enabling users to edit
mathematical equations much as one might edit HTML text. Several early versions of such
MathML tools already exist, and a number of others, both freely available software and
commercial products, are under development." See the press release.

Links:

[April 07, 1998] Announcement for MathML as a W3C Recommendation
Mathematical Markup Language (MathML) specification as Recommendation -
REC-MathML-19980407, W3C Recommendation 07-April-1998. [June 21, 1998] Signed Document Markup Language (SDML), Specification 2.0. [local
archive copy


B.  Learning XML

An excellent short slide show is given by Howard Strauss

Consider "Learning XML in 11.5 Minutes" by I. c. reese

Syntax for Well-formed XML documents:

1.Declare your XML version.
2.Begin and end opening tags with < and > and closing tags with </ and >.
3.Insure child markup nests completely within parent markup.
4.Start empty markup with < and end it with />.

Recall the three parts of an XML document

  1. Prolog
  2. Root
  3. Epilog

Consider the Prolog

Sample Prolog

<?xml  version="1.0"  encoding="UTF-8"  ?>

Note:


Consider the Root

Sample Root

<SPEECH> 
<SPEAKER> PROSPERO</SPEAKER> 
<LINE><STAGEDIR>Aside</STAGEDIR> The Duke of Milan </LINE> 
<LINE>And his more braver daughter could control thee, </LINE> 
... 
</SPEECH>

Consider the Epilog


More XML Syntax

Comments

The syntax of XML comments is the same as HTML documents

Sample Comment

<!--  This is an XML comment.  You can put anything in it, including what looks like undefined <Quasi-XML tags> and they will never be scanned by the parser.  -->

Note:


C.  Adobe GoLive and XML

Adobe GoLive recognizes XML, the simplified dialect of
SGML (Standard Generalized Markup Language) for the
structured presentation of information on the World Wide
Web. Adobe GoLive reads XML, writes it back to file
without any changes, and lets you inspect and edit existing
XML declarations and tags. Because of the nature of XML,
a visual editor is not available.

XML is a subset of SGML for defining custom markup
languages. XML documents use Document Type
Definitions (DTDs) that define the custom tags available for
use in the document. Adobe GoLive does not validate (that
is, syntax check) XML files. For more information on XML,
visit the Web site at www.w3.org/XML/ or www.xml.com/.

With Adobe GoLive, the user can open existing XML
documents by choosing File > Open, by double- clicking
the XML file, or by dragging and dropping the file onto the
Adobe GoLive application. The Outline Editor is
recommended for viewing and editing XML tags.

Adobe GoLive Layout view shows the XML document in a
collapsed format in which the XML structure is represented
by box symbols. Click the triangle on the left side of a box
to expand or collapse the element or show or hide nested
elements.


XML Document in Layout View: A. Boxes represent XML structure. B.
Triangle control expands or collapses element views.

You can also switch to the source view and edit the XML
markup as text.

Adobe GoLive lets XML developers edit the text of XML
documents. Other than the changes you make, Adobe
GoLive will write the balance of the document without
modifying the code, although it may modify white space.
You can view the new Adobe GoLive structure by looking
in your Adobe GoLive Modules folder in the Web Settings
folder. You can open and view the various files in Outline,
Layout, and Source views. Successful editing of these files
requires expert-level knowledge of XML.


The XML Tab

The XML tab lists Adobe GoLive and imported Document
Type Definition (DTD) files that contain XML tags and
definitions. You cannot edit the entries provided with
Adobe GoLive, but you can view the defaults or import
your own DTD files.

To import a DTD file:

Right-click (Windows) or Control-click (Mac OS) in the XML
tab and choose Import DTD from the context menu.


Another tool for creating XML and DTD documents is XML Spy


Welcome to XML Spy!

The first true IDE for XML
XML Spy is the first true Integrated Development Environment for the eXtensible Markup Language that includes all major aspects of XML in one powerful and easy-to-use product:

XML Spy is centered around a professional validating XML editor that provides five advanced views on your documents: an Enhanced Grid View for structured editing, a Database/Table view that shows repeated elements in a tabular fashion, a Text View with syntax-coloring for low-level work, a graphical XML Schema design view, and an integrated Browser View that supports both CSS and XSL style-sheets.


D. Browsers and XML


On Display: XML Web Pages with Mozilla

by Simon St. Laurent
March 29, 2000

Direct display of XML in a web browser is finally becoming a reality. This article is the first of a series in which we will examine XML support in the Mozilla, Opera, and Internet Explorer browsers.

Although Cascading Style Sheets Level 2 provides a solid set of tools for presenting XML documents in web browsers, web developers have been waiting a very long time for an implementation that lets them really use their CSS skills with XML. Internet Explorer 5.0 took some credible first steps toward XML+CSS (see Tim Bray's review of Windows IE5 for details), but the latest work from Mozilla goes beyond first steps to a usable set of tools. The solid XML+CSS core and the underlying DOM support suggests that Mozilla will be a useful platform for building applications, not just web pages. Add to that a dash of XLink support, and it looks like Mozilla may be leading the pack.


Recall Example 1: A Well-formed XML Document

 <SENTENCE>
   <SUBJECT  TYPE="COMPLEX">
     <ARTICLE  TYPE="INDEFINITE">A</ARTICLE>
     <ADJECTIVE> quick</ADJECTIVE>
     <ADJECTIVE> brown</ADJECTIVE>
     <NOUN>fox</NOUN>
   </SUBJECT>
   <VERB  TYPE="INTRANSITIVE">jumps</VERB>
   <PREPOSITION>over<</PREPOSITION>
   <ARTICLE TYPE ="INDEFINITE">a</ARTICLE>
   <OBJECT>
     <ADJECTIVE>lazy</ADJECTIVE>
     <NOUN> dog</NOUN>
   </OBJECT>.
   

This produces, in Internet Explorer:

but only the raw XML code in Netscape Communicator 4.7


Example 2: A DTD Document

bioml.dtd


E.  Final Observations on XML

Prospects for XML

Problems of XML

 



F.  Homework Assignment # 15

 


Updated 5/1//01