CSci 220 - Lecture 28

Web Page Design

© M. Firebaugh

     


Introduction

 


A. The Common Gateway Interface (CGI)

Recall Pushand Pullon the Internet

The key to Push is the Common Gateway Interface. CGI is a standard for interfacing external applications with information servers

Features of the CGI

QUERY_STRING
PATH_INFO

  • You must also include a header in your CGI document identifying the MIME type of the document you're returning to the client
  • text/html
    text/plain
    
    

    Example:

    Content-type: text/html
    <HTML>
    <HEAD>
    <TITLE>output of HTML from CGI script</TITLE>
    </HEAD>
    <BODY>
    <H1>Sample output</H1>
    What do you think of <STRONG>this?</STRONG>
    </BODY>
    </HTML>

  • A good source of information is the NCSA Primer on CGI
  • The content-type header above is just one example of a host of command-line arguments for setting environment variables.
  • The downside of CGI programming
  • FinalExample.html

      


    B.  File Structures for Implementing CGI

    The standard language for writing CGI is PERL .

     

    There are some fundamental problems involved with implentation of CGI Scripts

    The standard practice is to load a directory (folder), /cgi-bin/ immediately below your HTML files.

         


    C.  WWW Archives as a Source for Scripts

    The WWW contains a vast resource of CGI Scripts.  Here is an

    Index to CGI Scripts

      

    Example 1:  Hits on your page counter

    http://technotrade.com/cgi/redirect.html
     

    The Perl Script for doing this is:
     

                                    1. redirect.cgi
                                    2. a text data file for counts
                                    3. a text data file for hit dates,times,domains

    Example :

              <A HREF="http://www.cgi-local.com">Click here</A>
               will now become ....
              <A HREF="http://uwp.edu/cgi/firebaugh/redirect.cgi? http://www.cgi-
    local.com">Click here</A> Click here
                               
    (replace technotrade.com with your domain and location of the script)
     
    Click here to call mycount.cgi

    Click here to call PrntDate.cgi



    D. Using Remote CGI Scripts

    An excellent source of free CGI scripts is available at

    http://www.cgi-free.com/

    Example 1:  How long till I retire?

    Countdown to the date of my choice! 
       



    E.  Cookies and MOMSpiders 

    HTTP is a "stateless machine"

    Example: How does Amazon.com know my name?

      http://www.Amazon.com
     

    Sites and Links get corrupted, changed, and deleted

    Most documents made available on the World-Wide Web can be considered part of an infostructure -- an information resource database with a specifically designed structure. Infostructures often contain a wide variety of information sources, in the form of interlinked documents at distributed sites, which are maintained by a number of different document owners (usually, but not necessarily, the original document authors). Individual documents may also be shared by multiple infostructures. Since it is rarely static, the content of an infostructure is likely to change over time and may vary from the intended structure. Documents may be moved or deleted, referenced information may change, and hypertext links may be broken.

    As it grows, an infostructure becomes complex and difficult to maintain. Such maintenance currently relies upon the error logs of each server (often never relayed to the document owners), the complaints of users (often not seen by the actual document maintainers), and periodic manual traversals by each owner of all the webs for which they are responsible. Since thorough manual traversal of a web can be time-consuming and boring, maintenance is rarely or inconsistently performed and the infostructure eventually becomes corrupted. What is needed is an automated means for traversing a web of documents and checking for changes which may require the attention of the human maintainers (owners) of that web.

    The Multi-Owner Maintenance spider (MOMspider) has been developed to at least partially solve this maintenance problem. MOMspider can periodically traverse a list of webs (by owner, site, or document tree), check each web for any changes which may require its owner's attention, and build a special index document that lists out the attributes and connections of the web in a form that can itself be traversed as a hypertext document. This paper describes the design of MOMspider and how it was influenced by the nature of distributed hypertext maintenance and requirements for the good behavior of any web-traversing robot. It also includes discussion of the efficiency requirements for maintaining world-wide webs and proposed changes to HTML and HTTP to support distributed maintenance. The paper concludes with a short description of MOMspider's future and pointers to its freeware distribution site.

     

     




    Updated 4/24/2001