| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | \section{Standard Module \sectcode{cgi}} | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \stmodindex{cgi} | 
					
						
							|  |  |  | \indexii{WWW}{server} | 
					
						
							|  |  |  | \indexii{CGI}{protocol} | 
					
						
							|  |  |  | \indexii{HTTP}{protocol} | 
					
						
							|  |  |  | \indexii{MIME}{headers} | 
					
						
							|  |  |  | \index{URL} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1995-02-28 17:14:32 +00:00
										 |  |  | \renewcommand{\indexsubitem}{(in module cgi)} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | Support module for CGI (Common Gateway Interface) scripts. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | This module defines a number of utilities for use by CGI scripts | 
					
						
							|  |  |  | written in Python. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Introduction} | 
					
						
							|  |  |  | \nodename{Introduction to the CGI module} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | A CGI script is invoked by an HTTP server, usually to process user | 
					
						
							|  |  |  | input submitted through an HTML \code{<FORM>} or \code{<ISINPUT>} element. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Most often, CGI scripts live in the server's special \code{cgi-bin} | 
					
						
							|  |  |  | directory.  The HTTP server places all sorts of information about the | 
					
						
							|  |  |  | request (such as the client's hostname, the requested URL, the query | 
					
						
							|  |  |  | string, and lots of other goodies) in the script's shell environment, | 
					
						
							|  |  |  | executes the script, and sends the script's output back to the client. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The script's input is connected to the client too, and sometimes the | 
					
						
							|  |  |  | form data is read this way; at other times the form data is passed via | 
					
						
							|  |  |  | the ``query string'' part of the URL.  This module (\code{cgi.py}) is intended | 
					
						
							|  |  |  | to take care of the different cases and provide a simpler interface to | 
					
						
							|  |  |  | the Python script.  It also provides a number of utilities that help | 
					
						
							|  |  |  | in debugging scripts, and the latest addition is support for file | 
					
						
							|  |  |  | uploads from a form (if your browser supports it -- Grail 0.3 and | 
					
						
							|  |  |  | Netscape 2.0 do). | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The output of a CGI script should consist of two sections, separated | 
					
						
							|  |  |  | by a blank line.  The first section contains a number of headers, | 
					
						
							|  |  |  | telling the client what kind of data is following.  Python code to | 
					
						
							|  |  |  | generate a minimal header section looks like this: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-02-28 16:37:49 +00:00
										 |  |  |     print "Content-type: text/html"     # HTML is following | 
					
						
							|  |  |  |     print                               # blank line, end of headers | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The second section is usually HTML, which allows the client software | 
					
						
							|  |  |  | to display nicely formatted text with header, in-line images, etc. | 
					
						
							|  |  |  | Here's Python code that prints a simple piece of HTML: | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-02-28 16:37:49 +00:00
										 |  |  |     print "<TITLE>CGI script output</TITLE>" | 
					
						
							|  |  |  |     print "<H1>This is my first CGI script</H1>" | 
					
						
							|  |  |  |     print "Hello, world!" | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | (It may not be fully legal HTML according to the letter of the | 
					
						
							|  |  |  | standard, but any browser will understand it.) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Using the cgi module} | 
					
						
							|  |  |  | \nodename{Using the cgi module} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Begin by writing \code{import cgi}.  Don't use \code{from cgi import *} -- the | 
					
						
							|  |  |  | module defines all sorts of names for its own use or for backward  | 
					
						
							|  |  |  | compatibility that you don't want in your namespace. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | It's best to use the \code{FieldStorage} class.  The other classes define in this  | 
					
						
							|  |  |  | module are provided mostly for backward compatibility.  Instantiate it  | 
					
						
							|  |  |  | exactly once, without arguments.  This reads the form contents from  | 
					
						
							|  |  |  | standard input or the environment (depending on the value of various  | 
					
						
							|  |  |  | environment variables set according to the CGI standard).  Since it may  | 
					
						
							|  |  |  | consume standard input, it should be instantiated only once. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The \code{FieldStorage} instance can be accessed as if it were a Python  | 
					
						
							|  |  |  | dictionary.  For instance, the following code (which assumes that the  | 
					
						
							|  |  |  | \code{Content-type} header and blank line have already been printed) checks that  | 
					
						
							|  |  |  | the fields \code{name} and \code{addr} are both set to a non-empty string: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-02-28 16:37:49 +00:00
										 |  |  |     form = cgi.FieldStorage() | 
					
						
							|  |  |  |     form_ok = 0 | 
					
						
							|  |  |  |     if form.has_key("name") and form.has_key("addr"): | 
					
						
							|  |  |  |         if form["name"].value != "" and form["addr"].value != "": | 
					
						
							|  |  |  |             form_ok = 1 | 
					
						
							|  |  |  |     if not form_ok: | 
					
						
							|  |  |  |         print "<H1>Error</H1>" | 
					
						
							|  |  |  |         print "Please fill in the name and addr fields." | 
					
						
							|  |  |  |         return | 
					
						
							|  |  |  |     ...further form processing here... | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Here the fields, accessed through \code{form[key]}, are themselves instances | 
					
						
							|  |  |  | of \code{FieldStorage} (or \code{MiniFieldStorage}, depending on the form encoding). | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | If the submitted form data contains more than one field with the same | 
					
						
							|  |  |  | name, the object retrieved by \code{form[key]} is not a \code{(Mini)FieldStorage} | 
					
						
							|  |  |  | instance but a list of such instances.  If you expect this possibility | 
					
						
							|  |  |  | (i.e., when your HTML form comtains multiple fields with the same | 
					
						
							|  |  |  | name), use the \code{type()} function to determine whether you have a single | 
					
						
							|  |  |  | instance or a list of instances.  For example, here's code that | 
					
						
							|  |  |  | concatenates any number of username fields, separated by commas: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-02-28 16:37:49 +00:00
										 |  |  |     username = form["username"] | 
					
						
							|  |  |  |     if type(username) is type([]): | 
					
						
							|  |  |  |         # Multiple username fields specified | 
					
						
							|  |  |  |         usernames = "" | 
					
						
							|  |  |  |         for item in username: | 
					
						
							|  |  |  |             if usernames: | 
					
						
							|  |  |  |                 # Next item -- insert comma | 
					
						
							|  |  |  |                 usernames = usernames + "," + item.value | 
					
						
							|  |  |  |             else: | 
					
						
							|  |  |  |                 # First item -- don't insert comma | 
					
						
							|  |  |  |                 usernames = item.value | 
					
						
							|  |  |  |     else: | 
					
						
							|  |  |  |         # Single username field specified | 
					
						
							|  |  |  |         usernames = username.value | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | If a field represents an uploaded file, the value attribute reads the  | 
					
						
							|  |  |  | entire file in memory as a string.  This may not be what you want.  You can  | 
					
						
							|  |  |  | test for an uploaded file by testing either the filename attribute or the  | 
					
						
							|  |  |  | file attribute.  You can then read the data at leasure from the file  | 
					
						
							|  |  |  | attribute: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-02-28 16:37:49 +00:00
										 |  |  |     fileitem = form["userfile"] | 
					
						
							|  |  |  |     if fileitem.file: | 
					
						
							|  |  |  |         # It's an uploaded file; count lines | 
					
						
							|  |  |  |         linecount = 0 | 
					
						
							|  |  |  |         while 1: | 
					
						
							|  |  |  |             line = fileitem.file.readline() | 
					
						
							|  |  |  |             if not line: break | 
					
						
							|  |  |  |             linecount = linecount + 1 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The file upload draft standard entertains the possibility of uploading | 
					
						
							|  |  |  | multiple files from one field (using a recursive \code{multipart/*} | 
					
						
							|  |  |  | encoding).  When this occurs, the item will be a dictionary-like | 
					
						
							|  |  |  | FieldStorage item.  This can be determined by testing its type | 
					
						
							|  |  |  | attribute, which should have the value \code{multipart/form-data} (or | 
					
						
							|  |  |  | perhaps another string beginning with \code{multipart/}  It this case, it | 
					
						
							|  |  |  | can be iterated over recursively just like the top-level form object. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | When a form is submitted in the ``old'' format (as the query string or as a  | 
					
						
							|  |  |  | single data part of type \code{application/x-www-form-urlencoded}), the items  | 
					
						
							|  |  |  | will actually be instances of the class \code{MiniFieldStorage}.  In this case, | 
					
						
							|  |  |  | the list, file and filename attributes are always \code{None}. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Old classes} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | These classes, present in earlier versions of the \code{cgi} module, are still  | 
					
						
							| 
									
										
										
										
											1996-10-24 14:47:44 +00:00
										 |  |  | supported for backward compatibility.  New applications should use the | 
					
						
							|  |  |  | FieldStorage class. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \code{SvFormContentDict}: single value form content as dictionary; assumes each  | 
					
						
							|  |  |  | field name occurs in the form only once. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \code{FormContentDict}: multiple value form content as dictionary (the form | 
					
						
							|  |  |  | items are lists of values).  Useful if your form contains multiple | 
					
						
							|  |  |  | fields with the same name. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Other classes (\code{FormContent}, \code{InterpFormContentDict}) are present for | 
					
						
							|  |  |  | backwards compatibility with really old applications only.  If you still  | 
					
						
							|  |  |  | use these and would be inconvenienced when they disappeared from a next  | 
					
						
							|  |  |  | version of this module, drop me a note. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Functions} | 
					
						
							| 
									
										
										
										
											1996-12-13 22:04:31 +00:00
										 |  |  | \nodename{Functions in cgi module} | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | These are useful if you want more control, or if you want to employ | 
					
						
							|  |  |  | some of the algorithms implemented in this module in other | 
					
						
							|  |  |  | circumstances. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{funcdesc}{parse}{fp}: Parse a query in the environment or from a file (default \code{sys.stdin}). | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \begin{funcdesc}{parse_qs}{qs}: parse a query string given as a string argument (data of type  | 
					
						
							|  |  |  | \code{application/x-www-form-urlencoded}). | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \begin{funcdesc}{parse_multipart}{fp\, pdict}: parse input of type \code{multipart/form-data} (for  | 
					
						
							|  |  |  | file uploads).  Arguments are \code{fp} for the input file and  | 
					
						
							|  |  |  |     \code{pdict} for the dictionary containing other parameters of \code{content-type} header | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     Returns a dictionary just like \code{parse_qs()}: keys are the field names, each  | 
					
						
							|  |  |  |     value is a list of values for that field.  This is easy to use but not  | 
					
						
							|  |  |  |     much good if you are expecting megabytes to be uploaded -- in that case,  | 
					
						
							|  |  |  |     use the \code{FieldStorage} class instead which is much more flexible.  Note  | 
					
						
							|  |  |  |     that \code{content-type} is the raw, unparsed contents of the \code{content-type}  | 
					
						
							|  |  |  |     header. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     Note that this does not parse nested multipart parts -- use \code{FieldStorage} for  | 
					
						
							|  |  |  |     that. | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \begin{funcdesc}{parse_header}{string}: parse a header like \code{Content-type} into a main | 
					
						
							|  |  |  | content-type and a dictionary of parameters. | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \begin{funcdesc}{test}{}: robust test CGI script, usable as main program. | 
					
						
							|  |  |  |     Writes minimal HTTP headers and formats all information provided to | 
					
						
							|  |  |  |     the script in HTML form. | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \begin{funcdesc}{print_environ}{}: format the shell environment in HTML. | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \begin{funcdesc}{print_form}{form}: format a form in HTML. | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \begin{funcdesc}{print_directory}{}: format the current directory in HTML. | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \begin{funcdesc}{print_environ_usage}{}: print a list of useful (used by CGI) environment variables in | 
					
						
							|  |  |  | HTML. | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{funcdesc}{escape}{}: convert the characters ``\code{\&}'', ``\code{<}'' and ``\code{>}'' to HTML-safe | 
					
						
							|  |  |  | sequences.  Use this if you need to display text that might contain | 
					
						
							|  |  |  | such characters in HTML.  To translate URLs for inclusion in the HREF | 
					
						
							|  |  |  | attribute of an \code{<A>} tag, use \code{urllib.quote()}. | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Caring about security} | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | There's one important rule: if you invoke an external program (e.g. | 
					
						
							|  |  |  | via the \code{os.system()} or \code{os.popen()} functions), make very sure you don't | 
					
						
							|  |  |  | pass arbitrary strings received from the client to the shell.  This is | 
					
						
							|  |  |  | a well-known security hole whereby clever hackers anywhere on the web | 
					
						
							|  |  |  | can exploit a gullible CGI script to invoke arbitrary shell commands. | 
					
						
							|  |  |  | Even parts of the URL or field names cannot be trusted, since the | 
					
						
							|  |  |  | request doesn't have to come from your form! | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | To be on the safe side, if you must pass a string gotten from a form | 
					
						
							|  |  |  | to a shell command, you should make sure the string contains only | 
					
						
							|  |  |  | alphanumeric characters, dashes, underscores, and periods. | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \subsection{Installing your CGI script on a Unix system} | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | Read the documentation for your HTTP server and check with your local | 
					
						
							|  |  |  | system administrator to find the directory where CGI scripts should be | 
					
						
							|  |  |  | installed; usually this is in a directory \code{cgi-bin} in the server tree. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Make sure that your script is readable and executable by ``others''; the | 
					
						
							|  |  |  | Unix file mode should be 755 (use \code{chmod 755 filename}).  Make sure | 
					
						
							|  |  |  | that the first line of the script contains \code{\#!} starting in column 1 | 
					
						
							|  |  |  | followed by the pathname of the Python interpreter, for instance: | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-02-28 16:37:49 +00:00
										 |  |  |     #!/usr/local/bin/python | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | Make sure the Python interpreter exists and is executable by ``others''. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Make sure that any files your script needs to read or write are | 
					
						
							|  |  |  | readable or writable, respectively, by ``others'' -- their mode should | 
					
						
							|  |  |  | be 644 for readable and 666 for writable.  This is because, for | 
					
						
							|  |  |  | security reasons, the HTTP server executes your script as user | 
					
						
							|  |  |  | ``nobody'', without any special privileges.  It can only read (write, | 
					
						
							|  |  |  | execute) files that everybody can read (write, execute).  The current | 
					
						
							|  |  |  | directory at execution time is also different (it is usually the | 
					
						
							|  |  |  | server's cgi-bin directory) and the set of environment variables is | 
					
						
							|  |  |  | also different from what you get at login.  in particular, don't count | 
					
						
							|  |  |  | on the shell's search path for executables (\code{\$PATH}) or the Python | 
					
						
							|  |  |  | module search path (\code{\$PYTHONPATH}) to be set to anything interesting. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | If you need to load modules from a directory which is not on Python's | 
					
						
							|  |  |  | default module search path, you can change the path in your script, | 
					
						
							|  |  |  | before importing other modules, e.g.: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-02-28 16:37:49 +00:00
										 |  |  |     import sys | 
					
						
							|  |  |  |     sys.path.insert(0, "/usr/home/joe/lib/python") | 
					
						
							|  |  |  |     sys.path.insert(0, "/usr/local/lib/python") | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | (This way, the directory inserted last will be searched first!) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Instructions for non-Unix systems will vary; check your HTTP server's | 
					
						
							|  |  |  | documentation (it will usually have a section on CGI scripts). | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \subsection{Testing your CGI script} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Unfortunately, a CGI script will generally not run when you try it | 
					
						
							|  |  |  | from the command line, and a script that works perfectly from the | 
					
						
							|  |  |  | command line may fail mysteriously when run from the server.  There's | 
					
						
							|  |  |  | one reason why you should still test your script from the command | 
					
						
							|  |  |  | line: if it contains a syntax error, the python interpreter won't | 
					
						
							|  |  |  | execute it at all, and the HTTP server will most likely send a cryptic | 
					
						
							|  |  |  | error to the client. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Assuming your script has no syntax errors, yet it does not work, you | 
					
						
							|  |  |  | have no choice but to read the next section: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Debugging CGI scripts} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | First of all, check for trivial installation errors -- reading the | 
					
						
							|  |  |  | section above on installing your CGI script carefully can save you a | 
					
						
							|  |  |  | lot of time.  If you wonder whether you have understood the | 
					
						
							|  |  |  | installation procedure correctly, try installing a copy of this module | 
					
						
							|  |  |  | file (\code{cgi.py}) as a CGI script.  When invoked as a script, the file | 
					
						
							|  |  |  | will dump its environment and the contents of the form in HTML form. | 
					
						
							|  |  |  | Give it the right mode etc, and send it a request.  If it's installed | 
					
						
							|  |  |  | in the standard \code{cgi-bin} directory, it should be possible to send it a | 
					
						
							|  |  |  | request by entering a URL into your browser of the form: | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-02-28 16:37:49 +00:00
										 |  |  |     http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \end{verbatim} | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | If this gives an error of type 404, the server cannot find the script | 
					
						
							|  |  |  | -- perhaps you need to install it in a different directory.  If it | 
					
						
							|  |  |  | gives another error (e.g.  500), there's an installation problem that | 
					
						
							|  |  |  | you should fix before trying to go any further.  If you get a nicely | 
					
						
							|  |  |  | formatted listing of the environment and form content (in this | 
					
						
							|  |  |  | example, the fields should be listed as ``addr'' with value ``At Home'' | 
					
						
							|  |  |  | and ``name'' with value ``Joe Blow''), the \code{cgi.py} script has been | 
					
						
							|  |  |  | installed correctly.  If you follow the same procedure for your own | 
					
						
							|  |  |  | script, you should now be able to debug it. | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | The next step could be to call the \code{cgi} module's test() function from | 
					
						
							|  |  |  | your script: replace its main code with the single statement | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-02-28 16:37:49 +00:00
										 |  |  |     cgi.test() | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \end{verbatim} | 
					
						
							| 
									
										
										
										
											1997-02-28 16:37:49 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | This should produce the same results as those gotten from installing | 
					
						
							|  |  |  | the \code{cgi.py} file itself. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | When an ordinary Python script raises an unhandled exception | 
					
						
							|  |  |  | (e.g. because of a typo in a module name, a file that can't be opened, | 
					
						
							|  |  |  | etc.), the Python interpreter prints a nice traceback and exits. | 
					
						
							|  |  |  | While the Python interpreter will still do this when your CGI script | 
					
						
							|  |  |  | raises an exception, most likely the traceback will end up in one of | 
					
						
							|  |  |  | the HTTP server's log file, or be discarded altogether. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Fortunately, once you have managed to get your script to execute | 
					
						
							|  |  |  | *some* code, it is easy to catch exceptions and cause a traceback to | 
					
						
							|  |  |  | be printed.  The \code{test()} function below in this module is an example. | 
					
						
							|  |  |  | Here are the rules: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{enumerate} | 
					
						
							|  |  |  | 	\item Import the traceback module (before entering the | 
					
						
							|  |  |  | 	   try-except!) | 
					
						
							|  |  |  | 	 | 
					
						
							|  |  |  | 	\item Make sure you finish printing the headers and the blank | 
					
						
							|  |  |  | 	   line early | 
					
						
							|  |  |  | 	 | 
					
						
							|  |  |  | 	\item Assign \code{sys.stderr} to \code{sys.stdout} | 
					
						
							|  |  |  | 	 | 
					
						
							|  |  |  | 	\item Wrap all remaining code in a try-except statement | 
					
						
							|  |  |  | 	 | 
					
						
							|  |  |  | 	\item In the except clause, call \code{traceback.print_exc()} | 
					
						
							|  |  |  | \end{enumerate} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | For example: | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-02-28 16:37:49 +00:00
										 |  |  |     import sys | 
					
						
							|  |  |  |     import traceback | 
					
						
							|  |  |  |     print "Content-type: text/html" | 
					
						
							|  |  |  |     print | 
					
						
							|  |  |  |     sys.stderr = sys.stdout | 
					
						
							|  |  |  |     try: | 
					
						
							|  |  |  |         ...your code here... | 
					
						
							|  |  |  |     except: | 
					
						
							|  |  |  |         print "\n\n<PRE>" | 
					
						
							|  |  |  |         traceback.print_exc() | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \end{verbatim} | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | Notes: The assignment to \code{sys.stderr} is needed because the traceback | 
					
						
							|  |  |  | prints to \code{sys.stderr}.  The \code{print "$\backslash$n$\backslash$n<PRE>"} statement is necessary to | 
					
						
							|  |  |  | disable the word wrapping in HTML. | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | If you suspect that there may be a problem in importing the traceback | 
					
						
							|  |  |  | module, you can use an even more robust approach (which only uses | 
					
						
							|  |  |  | built-in modules): | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-02-28 16:37:49 +00:00
										 |  |  |     import sys | 
					
						
							|  |  |  |     sys.stderr = sys.stdout | 
					
						
							|  |  |  |     print "Content-type: text/plain" | 
					
						
							|  |  |  |     print | 
					
						
							|  |  |  |     ...your code here... | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | This relies on the Python interpreter to print the traceback.  The | 
					
						
							|  |  |  | content type of the output is set to plain text, which disables all | 
					
						
							|  |  |  | HTML processing.  If your script works, the raw HTML will be displayed | 
					
						
							|  |  |  | by your client.  If it raises an exception, most likely after the | 
					
						
							|  |  |  | first two lines have been printed, a traceback will be displayed. | 
					
						
							|  |  |  | Because no HTML interpretation is going on, the traceback will | 
					
						
							|  |  |  | readable. | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \subsection{Common problems and solutions} | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \begin{itemize} | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \item Most HTTP servers buffer the output from CGI scripts until the | 
					
						
							|  |  |  | script is completed.  This means that it is not possible to display a | 
					
						
							|  |  |  | progress report on the client's display while the script is running. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \item Check the installation instructions above. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \item Check the HTTP server's log files.  (\code{tail -f logfile} in a separate | 
					
						
							|  |  |  | window may be useful!) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \item Always check a script for syntax errors first, by doing something | 
					
						
							|  |  |  | like \code{python script.py}. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \item When using any of the debugging techniques, don't forget to add | 
					
						
							|  |  |  | \code{import sys} to the top of the script. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \item When invoking external programs, make sure they can be found. | 
					
						
							|  |  |  | Usually, this means using absolute path names -- \code{\$PATH} is usually not | 
					
						
							|  |  |  | set to a very useful value in a CGI script. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \item When reading or writing external files, make sure they can be read | 
					
						
							|  |  |  | or written by every user on the system. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \item Don't try to give a CGI script a set-uid mode.  This doesn't work on | 
					
						
							|  |  |  | most systems, and is a security liability as well. | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | \end{itemize} | 
					
						
							|  |  |  | 
 |