| 
									
										
										
										
											1998-08-10 19:42:37 +00:00
										 |  |  | \section{\module{cgi} --- | 
					
						
							|  |  |  |          Common Gateway Interface support.} | 
					
						
							| 
									
										
										
										
											1998-07-23 17:59:49 +00:00
										 |  |  | \declaremodule{standard}{cgi} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-08-10 19:42:37 +00:00
										 |  |  | \modulesynopsis{Common Gateway Interface support, used to interpret | 
					
						
							|  |  |  | forms in server-side scripts.} | 
					
						
							| 
									
										
										
										
											1998-07-23 17:59:49 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \indexii{WWW}{server} | 
					
						
							|  |  |  | \indexii{CGI}{protocol} | 
					
						
							|  |  |  | \indexii{HTTP}{protocol} | 
					
						
							|  |  |  | \indexii{MIME}{headers} | 
					
						
							|  |  |  | \index{URL} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1995-02-28 17:14:32 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-04-03 03:47:03 +00:00
										 |  |  | Support module for CGI (Common Gateway Interface) scripts.%
 | 
					
						
							|  |  |  | \index{Common Gateway Interface} | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | This module defines a number of utilities for use by CGI scripts | 
					
						
							|  |  |  | written in Python. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Introduction} | 
					
						
							| 
									
										
										
										
											1998-04-14 17:19:54 +00:00
										 |  |  | \nodename{cgi-intro} | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | A CGI script is invoked by an HTTP server, usually to process user | 
					
						
							| 
									
										
										
										
											1998-08-21 20:02:06 +00:00
										 |  |  | input submitted through an HTML \code{<FORM>} or \code{<ISINDEX>} element. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1997-12-09 03:28:42 +00:00
										 |  |  | Most often, CGI scripts live in the server's special \file{cgi-bin} | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | directory.  The HTTP server places all sorts of information about the | 
					
						
							|  |  |  | request (such as the client's hostname, the requested URL, the query | 
					
						
							|  |  |  | string, and lots of other goodies) in the script's shell environment, | 
					
						
							|  |  |  | executes the script, and sends the script's output back to the client. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The script's input is connected to the client too, and sometimes the | 
					
						
							|  |  |  | form data is read this way; at other times the form data is passed via | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | the ``query string'' part of the URL.  This module is intended | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | to take care of the different cases and provide a simpler interface to | 
					
						
							|  |  |  | the Python script.  It also provides a number of utilities that help | 
					
						
							|  |  |  | in debugging scripts, and the latest addition is support for file | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | uploads from a form (if your browser supports it --- Grail 0.3 and | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | Netscape 2.0 do). | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The output of a CGI script should consist of two sections, separated | 
					
						
							|  |  |  | by a blank line.  The first section contains a number of headers, | 
					
						
							|  |  |  | telling the client what kind of data is following.  Python code to | 
					
						
							|  |  |  | generate a minimal header section looks like this: | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-07-17 16:34:52 +00:00
										 |  |  | print "Content-type: text/html"     # HTML is following | 
					
						
							|  |  |  | print                               # blank line, end of headers | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \end{verbatim} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | The second section is usually HTML, which allows the client software | 
					
						
							|  |  |  | to display nicely formatted text with header, in-line images, etc. | 
					
						
							|  |  |  | Here's Python code that prints a simple piece of HTML: | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-07-17 16:34:52 +00:00
										 |  |  | print "<TITLE>CGI script output</TITLE>" | 
					
						
							|  |  |  | print "<H1>This is my first CGI script</H1>" | 
					
						
							|  |  |  | print "Hello, world!" | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \end{verbatim} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | (It may not be fully legal HTML according to the letter of the | 
					
						
							|  |  |  | standard, but any browser will understand it.) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Using the cgi module} | 
					
						
							|  |  |  | \nodename{Using the cgi module} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | Begin by writing \samp{import cgi}.  Do not use \samp{from cgi import | 
					
						
							|  |  |  | *} --- the module defines all sorts of names for its own use or for | 
					
						
							|  |  |  | backward compatibility that you don't want in your namespace. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | It's best to use the \class{FieldStorage} class.  The other classes | 
					
						
							|  |  |  | defined in this module are provided mostly for backward compatibility. | 
					
						
							|  |  |  | Instantiate it exactly once, without arguments.  This reads the form | 
					
						
							|  |  |  | contents from standard input or the environment (depending on the | 
					
						
							|  |  |  | value of various environment variables set according to the CGI | 
					
						
							|  |  |  | standard).  Since it may consume standard input, it should be | 
					
						
							|  |  |  | instantiated only once. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | The \class{FieldStorage} instance can be accessed as if it were a Python  | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | dictionary.  For instance, the following code (which assumes that the  | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | \code{content-type} header and blank line have already been printed) | 
					
						
							|  |  |  | checks that the fields \code{name} and \code{addr} are both set to a | 
					
						
							|  |  |  | non-empty string: | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-07-17 16:34:52 +00:00
										 |  |  | form = cgi.FieldStorage() | 
					
						
							|  |  |  | form_ok = 0 | 
					
						
							|  |  |  | if form.has_key("name") and form.has_key("addr"): | 
					
						
							|  |  |  |     if form["name"].value != "" and form["addr"].value != "": | 
					
						
							|  |  |  |         form_ok = 1 | 
					
						
							|  |  |  | if not form_ok: | 
					
						
							|  |  |  |     print "<H1>Error</H1>" | 
					
						
							|  |  |  |     print "Please fill in the name and addr fields." | 
					
						
							|  |  |  |     return | 
					
						
							|  |  |  | ...further form processing here... | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \end{verbatim} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | Here the fields, accessed through \samp{form[\var{key}]}, are | 
					
						
							|  |  |  | themselves instances of \class{FieldStorage} (or | 
					
						
							|  |  |  | \class{MiniFieldStorage}, depending on the form encoding). | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | If the submitted form data contains more than one field with the same | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | name, the object retrieved by \samp{form[\var{key}]} is not a | 
					
						
							|  |  |  | \class{FieldStorage} or \class{MiniFieldStorage} | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | instance but a list of such instances.  If you expect this possibility | 
					
						
							|  |  |  | (i.e., when your HTML form comtains multiple fields with the same | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | name), use the \function{type()} function to determine whether you | 
					
						
							|  |  |  | have a single instance or a list of instances.  For example, here's | 
					
						
							|  |  |  | code that concatenates any number of username fields, separated by | 
					
						
							|  |  |  | commas: | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-07-17 16:34:52 +00:00
										 |  |  | username = form["username"] | 
					
						
							|  |  |  | if type(username) is type([]): | 
					
						
							|  |  |  |     # Multiple username fields specified | 
					
						
							|  |  |  |     usernames = "" | 
					
						
							|  |  |  |     for item in username: | 
					
						
							|  |  |  |         if usernames: | 
					
						
							|  |  |  |             # Next item -- insert comma | 
					
						
							|  |  |  |             usernames = usernames + "," + item.value | 
					
						
							|  |  |  |         else: | 
					
						
							|  |  |  |             # First item -- don't insert comma | 
					
						
							|  |  |  |             usernames = item.value | 
					
						
							|  |  |  | else: | 
					
						
							|  |  |  |     # Single username field specified | 
					
						
							|  |  |  |     usernames = username.value | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \end{verbatim} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | If a field represents an uploaded file, the value attribute reads the | 
					
						
							|  |  |  | entire file in memory as a string.  This may not be what you want. | 
					
						
							|  |  |  | You can test for an uploaded file by testing either the filename | 
					
						
							|  |  |  | attribute or the file attribute.  You can then read the data at | 
					
						
							|  |  |  | leasure from the file attribute: | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-07-17 16:34:52 +00:00
										 |  |  | fileitem = form["userfile"] | 
					
						
							|  |  |  | if fileitem.file: | 
					
						
							|  |  |  |     # It's an uploaded file; count lines | 
					
						
							|  |  |  |     linecount = 0 | 
					
						
							|  |  |  |     while 1: | 
					
						
							|  |  |  |         line = fileitem.file.readline() | 
					
						
							|  |  |  |         if not line: break | 
					
						
							|  |  |  |         linecount = linecount + 1 | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \end{verbatim} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | The file upload draft standard entertains the possibility of uploading | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | multiple files from one field (using a recursive | 
					
						
							|  |  |  | \mimetype{multipart/*} encoding).  When this occurs, the item will be | 
					
						
							|  |  |  | a dictionary-like \class{FieldStorage} item.  This can be determined | 
					
						
							|  |  |  | by testing its \member{type} attribute, which should be | 
					
						
							|  |  |  | \mimetype{multipart/form-data} (or perhaps another MIME type matching | 
					
						
							| 
									
										
										
										
											1999-01-18 15:46:02 +00:00
										 |  |  | \mimetype{multipart/*}).  In this case, it can be iterated over | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | recursively just like the top-level form object. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | When a form is submitted in the ``old'' format (as the query string or | 
					
						
							|  |  |  | as a single data part of type | 
					
						
							|  |  |  | \mimetype{application/x-www-form-urlencoded}), the items will actually | 
					
						
							|  |  |  | be instances of the class \class{MiniFieldStorage}.  In this case, the | 
					
						
							|  |  |  | list, file and filename attributes are always \code{None}. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Old classes} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | These classes, present in earlier versions of the \module{cgi} module, | 
					
						
							|  |  |  | are still supported for backward compatibility.  New applications | 
					
						
							|  |  |  | should use the \class{FieldStorage} class. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | \class{SvFormContentDict} stores single value form content as | 
					
						
							|  |  |  | dictionary; it assumes each field name occurs in the form only once. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | \class{FormContentDict} stores multiple value form content as a | 
					
						
							|  |  |  | dictionary (the form items are lists of values).  Useful if your form | 
					
						
							|  |  |  | contains multiple fields with the same name. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | Other classes (\class{FormContent}, \class{InterpFormContentDict}) are | 
					
						
							|  |  |  | present for backwards compatibility with really old applications only. | 
					
						
							|  |  |  | If you still use these and would be inconvenienced when they | 
					
						
							|  |  |  | disappeared from a next version of this module, drop me a note. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Functions} | 
					
						
							| 
									
										
										
										
											1996-12-13 22:04:31 +00:00
										 |  |  | \nodename{Functions in cgi module} | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | These are useful if you want more control, or if you want to employ | 
					
						
							|  |  |  | some of the algorithms implemented in this module in other | 
					
						
							|  |  |  | circumstances. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1997-08-25 18:28:03 +00:00
										 |  |  | \begin{funcdesc}{parse}{fp} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | Parse a query in the environment or from a file (default | 
					
						
							|  |  |  | \code{sys.stdin}). | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1997-08-25 18:28:03 +00:00
										 |  |  | \begin{funcdesc}{parse_qs}{qs} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | Parse a query string given as a string argument (data of type  | 
					
						
							|  |  |  | \mimetype{application/x-www-form-urlencoded}). | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-03-17 06:33:25 +00:00
										 |  |  | \begin{funcdesc}{parse_multipart}{fp, pdict} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | Parse input of type \mimetype{multipart/form-data} (for  | 
					
						
							|  |  |  | file uploads).  Arguments are \var{fp} for the input file and | 
					
						
							|  |  |  | \var{pdict} for the dictionary containing other parameters of | 
					
						
							|  |  |  | \code{content-type} header | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Returns a dictionary just like \function{parse_qs()} keys are the | 
					
						
							|  |  |  | field names, each value is a list of values for that field.  This is | 
					
						
							|  |  |  | easy to use but not much good if you are expecting megabytes to be | 
					
						
							|  |  |  | uploaded --- in that case, use the \class{FieldStorage} class instead | 
					
						
							|  |  |  | which is much more flexible.  Note that \code{content-type} is the | 
					
						
							|  |  |  | raw, unparsed contents of the \code{content-type} header. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Note that this does not parse nested multipart parts --- use | 
					
						
							|  |  |  | \class{FieldStorage} for that. | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1997-08-25 18:28:03 +00:00
										 |  |  | \begin{funcdesc}{parse_header}{string} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | Parse a header like \code{content-type} into a main | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | content-type and a dictionary of parameters. | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1997-08-25 18:28:03 +00:00
										 |  |  | \begin{funcdesc}{test}{} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | Robust test CGI script, usable as main program. | 
					
						
							|  |  |  | Writes minimal HTTP headers and formats all information provided to | 
					
						
							|  |  |  | the script in HTML form. | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1997-08-25 18:28:03 +00:00
										 |  |  | \begin{funcdesc}{print_environ}{} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | Format the shell environment in HTML. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1997-08-25 18:28:03 +00:00
										 |  |  | \begin{funcdesc}{print_form}{form} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | Format a form in HTML. | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1997-08-25 18:28:03 +00:00
										 |  |  | \begin{funcdesc}{print_directory}{} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | Format the current directory in HTML. | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1997-08-25 18:28:03 +00:00
										 |  |  | \begin{funcdesc}{print_environ_usage}{} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | Print a list of useful (used by CGI) environment variables in | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | HTML. | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-03-17 06:33:25 +00:00
										 |  |  | \begin{funcdesc}{escape}{s\optional{, quote}} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | Convert the characters | 
					
						
							|  |  |  | \character{\&}, \character{<} and \character{>} in string \var{s} to | 
					
						
							|  |  |  | HTML-safe sequences.  Use this if you need to display text that might | 
					
						
							|  |  |  | contain such characters in HTML.  If the optional flag \var{quote} is | 
					
						
							|  |  |  | true, the double quote character (\character{"}) is also translated; | 
					
						
							|  |  |  | this helps for inclusion in an HTML attribute value, e.g. in \code{<A | 
					
						
							|  |  |  | HREF="...">}. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Caring about security} | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | There's one important rule: if you invoke an external program (e.g. | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | via the \function{os.system()} or \function{os.popen()} functions), | 
					
						
							|  |  |  | make very sure you don't pass arbitrary strings received from the | 
					
						
							|  |  |  | client to the shell.  This is a well-known security hole whereby | 
					
						
							|  |  |  | clever hackers anywhere on the web can exploit a gullible CGI script | 
					
						
							|  |  |  | to invoke arbitrary shell commands.  Even parts of the URL or field | 
					
						
							|  |  |  | names cannot be trusted, since the request doesn't have to come from | 
					
						
							|  |  |  | your form! | 
					
						
							| 
									
										
										
										
											1995-02-27 17:53:25 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | To be on the safe side, if you must pass a string gotten from a form | 
					
						
							|  |  |  | to a shell command, you should make sure the string contains only | 
					
						
							|  |  |  | alphanumeric characters, dashes, underscores, and periods. | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \subsection{Installing your CGI script on a Unix system} | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | Read the documentation for your HTTP server and check with your local | 
					
						
							|  |  |  | system administrator to find the directory where CGI scripts should be | 
					
						
							| 
									
										
										
										
											1997-12-09 03:28:42 +00:00
										 |  |  | installed; usually this is in a directory \file{cgi-bin} in the server tree. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | Make sure that your script is readable and executable by ``others''; the | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | \UNIX{} file mode should be \code{0755} octal (use \samp{chmod 0755 | 
					
						
							| 
									
										
										
										
											1999-01-18 15:46:02 +00:00
										 |  |  | \var{filename}}).  Make sure that the first line of the script contains | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | \code{\#!} starting in column 1 followed by the pathname of the Python | 
					
						
							|  |  |  | interpreter, for instance: | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-07-17 16:34:52 +00:00
										 |  |  | #!/usr/local/bin/python | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \end{verbatim} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | Make sure the Python interpreter exists and is executable by ``others''. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Make sure that any files your script needs to read or write are | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | readable or writable, respectively, by ``others'' --- their mode | 
					
						
							|  |  |  | should be \code{0644} for readable and \code{0666} for writable.  This | 
					
						
							|  |  |  | is because, for security reasons, the HTTP server executes your script | 
					
						
							|  |  |  | as user ``nobody'', without any special privileges.  It can only read | 
					
						
							|  |  |  | (write, execute) files that everybody can read (write, execute).  The | 
					
						
							|  |  |  | current directory at execution time is also different (it is usually | 
					
						
							|  |  |  | the server's cgi-bin directory) and the set of environment variables | 
					
						
							|  |  |  | is also different from what you get at login.  In particular, don't | 
					
						
							|  |  |  | count on the shell's search path for executables (\envvar{PATH}) or | 
					
						
							|  |  |  | the Python module search path (\envvar{PYTHONPATH}) to be set to | 
					
						
							|  |  |  | anything interesting. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | If you need to load modules from a directory which is not on Python's | 
					
						
							|  |  |  | default module search path, you can change the path in your script, | 
					
						
							|  |  |  | before importing other modules, e.g.: | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-07-17 16:34:52 +00:00
										 |  |  | import sys | 
					
						
							|  |  |  | sys.path.insert(0, "/usr/home/joe/lib/python") | 
					
						
							|  |  |  | sys.path.insert(0, "/usr/local/lib/python") | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \end{verbatim} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | (This way, the directory inserted last will be searched first!) | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-01-13 19:00:33 +00:00
										 |  |  | Instructions for non-\UNIX{} systems will vary; check your HTTP server's | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | documentation (it will usually have a section on CGI scripts). | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \subsection{Testing your CGI script} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Unfortunately, a CGI script will generally not run when you try it | 
					
						
							|  |  |  | from the command line, and a script that works perfectly from the | 
					
						
							|  |  |  | command line may fail mysteriously when run from the server.  There's | 
					
						
							|  |  |  | one reason why you should still test your script from the command | 
					
						
							| 
									
										
										
										
											1998-04-03 03:47:03 +00:00
										 |  |  | line: if it contains a syntax error, the Python interpreter won't | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | execute it at all, and the HTTP server will most likely send a cryptic | 
					
						
							|  |  |  | error to the client. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Assuming your script has no syntax errors, yet it does not work, you | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | have no choice but to read the next section. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Debugging CGI scripts} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | First of all, check for trivial installation errors --- reading the | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | section above on installing your CGI script carefully can save you a | 
					
						
							|  |  |  | lot of time.  If you wonder whether you have understood the | 
					
						
							|  |  |  | installation procedure correctly, try installing a copy of this module | 
					
						
							| 
									
										
										
										
											1997-12-09 03:28:42 +00:00
										 |  |  | file (\file{cgi.py}) as a CGI script.  When invoked as a script, the file | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | will dump its environment and the contents of the form in HTML form. | 
					
						
							|  |  |  | Give it the right mode etc, and send it a request.  If it's installed | 
					
						
							| 
									
										
										
										
											1997-12-09 03:28:42 +00:00
										 |  |  | in the standard \file{cgi-bin} directory, it should be possible to send it a | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | request by entering a URL into your browser of the form: | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-07-17 16:34:52 +00:00
										 |  |  | http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \end{verbatim} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | If this gives an error of type 404, the server cannot find the script | 
					
						
							|  |  |  | -- perhaps you need to install it in a different directory.  If it | 
					
						
							|  |  |  | gives another error (e.g.  500), there's an installation problem that | 
					
						
							|  |  |  | you should fix before trying to go any further.  If you get a nicely | 
					
						
							|  |  |  | formatted listing of the environment and form content (in this | 
					
						
							|  |  |  | example, the fields should be listed as ``addr'' with value ``At Home'' | 
					
						
							| 
									
										
										
										
											1997-12-09 03:28:42 +00:00
										 |  |  | and ``name'' with value ``Joe Blow''), the \file{cgi.py} script has been | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | installed correctly.  If you follow the same procedure for your own | 
					
						
							|  |  |  | script, you should now be able to debug it. | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | The next step could be to call the \module{cgi} module's | 
					
						
							|  |  |  | \function{test()} function from your script: replace its main code | 
					
						
							|  |  |  | with the single statement | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-07-17 16:34:52 +00:00
										 |  |  | cgi.test() | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \end{verbatim} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | This should produce the same results as those gotten from installing | 
					
						
							| 
									
										
										
										
											1997-12-09 03:28:42 +00:00
										 |  |  | the \file{cgi.py} file itself. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | When an ordinary Python script raises an unhandled exception | 
					
						
							|  |  |  | (e.g. because of a typo in a module name, a file that can't be opened, | 
					
						
							|  |  |  | etc.), the Python interpreter prints a nice traceback and exits. | 
					
						
							|  |  |  | While the Python interpreter will still do this when your CGI script | 
					
						
							|  |  |  | raises an exception, most likely the traceback will end up in one of | 
					
						
							|  |  |  | the HTTP server's log file, or be discarded altogether. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Fortunately, once you have managed to get your script to execute | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | \emph{some} code, it is easy to catch exceptions and cause a traceback | 
					
						
							|  |  |  | to be printed.  The \function{test()} function below in this module is | 
					
						
							|  |  |  | an example.  Here are the rules: | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \begin{enumerate} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | \item Import the traceback module before entering the \keyword{try} | 
					
						
							|  |  |  |    ... \keyword{except} statement | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \item Assign \code{sys.stderr} to be \code{sys.stdout} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \item Make sure you finish printing the headers and the blank line | 
					
						
							|  |  |  |    early | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \item Wrap all remaining code in a \keyword{try} ... \keyword{except} | 
					
						
							|  |  |  |    statement | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \item In the except clause, call \function{traceback.print_exc()} | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \end{enumerate} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | For example: | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-07-17 16:34:52 +00:00
										 |  |  | import sys | 
					
						
							|  |  |  | import traceback | 
					
						
							|  |  |  | print "Content-type: text/html" | 
					
						
							|  |  |  | print | 
					
						
							|  |  |  | sys.stderr = sys.stdout | 
					
						
							|  |  |  | try: | 
					
						
							|  |  |  |     ...your code here... | 
					
						
							|  |  |  | except: | 
					
						
							|  |  |  |     print "\n\n<PRE>" | 
					
						
							|  |  |  |     traceback.print_exc() | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \end{verbatim} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | Notes: The assignment to \code{sys.stderr} is needed because the | 
					
						
							|  |  |  | traceback prints to \code{sys.stderr}. | 
					
						
							| 
									
										
										
										
											1997-11-25 00:35:44 +00:00
										 |  |  | The \code{print "{\e}n{\e}n<PRE>"} statement is necessary to | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | disable the word wrapping in HTML. | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | If you suspect that there may be a problem in importing the traceback | 
					
						
							|  |  |  | module, you can use an even more robust approach (which only uses | 
					
						
							|  |  |  | built-in modules): | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											1997-07-17 16:34:52 +00:00
										 |  |  | import sys | 
					
						
							|  |  |  | sys.stderr = sys.stdout | 
					
						
							|  |  |  | print "Content-type: text/plain" | 
					
						
							|  |  |  | print | 
					
						
							|  |  |  | ...your code here... | 
					
						
							| 
									
										
										
										
											1998-02-13 06:58:54 +00:00
										 |  |  | \end{verbatim} | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | This relies on the Python interpreter to print the traceback.  The | 
					
						
							|  |  |  | content type of the output is set to plain text, which disables all | 
					
						
							|  |  |  | HTML processing.  If your script works, the raw HTML will be displayed | 
					
						
							|  |  |  | by your client.  If it raises an exception, most likely after the | 
					
						
							|  |  |  | first two lines have been printed, a traceback will be displayed. | 
					
						
							|  |  |  | Because no HTML interpretation is going on, the traceback will | 
					
						
							|  |  |  | readable. | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \subsection{Common problems and solutions} | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \begin{itemize} | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | \item Most HTTP servers buffer the output from CGI scripts until the | 
					
						
							|  |  |  | script is completed.  This means that it is not possible to display a | 
					
						
							|  |  |  | progress report on the client's display while the script is running. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \item Check the installation instructions above. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | \item Check the HTTP server's log files.  (\samp{tail -f logfile} in a | 
					
						
							|  |  |  | separate window may be useful!) | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \item Always check a script for syntax errors first, by doing something | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | like \samp{python script.py}. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \item When using any of the debugging techniques, don't forget to add | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | \samp{import sys} to the top of the script. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \item When invoking external programs, make sure they can be found. | 
					
						
							| 
									
										
										
										
											1998-03-12 06:52:05 +00:00
										 |  |  | Usually, this means using absolute path names --- \envvar{PATH} is | 
					
						
							|  |  |  | usually not set to a very useful value in a CGI script. | 
					
						
							| 
									
										
										
										
											1996-07-30 18:22:07 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \item When reading or writing external files, make sure they can be read | 
					
						
							|  |  |  | or written by every user on the system. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \item Don't try to give a CGI script a set-uid mode.  This doesn't work on | 
					
						
							|  |  |  | most systems, and is a security liability as well. | 
					
						
							| 
									
										
										
										
											1995-03-17 16:07:09 +00:00
										 |  |  | \end{itemize} | 
					
						
							|  |  |  | 
 |