mirror of
				https://github.com/python/cpython.git
				synced 2025-10-30 21:21:22 +00:00 
			
		
		
		
	 d098c3d94c
			
		
	
	
		d098c3d94c
		
	
	
	
	
		
			
			svn+ssh://svn.python.org/python/branches/py3k ........ r85274 | georg.brandl | 2010-10-06 12:26:05 +0200 (Mi, 06 Okt 2010) | 1 line Fix errors found by "make suspicious". ........
		
			
				
	
	
		
			876 lines
		
	
	
	
		
			33 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			876 lines
		
	
	
	
		
			33 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| :tocdepth: 2
 | |
| 
 | |
| =========================
 | |
| Library and Extension FAQ
 | |
| =========================
 | |
| 
 | |
| .. contents::
 | |
| 
 | |
| General Library Questions
 | |
| =========================
 | |
| 
 | |
| How do I find a module or application to perform task X?
 | |
| --------------------------------------------------------
 | |
| 
 | |
| Check :ref:`the Library Reference <library-index>` to see if there's a relevant
 | |
| standard library module.  (Eventually you'll learn what's in the standard
 | |
| library and will able to skip this step.)
 | |
| 
 | |
| For third-party packages, search the `Python Package Index
 | |
| <http://pypi.python.org/pypi>`_ or try `Google <http://www.google.com>`_ or
 | |
| another Web search engine.  Searching for "Python" plus a keyword or two for
 | |
| your topic of interest will usually find something helpful.
 | |
| 
 | |
| 
 | |
| Where is the math.py (socket.py, regex.py, etc.) source file?
 | |
| -------------------------------------------------------------
 | |
| 
 | |
| If you can't find a source file for a module it may be a builtin or dynamically
 | |
| loaded module implemented in C, C++ or other compiled language.  In this case
 | |
| you may not have the source file or it may be something like mathmodule.c,
 | |
| somewhere in a C source directory (not on the Python Path).
 | |
| 
 | |
| There are (at least) three kinds of modules in Python:
 | |
| 
 | |
| 1) modules written in Python (.py);
 | |
| 2) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc);
 | |
| 3) modules written in C and linked with the interpreter; to get a list of these,
 | |
|    type::
 | |
| 
 | |
|       import sys
 | |
|       print(sys.builtin_module_names)
 | |
| 
 | |
| 
 | |
| How do I make a Python script executable on Unix?
 | |
| -------------------------------------------------
 | |
| 
 | |
| You need to do two things: the script file's mode must be executable and the
 | |
| first line must begin with ``#!`` followed by the path of the Python
 | |
| interpreter.
 | |
| 
 | |
| The first is done by executing ``chmod +x scriptfile`` or perhaps ``chmod 755
 | |
| scriptfile``.
 | |
| 
 | |
| The second can be done in a number of ways.  The most straightforward way is to
 | |
| write ::
 | |
| 
 | |
|   #!/usr/local/bin/python
 | |
| 
 | |
| as the very first line of your file, using the pathname for where the Python
 | |
| interpreter is installed on your platform.
 | |
| 
 | |
| If you would like the script to be independent of where the Python interpreter
 | |
| lives, you can use the "env" program.  Almost all Unix variants support the
 | |
| following, assuming the Python interpreter is in a directory on the user's
 | |
| $PATH::
 | |
| 
 | |
|   #!/usr/bin/env python
 | |
| 
 | |
| *Don't* do this for CGI scripts.  The $PATH variable for CGI scripts is often
 | |
| very minimal, so you need to use the actual absolute pathname of the
 | |
| interpreter.
 | |
| 
 | |
| Occasionally, a user's environment is so full that the /usr/bin/env program
 | |
| fails; or there's no env program at all.  In that case, you can try the
 | |
| following hack (due to Alex Rezinsky)::
 | |
| 
 | |
|    #! /bin/sh
 | |
|    """:"
 | |
|    exec python $0 ${1+"$@"}
 | |
|    """
 | |
| 
 | |
| The minor disadvantage is that this defines the script's __doc__ string.
 | |
| However, you can fix that by adding ::
 | |
| 
 | |
|    __doc__ = """...Whatever..."""
 | |
| 
 | |
| 
 | |
| 
 | |
| Is there a curses/termcap package for Python?
 | |
| ---------------------------------------------
 | |
| 
 | |
| .. XXX curses *is* built by default, isn't it?
 | |
| 
 | |
| For Unix variants: The standard Python source distribution comes with a curses
 | |
| module in the ``Modules/`` subdirectory, though it's not compiled by default
 | |
| (note that this is not available in the Windows distribution -- there is no
 | |
| curses module for Windows).
 | |
| 
 | |
| The curses module supports basic curses features as well as many additional
 | |
| functions from ncurses and SYSV curses such as colour, alternative character set
 | |
| support, pads, and mouse support. This means the module isn't compatible with
 | |
| operating systems that only have BSD curses, but there don't seem to be any
 | |
| currently maintained OSes that fall into this category.
 | |
| 
 | |
| For Windows: use `the consolelib module
 | |
| <http://effbot.org/zone/console-index.htm>`_.
 | |
| 
 | |
| 
 | |
| Is there an equivalent to C's onexit() in Python?
 | |
| -------------------------------------------------
 | |
| 
 | |
| The :mod:`atexit` module provides a register function that is similar to C's
 | |
| onexit.
 | |
| 
 | |
| 
 | |
| Why don't my signal handlers work?
 | |
| ----------------------------------
 | |
| 
 | |
| The most common problem is that the signal handler is declared with the wrong
 | |
| argument list.  It is called as ::
 | |
| 
 | |
|    handler(signum, frame)
 | |
| 
 | |
| so it should be declared with two arguments::
 | |
| 
 | |
|    def handler(signum, frame):
 | |
|        ...
 | |
| 
 | |
| 
 | |
| Common tasks
 | |
| ============
 | |
| 
 | |
| How do I test a Python program or component?
 | |
| --------------------------------------------
 | |
| 
 | |
| Python comes with two testing frameworks.  The :mod:`doctest` module finds
 | |
| examples in the docstrings for a module and runs them, comparing the output with
 | |
| the expected output given in the docstring.
 | |
| 
 | |
| The :mod:`unittest` module is a fancier testing framework modelled on Java and
 | |
| Smalltalk testing frameworks.
 | |
| 
 | |
| For testing, it helps to write the program so that it may be easily tested by
 | |
| using good modular design.  Your program should have almost all functionality
 | |
| encapsulated in either functions or class methods -- and this sometimes has the
 | |
| surprising and delightful effect of making the program run faster (because local
 | |
| variable accesses are faster than global accesses).  Furthermore the program
 | |
| should avoid depending on mutating global variables, since this makes testing
 | |
| much more difficult to do.
 | |
| 
 | |
| The "global main logic" of your program may be as simple as ::
 | |
| 
 | |
|    if __name__ == "__main__":
 | |
|        main_logic()
 | |
| 
 | |
| at the bottom of the main module of your program.
 | |
| 
 | |
| Once your program is organized as a tractable collection of functions and class
 | |
| behaviours you should write test functions that exercise the behaviours.  A test
 | |
| suite can be associated with each module which automates a sequence of tests.
 | |
| This sounds like a lot of work, but since Python is so terse and flexible it's
 | |
| surprisingly easy.  You can make coding much more pleasant and fun by writing
 | |
| your test functions in parallel with the "production code", since this makes it
 | |
| easy to find bugs and even design flaws earlier.
 | |
| 
 | |
| "Support modules" that are not intended to be the main module of a program may
 | |
| include a self-test of the module. ::
 | |
| 
 | |
|    if __name__ == "__main__":
 | |
|        self_test()
 | |
| 
 | |
| Even programs that interact with complex external interfaces may be tested when
 | |
| the external interfaces are unavailable by using "fake" interfaces implemented
 | |
| in Python.
 | |
| 
 | |
| 
 | |
| How do I create documentation from doc strings?
 | |
| -----------------------------------------------
 | |
| 
 | |
| The :mod:`pydoc` module can create HTML from the doc strings in your Python
 | |
| source code.  An alternative for creating API documentation purely from
 | |
| docstrings is `epydoc <http://epydoc.sf.net/>`_.  `Sphinx
 | |
| <http://sphinx.pocoo.org>`_ can also include docstring content.
 | |
| 
 | |
| 
 | |
| How do I get a single keypress at a time?
 | |
| -----------------------------------------
 | |
| 
 | |
| For Unix variants: There are several solutions.  It's straightforward to do this
 | |
| using curses, but curses is a fairly large module to learn.
 | |
| 
 | |
| .. XXX this doesn't work out of the box, some IO expert needs to check why
 | |
| 
 | |
|    Here's a solution without curses::
 | |
| 
 | |
|    import termios, fcntl, sys, os
 | |
|    fd = sys.stdin.fileno()
 | |
| 
 | |
|    oldterm = termios.tcgetattr(fd)
 | |
|    newattr = termios.tcgetattr(fd)
 | |
|    newattr[3] = newattr[3] & ~termios.ICANON & ~termios.ECHO
 | |
|    termios.tcsetattr(fd, termios.TCSANOW, newattr)
 | |
| 
 | |
|    oldflags = fcntl.fcntl(fd, fcntl.F_GETFL)
 | |
|    fcntl.fcntl(fd, fcntl.F_SETFL, oldflags | os.O_NONBLOCK)
 | |
| 
 | |
|    try:
 | |
|        while True:
 | |
|            try:
 | |
|                c = sys.stdin.read(1)
 | |
|                print("Got character", repr(c))
 | |
|            except IOError:
 | |
|                pass
 | |
|    finally:
 | |
|        termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm)
 | |
|        fcntl.fcntl(fd, fcntl.F_SETFL, oldflags)
 | |
| 
 | |
|    You need the :mod:`termios` and the :mod:`fcntl` module for any of this to
 | |
|    work, and I've only tried it on Linux, though it should work elsewhere.  In
 | |
|    this code, characters are read and printed one at a time.
 | |
| 
 | |
|    :func:`termios.tcsetattr` turns off stdin's echoing and disables canonical
 | |
|    mode.  :func:`fcntl.fnctl` is used to obtain stdin's file descriptor flags
 | |
|    and modify them for non-blocking mode.  Since reading stdin when it is empty
 | |
|    results in an :exc:`IOError`, this error is caught and ignored.
 | |
| 
 | |
| 
 | |
| Threads
 | |
| =======
 | |
| 
 | |
| How do I program using threads?
 | |
| -------------------------------
 | |
| 
 | |
| Be sure to use the :mod:`threading` module and not the :mod:`_thread` module.
 | |
| The :mod:`threading` module builds convenient abstractions on top of the
 | |
| low-level primitives provided by the :mod:`_thread` module.
 | |
| 
 | |
| Aahz has a set of slides from his threading tutorial that are helpful; see
 | |
| http://www.pythoncraft.com/OSCON2001/.
 | |
| 
 | |
| 
 | |
| None of my threads seem to run: why?
 | |
| ------------------------------------
 | |
| 
 | |
| As soon as the main thread exits, all threads are killed.  Your main thread is
 | |
| running too quickly, giving the threads no time to do any work.
 | |
| 
 | |
| A simple fix is to add a sleep to the end of the program that's long enough for
 | |
| all the threads to finish::
 | |
| 
 | |
|    import threading, time
 | |
| 
 | |
|    def thread_task(name, n):
 | |
|        for i in range(n): print(name, i)
 | |
| 
 | |
|    for i in range(10):
 | |
|        T = threading.Thread(target=thread_task, args=(str(i), i))
 | |
|        T.start()
 | |
| 
 | |
|    time.sleep(10)  # <---------------------------!
 | |
| 
 | |
| But now (on many platforms) the threads don't run in parallel, but appear to run
 | |
| sequentially, one at a time!  The reason is that the OS thread scheduler doesn't
 | |
| start a new thread until the previous thread is blocked.
 | |
| 
 | |
| A simple fix is to add a tiny sleep to the start of the run function::
 | |
| 
 | |
|    def thread_task(name, n):
 | |
|        time.sleep(0.001)  # <--------------------!
 | |
|        for i in range(n): print(name, i)
 | |
| 
 | |
|    for i in range(10):
 | |
|        T = threading.Thread(target=thread_task, args=(str(i), i))
 | |
|        T.start()
 | |
| 
 | |
|    time.sleep(10)
 | |
| 
 | |
| Instead of trying to guess how long a :func:`time.sleep` delay will be enough,
 | |
| it's better to use some kind of semaphore mechanism.  One idea is to use the
 | |
| :mod:`queue` module to create a queue object, let each thread append a token to
 | |
| the queue when it finishes, and let the main thread read as many tokens from the
 | |
| queue as there are threads.
 | |
| 
 | |
| 
 | |
| How do I parcel out work among a bunch of worker threads?
 | |
| ---------------------------------------------------------
 | |
| 
 | |
| Use the :mod:`queue` module to create a queue containing a list of jobs.  The
 | |
| :class:`~queue.Queue` class maintains a list of objects with ``.put(obj)`` to
 | |
| add an item to the queue and ``.get()`` to return an item.  The class will take
 | |
| care of the locking necessary to ensure that each job is handed out exactly
 | |
| once.
 | |
| 
 | |
| Here's a trivial example::
 | |
| 
 | |
|    import threading, queue, time
 | |
| 
 | |
|    # The worker thread gets jobs off the queue.  When the queue is empty, it
 | |
|    # assumes there will be no more work and exits.
 | |
|    # (Realistically workers will run until terminated.)
 | |
|    def worker ():
 | |
|        print('Running worker')
 | |
|        time.sleep(0.1)
 | |
|        while True:
 | |
|            try:
 | |
|                arg = q.get(block=False)
 | |
|            except queue.Empty:
 | |
|                print('Worker', threading.currentThread(), end=' ')
 | |
|                print('queue empty')
 | |
|                break
 | |
|            else:
 | |
|                print('Worker', threading.currentThread(), end=' ')
 | |
|                print('running with argument', arg)
 | |
|                time.sleep(0.5)
 | |
| 
 | |
|    # Create queue
 | |
|    q = queue.Queue()
 | |
| 
 | |
|    # Start a pool of 5 workers
 | |
|    for i in range(5):
 | |
|        t = threading.Thread(target=worker, name='worker %i' % (i+1))
 | |
|        t.start()
 | |
| 
 | |
|    # Begin adding work to the queue
 | |
|    for i in range(50):
 | |
|        q.put(i)
 | |
| 
 | |
|    # Give threads time to run
 | |
|    print('Main thread sleeping')
 | |
|    time.sleep(5)
 | |
| 
 | |
| When run, this will produce the following output::
 | |
| 
 | |
|    Running worker
 | |
|    Running worker
 | |
|    Running worker
 | |
|    Running worker
 | |
|    Running worker
 | |
|    Main thread sleeping
 | |
|    Worker <Thread(worker 1, started 130283832797456)> running with argument 0
 | |
|    Worker <Thread(worker 2, started 130283824404752)> running with argument 1
 | |
|    Worker <Thread(worker 3, started 130283816012048)> running with argument 2
 | |
|    Worker <Thread(worker 4, started 130283807619344)> running with argument 3
 | |
|    Worker <Thread(worker 5, started 130283799226640)> running with argument 4
 | |
|    Worker <Thread(worker 1, started 130283832797456)> running with argument 5
 | |
|    ...
 | |
| 
 | |
| Consult the module's documentation for more details; the ``Queue`` class
 | |
| provides a featureful interface.
 | |
| 
 | |
| 
 | |
| What kinds of global value mutation are thread-safe?
 | |
| ----------------------------------------------------
 | |
| 
 | |
| A global interpreter lock (GIL) is used internally to ensure that only one
 | |
| thread runs in the Python VM at a time.  In general, Python offers to switch
 | |
| among threads only between bytecode instructions; how frequently it switches can
 | |
| be set via :func:`sys.setswitchinterval`.  Each bytecode instruction and
 | |
| therefore all the C implementation code reached from each instruction is
 | |
| therefore atomic from the point of view of a Python program.
 | |
| 
 | |
| In theory, this means an exact accounting requires an exact understanding of the
 | |
| PVM bytecode implementation.  In practice, it means that operations on shared
 | |
| variables of builtin data types (ints, lists, dicts, etc) that "look atomic"
 | |
| really are.
 | |
| 
 | |
| For example, the following operations are all atomic (L, L1, L2 are lists, D,
 | |
| D1, D2 are dicts, x, y are objects, i, j are ints)::
 | |
| 
 | |
|    L.append(x)
 | |
|    L1.extend(L2)
 | |
|    x = L[i]
 | |
|    x = L.pop()
 | |
|    L1[i:j] = L2
 | |
|    L.sort()
 | |
|    x = y
 | |
|    x.field = y
 | |
|    D[x] = y
 | |
|    D1.update(D2)
 | |
|    D.keys()
 | |
| 
 | |
| These aren't::
 | |
| 
 | |
|    i = i+1
 | |
|    L.append(L[-1])
 | |
|    L[i] = L[j]
 | |
|    D[x] = D[x] + 1
 | |
| 
 | |
| Operations that replace other objects may invoke those other objects'
 | |
| :meth:`__del__` method when their reference count reaches zero, and that can
 | |
| affect things.  This is especially true for the mass updates to dictionaries and
 | |
| lists.  When in doubt, use a mutex!
 | |
| 
 | |
| 
 | |
| Can't we get rid of the Global Interpreter Lock?
 | |
| ------------------------------------------------
 | |
| 
 | |
| .. XXX mention multiprocessing
 | |
| .. XXX link to dbeazley's talk about GIL?
 | |
| 
 | |
| The Global Interpreter Lock (GIL) is often seen as a hindrance to Python's
 | |
| deployment on high-end multiprocessor server machines, because a multi-threaded
 | |
| Python program effectively only uses one CPU, due to the insistence that
 | |
| (almost) all Python code can only run while the GIL is held.
 | |
| 
 | |
| Back in the days of Python 1.5, Greg Stein actually implemented a comprehensive
 | |
| patch set (the "free threading" patches) that removed the GIL and replaced it
 | |
| with fine-grained locking.  Unfortunately, even on Windows (where locks are very
 | |
| efficient) this ran ordinary Python code about twice as slow as the interpreter
 | |
| using the GIL.  On Linux the performance loss was even worse because pthread
 | |
| locks aren't as efficient.
 | |
| 
 | |
| Since then, the idea of getting rid of the GIL has occasionally come up but
 | |
| nobody has found a way to deal with the expected slowdown, and users who don't
 | |
| use threads would not be happy if their code ran at half at the speed.  Greg's
 | |
| free threading patch set has not been kept up-to-date for later Python versions.
 | |
| 
 | |
| This doesn't mean that you can't make good use of Python on multi-CPU machines!
 | |
| You just have to be creative with dividing the work up between multiple
 | |
| *processes* rather than multiple *threads*.  Judicious use of C extensions will
 | |
| also help; if you use a C extension to perform a time-consuming task, the
 | |
| extension can release the GIL while the thread of execution is in the C code and
 | |
| allow other threads to get some work done.
 | |
| 
 | |
| It has been suggested that the GIL should be a per-interpreter-state lock rather
 | |
| than truly global; interpreters then wouldn't be able to share objects.
 | |
| Unfortunately, this isn't likely to happen either.  It would be a tremendous
 | |
| amount of work, because many object implementations currently have global state.
 | |
| For example, small integers and short strings are cached; these caches would
 | |
| have to be moved to the interpreter state.  Other object types have their own
 | |
| free list; these free lists would have to be moved to the interpreter state.
 | |
| And so on.
 | |
| 
 | |
| And I doubt that it can even be done in finite time, because the same problem
 | |
| exists for 3rd party extensions.  It is likely that 3rd party extensions are
 | |
| being written at a faster rate than you can convert them to store all their
 | |
| global state in the interpreter state.
 | |
| 
 | |
| And finally, once you have multiple interpreters not sharing any state, what
 | |
| have you gained over running each interpreter in a separate process?
 | |
| 
 | |
| 
 | |
| Input and Output
 | |
| ================
 | |
| 
 | |
| How do I delete a file? (And other file questions...)
 | |
| -----------------------------------------------------
 | |
| 
 | |
| Use ``os.remove(filename)`` or ``os.unlink(filename)``; for documentation, see
 | |
| the :mod:`os` module.  The two functions are identical; :func:`~os.unlink` is simply
 | |
| the name of the Unix system call for this function.
 | |
| 
 | |
| To remove a directory, use :func:`os.rmdir`; use :func:`os.mkdir` to create one.
 | |
| ``os.makedirs(path)`` will create any intermediate directories in ``path`` that
 | |
| don't exist. ``os.removedirs(path)`` will remove intermediate directories as
 | |
| long as they're empty; if you want to delete an entire directory tree and its
 | |
| contents, use :func:`shutil.rmtree`.
 | |
| 
 | |
| To rename a file, use ``os.rename(old_path, new_path)``.
 | |
| 
 | |
| To truncate a file, open it using ``f = open(filename, "rb+")``, and use
 | |
| ``f.truncate(offset)``; offset defaults to the current seek position.  There's
 | |
| also ``os.ftruncate(fd, offset)`` for files opened with :func:`os.open`, where
 | |
| ``fd`` is the file descriptor (a small integer).
 | |
| 
 | |
| The :mod:`shutil` module also contains a number of functions to work on files
 | |
| including :func:`~shutil.copyfile`, :func:`~shutil.copytree`, and
 | |
| :func:`~shutil.rmtree`.
 | |
| 
 | |
| 
 | |
| How do I copy a file?
 | |
| ---------------------
 | |
| 
 | |
| The :mod:`shutil` module contains a :func:`~shutil.copyfile` function.  Note
 | |
| that on MacOS 9 it doesn't copy the resource fork and Finder info.
 | |
| 
 | |
| 
 | |
| How do I read (or write) binary data?
 | |
| -------------------------------------
 | |
| 
 | |
| To read or write complex binary data formats, it's best to use the :mod:`struct`
 | |
| module.  It allows you to take a string containing binary data (usually numbers)
 | |
| and convert it to Python objects; and vice versa.
 | |
| 
 | |
| For example, the following code reads two 2-byte integers and one 4-byte integer
 | |
| in big-endian format from a file::
 | |
| 
 | |
|    import struct
 | |
| 
 | |
|    with open(filename, "rb") as f:
 | |
|       s = f.read(8)
 | |
|       x, y, z = struct.unpack(">hhl", s)
 | |
| 
 | |
| The '>' in the format string forces big-endian data; the letter 'h' reads one
 | |
| "short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the
 | |
| string.
 | |
| 
 | |
| For data that is more regular (e.g. a homogeneous list of ints or thefloats),
 | |
| you can also use the :mod:`array` module.
 | |
| 
 | |
|    .. note::
 | |
|       To read and write binary data, it is mandatory to open the file in
 | |
|       binary mode (here, passing ``"rb"`` to :func:`open`).  If you use
 | |
|       ``"r"`` instead (the default), the file will be open in text mode
 | |
|       and ``f.read()`` will return :class:`str` objects rather than
 | |
|       :class:`bytes` objects.
 | |
| 
 | |
| 
 | |
| I can't seem to use os.read() on a pipe created with os.popen(); why?
 | |
| ---------------------------------------------------------------------
 | |
| 
 | |
| :func:`os.read` is a low-level function which takes a file descriptor, a small
 | |
| integer representing the opened file.  :func:`os.popen` creates a high-level
 | |
| file object, the same type returned by the builtin :func:`open` function.  Thus,
 | |
| to read n bytes from a pipe p created with :func:`os.popen`, you need to use
 | |
| ``p.read(n)``.
 | |
| 
 | |
| 
 | |
| .. XXX update to use subprocess. See the :ref:`subprocess-replacements` section.
 | |
| 
 | |
|    How do I run a subprocess with pipes connected to both input and output?
 | |
|    ------------------------------------------------------------------------
 | |
| 
 | |
|    Use the :mod:`popen2` module.  For example::
 | |
| 
 | |
|       import popen2
 | |
|       fromchild, tochild = popen2.popen2("command")
 | |
|       tochild.write("input\n")
 | |
|       tochild.flush()
 | |
|       output = fromchild.readline()
 | |
| 
 | |
|    Warning: in general it is unwise to do this because you can easily cause a
 | |
|    deadlock where your process is blocked waiting for output from the child
 | |
|    while the child is blocked waiting for input from you.  This can be caused
 | |
|    because the parent expects the child to output more text than it does, or it
 | |
|    can be caused by data being stuck in stdio buffers due to lack of flushing.
 | |
|    The Python parent can of course explicitly flush the data it sends to the
 | |
|    child before it reads any output, but if the child is a naive C program it
 | |
|    may have been written to never explicitly flush its output, even if it is
 | |
|    interactive, since flushing is normally automatic.
 | |
| 
 | |
|    Note that a deadlock is also possible if you use :func:`popen3` to read
 | |
|    stdout and stderr. If one of the two is too large for the internal buffer
 | |
|    (increasing the buffer size does not help) and you ``read()`` the other one
 | |
|    first, there is a deadlock, too.
 | |
| 
 | |
|    Note on a bug in popen2: unless your program calls ``wait()`` or
 | |
|    ``waitpid()``, finished child processes are never removed, and eventually
 | |
|    calls to popen2 will fail because of a limit on the number of child
 | |
|    processes.  Calling :func:`os.waitpid` with the :data:`os.WNOHANG` option can
 | |
|    prevent this; a good place to insert such a call would be before calling
 | |
|    ``popen2`` again.
 | |
| 
 | |
|    In many cases, all you really need is to run some data through a command and
 | |
|    get the result back.  Unless the amount of data is very large, the easiest
 | |
|    way to do this is to write it to a temporary file and run the command with
 | |
|    that temporary file as input.  The standard module :mod:`tempfile` exports a
 | |
|    ``mktemp()`` function to generate unique temporary file names. ::
 | |
| 
 | |
|       import tempfile
 | |
|       import os
 | |
| 
 | |
|       class Popen3:
 | |
|           """
 | |
|           This is a deadlock-safe version of popen that returns
 | |
|           an object with errorlevel, out (a string) and err (a string).
 | |
|           (capturestderr may not work under windows.)
 | |
|           Example: print(Popen3('grep spam','\n\nhere spam\n\n').out)
 | |
|           """
 | |
|           def __init__(self,command,input=None,capturestderr=None):
 | |
|               outfile=tempfile.mktemp()
 | |
|               command="( %s ) > %s" % (command,outfile)
 | |
|               if input:
 | |
|                   infile=tempfile.mktemp()
 | |
|                   open(infile,"w").write(input)
 | |
|                   command=command+" <"+infile
 | |
|               if capturestderr:
 | |
|                   errfile=tempfile.mktemp()
 | |
|                   command=command+" 2>"+errfile
 | |
|               self.errorlevel=os.system(command) >> 8
 | |
|               self.out=open(outfile,"r").read()
 | |
|               os.remove(outfile)
 | |
|               if input:
 | |
|                   os.remove(infile)
 | |
|               if capturestderr:
 | |
|                   self.err=open(errfile,"r").read()
 | |
|                   os.remove(errfile)
 | |
| 
 | |
|    Note that many interactive programs (e.g. vi) don't work well with pipes
 | |
|    substituted for standard input and output.  You will have to use pseudo ttys
 | |
|    ("ptys") instead of pipes. Or you can use a Python interface to Don Libes'
 | |
|    "expect" library.  A Python extension that interfaces to expect is called
 | |
|    "expy" and available from http://expectpy.sourceforge.net.  A pure Python
 | |
|    solution that works like expect is `pexpect
 | |
|    <http://pypi.python.org/pypi/pexpect/>`_.
 | |
| 
 | |
| 
 | |
| How do I access the serial (RS232) port?
 | |
| ----------------------------------------
 | |
| 
 | |
| For Win32, POSIX (Linux, BSD, etc.), Jython:
 | |
| 
 | |
|    http://pyserial.sourceforge.net
 | |
| 
 | |
| For Unix, see a Usenet post by Mitch Chapman:
 | |
| 
 | |
|    http://groups.google.com/groups?selm=34A04430.CF9@ohioee.com
 | |
| 
 | |
| 
 | |
| Why doesn't closing sys.stdout (stdin, stderr) really close it?
 | |
| ---------------------------------------------------------------
 | |
| 
 | |
| Python :term:`file objects <file object>` are a high-level layer of
 | |
| abstraction on low-level C file descriptors.
 | |
| 
 | |
| For most file objects you create in Python via the built-in :func:`open`
 | |
| function, ``f.close()`` marks the Python file object as being closed from
 | |
| Python's point of view, and also arranges to close the underlying C file
 | |
| descriptor.  This also happens automatically in ``f``'s destructor, when
 | |
| ``f`` becomes garbage.
 | |
| 
 | |
| But stdin, stdout and stderr are treated specially by Python, because of the
 | |
| special status also given to them by C.  Running ``sys.stdout.close()`` marks
 | |
| the Python-level file object as being closed, but does *not* close the
 | |
| associated C file descriptor.
 | |
| 
 | |
| To close the underlying C file descriptor for one of these three, you should
 | |
| first be sure that's what you really want to do (e.g., you may confuse
 | |
| extension modules trying to do I/O).  If it is, use :func:`os.close`::
 | |
| 
 | |
|    os.close(stdin.fileno())
 | |
|    os.close(stdout.fileno())
 | |
|    os.close(stderr.fileno())
 | |
| 
 | |
| Or you can use the numeric constants 0, 1 and 2, respectively.
 | |
| 
 | |
| 
 | |
| Network/Internet Programming
 | |
| ============================
 | |
| 
 | |
| What WWW tools are there for Python?
 | |
| ------------------------------------
 | |
| 
 | |
| See the chapters titled :ref:`internet` and :ref:`netdata` in the Library
 | |
| Reference Manual.  Python has many modules that will help you build server-side
 | |
| and client-side web systems.
 | |
| 
 | |
| .. XXX check if wiki page is still up to date
 | |
| 
 | |
| A summary of available frameworks is maintained by Paul Boddie at
 | |
| http://wiki.python.org/moin/WebProgramming .
 | |
| 
 | |
| Cameron Laird maintains a useful set of pages about Python web technologies at
 | |
| http://phaseit.net/claird/comp.lang.python/web_python.
 | |
| 
 | |
| 
 | |
| How can I mimic CGI form submission (METHOD=POST)?
 | |
| --------------------------------------------------
 | |
| 
 | |
| I would like to retrieve web pages that are the result of POSTing a form. Is
 | |
| there existing code that would let me do this easily?
 | |
| 
 | |
| Yes. Here's a simple example that uses urllib.request::
 | |
| 
 | |
|    #!/usr/local/bin/python
 | |
| 
 | |
|    import urllib.request
 | |
| 
 | |
|    ### build the query string
 | |
|    qs = "First=Josephine&MI=Q&Last=Public"
 | |
| 
 | |
|    ### connect and send the server a path
 | |
|    req = urllib.request.urlopen('http://www.some-server.out-there'
 | |
|                                 '/cgi-bin/some-cgi-script', data=qs)
 | |
|    msg, hdrs = req.read(), req.info()
 | |
| 
 | |
| Note that in general for percent-encoded POST operations, query strings must be
 | |
| quoted using :func:`urllib.parse.urlencode`.  For example to send name="Guy Steele,
 | |
| Jr."::
 | |
| 
 | |
|    >>> import urllib.parse
 | |
|    >>> urllib.parse.urlencode({'name': 'Guy Steele, Jr.'})
 | |
|    'name=Guy+Steele%2C+Jr.'
 | |
| 
 | |
| .. seealso:: :ref:`urllib-howto` for extensive examples.
 | |
| 
 | |
| 
 | |
| What module should I use to help with generating HTML?
 | |
| ------------------------------------------------------
 | |
| 
 | |
| .. XXX add modern template languages
 | |
| 
 | |
| There are many different modules available:
 | |
| 
 | |
| * HTMLgen is a class library of objects corresponding to all the HTML 3.2 markup
 | |
|   tags. It's used when you are writing in Python and wish to synthesize HTML
 | |
|   pages for generating a web or for CGI forms, etc.
 | |
| 
 | |
| * DocumentTemplate and Zope Page Templates are two different systems that are
 | |
|   part of Zope.
 | |
| 
 | |
| * Quixote's PTL uses Python syntax to assemble strings of text.
 | |
| 
 | |
| Consult the `Web Programming wiki pages
 | |
| <http://wiki.python.org/moin/WebProgramming>`_ for more links.
 | |
| 
 | |
| 
 | |
| How do I send mail from a Python script?
 | |
| ----------------------------------------
 | |
| 
 | |
| Use the standard library module :mod:`smtplib`.
 | |
| 
 | |
| Here's a very simple interactive mail sender that uses it.  This method will
 | |
| work on any host that supports an SMTP listener. ::
 | |
| 
 | |
|    import sys, smtplib
 | |
| 
 | |
|    fromaddr = input("From: ")
 | |
|    toaddrs  = input("To: ").split(',')
 | |
|    print("Enter message, end with ^D:")
 | |
|    msg = ''
 | |
|    while True:
 | |
|        line = sys.stdin.readline()
 | |
|        if not line:
 | |
|            break
 | |
|        msg += line
 | |
| 
 | |
|    # The actual mail send
 | |
|    server = smtplib.SMTP('localhost')
 | |
|    server.sendmail(fromaddr, toaddrs, msg)
 | |
|    server.quit()
 | |
| 
 | |
| A Unix-only alternative uses sendmail.  The location of the sendmail program
 | |
| varies between systems; sometimes it is ``/usr/lib/sendmail``, sometime
 | |
| ``/usr/sbin/sendmail``.  The sendmail manual page will help you out.  Here's
 | |
| some sample code::
 | |
| 
 | |
|    SENDMAIL = "/usr/sbin/sendmail"  # sendmail location
 | |
|    import os
 | |
|    p = os.popen("%s -t -i" % SENDMAIL, "w")
 | |
|    p.write("To: receiver@example.com\n")
 | |
|    p.write("Subject: test\n")
 | |
|    p.write("\n")  # blank line separating headers from body
 | |
|    p.write("Some text\n")
 | |
|    p.write("some more text\n")
 | |
|    sts = p.close()
 | |
|    if sts != 0:
 | |
|        print("Sendmail exit status", sts)
 | |
| 
 | |
| 
 | |
| How do I avoid blocking in the connect() method of a socket?
 | |
| ------------------------------------------------------------
 | |
| 
 | |
| The select module is commonly used to help with asynchronous I/O on sockets.
 | |
| 
 | |
| To prevent the TCP connect from blocking, you can set the socket to non-blocking
 | |
| mode.  Then when you do the ``connect()``, you will either connect immediately
 | |
| (unlikely) or get an exception that contains the error number as ``.errno``.
 | |
| ``errno.EINPROGRESS`` indicates that the connection is in progress, but hasn't
 | |
| finished yet.  Different OSes will return different values, so you're going to
 | |
| have to check what's returned on your system.
 | |
| 
 | |
| You can use the ``connect_ex()`` method to avoid creating an exception.  It will
 | |
| just return the errno value.  To poll, you can call ``connect_ex()`` again later
 | |
| -- ``0`` or ``errno.EISCONN`` indicate that you're connected -- or you can pass this
 | |
| socket to select to check if it's writable.
 | |
| 
 | |
| 
 | |
| Databases
 | |
| =========
 | |
| 
 | |
| Are there any interfaces to database packages in Python?
 | |
| --------------------------------------------------------
 | |
| 
 | |
| Yes.
 | |
| 
 | |
| Interfaces to disk-based hashes such as :mod:`DBM <dbm.ndbm>` and :mod:`GDBM
 | |
| <dbm.gnu>` are also included with standard Python.  There is also the
 | |
| :mod:`sqlite3` module, which provides a lightweight disk-based relational
 | |
| database.
 | |
| 
 | |
| Support for most relational databases is available.  See the
 | |
| `DatabaseProgramming wiki page
 | |
| <http://wiki.python.org/moin/DatabaseProgramming>`_ for details.
 | |
| 
 | |
| 
 | |
| How do you implement persistent objects in Python?
 | |
| --------------------------------------------------
 | |
| 
 | |
| The :mod:`pickle` library module solves this in a very general way (though you
 | |
| still can't store things like open files, sockets or windows), and the
 | |
| :mod:`shelve` library module uses pickle and (g)dbm to create persistent
 | |
| mappings containing arbitrary Python objects.
 | |
| 
 | |
| A more awkward way of doing things is to use pickle's little sister, marshal.
 | |
| The :mod:`marshal` module provides very fast ways to store noncircular basic
 | |
| Python types to files and strings, and back again.  Although marshal does not do
 | |
| fancy things like store instances or handle shared references properly, it does
 | |
| run extremely fast.  For example loading a half megabyte of data may take less
 | |
| than a third of a second.  This often beats doing something more complex and
 | |
| general such as using gdbm with pickle/shelve.
 | |
| 
 | |
| 
 | |
| If my program crashes with a bsddb (or anydbm) database open, it gets corrupted. How come?
 | |
| ------------------------------------------------------------------------------------------
 | |
| 
 | |
| .. XXX move this FAQ entry elsewhere?
 | |
| 
 | |
| .. note::
 | |
| 
 | |
|    The bsddb module is now available as a standalone package `pybsddb
 | |
|    <http://www.jcea.es/programacion/pybsddb.htm>`_.
 | |
| 
 | |
| Databases opened for write access with the bsddb module (and often by the anydbm
 | |
| module, since it will preferentially use bsddb) must explicitly be closed using
 | |
| the ``.close()`` method of the database.  The underlying library caches database
 | |
| contents which need to be converted to on-disk form and written.
 | |
| 
 | |
| If you have initialized a new bsddb database but not written anything to it
 | |
| before the program crashes, you will often wind up with a zero-length file and
 | |
| encounter an exception the next time the file is opened.
 | |
| 
 | |
| 
 | |
| I tried to open Berkeley DB file, but bsddb produces bsddb.error: (22, 'Invalid argument'). Help! How can I restore my data?
 | |
| ----------------------------------------------------------------------------------------------------------------------------
 | |
| 
 | |
| .. XXX move this FAQ entry elsewhere?
 | |
| 
 | |
| .. note::
 | |
| 
 | |
|    The bsddb module is now available as a standalone package `pybsddb
 | |
|    <http://www.jcea.es/programacion/pybsddb.htm>`_.
 | |
| 
 | |
| Don't panic! Your data is probably intact. The most frequent cause for the error
 | |
| is that you tried to open an earlier Berkeley DB file with a later version of
 | |
| the Berkeley DB library.
 | |
| 
 | |
| Many Linux systems now have all three versions of Berkeley DB available.  If you
 | |
| are migrating from version 1 to a newer version use db_dump185 to dump a plain
 | |
| text version of the database.  If you are migrating from version 2 to version 3
 | |
| use db2_dump to create a plain text version of the database.  In either case,
 | |
| use db_load to create a new native database for the latest version installed on
 | |
| your computer.  If you have version 3 of Berkeley DB installed, you should be
 | |
| able to use db2_load to create a native version 2 database.
 | |
| 
 | |
| You should move away from Berkeley DB version 1 files because the hash file code
 | |
| contains known bugs that can corrupt your data.
 | |
| 
 | |
| 
 | |
| Mathematics and Numerics
 | |
| ========================
 | |
| 
 | |
| How do I generate random numbers in Python?
 | |
| -------------------------------------------
 | |
| 
 | |
| The standard module :mod:`random` implements a random number generator.  Usage
 | |
| is simple::
 | |
| 
 | |
|    import random
 | |
|    random.random()
 | |
| 
 | |
| This returns a random floating point number in the range [0, 1).
 | |
| 
 | |
| There are also many other specialized generators in this module, such as:
 | |
| 
 | |
| * ``randrange(a, b)`` chooses an integer in the range [a, b).
 | |
| * ``uniform(a, b)`` chooses a floating point number in the range [a, b).
 | |
| * ``normalvariate(mean, sdev)`` samples the normal (Gaussian) distribution.
 | |
| 
 | |
| Some higher-level functions operate on sequences directly, such as:
 | |
| 
 | |
| * ``choice(S)`` chooses random element from a given sequence
 | |
| * ``shuffle(L)`` shuffles a list in-place, i.e. permutes it randomly
 | |
| 
 | |
| There's also a ``Random`` class you can instantiate to create independent
 | |
| multiple random number generators.
 |