mirror of
https://github.com/python/cpython.git
synced 2026-04-15 00:00:57 +00:00
372 lines
14 KiB
ReStructuredText
372 lines
14 KiB
ReStructuredText
.. _profile:
|
|
|
|
****************************************
|
|
:mod:`!profile` --- Pure Python profiler
|
|
****************************************
|
|
|
|
.. module:: profile
|
|
:synopsis: Pure Python profiler (deprecated).
|
|
:deprecated:
|
|
|
|
**Source code:** :source:`Lib/profile.py`
|
|
|
|
--------------
|
|
|
|
.. deprecated-removed:: 3.15 3.17
|
|
|
|
The :mod:`!profile` module is deprecated and will be removed in Python 3.17.
|
|
Use :mod:`profiling.tracing` instead.
|
|
|
|
The :mod:`!profile` module provides a pure Python implementation of a
|
|
deterministic profiler. While useful for understanding profiler internals or
|
|
extending profiler behavior through subclassing, its pure Python implementation
|
|
introduces significant overhead compared to the C-based :mod:`profiling.tracing`
|
|
module.
|
|
|
|
For most profiling tasks, use:
|
|
|
|
- :mod:`profiling.sampling` for production debugging with zero overhead
|
|
- :mod:`profiling.tracing` for development and testing
|
|
|
|
|
|
Migration
|
|
=========
|
|
|
|
Migrating from :mod:`!profile` to :mod:`profiling.tracing` is straightforward.
|
|
The APIs are compatible::
|
|
|
|
# Old (deprecated)
|
|
import profile
|
|
profile.run('my_function()')
|
|
|
|
# New (recommended)
|
|
import profiling.tracing
|
|
profiling.tracing.run('my_function()')
|
|
|
|
For most code, replacing ``import profile`` with ``import profiling.tracing``
|
|
(and using ``profiling.tracing`` instead of ``profile`` throughout) provides
|
|
a straightforward migration path.
|
|
|
|
.. note::
|
|
|
|
The ``cProfile`` module remains available as a backward-compatible alias
|
|
to :mod:`profiling.tracing`. Existing code using ``import cProfile`` will
|
|
continue to work without modification.
|
|
|
|
|
|
:mod:`!profile` and :mod:`!profiling.tracing` module reference
|
|
==============================================================
|
|
|
|
Both the :mod:`!profile` and :mod:`profiling.tracing` modules provide the
|
|
following functions:
|
|
|
|
.. function:: run(command, filename=None, sort=-1)
|
|
|
|
This function takes a single argument that can be passed to the :func:`exec`
|
|
function, and an optional file name. In all cases this routine executes::
|
|
|
|
exec(command, __main__.__dict__, __main__.__dict__)
|
|
|
|
and gathers profiling statistics from the execution. If no file name is
|
|
present, then this function automatically creates a :class:`~pstats.Stats`
|
|
instance and prints a simple profiling report. If the sort value is specified,
|
|
it is passed to this :class:`~pstats.Stats` instance to control how the
|
|
results are sorted.
|
|
|
|
.. function:: runctx(command, globals, locals, filename=None, sort=-1)
|
|
|
|
This function is similar to :func:`run`, with added arguments to supply the
|
|
globals and locals mappings for the *command* string. This routine
|
|
executes::
|
|
|
|
exec(command, globals, locals)
|
|
|
|
and gathers profiling statistics as in the :func:`run` function above.
|
|
|
|
.. class:: Profile(timer=None, timeunit=0.0, subcalls=True, builtins=True)
|
|
|
|
This class is normally only used if more precise control over profiling is
|
|
needed than what the :func:`profiling.tracing.run` function provides.
|
|
|
|
A custom timer can be supplied for measuring how long code takes to run via
|
|
the *timer* argument. This must be a function that returns a single number
|
|
representing the current time. If the number is an integer, the *timeunit*
|
|
specifies a multiplier that specifies the duration of each unit of time. For
|
|
example, if the timer returns times measured in thousands of seconds, the
|
|
time unit would be ``.001``.
|
|
|
|
Directly using the :class:`Profile` class allows formatting profile results
|
|
without writing the profile data to a file::
|
|
|
|
import profiling.tracing
|
|
import pstats
|
|
import io
|
|
from pstats import SortKey
|
|
|
|
pr = profiling.tracing.Profile()
|
|
pr.enable()
|
|
# ... do something ...
|
|
pr.disable()
|
|
s = io.StringIO()
|
|
sortby = SortKey.CUMULATIVE
|
|
ps = pstats.Stats(pr, stream=s).sort_stats(sortby)
|
|
ps.print_stats()
|
|
print(s.getvalue())
|
|
|
|
The :class:`Profile` class can also be used as a context manager (supported
|
|
only in :mod:`profiling.tracing`, not in the deprecated :mod:`!profile`
|
|
module; see :ref:`typecontextmanager`)::
|
|
|
|
import profiling.tracing
|
|
|
|
with profiling.tracing.Profile() as pr:
|
|
# ... do something ...
|
|
|
|
pr.print_stats()
|
|
|
|
.. versionchanged:: 3.8
|
|
Added context manager support.
|
|
|
|
.. method:: enable()
|
|
|
|
Start collecting profiling data. Only in :mod:`profiling.tracing`.
|
|
|
|
.. method:: disable()
|
|
|
|
Stop collecting profiling data. Only in :mod:`profiling.tracing`.
|
|
|
|
.. method:: create_stats()
|
|
|
|
Stop collecting profiling data and record the results internally
|
|
as the current profile.
|
|
|
|
.. method:: print_stats(sort=-1)
|
|
|
|
Create a :class:`~pstats.Stats` object based on the current
|
|
profile and print the results to stdout.
|
|
|
|
The *sort* parameter specifies the sorting order of the displayed
|
|
statistics. It accepts a single key or a tuple of keys to enable
|
|
multi-level sorting, as in :meth:`pstats.Stats.sort_stats`.
|
|
|
|
.. versionadded:: 3.13
|
|
:meth:`~Profile.print_stats` now accepts a tuple of keys.
|
|
|
|
.. method:: dump_stats(filename)
|
|
|
|
Write the results of the current profile to *filename*.
|
|
|
|
.. method:: run(cmd)
|
|
|
|
Profile the cmd via :func:`exec`.
|
|
|
|
.. method:: runctx(cmd, globals, locals)
|
|
|
|
Profile the cmd via :func:`exec` with the specified global and
|
|
local environment.
|
|
|
|
.. method:: runcall(func, /, *args, **kwargs)
|
|
|
|
Profile ``func(*args, **kwargs)``
|
|
|
|
Note that profiling will only work if the called command/function actually
|
|
returns. If the interpreter is terminated (e.g. via a :func:`sys.exit` call
|
|
during the called command/function execution) no profiling results will be
|
|
printed.
|
|
|
|
|
|
Differences from :mod:`!profiling.tracing`
|
|
==========================================
|
|
|
|
The :mod:`!profile` module differs from :mod:`profiling.tracing` in several
|
|
ways:
|
|
|
|
**Higher overhead.** The pure Python implementation is significantly slower
|
|
than the C implementation, making it unsuitable for profiling long-running
|
|
programs or performance-sensitive code.
|
|
|
|
**Calibration support.** The :mod:`!profile` module supports calibration to
|
|
compensate for profiling overhead. This is not needed in :mod:`profiling.tracing`
|
|
because the C implementation has negligible overhead.
|
|
|
|
**Custom timers.** Both modules support custom timers, but :mod:`!profile`
|
|
accepts timer functions that return tuples (like :func:`os.times`), while
|
|
:mod:`profiling.tracing` requires a function returning a single number.
|
|
|
|
**Subclassing.** The pure Python implementation is easier to subclass and
|
|
extend for custom profiling behavior.
|
|
|
|
|
|
.. _deterministic-profiling:
|
|
|
|
What is deterministic profiling?
|
|
================================
|
|
|
|
:dfn:`Deterministic profiling` is meant to reflect the fact that all *function
|
|
call*, *function return*, and *exception* events are monitored, and precise
|
|
timings are made for the intervals between these events (during which time the
|
|
user's code is executing). In contrast, :dfn:`statistical profiling` (which is
|
|
provided by the :mod:`profiling.sampling` module) periodically samples the
|
|
effective instruction pointer, and deduces where time is being spent. The
|
|
latter technique traditionally involves less overhead (as the code does not
|
|
need to be instrumented), but provides only relative indications of where time
|
|
is being spent.
|
|
|
|
In Python, since there is an interpreter active during execution, the presence
|
|
of instrumented code is not required in order to do deterministic profiling.
|
|
Python automatically provides a :dfn:`hook` (optional callback) for each event.
|
|
In addition, the interpreted nature of Python tends to add so much overhead to
|
|
execution, that deterministic profiling tends to only add small processing
|
|
overhead in typical applications. The result is that deterministic profiling is
|
|
not that expensive, yet provides extensive run time statistics about the
|
|
execution of a Python program.
|
|
|
|
Call count statistics can be used to identify bugs in code (surprising counts),
|
|
and to identify possible inline-expansion points (high call counts). Internal
|
|
time statistics can be used to identify "hot loops" that should be carefully
|
|
optimized. Cumulative time statistics should be used to identify high level
|
|
errors in the selection of algorithms. Note that the unusual handling of
|
|
cumulative times in this profiler allows statistics for recursive
|
|
implementations of algorithms to be directly compared to iterative
|
|
implementations.
|
|
|
|
|
|
.. _profile-limitations:
|
|
|
|
Limitations
|
|
===========
|
|
|
|
One limitation has to do with accuracy of timing information. There is a
|
|
fundamental problem with deterministic profilers involving accuracy. The most
|
|
obvious restriction is that the underlying "clock" is only ticking at a rate
|
|
(typically) of about .001 seconds. Hence no measurements will be more accurate
|
|
than the underlying clock. If enough measurements are taken, then the "error"
|
|
will tend to average out. Unfortunately, removing this first error induces a
|
|
second source of error.
|
|
|
|
The second problem is that it "takes a while" from when an event is dispatched
|
|
until the profiler's call to get the time actually *gets* the state of the
|
|
clock. Similarly, there is a certain lag when exiting the profiler event
|
|
handler from the time that the clock's value was obtained (and then squirreled
|
|
away), until the user's code is once again executing. As a result, functions
|
|
that are called many times, or call many functions, will typically accumulate
|
|
this error. The error that accumulates in this fashion is typically less than
|
|
the accuracy of the clock (less than one clock tick), but it *can* accumulate
|
|
and become very significant.
|
|
|
|
The problem is more important with the deprecated :mod:`!profile` module than
|
|
with the lower-overhead :mod:`profiling.tracing`. For this reason,
|
|
:mod:`!profile` provides a means of calibrating itself for a given platform so
|
|
that this error can be probabilistically (on the average) removed. After the
|
|
profiler is calibrated, it will be more accurate (in a least square sense), but
|
|
it will sometimes produce negative numbers (when call counts are exceptionally
|
|
low, and the gods of probability work against you :-). ) Do *not* be alarmed
|
|
by negative numbers in the profile. They should *only* appear if you have
|
|
calibrated your profiler, and the results are actually better than without
|
|
calibration.
|
|
|
|
|
|
.. _profile-calibration:
|
|
|
|
Calibration
|
|
===========
|
|
|
|
The profiler of the :mod:`!profile` module subtracts a constant from each event
|
|
handling time to compensate for the overhead of calling the time function, and
|
|
socking away the results. By default, the constant is 0. The following
|
|
procedure can be used to obtain a better constant for a given platform (see
|
|
:ref:`profile-limitations`). ::
|
|
|
|
import profile
|
|
pr = profile.Profile()
|
|
for i in range(5):
|
|
print(pr.calibrate(10000))
|
|
|
|
The method executes the number of Python calls given by the argument, directly
|
|
and again under the profiler, measuring the time for both. It then computes the
|
|
hidden overhead per profiler event, and returns that as a float. For example,
|
|
on a 1.8Ghz Intel Core i5 running macOS, and using Python's time.process_time() as
|
|
the timer, the magical number is about 4.04e-6.
|
|
|
|
The object of this exercise is to get a fairly consistent result. If your
|
|
computer is *very* fast, or your timer function has poor resolution, you might
|
|
have to pass 100000, or even 1000000, to get consistent results.
|
|
|
|
When you have a consistent answer, there are three ways you can use it::
|
|
|
|
import profile
|
|
|
|
# 1. Apply computed bias to all Profile instances created hereafter.
|
|
profile.Profile.bias = your_computed_bias
|
|
|
|
# 2. Apply computed bias to a specific Profile instance.
|
|
pr = profile.Profile()
|
|
pr.bias = your_computed_bias
|
|
|
|
# 3. Specify computed bias in instance constructor.
|
|
pr = profile.Profile(bias=your_computed_bias)
|
|
|
|
If you have a choice, you are better off choosing a smaller constant, and then
|
|
your results will "less often" show up as negative in profile statistics.
|
|
|
|
|
|
.. _profile-timers:
|
|
|
|
Using a custom timer
|
|
====================
|
|
|
|
If you want to change how current time is determined (for example, to force use
|
|
of wall-clock time or elapsed process time), pass the timing function you want
|
|
to the :class:`Profile` class constructor::
|
|
|
|
pr = profile.Profile(your_time_func)
|
|
|
|
The resulting profiler will then call ``your_time_func``. Depending on whether
|
|
you are using :class:`profile.Profile` or :class:`profiling.tracing.Profile`,
|
|
``your_time_func``'s return value will be interpreted differently:
|
|
|
|
:class:`profile.Profile`
|
|
``your_time_func`` should return a single number, or a list of numbers whose
|
|
sum is the current time (like what :func:`os.times` returns). If the
|
|
function returns a single time number, or the list of returned numbers has
|
|
length 2, then you will get an especially fast version of the dispatch
|
|
routine.
|
|
|
|
Be warned that you should calibrate the profiler class for the timer function
|
|
that you choose (see :ref:`profile-calibration`). For most machines, a timer
|
|
that returns a lone integer value will provide the best results in terms of
|
|
low overhead during profiling. (:func:`os.times` is *pretty* bad, as it
|
|
returns a tuple of floating-point values). If you want to substitute a
|
|
better timer in the cleanest fashion, derive a class and hardwire a
|
|
replacement dispatch method that best handles your timer call, along with the
|
|
appropriate calibration constant.
|
|
|
|
:class:`profiling.tracing.Profile`
|
|
``your_time_func`` should return a single number. If it returns integers,
|
|
you can also invoke the class constructor with a second argument specifying
|
|
the real duration of one unit of time. For example, if
|
|
``your_integer_time_func`` returns times measured in thousands of seconds,
|
|
you would construct the :class:`Profile` instance as follows::
|
|
|
|
pr = profiling.tracing.Profile(your_integer_time_func, 0.001)
|
|
|
|
As the :class:`profiling.tracing.Profile` class cannot be calibrated, custom
|
|
timer functions should be used with care and should be as fast as possible.
|
|
For the best results with a custom timer, it might be necessary to hard-code
|
|
it in the C source of the internal :mod:`!_lsprof` module.
|
|
|
|
Python 3.3 adds several new functions in :mod:`time` that can be used to make
|
|
precise measurements of process or wall-clock time. For example, see
|
|
:func:`time.perf_counter`.
|
|
|
|
|
|
.. seealso::
|
|
|
|
:mod:`profiling`
|
|
Overview of Python profiling tools.
|
|
|
|
:mod:`profiling.tracing`
|
|
Recommended replacement for this module.
|
|
|
|
:mod:`pstats`
|
|
Statistical analysis and formatting for profile data.
|