diff options
author | cinap_lenrek <cinap_lenrek@localhost> | 2011-05-03 11:25:13 +0000 |
---|---|---|
committer | cinap_lenrek <cinap_lenrek@localhost> | 2011-05-03 11:25:13 +0000 |
commit | 458120dd40db6b4df55a4e96b650e16798ef06a0 (patch) | |
tree | 8f82685be24fef97e715c6f5ca4c68d34d5074ee /sys/src/cmd/python/Doc/whatsnew/whatsnew24.tex | |
parent | 3a742c699f6806c1145aea5149bf15de15a0afd7 (diff) |
add hg and python
Diffstat (limited to 'sys/src/cmd/python/Doc/whatsnew/whatsnew24.tex')
-rw-r--r-- | sys/src/cmd/python/Doc/whatsnew/whatsnew24.tex | 1757 |
1 files changed, 1757 insertions, 0 deletions
diff --git a/sys/src/cmd/python/Doc/whatsnew/whatsnew24.tex b/sys/src/cmd/python/Doc/whatsnew/whatsnew24.tex new file mode 100644 index 000000000..6b146946a --- /dev/null +++ b/sys/src/cmd/python/Doc/whatsnew/whatsnew24.tex @@ -0,0 +1,1757 @@ +\documentclass{howto} +\usepackage{distutils} +% $Id: whatsnew24.tex 50936 2006-07-29 15:42:46Z andrew.kuchling $ + +% Don't write extensive text for new sections; I'll do that. +% Feel free to add commented-out reminders of things that need +% to be covered. --amk + +\title{What's New in Python 2.4} +\release{1.02} +\author{A.M.\ Kuchling} +\authoraddress{ + \strong{Python Software Foundation}\\ + Email: \email{amk@amk.ca} +} + +\begin{document} +\maketitle +\tableofcontents + +This article explains the new features in Python 2.4.1, released on +March~30, 2005. + +Python 2.4 is a medium-sized release. It doesn't introduce as many +changes as the radical Python 2.2, but introduces more features than +the conservative 2.3 release. The most significant new language +features are function decorators and generator expressions; most other +changes are to the standard library. + +According to the CVS change logs, there were 481 patches applied and +502 bugs fixed between Python 2.3 and 2.4. Both figures are likely to +be underestimates. + +This article doesn't attempt to provide a complete specification of +every single new feature, but instead provides a brief introduction to +each feature. For full details, you should refer to the documentation +for Python 2.4, such as the \citetitle[../lib/lib.html]{Python Library +Reference} and the \citetitle[../ref/ref.html]{Python Reference +Manual}. Often you will be referred to the PEP for a particular new +feature for explanations of the implementation and design rationale. + + +%====================================================================== +\section{PEP 218: Built-In Set Objects} + +Python 2.3 introduced the \module{sets} module. C implementations of +set data types have now been added to the Python core as two new +built-in types, \function{set(\var{iterable})} and +\function{frozenset(\var{iterable})}. They provide high speed +operations for membership testing, for eliminating duplicates from +sequences, and for mathematical operations like unions, intersections, +differences, and symmetric differences. + +\begin{verbatim} +>>> a = set('abracadabra') # form a set from a string +>>> 'z' in a # fast membership testing +False +>>> a # unique letters in a +set(['a', 'r', 'b', 'c', 'd']) +>>> ''.join(a) # convert back into a string +'arbcd' + +>>> b = set('alacazam') # form a second set +>>> a - b # letters in a but not in b +set(['r', 'd', 'b']) +>>> a | b # letters in either a or b +set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l']) +>>> a & b # letters in both a and b +set(['a', 'c']) +>>> a ^ b # letters in a or b but not both +set(['r', 'd', 'b', 'm', 'z', 'l']) + +>>> a.add('z') # add a new element +>>> a.update('wxy') # add multiple new elements +>>> a +set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'x', 'z']) +>>> a.remove('x') # take one element out +>>> a +set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'z']) +\end{verbatim} + +The \function{frozenset} type is an immutable version of \function{set}. +Since it is immutable and hashable, it may be used as a dictionary key or +as a member of another set. + +The \module{sets} module remains in the standard library, and may be +useful if you wish to subclass the \class{Set} or \class{ImmutableSet} +classes. There are currently no plans to deprecate the module. + +\begin{seealso} +\seepep{218}{Adding a Built-In Set Object Type}{Originally proposed by +Greg Wilson and ultimately implemented by Raymond Hettinger.} +\end{seealso} + + +%====================================================================== +\section{PEP 237: Unifying Long Integers and Integers} + +The lengthy transition process for this PEP, begun in Python 2.2, +takes another step forward in Python 2.4. In 2.3, certain integer +operations that would behave differently after int/long unification +triggered \exception{FutureWarning} warnings and returned values +limited to 32 or 64 bits (depending on your platform). In 2.4, these +expressions no longer produce a warning and instead produce a +different result that's usually a long integer. + +The problematic expressions are primarily left shifts and lengthy +hexadecimal and octal constants. For example, +\code{2 \textless{}\textless{} 32} results +in a warning in 2.3, evaluating to 0 on 32-bit platforms. In Python +2.4, this expression now returns the correct answer, 8589934592. + +\begin{seealso} +\seepep{237}{Unifying Long Integers and Integers}{Original PEP +written by Moshe Zadka and GvR. The changes for 2.4 were implemented by +Kalle Svensson.} +\end{seealso} + + +%====================================================================== +\section{PEP 289: Generator Expressions} + +The iterator feature introduced in Python 2.2 and the +\module{itertools} module make it easier to write programs that loop +through large data sets without having the entire data set in memory +at one time. List comprehensions don't fit into this picture very +well because they produce a Python list object containing all of the +items. This unavoidably pulls all of the objects into memory, which +can be a problem if your data set is very large. When trying to write +a functionally-styled program, it would be natural to write something +like: + +\begin{verbatim} +links = [link for link in get_all_links() if not link.followed] +for link in links: + ... +\end{verbatim} + +instead of + +\begin{verbatim} +for link in get_all_links(): + if link.followed: + continue + ... +\end{verbatim} + +The first form is more concise and perhaps more readable, but if +you're dealing with a large number of link objects you'd have to write +the second form to avoid having all link objects in memory at the same +time. + +Generator expressions work similarly to list comprehensions but don't +materialize the entire list; instead they create a generator that will +return elements one by one. The above example could be written as: + +\begin{verbatim} +links = (link for link in get_all_links() if not link.followed) +for link in links: + ... +\end{verbatim} + +Generator expressions always have to be written inside parentheses, as +in the above example. The parentheses signalling a function call also +count, so if you want to create an iterator that will be immediately +passed to a function you could write: + +\begin{verbatim} +print sum(obj.count for obj in list_all_objects()) +\end{verbatim} + +Generator expressions differ from list comprehensions in various small +ways. Most notably, the loop variable (\var{obj} in the above +example) is not accessible outside of the generator expression. List +comprehensions leave the variable assigned to its last value; future +versions of Python will change this, making list comprehensions match +generator expressions in this respect. + +\begin{seealso} +\seepep{289}{Generator Expressions}{Proposed by Raymond Hettinger and +implemented by Jiwon Seo with early efforts steered by Hye-Shik Chang.} +\end{seealso} + + +%====================================================================== +\section{PEP 292: Simpler String Substitutions} + +Some new classes in the standard library provide an alternative +mechanism for substituting variables into strings; this style of +substitution may be better for applications where untrained +users need to edit templates. + +The usual way of substituting variables by name is the \code{\%} +operator: + +\begin{verbatim} +>>> '%(page)i: %(title)s' % {'page':2, 'title': 'The Best of Times'} +'2: The Best of Times' +\end{verbatim} + +When writing the template string, it can be easy to forget the +\samp{i} or \samp{s} after the closing parenthesis. This isn't a big +problem if the template is in a Python module, because you run the +code, get an ``Unsupported format character'' \exception{ValueError}, +and fix the problem. However, consider an application such as Mailman +where template strings or translations are being edited by users who +aren't aware of the Python language. The format string's syntax is +complicated to explain to such users, and if they make a mistake, it's +difficult to provide helpful feedback to them. + +PEP 292 adds a \class{Template} class to the \module{string} module +that uses \samp{\$} to indicate a substitution: + +\begin{verbatim} +>>> import string +>>> t = string.Template('$page: $title') +>>> t.substitute({'page':2, 'title': 'The Best of Times'}) +'2: The Best of Times' +\end{verbatim} + +% $ Terminate $-mode for Emacs + +If a key is missing from the dictionary, the \method{substitute} method +will raise a \exception{KeyError}. There's also a \method{safe_substitute} +method that ignores missing keys: + +\begin{verbatim} +>>> t = string.Template('$page: $title') +>>> t.safe_substitute({'page':3}) +'3: $title' +\end{verbatim} + +% $ Terminate math-mode for Emacs + + +\begin{seealso} +\seepep{292}{Simpler String Substitutions}{Written and implemented +by Barry Warsaw.} +\end{seealso} + + +%====================================================================== +\section{PEP 318: Decorators for Functions and Methods} + +Python 2.2 extended Python's object model by adding static methods and +class methods, but it didn't extend Python's syntax to provide any new +way of defining static or class methods. Instead, you had to write a +\keyword{def} statement in the usual way, and pass the resulting +method to a \function{staticmethod()} or \function{classmethod()} +function that would wrap up the function as a method of the new type. +Your code would look like this: + +\begin{verbatim} +class C: + def meth (cls): + ... + + meth = classmethod(meth) # Rebind name to wrapped-up class method +\end{verbatim} + +If the method was very long, it would be easy to miss or forget the +\function{classmethod()} invocation after the function body. + +The intention was always to add some syntax to make such definitions +more readable, but at the time of 2.2's release a good syntax was not +obvious. Today a good syntax \emph{still} isn't obvious but users are +asking for easier access to the feature; a new syntactic feature has +been added to meet this need. + +The new feature is called ``function decorators''. The name comes +from the idea that \function{classmethod}, \function{staticmethod}, +and friends are storing additional information on a function object; +they're \emph{decorating} functions with more details. + +The notation borrows from Java and uses the \character{@} character as an +indicator. Using the new syntax, the example above would be written: + +\begin{verbatim} +class C: + + @classmethod + def meth (cls): + ... + +\end{verbatim} + +The \code{@classmethod} is shorthand for the +\code{meth=classmethod(meth)} assignment. More generally, if you have +the following: + +\begin{verbatim} +@A +@B +@C +def f (): + ... +\end{verbatim} + +It's equivalent to the following pre-decorator code: + +\begin{verbatim} +def f(): ... +f = A(B(C(f))) +\end{verbatim} + +Decorators must come on the line before a function definition, one decorator +per line, and can't be on the same line as the def statement, meaning that +\code{@A def f(): ...} is illegal. You can only decorate function +definitions, either at the module level or inside a class; you can't +decorate class definitions. + +A decorator is just a function that takes the function to be decorated as an +argument and returns either the same function or some new object. The +return value of the decorator need not be callable (though it typically is), +unless further decorators will be applied to the result. It's easy to write +your own decorators. The following simple example just sets an attribute on +the function object: + +\begin{verbatim} +>>> def deco(func): +... func.attr = 'decorated' +... return func +... +>>> @deco +... def f(): pass +... +>>> f +<function f at 0x402ef0d4> +>>> f.attr +'decorated' +>>> +\end{verbatim} + +As a slightly more realistic example, the following decorator checks +that the supplied argument is an integer: + +\begin{verbatim} +def require_int (func): + def wrapper (arg): + assert isinstance(arg, int) + return func(arg) + + return wrapper + +@require_int +def p1 (arg): + print arg + +@require_int +def p2(arg): + print arg*2 +\end{verbatim} + +An example in \pep{318} contains a fancier version of this idea that +lets you both specify the required type and check the returned type. + +Decorator functions can take arguments. If arguments are supplied, +your decorator function is called with only those arguments and must +return a new decorator function; this function must take a single +function and return a function, as previously described. In other +words, \code{@A @B @C(args)} becomes: + +\begin{verbatim} +def f(): ... +_deco = C(args) +f = A(B(_deco(f))) +\end{verbatim} + +Getting this right can be slightly brain-bending, but it's not too +difficult. + +A small related change makes the \member{func_name} attribute of +functions writable. This attribute is used to display function names +in tracebacks, so decorators should change the name of any new +function that's constructed and returned. + +\begin{seealso} +\seepep{318}{Decorators for Functions, Methods and Classes}{Written +by Kevin D. Smith, Jim Jewett, and Skip Montanaro. Several people +wrote patches implementing function decorators, but the one that was +actually checked in was patch \#979728, written by Mark Russell.} + +\seeurl{http://www.python.org/moin/PythonDecoratorLibrary} +{This Wiki page contains several examples of decorators.} + +\end{seealso} + + +%====================================================================== +\section{PEP 322: Reverse Iteration} + +A new built-in function, \function{reversed(\var{seq})}, takes a sequence +and returns an iterator that loops over the elements of the sequence +in reverse order. + +\begin{verbatim} +>>> for i in reversed(xrange(1,4)): +... print i +... +3 +2 +1 +\end{verbatim} + +Compared to extended slicing, such as \code{range(1,4)[::-1]}, +\function{reversed()} is easier to read, runs faster, and uses +substantially less memory. + +Note that \function{reversed()} only accepts sequences, not arbitrary +iterators. If you want to reverse an iterator, first convert it to +a list with \function{list()}. + +\begin{verbatim} +>>> input = open('/etc/passwd', 'r') +>>> for line in reversed(list(input)): +... print line +... +root:*:0:0:System Administrator:/var/root:/bin/tcsh + ... +\end{verbatim} + +\begin{seealso} +\seepep{322}{Reverse Iteration}{Written and implemented by Raymond Hettinger.} + +\end{seealso} + + +%====================================================================== +\section{PEP 324: New subprocess Module} + +The standard library provides a number of ways to execute a +subprocess, offering different features and different levels of +complexity. \function{os.system(\var{command})} is easy to use, but +slow (it runs a shell process which executes the command) and +dangerous (you have to be careful about escaping the shell's +metacharacters). The \module{popen2} module offers classes that can +capture standard output and standard error from the subprocess, but +the naming is confusing. The \module{subprocess} module cleans +this up, providing a unified interface that offers all the features +you might need. + +Instead of \module{popen2}'s collection of classes, +\module{subprocess} contains a single class called \class{Popen} +whose constructor supports a number of different keyword arguments. + +\begin{verbatim} +class Popen(args, bufsize=0, executable=None, + stdin=None, stdout=None, stderr=None, + preexec_fn=None, close_fds=False, shell=False, + cwd=None, env=None, universal_newlines=False, + startupinfo=None, creationflags=0): +\end{verbatim} + +\var{args} is commonly a sequence of strings that will be the +arguments to the program executed as the subprocess. (If the +\var{shell} argument is true, \var{args} can be a string which will +then be passed on to the shell for interpretation, just as +\function{os.system()} does.) + +\var{stdin}, \var{stdout}, and \var{stderr} specify what the +subprocess's input, output, and error streams will be. You can +provide a file object or a file descriptor, or you can use the +constant \code{subprocess.PIPE} to create a pipe between the +subprocess and the parent. + +The constructor has a number of handy options: + +\begin{itemize} + \item \var{close_fds} requests that all file descriptors be closed + before running the subprocess. + + \item \var{cwd} specifies the working directory in which the + subprocess will be executed (defaulting to whatever the parent's + working directory is). + + \item \var{env} is a dictionary specifying environment variables. + + \item \var{preexec_fn} is a function that gets called before the + child is started. + + \item \var{universal_newlines} opens the child's input and output + using Python's universal newline feature. + +\end{itemize} + +Once you've created the \class{Popen} instance, +you can call its \method{wait()} method to pause until the subprocess +has exited, \method{poll()} to check if it's exited without pausing, +or \method{communicate(\var{data})} to send the string \var{data} to +the subprocess's standard input. \method{communicate(\var{data})} +then reads any data that the subprocess has sent to its standard output +or standard error, returning a tuple \code{(\var{stdout_data}, +\var{stderr_data})}. + +\function{call()} is a shortcut that passes its arguments along to the +\class{Popen} constructor, waits for the command to complete, and +returns the status code of the subprocess. It can serve as a safer +analog to \function{os.system()}: + +\begin{verbatim} +sts = subprocess.call(['dpkg', '-i', '/tmp/new-package.deb']) +if sts == 0: + # Success + ... +else: + # dpkg returned an error + ... +\end{verbatim} + +The command is invoked without use of the shell. If you really do want to +use the shell, you can add \code{shell=True} as a keyword argument and provide +a string instead of a sequence: + +\begin{verbatim} +sts = subprocess.call('dpkg -i /tmp/new-package.deb', shell=True) +\end{verbatim} + +The PEP takes various examples of shell and Python code and shows how +they'd be translated into Python code that uses \module{subprocess}. +Reading this section of the PEP is highly recommended. + +\begin{seealso} +\seepep{324}{subprocess - New process module}{Written and implemented by Peter {\AA}strand, with assistance from Fredrik Lundh and others.} +\end{seealso} + + +%====================================================================== +\section{PEP 327: Decimal Data Type} + +Python has always supported floating-point (FP) numbers, based on the +underlying C \ctype{double} type, as a data type. However, while most +programming languages provide a floating-point type, many people (even +programmers) are unaware that floating-point numbers don't represent +certain decimal fractions accurately. The new \class{Decimal} type +can represent these fractions accurately, up to a user-specified +precision limit. + + +\subsection{Why is Decimal needed?} + +The limitations arise from the representation used for floating-point numbers. +FP numbers are made up of three components: + +\begin{itemize} +\item The sign, which is positive or negative. +\item The mantissa, which is a single-digit binary number +followed by a fractional part. For example, \code{1.01} in base-2 notation +is \code{1 + 0/2 + 1/4}, or 1.25 in decimal notation. +\item The exponent, which tells where the decimal point is located in the number represented. +\end{itemize} + +For example, the number 1.25 has positive sign, a mantissa value of +1.01 (in binary), and an exponent of 0 (the decimal point doesn't need +to be shifted). The number 5 has the same sign and mantissa, but the +exponent is 2 because the mantissa is multiplied by 4 (2 to the power +of the exponent 2); 1.25 * 4 equals 5. + +Modern systems usually provide floating-point support that conforms to +a standard called IEEE 754. C's \ctype{double} type is usually +implemented as a 64-bit IEEE 754 number, which uses 52 bits of space +for the mantissa. This means that numbers can only be specified to 52 +bits of precision. If you're trying to represent numbers whose +expansion repeats endlessly, the expansion is cut off after 52 bits. +Unfortunately, most software needs to produce output in base 10, and +common fractions in base 10 are often repeating decimals in binary. +For example, 1.1 decimal is binary \code{1.0001100110011 ...}; .1 = +1/16 + 1/32 + 1/256 plus an infinite number of additional terms. IEEE +754 has to chop off that infinitely repeated decimal after 52 digits, +so the representation is slightly inaccurate. + +Sometimes you can see this inaccuracy when the number is printed: +\begin{verbatim} +>>> 1.1 +1.1000000000000001 +\end{verbatim} + +The inaccuracy isn't always visible when you print the number because +the FP-to-decimal-string conversion is provided by the C library, and +most C libraries try to produce sensible output. Even if it's not +displayed, however, the inaccuracy is still there and subsequent +operations can magnify the error. + +For many applications this doesn't matter. If I'm plotting points and +displaying them on my monitor, the difference between 1.1 and +1.1000000000000001 is too small to be visible. Reports often limit +output to a certain number of decimal places, and if you round the +number to two or three or even eight decimal places, the error is +never apparent. However, for applications where it does matter, +it's a lot of work to implement your own custom arithmetic routines. + +Hence, the \class{Decimal} type was created. + +\subsection{The \class{Decimal} type} + +A new module, \module{decimal}, was added to Python's standard +library. It contains two classes, \class{Decimal} and +\class{Context}. \class{Decimal} instances represent numbers, and +\class{Context} instances are used to wrap up various settings such as +the precision and default rounding mode. + +\class{Decimal} instances are immutable, like regular Python integers +and FP numbers; once it's been created, you can't change the value an +instance represents. \class{Decimal} instances can be created from +integers or strings: + +\begin{verbatim} +>>> import decimal +>>> decimal.Decimal(1972) +Decimal("1972") +>>> decimal.Decimal("1.1") +Decimal("1.1") +\end{verbatim} + +You can also provide tuples containing the sign, the mantissa represented +as a tuple of decimal digits, and the exponent: + +\begin{verbatim} +>>> decimal.Decimal((1, (1, 4, 7, 5), -2)) +Decimal("-14.75") +\end{verbatim} + +Cautionary note: the sign bit is a Boolean value, so 0 is positive and +1 is negative. + +Converting from floating-point numbers poses a bit of a problem: +should the FP number representing 1.1 turn into the decimal number for +exactly 1.1, or for 1.1 plus whatever inaccuracies are introduced? +The decision was to dodge the issue and leave such a conversion out of +the API. Instead, you should convert the floating-point number into a +string using the desired precision and pass the string to the +\class{Decimal} constructor: + +\begin{verbatim} +>>> f = 1.1 +>>> decimal.Decimal(str(f)) +Decimal("1.1") +>>> decimal.Decimal('%.12f' % f) +Decimal("1.100000000000") +\end{verbatim} + +Once you have \class{Decimal} instances, you can perform the usual +mathematical operations on them. One limitation: exponentiation +requires an integer exponent: + +\begin{verbatim} +>>> a = decimal.Decimal('35.72') +>>> b = decimal.Decimal('1.73') +>>> a+b +Decimal("37.45") +>>> a-b +Decimal("33.99") +>>> a*b +Decimal("61.7956") +>>> a/b +Decimal("20.64739884393063583815028902") +>>> a ** 2 +Decimal("1275.9184") +>>> a**b +Traceback (most recent call last): + ... +decimal.InvalidOperation: x ** (non-integer) +\end{verbatim} + +You can combine \class{Decimal} instances with integers, but not with +floating-point numbers: + +\begin{verbatim} +>>> a + 4 +Decimal("39.72") +>>> a + 4.5 +Traceback (most recent call last): + ... +TypeError: You can interact Decimal only with int, long or Decimal data types. +>>> +\end{verbatim} + +\class{Decimal} numbers can be used with the \module{math} and +\module{cmath} modules, but note that they'll be immediately converted to +floating-point numbers before the operation is performed, resulting in +a possible loss of precision and accuracy. You'll also get back a +regular floating-point number and not a \class{Decimal}. + +\begin{verbatim} +>>> import math, cmath +>>> d = decimal.Decimal('123456789012.345') +>>> math.sqrt(d) +351364.18288201344 +>>> cmath.sqrt(-d) +351364.18288201344j +\end{verbatim} + +\class{Decimal} instances have a \method{sqrt()} method that +returns a \class{Decimal}, but if you need other things such as +trigonometric functions you'll have to implement them. + +\begin{verbatim} +>>> d.sqrt() +Decimal("351364.1828820134592177245001") +\end{verbatim} + + +\subsection{The \class{Context} type} + +Instances of the \class{Context} class encapsulate several settings for +decimal operations: + +\begin{itemize} + \item \member{prec} is the precision, the number of decimal places. + \item \member{rounding} specifies the rounding mode. The \module{decimal} + module has constants for the various possibilities: + \constant{ROUND_DOWN}, \constant{ROUND_CEILING}, + \constant{ROUND_HALF_EVEN}, and various others. + \item \member{traps} is a dictionary specifying what happens on +encountering certain error conditions: either an exception is raised or +a value is returned. Some examples of error conditions are +division by zero, loss of precision, and overflow. +\end{itemize} + +There's a thread-local default context available by calling +\function{getcontext()}; you can change the properties of this context +to alter the default precision, rounding, or trap handling. The +following example shows the effect of changing the precision of the default +context: + +\begin{verbatim} +>>> decimal.getcontext().prec +28 +>>> decimal.Decimal(1) / decimal.Decimal(7) +Decimal("0.1428571428571428571428571429") +>>> decimal.getcontext().prec = 9 +>>> decimal.Decimal(1) / decimal.Decimal(7) +Decimal("0.142857143") +\end{verbatim} + +The default action for error conditions is selectable; the module can +either return a special value such as infinity or not-a-number, or +exceptions can be raised: + +\begin{verbatim} +>>> decimal.Decimal(1) / decimal.Decimal(0) +Traceback (most recent call last): + ... +decimal.DivisionByZero: x / 0 +>>> decimal.getcontext().traps[decimal.DivisionByZero] = False +>>> decimal.Decimal(1) / decimal.Decimal(0) +Decimal("Infinity") +>>> +\end{verbatim} + +The \class{Context} instance also has various methods for formatting +numbers such as \method{to_eng_string()} and \method{to_sci_string()}. + +For more information, see the documentation for the \module{decimal} +module, which includes a quick-start tutorial and a reference. + +\begin{seealso} +\seepep{327}{Decimal Data Type}{Written by Facundo Batista and implemented + by Facundo Batista, Eric Price, Raymond Hettinger, Aahz, and Tim Peters.} + +\seeurl{http://research.microsoft.com/\textasciitilde hollasch/cgindex/coding/ieeefloat.html} +{A more detailed overview of the IEEE-754 representation.} + +\seeurl{http://www.lahey.com/float.htm} +{The article uses Fortran code to illustrate many of the problems +that floating-point inaccuracy can cause.} + +\seeurl{http://www2.hursley.ibm.com/decimal/} +{A description of a decimal-based representation. This representation +is being proposed as a standard, and underlies the new Python decimal +type. Much of this material was written by Mike Cowlishaw, designer of the +Rexx language.} + +\end{seealso} + + +%====================================================================== +\section{PEP 328: Multi-line Imports} + +One language change is a small syntactic tweak aimed at making it +easier to import many names from a module. In a +\code{from \var{module} import \var{names}} statement, +\var{names} is a sequence of names separated by commas. If the sequence is +very long, you can either write multiple imports from the same module, +or you can use backslashes to escape the line endings like this: + +\begin{verbatim} +from SimpleXMLRPCServer import SimpleXMLRPCServer,\ + SimpleXMLRPCRequestHandler,\ + CGIXMLRPCRequestHandler,\ + resolve_dotted_attribute +\end{verbatim} + +The syntactic change in Python 2.4 simply allows putting the names +within parentheses. Python ignores newlines within a parenthesized +expression, so the backslashes are no longer needed: + +\begin{verbatim} +from SimpleXMLRPCServer import (SimpleXMLRPCServer, + SimpleXMLRPCRequestHandler, + CGIXMLRPCRequestHandler, + resolve_dotted_attribute) +\end{verbatim} + +The PEP also proposes that all \keyword{import} statements be absolute +imports, with a leading \samp{.} character to indicate a relative +import. This part of the PEP was not implemented for Python 2.4, +but was completed for Python 2.5. + +\begin{seealso} +\seepep{328}{Imports: Multi-Line and Absolute/Relative} + {Written by Aahz. Multi-line imports were implemented by + Dima Dorfman.} +\end{seealso} + + +%====================================================================== +\section{PEP 331: Locale-Independent Float/String Conversions} + +The \module{locale} modules lets Python software select various +conversions and display conventions that are localized to a particular +country or language. However, the module was careful to not change +the numeric locale because various functions in Python's +implementation required that the numeric locale remain set to the +\code{'C'} locale. Often this was because the code was using the C library's +\cfunction{atof()} function. + +Not setting the numeric locale caused trouble for extensions that used +third-party C libraries, however, because they wouldn't have the +correct locale set. The motivating example was GTK+, whose user +interface widgets weren't displaying numbers in the current locale. + +The solution described in the PEP is to add three new functions to the +Python API that perform ASCII-only conversions, ignoring the locale +setting: + +\begin{itemize} + \item \cfunction{PyOS_ascii_strtod(\var{str}, \var{ptr})} +and \cfunction{PyOS_ascii_atof(\var{str}, \var{ptr})} +both convert a string to a C \ctype{double}. + \item \cfunction{PyOS_ascii_formatd(\var{buffer}, \var{buf_len}, \var{format}, \var{d})} converts a \ctype{double} to an ASCII string. +\end{itemize} + +The code for these functions came from the GLib library +(\url{http://developer.gnome.org/arch/gtk/glib.html}), whose +developers kindly relicensed the relevant functions and donated them +to the Python Software Foundation. The \module{locale} module +can now change the numeric locale, letting extensions such as GTK+ +produce the correct results. + +\begin{seealso} +\seepep{331}{Locale-Independent Float/String Conversions} +{Written by Christian R. Reis, and implemented by Gustavo Carneiro.} +\end{seealso} + +%====================================================================== +\section{Other Language Changes} + +Here are all of the changes that Python 2.4 makes to the core Python +language. + +\begin{itemize} + +\item Decorators for functions and methods were added (\pep{318}). + +\item Built-in \function{set} and \function{frozenset} types were +added (\pep{218}). Other new built-ins include the \function{reversed(\var{seq})} function (\pep{322}). + +\item Generator expressions were added (\pep{289}). + +\item Certain numeric expressions no longer return values restricted to 32 or 64 bits (\pep{237}). + +\item You can now put parentheses around the list of names in a +\code{from \var{module} import \var{names}} statement (\pep{328}). + +\item The \method{dict.update()} method now accepts the same +argument forms as the \class{dict} constructor. This includes any +mapping, any iterable of key/value pairs, and keyword arguments. +(Contributed by Raymond Hettinger.) + +\item The string methods \method{ljust()}, \method{rjust()}, and +\method{center()} now take an optional argument for specifying a +fill character other than a space. +(Contributed by Raymond Hettinger.) + +\item Strings also gained an \method{rsplit()} method that +works like the \method{split()} method but splits from the end of +the string. +(Contributed by Sean Reifschneider.) + +\begin{verbatim} +>>> 'www.python.org'.split('.', 1) +['www', 'python.org'] +'www.python.org'.rsplit('.', 1) +['www.python', 'org'] +\end{verbatim} + +\item Three keyword parameters, \var{cmp}, \var{key}, and +\var{reverse}, were added to the \method{sort()} method of lists. +These parameters make some common usages of \method{sort()} simpler. +All of these parameters are optional. + +For the \var{cmp} parameter, the value should be a comparison function +that takes two parameters and returns -1, 0, or +1 depending on how +the parameters compare. This function will then be used to sort the +list. Previously this was the only parameter that could be provided +to \method{sort()}. + +\var{key} should be a single-parameter function that takes a list +element and returns a comparison key for the element. The list is +then sorted using the comparison keys. The following example sorts a +list case-insensitively: + +\begin{verbatim} +>>> L = ['A', 'b', 'c', 'D'] +>>> L.sort() # Case-sensitive sort +>>> L +['A', 'D', 'b', 'c'] +>>> # Using 'key' parameter to sort list +>>> L.sort(key=lambda x: x.lower()) +>>> L +['A', 'b', 'c', 'D'] +>>> # Old-fashioned way +>>> L.sort(cmp=lambda x,y: cmp(x.lower(), y.lower())) +>>> L +['A', 'b', 'c', 'D'] +\end{verbatim} + +The last example, which uses the \var{cmp} parameter, is the old way +to perform a case-insensitive sort. It works but is slower than using +a \var{key} parameter. Using \var{key} calls \method{lower()} method +once for each element in the list while using \var{cmp} will call it +twice for each comparison, so using \var{key} saves on invocations of +the \method{lower()} method. + +For simple key functions and comparison functions, it is often +possible to avoid a \keyword{lambda} expression by using an unbound +method instead. For example, the above case-insensitive sort is best +written as: + +\begin{verbatim} +>>> L.sort(key=str.lower) +>>> L +['A', 'b', 'c', 'D'] +\end{verbatim} + +Finally, the \var{reverse} parameter takes a Boolean value. If the +value is true, the list will be sorted into reverse order. +Instead of \code{L.sort() ; L.reverse()}, you can now write +\code{L.sort(reverse=True)}. + +The results of sorting are now guaranteed to be stable. This means +that two entries with equal keys will be returned in the same order as +they were input. For example, you can sort a list of people by name, +and then sort the list by age, resulting in a list sorted by age where +people with the same age are in name-sorted order. + +(All changes to \method{sort()} contributed by Raymond Hettinger.) + +\item There is a new built-in function +\function{sorted(\var{iterable})} that works like the in-place +\method{list.sort()} method but can be used in +expressions. The differences are: + \begin{itemize} + \item the input may be any iterable; + \item a newly formed copy is sorted, leaving the original intact; and + \item the expression returns the new sorted copy + \end{itemize} + +\begin{verbatim} +>>> L = [9,7,8,3,2,4,1,6,5] +>>> [10+i for i in sorted(L)] # usable in a list comprehension +[11, 12, 13, 14, 15, 16, 17, 18, 19] +>>> L # original is left unchanged +[9,7,8,3,2,4,1,6,5] +>>> sorted('Monty Python') # any iterable may be an input +[' ', 'M', 'P', 'h', 'n', 'n', 'o', 'o', 't', 't', 'y', 'y'] + +>>> # List the contents of a dict sorted by key values +>>> colormap = dict(red=1, blue=2, green=3, black=4, yellow=5) +>>> for k, v in sorted(colormap.iteritems()): +... print k, v +... +black 4 +blue 2 +green 3 +red 1 +yellow 5 +\end{verbatim} + +(Contributed by Raymond Hettinger.) + +\item Integer operations will no longer trigger an \exception{OverflowWarning}. +The \exception{OverflowWarning} warning will disappear in Python 2.5. + +\item The interpreter gained a new switch, \programopt{-m}, that +takes a name, searches for the corresponding module on \code{sys.path}, +and runs the module as a script. For example, +you can now run the Python profiler with \code{python -m profile}. +(Contributed by Nick Coghlan.) + +\item The \function{eval(\var{expr}, \var{globals}, \var{locals})} +and \function{execfile(\var{filename}, \var{globals}, \var{locals})} +functions and the \keyword{exec} statement now accept any mapping type +for the \var{locals} parameter. Previously this had to be a regular +Python dictionary. (Contributed by Raymond Hettinger.) + +\item The \function{zip()} built-in function and \function{itertools.izip()} + now return an empty list if called with no arguments. + Previously they raised a \exception{TypeError} + exception. This makes them more + suitable for use with variable length argument lists: + +\begin{verbatim} +>>> def transpose(array): +... return zip(*array) +... +>>> transpose([(1,2,3), (4,5,6)]) +[(1, 4), (2, 5), (3, 6)] +>>> transpose([]) +[] +\end{verbatim} +(Contributed by Raymond Hettinger.) + +\item Encountering a failure while importing a module no longer leaves +a partially-initialized module object in \code{sys.modules}. The +incomplete module object left behind would fool further imports of the +same module into succeeding, leading to confusing errors. +(Fixed by Tim Peters.) + +\item \constant{None} is now a constant; code that binds a new value to +the name \samp{None} is now a syntax error. +(Contributed by Raymond Hettinger.) + +\end{itemize} + + +%====================================================================== +\subsection{Optimizations} + +\begin{itemize} + +\item The inner loops for list and tuple slicing + were optimized and now run about one-third faster. The inner loops + for dictionaries were also optimized, resulting in performance boosts for + \method{keys()}, \method{values()}, \method{items()}, + \method{iterkeys()}, \method{itervalues()}, and \method{iteritems()}. + (Contributed by Raymond Hettinger.) + +\item The machinery for growing and shrinking lists was optimized for + speed and for space efficiency. Appending and popping from lists now + runs faster due to more efficient code paths and less frequent use of + the underlying system \cfunction{realloc()}. List comprehensions + also benefit. \method{list.extend()} was also optimized and no + longer converts its argument into a temporary list before extending + the base list. (Contributed by Raymond Hettinger.) + +\item \function{list()}, \function{tuple()}, \function{map()}, + \function{filter()}, and \function{zip()} now run several times + faster with non-sequence arguments that supply a \method{__len__()} + method. (Contributed by Raymond Hettinger.) + +\item The methods \method{list.__getitem__()}, + \method{dict.__getitem__()}, and \method{dict.__contains__()} are + are now implemented as \class{method_descriptor} objects rather + than \class{wrapper_descriptor} objects. This form of + access doubles their performance and makes them more suitable for + use as arguments to functionals: + \samp{map(mydict.__getitem__, keylist)}. + (Contributed by Raymond Hettinger.) + +\item Added a new opcode, \code{LIST_APPEND}, that simplifies + the generated bytecode for list comprehensions and speeds them up + by about a third. (Contributed by Raymond Hettinger.) + +\item The peephole bytecode optimizer has been improved to +produce shorter, faster bytecode; remarkably, the resulting bytecode is +more readable. (Enhanced by Raymond Hettinger.) + +\item String concatenations in statements of the form \code{s = s + +"abc"} and \code{s += "abc"} are now performed more efficiently in +certain circumstances. This optimization won't be present in other +Python implementations such as Jython, so you shouldn't rely on it; +using the \method{join()} method of strings is still recommended when +you want to efficiently glue a large number of strings together. +(Contributed by Armin Rigo.) + +\end{itemize} + +% pystone is almost useless for comparing different versions of Python; +% instead, it excels at predicting relative Python performance on +% different machines. +% So, this section would be more informative if it used other tools +% such as pybench and parrotbench. For a more application oriented +% benchmark, try comparing the timings of test_decimal.py under 2.3 +% and 2.4. + +The net result of the 2.4 optimizations is that Python 2.4 runs the +pystone benchmark around 5\% faster than Python 2.3 and 35\% faster +than Python 2.2. (pystone is not a particularly good benchmark, but +it's the most commonly used measurement of Python's performance. Your +own applications may show greater or smaller benefits from Python~2.4.) + + +%====================================================================== +\section{New, Improved, and Deprecated Modules} + +As usual, Python's standard library received a number of enhancements and +bug fixes. Here's a partial list of the most notable changes, sorted +alphabetically by module name. Consult the +\file{Misc/NEWS} file in the source tree for a more +complete list of changes, or look through the CVS logs for all the +details. + +\begin{itemize} + +\item The \module{asyncore} module's \function{loop()} function now + has a \var{count} parameter that lets you perform a limited number + of passes through the polling loop. The default is still to loop + forever. + +\item The \module{base64} module now has more complete RFC 3548 support + for Base64, Base32, and Base16 encoding and decoding, including + optional case folding and optional alternative alphabets. + (Contributed by Barry Warsaw.) + +\item The \module{bisect} module now has an underlying C implementation + for improved performance. + (Contributed by Dmitry Vasiliev.) + +\item The CJKCodecs collections of East Asian codecs, maintained +by Hye-Shik Chang, was integrated into 2.4. +The new encodings are: + +\begin{itemize} + \item Chinese (PRC): gb2312, gbk, gb18030, big5hkscs, hz + \item Chinese (ROC): big5, cp950 + \item Japanese: cp932, euc-jis-2004, euc-jp, +euc-jisx0213, iso-2022-jp, iso-2022-jp-1, iso-2022-jp-2, + iso-2022-jp-3, iso-2022-jp-ext, iso-2022-jp-2004, + shift-jis, shift-jisx0213, shift-jis-2004 + \item Korean: cp949, euc-kr, johab, iso-2022-kr +\end{itemize} + +\item Some other new encodings were added: HP Roman8, +ISO_8859-11, ISO_8859-16, PCTP-154, and TIS-620. + +\item The UTF-8 and UTF-16 codecs now cope better with receiving partial input. +Previously the \class{StreamReader} class would try to read more data, +making it impossible to resume decoding from the stream. The +\method{read()} method will now return as much data as it can and future +calls will resume decoding where previous ones left off. +(Implemented by Walter D\"orwald.) + +\item There is a new \module{collections} module for + various specialized collection datatypes. + Currently it contains just one type, \class{deque}, + a double-ended queue that supports efficiently adding and removing + elements from either end: + +\begin{verbatim} +>>> from collections import deque +>>> d = deque('ghi') # make a new deque with three items +>>> d.append('j') # add a new entry to the right side +>>> d.appendleft('f') # add a new entry to the left side +>>> d # show the representation of the deque +deque(['f', 'g', 'h', 'i', 'j']) +>>> d.pop() # return and remove the rightmost item +'j' +>>> d.popleft() # return and remove the leftmost item +'f' +>>> list(d) # list the contents of the deque +['g', 'h', 'i'] +>>> 'h' in d # search the deque +True +\end{verbatim} + +Several modules, such as the \module{Queue} and \module{threading} +modules, now take advantage of \class{collections.deque} for improved +performance. (Contributed by Raymond Hettinger.) + +\item The \module{ConfigParser} classes have been enhanced slightly. + The \method{read()} method now returns a list of the files that + were successfully parsed, and the \method{set()} method raises + \exception{TypeError} if passed a \var{value} argument that isn't a + string. (Contributed by John Belmonte and David Goodger.) + +\item The \module{curses} module now supports the ncurses extension + \function{use_default_colors()}. On platforms where the terminal + supports transparency, this makes it possible to use a transparent + background. (Contributed by J\"org Lehmann.) + +\item The \module{difflib} module now includes an \class{HtmlDiff} class +that creates an HTML table showing a side by side comparison +of two versions of a text. (Contributed by Dan Gass.) + +\item The \module{email} package was updated to version 3.0, +which dropped various deprecated APIs and removes support for Python +versions earlier than 2.3. The 3.0 version of the package uses a new +incremental parser for MIME messages, available in the +\module{email.FeedParser} module. The new parser doesn't require +reading the entire message into memory, and doesn't throw exceptions +if a message is malformed; instead it records any problems in the +\member{defect} attribute of the message. (Developed by Anthony +Baxter, Barry Warsaw, Thomas Wouters, and others.) + +\item The \module{heapq} module has been converted to C. The resulting + tenfold improvement in speed makes the module suitable for handling + high volumes of data. In addition, the module has two new functions + \function{nlargest()} and \function{nsmallest()} that use heaps to + find the N largest or smallest values in a dataset without the + expense of a full sort. (Contributed by Raymond Hettinger.) + +\item The \module{httplib} module now contains constants for HTTP +status codes defined in various HTTP-related RFC documents. Constants +have names such as \constant{OK}, \constant{CREATED}, +\constant{CONTINUE}, and \constant{MOVED_PERMANENTLY}; use pydoc to +get a full list. (Contributed by Andrew Eland.) + +\item The \module{imaplib} module now supports IMAP's THREAD command +(contributed by Yves Dionne) and new \method{deleteacl()} and +\method{myrights()} methods (contributed by Arnaud Mazin). + +\item The \module{itertools} module gained a + \function{groupby(\var{iterable}\optional{, \var{func}})} function. + \var{iterable} is something that can be iterated over to return a + stream of elements, and the optional \var{func} parameter is a + function that takes an element and returns a key value; if omitted, + the key is simply the element itself. \function{groupby()} then + groups the elements into subsequences which have matching values of + the key, and returns a series of 2-tuples containing the key value + and an iterator over the subsequence. + +Here's an example to make this clearer. The \var{key} function simply +returns whether a number is even or odd, so the result of +\function{groupby()} is to return consecutive runs of odd or even +numbers. + +\begin{verbatim} +>>> import itertools +>>> L = [2, 4, 6, 7, 8, 9, 11, 12, 14] +>>> for key_val, it in itertools.groupby(L, lambda x: x % 2): +... print key_val, list(it) +... +0 [2, 4, 6] +1 [7] +0 [8] +1 [9, 11] +0 [12, 14] +>>> +\end{verbatim} + +\function{groupby()} is typically used with sorted input. The logic +for \function{groupby()} is similar to the \UNIX{} \code{uniq} filter +which makes it handy for eliminating, counting, or identifying +duplicate elements: + +\begin{verbatim} +>>> word = 'abracadabra' +>>> letters = sorted(word) # Turn string into a sorted list of letters +>>> letters +['a', 'a', 'a', 'a', 'a', 'b', 'b', 'c', 'd', 'r', 'r'] +>>> for k, g in itertools.groupby(letters): +... print k, list(g) +... +a ['a', 'a', 'a', 'a', 'a'] +b ['b', 'b'] +c ['c'] +d ['d'] +r ['r', 'r'] +>>> # List unique letters +>>> [k for k, g in groupby(letters)] +['a', 'b', 'c', 'd', 'r'] +>>> # Count letter occurrences +>>> [(k, len(list(g))) for k, g in groupby(letters)] +[('a', 5), ('b', 2), ('c', 1), ('d', 1), ('r', 2)] +\end{verbatim} + +(Contributed by Hye-Shik Chang.) + +\item \module{itertools} also gained a function named +\function{tee(\var{iterator}, \var{N})} that returns \var{N} independent +iterators that replicate \var{iterator}. If \var{N} is omitted, the +default is 2. + +\begin{verbatim} +>>> L = [1,2,3] +>>> i1, i2 = itertools.tee(L) +>>> i1,i2 +(<itertools.tee object at 0x402c2080>, <itertools.tee object at 0x402c2090>) +>>> list(i1) # Run the first iterator to exhaustion +[1, 2, 3] +>>> list(i2) # Run the second iterator to exhaustion +[1, 2, 3] +>\end{verbatim} + +Note that \function{tee()} has to keep copies of the values returned +by the iterator; in the worst case, it may need to keep all of them. +This should therefore be used carefully if the leading iterator +can run far ahead of the trailing iterator in a long stream of inputs. +If the separation is large, then you might as well use +\function{list()} instead. When the iterators track closely with one +another, \function{tee()} is ideal. Possible applications include +bookmarking, windowing, or lookahead iterators. +(Contributed by Raymond Hettinger.) + +\item A number of functions were added to the \module{locale} +module, such as \function{bind_textdomain_codeset()} to specify a +particular encoding and a family of \function{l*gettext()} functions +that return messages in the chosen encoding. +(Contributed by Gustavo Niemeyer.) + +\item Some keyword arguments were added to the \module{logging} +package's \function{basicConfig} function to simplify log +configuration. The default behavior is to log messages to standard +error, but various keyword arguments can be specified to log to a +particular file, change the logging format, or set the logging level. +For example: + +\begin{verbatim} +import logging +logging.basicConfig(filename='/var/log/application.log', + level=0, # Log all messages + format='%(levelname):%(process):%(thread):%(message)') +\end{verbatim} + +Other additions to the \module{logging} package include a +\method{log(\var{level}, \var{msg})} convenience method, as well as a +\class{TimedRotatingFileHandler} class that rotates its log files at a +timed interval. The module already had \class{RotatingFileHandler}, +which rotated logs once the file exceeded a certain size. Both +classes derive from a new \class{BaseRotatingHandler} class that can +be used to implement other rotating handlers. + +(Changes implemented by Vinay Sajip.) + +\item The \module{marshal} module now shares interned strings on unpacking a +data structure. This may shrink the size of certain pickle strings, +but the primary effect is to make \file{.pyc} files significantly smaller. +(Contributed by Martin von~L\"owis.) + +\item The \module{nntplib} module's \class{NNTP} class gained +\method{description()} and \method{descriptions()} methods to retrieve +newsgroup descriptions for a single group or for a range of groups. +(Contributed by J\"urgen A. Erhard.) + +\item Two new functions were added to the \module{operator} module, +\function{attrgetter(\var{attr})} and \function{itemgetter(\var{index})}. +Both functions return callables that take a single argument and return +the corresponding attribute or item; these callables make excellent +data extractors when used with \function{map()} or +\function{sorted()}. For example: + +\begin{verbatim} +>>> L = [('c', 2), ('d', 1), ('a', 4), ('b', 3)] +>>> map(operator.itemgetter(0), L) +['c', 'd', 'a', 'b'] +>>> map(operator.itemgetter(1), L) +[2, 1, 4, 3] +>>> sorted(L, key=operator.itemgetter(1)) # Sort list by second tuple item +[('d', 1), ('c', 2), ('b', 3), ('a', 4)] +\end{verbatim} + +(Contributed by Raymond Hettinger.) + +\item The \module{optparse} module was updated in various ways. The +module now passes its messages through \function{gettext.gettext()}, +making it possible to internationalize Optik's help and error +messages. Help messages for options can now include the string +\code{'\%default'}, which will be replaced by the option's default +value. (Contributed by Greg Ward.) + +\item The long-term plan is to deprecate the \module{rfc822} module +in some future Python release in favor of the \module{email} package. +To this end, the \function{email.Utils.formatdate()} function has been +changed to make it usable as a replacement for +\function{rfc822.formatdate()}. You may want to write new e-mail +processing code with this in mind. (Change implemented by Anthony +Baxter.) + +\item A new \function{urandom(\var{n})} function was added to the +\module{os} module, returning a string containing \var{n} bytes of +random data. This function provides access to platform-specific +sources of randomness such as \file{/dev/urandom} on Linux or the +Windows CryptoAPI. (Contributed by Trevor Perrin.) + +\item Another new function: \function{os.path.lexists(\var{path})} +returns true if the file specified by \var{path} exists, whether or +not it's a symbolic link. This differs from the existing +\function{os.path.exists(\var{path})} function, which returns false if +\var{path} is a symlink that points to a destination that doesn't exist. +(Contributed by Beni Cherniavsky.) + +\item A new \function{getsid()} function was added to the +\module{posix} module that underlies the \module{os} module. +(Contributed by J. Raynor.) + +\item The \module{poplib} module now supports POP over SSL. (Contributed by +Hector Urtubia.) + +\item The \module{profile} module can now profile C extension functions. +(Contributed by Nick Bastin.) + +\item The \module{random} module has a new method called + \method{getrandbits(\var{N})} that returns a long integer \var{N} + bits in length. The existing \method{randrange()} method now uses + \method{getrandbits()} where appropriate, making generation of + arbitrarily large random numbers more efficient. (Contributed by + Raymond Hettinger.) + +\item The regular expression language accepted by the \module{re} module + was extended with simple conditional expressions, written as + \regexp{(?(\var{group})\var{A}|\var{B})}. \var{group} is either a + numeric group ID or a group name defined with \regexp{(?P<group>...)} + earlier in the expression. If the specified group matched, the + regular expression pattern \var{A} will be tested against the string; if + the group didn't match, the pattern \var{B} will be used instead. + (Contributed by Gustavo Niemeyer.) + +\item The \module{re} module is also no longer recursive, thanks to a +massive amount of work by Gustavo Niemeyer. In a recursive regular +expression engine, certain patterns result in a large amount of C +stack space being consumed, and it was possible to overflow the stack. +For example, if you matched a 30000-byte string of \samp{a} characters +against the expression \regexp{(a|b)+}, one stack frame was consumed +per character. Python 2.3 tried to check for stack overflow and raise +a \exception{RuntimeError} exception, but certain patterns could +sidestep the checking and if you were unlucky Python could segfault. +Python 2.4's regular expression engine can match this pattern without +problems. + +\item The \module{signal} module now performs tighter error-checking +on the parameters to the \function{signal.signal()} function. For +example, you can't set a handler on the \constant{SIGKILL} signal; +previous versions of Python would quietly accept this, but 2.4 will +raise a \exception{RuntimeError} exception. + +\item Two new functions were added to the \module{socket} module. +\function{socketpair()} returns a pair of connected sockets and +\function{getservbyport(\var{port})} looks up the service name for a +given port number. (Contributed by Dave Cole and Barry Warsaw.) + +\item The \function{sys.exitfunc()} function has been deprecated. Code +should be using the existing \module{atexit} module, which correctly +handles calling multiple exit functions. Eventually +\function{sys.exitfunc()} will become a purely internal interface, +accessed only by \module{atexit}. + +\item The \module{tarfile} module now generates GNU-format tar files +by default. (Contributed by Lars Gustaebel.) + +\item The \module{threading} module now has an elegantly simple way to support +thread-local data. The module contains a \class{local} class whose +attribute values are local to different threads. + +\begin{verbatim} +import threading + +data = threading.local() +data.number = 42 +data.url = ('www.python.org', 80) +\end{verbatim} + +Other threads can assign and retrieve their own values for the +\member{number} and \member{url} attributes. You can subclass +\class{local} to initialize attributes or to add methods. +(Contributed by Jim Fulton.) + +\item The \module{timeit} module now automatically disables periodic + garbage collection during the timing loop. This change makes + consecutive timings more comparable. (Contributed by Raymond Hettinger.) + +\item The \module{weakref} module now supports a wider variety of objects + including Python functions, class instances, sets, frozensets, deques, + arrays, files, sockets, and regular expression pattern objects. + (Contributed by Raymond Hettinger.) + +\item The \module{xmlrpclib} module now supports a multi-call extension for +transmitting multiple XML-RPC calls in a single HTTP operation. +(Contributed by Brian Quinlan.) + +\item The \module{mpz}, \module{rotor}, and \module{xreadlines} modules have +been removed. + +\end{itemize} + + +%====================================================================== +% whole new modules get described in subsections here + +%===================== +\subsection{cookielib} + +The \module{cookielib} library supports client-side handling for HTTP +cookies, mirroring the \module{Cookie} module's server-side cookie +support. Cookies are stored in cookie jars; the library transparently +stores cookies offered by the web server in the cookie jar, and +fetches the cookie from the jar when connecting to the server. As in +web browsers, policy objects control whether cookies are accepted or +not. + +In order to store cookies across sessions, two implementations of +cookie jars are provided: one that stores cookies in the Netscape +format so applications can use the Mozilla or Lynx cookie files, and +one that stores cookies in the same format as the Perl libwww library. + +\module{urllib2} has been changed to interact with \module{cookielib}: +\class{HTTPCookieProcessor} manages a cookie jar that is used when +accessing URLs. + +This module was contributed by John J. Lee. + + +% ================== +\subsection{doctest} + +The \module{doctest} module underwent considerable refactoring thanks +to Edward Loper and Tim Peters. Testing can still be as simple as +running \function{doctest.testmod()}, but the refactorings allow +customizing the module's operation in various ways + +The new \class{DocTestFinder} class extracts the tests from a given +object's docstrings: + +\begin{verbatim} +def f (x, y): + """>>> f(2,2) +4 +>>> f(3,2) +6 + """ + return x*y + +finder = doctest.DocTestFinder() + +# Get list of DocTest instances +tests = finder.find(f) +\end{verbatim} + +The new \class{DocTestRunner} class then runs individual tests and can +produce a summary of the results: + +\begin{verbatim} +runner = doctest.DocTestRunner() +for t in tests: + tried, failed = runner.run(t) + +runner.summarize(verbose=1) +\end{verbatim} + +The above example produces the following output: + +\begin{verbatim} +1 items passed all tests: + 2 tests in f +2 tests in 1 items. +2 passed and 0 failed. +Test passed. +\end{verbatim} + +\class{DocTestRunner} uses an instance of the \class{OutputChecker} +class to compare the expected output with the actual output. This +class takes a number of different flags that customize its behaviour; +ambitious users can also write a completely new subclass of +\class{OutputChecker}. + +The default output checker provides a number of handy features. +For example, with the \constant{doctest.ELLIPSIS} option flag, +an ellipsis (\samp{...}) in the expected output matches any substring, +making it easier to accommodate outputs that vary in minor ways: + +\begin{verbatim} +def o (n): + """>>> o(1) +<__main__.C instance at 0x...> +>>> +""" +\end{verbatim} + +Another special string, \samp{<BLANKLINE>}, matches a blank line: + +\begin{verbatim} +def p (n): + """>>> p(1) +<BLANKLINE> +>>> +""" +\end{verbatim} + +Another new capability is producing a diff-style display of the output +by specifying the \constant{doctest.REPORT_UDIFF} (unified diffs), +\constant{doctest.REPORT_CDIFF} (context diffs), or +\constant{doctest.REPORT_NDIFF} (delta-style) option flags. For example: + +\begin{verbatim} +def g (n): + """>>> g(4) +here +is +a +lengthy +>>>""" + L = 'here is a rather lengthy list of words'.split() + for word in L[:n]: + print word +\end{verbatim} + +Running the above function's tests with +\constant{doctest.REPORT_UDIFF} specified, you get the following output: + +\begin{verbatim} +********************************************************************** +File ``t.py'', line 15, in g +Failed example: + g(4) +Differences (unified diff with -expected +actual): + @@ -2,3 +2,3 @@ + is + a + -lengthy + +rather +********************************************************************** +\end{verbatim} + + +% ====================================================================== +\section{Build and C API Changes} + +Some of the changes to Python's build process and to the C API are: + +\begin{itemize} + + \item Three new convenience macros were added for common return + values from extension functions: \csimplemacro{Py_RETURN_NONE}, + \csimplemacro{Py_RETURN_TRUE}, and \csimplemacro{Py_RETURN_FALSE}. + (Contributed by Brett Cannon.) + + \item Another new macro, \csimplemacro{Py_CLEAR(\var{obj})}, + decreases the reference count of \var{obj} and sets \var{obj} to the + null pointer. (Contributed by Jim Fulton.) + + \item A new function, \cfunction{PyTuple_Pack(\var{N}, \var{obj1}, + \var{obj2}, ..., \var{objN})}, constructs tuples from a variable + length argument list of Python objects. (Contributed by Raymond Hettinger.) + + \item A new function, \cfunction{PyDict_Contains(\var{d}, \var{k})}, + implements fast dictionary lookups without masking exceptions raised + during the look-up process. (Contributed by Raymond Hettinger.) + + \item The \csimplemacro{Py_IS_NAN(\var{X})} macro returns 1 if + its float or double argument \var{X} is a NaN. + (Contributed by Tim Peters.) + + \item C code can avoid unnecessary locking by using the new + \cfunction{PyEval_ThreadsInitialized()} function to tell + if any thread operations have been performed. If this function + returns false, no lock operations are needed. + (Contributed by Nick Coghlan.) + + \item A new function, \cfunction{PyArg_VaParseTupleAndKeywords()}, + is the same as \cfunction{PyArg_ParseTupleAndKeywords()} but takes a + \ctype{va_list} instead of a number of arguments. + (Contributed by Greg Chapman.) + + \item A new method flag, \constant{METH_COEXISTS}, allows a function + defined in slots to co-exist with a \ctype{PyCFunction} having the + same name. This can halve the access time for a method such as + \method{set.__contains__()}. (Contributed by Raymond Hettinger.) + + \item Python can now be built with additional profiling for the + interpreter itself, intended as an aid to people developing the + Python core. Providing \longprogramopt{--enable-profiling} to the + \program{configure} script will let you profile the interpreter with + \program{gprof}, and providing the \longprogramopt{--with-tsc} + switch enables profiling using the Pentium's Time-Stamp-Counter + register. Note that the \longprogramopt{--with-tsc} switch is slightly + misnamed, because the profiling feature also works on the PowerPC + platform, though that processor architecture doesn't call that + register ``the TSC register''. (Contributed by Jeremy Hylton.) + + \item The \ctype{tracebackobject} type has been renamed to \ctype{PyTracebackObject}. + +\end{itemize} + + +%====================================================================== +\subsection{Port-Specific Changes} + +\begin{itemize} + +\item The Windows port now builds under MSVC++ 7.1 as well as version 6. + (Contributed by Martin von~L\"owis.) + +\end{itemize} + + + +%====================================================================== +\section{Porting to Python 2.4} + +This section lists previously described changes that may require +changes to your code: + +\begin{itemize} + +\item Left shifts and hexadecimal/octal constants that are too + large no longer trigger a \exception{FutureWarning} and return + a value limited to 32 or 64 bits; instead they return a long integer. + +\item Integer operations will no longer trigger an \exception{OverflowWarning}. +The \exception{OverflowWarning} warning will disappear in Python 2.5. + +\item The \function{zip()} built-in function and \function{itertools.izip()} + now return an empty list instead of raising a \exception{TypeError} + exception if called with no arguments. + +\item You can no longer compare the \class{date} and \class{datetime} + instances provided by the \module{datetime} module. Two + instances of different classes will now always be unequal, and + relative comparisons (\code{<}, \code{>}) will raise a \exception{TypeError}. + +\item \function{dircache.listdir()} now passes exceptions to the caller + instead of returning empty lists. + +\item \function{LexicalHandler.startDTD()} used to receive the public and + system IDs in the wrong order. This has been corrected; applications + relying on the wrong order need to be fixed. + +\item \function{fcntl.ioctl} now warns if the \var{mutate} + argument is omitted and relevant. + +\item The \module{tarfile} module now generates GNU-format tar files +by default. + +\item Encountering a failure while importing a module no longer leaves +a partially-initialized module object in \code{sys.modules}. + +\item \constant{None} is now a constant; code that binds a new value to +the name \samp{None} is now a syntax error. + +\item The \function{signals.signal()} function now raises a +\exception{RuntimeError} exception for certain illegal values; +previously these errors would pass silently. For example, you can no +longer set a handler on the \constant{SIGKILL} signal. + +\end{itemize} + + +%====================================================================== +\section{Acknowledgements \label{acks}} + +The author would like to thank the following people for offering +suggestions, corrections and assistance with various drafts of this +article: Koray Can, Hye-Shik Chang, Michael Dyck, Raymond Hettinger, +Brian Hurt, Hamish Lawson, Fredrik Lundh, Sean Reifschneider, +Sadruddin Rejeb. + +\end{document} |