diff options
author | cinap_lenrek <cinap_lenrek@localhost> | 2011-05-03 11:25:13 +0000 |
---|---|---|
committer | cinap_lenrek <cinap_lenrek@localhost> | 2011-05-03 11:25:13 +0000 |
commit | 458120dd40db6b4df55a4e96b650e16798ef06a0 (patch) | |
tree | 8f82685be24fef97e715c6f5ca4c68d34d5074ee /sys/src/cmd/python/Doc/lib/libsets.tex | |
parent | 3a742c699f6806c1145aea5149bf15de15a0afd7 (diff) |
add hg and python
Diffstat (limited to 'sys/src/cmd/python/Doc/lib/libsets.tex')
-rw-r--r-- | sys/src/cmd/python/Doc/lib/libsets.tex | 264 |
1 files changed, 264 insertions, 0 deletions
diff --git a/sys/src/cmd/python/Doc/lib/libsets.tex b/sys/src/cmd/python/Doc/lib/libsets.tex new file mode 100644 index 000000000..22bf34bfb --- /dev/null +++ b/sys/src/cmd/python/Doc/lib/libsets.tex @@ -0,0 +1,264 @@ +\section{\module{sets} --- + Unordered collections of unique elements} + +\declaremodule{standard}{sets} +\modulesynopsis{Implementation of sets of unique elements.} +\moduleauthor{Greg V. Wilson}{gvwilson@nevex.com} +\moduleauthor{Alex Martelli}{aleax@aleax.it} +\moduleauthor{Guido van Rossum}{guido@python.org} +\sectionauthor{Raymond D. Hettinger}{python@rcn.com} + +\versionadded{2.3} + +The \module{sets} module provides classes for constructing and manipulating +unordered collections of unique elements. Common uses include membership +testing, removing duplicates from a sequence, and computing standard math +operations on sets such as intersection, union, difference, and symmetric +difference. + +Like other collections, sets support \code{\var{x} in \var{set}}, +\code{len(\var{set})}, and \code{for \var{x} in \var{set}}. Being an +unordered collection, sets do not record element position or order of +insertion. Accordingly, sets do not support indexing, slicing, or +other sequence-like behavior. + +Most set applications use the \class{Set} class which provides every set +method except for \method{__hash__()}. For advanced applications requiring +a hash method, the \class{ImmutableSet} class adds a \method{__hash__()} +method but omits methods which alter the contents of the set. Both +\class{Set} and \class{ImmutableSet} derive from \class{BaseSet}, an +abstract class useful for determining whether something is a set: +\code{isinstance(\var{obj}, BaseSet)}. + +The set classes are implemented using dictionaries. Accordingly, the +requirements for set elements are the same as those for dictionary keys; +namely, that the element defines both \method{__eq__} and \method{__hash__}. +As a result, sets +cannot contain mutable elements such as lists or dictionaries. +However, they can contain immutable collections such as tuples or +instances of \class{ImmutableSet}. For convenience in implementing +sets of sets, inner sets are automatically converted to immutable +form, for example, \code{Set([Set(['dog'])])} is transformed to +\code{Set([ImmutableSet(['dog'])])}. + +\begin{classdesc}{Set}{\optional{iterable}} +Constructs a new empty \class{Set} object. If the optional \var{iterable} +parameter is supplied, updates the set with elements obtained from iteration. +All of the elements in \var{iterable} should be immutable or be transformable +to an immutable using the protocol described in +section~\ref{immutable-transforms}. +\end{classdesc} + +\begin{classdesc}{ImmutableSet}{\optional{iterable}} +Constructs a new empty \class{ImmutableSet} object. If the optional +\var{iterable} parameter is supplied, updates the set with elements obtained +from iteration. All of the elements in \var{iterable} should be immutable or +be transformable to an immutable using the protocol described in +section~\ref{immutable-transforms}. + +Because \class{ImmutableSet} objects provide a \method{__hash__()} method, +they can be used as set elements or as dictionary keys. \class{ImmutableSet} +objects do not have methods for adding or removing elements, so all of the +elements must be known when the constructor is called. +\end{classdesc} + + +\subsection{Set Objects \label{set-objects}} + +Instances of \class{Set} and \class{ImmutableSet} both provide +the following operations: + +\begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result} + \lineiii{len(\var{s})}{}{cardinality of set \var{s}} + + \hline + \lineiii{\var{x} in \var{s}}{} + {test \var{x} for membership in \var{s}} + \lineiii{\var{x} not in \var{s}}{} + {test \var{x} for non-membership in \var{s}} + \lineiii{\var{s}.issubset(\var{t})}{\code{\var{s} <= \var{t}}} + {test whether every element in \var{s} is in \var{t}} + \lineiii{\var{s}.issuperset(\var{t})}{\code{\var{s} >= \var{t}}} + {test whether every element in \var{t} is in \var{s}} + + \hline + \lineiii{\var{s}.union(\var{t})}{\var{s} \textbar{} \var{t}} + {new set with elements from both \var{s} and \var{t}} + \lineiii{\var{s}.intersection(\var{t})}{\var{s} \&\ \var{t}} + {new set with elements common to \var{s} and \var{t}} + \lineiii{\var{s}.difference(\var{t})}{\var{s} - \var{t}} + {new set with elements in \var{s} but not in \var{t}} + \lineiii{\var{s}.symmetric_difference(\var{t})}{\var{s} \^\ \var{t}} + {new set with elements in either \var{s} or \var{t} but not both} + \lineiii{\var{s}.copy()}{} + {new set with a shallow copy of \var{s}} +\end{tableiii} + +Note, the non-operator versions of \method{union()}, +\method{intersection()}, \method{difference()}, and +\method{symmetric_difference()} will accept any iterable as an argument. +In contrast, their operator based counterparts require their arguments to +be sets. This precludes error-prone constructions like +\code{Set('abc') \&\ 'cbs'} in favor of the more readable +\code{Set('abc').intersection('cbs')}. +\versionchanged[Formerly all arguments were required to be sets]{2.3.1} + +In addition, both \class{Set} and \class{ImmutableSet} +support set to set comparisons. Two sets are equal if and only if +every element of each set is contained in the other (each is a subset +of the other). +A set is less than another set if and only if the first set is a proper +subset of the second set (is a subset, but is not equal). +A set is greater than another set if and only if the first set is a proper +superset of the second set (is a superset, but is not equal). + +The subset and equality comparisons do not generalize to a complete +ordering function. For example, any two disjoint sets are not equal and +are not subsets of each other, so \emph{all} of the following return +\code{False}: \code{\var{a}<\var{b}}, \code{\var{a}==\var{b}}, or +\code{\var{a}>\var{b}}. +Accordingly, sets do not implement the \method{__cmp__} method. + +Since sets only define partial ordering (subset relationships), the output +of the \method{list.sort()} method is undefined for lists of sets. + +The following table lists operations available in \class{ImmutableSet} +but not found in \class{Set}: + +\begin{tableii}{c|l}{code}{Operation}{Result} + \lineii{hash(\var{s})}{returns a hash value for \var{s}} +\end{tableii} + +The following table lists operations available in \class{Set} +but not found in \class{ImmutableSet}: + +\begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result} + \lineiii{\var{s}.update(\var{t})} + {\var{s} \textbar= \var{t}} + {return set \var{s} with elements added from \var{t}} + \lineiii{\var{s}.intersection_update(\var{t})} + {\var{s} \&= \var{t}} + {return set \var{s} keeping only elements also found in \var{t}} + \lineiii{\var{s}.difference_update(\var{t})} + {\var{s} -= \var{t}} + {return set \var{s} after removing elements found in \var{t}} + \lineiii{\var{s}.symmetric_difference_update(\var{t})} + {\var{s} \textasciicircum= \var{t}} + {return set \var{s} with elements from \var{s} or \var{t} + but not both} + + \hline + \lineiii{\var{s}.add(\var{x})}{} + {add element \var{x} to set \var{s}} + \lineiii{\var{s}.remove(\var{x})}{} + {remove \var{x} from set \var{s}; raises \exception{KeyError} + if not present} + \lineiii{\var{s}.discard(\var{x})}{} + {removes \var{x} from set \var{s} if present} + \lineiii{\var{s}.pop()}{} + {remove and return an arbitrary element from \var{s}; raises + \exception{KeyError} if empty} + \lineiii{\var{s}.clear()}{} + {remove all elements from set \var{s}} +\end{tableiii} + +Note, the non-operator versions of \method{update()}, +\method{intersection_update()}, \method{difference_update()}, and +\method{symmetric_difference_update()} will accept any iterable as +an argument. +\versionchanged[Formerly all arguments were required to be sets]{2.3.1} + +Also note, the module also includes a \method{union_update()} method +which is an alias for \method{update()}. The method is included for +backwards compatibility. Programmers should prefer the +\method{update()} method because it is supported by the builtin +\class{set()} and \class{frozenset()} types. + +\subsection{Example \label{set-example}} + +\begin{verbatim} +>>> from sets import Set +>>> engineers = Set(['John', 'Jane', 'Jack', 'Janice']) +>>> programmers = Set(['Jack', 'Sam', 'Susan', 'Janice']) +>>> managers = Set(['Jane', 'Jack', 'Susan', 'Zack']) +>>> employees = engineers | programmers | managers # union +>>> engineering_management = engineers & managers # intersection +>>> fulltime_management = managers - engineers - programmers # difference +>>> engineers.add('Marvin') # add element +>>> print engineers +Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack']) +>>> employees.issuperset(engineers) # superset test +False +>>> employees.union_update(engineers) # update from another set +>>> employees.issuperset(engineers) +True +>>> for group in [engineers, programmers, managers, employees]: +... group.discard('Susan') # unconditionally remove element +... print group +... +Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack']) +Set(['Janice', 'Jack', 'Sam']) +Set(['Jane', 'Zack', 'Jack']) +Set(['Jack', 'Sam', 'Jane', 'Marvin', 'Janice', 'John', 'Zack']) +\end{verbatim} + + +\subsection{Protocol for automatic conversion to immutable + \label{immutable-transforms}} + +Sets can only contain immutable elements. For convenience, mutable +\class{Set} objects are automatically copied to an \class{ImmutableSet} +before being added as a set element. + +The mechanism is to always add a hashable element, or if it is not +hashable, the element is checked to see if it has an +\method{__as_immutable__()} method which returns an immutable equivalent. + +Since \class{Set} objects have a \method{__as_immutable__()} method +returning an instance of \class{ImmutableSet}, it is possible to +construct sets of sets. + +A similar mechanism is needed by the \method{__contains__()} and +\method{remove()} methods which need to hash an element to check +for membership in a set. Those methods check an element for hashability +and, if not, check for a \method{__as_temporarily_immutable__()} method +which returns the element wrapped by a class that provides temporary +methods for \method{__hash__()}, \method{__eq__()}, and \method{__ne__()}. + +The alternate mechanism spares the need to build a separate copy of +the original mutable object. + +\class{Set} objects implement the \method{__as_temporarily_immutable__()} +method which returns the \class{Set} object wrapped by a new class +\class{_TemporarilyImmutableSet}. + +The two mechanisms for adding hashability are normally invisible to the +user; however, a conflict can arise in a multi-threaded environment +where one thread is updating a set while another has temporarily wrapped it +in \class{_TemporarilyImmutableSet}. In other words, sets of mutable sets +are not thread-safe. + + +\subsection{Comparison to the built-in \class{set} types + \label{comparison-to-builtin-set}} + +The built-in \class{set} and \class{frozenset} types were designed based +on lessons learned from the \module{sets} module. The key differences are: + +\begin{itemize} +\item \class{Set} and \class{ImmutableSet} were renamed to \class{set} and + \class{frozenset}. +\item There is no equivalent to \class{BaseSet}. Instead, use + \code{isinstance(x, (set, frozenset))}. +\item The hash algorithm for the built-ins performs significantly better + (fewer collisions) for most datasets. +\item The built-in versions have more space efficient pickles. +\item The built-in versions do not have a \method{union_update()} method. + Instead, use the \method{update()} method which is equivalent. +\item The built-in versions do not have a \method{_repr(sorted=True)} method. + Instead, use the built-in \function{repr()} and \function{sorted()} + functions: \code{repr(sorted(s))}. +\item The built-in version does not have a protocol for automatic conversion + to immutable. Many found this feature to be confusing and no one + in the community reported having found real uses for it. +\end{itemize} |