summaryrefslogtreecommitdiff
path: root/sys/src/cmd/python/Doc/lib/libdifflib.tex
diff options
context:
space:
mode:
authorOri Bernstein <ori@eigenstate.org>2021-06-14 00:00:37 +0000
committerOri Bernstein <ori@eigenstate.org>2021-06-14 00:00:37 +0000
commita73a964e51247ed169d322c725a3a18859f109a3 (patch)
tree3f752d117274d444bda44e85609aeac1acf313f3 /sys/src/cmd/python/Doc/lib/libdifflib.tex
parente64efe273fcb921a61bf27d33b230c4e64fcd425 (diff)
python, hg: tow outside the environment.
they've served us well, and can ride off into the sunset.
Diffstat (limited to 'sys/src/cmd/python/Doc/lib/libdifflib.tex')
-rw-r--r--sys/src/cmd/python/Doc/lib/libdifflib.tex704
1 files changed, 0 insertions, 704 deletions
diff --git a/sys/src/cmd/python/Doc/lib/libdifflib.tex b/sys/src/cmd/python/Doc/lib/libdifflib.tex
deleted file mode 100644
index acb5ed1c3..000000000
--- a/sys/src/cmd/python/Doc/lib/libdifflib.tex
+++ /dev/null
@@ -1,704 +0,0 @@
-\section{\module{difflib} ---
- Helpers for computing deltas}
-
-\declaremodule{standard}{difflib}
-\modulesynopsis{Helpers for computing differences between objects.}
-\moduleauthor{Tim Peters}{tim_one@users.sourceforge.net}
-\sectionauthor{Tim Peters}{tim_one@users.sourceforge.net}
-% LaTeXification by Fred L. Drake, Jr. <fdrake@acm.org>.
-
-\versionadded{2.1}
-
-
-\begin{classdesc*}{SequenceMatcher}
- This is a flexible class for comparing pairs of sequences of any
- type, so long as the sequence elements are hashable. The basic
- algorithm predates, and is a little fancier than, an algorithm
- published in the late 1980's by Ratcliff and Obershelp under the
- hyperbolic name ``gestalt pattern matching.'' The idea is to find
- the longest contiguous matching subsequence that contains no
- ``junk'' elements (the Ratcliff and Obershelp algorithm doesn't
- address junk). The same idea is then applied recursively to the
- pieces of the sequences to the left and to the right of the matching
- subsequence. This does not yield minimal edit sequences, but does
- tend to yield matches that ``look right'' to people.
-
- \strong{Timing:} The basic Ratcliff-Obershelp algorithm is cubic
- time in the worst case and quadratic time in the expected case.
- \class{SequenceMatcher} is quadratic time for the worst case and has
- expected-case behavior dependent in a complicated way on how many
- elements the sequences have in common; best case time is linear.
-\end{classdesc*}
-
-\begin{classdesc*}{Differ}
- This is a class for comparing sequences of lines of text, and
- producing human-readable differences or deltas. Differ uses
- \class{SequenceMatcher} both to compare sequences of lines, and to
- compare sequences of characters within similar (near-matching)
- lines.
-
- Each line of a \class{Differ} delta begins with a two-letter code:
-
-\begin{tableii}{l|l}{code}{Code}{Meaning}
- \lineii{'- '}{line unique to sequence 1}
- \lineii{'+ '}{line unique to sequence 2}
- \lineii{' '}{line common to both sequences}
- \lineii{'? '}{line not present in either input sequence}
-\end{tableii}
-
- Lines beginning with `\code{?~}' attempt to guide the eye to
- intraline differences, and were not present in either input
- sequence. These lines can be confusing if the sequences contain tab
- characters.
-\end{classdesc*}
-
-\begin{classdesc*}{HtmlDiff}
-
- This class can be used to create an HTML table (or a complete HTML file
- containing the table) showing a side by side, line by line comparison
- of text with inter-line and intra-line change highlights. The table can
- be generated in either full or contextual difference mode.
-
- The constructor for this class is:
-
- \begin{funcdesc}{__init__}{\optional{tabsize}\optional{,
- wrapcolumn}\optional{, linejunk}\optional{, charjunk}}
-
- Initializes instance of \class{HtmlDiff}.
-
- \var{tabsize} is an optional keyword argument to specify tab stop spacing
- and defaults to \code{8}.
-
- \var{wrapcolumn} is an optional keyword to specify column number where
- lines are broken and wrapped, defaults to \code{None} where lines are not
- wrapped.
-
- \var{linejunk} and \var{charjunk} are optional keyword arguments passed
- into \code{ndiff()} (used by \class{HtmlDiff} to generate the
- side by side HTML differences). See \code{ndiff()} documentation for
- argument default values and descriptions.
-
- \end{funcdesc}
-
- The following methods are public:
-
- \begin{funcdesc}{make_file}{fromlines, tolines
- \optional{, fromdesc}\optional{, todesc}\optional{, context}\optional{,
- numlines}}
- Compares \var{fromlines} and \var{tolines} (lists of strings) and returns
- a string which is a complete HTML file containing a table showing line by
- line differences with inter-line and intra-line changes highlighted.
-
- \var{fromdesc} and \var{todesc} are optional keyword arguments to specify
- from/to file column header strings (both default to an empty string).
-
- \var{context} and \var{numlines} are both optional keyword arguments.
- Set \var{context} to \code{True} when contextual differences are to be
- shown, else the default is \code{False} to show the full files.
- \var{numlines} defaults to \code{5}. When \var{context} is \code{True}
- \var{numlines} controls the number of context lines which surround the
- difference highlights. When \var{context} is \code{False} \var{numlines}
- controls the number of lines which are shown before a difference
- highlight when using the "next" hyperlinks (setting to zero would cause
- the "next" hyperlinks to place the next difference highlight at the top of
- the browser without any leading context).
- \end{funcdesc}
-
- \begin{funcdesc}{make_table}{fromlines, tolines
- \optional{, fromdesc}\optional{, todesc}\optional{, context}\optional{,
- numlines}}
- Compares \var{fromlines} and \var{tolines} (lists of strings) and returns
- a string which is a complete HTML table showing line by line differences
- with inter-line and intra-line changes highlighted.
-
- The arguments for this method are the same as those for the
- \method{make_file()} method.
- \end{funcdesc}
-
- \file{Tools/scripts/diff.py} is a command-line front-end to this class
- and contains a good example of its use.
-
- \versionadded{2.4}
-\end{classdesc*}
-
-\begin{funcdesc}{context_diff}{a, b\optional{, fromfile}\optional{,
- tofile}\optional{, fromfiledate}\optional{, tofiledate}\optional{,
- n}\optional{, lineterm}}
- Compare \var{a} and \var{b} (lists of strings); return a
- delta (a generator generating the delta lines) in context diff
- format.
-
- Context diffs are a compact way of showing just the lines that have
- changed plus a few lines of context. The changes are shown in a
- before/after style. The number of context lines is set by \var{n}
- which defaults to three.
-
- By default, the diff control lines (those with \code{***} or \code{---})
- are created with a trailing newline. This is helpful so that inputs created
- from \function{file.readlines()} result in diffs that are suitable for use
- with \function{file.writelines()} since both the inputs and outputs have
- trailing newlines.
-
- For inputs that do not have trailing newlines, set the \var{lineterm}
- argument to \code{""} so that the output will be uniformly newline free.
-
- The context diff format normally has a header for filenames and
- modification times. Any or all of these may be specified using strings for
- \var{fromfile}, \var{tofile}, \var{fromfiledate}, and \var{tofiledate}.
- The modification times are normally expressed in the format returned by
- \function{time.ctime()}. If not specified, the strings default to blanks.
-
- \file{Tools/scripts/diff.py} is a command-line front-end for this
- function.
-
- \versionadded{2.3}
-\end{funcdesc}
-
-\begin{funcdesc}{get_close_matches}{word, possibilities\optional{,
- n}\optional{, cutoff}}
- Return a list of the best ``good enough'' matches. \var{word} is a
- sequence for which close matches are desired (typically a string),
- and \var{possibilities} is a list of sequences against which to
- match \var{word} (typically a list of strings).
-
- Optional argument \var{n} (default \code{3}) is the maximum number
- of close matches to return; \var{n} must be greater than \code{0}.
-
- Optional argument \var{cutoff} (default \code{0.6}) is a float in
- the range [0, 1]. Possibilities that don't score at least that
- similar to \var{word} are ignored.
-
- The best (no more than \var{n}) matches among the possibilities are
- returned in a list, sorted by similarity score, most similar first.
-
-\begin{verbatim}
->>> get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy'])
-['apple', 'ape']
->>> import keyword
->>> get_close_matches('wheel', keyword.kwlist)
-['while']
->>> get_close_matches('apple', keyword.kwlist)
-[]
->>> get_close_matches('accept', keyword.kwlist)
-['except']
-\end{verbatim}
-\end{funcdesc}
-
-\begin{funcdesc}{ndiff}{a, b\optional{, linejunk}\optional{, charjunk}}
- Compare \var{a} and \var{b} (lists of strings); return a
- \class{Differ}-style delta (a generator generating the delta lines).
-
- Optional keyword parameters \var{linejunk} and \var{charjunk} are
- for filter functions (or \code{None}):
-
- \var{linejunk}: A function that accepts a single string
- argument, and returns true if the string is junk, or false if not.
- The default is (\code{None}), starting with Python 2.3. Before then,
- the default was the module-level function
- \function{IS_LINE_JUNK()}, which filters out lines without visible
- characters, except for at most one pound character (\character{\#}).
- As of Python 2.3, the underlying \class{SequenceMatcher} class
- does a dynamic analysis of which lines are so frequent as to
- constitute noise, and this usually works better than the pre-2.3
- default.
-
- \var{charjunk}: A function that accepts a character (a string of
- length 1), and returns if the character is junk, or false if not.
- The default is module-level function \function{IS_CHARACTER_JUNK()},
- which filters out whitespace characters (a blank or tab; note: bad
- idea to include newline in this!).
-
- \file{Tools/scripts/ndiff.py} is a command-line front-end to this
- function.
-
-\begin{verbatim}
->>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1),
-... 'ore\ntree\nemu\n'.splitlines(1))
->>> print ''.join(diff),
-- one
-? ^
-+ ore
-? ^
-- two
-- three
-? -
-+ tree
-+ emu
-\end{verbatim}
-\end{funcdesc}
-
-\begin{funcdesc}{restore}{sequence, which}
- Return one of the two sequences that generated a delta.
-
- Given a \var{sequence} produced by \method{Differ.compare()} or
- \function{ndiff()}, extract lines originating from file 1 or 2
- (parameter \var{which}), stripping off line prefixes.
-
- Example:
-
-\begin{verbatim}
->>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1),
-... 'ore\ntree\nemu\n'.splitlines(1))
->>> diff = list(diff) # materialize the generated delta into a list
->>> print ''.join(restore(diff, 1)),
-one
-two
-three
->>> print ''.join(restore(diff, 2)),
-ore
-tree
-emu
-\end{verbatim}
-
-\end{funcdesc}
-
-\begin{funcdesc}{unified_diff}{a, b\optional{, fromfile}\optional{,
- tofile}\optional{, fromfiledate}\optional{, tofiledate}\optional{,
- n}\optional{, lineterm}}
- Compare \var{a} and \var{b} (lists of strings); return a
- delta (a generator generating the delta lines) in unified diff
- format.
-
- Unified diffs are a compact way of showing just the lines that have
- changed plus a few lines of context. The changes are shown in a
- inline style (instead of separate before/after blocks). The number
- of context lines is set by \var{n} which defaults to three.
-
- By default, the diff control lines (those with \code{---}, \code{+++},
- or \code{@@}) are created with a trailing newline. This is helpful so
- that inputs created from \function{file.readlines()} result in diffs
- that are suitable for use with \function{file.writelines()} since both
- the inputs and outputs have trailing newlines.
-
- For inputs that do not have trailing newlines, set the \var{lineterm}
- argument to \code{""} so that the output will be uniformly newline free.
-
- The context diff format normally has a header for filenames and
- modification times. Any or all of these may be specified using strings for
- \var{fromfile}, \var{tofile}, \var{fromfiledate}, and \var{tofiledate}.
- The modification times are normally expressed in the format returned by
- \function{time.ctime()}. If not specified, the strings default to blanks.
-
- \file{Tools/scripts/diff.py} is a command-line front-end for this
- function.
-
- \versionadded{2.3}
-\end{funcdesc}
-
-\begin{funcdesc}{IS_LINE_JUNK}{line}
- Return true for ignorable lines. The line \var{line} is ignorable
- if \var{line} is blank or contains a single \character{\#},
- otherwise it is not ignorable. Used as a default for parameter
- \var{linejunk} in \function{ndiff()} before Python 2.3.
-\end{funcdesc}
-
-
-\begin{funcdesc}{IS_CHARACTER_JUNK}{ch}
- Return true for ignorable characters. The character \var{ch} is
- ignorable if \var{ch} is a space or tab, otherwise it is not
- ignorable. Used as a default for parameter \var{charjunk} in
- \function{ndiff()}.
-\end{funcdesc}
-
-
-\begin{seealso}
- \seetitle[http://www.ddj.com/documents/s=1103/ddj8807c/]
- {Pattern Matching: The Gestalt Approach}{Discussion of a
- similar algorithm by John W. Ratcliff and D. E. Metzener.
- This was published in
- \citetitle[http://www.ddj.com/]{Dr. Dobb's Journal} in
- July, 1988.}
-\end{seealso}
-
-
-\subsection{SequenceMatcher Objects \label{sequence-matcher}}
-
-The \class{SequenceMatcher} class has this constructor:
-
-\begin{classdesc}{SequenceMatcher}{\optional{isjunk\optional{,
- a\optional{, b}}}}
- Optional argument \var{isjunk} must be \code{None} (the default) or
- a one-argument function that takes a sequence element and returns
- true if and only if the element is ``junk'' and should be ignored.
- Passing \code{None} for \var{isjunk} is equivalent to passing
- \code{lambda x: 0}; in other words, no elements are ignored. For
- example, pass:
-
-\begin{verbatim}
-lambda x: x in " \t"
-\end{verbatim}
-
- if you're comparing lines as sequences of characters, and don't want
- to synch up on blanks or hard tabs.
-
- The optional arguments \var{a} and \var{b} are sequences to be
- compared; both default to empty strings. The elements of both
- sequences must be hashable.
-\end{classdesc}
-
-
-\class{SequenceMatcher} objects have the following methods:
-
-\begin{methoddesc}{set_seqs}{a, b}
- Set the two sequences to be compared.
-\end{methoddesc}
-
-\class{SequenceMatcher} computes and caches detailed information about
-the second sequence, so if you want to compare one sequence against
-many sequences, use \method{set_seq2()} to set the commonly used
-sequence once and call \method{set_seq1()} repeatedly, once for each
-of the other sequences.
-
-\begin{methoddesc}{set_seq1}{a}
- Set the first sequence to be compared. The second sequence to be
- compared is not changed.
-\end{methoddesc}
-
-\begin{methoddesc}{set_seq2}{b}
- Set the second sequence to be compared. The first sequence to be
- compared is not changed.
-\end{methoddesc}
-
-\begin{methoddesc}{find_longest_match}{alo, ahi, blo, bhi}
- Find longest matching block in \code{\var{a}[\var{alo}:\var{ahi}]}
- and \code{\var{b}[\var{blo}:\var{bhi}]}.
-
- If \var{isjunk} was omitted or \code{None},
- \method{get_longest_match()} returns \code{(\var{i}, \var{j},
- \var{k})} such that \code{\var{a}[\var{i}:\var{i}+\var{k}]} is equal
- to \code{\var{b}[\var{j}:\var{j}+\var{k}]}, where
- \code{\var{alo} <= \var{i} <= \var{i}+\var{k} <= \var{ahi}} and
- \code{\var{blo} <= \var{j} <= \var{j}+\var{k} <= \var{bhi}}.
- For all \code{(\var{i'}, \var{j'}, \var{k'})} meeting those
- conditions, the additional conditions
- \code{\var{k} >= \var{k'}},
- \code{\var{i} <= \var{i'}},
- and if \code{\var{i} == \var{i'}}, \code{\var{j} <= \var{j'}}
- are also met.
- In other words, of all maximal matching blocks, return one that
- starts earliest in \var{a}, and of all those maximal matching blocks
- that start earliest in \var{a}, return the one that starts earliest
- in \var{b}.
-
-\begin{verbatim}
->>> s = SequenceMatcher(None, " abcd", "abcd abcd")
->>> s.find_longest_match(0, 5, 0, 9)
-(0, 4, 5)
-\end{verbatim}
-
- If \var{isjunk} was provided, first the longest matching block is
- determined as above, but with the additional restriction that no
- junk element appears in the block. Then that block is extended as
- far as possible by matching (only) junk elements on both sides.
- So the resulting block never matches on junk except as identical
- junk happens to be adjacent to an interesting match.
-
- Here's the same example as before, but considering blanks to be junk.
- That prevents \code{' abcd'} from matching the \code{' abcd'} at the
- tail end of the second sequence directly. Instead only the
- \code{'abcd'} can match, and matches the leftmost \code{'abcd'} in
- the second sequence:
-
-\begin{verbatim}
->>> s = SequenceMatcher(lambda x: x==" ", " abcd", "abcd abcd")
->>> s.find_longest_match(0, 5, 0, 9)
-(1, 0, 4)
-\end{verbatim}
-
- If no blocks match, this returns \code{(\var{alo}, \var{blo}, 0)}.
-\end{methoddesc}
-
-\begin{methoddesc}{get_matching_blocks}{}
- Return list of triples describing matching subsequences.
- Each triple is of the form \code{(\var{i}, \var{j}, \var{n})}, and
- means that \code{\var{a}[\var{i}:\var{i}+\var{n}] ==
- \var{b}[\var{j}:\var{j}+\var{n}]}. The triples are monotonically
- increasing in \var{i} and \var{j}.
-
- The last triple is a dummy, and has the value \code{(len(\var{a}),
- len(\var{b}), 0)}. It is the only triple with \code{\var{n} == 0}.
- % Explain why a dummy is used!
-
- If
- \code{(\var{i}, \var{j}, \var{n})} and
- \code{(\var{i'}, \var{j'}, \var{n'})} are adjacent triples in the list,
- and the second is not the last triple in the list, then
- \code{\var{i}+\var{n} != \var{i'}} or
- \code{\var{j}+\var{n} != \var{j'}}; in other words, adjacent triples
- always describe non-adjacent equal blocks.
- \versionchanged[The guarantee that adjacent triples always describe
- non-adjacent blocks was implemented]{2.5}
-
-\begin{verbatim}
->>> s = SequenceMatcher(None, "abxcd", "abcd")
->>> s.get_matching_blocks()
-[(0, 0, 2), (3, 2, 2), (5, 4, 0)]
-\end{verbatim}
-\end{methoddesc}
-
-\begin{methoddesc}{get_opcodes}{}
- Return list of 5-tuples describing how to turn \var{a} into \var{b}.
- Each tuple is of the form \code{(\var{tag}, \var{i1}, \var{i2},
- \var{j1}, \var{j2})}. The first tuple has \code{\var{i1} ==
- \var{j1} == 0}, and remaining tuples have \var{i1} equal to the
- \var{i2} from the preceding tuple, and, likewise, \var{j1} equal to
- the previous \var{j2}.
-
- The \var{tag} values are strings, with these meanings:
-
-\begin{tableii}{l|l}{code}{Value}{Meaning}
- \lineii{'replace'}{\code{\var{a}[\var{i1}:\var{i2}]} should be
- replaced by \code{\var{b}[\var{j1}:\var{j2}]}.}
- \lineii{'delete'}{\code{\var{a}[\var{i1}:\var{i2}]} should be
- deleted. Note that \code{\var{j1} == \var{j2}} in
- this case.}
- \lineii{'insert'}{\code{\var{b}[\var{j1}:\var{j2}]} should be
- inserted at \code{\var{a}[\var{i1}:\var{i1}]}.
- Note that \code{\var{i1} == \var{i2}} in this
- case.}
- \lineii{'equal'}{\code{\var{a}[\var{i1}:\var{i2}] ==
- \var{b}[\var{j1}:\var{j2}]} (the sub-sequences are
- equal).}
-\end{tableii}
-
-For example:
-
-\begin{verbatim}
->>> a = "qabxcd"
->>> b = "abycdf"
->>> s = SequenceMatcher(None, a, b)
->>> for tag, i1, i2, j1, j2 in s.get_opcodes():
-... print ("%7s a[%d:%d] (%s) b[%d:%d] (%s)" %
-... (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2]))
- delete a[0:1] (q) b[0:0] ()
- equal a[1:3] (ab) b[0:2] (ab)
-replace a[3:4] (x) b[2:3] (y)
- equal a[4:6] (cd) b[3:5] (cd)
- insert a[6:6] () b[5:6] (f)
-\end{verbatim}
-\end{methoddesc}
-
-\begin{methoddesc}{get_grouped_opcodes}{\optional{n}}
- Return a generator of groups with up to \var{n} lines of context.
-
- Starting with the groups returned by \method{get_opcodes()},
- this method splits out smaller change clusters and eliminates
- intervening ranges which have no changes.
-
- The groups are returned in the same format as \method{get_opcodes()}.
- \versionadded{2.3}
-\end{methoddesc}
-
-\begin{methoddesc}{ratio}{}
- Return a measure of the sequences' similarity as a float in the
- range [0, 1].
-
- Where T is the total number of elements in both sequences, and M is
- the number of matches, this is 2.0*M / T. Note that this is
- \code{1.0} if the sequences are identical, and \code{0.0} if they
- have nothing in common.
-
- This is expensive to compute if \method{get_matching_blocks()} or
- \method{get_opcodes()} hasn't already been called, in which case you
- may want to try \method{quick_ratio()} or
- \method{real_quick_ratio()} first to get an upper bound.
-\end{methoddesc}
-
-\begin{methoddesc}{quick_ratio}{}
- Return an upper bound on \method{ratio()} relatively quickly.
-
- This isn't defined beyond that it is an upper bound on
- \method{ratio()}, and is faster to compute.
-\end{methoddesc}
-
-\begin{methoddesc}{real_quick_ratio}{}
- Return an upper bound on \method{ratio()} very quickly.
-
- This isn't defined beyond that it is an upper bound on
- \method{ratio()}, and is faster to compute than either
- \method{ratio()} or \method{quick_ratio()}.
-\end{methoddesc}
-
-The three methods that return the ratio of matching to total characters
-can give different results due to differing levels of approximation,
-although \method{quick_ratio()} and \method{real_quick_ratio()} are always
-at least as large as \method{ratio()}:
-
-\begin{verbatim}
->>> s = SequenceMatcher(None, "abcd", "bcde")
->>> s.ratio()
-0.75
->>> s.quick_ratio()
-0.75
->>> s.real_quick_ratio()
-1.0
-\end{verbatim}
-
-
-\subsection{SequenceMatcher Examples \label{sequencematcher-examples}}
-
-
-This example compares two strings, considering blanks to be ``junk:''
-
-\begin{verbatim}
->>> s = SequenceMatcher(lambda x: x == " ",
-... "private Thread currentThread;",
-... "private volatile Thread currentThread;")
-\end{verbatim}
-
-\method{ratio()} returns a float in [0, 1], measuring the similarity
-of the sequences. As a rule of thumb, a \method{ratio()} value over
-0.6 means the sequences are close matches:
-
-\begin{verbatim}
->>> print round(s.ratio(), 3)
-0.866
-\end{verbatim}
-
-If you're only interested in where the sequences match,
-\method{get_matching_blocks()} is handy:
-
-\begin{verbatim}
->>> for block in s.get_matching_blocks():
-... print "a[%d] and b[%d] match for %d elements" % block
-a[0] and b[0] match for 8 elements
-a[8] and b[17] match for 6 elements
-a[14] and b[23] match for 15 elements
-a[29] and b[38] match for 0 elements
-\end{verbatim}
-
-Note that the last tuple returned by \method{get_matching_blocks()} is
-always a dummy, \code{(len(\var{a}), len(\var{b}), 0)}, and this is
-the only case in which the last tuple element (number of elements
-matched) is \code{0}.
-
-If you want to know how to change the first sequence into the second,
-use \method{get_opcodes()}:
-
-\begin{verbatim}
->>> for opcode in s.get_opcodes():
-... print "%6s a[%d:%d] b[%d:%d]" % opcode
- equal a[0:8] b[0:8]
-insert a[8:8] b[8:17]
- equal a[8:14] b[17:23]
- equal a[14:29] b[23:38]
-\end{verbatim}
-
-See also the function \function{get_close_matches()} in this module,
-which shows how simple code building on \class{SequenceMatcher} can be
-used to do useful work.
-
-
-\subsection{Differ Objects \label{differ-objects}}
-
-Note that \class{Differ}-generated deltas make no claim to be
-\strong{minimal} diffs. To the contrary, minimal diffs are often
-counter-intuitive, because they synch up anywhere possible, sometimes
-accidental matches 100 pages apart. Restricting synch points to
-contiguous matches preserves some notion of locality, at the
-occasional cost of producing a longer diff.
-
-The \class{Differ} class has this constructor:
-
-\begin{classdesc}{Differ}{\optional{linejunk\optional{, charjunk}}}
- Optional keyword parameters \var{linejunk} and \var{charjunk} are
- for filter functions (or \code{None}):
-
- \var{linejunk}: A function that accepts a single string
- argument, and returns true if the string is junk. The default is
- \code{None}, meaning that no line is considered junk.
-
- \var{charjunk}: A function that accepts a single character argument
- (a string of length 1), and returns true if the character is junk.
- The default is \code{None}, meaning that no character is
- considered junk.
-\end{classdesc}
-
-\class{Differ} objects are used (deltas generated) via a single
-method:
-
-\begin{methoddesc}{compare}{a, b}
- Compare two sequences of lines, and generate the delta (a sequence
- of lines).
-
- Each sequence must contain individual single-line strings ending
- with newlines. Such sequences can be obtained from the
- \method{readlines()} method of file-like objects. The delta generated
- also consists of newline-terminated strings, ready to be printed as-is
- via the \method{writelines()} method of a file-like object.
-\end{methoddesc}
-
-
-\subsection{Differ Example \label{differ-examples}}
-
-This example compares two texts. First we set up the texts, sequences
-of individual single-line strings ending with newlines (such sequences
-can also be obtained from the \method{readlines()} method of file-like
-objects):
-
-\begin{verbatim}
->>> text1 = ''' 1. Beautiful is better than ugly.
-... 2. Explicit is better than implicit.
-... 3. Simple is better than complex.
-... 4. Complex is better than complicated.
-... '''.splitlines(1)
->>> len(text1)
-4
->>> text1[0][-1]
-'\n'
->>> text2 = ''' 1. Beautiful is better than ugly.
-... 3. Simple is better than complex.
-... 4. Complicated is better than complex.
-... 5. Flat is better than nested.
-... '''.splitlines(1)
-\end{verbatim}
-
-Next we instantiate a Differ object:
-
-\begin{verbatim}
->>> d = Differ()
-\end{verbatim}
-
-Note that when instantiating a \class{Differ} object we may pass
-functions to filter out line and character ``junk.'' See the
-\method{Differ()} constructor for details.
-
-Finally, we compare the two:
-
-\begin{verbatim}
->>> result = list(d.compare(text1, text2))
-\end{verbatim}
-
-\code{result} is a list of strings, so let's pretty-print it:
-
-\begin{verbatim}
->>> from pprint import pprint
->>> pprint(result)
-[' 1. Beautiful is better than ugly.\n',
- '- 2. Explicit is better than implicit.\n',
- '- 3. Simple is better than complex.\n',
- '+ 3. Simple is better than complex.\n',
- '? ++ \n',
- '- 4. Complex is better than complicated.\n',
- '? ^ ---- ^ \n',
- '+ 4. Complicated is better than complex.\n',
- '? ++++ ^ ^ \n',
- '+ 5. Flat is better than nested.\n']
-\end{verbatim}
-
-As a single multi-line string it looks like this:
-
-\begin{verbatim}
->>> import sys
->>> sys.stdout.writelines(result)
- 1. Beautiful is better than ugly.
-- 2. Explicit is better than implicit.
-- 3. Simple is better than complex.
-+ 3. Simple is better than complex.
-? ++
-- 4. Complex is better than complicated.
-? ^ ---- ^
-+ 4. Complicated is better than complex.
-? ++++ ^ ^
-+ 5. Flat is better than nested.
-\end{verbatim}