diff options
author | Ori Bernstein <ori@eigenstate.org> | 2021-06-14 00:00:37 +0000 |
---|---|---|
committer | Ori Bernstein <ori@eigenstate.org> | 2021-06-14 00:00:37 +0000 |
commit | a73a964e51247ed169d322c725a3a18859f109a3 (patch) | |
tree | 3f752d117274d444bda44e85609aeac1acf313f3 /sys/src/cmd/python/Doc/lib/libdifflib.tex | |
parent | e64efe273fcb921a61bf27d33b230c4e64fcd425 (diff) |
python, hg: tow outside the environment.
they've served us well, and can ride off into the sunset.
Diffstat (limited to 'sys/src/cmd/python/Doc/lib/libdifflib.tex')
-rw-r--r-- | sys/src/cmd/python/Doc/lib/libdifflib.tex | 704 |
1 files changed, 0 insertions, 704 deletions
diff --git a/sys/src/cmd/python/Doc/lib/libdifflib.tex b/sys/src/cmd/python/Doc/lib/libdifflib.tex deleted file mode 100644 index acb5ed1c3..000000000 --- a/sys/src/cmd/python/Doc/lib/libdifflib.tex +++ /dev/null @@ -1,704 +0,0 @@ -\section{\module{difflib} --- - Helpers for computing deltas} - -\declaremodule{standard}{difflib} -\modulesynopsis{Helpers for computing differences between objects.} -\moduleauthor{Tim Peters}{tim_one@users.sourceforge.net} -\sectionauthor{Tim Peters}{tim_one@users.sourceforge.net} -% LaTeXification by Fred L. Drake, Jr. <fdrake@acm.org>. - -\versionadded{2.1} - - -\begin{classdesc*}{SequenceMatcher} - This is a flexible class for comparing pairs of sequences of any - type, so long as the sequence elements are hashable. The basic - algorithm predates, and is a little fancier than, an algorithm - published in the late 1980's by Ratcliff and Obershelp under the - hyperbolic name ``gestalt pattern matching.'' The idea is to find - the longest contiguous matching subsequence that contains no - ``junk'' elements (the Ratcliff and Obershelp algorithm doesn't - address junk). The same idea is then applied recursively to the - pieces of the sequences to the left and to the right of the matching - subsequence. This does not yield minimal edit sequences, but does - tend to yield matches that ``look right'' to people. - - \strong{Timing:} The basic Ratcliff-Obershelp algorithm is cubic - time in the worst case and quadratic time in the expected case. - \class{SequenceMatcher} is quadratic time for the worst case and has - expected-case behavior dependent in a complicated way on how many - elements the sequences have in common; best case time is linear. -\end{classdesc*} - -\begin{classdesc*}{Differ} - This is a class for comparing sequences of lines of text, and - producing human-readable differences or deltas. Differ uses - \class{SequenceMatcher} both to compare sequences of lines, and to - compare sequences of characters within similar (near-matching) - lines. - - Each line of a \class{Differ} delta begins with a two-letter code: - -\begin{tableii}{l|l}{code}{Code}{Meaning} - \lineii{'- '}{line unique to sequence 1} - \lineii{'+ '}{line unique to sequence 2} - \lineii{' '}{line common to both sequences} - \lineii{'? '}{line not present in either input sequence} -\end{tableii} - - Lines beginning with `\code{?~}' attempt to guide the eye to - intraline differences, and were not present in either input - sequence. These lines can be confusing if the sequences contain tab - characters. -\end{classdesc*} - -\begin{classdesc*}{HtmlDiff} - - This class can be used to create an HTML table (or a complete HTML file - containing the table) showing a side by side, line by line comparison - of text with inter-line and intra-line change highlights. The table can - be generated in either full or contextual difference mode. - - The constructor for this class is: - - \begin{funcdesc}{__init__}{\optional{tabsize}\optional{, - wrapcolumn}\optional{, linejunk}\optional{, charjunk}} - - Initializes instance of \class{HtmlDiff}. - - \var{tabsize} is an optional keyword argument to specify tab stop spacing - and defaults to \code{8}. - - \var{wrapcolumn} is an optional keyword to specify column number where - lines are broken and wrapped, defaults to \code{None} where lines are not - wrapped. - - \var{linejunk} and \var{charjunk} are optional keyword arguments passed - into \code{ndiff()} (used by \class{HtmlDiff} to generate the - side by side HTML differences). See \code{ndiff()} documentation for - argument default values and descriptions. - - \end{funcdesc} - - The following methods are public: - - \begin{funcdesc}{make_file}{fromlines, tolines - \optional{, fromdesc}\optional{, todesc}\optional{, context}\optional{, - numlines}} - Compares \var{fromlines} and \var{tolines} (lists of strings) and returns - a string which is a complete HTML file containing a table showing line by - line differences with inter-line and intra-line changes highlighted. - - \var{fromdesc} and \var{todesc} are optional keyword arguments to specify - from/to file column header strings (both default to an empty string). - - \var{context} and \var{numlines} are both optional keyword arguments. - Set \var{context} to \code{True} when contextual differences are to be - shown, else the default is \code{False} to show the full files. - \var{numlines} defaults to \code{5}. When \var{context} is \code{True} - \var{numlines} controls the number of context lines which surround the - difference highlights. When \var{context} is \code{False} \var{numlines} - controls the number of lines which are shown before a difference - highlight when using the "next" hyperlinks (setting to zero would cause - the "next" hyperlinks to place the next difference highlight at the top of - the browser without any leading context). - \end{funcdesc} - - \begin{funcdesc}{make_table}{fromlines, tolines - \optional{, fromdesc}\optional{, todesc}\optional{, context}\optional{, - numlines}} - Compares \var{fromlines} and \var{tolines} (lists of strings) and returns - a string which is a complete HTML table showing line by line differences - with inter-line and intra-line changes highlighted. - - The arguments for this method are the same as those for the - \method{make_file()} method. - \end{funcdesc} - - \file{Tools/scripts/diff.py} is a command-line front-end to this class - and contains a good example of its use. - - \versionadded{2.4} -\end{classdesc*} - -\begin{funcdesc}{context_diff}{a, b\optional{, fromfile}\optional{, - tofile}\optional{, fromfiledate}\optional{, tofiledate}\optional{, - n}\optional{, lineterm}} - Compare \var{a} and \var{b} (lists of strings); return a - delta (a generator generating the delta lines) in context diff - format. - - Context diffs are a compact way of showing just the lines that have - changed plus a few lines of context. The changes are shown in a - before/after style. The number of context lines is set by \var{n} - which defaults to three. - - By default, the diff control lines (those with \code{***} or \code{---}) - are created with a trailing newline. This is helpful so that inputs created - from \function{file.readlines()} result in diffs that are suitable for use - with \function{file.writelines()} since both the inputs and outputs have - trailing newlines. - - For inputs that do not have trailing newlines, set the \var{lineterm} - argument to \code{""} so that the output will be uniformly newline free. - - The context diff format normally has a header for filenames and - modification times. Any or all of these may be specified using strings for - \var{fromfile}, \var{tofile}, \var{fromfiledate}, and \var{tofiledate}. - The modification times are normally expressed in the format returned by - \function{time.ctime()}. If not specified, the strings default to blanks. - - \file{Tools/scripts/diff.py} is a command-line front-end for this - function. - - \versionadded{2.3} -\end{funcdesc} - -\begin{funcdesc}{get_close_matches}{word, possibilities\optional{, - n}\optional{, cutoff}} - Return a list of the best ``good enough'' matches. \var{word} is a - sequence for which close matches are desired (typically a string), - and \var{possibilities} is a list of sequences against which to - match \var{word} (typically a list of strings). - - Optional argument \var{n} (default \code{3}) is the maximum number - of close matches to return; \var{n} must be greater than \code{0}. - - Optional argument \var{cutoff} (default \code{0.6}) is a float in - the range [0, 1]. Possibilities that don't score at least that - similar to \var{word} are ignored. - - The best (no more than \var{n}) matches among the possibilities are - returned in a list, sorted by similarity score, most similar first. - -\begin{verbatim} ->>> get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy']) -['apple', 'ape'] ->>> import keyword ->>> get_close_matches('wheel', keyword.kwlist) -['while'] ->>> get_close_matches('apple', keyword.kwlist) -[] ->>> get_close_matches('accept', keyword.kwlist) -['except'] -\end{verbatim} -\end{funcdesc} - -\begin{funcdesc}{ndiff}{a, b\optional{, linejunk}\optional{, charjunk}} - Compare \var{a} and \var{b} (lists of strings); return a - \class{Differ}-style delta (a generator generating the delta lines). - - Optional keyword parameters \var{linejunk} and \var{charjunk} are - for filter functions (or \code{None}): - - \var{linejunk}: A function that accepts a single string - argument, and returns true if the string is junk, or false if not. - The default is (\code{None}), starting with Python 2.3. Before then, - the default was the module-level function - \function{IS_LINE_JUNK()}, which filters out lines without visible - characters, except for at most one pound character (\character{\#}). - As of Python 2.3, the underlying \class{SequenceMatcher} class - does a dynamic analysis of which lines are so frequent as to - constitute noise, and this usually works better than the pre-2.3 - default. - - \var{charjunk}: A function that accepts a character (a string of - length 1), and returns if the character is junk, or false if not. - The default is module-level function \function{IS_CHARACTER_JUNK()}, - which filters out whitespace characters (a blank or tab; note: bad - idea to include newline in this!). - - \file{Tools/scripts/ndiff.py} is a command-line front-end to this - function. - -\begin{verbatim} ->>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1), -... 'ore\ntree\nemu\n'.splitlines(1)) ->>> print ''.join(diff), -- one -? ^ -+ ore -? ^ -- two -- three -? - -+ tree -+ emu -\end{verbatim} -\end{funcdesc} - -\begin{funcdesc}{restore}{sequence, which} - Return one of the two sequences that generated a delta. - - Given a \var{sequence} produced by \method{Differ.compare()} or - \function{ndiff()}, extract lines originating from file 1 or 2 - (parameter \var{which}), stripping off line prefixes. - - Example: - -\begin{verbatim} ->>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1), -... 'ore\ntree\nemu\n'.splitlines(1)) ->>> diff = list(diff) # materialize the generated delta into a list ->>> print ''.join(restore(diff, 1)), -one -two -three ->>> print ''.join(restore(diff, 2)), -ore -tree -emu -\end{verbatim} - -\end{funcdesc} - -\begin{funcdesc}{unified_diff}{a, b\optional{, fromfile}\optional{, - tofile}\optional{, fromfiledate}\optional{, tofiledate}\optional{, - n}\optional{, lineterm}} - Compare \var{a} and \var{b} (lists of strings); return a - delta (a generator generating the delta lines) in unified diff - format. - - Unified diffs are a compact way of showing just the lines that have - changed plus a few lines of context. The changes are shown in a - inline style (instead of separate before/after blocks). The number - of context lines is set by \var{n} which defaults to three. - - By default, the diff control lines (those with \code{---}, \code{+++}, - or \code{@@}) are created with a trailing newline. This is helpful so - that inputs created from \function{file.readlines()} result in diffs - that are suitable for use with \function{file.writelines()} since both - the inputs and outputs have trailing newlines. - - For inputs that do not have trailing newlines, set the \var{lineterm} - argument to \code{""} so that the output will be uniformly newline free. - - The context diff format normally has a header for filenames and - modification times. Any or all of these may be specified using strings for - \var{fromfile}, \var{tofile}, \var{fromfiledate}, and \var{tofiledate}. - The modification times are normally expressed in the format returned by - \function{time.ctime()}. If not specified, the strings default to blanks. - - \file{Tools/scripts/diff.py} is a command-line front-end for this - function. - - \versionadded{2.3} -\end{funcdesc} - -\begin{funcdesc}{IS_LINE_JUNK}{line} - Return true for ignorable lines. The line \var{line} is ignorable - if \var{line} is blank or contains a single \character{\#}, - otherwise it is not ignorable. Used as a default for parameter - \var{linejunk} in \function{ndiff()} before Python 2.3. -\end{funcdesc} - - -\begin{funcdesc}{IS_CHARACTER_JUNK}{ch} - Return true for ignorable characters. The character \var{ch} is - ignorable if \var{ch} is a space or tab, otherwise it is not - ignorable. Used as a default for parameter \var{charjunk} in - \function{ndiff()}. -\end{funcdesc} - - -\begin{seealso} - \seetitle[http://www.ddj.com/documents/s=1103/ddj8807c/] - {Pattern Matching: The Gestalt Approach}{Discussion of a - similar algorithm by John W. Ratcliff and D. E. Metzener. - This was published in - \citetitle[http://www.ddj.com/]{Dr. Dobb's Journal} in - July, 1988.} -\end{seealso} - - -\subsection{SequenceMatcher Objects \label{sequence-matcher}} - -The \class{SequenceMatcher} class has this constructor: - -\begin{classdesc}{SequenceMatcher}{\optional{isjunk\optional{, - a\optional{, b}}}} - Optional argument \var{isjunk} must be \code{None} (the default) or - a one-argument function that takes a sequence element and returns - true if and only if the element is ``junk'' and should be ignored. - Passing \code{None} for \var{isjunk} is equivalent to passing - \code{lambda x: 0}; in other words, no elements are ignored. For - example, pass: - -\begin{verbatim} -lambda x: x in " \t" -\end{verbatim} - - if you're comparing lines as sequences of characters, and don't want - to synch up on blanks or hard tabs. - - The optional arguments \var{a} and \var{b} are sequences to be - compared; both default to empty strings. The elements of both - sequences must be hashable. -\end{classdesc} - - -\class{SequenceMatcher} objects have the following methods: - -\begin{methoddesc}{set_seqs}{a, b} - Set the two sequences to be compared. -\end{methoddesc} - -\class{SequenceMatcher} computes and caches detailed information about -the second sequence, so if you want to compare one sequence against -many sequences, use \method{set_seq2()} to set the commonly used -sequence once and call \method{set_seq1()} repeatedly, once for each -of the other sequences. - -\begin{methoddesc}{set_seq1}{a} - Set the first sequence to be compared. The second sequence to be - compared is not changed. -\end{methoddesc} - -\begin{methoddesc}{set_seq2}{b} - Set the second sequence to be compared. The first sequence to be - compared is not changed. -\end{methoddesc} - -\begin{methoddesc}{find_longest_match}{alo, ahi, blo, bhi} - Find longest matching block in \code{\var{a}[\var{alo}:\var{ahi}]} - and \code{\var{b}[\var{blo}:\var{bhi}]}. - - If \var{isjunk} was omitted or \code{None}, - \method{get_longest_match()} returns \code{(\var{i}, \var{j}, - \var{k})} such that \code{\var{a}[\var{i}:\var{i}+\var{k}]} is equal - to \code{\var{b}[\var{j}:\var{j}+\var{k}]}, where - \code{\var{alo} <= \var{i} <= \var{i}+\var{k} <= \var{ahi}} and - \code{\var{blo} <= \var{j} <= \var{j}+\var{k} <= \var{bhi}}. - For all \code{(\var{i'}, \var{j'}, \var{k'})} meeting those - conditions, the additional conditions - \code{\var{k} >= \var{k'}}, - \code{\var{i} <= \var{i'}}, - and if \code{\var{i} == \var{i'}}, \code{\var{j} <= \var{j'}} - are also met. - In other words, of all maximal matching blocks, return one that - starts earliest in \var{a}, and of all those maximal matching blocks - that start earliest in \var{a}, return the one that starts earliest - in \var{b}. - -\begin{verbatim} ->>> s = SequenceMatcher(None, " abcd", "abcd abcd") ->>> s.find_longest_match(0, 5, 0, 9) -(0, 4, 5) -\end{verbatim} - - If \var{isjunk} was provided, first the longest matching block is - determined as above, but with the additional restriction that no - junk element appears in the block. Then that block is extended as - far as possible by matching (only) junk elements on both sides. - So the resulting block never matches on junk except as identical - junk happens to be adjacent to an interesting match. - - Here's the same example as before, but considering blanks to be junk. - That prevents \code{' abcd'} from matching the \code{' abcd'} at the - tail end of the second sequence directly. Instead only the - \code{'abcd'} can match, and matches the leftmost \code{'abcd'} in - the second sequence: - -\begin{verbatim} ->>> s = SequenceMatcher(lambda x: x==" ", " abcd", "abcd abcd") ->>> s.find_longest_match(0, 5, 0, 9) -(1, 0, 4) -\end{verbatim} - - If no blocks match, this returns \code{(\var{alo}, \var{blo}, 0)}. -\end{methoddesc} - -\begin{methoddesc}{get_matching_blocks}{} - Return list of triples describing matching subsequences. - Each triple is of the form \code{(\var{i}, \var{j}, \var{n})}, and - means that \code{\var{a}[\var{i}:\var{i}+\var{n}] == - \var{b}[\var{j}:\var{j}+\var{n}]}. The triples are monotonically - increasing in \var{i} and \var{j}. - - The last triple is a dummy, and has the value \code{(len(\var{a}), - len(\var{b}), 0)}. It is the only triple with \code{\var{n} == 0}. - % Explain why a dummy is used! - - If - \code{(\var{i}, \var{j}, \var{n})} and - \code{(\var{i'}, \var{j'}, \var{n'})} are adjacent triples in the list, - and the second is not the last triple in the list, then - \code{\var{i}+\var{n} != \var{i'}} or - \code{\var{j}+\var{n} != \var{j'}}; in other words, adjacent triples - always describe non-adjacent equal blocks. - \versionchanged[The guarantee that adjacent triples always describe - non-adjacent blocks was implemented]{2.5} - -\begin{verbatim} ->>> s = SequenceMatcher(None, "abxcd", "abcd") ->>> s.get_matching_blocks() -[(0, 0, 2), (3, 2, 2), (5, 4, 0)] -\end{verbatim} -\end{methoddesc} - -\begin{methoddesc}{get_opcodes}{} - Return list of 5-tuples describing how to turn \var{a} into \var{b}. - Each tuple is of the form \code{(\var{tag}, \var{i1}, \var{i2}, - \var{j1}, \var{j2})}. The first tuple has \code{\var{i1} == - \var{j1} == 0}, and remaining tuples have \var{i1} equal to the - \var{i2} from the preceding tuple, and, likewise, \var{j1} equal to - the previous \var{j2}. - - The \var{tag} values are strings, with these meanings: - -\begin{tableii}{l|l}{code}{Value}{Meaning} - \lineii{'replace'}{\code{\var{a}[\var{i1}:\var{i2}]} should be - replaced by \code{\var{b}[\var{j1}:\var{j2}]}.} - \lineii{'delete'}{\code{\var{a}[\var{i1}:\var{i2}]} should be - deleted. Note that \code{\var{j1} == \var{j2}} in - this case.} - \lineii{'insert'}{\code{\var{b}[\var{j1}:\var{j2}]} should be - inserted at \code{\var{a}[\var{i1}:\var{i1}]}. - Note that \code{\var{i1} == \var{i2}} in this - case.} - \lineii{'equal'}{\code{\var{a}[\var{i1}:\var{i2}] == - \var{b}[\var{j1}:\var{j2}]} (the sub-sequences are - equal).} -\end{tableii} - -For example: - -\begin{verbatim} ->>> a = "qabxcd" ->>> b = "abycdf" ->>> s = SequenceMatcher(None, a, b) ->>> for tag, i1, i2, j1, j2 in s.get_opcodes(): -... print ("%7s a[%d:%d] (%s) b[%d:%d] (%s)" % -... (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2])) - delete a[0:1] (q) b[0:0] () - equal a[1:3] (ab) b[0:2] (ab) -replace a[3:4] (x) b[2:3] (y) - equal a[4:6] (cd) b[3:5] (cd) - insert a[6:6] () b[5:6] (f) -\end{verbatim} -\end{methoddesc} - -\begin{methoddesc}{get_grouped_opcodes}{\optional{n}} - Return a generator of groups with up to \var{n} lines of context. - - Starting with the groups returned by \method{get_opcodes()}, - this method splits out smaller change clusters and eliminates - intervening ranges which have no changes. - - The groups are returned in the same format as \method{get_opcodes()}. - \versionadded{2.3} -\end{methoddesc} - -\begin{methoddesc}{ratio}{} - Return a measure of the sequences' similarity as a float in the - range [0, 1]. - - Where T is the total number of elements in both sequences, and M is - the number of matches, this is 2.0*M / T. Note that this is - \code{1.0} if the sequences are identical, and \code{0.0} if they - have nothing in common. - - This is expensive to compute if \method{get_matching_blocks()} or - \method{get_opcodes()} hasn't already been called, in which case you - may want to try \method{quick_ratio()} or - \method{real_quick_ratio()} first to get an upper bound. -\end{methoddesc} - -\begin{methoddesc}{quick_ratio}{} - Return an upper bound on \method{ratio()} relatively quickly. - - This isn't defined beyond that it is an upper bound on - \method{ratio()}, and is faster to compute. -\end{methoddesc} - -\begin{methoddesc}{real_quick_ratio}{} - Return an upper bound on \method{ratio()} very quickly. - - This isn't defined beyond that it is an upper bound on - \method{ratio()}, and is faster to compute than either - \method{ratio()} or \method{quick_ratio()}. -\end{methoddesc} - -The three methods that return the ratio of matching to total characters -can give different results due to differing levels of approximation, -although \method{quick_ratio()} and \method{real_quick_ratio()} are always -at least as large as \method{ratio()}: - -\begin{verbatim} ->>> s = SequenceMatcher(None, "abcd", "bcde") ->>> s.ratio() -0.75 ->>> s.quick_ratio() -0.75 ->>> s.real_quick_ratio() -1.0 -\end{verbatim} - - -\subsection{SequenceMatcher Examples \label{sequencematcher-examples}} - - -This example compares two strings, considering blanks to be ``junk:'' - -\begin{verbatim} ->>> s = SequenceMatcher(lambda x: x == " ", -... "private Thread currentThread;", -... "private volatile Thread currentThread;") -\end{verbatim} - -\method{ratio()} returns a float in [0, 1], measuring the similarity -of the sequences. As a rule of thumb, a \method{ratio()} value over -0.6 means the sequences are close matches: - -\begin{verbatim} ->>> print round(s.ratio(), 3) -0.866 -\end{verbatim} - -If you're only interested in where the sequences match, -\method{get_matching_blocks()} is handy: - -\begin{verbatim} ->>> for block in s.get_matching_blocks(): -... print "a[%d] and b[%d] match for %d elements" % block -a[0] and b[0] match for 8 elements -a[8] and b[17] match for 6 elements -a[14] and b[23] match for 15 elements -a[29] and b[38] match for 0 elements -\end{verbatim} - -Note that the last tuple returned by \method{get_matching_blocks()} is -always a dummy, \code{(len(\var{a}), len(\var{b}), 0)}, and this is -the only case in which the last tuple element (number of elements -matched) is \code{0}. - -If you want to know how to change the first sequence into the second, -use \method{get_opcodes()}: - -\begin{verbatim} ->>> for opcode in s.get_opcodes(): -... print "%6s a[%d:%d] b[%d:%d]" % opcode - equal a[0:8] b[0:8] -insert a[8:8] b[8:17] - equal a[8:14] b[17:23] - equal a[14:29] b[23:38] -\end{verbatim} - -See also the function \function{get_close_matches()} in this module, -which shows how simple code building on \class{SequenceMatcher} can be -used to do useful work. - - -\subsection{Differ Objects \label{differ-objects}} - -Note that \class{Differ}-generated deltas make no claim to be -\strong{minimal} diffs. To the contrary, minimal diffs are often -counter-intuitive, because they synch up anywhere possible, sometimes -accidental matches 100 pages apart. Restricting synch points to -contiguous matches preserves some notion of locality, at the -occasional cost of producing a longer diff. - -The \class{Differ} class has this constructor: - -\begin{classdesc}{Differ}{\optional{linejunk\optional{, charjunk}}} - Optional keyword parameters \var{linejunk} and \var{charjunk} are - for filter functions (or \code{None}): - - \var{linejunk}: A function that accepts a single string - argument, and returns true if the string is junk. The default is - \code{None}, meaning that no line is considered junk. - - \var{charjunk}: A function that accepts a single character argument - (a string of length 1), and returns true if the character is junk. - The default is \code{None}, meaning that no character is - considered junk. -\end{classdesc} - -\class{Differ} objects are used (deltas generated) via a single -method: - -\begin{methoddesc}{compare}{a, b} - Compare two sequences of lines, and generate the delta (a sequence - of lines). - - Each sequence must contain individual single-line strings ending - with newlines. Such sequences can be obtained from the - \method{readlines()} method of file-like objects. The delta generated - also consists of newline-terminated strings, ready to be printed as-is - via the \method{writelines()} method of a file-like object. -\end{methoddesc} - - -\subsection{Differ Example \label{differ-examples}} - -This example compares two texts. First we set up the texts, sequences -of individual single-line strings ending with newlines (such sequences -can also be obtained from the \method{readlines()} method of file-like -objects): - -\begin{verbatim} ->>> text1 = ''' 1. Beautiful is better than ugly. -... 2. Explicit is better than implicit. -... 3. Simple is better than complex. -... 4. Complex is better than complicated. -... '''.splitlines(1) ->>> len(text1) -4 ->>> text1[0][-1] -'\n' ->>> text2 = ''' 1. Beautiful is better than ugly. -... 3. Simple is better than complex. -... 4. Complicated is better than complex. -... 5. Flat is better than nested. -... '''.splitlines(1) -\end{verbatim} - -Next we instantiate a Differ object: - -\begin{verbatim} ->>> d = Differ() -\end{verbatim} - -Note that when instantiating a \class{Differ} object we may pass -functions to filter out line and character ``junk.'' See the -\method{Differ()} constructor for details. - -Finally, we compare the two: - -\begin{verbatim} ->>> result = list(d.compare(text1, text2)) -\end{verbatim} - -\code{result} is a list of strings, so let's pretty-print it: - -\begin{verbatim} ->>> from pprint import pprint ->>> pprint(result) -[' 1. Beautiful is better than ugly.\n', - '- 2. Explicit is better than implicit.\n', - '- 3. Simple is better than complex.\n', - '+ 3. Simple is better than complex.\n', - '? ++ \n', - '- 4. Complex is better than complicated.\n', - '? ^ ---- ^ \n', - '+ 4. Complicated is better than complex.\n', - '? ++++ ^ ^ \n', - '+ 5. Flat is better than nested.\n'] -\end{verbatim} - -As a single multi-line string it looks like this: - -\begin{verbatim} ->>> import sys ->>> sys.stdout.writelines(result) - 1. Beautiful is better than ugly. -- 2. Explicit is better than implicit. -- 3. Simple is better than complex. -+ 3. Simple is better than complex. -? ++ -- 4. Complex is better than complicated. -? ^ ---- ^ -+ 4. Complicated is better than complex. -? ++++ ^ ^ -+ 5. Flat is better than nested. -\end{verbatim} |