diff options
author | cinap_lenrek <cinap_lenrek@centraldogma> | 2011-07-19 05:12:01 +0200 |
---|---|---|
committer | cinap_lenrek <cinap_lenrek@centraldogma> | 2011-07-19 05:12:01 +0200 |
commit | b6eee91029e9b7ed76d872d18aa88dc4d85a7e56 (patch) | |
tree | b187989a64eedab41bc32ade5400325389bcecba /sys/doc/sam/sam.html | |
parent | 3b8c921bfa982bcdf287bb34f7a6f1b96c4b5ec8 (diff) | |
parent | 8c4c1f39f4e369d7c590c9d119f1150a2215e56d (diff) |
merge
Diffstat (limited to 'sys/doc/sam/sam.html')
-rw-r--r-- | sys/doc/sam/sam.html | 3705 |
1 files changed, 3705 insertions, 0 deletions
diff --git a/sys/doc/sam/sam.html b/sys/doc/sam/sam.html new file mode 100644 index 000000000..e7bda43b6 --- /dev/null +++ b/sys/doc/sam/sam.html @@ -0,0 +1,3705 @@ +<?xml version="1.0" encoding="utf-8"?> +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" +"http://www.w3.org/TR/html4/loose.dtd"> +<html> +<head> +<meta http-equiv=Content-Type content="text/html; charset=utf8"> +<title>The Text Editor sam</title> +</meta> +</head> +<body> +<p style="margin-top: 0; margin-bottom: 0.50in"></p> +<p style="margin-top: 0; margin-bottom: 0.21in"></p> + +<p style="line-height: 1.4em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: center;"> +<span style="font-size: 12pt"><b>The Text Editor </b></span><span style="font-size: 12pt"><tt>sam</tt></span><span style="font-size: 12pt"><b></b></span></p> +<p style="margin-top: 0; margin-bottom: 0.21in"></p> + +<p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.4em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: center;"> +<span style="font-size: 10pt"><i>Rob Pike</i></span></p> +<p style="line-height: 1.4em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: center;"> +<span style="font-size: 10pt"><i>rob@plan9.bell-labs.com</i></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="margin-top: 0; margin-bottom: 0.33in"></p> +<p style="line-height: 1.4em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: center;"> +<span style="font-size: 10pt"><i>ABSTRACT</i></span></p> +<p style="margin-top: 0; margin-bottom: 0.19in"></p> +<p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.50in; text-indent: 0.50in; margin-right: 1.50in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +is an interactive multi-file text editor intended for +bitmap displays. +A textual command language +supplements the mouse-driven, cut-and-paste interface +to make complex or +repetitive editing tasks easy to specify. +The language is characterized by the composition of regular expressions +to describe the structure of the text being modified. +The treatment of files as a database, with changes logged +as atomic transactions, guides the implementation and +makes a general ‘undo’ mechanism straightforward. +</span><span style="font-size: 10pt"></span><span style="font-size: 10pt"></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.50in; text-indent: 0.35in; margin-right: 1.50in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +is implemented as two processes connected by a low-bandwidth stream, +one process handling the display and the other the editing +algorithms. Therefore it can run with the display process +in a bitmap terminal and the editor on a local host, +with both processes on a bitmap-equipped host, or with +the display process in the terminal and the editor in a +remote host. +By suppressing the display process, +it can even run without a bitmap terminal. +</span><span style="font-size: 10pt"></span><span style="font-size: 10pt"></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.50in; text-indent: 0.35in; margin-right: 1.50in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">This paper is reprinted from Software—Practice and Experience, +Vol 17, number 11, pp. 813-845, November 1987. +The paper has not been updated for the Plan 9 manuals. Although +</span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +has not changed much since the paper was written, the system around it certainly has. +Nonetheless, the description here still stands as the best introduction to the editor. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="margin-top: 0; margin-bottom: 0.50in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Introduction +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +is an interactive text editor that combines cut-and-paste interactive editing with +an unusual command language based on the composition of regular expressions. +It is written as two programs: one, the ‘host part,’ runs on a UNIX system +and implements the command language and provides file access; the other, the +‘terminal part,’ runs asynchronously +on a machine with a mouse and bitmap display +and supports the display and interactive editing. +The host part may be even run in isolation on an ordinary terminal +to edit text using the command +language, much like a traditional line editor, +without assistance from a mouse or display. +Most often, +the terminal part runs on a Blit<sup></sup></span><sup><span style="font-size: 6pt">1</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> terminal +(actually on a Teletype DMD 5620, the production version of the Blit), whose +host connection is an ordinary 9600 bps RS232 link; +on the SUN computer the host and display processes run on a single machine, +connected by a pipe. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +edits uninterpreted +ASCII text. +It has no facilities for multiple fonts, graphics or tables, +unlike MacWrite,<sup></sup></span><sup><span style="font-size: 6pt">2</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> Bravo,<sup></sup></span><sup><span style="font-size: 6pt">3</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> Tioga<sup></sup></span><sup><span style="font-size: 6pt">4</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +or Lara.<sup></sup></span><sup><span style="font-size: 6pt">5</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +Also unlike them, it has a rich command language. +(Throughout this paper, the phrase +</span><span style="font-size: 10pt"><i>command language +</i></span><span style="font-size: 10pt">refers to +textual commands; commands activated from the mouse form the +</span><span style="font-size: 10pt"><i>mouse</i></span><span style="font-size: 10pt"> +</span><span style="font-size: 10pt"><i>language.</i></span><span style="font-size: 10pt">) +</span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +developed as an editor for use by programmers, and tries to join +the styles of the UNIX text editor +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt"><sup></sup></span><sup><span style="font-size: 6pt">6,7</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +with that of interactive cut-and-paste editors by +providing a comfortable mouse-driven interface +to a program with a solid command language driven by regular expressions. +The command language developed more than the mouse language, and +acquired a notation for describing the structure of files +more richly than as a sequence of lines, +using a dataflow-like syntax for specifying changes. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The interactive style was influenced by +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt">,<sup></sup></span><sup><span style="font-size: 6pt">1</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +an early cut-and-paste editor for the Blit, and by +</span><span style="font-size: 10pt"><tt>mux</tt></span><span style="font-size: 10pt">,<sup></sup></span><sup><span style="font-size: 6pt">8</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +the Blit window system. +</span><span style="font-size: 10pt"><tt>Mux</tt></span><span style="font-size: 10pt"> +merges the original Blit window system, +</span><span style="font-size: 10pt"><tt>mpx</tt></span><span style="font-size: 10pt">,<sup></sup></span><sup><span style="font-size: 6pt">1</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +with cut-and-paste editing, forming something like a +multiplexed version of +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt"> +that edits the output of (and input to) command sessions rather than files. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The first part of this paper describes the command language, then the mouse +language, and explains how they interact. +That is followed by a description of the implementation, +first of the host part, then of the terminal part. +A principle that influenced the design of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +is that it should have no explicit limits, such as upper limits on +file size or line length. +A secondary consideration is that it be efficient. +To honor these two goals together requires a method for efficiently +manipulating +huge strings (files) without breaking them into lines, +perhaps while making thousands of changes +under control of the command language. +</span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt">’s +method is to +treat the file as a transaction database, implementing changes as atomic +updates. These updates may be unwound easily to ‘undo’ changes. +Efficiency is achieved through a collection of caches that minimizes +disc traffic and data motion, both within the two parts of the program +and between them. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The terminal part of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +is fairly straightforward. +More interesting is how the two halves of the editor stay +synchronized when either half may initiate a change. +This is achieved through a data structure that organizes the +communications and is maintained in parallel by both halves. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The last part of the paper chronicles the writing of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +and discusses the lessons that were learned through its development and use. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The paper is long, but is composed largely of two papers of reasonable length: +a description of the user interface of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +and a discussion of its implementation. +They are combined because the implementation is strongly influenced by +the user interface, and vice versa. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>The Interface +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +is a text editor for multiple files. +File names may be provided when it is invoked: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>sam file1 file2 ...</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">and there are commands +to add new files and discard unneeded ones. +Files are not read until necessary +to complete some command. +Editing operations apply to an internal copy +made when the file is read; the UNIX file associated with the copy +is changed only by an explicit command. +To simplify the discussion, the internal copy is here called a +</span><span style="font-size: 10pt"><i>file</i></span><span style="font-size: 10pt">, +while the disc-resident original is called a +</span><span style="font-size: 10pt"><i>disc file. +</i></span><span style="font-size: 10pt"></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +is usually connected to a bitmap display that presents a cut-and-paste +editor driven by the mouse. +In this mode, the command language is still available: +text typed in a special window, called the +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +</span><span style="font-size: 10pt"><i>window,</i></span><span style="font-size: 10pt"> +is interpreted +as commands to be executed in the current file. +Cut-and-paste editing may be used in any window — even in the +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +window to construct commands. +The other mode of operation, invoked by starting +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +with the option +</span><span style="font-size: 10pt"><tt>-d</tt></span><span style="font-size: 10pt"> +(for ‘no download’), +does not use the mouse or bitmap display, but still permits +editing using the textual command language, even on an ordinary terminal, +interactively or from a script. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The following sections describe first the command language (under +</span><span style="font-size: 10pt"><tt>sam\fP-d +and in the +</tt></span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"><tt> +window), and then the mouse interface. +These two languages are nearly independent, but connect through the +</tt></span><span style="font-size: 10pt"><i>current</i></span><span style="font-size: 10pt"><tt> +</tt></span><span style="font-size: 10pt"><i>text,</i></span><span style="font-size: 10pt"><tt> +described below. +</tt></span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>The Command Language +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">A file consists of its contents, which are an array of characters +(that is, a string); the +</span><span style="font-size: 10pt"><i>name</i></span><span style="font-size: 10pt"> +of the associated disc file; the +</span><span style="font-size: 10pt"><i>modified bit +</i></span><span style="font-size: 10pt">that states whether the contents match those of +the disc file; +and a substring of the contents, called the +</span><span style="font-size: 10pt"><i>current text +</i></span><span style="font-size: 10pt">or +</span><span style="font-size: 10pt"><i>dot</i></span><span style="font-size: 10pt"> +(see Figures 1 and 2). +If the current text is a null string, dot falls between characters. +The +</span><span style="font-size: 10pt"><i>value</i></span><span style="font-size: 10pt"> +of dot is the location of the current text; the +</span><span style="font-size: 10pt"><i>contents</i></span><span style="font-size: 10pt"> +of dot are the characters it contains. +</span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +imparts to the text no two-dimensional interpretation such as columns +or fields; text is always one-dimensional. +Even the idea of a ‘line’ of text as understood by most UNIX programs +— a sequence of characters terminated by a newline character — +is only weakly supported. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The +</span><span style="font-size: 10pt"><i>current file +</i></span><span style="font-size: 10pt">is the file to which editing commands refer. +The current text is therefore dot in the current file. +If a command doesn’t explicitly name a particular file or piece of text, +the command is assumed to apply to the current text. +For the moment, ignore the presence of multiple files and consider +editing a single file. +</span><span style="font-size: 10pt"></span></p><center><img src="fig1.gif" /></center> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 8pt"><i>Figure 1. A typical +</i></span><span style="font-size: 8pt"><tt>sam</tt></span><span style="font-size: 8pt"><i> +screen, with the editing menu presented. +The +</i></span><span style="font-size: 8pt"><tt>sam</tt></span><span style="font-size: 8pt"><i> +(command language) window is in the middle, with file windows above and below. +(The user interface makes it easy to create these abutting windows.) +The partially obscured window is a third file window. +The uppermost window is that to which typing and mouse operations apply, +as indicated by its heavy border. +Each window has its current text highlighted in reverse video. +The +</i></span><span style="font-size: 8pt"><tt>sam</tt></span><span style="font-size: 8pt"><i> +window’s current text is the null string on the last visible line, +indicated by a vertical bar. +See also Figure 2. +</i></span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="margin-top: 0; margin-bottom: 0.02in"></p> + +<p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Commands have one-letter names. +Except for non-editing commands such as writing +the file to disc, most commands make some change +to the text in dot and leave dot set to the text resulting from the change. +For example, the delete command, +</span><span style="font-size: 10pt"><tt>d</tt></span><span style="font-size: 10pt">, +deletes the text in dot, replacing it by the null string and setting dot +to the result. +The change command, +</span><span style="font-size: 10pt"><tt>c</tt></span><span style="font-size: 10pt">, +replaces dot by text delimited by an arbitrary punctuation character, +conventionally +a slash. Thus, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>c/Peter/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">replaces the text in dot by the string +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt">. +Similarly, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>a/Peter/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">(append) adds the string after dot, and +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>i/Peter/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">(insert) inserts before dot. +All three leave dot set to the new text, +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt">. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Newlines are part of the syntax of commands: +the newline character lexically terminates a command. +Within the inserted text, however, newlines are never implicit. +But since it is often convenient to insert multiple lines of text, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +has a special +syntax for that case: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>a</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>some lines of text</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>to be inserted in the file,</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>terminated by a period</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>on a line by itself</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>.</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">In the one-line syntax, a newline character may be specified by a C-like +escape, so +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>c/\n/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">replaces dot by a single newline character. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +also has a substitute command, +</span><span style="font-size: 10pt"><tt>s</tt></span><span style="font-size: 10pt">: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>s/</tt></span><span style="font-size: 9pt"><i>expression</i></span><span style="font-size: 9pt"><tt>/</tt></span><span style="font-size: 9pt"><i>replacement</i></span><span style="font-size: 9pt"><tt>/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">substitutes the replacement text for the first match, in dot, +of the regular expression. +Thus, if dot is the string +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt">, +the command +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>s/t/st/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">changes it to +</span><span style="font-size: 10pt"><tt>Pester</tt></span><span style="font-size: 10pt">. +In general, +</span><span style="font-size: 10pt"><tt>s</tt></span><span style="font-size: 10pt"> +is unnecessary, but it was inherited from +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt"> +and it has some convenient variations. +For instance, the replacement text may include the matched text, +specified by +</span><span style="font-size: 10pt"><tt>&</tt></span><span style="font-size: 10pt">: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>s/Peter/Oh, &, &, &, &!/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">There are also three commands that apply programs +to text: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>< </tt></span><span style="font-size: 9pt"><i>UNIX program</i></span><span style="font-size: 9pt"><tt></tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">replaces dot by the output of the UNIX program. +Similarly, the +</span><span style="font-size: 10pt"><tt>></tt></span><span style="font-size: 10pt"> +command +runs the program with dot as its standard input, and +</span><span style="font-size: 10pt"><tt>|</tt></span><span style="font-size: 10pt"> +does both. For example, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>| sort</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">replaces dot by the result of applying the standard sorting utility to it. +Again, newlines have no special significance for these +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +commands. +The text acted upon and resulting from these commands is not necessarily +bounded by newlines, although for connection with UNIX programs, +newlines may be necessary to obey conventions. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">One more command: +</span><span style="font-size: 10pt"><tt>p</tt></span><span style="font-size: 10pt"> +prints the contents of dot. +Table I summarizes +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +commands. +</span><span style="font-size: 10pt"></span></p><center><img src="sam0.png"></center> +</center> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="margin-top: 0; margin-bottom: 0.02in"></p> + +<p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The value of dot may be changed by +specifying an +</span><span style="font-size: 10pt"><i>address</i></span><span style="font-size: 10pt"> +for the command. +The simplest address is a line number: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>3</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">refers to the third line of the file, so +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>3d</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">deletes the third line of the file, and implicitly renumbers +the lines so the old line 4 is now numbered 3. +(This is one of the few places where +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +deals with lines directly.) +Line +</span><span style="font-size: 10pt"><tt>0</tt></span><span style="font-size: 10pt"> +is the null string at the beginning of the file. +If a command consists of only an address, a +</span><span style="font-size: 10pt"><tt>p</tt></span><span style="font-size: 10pt"> +command is assumed, so typing an unadorned +</span><span style="font-size: 10pt"><tt>3</tt></span><span style="font-size: 10pt"> +prints line 3 on the terminal. +There are a couple of other basic addresses: +a period addresses dot itself; and +a dollar sign +(</span><span style="font-size: 10pt"><tt>$</tt></span><span style="font-size: 10pt">) +addresses the null string at the end of the file. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">An address is always a single substring of the file. +Thus, the address +</span><span style="font-size: 10pt"><tt>3</tt></span><span style="font-size: 10pt"> +addresses the characters +after the second newline of +the file through the third newline of the file. +A +</span><span style="font-size: 10pt"><i>compound address +</i></span><span style="font-size: 10pt">is constructed by the comma operator +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><i>address1</i></span><span style="font-size: 9pt"><tt>,</tt></span><span style="font-size: 9pt"><i>address2</i></span><span style="font-size: 9pt"><tt></tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">and addresses the substring of the file from the beginning of +</span><span style="font-size: 10pt"><i>address1</i></span><span style="font-size: 10pt"> +to the end of +</span><span style="font-size: 10pt"><i>address2</i></span><span style="font-size: 10pt">. +For example, the command +</span><span style="font-size: 10pt"><tt>3,5p</tt></span><span style="font-size: 10pt"> +prints the third through fifth lines of the file and +</span><span style="font-size: 10pt"><tt>.,$d</tt></span><span style="font-size: 10pt"> +deletes the text from the beginning of dot to the end of the file. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">These addresses are all absolute positions in the file, but +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +also has relative addresses, indicated by +</span><span style="font-size: 10pt"><tt>+</tt></span><span style="font-size: 10pt"> +or +</span><span style="font-size: 10pt"><tt>-</tt></span><span style="font-size: 10pt">. +For example, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>$-3</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">is the third line before the end of the file and +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>.+1</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">is the line after dot. +If no address appears to the left of the +</span><span style="font-size: 10pt"><tt>+</tt></span><span style="font-size: 10pt"> +or +</span><span style="font-size: 10pt"><tt>-</tt></span><span style="font-size: 10pt">, +dot is assumed; +if nothing appears to the right, +</span><span style="font-size: 10pt"><tt>1</tt></span><span style="font-size: 10pt"> +is assumed. +Therefore, +</span><span style="font-size: 10pt"><tt>.+1</tt></span><span style="font-size: 10pt"> +may be abbreviated to just a plus sign. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The +</span><span style="font-size: 10pt"><tt>+</tt></span><span style="font-size: 10pt"> +operator acts relative to the end of its first argument, while the +</span><span style="font-size: 10pt"><tt>-</tt></span><span style="font-size: 10pt"> +operator acts relative to the beginning. Thus +</span><span style="font-size: 10pt"><tt>.+1</tt></span><span style="font-size: 10pt"> +addresses the first line after dot, +</span><span style="font-size: 10pt"><tt>.-</tt></span><span style="font-size: 10pt"> +addresses the first line before dot, and +</span><span style="font-size: 10pt"><tt>+-</tt></span><span style="font-size: 10pt"> +refers to the line containing the end of dot. (Dot may span multiple lines, and +</span><span style="font-size: 10pt"><tt>+</tt></span><span style="font-size: 10pt"> +selects the line after the end of dot, then +</span><span style="font-size: 10pt"><tt>-</tt></span><span style="font-size: 10pt"> +backs up one line.) +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The final type of address is a regular expression, which addresses the +text matched by the expression. The expression is enclosed in slashes, as in +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>/</tt></span><span style="font-size: 9pt"><i>expression</i></span><span style="font-size: 9pt"><tt>/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The expressions are the same as those in the UNIX program +</span><span style="font-size: 10pt"><tt>egrep</tt></span><span style="font-size: 10pt">,<sup></sup></span><sup><span style="font-size: 6pt">6,7</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +and include closures, alternations, and so on. +They find the +</span><span style="font-size: 10pt"><i>leftmost longest +</i></span><span style="font-size: 10pt">string that matches the expression, that is, +the first match after the point where the search is started, +and if more than one match begins at the same spot, the longest such match. +(I assume familiarity with the syntax for regular expressions in UNIX programs.<sup></sup></span><sup><span style="font-size: 6pt">9</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt">) +For example, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>/x/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">matches the next +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +character in the file, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>/xx*/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">matches the next run of one or more +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt">’s, +and +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>/x|Peter/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">matches the next +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +or +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt">. +For compatibility with other UNIX programs, the ‘any character’ operator, +a period, +does not match a newline, so +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>/.*/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">matches the text from dot to the end of the line, but excludes the newline +and so will not match across +the line boundary. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Regular expressions are always relative addresses. +The direction is forwards by default, +so +</span><span style="font-size: 10pt"><tt>/Peter/</tt></span><span style="font-size: 10pt"> +is really an abbreviation for +</span><span style="font-size: 10pt"><tt>+/Peter/</tt></span><span style="font-size: 10pt">. +The search can be reversed with a minus sign, so +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt></tt></span><span style="font-size: 9pt"><tt>-/Peter/</tt></span><span style="font-size: 9pt"><tt></tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">finds the first +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt"> +before dot. +Regular expressions may be used with other address forms, so +</span><span style="font-size: 10pt"><tt>0+/Peter/</tt></span><span style="font-size: 10pt"> +finds the first +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt"> +in the file and +</span><span style="font-size: 10pt"><tt>$-/Peter/</tt></span><span style="font-size: 10pt"> +finds the last. +Table II summarizes +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +addresses. +</span><span style="font-size: 10pt"></span></p><center><img src="sam1.png"></center> +</center> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="margin-top: 0; margin-bottom: 0.02in"></p> + +<p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The language discussed so far will not seem novel +to people who use UNIX text editors +such as +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt"> +or +</span><span style="font-size: 10pt"><tt>vi</tt></span><span style="font-size: 10pt">.<sup></sup></span><sup><span style="font-size: 6pt">9</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +Moreover, the kinds of editing operations these commands allow, with the exception +of regular expressions and line numbers, +are clearly more conveniently handled by a mouse-based interface. +Indeed, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +mouse language (discussed at length below) is the means by which +simple changes are usually made. +For large or repetitive changes, however, a textual language +outperforms a manual interface. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Imagine that, instead of deleting just one occurrence of the string +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt">, +we wanted to eliminate every +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt">. +What’s needed is an iterator that runs a command for each occurrence of some +text. +</span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt">’s +iterator is called +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt">, +for extract: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>x/</tt></span><span style="font-size: 9pt"><i>expression</i></span><span style="font-size: 9pt"><tt>/ </tt></span><span style="font-size: 9pt"><i>command</i></span><span style="font-size: 9pt"><tt></tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">finds all matches in dot of the specified expression, and for each +such match, sets dot to the text matched and runs the command. +So to delete all the +</span><span style="font-size: 10pt"><tt>Peters:</tt></span><span style="font-size: 10pt"> +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>0,$ x/Peter/ d</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">(Blanks in these examples are to improve readability; +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +neither requires nor interprets them.) +This searches the entire file +(</span><span style="font-size: 10pt"><tt>0,$</tt></span><span style="font-size: 10pt">) +for occurrences of the string +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt">, +and runs the +</span><span style="font-size: 10pt"><tt>d</tt></span><span style="font-size: 10pt"> +command with dot set to each such occurrence. +(By contrast, the comparable +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt"> +command would delete all +</span><span style="font-size: 10pt"><i>lines</i></span><span style="font-size: 10pt"> +containing +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt">; +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +deletes only the +</span><span style="font-size: 10pt"><tt>Peters</tt></span><span style="font-size: 10pt">.) +The address +</span><span style="font-size: 10pt"><tt>0,$</tt></span><span style="font-size: 10pt"> +is commonly used, and may be abbreviated to just a comma. +As another example, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>, x/Peter/ p</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">prints a list of +</span><span style="font-size: 10pt"><tt>Peters,</tt></span><span style="font-size: 10pt"> +one for each appearance in the file, with no intervening text (not even newlines +to separate the instances). +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Of course, the text extracted by +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +may be selected by a regular expression, +which complicates deciding what set of matches is chosen — +matches may overlap. This is resolved by generating the matches +starting from the beginning of dot using the leftmost-longest rule, +and searching for each match starting from the end of the previous one. +Regular expressions may also match null strings, but a null match +adjacent to a non-null match is never selected; at least one character +must intervene. +For example, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>, c/AAA/</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>x/B*/ c/-/</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>, p</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">produces as output +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>-A-A-A-</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">because the pattern +</span><span style="font-size: 10pt"><tt>B*</tt></span><span style="font-size: 10pt"> +matches the null strings separating the +</span><span style="font-size: 10pt"><tt>A</tt></span><span style="font-size: 10pt">’s. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +command has a complement, +</span><span style="font-size: 10pt"><tt>y</tt></span><span style="font-size: 10pt">, +with similar syntax, that executes the command with dot set to the text +</span><span style="font-size: 10pt"><i>between</i></span><span style="font-size: 10pt"> +the matches of the expression. +For example, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>, c/AAA/</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>y/A/ c/-/</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>, p</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">produces the same result as the example above. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +and +</span><span style="font-size: 10pt"><tt>y</tt></span><span style="font-size: 10pt"> +commands are looping constructs, and +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +has a pair of conditional commands to go with them. +They have similar syntax: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>g/</tt></span><span style="font-size: 9pt"><i>expression</i></span><span style="font-size: 9pt"><tt>/ </tt></span><span style="font-size: 9pt"><i>command</i></span><span style="font-size: 9pt"><tt></tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">(guard) +runs the command exactly once if dot contains a match of the expression. +This is different from +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt">, +which runs the command for +</span><span style="font-size: 10pt"><i>each</i></span><span style="font-size: 10pt"> +match: +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +loops; +</span><span style="font-size: 10pt"><tt>g</tt></span><span style="font-size: 10pt"> +merely tests, without changing the value of dot. +Thus, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>, x/Peter/ d</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">deletes all occurrences of +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt">, +but +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>, g/Peter/ d</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">deletes the whole file (reduces it to a null string) if +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt"> +occurs anywhere in the text. +The complementary conditional is +</span><span style="font-size: 10pt"><tt>v</tt></span><span style="font-size: 10pt">, +which runs the command if there is +</span><span style="font-size: 10pt"><i>no</i></span><span style="font-size: 10pt"> +match of the expression. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">These control-structure-like commands may be composed to construct more +involved operations. For example, to print those lines of text that +contain the string +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt">: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>, x/.*\n/ g/Peter/ p</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +breaks the file into lines, the +</span><span style="font-size: 10pt"><tt>g</tt></span><span style="font-size: 10pt"> +selects those lines containing +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt">, +and the +</span><span style="font-size: 10pt"><tt>p</tt></span><span style="font-size: 10pt"> +prints them. +This command gives an address for the +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +command (the whole file), but because +</span><span style="font-size: 10pt"><tt>g</tt></span><span style="font-size: 10pt"> +does not have an explicit address, it applies to the value of +dot produced by the +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +command, that is, to each line. +All commands in +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +except for the command to write a file to disc use dot for the +default address. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Composition may be continued indefinitely. +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>, x/.*\n/ g/Peter/ v/SaltPeter/ p</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">prints those lines containing +</span><span style="font-size: 10pt"><tt>Peter</tt></span><span style="font-size: 10pt"> +but +</span><span style="font-size: 10pt"><i>not</i></span><span style="font-size: 10pt"> +those containing +</span><span style="font-size: 10pt"><tt>SaltPeter</tt></span><span style="font-size: 10pt">. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Structural Regular Expressions +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Unlike other UNIX text editors, +including the non-interactive ones such as +</span><span style="font-size: 10pt"><tt>sed</tt></span><span style="font-size: 10pt"> +and +</span><span style="font-size: 10pt"><tt>awk</tt></span><span style="font-size: 10pt">,<sup></sup></span><sup><span style="font-size: 6pt">7</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +is good for manipulating files with multi-line ‘records.’ +An example is an on-line phone book composed of records, +separated by blank lines, of the form +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>Herbert Tic</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>44 Turnip Ave., Endive, NJ</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>201-5555642</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.15in"></p> + +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>Norbert Twinge</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>16 Potato St., Cabbagetown, NJ</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>201-5553145</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.15in"></p> + +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>...</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The format may be encoded as a regular expression: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>(.+\n)+</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">that is, a sequence of one or more non-blank lines. +The command to print Mr. Tic’s entire record is then +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>, x/(.+\n)+/ g/^Herbert Tic$/ p</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">and that to extract just the phone number is +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>, x/(.+\n)+/ g/^Herbert Tic$/ x/^[0-9]*-[0-9]*\n/ p</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The latter command breaks the file into records, +chooses Mr. Tic’s record, +extracts the phone number from the record, +and finally prints the number. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">A more involved problem is that of +renaming a particular variable, say +</span><span style="font-size: 10pt"><tt>n</tt></span><span style="font-size: 10pt">, +to +</span><span style="font-size: 10pt"><tt>num</tt></span><span style="font-size: 10pt"> +in a C program. +The obvious first attempt, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>, x/n/ c/num/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">is badly flawed: it changes not only the variable +</span><span style="font-size: 10pt"><tt>n</tt></span><span style="font-size: 10pt"> +but any letter +</span><span style="font-size: 10pt"><tt>n</tt></span><span style="font-size: 10pt"> +that appears. +We need to extract all the variables, and select those that match +</span><span style="font-size: 10pt"><tt>n</tt></span><span style="font-size: 10pt"> +and only +</span><span style="font-size: 10pt"><tt>n</tt></span><span style="font-size: 10pt">: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>, x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The pattern +</span><span style="font-size: 10pt"><tt>[A-Za-z_][A-Za-z_0-9]*</tt></span><span style="font-size: 10pt"> +matches C identifiers. +Next +</span><span style="font-size: 10pt"><tt>g/n/</tt></span><span style="font-size: 10pt"> +selects those containing an +</span><span style="font-size: 10pt"><tt>n</tt></span><span style="font-size: 10pt">. +Then +</span><span style="font-size: 10pt"><tt>v/../</tt></span><span style="font-size: 10pt"> +rejects those containing two (or more) characters, and finally +</span><span style="font-size: 10pt"><tt>c/num/</tt></span><span style="font-size: 10pt"> +changes the remainder (identifiers +</span><span style="font-size: 10pt"><tt>n</tt></span><span style="font-size: 10pt">) +to +</span><span style="font-size: 10pt"><tt>num</tt></span><span style="font-size: 10pt">. +This version clearly works much better, but there may still be problems. +For example, in C character and string constants, the sequence +</span><span style="font-size: 10pt"><tt>\n</tt></span><span style="font-size: 10pt"> +is interpreted as a newline character, and we don’t want to change it to +</span><span style="font-size: 10pt"><tt>\num.</tt></span><span style="font-size: 10pt"> +This problem can be forestalled with a +</span><span style="font-size: 10pt"><tt>y</tt></span><span style="font-size: 10pt"> +command: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>, y/\\n/ x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">(the second +</span><span style="font-size: 10pt"><tt>\</tt></span><span style="font-size: 10pt"> +is necessary because of lexical conventions in regular expressions), +or we could even reject character constants and strings outright: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>,y/’[^’]*’/ y/"[^"]*"/ x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The +</span><span style="font-size: 10pt"><tt>y</tt></span><span style="font-size: 10pt"> +commands in this version exclude from consideration all character constants +and strings. +The only remaining problem is to deal with the possible occurrence of +</span><span style="font-size: 10pt"><tt>\’</tt></span><span style="font-size: 10pt"> +or +</span><span style="font-size: 10pt"><tt>\"</tt></span><span style="font-size: 10pt"> +within these sequences, but it’s easy to see how to resolve this difficulty. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The point of these composed commands is successive refinement. +A simple version of the command is tried, and if it’s not good enough, +it can be honed by adding a clause or two. +(Mistakes can be undone; see below. +Also, the mouse language makes it unnecessary to retype the command each time.) +The resulting chains of commands are somewhat reminiscent of +shell pipelines.<sup></sup></span><sup><span style="font-size: 6pt">7</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +Unlike pipelines, though, which pass along modified +</span><span style="font-size: 10pt"><i>data</i></span><span style="font-size: 10pt">, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +commands pass a +</span><span style="font-size: 10pt"><i>view</i></span><span style="font-size: 10pt"> +of the data. +The text at each step of the command is the same, but which pieces +are selected is refined step by step until the correct piece is +available to the final step of the command line, which ultimately makes the change. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">In other UNIX programs, regular expressions are used only for selection, +as in the +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +</span><span style="font-size: 10pt"><tt>g</tt></span><span style="font-size: 10pt"> +command, never for extraction as in the +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +or +</span><span style="font-size: 10pt"><tt>y</tt></span><span style="font-size: 10pt"> +command. +For example, patterns in +</span><span style="font-size: 10pt"><tt>awk</tt></span><span style="font-size: 10pt"><sup></sup></span><sup><span style="font-size: 6pt">7</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +are used to select lines to be operated on, but cannot be used +to describe the format of the input text, or to handle newline-free text. +The use of regular expressions to describe the structure of a piece +of text rather than its contents, as in the +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +command, +has been given a name: +</span><span style="font-size: 10pt"><i>structural regular expressions. +</i></span><span style="font-size: 10pt">When they are composed, as in the above example, +they are pleasantly expressive. +Their use is discussed at greater length elsewhere.<sup></sup></span><sup><span style="font-size: 6pt">10</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Multiple files +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +has a few other commands, mostly relating to input and output. +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>e discfilename</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">replaces the contents and name of the current file with those of the named +disc file; +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>w discfilename</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">writes the contents to the named disc file; and +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>r discfilename</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">replaces dot with the contents of the named disc file. +All these commands use the current file’s name if none is specified. +Finally, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>f discfilename</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">changes the name associated with the file and displays the result: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>’-. discfilename</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">This output is called the file’s +</span><span style="font-size: 10pt"><i>menu line, +</i></span><span style="font-size: 10pt">because it is the contents of the file’s line in the button 3 menu (described +in the +next section). +The first three characters are a concise notation for the state of the file. +The apostrophe signifies that the file is modified. +The minus sign indicates the number of windows +open on the file (see the next section): +</span><span style="font-size: 10pt"><tt>-</tt></span><span style="font-size: 10pt"> +means none, +</span><span style="font-size: 10pt"><tt>+</tt></span><span style="font-size: 10pt"> +means one, and +</span><span style="font-size: 10pt"><tt>*</tt></span><span style="font-size: 10pt"> +means more than one. +Finally, the period indicates that this is the current file. +These characters are useful for controlling the +</span><span style="font-size: 10pt"><tt>X</tt></span><span style="font-size: 10pt"> +command, described shortly. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +may be started with a set of disc files (such as all the source for +a program) by invoking it with a list of file names as arguments, and +more may be added or deleted on demand. +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>B discfile1 discfile2 ...</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">adds the named files to +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +list, and +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>D discfile1 discfile2 ...</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">removes them from +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +memory (without effect on associated disc files). +Both these commands have a syntax for using the shell<sup></sup></span><sup><span style="font-size: 6pt">7</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +(the UNIX command interpreter) to generate the lists: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>B <echo *.c</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">will add all C source files, and +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>B <grep -l variable *.c</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">will add all C source files referencing a particular variable +(the UNIX command +</span><span style="font-size: 10pt"><tt>grep\fP-l +lists all files in its arguments that contain matches of +the specified regular expression). +Finally, +</tt></span><span style="font-size: 10pt"><tt>D</tt></span><span style="font-size: 10pt"><tt> +without arguments deletes the current file. +</tt></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">There are two ways to change which file is current: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>b filename</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">makes the named file current. +The +</span><span style="font-size: 10pt"><tt>B</tt></span><span style="font-size: 10pt"> +command +does the same, but also adds any new files to +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +list. +(In practice, of course, the current file +is usually chosen by mouse actions, not by textual commands.) +The other way is to use a form of address that refers to files: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>"</tt></span><span style="font-size: 9pt"><i>expression</i></span><span style="font-size: 9pt"><tt>" </tt></span><span style="font-size: 9pt"><i>address</i></span><span style="font-size: 9pt"><tt></tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">refers to the address evaluated in the file whose menu line +matches the expression (there must be exactly one match). +For example, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>"peter.c" 3</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">refers to the third line of the file whose name matches +</span><span style="font-size: 10pt"><tt>peter.c</tt></span><span style="font-size: 10pt">. +This is most useful in the move +(</span><span style="font-size: 10pt"><tt>m</tt></span><span style="font-size: 10pt">) +and copy +(</span><span style="font-size: 10pt"><tt>t</tt></span><span style="font-size: 10pt">) +commands: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>0,$ t "peter.c" 0</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">makes a copy of the current file at the beginning of +</span><span style="font-size: 10pt"><tt>peter.c</tt></span><span style="font-size: 10pt">. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The +</span><span style="font-size: 10pt"><tt>X</tt></span><span style="font-size: 10pt"> +command +is a looping construct, like +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt">, +that refers to files instead of strings: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>X/</tt></span><span style="font-size: 9pt"><i>expression</i></span><span style="font-size: 9pt"><tt>/ </tt></span><span style="font-size: 9pt"><i>command</i></span><span style="font-size: 9pt"><tt></tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">runs the command in all +files whose menu lines match the expression. The best example is +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>X/’/ w</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">which writes to disc all modified files. +</span><span style="font-size: 10pt"><tt>Y</tt></span><span style="font-size: 10pt"> +is the complement of +</span><span style="font-size: 10pt"><tt>X</tt></span><span style="font-size: 10pt">: +it runs the command on all files whose menu lines don’t match the expression: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>Y/\.c/ D</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">deletes all files that don’t have +</span><span style="font-size: 10pt"><tt>.c</tt></span><span style="font-size: 10pt"> +in their names, that is, it keeps all C source files and deletes the rest. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Braces allow commands to be grouped, so +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>{</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> </tt></span><span style="font-size: 9pt"><i>command1</i></span><span style="font-size: 9pt"><tt></tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> </tt></span><span style="font-size: 9pt"><i>command2</i></span><span style="font-size: 9pt"><tt></tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>}</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">is syntactically a single command that runs two commands. +Thus, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>X/\.c/ ,g/variable/ {</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> f</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> , x/.*\n/ g/variable/ p</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>}</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">finds all occurrences of +</span><span style="font-size: 10pt"><tt>variable</tt></span><span style="font-size: 10pt"> +in C source files, and prints +out the file names and lines of each match. +The precise semantics of compound operations is discussed in the implementation +sections below. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Finally, +the undo command, +</span><span style="font-size: 10pt"><tt>u</tt></span><span style="font-size: 10pt">, +undoes the last command, +no matter how many files were affected. +Multiple undo operations move further back in time, so +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>u</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>u</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">(which may be abbreviated +</span><span style="font-size: 10pt"><tt>u2</tt></span><span style="font-size: 10pt">) +undoes the last two commands. An undo may not be undone, however, nor +may any command that adds or deletes files. +Everything else is undoable, though, including for example +</span><span style="font-size: 10pt"><tt>e</tt></span><span style="font-size: 10pt"> +commands: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>e filename</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>u</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">restores the state of the file completely, including its name, dot, +and modified bit. Because of the undo, potentially dangerous commands +are not guarded by confirmations. Only +</span><span style="font-size: 10pt"><tt>D</tt></span><span style="font-size: 10pt">, +which destroys the information necessary to restore itself, is protected. +It will not delete a modified file, but a second +</span><span style="font-size: 10pt"><tt>D</tt></span><span style="font-size: 10pt"> +of the same file will succeed regardless. +The +</span><span style="font-size: 10pt"><tt>q</tt></span><span style="font-size: 10pt"> +command, which exits +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">, +is similarly guarded. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Mouse Interface +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +is most commonly run +connected to a bitmap display and mouse for interactive editing. +The only difference in the command language +between regular, mouse-driven +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +and +</span><span style="font-size: 10pt"><tt>sam\fP-d +is that if an address +is provided without a command, +</tt></span><span style="font-size: 10pt"><tt>sam\fP-d +will print the text referenced by the address, but +regular +</tt></span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"><tt> +will highlight it on the screen — in fact, +dot is always highlighted (see Figure 2). +</tt></span><span style="font-size: 10pt"></span></p><center><img src="fig3.gif" /></center> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 8pt"><i>Figure 2. A +</i></span><span style="font-size: 8pt"><tt>sam</tt></span><span style="font-size: 8pt"><i> +window. The scroll bar down the left +represents the file, with the bubble showing the fraction +visible in the window. +The scroll bar may be manipulated by the mouse for convenient browsing. +The current text, +which is highlighted, need not fit on a line. Here it consists of one partial +line, one complete line, and final partial line. +</i></span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="margin-top: 0; margin-bottom: 0.02in"></p> + +<p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Each file may have zero or more windows open on the display. +At any time, only one window in all of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +is the +</span><span style="font-size: 10pt"><i>current window, +</i></span><span style="font-size: 10pt">that is, the window to which typing and mouse actions refer; +this may be the +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +window (that in which commands may be typed) +or one of the file windows. +When a file has multiple windows, the image of the file in each window +is always kept up to date. +The current file is the last file affected by a command, +so if the +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +window is current, +the current window is not a window on the current file. +However, each window on a file has its own value of dot, +and when switching between windows on a single file, +the file’s value of dot is changed to that of the window. +Thus, flipping between windows behaves in the obvious, convenient way. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The mouse on the Blit has three buttons, numbered left to right. +Button 3 has a list of commands to manipulate windows, +followed by a list of ‘menu lines’ exactly as printed by the +</span><span style="font-size: 10pt"><tt>f</tt></span><span style="font-size: 10pt"> +command, one per file (not one per window). +These menu lines are sorted by file name. +If the list is long, the Blit menu software will make it more manageable +by generating a scrolling menu instead of an unwieldy long list. +Using the menu to select a file from the list makes that file the current +file, and the most recently current window in that file the current window. +But if that file is already current, selecting it in the menu cycles through +the windows on the file; this simple trick avoids a special menu to +choose windows on a file. +If there is no window open on the file, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +changes the mouse cursor to prompt the user to create one. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The commands on the button 3 menu are straightforward (see Figure 3), and +are like the commands to manipulate windows in +</span><span style="font-size: 10pt"><tt>mux</tt></span><span style="font-size: 10pt">,<sup></sup></span><sup><span style="font-size: 6pt">8</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +the Blit’s window system. +</span><span style="font-size: 10pt"><tt>New</tt></span><span style="font-size: 10pt"> +makes a new file, and gives it one empty window, whose size is determined +by a rectangle swept by the mouse. +</span><span style="font-size: 10pt"><tt>Zerox</tt></span><span style="font-size: 10pt"> +prompts for a window to be selected, and +makes a clone of that window; this is how multiple windows are created on one file. +</span><span style="font-size: 10pt"><tt>Reshape</tt></span><span style="font-size: 10pt"> +changes the size of the indicated window, and +</span><span style="font-size: 10pt"><tt>close</tt></span><span style="font-size: 10pt"> +deletes it. If that is the last window open on the file, +</span><span style="font-size: 10pt"><tt>close</tt></span><span style="font-size: 10pt"> +first does a +</span><span style="font-size: 10pt"><tt>D</tt></span><span style="font-size: 10pt"> +command on the file. +</span><span style="font-size: 10pt"><tt>Write</tt></span><span style="font-size: 10pt"> +is identical to a +</span><span style="font-size: 10pt"><tt>w</tt></span><span style="font-size: 10pt"> +command on the file; it is in the menu purely for convenience. +Finally, +</span><span style="font-size: 10pt"><tt>~~sam~~</tt></span><span style="font-size: 10pt"> +is a menu item that appears between the commands and the file names. +Selecting it makes the +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +window the current window, +causing subsequent typing to be interpreted as commands. +</span><span style="font-size: 10pt"></span></p><center><img src="fig2.gif" /></center> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 8pt"><i>Figure 3. The menu on button 3. +The black rectangle on the left is a scroll bar; the menu is limited to +the length shown to prevent its becoming unwieldy. +Above the +</i></span><span style="font-size: 8pt"><tt>~~sam~~</tt></span><span style="font-size: 8pt"><i> +line is a list of commands; +beneath it is a list of files, presented exactly as with the +</i></span><span style="font-size: 8pt"><tt>f</tt></span><span style="font-size: 8pt"><i> +command. +</i></span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="margin-top: 0; margin-bottom: 0.02in"></p> + +<p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">When +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +requests that a window be swept, in response to +</span><span style="font-size: 10pt"><tt>new</tt></span><span style="font-size: 10pt">, +</span><span style="font-size: 10pt"><tt>zerox</tt></span><span style="font-size: 10pt"> +or +</span><span style="font-size: 10pt"><tt>reshape</tt></span><span style="font-size: 10pt">, +it changes the mouse cursor from the usual arrow to a box with +a small arrow. +In this state, the mouse may be used to indicate an arbitrary rectangle by +pressing button 3 at one corner and releasing it at the opposite corner. +More conveniently, +button 3 may simply be clicked, +whereupon +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +creates the maximal rectangle that contains the cursor +and abuts the +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +window. +By placing the +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +window in the middle of the screen, the user can define two regions (one above, +one below) in which stacked fully-overlapping +windows can be created with minimal fuss (see Figure 1). +This simple user interface trick makes window creation noticeably easier. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The cut-and-paste editor is essentially the same as that in Smalltalk-80.<sup></sup></span><sup><span style="font-size: 6pt">11</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +The text in dot is always highlighted on the screen. +When a character is typed it replaces dot, and sets dot to the null +string after the character. Thus, ordinary typing inserts text. +Button 1 is used for selection: +pressing the button, moving the mouse, and lifting the button +selects (sets dot to) the text between the points where the +button was pressed and released. +Pressing and releasing at the same point selects a null string; this +is called clicking. Clicking twice quickly, or +</span><span style="font-size: 10pt"><i>double clicking, +</i></span><span style="font-size: 10pt">selects larger objects; +for example, double clicking in a word selects the word, +double clicking just inside an opening bracket selects the text +contained in the brackets (handling nested brackets correctly), +and similarly for +parentheses, quotes, and so on. +The double-clicking rules reflect a bias toward +programmers. +If +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +were intended more for word processing, double-clicks would probably +select linguistic structures such as sentences. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">If button 1 is pressed outside the current window, it makes the indicated +window current. +This is the easiest way to switch between windows and files. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Pressing button 2 brings up a menu of editing functions (see Figure 4). +These mostly apply to the selected text: +</span><span style="font-size: 10pt"><tt>cut</tt></span><span style="font-size: 10pt"> +deletes the selected text, and remembers it in a hidden buffer called the +</span><span style="font-size: 10pt"><i>snarf buffer, +</i></span><span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>paste</tt></span><span style="font-size: 10pt"> +replaces the selected text by the contents of the snarf buffer, +</span><span style="font-size: 10pt"><tt>snarf</tt></span><span style="font-size: 10pt"> +just copies the selected text to the snarf buffer, +</span><span style="font-size: 10pt"><tt>look</tt></span><span style="font-size: 10pt"> +searches forward for the next literal occurrence of the selected text, and +</span><span style="font-size: 10pt"><tt><mux></tt></span><span style="font-size: 10pt"> +exchanges snarf buffers with the window system in which +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +is running. +Finally, the last regular expression used appears as a menu entry +to search +forward for the next occurrence of a match for the expression. +</span><span style="font-size: 10pt"></span></p><center><img src="fig4.gif" /></center> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 8pt"><i>Figure 4. The menu on button 2. +The bottom entry tracks the most recently used regular expression, which may +be literal text. +</i></span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="margin-top: 0; margin-bottom: 0.02in"></p> + +<p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The relationship between the command language and the mouse language is +entirely due to the equality of dot and the selected text chosen +with button 1 on the mouse. +For example, to make a set of changes in a C subroutine, dot can be +set by double clicking on the left brace that begins the subroutine, +which sets dot for the command language. +An address-free command then typed in the +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +window will apply only to the text between the opening and closing +braces of the function. +The idea is to select what you want, and then say what you want +to do with it, whether invoked by a menu selection or by a typed command. +And of course, the value of dot is highlighted on +the display after the command completes. +This relationship between mouse interface and command language +is clumsy to explain, but comfortable, even natural, in practice. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>The Implementation +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The next few sections describe how +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +is put together, first the host part, +then the inter-component communication, +then the terminal part. +After explaining how the command language is implemented, +the discussion follows (roughly) the path of a character +from the temporary file on disc to the screen. +The presentation centers on the data structures, +because that is how the program was designed and because +the algorithms are easy to provide, given the right data +structures. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Parsing and execution +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The command language is interpreted by parsing each command with a +table-driven recursive +descent parser, and when a complete command is assembled, invoking a top-down +executor. +Most editors instead employ a simple character-at-a-time +lexical scanner. +Use of a parser makes it +easy and unambiguous to detect when a command is complete, +which has two advantages. +First, escape conventions such as backslashes to quote +multiple-line commands are unnecessary; if the command isn’t finished, +the parser keeps reading. For example, a multiple-line append driven by an +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +command is straightforward: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>x/.*\n/ g/Peter/ a</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>one line about Peter</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>another line about Peter</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>.</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Other UNIX editors would require a backslash after all but the last line. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The other advantage is specific to the two-process structure of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">. +The host process must decide when a command is completed so the +command interpreter can be called. This problem is easily resolved +by having the lexical analyzer read the single stream of events from the +terminal, directly executing all typing and mouse commands, +but passing to the parser characters typed to the +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +command window. +This scheme is slightly complicated by the availability of cut-and-paste +editing in the +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +window, but that difficulty is resolved by applying the rules +used in +</span><span style="font-size: 10pt"><tt>mux</tt></span><span style="font-size: 10pt">: +when a newline is typed to the +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +window, all text between the newline and the previously typed newline +is made available to the parser. +This permits arbitrary editing to be done to a command before +typing newline and thereby requesting execution. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The parser is driven by a table because the syntax of addresses +and commands is regular enough +to be encoded compactly. There are few special cases, such as the +replacement text in a substitution, so the syntax of almost all commands +can be encoded with a few flags. +These include whether the command allows an address (for example, +</span><span style="font-size: 10pt"><tt>e</tt></span><span style="font-size: 10pt"> +does not), whether it takes a regular expression (as in +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +and +</span><span style="font-size: 10pt"><tt>s</tt></span><span style="font-size: 10pt">), +whether it takes replacement text (as in +</span><span style="font-size: 10pt"><tt>c</tt></span><span style="font-size: 10pt"> +or +</span><span style="font-size: 10pt"><tt>i</tt></span><span style="font-size: 10pt">), +which may be multi-line, and so on. +The internal syntax of regular expressions is handled by a separate +parser; a regular expression is a leaf of the command parse tree. +Regular expressions are discussed fully in the next section. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The parser table also has information about defaults, so the interpreter +is always called with a complete tree. For example, the parser fills in +the implicit +</span><span style="font-size: 10pt"><tt>0</tt></span><span style="font-size: 10pt"> +and +</span><span style="font-size: 10pt"><tt>$</tt></span><span style="font-size: 10pt"> +in the abbreviated address +</span><span style="font-size: 10pt"><tt>,</tt></span><span style="font-size: 10pt"> +(comma), +inserts a +</span><span style="font-size: 10pt"><tt>+</tt></span><span style="font-size: 10pt"> +to the left of an unadorned regular expression in an address, +and provides the usual default address +</span><span style="font-size: 10pt"><tt>.</tt></span><span style="font-size: 10pt"> +(dot) for commands that expect an address but are not given one. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Once a complete command is parsed, the evaluation is easy. +The address is evaluated left-to-right starting from the value of dot, +with a mostly ordinary expression evaluator. +Addresses, like many of the data structures in +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">, +are held in a C structure and passed around by value: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>typedef long Posn; /* Position in a file */</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>typedef struct Range{</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> Posn p1, p2;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>}Range;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>typedef struct Address{</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> Range r;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> File *f;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>}Address;</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">An address is encoded as a substring (character positions +</span><span style="font-size: 10pt"><tt>p1</tt></span><span style="font-size: 10pt"> +to +</span><span style="font-size: 10pt"><tt>p2</tt></span><span style="font-size: 10pt">) +in a file +</span><span style="font-size: 10pt"><tt>f</tt></span><span style="font-size: 10pt">. +(The data type +</span><span style="font-size: 10pt"><tt>File</tt></span><span style="font-size: 10pt"> +is described in detail below.) +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The address interpreter is an +</span><span style="font-size: 10pt"><tt>Address</tt></span><span style="font-size: 10pt">-valued +function that traverses the parse tree describing an address (the +parse tree for the address has type +</span><span style="font-size: 10pt"><tt>Addrtree</tt></span><span style="font-size: 10pt">): +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>Address</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>address(ap, a, sign)</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> Addrtree *ap;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> Address a;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> int sign;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>{</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> Address a2;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> do</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> switch(ap->type){</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> case ’.’:</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> a=a.f->dot;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> break;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> case ’$’:</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> a.r.p1=a.r.p2=a.f->nbytes;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> break;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> case ’"’: </tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> a=matchfile(a, ap->aregexp)->dot; </tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> break;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> case ’,’:</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> a2=address(ap->right, a, 0);</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> a=address(ap->left, a, 0);</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> if(a.f!=a2.f || a2.r.p2<a.r.p1)</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> error(Eorder);</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> a.r.p2=a2.r.p2;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> return a;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> /* and so on */</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> }</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> while((ap=ap->right)!=0);</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> return a;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>}</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Throughout, errors are handled by a non-local +</span><span style="font-size: 10pt"><tt>goto</tt></span><span style="font-size: 10pt"> +(a +</span><span style="font-size: 10pt"><tt>setjmp/longjmp</tt></span><span style="font-size: 10pt"> +in C terminology) +hidden in a routine called +</span><span style="font-size: 10pt"><tt>error</tt></span><span style="font-size: 10pt"> +that immediately aborts the execution, retracts any +partially made changes (see the section below on ‘undoing’), and +returns to the top level of the parser. +The argument to +</span><span style="font-size: 10pt"><tt>error</tt></span><span style="font-size: 10pt"> +is an enumeration type that +is translated to a terse but possibly helpful +message such as ‘?addresses out of order.’ +Very common messages are kept short; for example the message for +a failed regular expression search is ‘?search.’ +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Character addresses such as +</span><span style="font-size: 10pt"><tt>#3</tt></span><span style="font-size: 10pt"> +are trivial to implement, as the +</span><span style="font-size: 10pt"><tt>File</tt></span><span style="font-size: 10pt"> +data structure is accessible by character number. +However, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +keeps no information about the position of newlines — it is too +expensive to track dynamically — so line addresses are computed by reading +the file, counting newlines. Except in very large files, this has proven +acceptable: file access is fast enough to make the technique practical, +and lines are not central to the structure of the command language. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The command interpreter, called +</span><span style="font-size: 10pt"><tt>cmdexec</tt></span><span style="font-size: 10pt">, +is also straightforward. The parse table includes a +function to call to interpret a particular command. That function +receives as arguments +the calculated address +for the command +and the command tree (of type +</span><span style="font-size: 10pt"><tt>Cmdtree</tt></span><span style="font-size: 10pt">), +which may contain information such as the subtree for compound commands. +Here, for example, is the function for the +</span><span style="font-size: 10pt"><tt>g</tt></span><span style="font-size: 10pt"> +and +</span><span style="font-size: 10pt"><tt>v</tt></span><span style="font-size: 10pt"> +commands: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>int</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>g_cmd(a, cp)</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> Address a;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> Cmdtree *cp;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>{</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> compile(cp->regexp);</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> if(execute(a.f, a.r.p1, a.r.p2)!=(cp->cmdchar==’v’)){</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> a.f->dot=a;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> return cmdexec(a, cp->subcmd);</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> }</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> return TRUE; /* cause execution to continue */</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>}</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">(</span><span style="font-size: 10pt"><tt>Compile</tt></span><span style="font-size: 10pt"> +and +</span><span style="font-size: 10pt"><tt>execute</tt></span><span style="font-size: 10pt"> +are part of the regular expression code, described in the next section.) +Because the parser and the +</span><span style="font-size: 10pt"><tt>File</tt></span><span style="font-size: 10pt"> +data structure do most of the work, most commands +are similarly brief. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Regular expressions +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The regular expression code in +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +is an interpreted, rather than compiled on-the-fly, implementation of Thompson’s +non-deterministic finite automaton algorithm.<sup></sup></span><sup><span style="font-size: 6pt">12</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +The syntax and semantics of the expressions are as in the UNIX program +</span><span style="font-size: 10pt"><tt>egrep</tt></span><span style="font-size: 10pt">, +including alternation, closures, character classes, and so on. +The only changes in the notation are two additions: +</span><span style="font-size: 10pt"><tt>\n</tt></span><span style="font-size: 10pt"> +is translated to, and matches, a newline character, and +</span><span style="font-size: 10pt"><tt>@</tt></span><span style="font-size: 10pt"> +matches any character. In +</span><span style="font-size: 10pt"><tt>egrep</tt></span><span style="font-size: 10pt">, +the character +</span><span style="font-size: 10pt"><tt>.</tt></span><span style="font-size: 10pt"> +matches any character except newline, and in +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +the same rule seemed safest, to prevent idioms like +</span><span style="font-size: 10pt"><tt>.*</tt></span><span style="font-size: 10pt"> +from spanning newlines. +</span><span style="font-size: 10pt"><tt>Egrep</tt></span><span style="font-size: 10pt"> +expressions are arguably too complicated for an interactive editor — +certainly it would make sense if all the special characters were two-character +sequences, so that most of the punctuation characters wouldn’t have +peculiar meanings — but for an interesting command language, full +regular expressions are necessary, and +</span><span style="font-size: 10pt"><tt>egrep</tt></span><span style="font-size: 10pt"> +defines the full regular expression syntax for UNIX programs. +Also, it seemed superfluous to define a new syntax, since various UNIX programs +(</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt">, +</span><span style="font-size: 10pt"><tt>egrep</tt></span><span style="font-size: 10pt"> +and +</span><span style="font-size: 10pt"><tt>vi</tt></span><span style="font-size: 10pt">) +define too many already. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The expressions are compiled by a routine, +</span><span style="font-size: 10pt"><tt>compile</tt></span><span style="font-size: 10pt">, +that generates the description of the non-deterministic finite state machine. +A second routine, +</span><span style="font-size: 10pt"><tt>execute</tt></span><span style="font-size: 10pt">, +interprets the machine to generate the leftmost-longest match of the +expression in a substring of the file. +The algorithm is described elsewhere.<sup></sup></span><sup><span style="font-size: 6pt">12,13</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +</span><span style="font-size: 10pt"><tt>Execute</tt></span><span style="font-size: 10pt"> +reports +whether a match was found, and sets a global variable, +of type +</span><span style="font-size: 10pt"><tt>Range</tt></span><span style="font-size: 10pt">, +to the substring matched. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">A trick is required to evaluate the expression in reverse, such as when +searching backwards for an expression. +For example, +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>-/P.*r/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">looks backwards through the file for a match of the expression. +The expression, however, is defined for a forward search. +The solution is to construct a machine identical to the machine +for a forward search except for a reversal of all the concatenation +operators (the other operators are symmetric under direction reversal), +to exchange the meaning of the operators +</span><span style="font-size: 10pt"><tt>^</tt></span><span style="font-size: 10pt"> +and +</span><span style="font-size: 10pt"><tt>$</tt></span><span style="font-size: 10pt">, +and then to read the file backwards, looking for the +usual earliest longest match. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Execute</tt></span><span style="font-size: 10pt"> +generates only one match each time it is called. +To interpret looping constructs such as the +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +command, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +must therefore synchronize between +calls of +</span><span style="font-size: 10pt"><tt>execute</tt></span><span style="font-size: 10pt"> +to avoid +problems with null matches. +For example, even given the leftmost-longest rule, +the expression +</span><span style="font-size: 10pt"><tt>a*</tt></span><span style="font-size: 10pt"> +matches three times in the string +</span><span style="font-size: 10pt"><tt>ab</tt></span><span style="font-size: 10pt"> +(the character +</span><span style="font-size: 10pt"><tt>a</tt></span><span style="font-size: 10pt">, +the null string between the +</span><span style="font-size: 10pt"><tt>a</tt></span><span style="font-size: 10pt"> +and +</span><span style="font-size: 10pt"><tt>b</tt></span><span style="font-size: 10pt">, +and the final null string). +After returning a match for the +</span><span style="font-size: 10pt"><tt>a</tt></span><span style="font-size: 10pt">, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +must not match the null string before the +</span><span style="font-size: 10pt"><tt>b</tt></span><span style="font-size: 10pt">. +The algorithm starts +</span><span style="font-size: 10pt"><tt>execute</tt></span><span style="font-size: 10pt"> +at the end of its previous match, and +if the match it returns +is null and abuts the previous match, rejects the match and advances +the initial position one character. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Memory allocation +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The C language has no memory allocation primitives, although a standard +library routine, +</span><span style="font-size: 10pt"><tt>malloc</tt></span><span style="font-size: 10pt">, +provides adequate service for simple programs. +For specific uses, however, +it can be better to write a custom allocator. +The allocator (or rather, pair of allocators) described here +work in both the terminal and host parts of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">. +They are designed for efficient manipulation of strings, +which are allocated and freed frequently and vary in length from essentially +zero to 32 Kbytes (very large strings are written to disc). +More important, strings may be large and change size often, +so to minimize memory usage it is helpful to reclaim and to coalesce the +unused portions of strings when they are truncated. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Objects to be allocated in +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +are of two flavors: +the first is C +</span><span style="font-size: 10pt"><tt>structs</tt></span><span style="font-size: 10pt">, +which are small and often addressed by pointer variables; +the second is variable-sized arrays of characters +or integers whose +base pointer is always used to access them. +The memory allocator in +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +is therefore in two parts: +first, a traditional first-fit allocator that provides fixed storage for +</span><span style="font-size: 10pt"><tt>structs</tt></span><span style="font-size: 10pt">; +and second, a garbage-compacting allocator that reduces storage +overhead for variable-sized objects, at the cost of some bookkeeping. +The two types of objects are allocated from adjoining arenas, with +the garbage-compacting allocator controlling the arena with higher addresses. +Separating into two arenas simplifies compaction and prevents fragmentation due +to immovable objects. +The access rules for garbage-compactable objects +(discussed in the next paragraph) allow them to be relocated, so when +the first-fit arena needs space, it moves the garbage-compacted arena +to higher addresses to make room. Storage is therefore created only +at successively higher addresses, either when more garbage-compacted +space is needed or when the first-fit arena pushes up the other arena. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Objects that may be compacted declare to the +allocator a cell that is guaranteed to be the sole repository of the +address of the object whenever a compaction can occur. +The compactor can then update the address when the object is moved. +For example, the implementation of type +</span><span style="font-size: 10pt"><tt>List</tt></span><span style="font-size: 10pt"> +(really a variable-length array) +is: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>typedef struct List{</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> int nused;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt> long *ptr;</tt></span></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>}List;</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The +</span><span style="font-size: 10pt"><tt>ptr</tt></span><span style="font-size: 10pt"> +cell must always be used directly, and never copied. When a +</span><span style="font-size: 10pt"><tt>List</tt></span><span style="font-size: 10pt"> +is to be created the +</span><span style="font-size: 10pt"><tt>List</tt></span><span style="font-size: 10pt"> +structure is allocated in the ordinary first-fit arena +and its +</span><span style="font-size: 10pt"><tt>ptr</tt></span><span style="font-size: 10pt"> +is allocated in the garbage-compacted arena. +A similar data type for strings, called +</span><span style="font-size: 10pt"><tt>String</tt></span><span style="font-size: 10pt">, +stores variable-length character arrays of up to 32767 elements. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">A related matter of programming style: +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +frequently passes structures by value, which +simplifies the code. +Traditionally, C programs have +passed structures by reference, but implicit allocation on +the stack is easier to use. +Structure passing is a relatively new feature of C +(it is not in the +standard reference manual for C<sup></sup></span><sup><span style="font-size: 6pt">14</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt">), and is poorly supported in most +commercial C compilers. +It’s convenient and expressive, though, +and simplifies memory management by +avoiding the allocator altogether +and eliminating pointer aliases. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Data structures for manipulating files +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Experience with +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt"> +showed that the requirements +of the file data structure were few, but strict. +First, files need to be read and written quickly; +adding a fresh file must be painless. +Second, the implementation must place no arbitrary upper limit on +the number or sizes of files. (It should be practical to edit many files, +and files up to megabytes in length should be handled gracefully.) +This implies that files be stored on disc, not in main memory. +(Aficionados of virtual memory may argue otherwise, but the +implementation of virtual +memory in our system is not something to depend on +for good performance.) +Third, changes to files need be made by only two primitives: +deletion and insertion. +These are inverses of each other, +which simplifies the implementation of the undo operation. +Finally, +it must be easy and efficient to access the file, either +forwards or backwards, a byte at a time. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The +</span><span style="font-size: 10pt"><tt>File</tt></span><span style="font-size: 10pt"> +data type is constructed from three simpler data structures that hold arrays +of characters. +Each of these types has an insertion and deletion operator, and the +insertion and deletion operators of the +</span><span style="font-size: 10pt"><tt>File</tt></span><span style="font-size: 10pt"> +type itself are constructed from them. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The simplest type is the +</span><span style="font-size: 10pt"><tt>String</tt></span><span style="font-size: 10pt">, +which is used to hold strings in main memory. +The code that manages +</span><span style="font-size: 10pt"><tt>Strings</tt></span><span style="font-size: 10pt"> +guarantees that they will never be longer +than some moderate size, and in practice they are rarely larger than 8 Kbytes. +</span><span style="font-size: 10pt"><tt>Strings</tt></span><span style="font-size: 10pt"> +have two purposes: they hold short strings like file names with little overhead, +and because they are deliberately small, they are efficient to modify. +They are therefore used as the data structure for in-memory caches. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The disc copy of the file is managed by a data structure called a +</span><span style="font-size: 10pt"><tt>Disc</tt></span><span style="font-size: 10pt">, +which corresponds to a temporary file. A +</span><span style="font-size: 10pt"><tt>Disc</tt></span><span style="font-size: 10pt"> +has no storage in main memory other than bookkeeping information; +the actual data being held is all on the disc. +To reduce the number of open files needed, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +opens a dozen temporary UNIX files and multiplexes the +</span><span style="font-size: 10pt"><tt>Discs</tt></span><span style="font-size: 10pt"> +upon them. +This permits many files to +be edited; the entire +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +source (48 files) may be edited comfortably with a single +instance of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">. +Allocating one temporary file per +</span><span style="font-size: 10pt"><tt>Disc</tt></span><span style="font-size: 10pt"> +would strain the operating system’s limit on the number of open files. +Also, spreading the traffic among temporary files keeps the files shorter, +and shorter files are more efficiently implemented by the UNIX +I/O subsystem. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">A +</span><span style="font-size: 10pt"><tt>Disc</tt></span><span style="font-size: 10pt"> +is an array of fixed-length blocks, each of which contains +between 1 and 4096 characters of active data. +(The block size of our UNIX file system is 4096 bytes.) +The block addresses within the temporary file and the length of each +block are stored in a +</span><span style="font-size: 10pt"><tt>List</tt></span><span style="font-size: 10pt">. +When changes are made the live part of blocks may change size. +Blocks are created and coalesced when necessary to try to keep the sizes +between 2048 and 4096 bytes. +An actively changing part of the +</span><span style="font-size: 10pt"><tt>Disc</tt></span><span style="font-size: 10pt"> +therefore typically has about a kilobyte of slop that can be +inserted or deleted +without changing more than one block or affecting the block order. +When an insertion would overflow a block, the block is split, a new one +is allocated to receive the overflow, and the memory-resident list of blocks +is rearranged to reflect the insertion of the new block. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Obviously, going to the disc for every modification to the file is +prohibitively expensive. +The data type +</span><span style="font-size: 10pt"><tt>Buffer</tt></span><span style="font-size: 10pt"> +consists of a +</span><span style="font-size: 10pt"><tt>Disc</tt></span><span style="font-size: 10pt"> +to hold the data and a +</span><span style="font-size: 10pt"><tt>String</tt></span><span style="font-size: 10pt"> +that acts as a cache. +This is the first of a series of caches throughout the data structures in +</span><span style="font-size: 10pt"><tt>sam.</tt></span><span style="font-size: 10pt"> +The caches not only improve performance, they provide a way to organize +the flow of data, particularly in the communication between the host +and terminal. +This idea is developed below, in the section on communications. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">To reduce disc traffic, changes to a +</span><span style="font-size: 10pt"><tt>Buffer</tt></span><span style="font-size: 10pt"> +are mediated by a variable-length string, in memory, that acts as a cache. +When an insertion or deletion is made to a +</span><span style="font-size: 10pt"><tt>Buffer</tt></span><span style="font-size: 10pt">, +if the change can be accommodated by the cache, it is done there. +If the cache becomes bigger than a block because of an insertion, +some of it is written to the +</span><span style="font-size: 10pt"><tt>Disc</tt></span><span style="font-size: 10pt"> +and deleted from the cache. +If the change does not intersect the cache, the cache is flushed. +The cache is only loaded at the new position if the change is smaller than a block; +otherwise, it is sent directly to the +</span><span style="font-size: 10pt"><tt>Disc</tt></span><span style="font-size: 10pt">. +This is because +large changes are typically sequential, +whereupon the next change is unlikely to overlap the current one. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">A +</span><span style="font-size: 10pt"><tt>File</tt></span><span style="font-size: 10pt"> +comprises a +</span><span style="font-size: 10pt"><tt>String</tt></span><span style="font-size: 10pt"> +to hold the file name and some ancillary data such as dot and the modified bit. +The most important components, though, are a pair of +</span><span style="font-size: 10pt"><tt>Buffers</tt></span><span style="font-size: 10pt">, +one called the transcript and the other the contents. +Their use is described in the next section. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The overall structure is shown in Figure 5. +Although it may seem that the data is touched many times on its +way from the +</span><span style="font-size: 10pt"><tt>Disc</tt></span><span style="font-size: 10pt">, +it is read (by one UNIX system call) directly into the cache of the +associated +</span><span style="font-size: 10pt"><tt>Buffer</tt></span><span style="font-size: 10pt">; +no extra copy is done. +Similarly, when flushing the cache, the text is written +directly from the cache to disc. +Most operations act directly on the text in the cache. +A principle applied throughout +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +is that the fewer times the data is copied, the faster the program will run +(see also the paper by Waite<sup></sup></span><sup><span style="font-size: 6pt">15</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt">). +</span><span style="font-size: 10pt"></span></p><center><img src="sam2.png"></center> +</center> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 8pt"><i>Figure 5. File data structures. +The temporary files are stored in the standard repository for such files +on the host system. +</i></span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="margin-top: 0; margin-bottom: 0.02in"></p> + +<p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The contents of a +</span><span style="font-size: 10pt"><tt>File</tt></span><span style="font-size: 10pt"> +are accessed by a routine that +copies to a buffer a substring of a file starting at a specified offset. +To read a byte at a time, a +per-</span><span style="font-size: 10pt"><tt>File</tt></span><span style="font-size: 10pt"> +array is loaded starting from a specified initial position, +and bytes may then be read from the array. +The implementation is done by a macro similar to the C standard I/O +</span><span style="font-size: 10pt"><tt>getc</tt></span><span style="font-size: 10pt"> +macro.<sup></sup></span><sup><span style="font-size: 6pt">14</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +Because the reading may be done at any address, a minor change to the +macro allows the file to be read backwards. +This array is read-only; there is no +</span><span style="font-size: 10pt"><tt>putc</tt></span><span style="font-size: 10pt">. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Doing and undoing +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +has an unusual method for managing changes to files. +The command language makes it easy to specify multiple variable-length changes +to a file millions of bytes long, and such changes +must be made efficiently if the editor is to be practical. +The usual techniques for inserting and deleting strings +are inadequate under these conditions. +The +</span><span style="font-size: 10pt"><tt>Buffer</tt></span><span style="font-size: 10pt"> +and +</span><span style="font-size: 10pt"><tt>Disc</tt></span><span style="font-size: 10pt"> +data structures are designed for efficient random access to long strings, +but care must be taken to avoid super-linear behavior when making +many changes simultaneously. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +uses a two-pass algorithm for making changes, and treats each file as a database +against which transactions are registered. +Changes are not made directly to the contents. +Instead, when a command is started, a ‘mark’ containing +a sequence number is placed in the transcript +</span><span style="font-size: 10pt"><tt>Buffer</tt></span><span style="font-size: 10pt">, +and each change made to the file, either an insertion or deletion +or a change to the file name, +is appended to the end of the transcript. +When the command is complete, the transcript is rewound to the +mark and applied to the contents. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">One reason for separating evaluation from +application in this way is to simplify tracking the addresses of changes +made in the middle of a long sequence. +The two-pass algorithm also allows all changes to apply to the +</span><span style="font-size: 10pt"><i>original</i></span><span style="font-size: 10pt"> +data: no change can affect another change made in the same command. +This is particularly important when evaluating an +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +command because it prevents regular expression matches +from stumbling over changes made earlier in the execution. +Also, the two-pass +algorithm is cleaner than the way other UNIX editors allow changes to +affect each other; +for example, +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt">’s +idioms to do things like delete every other line +depend critically on the implementation. +Instead, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +simple model, in which all changes in a command occur effectively +simultaneously, is easy to explain and to understand. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The records in the transcript are of the form ‘‘delete substring from +locations +123 to 456’’ and ‘‘insert 11 characters ‘hello there’ at location 789.’’ +(It is an error if the changes are not at monotonically greater +positions through the file.) +While the update is occurring, these numbers must be +offset by earlier changes, but that is straightforward and +local to the update routine; +moreover, all the numbers have been computed +before the first is examined. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Treating the file as a transaction system has another advantage: +undo is trivial. +All it takes is to invert the transcript after it has been +implemented, converting insertions +into deletions and vice versa, and saving them in a holding +</span><span style="font-size: 10pt"><tt>Buffer</tt></span><span style="font-size: 10pt">. +The ‘do’ transcript can then be deleted from +the transcript +</span><span style="font-size: 10pt"><tt>Buffer</tt></span><span style="font-size: 10pt"> +and replaced by the ‘undo’ transcript. +If an undo is requested, the transcript is rewound and the undo transcript +executed. +Because the transcript +</span><span style="font-size: 10pt"><tt>Buffer</tt></span><span style="font-size: 10pt"> +is not truncated after each command, it accumulates +successive changes. +A sequence of undo commands +can therefore back up the file arbitrarily, +which is more helpful than the more commonly implemented self-inverse form of undo. +(</span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +provides no way to undo an undo, but if it were desired, +it would be easy to provide by re-interpreting the ‘do’ transcript.) +Each mark in the transcript contains a sequence number and the offset into +the transcript of the previous mark, to aid in unwinding the transcript. +Marks also contain the value of dot and the modified bit so these can be +restored easily. +Undoing multiple files is easy; it merely demands undoing all files whose +latest change has the same sequence number as the current file. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Another benefit of having a transcript is that errors encountered in the middle +of a complicated command need not leave the files in an intermediate state. +By rewinding the transcript to the mark beginning the command, +the partial command can be trivially undone. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">When the update algorithm was first implemented, it was unacceptably slow, +so a cache was added to coalesce nearby changes, +replacing multiple small changes by a single larger one. +This reduced the number +of insertions into the transaction +</span><span style="font-size: 10pt"><tt>Buffer</tt></span><span style="font-size: 10pt">, +and made a dramatic improvement in performance, +but made it impossible +to handle changes in non-monotonic order in the file; the caching method +only works if changes don’t overlap. +Before the cache was added, the transaction could in principle be sorted +if the changes were out of order, although +this was never done. +The current status is therefore acceptable performance with a minor +restriction on global changes, which is sometimes, but rarely, an annoyance. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The update algorithm obviously paws the data more than simpler +algorithms, but it is not prohibitively expensive; +the caches help. +(The principle of avoiding copying the data is still honored here, +although not as piously: +the data is moved from contents’ cache to +the transcript’s all at once and through only one internal buffer.) +Performance figures confirm the efficiency. +To read from a dead start a hundred kilobyte file on a VAX-11/750 +takes 1.4 seconds of user time, 2.5 seconds of system time, +and 5 seconds of real time. +Reading the same file in +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt"> +takes 6.0 seconds of user time, 1.7 seconds of system time, +and 8 seconds of real time. +</span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +uses about half the CPU time. +A more interesting example is the one stated above: +inserting a character between every pair of characters in the file. +The +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +command is +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>,y/@/ a/x/</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">and takes 3 CPU seconds per kilobyte of input file, of which +about a third is spent in the regular expression code. +This translates to about 500 changes per second. +</span><span style="font-size: 10pt"><tt>Ed</tt></span><span style="font-size: 10pt"> +takes 1.5 seconds per kilobyte to make a similar change (ignoring newlines), +but cannot undo it. +The same example in +</span><span style="font-size: 10pt"><tt>ex</tt></span><span style="font-size: 10pt">,<sup></sup></span><sup><span style="font-size: 6pt">9</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +a variant of +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt"> +done at the University of California at Berkeley, +which allows one level of undoing, again takes 3 seconds. +In summary, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +performance is comparable to that of other UNIX editors, although it solves +a harder problem. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Communications +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The discussion so far has described the implementation of the host part of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">; +the next few sections explain how a machine with mouse and bitmap display +can be engaged to improve interaction. +</span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +is not the first editor to be written as two processes,<sup></sup></span><sup><span style="font-size: 6pt">16</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +but its implementation +has some unusual aspects. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">There are several ways +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +host and terminal parts may be connected. +The first and simplest is to forgo the terminal part and use the host +part’s command language to edit text on an ordinary terminal. +This mode is invoked by starting +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +with the +</span><span style="font-size: 10pt"><tt>-d</tt></span><span style="font-size: 10pt"> +option. +With no options, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +runs separate host and terminal programs, +communicating with a message protocol over the physical +connection that joins them. +Typically, the connection is an RS-232 link between a Blit +(the prototypical display for +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">) +and a host running +the Ninth Edition of the UNIX operating system.<sup></sup></span><sup><span style="font-size: 6pt">8</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +(This is the version of the system used in the Computing Sciences Research +Center at AT&T Bell Laboratories [now Lucent Technologies, Bell Labs], where I work. Its relevant +aspects are discussed in the Blit paper.<sup></sup></span><sup><span style="font-size: 6pt">1</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt">) +The implementation of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +for the SUN computer runs both processes on the same machine and +connects them by a pipe. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The low bandwidth of an RS-232 link +necessitated the split between +the two programs. +The division is a mixed blessing: +a program in two parts is much harder to write and to debug +than a self-contained one, +but the split makes several unusual configurations possible. +The terminal may be physically separated from the host, allowing the conveniences +of a mouse and bitmap display to be taken home while leaving the files at work. +It is also possible to run the host part on a remote machine: +</span></p><p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.1em; margin-left: 1.28in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 9pt"><tt>sam -r host</tt></span></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> + +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">connects to the terminal in the usual way, and then makes a call +across the network to establish the host part of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +on the named machine. +Finally, it cross-connects the I/O to join the two parts. +This allows +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +to be run on machines that do not support bitmap displays; +for example, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +is the editor of choice on our Cray X-MP/24. +</span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +</span><span style="font-size: 10pt"><tt>-r</tt></span><span style="font-size: 10pt"> +involves +</span><span style="font-size: 10pt"><i>three</i></span><span style="font-size: 10pt"> +machines: the remote host, the terminal, and the local host. +The local host’s job is simple but vital: it passes the data +between the remote host and terminal. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The host and terminal exchange messages asynchronously +(rather than, say, as remote procedure calls) but there is no +error detection or correction +because, whatever the configuration, the connection is reliable. +Because the terminal handles mundane interaction tasks such as +popping up menus and interpreting the responses, the messages are about +data, not actions. +For example, the host knows nothing about what is displayed on the screen, +and when the user types a character, the message sent to the host says +‘‘insert a one-byte string at location 123 in file 7,’’ not ‘‘a character +was typed at the current position in the current file.’’ +In other words, the messages look very much like the transaction records +in the transcripts. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Either the host or terminal part of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +may initiate a change to a file. +The command language operates on the host, while typing and some +mouse operations are executed directly in the terminal to optimize response. +Changes initiated by the host program must be transmitted to the terminal, +and +vice versa. +(A token is exchanged to determine which end is in control, +which means that characters typed while a time-consuming command runs +must be buffered and do not appear until the command is complete.) +To maintain consistent information, +the host and terminal track changes through a per-file +data structure that records what portions of the file +the terminal has received. +The data structure, called a +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt"> +(a weak pun: it’s a file with holes) +is held and updated by both the host and terminal. +A +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt"> +is a list of +</span><span style="font-size: 10pt"><tt>Strings</tt></span><span style="font-size: 10pt"> +holding those parts of the file known to the terminal, +separated by counts of the number of bytes in the interstices. +Of course, the host doesn’t keep a separate copy of the data (it only needs +the lengths of the various pieces), +but the structure is the same on both ends. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt"> +in the terminal doubles as a cache. +Since the terminal keeps the text for portions of the file it has displayed, +it need not request data from the host when revisiting old parts of the file +or redrawing obscured windows, which speeds things up considerably +over low-speed links. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">It’s trivial for the terminal to maintain its +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt">, +because all changes made on the terminal apply to parts of the file +already loaded there. +Changes made by the host are compared against the +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt"> +during the update sequence after each command. +Small changes to pieces of the file loaded in the terminal +are sent in their entirety. +Larger changes, and changes that fall entirely in the holes, +are transmitted as messages without literal data: +only the lengths of the deleted and inserted strings are transmitted. +When a command is completed, the terminal examines its visible +windows to see if any holes in their +</span><span style="font-size: 10pt"><tt>Rasps</tt></span><span style="font-size: 10pt"> +intersect the visible portion of the file. +It then requests the missing data from the host, +along with up to 512 bytes of surrounding data, to minimize +the number of messages when visiting a new portion of the file. +This technique provides a kind of two-level lazy evaluation for the terminal. +The first level sends a minimum of information about +parts of the file not being edited interactively; +the second level waits until a change is displayed before +transmitting the new data. +Of course, +performance is also helped by having the terminal respond immediately to typing +and simple mouse requests. +Except for small changes to active pieces of the file, which are +transmitted to the terminal without negotiation, +the terminal is wholly responsible for deciding what is displayed; +the host uses the +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt"> +only to tell the terminal what might be relevant. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">When a change is initiated by the host, +the messages to the terminal describing the change +are generated by the routine that applies the transcript of the changes +to the contents of the +</span><span style="font-size: 10pt"><tt>File</tt></span><span style="font-size: 10pt">. +Since changes are undone by the same update routine, +undoing requires +no extra code in the communications; +the usual messages describing changes to the file are sufficient +to back up the screen image. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt"> +is a particularly good example of the way caches are used in +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">. +First, it facilitates access to the active portion of the text by placing +the busy text in main memory. +In so doing, it provides efficient access +to a large data structure that does not fit in memory. +Since the form of data is to be imposed by the user, not by the program, +and because characters will frequently be scanned sequentially, +files are stored as flat objects. +Caches help keep performance good and linear when working with such +data. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Second, the +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt"> +and several of the other caches have some +</span><span style="font-size: 10pt"><i>read-ahead;</i></span><span style="font-size: 10pt"> +that is, the cache is loaded with more information than is needed for +the job immediately at hand. +When manipulating linear structures, the accesses are usually sequential, +and read-ahead can significantly reduce the average time to access the +next element of the object. +Sequential access is a common mode for people as well as programs; +consider scrolling through a document while looking for something. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Finally, like any good data structure, +the cache guides the algorithm, or at least the implementation. +The +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt"> +was actually invented to control the communications between the host and +terminal parts, but I realized very early that it was also a form of +cache. Other caches were more explicitly intended to serve a double +purpose: for example, the caches in +</span><span style="font-size: 10pt"><tt>Files</tt></span><span style="font-size: 10pt"> +that coalesce updates not only reduce traffic to the +transcript and contents +</span><span style="font-size: 10pt"><tt>Buffers</tt></span><span style="font-size: 10pt">, +they also clump screen updates so that complicated changes to the +screen are achieved in +just a few messages to the terminal. +This saved me considerable work: I did not need to write special +code to optimize the message traffic to the +terminal. +Caches pay off in surprising ways. +Also, they tend to be independent, so their performance improvements +are multiplicative. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Data structures in the terminal +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The terminal’s job is to display and to maintain a consistent image of +pieces of the files being edited. +Because the text is always in memory, the data structures are +considerably simpler than those in the host part. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +typically has far more windows than does +</span><span style="font-size: 10pt"><tt>mux</tt></span><span style="font-size: 10pt">, +the window system within which its Blit implementation runs. +</span><span style="font-size: 10pt"><tt>Mux</tt></span><span style="font-size: 10pt"> +has a fairly small number of asynchronously updated windows; +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +needs a large number of synchronously updated windows that are +usually static and often fully obscured. +The different tradeoffs guided +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +away from the memory-intensive implementation of windows, called +</span><span style="font-size: 10pt"><tt>Layers</tt></span><span style="font-size: 10pt">,<sup></sup></span><sup><span style="font-size: 6pt">17</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +used in +</span><span style="font-size: 10pt"><tt>mux.</tt></span><span style="font-size: 10pt"> +Rather than depending on a complete bitmap image of the display for each window, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +regenerates the image from its in-memory text +(stored in the +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt">) +when necessary, although it will use such an image if it is available. +Like +</span><span style="font-size: 10pt"><tt>Layers</tt></span><span style="font-size: 10pt">, +though, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +uses the screen bitmap as active storage in which to update the image using +</span><span style="font-size: 10pt"><tt>bitblt</tt></span><span style="font-size: 10pt">.<sup></sup></span><sup><span style="font-size: 6pt">18,19</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +The resulting organization, pictured in Figure 6, +has a global array of windows, called +</span><span style="font-size: 10pt"><tt>Flayers</tt></span><span style="font-size: 10pt">, +each of which holds an image of a piece of text held in a data structure +called a +</span><span style="font-size: 10pt"><tt>Frame</tt></span><span style="font-size: 10pt">, +which in turn represents +a rectangular window full of text displayed in some +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt">. +Each +</span><span style="font-size: 10pt"><tt>Flayer</tt></span><span style="font-size: 10pt"> +appears in a global list that orders them all front-to-back +on the display, and simultaneously as an element of a per-file array +that holds all the open windows for that file. +The complement in the terminal of the +</span><span style="font-size: 10pt"><tt>File</tt></span><span style="font-size: 10pt"> +on the host is called a +</span><span style="font-size: 10pt"><tt>Text</tt></span><span style="font-size: 10pt">; +each connects its +</span><span style="font-size: 10pt"><tt>Flayers</tt></span><span style="font-size: 10pt"> +to the associated +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt">. +</span><span style="font-size: 10pt"></span></p><center><img src="sam3.png"></center> +</center> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 8pt"><i>Figure 6. Data structures in the terminal. +</i></span><span style="font-size: 8pt"><tt>Flayers</tt></span><span style="font-size: 8pt"><i> +are also linked together into a front-to-back list. +</i></span><span style="font-size: 8pt"><tt>Boxes</tt></span><span style="font-size: 8pt"><i> +are discussed in the next section. +</i></span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="margin-top: 0; margin-bottom: 0.02in"></p> + +<p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt"> +for a +</span><span style="font-size: 10pt"><tt>Frame</tt></span><span style="font-size: 10pt"> +contains the image of the text. +For a fully visible window, the +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt"> +will be the screen (or at least the +</span><span style="font-size: 10pt"><tt>Layer</tt></span><span style="font-size: 10pt"> +in which +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +is being run), +while for partially obscured windows the +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt"> +will be off-screen. +If the window is fully obscured, the +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt"> +will be null. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt"> +is a kind of cache. +When making changes to the display, most of the original image will +look the same in the final image, and the update algorithms exploit this. +The +</span><span style="font-size: 10pt"><tt>Frame</tt></span><span style="font-size: 10pt"> +software updates the image in the +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt"> +incrementally; the +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt"> +is not just an image, it is a data structure.<sup></sup></span><sup><span style="font-size: 6pt">18,19</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +The job of the software that updates the display is therefore +to use as much as possible of the existing image (converting the +text from ASCII characters to pixels is expensive) in a sort of two-dimensional +string insertion algorithm. +The details of this process are described in the next section. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The +</span><span style="font-size: 10pt"><tt>Frame</tt></span><span style="font-size: 10pt"> +software has no code to support overlapping windows; +its job is to keep a single +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt"> +up to date. +It falls to the +</span><span style="font-size: 10pt"><tt>Flayer</tt></span><span style="font-size: 10pt"> +software to multiplex the various +</span><span style="font-size: 10pt"><tt>Bitmaps</tt></span><span style="font-size: 10pt"> +onto the screen. +The problem of maintaining overlapping +</span><span style="font-size: 10pt"><tt>Flayers</tt></span><span style="font-size: 10pt"> +is easier than for +</span><span style="font-size: 10pt"><tt>Layers</tt></span><span style="font-size: 10pt"><sup></sup></span><sup><span style="font-size: 6pt">17</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> +because changes are made synchronously and because the contents of the window +can be reconstructed from the data stored in the +</span><span style="font-size: 10pt"><tt>Frame</tt></span><span style="font-size: 10pt">; +the +</span><span style="font-size: 10pt"><tt>Layers</tt></span><span style="font-size: 10pt"> +software +makes no such assumptions. +In +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">, +the window being changed is almost always fully visible, because the current +window is always fully visible, by construction. +However, when multi-file changes are being made, or when +more than one window is open on a file, +it may be necessary to update partially obscured windows. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">There are three cases: the window is +fully visible, invisible (fully obscured), or partially visible. +If fully visible, the +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt"> +is part of the screen, so when the +</span><span style="font-size: 10pt"><tt>Flayer</tt></span><span style="font-size: 10pt"> +update routine calls the +</span><span style="font-size: 10pt"><tt>Frame</tt></span><span style="font-size: 10pt"> +update routine, the screen will be updated directly. +If the window is invisible, +there is no associated +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt">, +and all that is necessary is to update the +</span><span style="font-size: 10pt"><tt>Frame</tt></span><span style="font-size: 10pt"> +data structure, not the image. +If the window is partially visible, the +</span><span style="font-size: 10pt"><tt>Frame</tt></span><span style="font-size: 10pt"> +routine is called to update the image in the off-screen +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt">, +which may require regenerating it from the text of the window. +The +</span><span style="font-size: 10pt"><tt>Flayer</tt></span><span style="font-size: 10pt"> +code then clips this +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt"> +against the +</span><span style="font-size: 10pt"><tt>Bitmaps</tt></span><span style="font-size: 10pt"> +of all +</span><span style="font-size: 10pt"><tt>Frames</tt></span><span style="font-size: 10pt"> +in front of the +</span><span style="font-size: 10pt"><tt>Frame</tt></span><span style="font-size: 10pt"> +being modified, and the remainder is copied to the display. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">This is much faster than recreating the image off-screen +for every change, or clipping all the changes made to the image +during its update. +Unfortunately, these caches can also consume prohibitive amounts of +memory, so they are freed fairly liberally — after every change to the +front-to-back order of the +</span><span style="font-size: 10pt"><tt>Flayers</tt></span><span style="font-size: 10pt">. +The result is that +the off-screen +</span><span style="font-size: 10pt"><tt>Bitmaps</tt></span><span style="font-size: 10pt"> +exist only while multi-window changes are occurring, +which is the only time the performance improvement they provide is needed. +Also, the user interface causes fully-obscured windows to be the +easiest to make — +creating a canonically sized and placed window requires only a button click +— which reduces the need for caching still further. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Screen update +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Only two low-level primitives are needed for incremental update: +</span><span style="font-size: 10pt"><tt>bitblt</tt></span><span style="font-size: 10pt">, +which copies rectangles of pixels, and +</span><span style="font-size: 10pt"><tt>string</tt></span><span style="font-size: 10pt"> +(which in turn calls +</span><span style="font-size: 10pt"><tt>bitblt</tt></span><span style="font-size: 10pt">), +which draws a null-terminated character string in a +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt">. +A +</span><span style="font-size: 10pt"><tt>Frame</tt></span><span style="font-size: 10pt"> +contains a list of +</span><span style="font-size: 10pt"><tt>Boxes</tt></span><span style="font-size: 10pt">, +each of which defines a horizontal strip of text in the window +(see Figure 7). +A +</span><span style="font-size: 10pt"><tt>Box</tt></span><span style="font-size: 10pt"> +has a character string +</span><span style="font-size: 10pt"><tt>str</tt></span><span style="font-size: 10pt">, +and a +</span><span style="font-size: 10pt"><tt>Rectangle</tt></span><span style="font-size: 10pt"> +</span><span style="font-size: 10pt"><tt>rect</tt></span><span style="font-size: 10pt"> +that defines the location of the strip in the window. +(The text in +</span><span style="font-size: 10pt"><tt>str</tt></span><span style="font-size: 10pt"> +is stored in the +</span><span style="font-size: 10pt"><tt>Box</tt></span><span style="font-size: 10pt"> +separately from the +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt"> +associated with the window’s file, so +</span><span style="font-size: 10pt"><tt>Boxes</tt></span><span style="font-size: 10pt"> +are self-contained.) +The invariant is that +the image of the +</span><span style="font-size: 10pt"><tt>Box</tt></span><span style="font-size: 10pt"> +can be reproduced by calling +</span><span style="font-size: 10pt"><tt>string</tt></span><span style="font-size: 10pt"> +with argument +</span><span style="font-size: 10pt"><tt>str</tt></span><span style="font-size: 10pt"> +to draw the string in +</span><span style="font-size: 10pt"><tt>rect</tt></span><span style="font-size: 10pt">, +and the resulting picture fits perfectly within +</span><span style="font-size: 10pt"><tt>rect</tt></span><span style="font-size: 10pt">. +In other words, the +</span><span style="font-size: 10pt"><tt>Boxes</tt></span><span style="font-size: 10pt"> +define the tiling of the window. +The tiling may be complicated by long lines of text, which +are folded onto the next line. +Some editors use horizontal scrolling to avoid this complication, +but to be comfortable this technique requires that lines not be +</span><span style="font-size: 10pt"><i>too</i></span><span style="font-size: 10pt"> +long; +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +has no such restriction. +Also, and perhaps more importantly, UNIX programs and terminals traditionally fold +long lines to make their contents fully visible. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Two special kinds of +</span><span style="font-size: 10pt"><tt>Boxes</tt></span><span style="font-size: 10pt"> +contain a single +character: either a newline or a tab. +Newlines and tabs are white space. +A newline +</span><span style="font-size: 10pt"><tt>Box</tt></span><span style="font-size: 10pt"> +always extends to the right edge of the window, +forcing the following +</span><span style="font-size: 10pt"><tt>Box</tt></span><span style="font-size: 10pt"> +to the next line. +The width of a tab depends on where it is located: +it forces the next +</span><span style="font-size: 10pt"><tt>Box</tt></span><span style="font-size: 10pt"> +to begin at a tab location. +Tabs also +have a minimum width equivalent to a blank (blanks are +drawn by +</span><span style="font-size: 10pt"><tt>string</tt></span><span style="font-size: 10pt"> +and are not treated specially); newlines have a minimum width of zero. +</span><span style="font-size: 10pt"></span></p><center><img src="sam4.png"></center> +</center> +<p style="margin-top: 0; margin-bottom: 0.08in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 8pt"><i>Figure 7. A line of text showing its +</i></span><span style="font-size: 8pt"><tt>Boxes</tt></span><span style="font-size: 8pt"><i>. +The first two blank +</i></span><span style="font-size: 8pt"><tt>Boxes</tt></span><span style="font-size: 8pt"><i> +contain tabs; the last contains a newline. +Spaces are handled as ordinary characters. +</i></span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="margin-top: 0; margin-bottom: 0.02in"></p> + +<p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The update algorithms always use the +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt"> +image of the text (either the display or cache +</span><span style="font-size: 10pt"><tt>Bitmap</tt></span><span style="font-size: 10pt">); +they never examine the characters within a +</span><span style="font-size: 10pt"><tt>Box</tt></span><span style="font-size: 10pt"> +except when the +</span><span style="font-size: 10pt"><tt>Box</tt></span><span style="font-size: 10pt"> +needs to be split in two. +Before a change, the window consists of a tiling of +</span><span style="font-size: 10pt"><tt>Boxes</tt></span><span style="font-size: 10pt">; +after the change the window is tiled differently. +The update algorithms rearrange the tiles in place, without +backup storage. +The algorithms are not strictly optimal — for example, they can +clear a pixel that is later going to be written upon — +but they never move a tile that doesn’t need to be moved, +and they move each tile at most once. +</span><span style="font-size: 10pt"><tt>Frinsert</tt></span><span style="font-size: 10pt"> +on a Blit can absorb over a thousand characters a second if the strings +being inserted are a few tens of characters long. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Consider +</span><span style="font-size: 10pt"><tt>frdelete</tt></span><span style="font-size: 10pt">. +Its job is to delete a substring from a +</span><span style="font-size: 10pt"><tt>Frame</tt></span><span style="font-size: 10pt"> +and restore the image of the +</span><span style="font-size: 10pt"><tt>Frame</tt></span><span style="font-size: 10pt">. +The image of a substring has a peculiar shape (see Figure 2) comprising +possibly a partial line, +zero or more full lines, +and possibly a final partial line. +For reference, call this the +</span><span style="font-size: 10pt"><i>Z-shape. +</i></span><span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Frdelete</tt></span><span style="font-size: 10pt"> +begins by splitting, if necessary, the +</span><span style="font-size: 10pt"><tt>Boxes</tt></span><span style="font-size: 10pt"> +containing the ends of +the substring so the substring begins and ends on +</span><span style="font-size: 10pt"><tt>Box</tt></span><span style="font-size: 10pt"> +boundaries. +Because the substring is being deleted, its image is not needed, +so the Z-shape is then cleared. +Then, tiles (that is, the images of +</span><span style="font-size: 10pt"><tt>Boxes</tt></span><span style="font-size: 10pt">) +are copied, using +</span><span style="font-size: 10pt"><tt>bitblt</tt></span><span style="font-size: 10pt">, +from immediately after the Z-shape to +the beginning of the Z-shape, +resulting in a new Z-shape. +(</span><span style="font-size: 10pt"><tt>Boxes</tt></span><span style="font-size: 10pt"> +whose contents would span two lines in the new position must first be split.) +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Copying the remainder of the +</span><span style="font-size: 10pt"><tt>Frame</tt></span><span style="font-size: 10pt"> +tile by tile +this way will clearly accomplish the deletion but eventually, +typically when the copying algorithm encounters a tab or newline, +the old and new +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +coordinates of the tile +to be copied are the same. +This correspondence implies +that the Z-shape has its beginning and ending edges aligned +vertically, and a sequence of at most two +</span><span style="font-size: 10pt"><tt>bitblts</tt></span><span style="font-size: 10pt"> +can be used to copy the remaining tiles. +The last step is to clear out the resulting empty space at the bottom +of the window; +the number of lines to be cleared is the number of complete lines in the +Z-shape closed by the final +</span><span style="font-size: 10pt"><tt>bitblts.</tt></span><span style="font-size: 10pt"> +The final step is to merge horizontally adjacent +</span><span style="font-size: 10pt"><tt>Boxes</tt></span><span style="font-size: 10pt"> +of plain text. +The complete source to +</span><span style="font-size: 10pt"><tt>frdelete</tt></span><span style="font-size: 10pt"> +is less than 100 lines of C. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>frinsert</tt></span><span style="font-size: 10pt"> +is more complicated because it must do four passes: +one to construct the +</span><span style="font-size: 10pt"><tt>Box</tt></span><span style="font-size: 10pt"> +list for the inserted string, +one to reconnoitre, +one to copy (in opposite order to +</span><span style="font-size: 10pt"><tt>frdelete</tt></span><span style="font-size: 10pt">) +the +</span><span style="font-size: 10pt"><tt>Boxes</tt></span><span style="font-size: 10pt"> +to make the hole for the new text, +and finally one to copy the new text into place. +Overall, though, +</span><span style="font-size: 10pt"><tt>frinsert</tt></span><span style="font-size: 10pt"> +has a similar flavor to +</span><span style="font-size: 10pt"><tt>frdelete</tt></span><span style="font-size: 10pt">, +and needn’t be described further. +</span><span style="font-size: 10pt"><tt>Frinsert</tt></span><span style="font-size: 10pt"> +and its subsidiary routines comprise 211 lines of C. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The terminal source code is 3024 lines of C, +and the host source is 5797 lines. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Discussion +</b></span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>History +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">The immediate ancestor of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +was the original text editor for the Blit, called +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt">. +</span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +inherited +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt">’s +two-process structure and mouse language almost unchanged, but +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt"> +suffered from several drawbacks that were addressed in the design of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">. +The most important of these was the lack of a command language. +Although +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt"> +was easy to use for simple editing, it provided no direct help with +large or repetitive editing tasks. Instead, it provided a command to pass +selected text through a shell pipeline, +but this was no more satisfactory than could be expected of a stopgap measure. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Jim</tt></span><span style="font-size: 10pt"> +was written primarily as a vehicle for experimenting with a mouse-based +interface to text, and the experiment was successful. +</span><span style="font-size: 10pt"><tt>Jim</tt></span><span style="font-size: 10pt"> +had some spin-offs: +</span><span style="font-size: 10pt"><tt>mux</tt></span><span style="font-size: 10pt">, +the second window system for the Blit, is essentially a multiplexed +version of the terminal part of +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt">; +and the debugger +</span><span style="font-size: 10pt"><tt>pi</tt></span><span style="font-size: 10pt">’s +user interface<sup></sup></span><sup><span style="font-size: 6pt">20</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> was closely modeled on +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt">’s. +But after a couple of years, +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt"> +had become difficult to maintain and limiting to use, +and its replacement was overdue. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">I began the design of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +by asking +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt"> +customers what they wanted. +This was probably a mistake; the answers were essentially a list of features +to be found in other editors, which did not provide any of the +guiding principles I was seeking. +For instance, one common request was for a ‘‘global substitute,’’ +but no one suggested how to provide it within a cut-and-paste editor. +I was looking for a scheme that would +support such specialized features comfortably in the context of some +general command language. +Ideas were not forthcoming, though, particularly given my insistence +on removing all limits on file sizes, line lengths and so on. +Even worse, I recognized that, since the mouse could easily +indicate a region of the screen that was not an integral number of lines, +the command language would best forget about newlines altogether, +and that meant the command language had to treat the file as a single +string, not an array of lines. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Eventually, I decided that thinking was not getting me very far and it was +time to try building. +I knew that the terminal part could be built easily — +that part of +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt"> +behaved acceptably well — and that most of the hard work was going +to be in the host part: the file interface, command interpreter and so on. +Moreover, I had some ideas about how the architecture of +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt"> +could be improved without destroying its basic structure, which I liked +in principle but which hadn’t worked out as well as I had hoped. +So I began by designing the file data structure, +starting with the way +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt"> +worked — comparable to a single structure merging +</span><span style="font-size: 10pt"><tt>Disc</tt></span><span style="font-size: 10pt"> +and +</span><span style="font-size: 10pt"><tt>Buffer</tt></span><span style="font-size: 10pt">, +which I split to make the cache more general +— and thinking about how global substitute could be implemented. +The answer was clearly that it had to be done in two passes, +and the transcript-oriented implementation fell out naturally. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +was written bottom-up, +starting from the data structures and algorithms for manipulating text, +through the command language and up to the code for maintaining +the display. +In retrospect, it turned out well, but this implementation method is +not recommended in general. +There were several times when I had a large body of interesting code +assembled and no clue how to proceed with it. +The command language, in particular, took almost a year to figure out, +but can be implemented (given what was there at the beginning of that year) +in a day or two. Similarly, inventing the +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt"> +data structure delayed the +connection of the host and terminal pieces by another few months. +</span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +took about two years to write, although only about four months were +spent actually working on it. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Part of the design process was unusual: +the subset of the protocol that maintains the +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt"> +was simulated, debugged +and verified by an automatic protocol analyzer,<sup></sup></span><sup><span style="font-size: 6pt">21</span><span style="font-size: 10pt"></span></sup><span style="font-size: 10pt"> and was bug-free +from the start. +The rest of the protocol, concerned mostly +with keeping menus up to date, +was unfortunately too unwieldy for such analysis, +and was debugged by more traditional methods, primarily +by logging in a file all messages in and out of the host. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Reflections +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +is essentially the only interactive editor used by the sixty or so members of +the computing science research center in which I work. +The same could not be said of +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt">; +the lack of a command language kept some people from adopting it. +The union of a user interface as comfortable as +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt">’s +with a command language as powerful as +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt">’s† +</span></p><p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.50in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">is essential to +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +success. +When +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +was first made available to the +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt"> +community, +almost everyone switched to it within two or three days. +In the months that followed, even people who had never adopted +</span><span style="font-size: 10pt"><tt>jim</tt></span><span style="font-size: 10pt"> +started using +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +exclusively. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">To be honest, +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt"> +still gets occasional use, but usually when +something quick needs to be done and the overhead of +downloading the terminal part of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +isn’t worth the trouble. +Also, as a ‘line’ editor, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +</span><span style="font-size: 10pt"><tt>-d</tt></span><span style="font-size: 10pt"> +is a bit odd; +when using a good old ASCII terminal, it’s comforting to have +a true line editor. +But it is fair to say that +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +command language has displaced +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt">’s +for most of the complicated editing that has kept line editors +(that is, command-driven editors) with us. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt">’s +command language is even fancier than +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt">’s, +and most +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +customers don’t come near to using all its capabilities. +Does it need to be so sophisticated? +I think the answer is yes, for two reasons. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">First, the +</span><span style="font-size: 10pt"><i>model</i></span><span style="font-size: 10pt"> +for +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +command language is really relatively simple, and certainly simpler than that of +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt">. +For instance, there is only one kind of textual loop in +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +— the +</span><span style="font-size: 10pt"><tt>x</tt></span><span style="font-size: 10pt"> +command — +while +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt"> +has three (the +</span><span style="font-size: 10pt"><tt>g</tt></span><span style="font-size: 10pt"> +command, the global flag on substitutions, and the implicit loop over +lines in multi-line substitutions). +Also, +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt">’s +substitute command is necessary to make changes within lines, but in +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +the +</span><span style="font-size: 10pt"><tt>s</tt></span><span style="font-size: 10pt"> +command is more of a familiar convenience than a necessity; +</span><span style="font-size: 10pt"><tt>c</tt></span><span style="font-size: 10pt"> +and +</span><span style="font-size: 10pt"><tt>t</tt></span><span style="font-size: 10pt"> +can do all the work. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Second, +given a community that expects an editor to be about as powerful as +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt">, +it’s hard to see how +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +could really be much simpler and still satisfy that expectation. +People want to do ‘‘global substitutes,’’ and most are content +to have the recipe for that and a few other fancy changes. +The sophistication of the command language is really just a veneer +over a design that makes it possible to do global substitutes +in a screen editor. +Some people will always want something more, however, and it’s gratifying to +be able to provide it. +The real power of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +command language comes from composability of the operators, which is by +nature orthogonal to the underlying model. +In other words, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +is not itself complex, but it makes complex things possible. +If you don’t want to do anything complex, you can ignore the +complexity altogether, and many people do so. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Sometimes I am asked the opposite question: why didn’t I just make +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +a real programmable editor, with macros and variables and so on? +The main reason is a matter of taste: I like the editor +to be the same every time I use it. +There is one technical reason, though: +programmability in editors is largely a workaround for insufficient +interactivity. +Programmable editors are used to make particular, usually short-term, +things easy to do, such as by providing shorthands for common actions. +If things are generally easy to do in the first place, +shorthands are not as helpful. +</span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +makes common editing operations very easy, and the solutions to +complex editing problems seem commensurate with the problems themselves. +Also, the ability to edit the +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +window makes it easy to repeat commands — it only takes a mouse button click +to execute a command again. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Pros and cons +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +has several other good points, +and its share of problems. +Among the good things is the idea of +structural regular expressions, +whose usefulness has only begun to be explored. +They were arrived at serendipitously when I attempted to distill the essence of +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt">’s +way of doing global substitution and recognized that the looping command in +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt"> +was implicitly imposing a structure (an array of lines) on the file. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Another of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +good things is its undo capability. +I had never before used an editor with a true undo, +but I would never go back now. +Undo +</span><span style="font-size: 10pt"><i>must</i></span><span style="font-size: 10pt"> +be done well, but if it is, it can be relied on. +For example, +it’s safe to experiment if you’re not sure how to write some intricate command, +because if you make a mistake, it can be fixed simply and reliably. +I learned two things about undo from writing +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">: +first, it’s easy to provide if you design it in from the beginning, and +second, it’s necessary, particularly if the system has some subtle +properties that may be unfamiliar or error-prone for users. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt">’s +lack of internal limits and sizes is a virtue. +Because it avoids all fixed-size tables and data structures, +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +is able to make global changes to files that some of our other +tools cannot even read. +Moreover, the design keeps the performance linear when doing such +operations, although I must admit +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +does get slow when editing a huge file. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Now, the problems. +Externally, the most obvious is that it is poorly integrated into the +surrounding window system. +By design, the user interface in +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +feels almost identical to that of +</span><span style="font-size: 10pt"><tt>mux</tt></span><span style="font-size: 10pt">, +but a thick wall separates text in +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +from the programs running in +</span><span style="font-size: 10pt"><tt>mux</tt></span><span style="font-size: 10pt">. +For instance, the ‘snarf buffer’ in +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +must be maintained separately from that in +</span><span style="font-size: 10pt"><tt>mux</tt></span><span style="font-size: 10pt">. +This is regrettable, but probably necessary given the unusual configuration +of the system, with a programmable terminal on the far end of an RS-232 link. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"></span><span style="font-size: 10pt"><tt>Sam</tt></span><span style="font-size: 10pt"> +is reliable; otherwise, people wouldn’t use it. +But it was written over such a long time, and has so many new (to me) +ideas in it, that I would like to see it done over again to clean +up the code and remove many of the lingering problems in the implementation. +The worst part is in the interconnection of the host and terminal parts, +which might even be able to go away in a redesign for a more +conventional window system. +The program must be split in two to use the terminal effectively, +but the low bandwidth of the connection forces the separation to +occur in an inconvenient part of the design if performance is to be acceptable. +A simple remote procedure call +protocol driven by the host, emitting only graphics +commands, would be easy to write but wouldn’t have nearly the +necessary responsiveness. On the other hand, if the terminal were in control +and requested much simpler file services from the host, regular expression +searches would require that the terminal read the entire file over its RS-232 +link, which would be unreasonably slow. +A compromise in which either end can take control is necessary. +In retrospect, the communications protocol should have been +designed and verified formally, although I do not know of any tool +that can adequately relate the protocol to +its implementation. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Not all of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +users are comfortable with its command language, and few are adept. +Some (venerable) people use a sort of +‘‘</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt"> +subset’’ of +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +command language, +and even ask why +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +command language is not exactly +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt">’s. +(The reason, of course, is that +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +model for text does not include newlines, which are central to +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt">. +Making the text an array of newlines to the command language would +be too much of a break from the seamless model provided by the mouse. +Some editors, such as +</span><span style="font-size: 10pt"><tt>vi</tt></span><span style="font-size: 10pt">, +are willing to make this break, though.) +The difficulty is that +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt">’s +syntax is so close to +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt">’s +that people believe it +</span><span style="font-size: 10pt"><i>should</i></span><span style="font-size: 10pt"> +be the same. +I thought, with some justification in hindsight, +that making +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +similar to +</span><span style="font-size: 10pt"><tt>ed</tt></span><span style="font-size: 10pt"> +would make it easier to learn and to accept. +But I may have overstepped and raised the users’ +expectations too much. +It’s hard to decide which way to resolve this problem. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.35in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Finally, there is a tradeoff in +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +that was decided by the environment in which it runs: +</span><span style="font-size: 10pt"><tt>sam</tt></span><span style="font-size: 10pt"> +is a multi-file editor, although in a different system there might instead be +multiple single-file editors. +The decision was made primarily because starting a new program in a Blit is +time-consuming. +If the choice could be made freely, however, I would +still choose the multi-file architecture, because it allows +groups of files to be handled as a unit; +the usefulness of the multi-file commands is incontrovertible. +It is delightful to have the source to an entire program +available at your fingertips. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>Acknowledgements +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">Tom Cargill suggested the idea behind the +</span><span style="font-size: 10pt"><tt>Rasp</tt></span><span style="font-size: 10pt"> +data structure. +Norman Wilson and Ken Thompson influenced the command language. +This paper was improved by comments from +Al Aho, +Jon Bentley, +Chris Fraser, +Gerard Holzmann, +Brian Kernighan, +Ted Kowalski, +Doug McIlroy +and +Dennis Ritchie. +</span></p><p style="margin-top: 0; margin-bottom: 0.17in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"><b>REFERENCES +</b></span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"> 1. R. Pike, +‘The Blit: a multiplexed graphics terminal,’ +</span><span style="font-size: 10pt"><i>AT&T Bell Labs. Tech. J., +</i></span><span style="font-size: 10pt"></span><span style="font-size: 10pt"><b>63</b></span><span style="font-size: 10pt">, +(8), +1607-1631 (1984). +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"> 2. L. Johnson, +</span><span style="font-size: 10pt"><i>MacWrite,</i></span><span style="font-size: 10pt"> +Apple Computer Inc., Cupertino, Calif. 1983. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"> 3. B. Lampson, +‘Bravo Manual,’ +in +</span><span style="font-size: 10pt"><i>Alto User’s Handbook, +</i></span><span style="font-size: 10pt">pp. 31-62, +Xerox Palo Alto Research Center, +Palo Alto, Calif. +1979. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"> 4. W. Teitelman, +‘A tour through Cedar,’ +</span><span style="font-size: 10pt"><i>IEEE Software, +</i></span><span style="font-size: 10pt"></span><span style="font-size: 10pt"><b>1</b></span><span style="font-size: 10pt"> +(2), 44-73 (1984). +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"> 5. J. Gutknecht, +‘Concepts of the text editor Lara,’ +</span><span style="font-size: 10pt"><i>Comm. ACM, +</i></span><span style="font-size: 10pt"></span><span style="font-size: 10pt"><b>28</b></span><span style="font-size: 10pt">, +(9), +942-960 (1985). +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"> 6. Bell Telephone Laboratories, +</span><span style="font-size: 10pt"><i>UNIX Programmer’s Manual, +</i></span><span style="font-size: 10pt">Holt, Rinehart and Winston, New York 1983. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"> 7. B. W. Kernighan and R. Pike, +</span><span style="font-size: 10pt"><i>The Unix Programming Environment, +</i></span><span style="font-size: 10pt">Prentice-Hall, Englewood Cliffs, New Jersey 1984. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"> 8. </span><span style="font-size: 10pt"><i>Unix Time-Sharing System Programmer’s Manual, Research Version, Ninth Edition, +Volume 1, +</i></span><span style="font-size: 10pt">AT&T Bell Laboratories, Murray Hill, New Jersey 1986. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt"> 9. </span><span style="font-size: 10pt"><i>Unix Time-Sharing System Programmer’s Manual, 4.1 Berkeley Software Distribution, +Volumes 1 and 2C, +</i></span><span style="font-size: 10pt">University of California, Berkeley, Calif. 1981. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">10. R. Pike, +‘Structural Regular Expressions,’ +</span><span style="font-size: 10pt"><i>Proc. EUUG Spring Conf., Helsinki 1987, +</i></span><span style="font-size: 10pt">Eur. Unix User’s Group, Buntingford, Herts, UK 1987. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">11. A. Goldberg, +</span><span style="font-size: 10pt"><i>Smalltalk-80 – The Interactive Programming Environment, +</i></span><span style="font-size: 10pt">Addison-Wesley, Reading, Mass. 1984. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">12. K. Thompson, +‘Regular expression search algorithm,’ +</span><span style="font-size: 10pt"><i>Comm. ACM, +</i></span><span style="font-size: 10pt"></span><span style="font-size: 10pt"><b>11</b></span><span style="font-size: 10pt">, +(6), +419-422 (1968). +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">13. A. V. Aho, J. E. Hopcroft and J. D. Ullman, +</span><span style="font-size: 10pt"><i>The Design and Analysis of Computer Algorithms, +</i></span><span style="font-size: 10pt">Addison-Wesley, Reading, Mass. 1974. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">14. B. W. Kernighan and D. M. Ritchie, +</span><span style="font-size: 10pt"><i>The C Programming Language, +</i></span><span style="font-size: 10pt">Prentice-Hall, Englewood Cliffs, New Jersey 1978. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">15. W. M. Waite, +‘The cost of lexical analysis,’ +</span><span style="font-size: 10pt"><i>Softw. Pract. Exp., +</i></span><span style="font-size: 10pt"></span><span style="font-size: 10pt"><b>16</b></span><span style="font-size: 10pt">, +(5), +473-488 (1986). +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">16. C. W. Fraser, +‘A generalized text editor,’ +</span><span style="font-size: 10pt"><i>Comm. ACM, +</i></span><span style="font-size: 10pt"></span><span style="font-size: 10pt"><b>23</b></span><span style="font-size: 10pt">, +(3), +154-158 (1980). +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">17. R. Pike, +‘Graphics in overlapping bitmap layers,’ +</span><span style="font-size: 10pt"><i>ACM Trans. on Graph., +</i></span><span style="font-size: 10pt"></span><span style="font-size: 10pt"><b>2</b></span><span style="font-size: 10pt">, +(2) +135-160 (1983). +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">18. L. J. Guibas and J. Stolfi, +‘A language for bitmap manipulation,’ +</span><span style="font-size: 10pt"><i>ACM Trans. on Graph., +</i></span><span style="font-size: 10pt"></span><span style="font-size: 10pt"><b>1</b></span><span style="font-size: 10pt">, +(3), +191-214 (1982). +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">19. R. Pike, B. Locanthi and J. Reiser, +‘Hardware/software trade-offs for bitmap graphics on the Blit,’ +</span><span style="font-size: 10pt"><i>Softw. Pract. Exp., +</i></span><span style="font-size: 10pt"></span><span style="font-size: 10pt"><b>15</b></span><span style="font-size: 10pt">, +(2), +131-151 (1985). +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">20. T. A. Cargill, +‘The feel of Pi,’ +</span><span style="font-size: 10pt"><i>Winter USENIX Conference Proceedings, +Denver 1986, +</i></span><span style="font-size: 10pt">62-71, +USENIX Assoc., El Cerrito, CA. +</span></p><p style="margin-top: 0; margin-bottom: 0.05in"></p> +<p style="line-height: 1.2em; margin-left: 1.00in; text-indent: 0.00in; margin-right: 1.00in; margin-top: 0; margin-bottom: 0; text-align: justify;"> +<span style="font-size: 10pt">21. G. J. Holzmann, +‘Tracing protocols,’ +</span><span style="font-size: 10pt"><i>AT&T Tech. J., +</i></span><span style="font-size: 10pt"></span><span style="font-size: 10pt"><b>64</b></span><span style="font-size: 10pt">, +(10), +2413-2434 (1985). +</span></p><p style="margin-top: 0; margin-bottom: 0.50in"></p> +</body> +</html> + |