summaryrefslogtreecommitdiff
path: root/sys/src/cmd/atazz/atazz.ms
diff options
context:
space:
mode:
authorgoogle <google@daverabbitz.ath.cx>2012-09-20 22:39:48 +1200
committergoogle <google@daverabbitz.ath.cx>2012-09-20 22:39:48 +1200
commitfa7fb8b66b9ff50029532d09315f03896f2ac4c4 (patch)
tree262cfe8966f3c07a5ad6ed61fda028f6fa928a36 /sys/src/cmd/atazz/atazz.ms
parent913afc39c9b4750e630c7f4ff3161a37602b006b (diff)
Add Erik Quanstrom's atazz
(needed to disable power management/head unload on 2.5" drive)
Diffstat (limited to 'sys/src/cmd/atazz/atazz.ms')
-rw-r--r--sys/src/cmd/atazz/atazz.ms573
1 files changed, 573 insertions, 0 deletions
diff --git a/sys/src/cmd/atazz/atazz.ms b/sys/src/cmd/atazz/atazz.ms
new file mode 100644
index 000000000..7048327c5
--- /dev/null
+++ b/sys/src/cmd/atazz/atazz.ms
@@ -0,0 +1,573 @@
+.FP lucidasans
+.TL
+ATA au Naturel
+.AU
+Erik Quanstrom
+quanstro@quanstro.net
+.AB
+The Plan 9
+.I sd (3)
+interface allows raw commands to be sent. Traditionally,
+only SCSI CDBs could be sent in this manner. For devices
+that respond to ATA/ATAPI commands, a small set of SCSI CDBs
+have been translated into an ATA equivalent. This approach
+works very well. However, there are ATA commands such as
+SMART which do not have direct translations. I describe how
+ATA/ATAPI commands were supported without disturbing
+existing functionality.
+.AE
+.SH
+Introduction
+.PP
+In writing new
+.I sd (3)
+drivers for plan 9, it has been necessary to copy laundry
+list of special commands that were needed with previous
+drivers. The set of commands supported by each device
+driver varies, and they are typically executed by writing a
+magic string into the driver's
+.CW ctl
+file. This requires code duplicated for each driver, and
+covers few commands. Coverage depends on the driver. It is
+not possible for the control interface to return output,
+making some commands impossible to implement. While a
+work around has been to change the contents of the control
+file, this solution is extremely unwieldy even for simple
+commands such as
+.CW "IDENTIFY DEVICE" .
+.SH
+Considerations
+.PP
+Currently, all
+.I sd
+devices respond to a small subset of SCSI commands
+through the raw interface and the normal read/write interface uses
+SCSI command blocks. SCSI devices, of course, respond natively
+while ATA devices emulate these commands with the help
+.I sd .
+This means that
+.I scuzz (8)
+can get surprisingly far with ATA devices, and ATAPI
+.I \fP(\fPsic. )
+devices
+work quite well. Although a new implementation might not
+use this approach, replacing the interface did not appear
+cost effective and would lead to maximum incompatibilities,
+while this interface is experimental. This means that the raw interface will need
+a method of signaling an ATA command rather than a SCSI CDB.
+.PP
+An unattractive wart of the ATA command set is there are seven
+protocols and two command sizes. While each command has a
+specific size (either 28-bit LBA or 48-bit LLBA) and is
+associated with a particular protocol (PIO, DMA, PACKET,
+etc.), this information is available only by table lookup.
+While this information may not always be necessary for simple
+SATA-based controllers, for the IDE controllers, it is required.
+PIO commands are required and use a different set of registers
+than DMA commands. Queued DMA commands and ATAPI
+commands are submitted differently still. Finally,
+the data direction is implied by the command. Having these
+three extra pieces of information in addition to the command
+seems necessary.
+.PP
+A final bit of extra-command information that may be useful
+is a timeout. While
+.I alarm (2)
+timeouts work with many drivers, it would be an added
+convenience to be able to specify a timeout along with the
+command. This seems a good idea in principle, since some
+ATA commands should return within milli- or microseconds,
+others may take hours to complete. On the other hand, the
+existing SCSI interface does not support it and changing its
+kernel-to-user space format would be quite invasive. Timeouts
+were left for a later date.
+.SH
+Protocol and Data Format
+.PP
+The existing protocol for SCSI commands suits ATA as well.
+We simply write the command block to the raw device. Then
+we either write or read the data. Finally the status block
+is read. What remains is choosing a data format for ATA
+commands.
+.PP
+The T10 Committee has defined a SCSI-to-ATA translation
+scheme called SAT[4]. This provides a standard set of
+translations between common SCSI commands and ATA commands.
+It specifies the ATA protocol and some other sideband
+information. It is particularly useful for common commands
+such as
+.CW "READ\ (12)"
+or
+.CW "READ CAPACITY\ (12)" .
+Unfortunately, our purpose is to address the uncommon commands.
+For those, special commands
+.CW "ATA PASSTHROUGH\ (12)"
+and
+.CW "(16)"
+exist. Unfortunately several commands we are interested in,
+such as those that set transfer modes are not allowed by the
+standard. This is not a major obstacle. We could simply
+ignore the standard. But this goes against the general
+reasons for using an established standard: interoperability.
+Finally, it should be mentioned that SAT format adds yet
+another intermediate format of variable size which would
+require translation to a usable format for all the existing
+Plan 9 drivers. If we're not hewing to a standard, we should
+build or choose for convenience.
+.PP
+ATA-8 and ACS-2 also specify an abstract register layout.
+The size of the command block varies based on the “size”
+(either 28- or 48-bits) of the command and only context
+differentiates a command from a response. The SATA
+specification defines host-to-drive communications. The
+formats of transactions are called Frame Information
+Structures (FISes). Typically drivers fill out the command
+FISes directly and have direct access to the Device-to-Host
+Register (D2H) FISes that return the resulting ATA register
+settings. The command FISes are also called Host-to-Device
+(H2D) Register FISes. Using this structure has several advantages. It
+is directly usable by many of the existing SATA drivers.
+All SATA commands are the same size and are tagged as
+commands. Normal responses are also all of the same size
+and are tagged as responses. Unfortunately, the ATA
+protocol is not specified. Nevertheless, SATA FISes seem to
+handle most of our needs and are quite convenient; they can
+be used directly by two of the three current SATA drivers.
+.SH
+Implementation
+.PP
+Raw ATA commands are formatted as a ATA escape byte, an
+encoded ATA protocol
+.CW proto
+and the FIS. Typically this would be a
+H2D FIS, but this is not a requirement. The escape byte
+.R 0xff ,
+which is not and, according to the current specification,
+will never be a valid SCSI command, was chosen. The
+protocol encoding
+.CW proto
+and other FIS construction details are specified in
+.CW "/sys/include/fis.h" .
+The
+.CW proto
+encodes the ATA protocol, the command “size” and data
+direction. The “atazz” command format is pictured in \*(Fn.
+.F1
+.PS
+scale=10
+w=8
+h=2
+define hdr |
+[
+ box $1 ht h wid w
+] |
+define fis |
+[
+ box $1 ht h wid w
+] |
+
+F: [
+A: hdr("0xff")
+ hdr("proto")
+ hdr("0x27")
+ hdr("flags")
+
+B: hdr("cmd") at A+(0, -h)
+ hdr("feat")
+ hdr("lba0")
+ hdr("lba8")
+
+C: hdr("lba16") at B+(0, -h)
+ hdr("dev")
+ hdr("lba24")
+ hdr("lba32")
+
+D: hdr("lba40") at C+(0, -h)
+ hdr("feat8")
+ hdr("cnt")
+ hdr("cnt8")
+
+E: hdr("rsvd") at D+(0, -h)
+ hdr("ctl")
+]
+G: [
+ fis("sdXX/raw")
+] at F.se +(w*2, -h)
+arrow from F.e to G.w
+H: [
+ fis("data")
+] at G.sw +(-w*2, -h)
+HH: [
+ fis("sdXX/raw")
+] at H.se +(w*2, -h)
+
+arrow from H.e to HH.w
+
+Q: [
+ fis("sdXX/raw")
+] at HH +(0, -2*h)
+
+I: [
+K: hdr("0xff")
+ hdr("proto")
+ hdr("0x34")
+ hdr("port");
+
+L: hdr("stat") at K+(0, -h)
+ hdr("err")
+ hdr("lba0")
+ hdr("lba8")
+
+M: hdr("lba16") at L+(0, -h)
+ hdr("dev")
+ hdr("lba24")
+ hdr("lba32")
+
+O: hdr("lba40") at M+(0, -h)
+ hdr("feat8")
+ hdr("cnt")
+ hdr("cnt8")
+
+P: hdr("rsvd") at O+(0, -h)
+ hdr("ctl")
+] at Q.sw +(-w*3.5, -3*h)
+arrow from Q.w to I.e
+
+.PE
+.F2
+.F3
+.PP
+Raw ATA replies are formatted as a one-byte
+.R sd
+status code followed by the reply FIS.
+The usual read/write register substitutions are
+applied; ioport replaces flags, status replaces cmd, error
+replaces feature.
+.PP
+Important commands such as
+.CW "SMART RETURN STATUS"
+return no data. In this case, the protocol is run as usual.
+The client performs a 0-byte read to fulfill data transfer
+step. The status is in the D2H FIS returned as the status.
+The vendor ATA command
+.R 0xf0
+is used to return the device signature FIS as there is no
+universal in-band way to do this without side effects.
+When talking only to ATA drives, it is possible to first
+issue a
+.CW "IDENTIFY PACKET DEVICE"
+and then a
+.CW "IDENTIFY DEVICE"
+command, inferring the device type from the successful
+command. However, it would not be possible to enumerate the
+devices behind a port multiplier using this technique.
+.SH
+Kernel changes and Libfis
+.PP
+Very few changes were made to devsd to accommodate ATA
+commands. the
+.CW SDreq
+structure adds
+.CW proto
+and
+.CW ataproto
+fields. To avoid disturbing existing SCSI functionality and
+to allow drivers which support SCSI and ATA commands in
+parallel, an additional
+.CW ataio
+callback was added to
+.CW SDifc
+with the same signature as
+the existing
+.CW rio
+callback. About twenty lines of code were
+added to
+.CW port/devsd.c
+to recognize raw ATA commands and call the
+driver's
+.CW ataio
+function.
+.PP
+To assist in generating the FISes to communicate with devices,
+.CW libfis
+was written. It contains functions to identify and
+enumerate the important features of a drive, to format
+H2D FISes And finally, functions for
+.I sd
+and
+.I sd
+-devices to build D2H FISes to
+capture the device signature.
+.PP
+All ATA device drivers for the 386 architecture have been
+modified to accept raw ATA commands. Due to consolidation
+of FIS handling, the AHCI driver lost
+175 lines of code, additional non-atazz-related functionality
+notwithstanding. The IDE driver remained exactly the same
+size. Quite a bit more code could be removed if the driver
+were reorganized. The mv50xx driver gained 153 lines of
+code. Development versions of the Marvell Orion driver
+lost over 500 lines while
+.CW libfis
+is only about the same line count.
+.PP
+Since FIS formats were used to convey
+commands from user space,
+.CW libfis
+has been equally useful for user space applications. This is
+because the
+.I atazz
+interface can be thought of as an idealized HBA. Conversely,
+the hardware driver does not need to know anything about
+the command it is issuing beyond the ATA protocol.
+.SH
+Atazz
+.PP
+As an example and debugging tool, the
+.I atazz (8)
+command was written.
+.I Atazz
+is an analog to
+.I scuzz (8);
+they can be thought of as a driver for a virtual interface provided
+by
+.I sd
+combined with a disk console.
+ATA commands are spelled out verbosely as in ACS-2. Arbitrary ATA
+commands may be submitted, but the controller or driver may
+not support all of them. Here is a sample transcript:
+.P1
+az> probe
+/dev/sda0 976773168; 512 50000f001b206489
+/dev/sdC1 0; 0 0
+/dev/sdD0 1023120; 512 0
+/dev/sdE0 976773168; 512 50014ee2014f5b5a
+/dev/sdF7 976773168; 512 5000cca214c3a6d3
+az> open /dev/sdF0
+az> smart enable operations
+az> smart return status
+normal
+az> rfis
+00
+34405000004fc2a00000000000000000
+.P2
+.PP
+In the example, the
+.CW probe
+command is a special command that uses
+.CW #S/sdctl
+to enumerate the controllers in the system.
+For each controller, the
+.CW sd
+vendor command
+.CW 0xf0
+.CW \fP(\fPGET
+.CW SIGNATURE )
+is issued. If this command is successful, the
+number of sectors, sector size and WWN are gathered
+and and listed. The
+.CW /dev/sdC1
+device reports 0 sectors and 0 sector size because it is
+a DVD-RW with no media. The
+.CW open
+command is another special command that issues the
+same commands a SATA driver would issue to gather
+the information about the drive. The final two commands
+enable SMART
+and return the SMART status. The smart status is
+returned in a D2H FIS. This result is parsed the result
+is printed as either “normal,” or “threshold exceeded”
+(the drive predicts imminent failure).
+.PP
+As a further real-world example, a drive from my file server
+failed after a power outage. The simple diagnostic
+.CW "SMART RETURN STATUS"
+returned an uninformative “threshold exceeded.”
+We can run some more in-depth tests. In this case we
+will need to make up for the fact that
+.I atazz
+does not know every option to every command. We
+will set the
+.CW lba0
+register by hand:
+.P1
+az> smart lba0 1 execute off-line immediate # short data collection
+az> smart read data
+col status: 00 never started
+exe status: 89 failed: shipping damage, 90% left
+time left: 10507s
+shrt poll: 176m
+ext poll: 19m
+az>
+.P2
+.PP
+Here we see that the drive claims that it was damaged in
+shipping and the damage occurred in the first 10% of the
+drive. Since we know the drive had been working before
+the power outage, and the original symptom was excessive
+UREs (Unrecoverable Read Errors) followed by write
+failures, and finally a threshold exceeded condition, it is
+reasonable to assume that the head may have crashed.
+.SH
+Stand Alone Applications
+.PP
+There are several obvious stand-alone applications for
+this functionality: a drive firmware upgrade utility,
+a drive scrubber that bypasses the drive cache and a
+SMART monitor.
+.PP
+Since SCSI also supports a basic SMART-like
+interface through the
+.CW "SEND DIAGNOSTIC"
+and
+.CW "RECEIVE DIAGNOSTIC RESULTS"
+commands,
+.I disk/smart (8)
+gives a chance to test both raw ATA and SCSI
+commands in the same application.
+.PP
+.I Disk/smart
+uses the usual techniques for gathering a list of
+devices or uses the devices given. Then it issues a raw ATA request for
+the device signature. If that fails, it is assumed
+that the drive is SCSI, and a raw SCSI request is issued.
+In both cases,
+.I disk/smart
+is able to reliably determine if SMART is supported
+and can be enabled.
+.PP
+If successful, each device is probed every 5 minutes
+and failures are logged. A one shot mode is also
+available:
+.P1
+chula# disk/smart -atv
+sda0: normal
+sda1: normal
+sda2: normal
+sda3: threshold exceeded
+sdE1: normal
+sdF7: normal
+.P2
+.PP
+Drives
+.CW sda0 ,
+.CW sda1
+are SCSI
+and the remainder are ATA. Note that other drives
+on the same controller are ATA.
+Recalling that
+.CW sdC0
+was previously listed, we can check to see why no
+results were reported by
+.CW sdC0 :
+.P1
+chula# for(i in a3 C0)
+ echo identify device |
+ atazz /dev/sd$i >[2]/dev/null |
+ grep '^flags'
+flags lba llba smart power nop sct
+flags lba
+.P2
+So we see that
+.CW sdC0
+simply does not support the SMART feature set.
+.SH
+Further Work
+.PP
+While the raw ATA interface has been used extensively
+from user space and has allowed the removal of quirky
+functionality, device setup has not yet been addressed.
+For example, both the Orion and AHCI drivers have
+an initialization routine similar to the following
+.P1
+newdrive(Drive *d)
+{
+ setfissig(d, getsig(d));
+ if(identify(d) != 0)
+ return SDeio;
+ setpowermode(d);
+ if(settxmode(d, d->udma) != 0)
+ return SDeio;
+ return SDok;
+}
+.P2
+However in preparing this document, it was discovered
+that one sets the power mode before setting the
+transfer mode and the other does the opposite. It is
+not clear that this particular difference is a problem,
+but over time, such differences will be the source of bugs.
+Neither the IDE nor the Marvell 50xx drivers sets the
+power mode at all. Worse,
+none is capable of properly addressing drives with
+features such as PUIS (Power Up In Standby) enabled.
+To addresses this problem all four of the ATA drivers would
+need to be changed.
+.PP
+Rather than maintaining a number of mutually out-of-date
+drivers, it would be advantageous to build an ATA analog
+of
+.CW pc/sdscsi.c
+using the raw ATA interface to submit ATA commands.
+There are some difficulties that make such a change a bit
+more than trivial. Since current model for hot-pluggable
+devices is not compatible with the top-down
+approach currently taken by
+.I sd
+this would need to be addressed. It does not seem that
+this would be difficult. Interface resets after failed commands
+should also be addressed.
+.SH
+Source
+.PP
+The current source including all the pc drivers and applications
+are available
+in the following
+.I contrib (1)
+packages on
+.I sources :
+.br
+.CW "quanstro/fis" ,
+.br
+.CW "quanstro/sd" ,
+.br
+.CW "quanstro/atazz" ,
+and
+.br
+.CW "quanstro/smart" .
+.PP
+The following manual pages are included:
+.br
+.I fis (2),
+.I sd (3),
+.I sdahci (3),
+.I sdaoe (3),
+.I sdloop (3),
+.I sdorion (3),
+.I atazz (8),
+and
+.I smart (8).
+.SH
+Abbreviated References
+.nr PI 5n
+.IP [1]
+.I sd (1),
+published online at
+.br
+.CW "http://plan9.bell-labs.com/magic/man2html/3/sd" .
+.IP [2]
+.I scuzz (8),
+published online at
+.br
+.CW "http://plan9.bell-labs.com/magic/man2html/8/scuzz" .
+.IP [3]
+T13
+.I "ATA/ATAPI Command Set\ \-\ 2" ,
+revision 1, January 21, 2009,
+formerly published online at
+.CW "http://www.t13.org" .
+.IP [4]
+T10
+.I "SCSI/ATA Translation\ \-\ 2 (SAT\-2)" ,
+revision 7, February 18, 2007,
+formerly published online at
+.CW "http://www.t10.org" .