martin carpenter


most popular
2012/05/05, updated 2012/12/15
ubuntu unity lens for vim



tags: solaris utmpx(4) wtmpx(4) last(1)

github home
repository URLs

Periodically I discover a Solaris 10 server with a corrupted /var/adm/wtmpx file, the accounting file that records login times and reboots. I don't believe this to be malicious (eg a hacker clumsily covering their tracks, although you should be aware of that possibility) but more likely some subtle bug in the depths of the login subsystem. Unfortunately most system tools don't report this problem and simply stop processing when they read corrupted data, although new records do continue to be appended to the corrupted file. In particular, last(1) does not emit any error message when reading such a file and the only symptom that you might notice is truncated output. Another way to spot this problem is if the size of the /var/adm/wtmpx file is not evenly divisible by the record length (372 bytes for current releases of Solaris 10).

I haven't yet noticed any pattern to the corruption: sometimes it's just handful of zero bytes, other times there is an identifiable ASCII source IP address, and other times it's just junk.

Consequently I wrote ckwtmpx to:

Usage details and the algorithm used for repair are detailed in the manual page below.

User Commands                                          ckwtmpx(1)

     ckwtmpx - check Solaris wtmpx files for corruption, and per-
     form optional repairs.

     ckwtmpx [-d] [-o output_file] [-e error_file] [-t time_travel]
     ckwtmpx -h
     ckwtmpx -v

     It sometimes happens, either malevolently or otherwise, that
     Solaris'   binary   format  accounting  file  /var/adm/wtmpx
     becomes corrupted. The only normal symptom of this  is  that
     standard tools such as last stop processing the file as soon
     as the corrupt data is encountered (last produces neither an
     error message nor a non-zero return code).

     ckwtmpx attempts to read a wtmpx file  from  standard  input
     one  record  at  a  time.  Valid  records  are copied to the
     (optional) output file (-o), and bytes  that  are  discarded
     are copied to the (optional) error file (-e).

     When an invalid record is encountered, ckwtmpx moves forward
     through the standard input one character at a time until the
     start of a valid record is found. Skipped bytes are  written
     to  the  error  file as they are discarded. Errors and debug
     information are sent to stderr.

     A valid record fulfills the following criteria:

         Epoch time (ut_tv) is greater than 0 (was written  after
         1 Jan 1970).

         Epoch time (ut_tv) is before now (was not written in the

         The wtmpx record type (ut_type) is valid.

         Either this is the first valid record found or it is not
         more  than  70  seconds younger than the previous record
         found. (Some systems may buffer output to wtmpx  result-
         ing occasional temporal misordering of records).

     See <utmpx.h> and <utmp.h> for more details  on  the  binary
     record  format, in particular struct futmpx in <utmpx.h> for
     details of the record serialization.

SunOS 5.10          Last change: 14 Apr 2010                    1

User Commands                                          ckwtmpx(1)

     Flags -d, -e and -o may be combined  as  required  but  note

         ckwtmpx -o /var/adm/wtmpx </var/adm/wtmpx

     will almost certainly cause pain. Use a temporary file.

     The following options are supported:

     -d                       Enable debug output to stderr.

     -h                       Print usage to stdout and exit.

     -e error_file            Writes  skipped  bytes   from   the
                              corrected wtmpx file to error_file.

     -o output_file           Writes the corrected wtmpx file  to

     -v                       Print version to stdout and exit.

     Returns 0 if the wtmpx file is okay, 1 if it is corrupt  and
     2 on fatal errors (syntax, file permissions, ...) so ckwtmpx
     can be run with  no  arguments  for  testing  file  validity
     without spurious output:

         if ! ckwtmpx