Lost Files

Volume Number:		1
Issue Number:		8
Column Tag:		Forth Forum

"Recovering Lost Files"

By Jörg Langowski, Chemical Engineer, Fed. Rep. of Germany, MacTutor Editorial Board

Recovering lost Files - File Tags

Before we start the main topic of this month's column, let me make an addition to my article in the May issue (Vol1 #5). I have so far made no distinction between using MacForth 1.1 and 2.0 for any of the program examples given here. And it is true, the main difference between the two Forth versions that appears on the surface is that Forth 2.0 contains the word ADD.CONTROL, which you have to define for yourself in Forth 1.1 (see V1 #4).

One thing, however, does change between the two versions. The tokens of predefined words are not the same anymore. Therefore, the example that I gave in the May article (for which I happened to use Forth 2.0) does not work with Forth 1.1. In 1.1, the tokens for the two 'no-name' words are:

2C1A instead of 2C7C for the 'wrapping tab' function, and

2BF6 instead of 2C58 for the wraparound function.

This month's example will make use of the latter function, so watch which Forth version you are using when you try it out.

Forth on the Mac can be compared to a set of very powerful, very sharp tools. You can create beautiful things with it; you can also cut yourself pretty badly. Cutting yourself with Forth can show up in a variety of disguises: the Bomb (and you may even be able to Resume!), interesting geometric patterns on the screen, requiring you to push Reset, strange machine gun type noises from the speaker, or all of the above. Of course, the MacForth editor automatically backs up screens and protects you from many disasters, but I left out something from the list a disk drive running wild, erasing some of your last half day's hard work or even the directory.

The people who designed the Mac have provided a very clever 'safety net' for those types of crashes. The operating system not only maintains a directory of the disk which associates to each file the blocks it is written to, it keeps additional information with each physical block: which file it belongs to, what its position in this file is, what the attributes of that file are and when it was written. Therefore, in principle, you can reconstruct any file on the disk, even without the directory being intact, just by reading all the blocks on the disk and looking where they belong.

The information that is kept with the blocks is not much; otherwise too much disk space is lost. So you won't be able to keep track of file names this way, nor of file types or creators.

An update to Inside Macintosh

The extra bytes that go with each sector are called tags. Inside Macintosh documents that there are 12 tag bytes per block and that they contain the following data:

bytes     0 3      : file number (long word)
byte       4       : file attributes (byte)
byte       5       : seems to be unused
bytes     6 7      : sequence number of this block within the file
bytes     8 11    : date block was written

This table deviates a little from the one that was given in the File Manager section of Inside Macintosh (at least in the most current update that was available to me in late April, when I wrote this). Inside Macintosh puts the attributes byte at position 5 in the tags block, and says that bit 0 of byte 4 is set =1 when the file is locked, =0 otherwise. My experiments, using the example program (see below), could not confirm this. Byte 4 contains all the file attributes, including whether the fork is resource or data (bit 1 =1 or =0, resp.).

The File Tags information is not written to the file buffer that you pass to the OS routines in the file control block. Rather, after each read operation, the tags are stripped from the main part of the block and placed into a special position in memory, which is referred to by the system global TagData (=$2FA). You will find the 12 tag bytes starting at TagData+2; I was not able to figure out what is contained in TagData and TagData+1. Inside Macintosh claims that the last 4 tag bytes (the block modification date) are not put into memory here, but that the logical block number appears at TagData+10 to TagData+11. This could not be reconfirmed, either; any time you read a block from disk, you will find the modification date of that block at TagData +10.

Reading and writing tags

In which way do the tags influence file handling by the operating system? Not at all, as long as the directory is intact. You can not only read tags, but also by changing the memory block that contains them and writing the 512 byte buffer back to disk, modify the tags (and even give them nonsense values). This, as far as I was able to find out, does not change the behavior of the disk in any way. If you write a block on a disk back in place, with only its tags modified (saying e.g. that it belongs to a different or non-existent file), the file containing that block will still be useable.

I tried this on a copy of a MacWrite disk, changing a couple of block tags in the MacWrite as well as in the System and Finder files. The disk would still boot nicely afterwards, and also after booting with Option-Command pressed (thus reconstructing the Desktop file), nothing peculiar was seen. Therefore it seems that the block tags are really nothing but an additional safety measure to help you (or some utility routine) reconstruct a crashed disk.

This is basically what Inside Macintosh says, too. And it is also said that there would be a utility (to come) that will automatically repair a destroyed disk from the tag information. Alas, no such thing has ever crossed this editor's desk and it still seems to be one of the numerous vaporware items for the Mac. (If any of you who read this column know better, please correct me and also tell me where to get such a routine). So far, the little Forth program at the end will have to do.

Recovering lost files

This month's example program is an update of the disk editor published in V1#5. Since I made a couple of changes to different parts of the program, it is printed in full length again. For one thing, the block length was changed from 1024 to 512 bytes, since tags refer to logical blocks, not allocation blocks. The new features that have been added are words to extract tag information from the system globals area and print it formatted; furthermore, you can dump the whole disk block by block, showing the tags of the starting block of each new file. Since files are organized on the disk in contiguous blocks, the program notices the start of a new file by the change of the file number in the tag.

The strategy for reconstructing a crashed directory, then, will be the following: First, select the drive that contains your sick disk. Then do a 'Dump all Tags' and you will get a pretty good idea of the file structure on that disk. Note that the resource fork and the data fork of a file will have the same file number but different attributes bytes; in the data fork, bit 1 equals 0, while it is set to 1 in the resource fork.

The last item in the DiskEdit menu is RecoverFile. This will create an output file of type TEXT (you may want to use the other drive for it, prefixing the file name by the volume name). It then asks for the starting and ending block of the file to recover, assuming that you already know (from the dump) where your file is located on the bad disk. The first block is read and it is decided whether the file to write is a regular file or a resource file. Then the other blocks are read, one by one, and written to the recovery file.

This procedure will leave you with a 'plain vanilla' document as a recovered file. You won't be able to tell the file name, whether it was an application or really a document, nor what its signature or creator is. That is too bad, but can't be helped at least you got your file back. Good guessing helps a lot here. You can also resort to a number of utilities to help you guess. If you happen to have a copy of ResEdit, looking at the resources of a (suspected) application will in most cases show you what application it was (from dialog boxes, strings etc.). Or, if you do not have ResEdit, you can just simply try and launch the application by double-clicking the recovered file's icon with the Command and Option keys pressed. The bomb will soon tell you whether you were wrong or right.

A copy of MacForth 1.1 that was recovered in the way explained above worked without problems, a minor annoyance being that you could start it with Command-Option-double click only. This can be changed, too: Using a utility such as SetFile (available on many public domain disks, as far as I know), you can change the Type and Creator words of your file and set the Bundle bit. For MacForth, the file type will be APPL, and the creator is M4TH. With that little trick, your file has been recovered completely; it can be launched in the usual way if it is an application, and will show the correct icon. If you don't have SetFile or a similar thing, you may as well use the disk editor published here, together with the information about directory structure given in V1#5.

A Forth based object-oriented language - NEON

There is a new language coming out for the Mac: NEON by Kriya Systems from Chicago. It is too early to give a full review of this system, since at this time I have only a pre-release copy without complete documentation, but what I've seen so far looks rather impressive.

The people at Kriya have been unsatisfied with Forth for a variety of reasons, the main three points being the lack of standardized data structures like strings and arrays, lack of a 'local variable' feature (other than the stack), and bad readability due to proliferation of stack operations, i.e., lack of named formal parameters in procedure calls.

You may argue about those points; fact is, that Forth code remains rather cryptic to non-Forth programmers in many cases, and since one of the basic concepts of Forth is extensibility, it makes sense to define your own 'dialect' of Forth that does not have those (real or not) deficiencies.

The creators of NEON went one step further and built object-orientation into their dialect of Forth. That is, to each class of objects (i.e. data structures) you can define a list of methods by which to manipulate them. For instance, putting a value into an array is a completely different operation from putting it into a list, and in standard Forth you will have to define different words for both. In an object oriented language, you can simply 'tell' the object to put the value into a certain place, and the object will do whatever is appropriate. (True, CREATE DOES> has something of that flavor to it; for instance you can define arrays that do automatic range checking or that remember how many dimensions they have etc. Still, Forth itself is a long way from being an object oriented language.)

Let me just give you a quick example of the definition of classes and objects. This is the way a class definition in NEON looks like:

:CLASS className <SUPER superClass
 n <INDEXED

( local variable definitions)

:M Method1: ( Forth code...)
;M

:M Method2:  ( Forth code...)
;M

... etc. for other methods ...

;CLASS ( end definition)

The <SUPER tells what superclass this new class belongs to (like a handle being a 
special kind of pointer, an unsigned integer a special kind of integer etc.); the <INDEXED 
tells, for indexed variables, the length of one element. 

Word definitions in NEON may contain formal parameters and local variables:

:Word { fpar1 fpar2 \ var1 var2 var3 -- }

That way you can refer to stack parameters by name and have some local variables 
available that other routines do not know of.
Another feature that has been added is forward referencing, which also comes in handy 
at times. But that's all that I want to tell you about NEON so far. I do promise to write a 
more detailed description with more examples soon, when I have the final release with the 
manual.
International compatibility -- one more remark
Just let me add a last quick point to what I said about the keyboard problems in the 
last issue. Apple seems to have realized this problem; there is a program available now, 
Localizer, which will automatically configure your system for any of a variety of countries, 
so you can 'localize' a disk with e.g. a German System file on it to a US keyboard. Which 
is just what I needed.
Listing 1: Disk Editor, Version 2
( Disk Editor Rev. 2, © 1985 MacTutor by
 J. Langowski )
: disk.editor ;
18 field +fcb.name  22 field +fcb.drive  
24 field +fcb.vrefnum 32 field +fcb.buf  
36 field +fcb.request 40 field +fcb.actual
44 field +fcb.posmode
46 field +fcb.position
12 constant dsk.menu 512 constant blk.size
variable vol.fcb  variable vol.fnumber
variable hex.asc  variable drive#
variable tag.fold 9999 tag.fold !
create this.fcb 50 allot  
create vol.buffer blk.size allot
hex
   a002 os.trap read  a003 os.trap write
decimal

: open.vol this.fcb dup vol.fcb !
   dup +fcb.vrefnum -5 swap w!
   +fcb.drive drive# @ swap w! ;
: input 0 0 >in ! query 
           32 word convert drop ;
: input$  0 >in ! query 32 word ;
: text.normal 12 textsize 15 line.height
    plain textstyle ;
: text.tiny 9 textsize 9 line.height
    condensed textstyle ;
hex
: need.chars 2bf6 execute ;  
( Ver. 1.1; set to 2c58 for 2.0 )
decimal

: dump.fcb ." Header    :" 
  3 0 do dup i 4* + @ . ."  " loop cr
  ." completion:" dup 12 + @ . cr
  ." ioresult  :" dup 16 + w@ . cr
  ." filename  :" dup +fcb.name @ . cr
  ." drive     :" dup +fcb.drive w@ . cr
  ." refnum    :" dup +fcb.vrefnum w@ . cr
  ." buffer    :" dup +fcb.buf @ . cr
  ." request   :" dup +fcb.request @ . cr
  ." actual    :" dup +fcb.actual @ . cr
  ." posmode   :" dup +fcb.posmode w@ . cr
 ." offset    :" dup +fcb.position @ . cr ;

hex  2FA 2+ constant tag.data  
tag.data constant tag.fnumber
tag.data 4+ constant tag.attr  
tag.data 5+ constant tag.lock
tag.data 6+ constant tag.sequ  
tag.data 8+ constant tag.date
: s2date ?days -1 fmt.date$ type ;
: s2time ?seconds fmt.time$ type ;
                               decimal
: dump.tags text.normal decimal
   ." file number : " tag.fnumber @ 6 .r cr
   ." attributes  : $" tag.attr c@ 
                         hex 2 .r decimal
   ." , sequence #: " tag.sequ w@ 4 .r cr
   ." date written: " tag.date @ dup s2date
                         space s2time cr ;
: setup.fcb ( buffer\block#\fcb -- fcb )
  dup +fcb.posmode 1 swap w!
  dup +fcb.position rot blk.size * swap ! 
           ( byte pos on disk)
  dup +fcb.buf rot swap !  
           ( buffer address)
  dup +fcb.request blk.size swap ! 
           ( # of bytes to transfer) ;
: read.pb ( buffer\block#\fcb -- )
                        setup.fcb read ;
: read.disk ( block# -- ) 
       dup 30 need.chars ." #" . space
       vol.buffer swap vol.fcb @ read.pb ;
: write.pb ( buffer\block#\fcb -- )
                        setup.fcb write ;
: write.disk ( block# -- ) 
       dup 30 need.chars ." #" . space
       vol.buffer swap vol.fcb @ write.pb ;
: dump.32 ( start address -- )
  32 0 do dup i + c@ hex.asc @ if
    dup 16 < if ." 0" then . else
    dup 32 < if ." ." drop else emit then
              then loop ;
: dump.buffer ( buffer address -- )
  text.tiny cr
  blk.size 32 / 0 do dup i 32 * dup16 < if
 ." 00" else dup 256 <  if ." 0" then then
  dup . + dump.32 drop cr loop ;
: read.block text.normal
  ." Read block #: " input dup 0<
  if error" Negative Block #" then
  cr read.disk io-result @ ?dup 
    if cr ." OS error " . cr abort
    else ." block read" cr dump.tags then ;
: write.block text.normal
  ." Write block #: " input dup 0<
  if error" Negative Block #" then
  cr write.disk io-result @ ?dup
    if cr ." OS error " . cr abort
    else ." block written" cr then ;
: dump.block 
     hex vol.buffer dump.buffer decimal ;
: patch.block text.normal
  ." change byte #:" hex input decimal
     dup blk.size 1- > 
     if ." too large" cr abort then
     vol.buffer + ." to:" hex input decimal
     swap c! ;
: dump.all.tags text.tiny 11 line.height
  800 0 do i read.disk 
      tag.fnumber @ tag.fold @ = not
     if cr ." new file starts, block " 
        i 3 .r space decimal tag.fnumber @
        dup tag.fold ! 4 .r space hex ." $"
        tag.attr c@ 2 .r 2 spaces  decimal
     then   loop text.normal cr ;
: set.hex 1 hex.asc !
    6 -1 dsk.menu item.check     
    7  0 dsk.menu item.check ;
: set.ascii 0 hex.asc !
    6  0 dsk.menu item.check     
    7 -1 dsk.menu item.check ;
: drive.1  1 drive# !   open.vol
    12 -1 dsk.menu item.check   
    13  0 dsk.menu item.check ;
: drive.2  2 drive# !   open.vol
    12  0 dsk.menu item.check   
    13 -1 dsk.menu item.check ;
create fname$ 60 allot
  : recover.file text.normal decimal cr
   ." Write output to: " input$ 
   dup c@ 1+ fname$ swap cmove
   fname$ 5 assign  5 create.file
  ." Recover blocks#: " input 
  ."  thru " input cr 1+ swap dup read.disk
  tag.attr c@ 2 and if 
     ." recovering resource file, opening
     output" cr  5 open.rsrc  else 
     ." recovering regular file, opening
     output" cr  5 open    then
 do i read.disk vol.buffer blk.size 5
    write.text ?file.error
 loop     5 close 5 remove   cr
." File recovered, double-check before
 using." cr ;

: disk.menu     
0 " DiskEdit" dsk.menu new.menu
       " Read;Write;Dump;Change;-(;Hex ;
                        Ascii;-(;Show Tags"
       dsk.menu append.items
       " Dump All Tags;-(;Drive 1;
                    Drive 2;-(;RecoverFile"
  dsk.menu append.items     draw.menu.bar
  dsk.menu menu.selection: 0 hilite.menu
  case  1 of read.block endof   
        2 of write.block endof
        3 of dump.block endof   
        4 of patch.block endof
        6 of set.hex    endof   
        7 of set.ascii   endof
        9 of dump.tags  endof  
       10 of dump.all.tags endof
       12 of drive.1    endof  
       13 of drive.2     endof
       15 of recover.file endof endcase
  events on do.events abort ;
disk.menu set.hex drive.1