Monday, February 12, 2018

Z Intepreter Source for CoCo Recovered

In November 2017, The CoCo Crew Podcast interviewed Brian Moriarty. Professor Moriarty (a.k.a. the Professor) was not only an author of several well-known games both at Infocom and elsewhere, but in particular he was the author and maintainer of some of Infocom's Z interpreter software used on a variety of microcomputers. In particular, that includes the Z interpreter software for the Tandy Color Computer. During the interview, the Professor hinted that he might have preserved the source code for that software in some form.

One of our listeners, Carlos Camacho, contacted the Professor in pursuit of that software. This eventually resulted in an email message to the CoCo mailing list indicating that Carlos was in possession of Infocom source code. As of this writing, that sourcecode is now available at the TRS-80 Color Computer Archive website.

(NOTE: What follows should be taken as pursuit of an academic/educational interest and an investigation into a historical aspect of the history of Infocom and the Tandy Color computer. The intellectual property of Infocom belongs to Infocom's legal heirs, and I am not in any way advocating otherwise.)

Preparations to Build

Let's begin by creating a fresh code repository in git and extracting the Infocom source into it:

mkdir coco_infocom
cd coco_infocom/
git init .
unzip ../Infocom\ Adventure\ Games\ Interpreter\ Source\ Code\ \(Infocom\).zip
chmod 644 *
git add *
git commit -m 'Initial commit of pristine CoCo ZIP source from Infocom'


At this point, we are mostly ready to build the code with lwasm or some other assembler. For fun, lets go ahead and try to build the boot track for use with the DOS command:

lwasm -9 -l -f raw -o boot.trk boot.src

That didn't work! Instead we see a bunch of ouput like this:

boot.src(105) : ERROR : Bad opcode
boot.src:00105 ERRM:    DB    $0D


What's going on? Well...unfortunately, many assemblers take some liberties with the naming of pseudo-ops. The assembler used by Infocom does exactly that, and at least lwasm is unsure what to do with a number of those pseudo-ops as they are used in the pristine code from Infocom. In the error shown above, lwasm needs to see ".DB $0D" (or certain other variants) instead of "DB $0D".

Fortunately, some simple transliterations in the source code are sufficient to make lwasm happy. Rather than describe them all here, I have provided a patch that can be applied to your local git tree:

wget http://www.tuxdriver.com/download/coco_infocom_patches/lwasm-compat.patch
git am lwasm-compat.patch
lwasm -9 -l -f raw -o boot.trk boot.src

You will now see a 597-byte file named boot.trk. The binary data in that file matches what you will find on the boot track of the Infocom diskette images in the Color Computer Archive with the Version C interpreter.

Make Things Easier

Many people do not like typing command lines to use a computer for anything. Even those that don't mind typing a few commands can tire of typing long command lines over and over again. As you can see above, the lwasm command line can be a bit complicated. Serious developers tend to use some sort of build script or a tool like make to control their software builds. To that end, I have provided a patch that adds a Makefile to the repository as well:

wget http://www.tuxdriver.com/download/coco_infocom_patches/add-makefile.patch
git am add-makefile.patch
make

You will now also see a 5824-byte file named cocozip.img. The binary data in that file matches what you will find on the first several tracks of the Infocom diskette images in the Color Computer Archive with the Version C interpreter.

On to Version D

One of the interestings that I noted in an earlier blog entry is that there are different versions of the Infocom Z interpreter represented among the various Infocom diskette images in the Color Computer Archive. The newly recovered Infocom source matches Version C of the interpreter. But what about the (apparently) later version D?

The ability to build the Version C source and compare the results to the various binaries made a reasonable task out of isolating the differences between the Version C and Version D binaries. I patched the code with the binary differences, then I disassembled those differences to reveal reasonable interpretations of what the original source must have been doing. This was a non-trivial effort, but it is already done! I am making those patches available too:

wget http://www.tuxdriver.com/download/coco_infocom_patches/remove-tandy-bit.patch
git am remove-tandy-bit.patch
wget http://www.tuxdriver.com/download/coco_infocom_patches/update-to-version-D.patch
git am update-to-version-D.patch

Why two different patches? Well, the first patch simply removes the part of the interpreter that sets the "Tandy bit". This little piece of censorship/marketing from Tandy is an interesting historical footnote. Some may choose to apply this single patch but otherwise use the version C interpreter for whatever reason suits them. Others may choose to use version D but otherwise omit that single patch in order to preserve the experience Tandy would have preferred to give them. That change is given its own identity to enable such choices or for further investigation.

The bulk of the version D update is contained in the second patch. With both patches applied, the resulting binaries from the build will match the corresponding bits from those Infocom diskette images in the Color Computer Archive that have the Version D interpreter.


Questions Raised or Answered?

For the record, the Professor seems to have been unaware of the existence of version D of the interpreter. This raises the question of whether version D might have been the work of some lone hacker or whatever. The fact that most of the version D changes seem to be simple features and bug fixes suggests that this is the sort of mundane work that we might expect from a typical software house as it continues to operate its business. That coupled with the several copies of version D in the repository being identical leads me to believe that version D was simply a maintenance update from Infocom.

However, the removal of the "Tandy bit" in version D seems noteworthy. Why would Infocom simply remove this piece of the Z interpreter? Was there some change in the corporate relationship between Infocom and Tandy that made Infocom want to revert this piece of Tandy-specific functionality? Or perhaps that unknown Infocom maintenance programmer was offended by this single bit (literally) of censorship, and simply removed it to soothe his own conscience? We may never know.

Having access to the source for this historic piece of software has been amazing. Interesting code structures and techniques and tantalizing comments abound. Beyond that, I have identified a number of places to make code changes for certain fixes and enhancements, like support for lowercase characters, different text screen dimensions, Drivewire support, etc. I plan to have more patches before long, but CoCoFEST! is coming...you'll just have to stay tuned!

3 comments:

  1. Good stuff. I had looked through the spec but it didn’t look like a weekend project to cleanroom an interpreter, by any means. This seems like a much nicer option.

    ReplyDelete
  2. Awesome post, John! I posted about it on ViTNO. People will love to learn about that! (http://www.vintageisthenewold.com/infocoms-z-interpreter-source-code-for-the-trs-coco-recovered/)

    ReplyDelete
  3. Please share your patches for the different text screen dimensions, namely what CoCoVGA can support. Awesome detective work.

    ReplyDelete