Sunday, February 17, 2019

Chapter 3. Syntax Entries - The Biggest Mystery of them All

3.1 Introduction

Probably the most innovative part of Infocom games were their ability to understand commands written in conversational English. The different types of grammar information were discussed in the “Learning ZIL” document. The structure of syntax entries in ZIL was shown, but the layout in the Z-code files was not mentioned. Infodump and ZILF do provide some extra information about the syntax structure. There are 3 additional grammar-related data blocks in the game not mentioned in the header: prepositions, syntax pointer table, and syntax entries.

3.2 Prepositions

There is a separate table of prepositions to speed up the syntax matching process.
  • 1 byte for number of prepositions
  • 2 word entries: address of preposition in vocabulary and preposition number
The prepositions are numbered from $FF and decrease. The address to this table is stored as a global variable. EZIP and XZIP use a compact form of the Preposition table that used a byte instead of a word for the preposition ID number.

3.3 Syntax Entry Pointer Table

The syntax entries are probably the most confusing part of Infocom games thanks to the lack of documentation. They provide the syntax structure for a particular action. To find the matching syntax entry, the verb number is needed. Since verbs can have synonyms, different verbs can have the same verb number like “GET” and “TAKE”. So they would use the same syntax entries. The syntax entry table lists the address where the group of syntax entries for that specific verb number is located. This table is just a block of addresses with verb number $FF is the first address. Subsequent addresses correspond to smaller verb numbers.

3.4 Syntax Entries

PARSER will then look through each of the syntax entries for the matched verb number and return the entry that best completely matches the given (if any) prepositions and noun clauses types. For example, the syntax entry table for verb number $F3 (or GET) has 7 different syntax entries. The syntaxes of “GET object”, “GET object from object”, and “GET on object” would correspond to 3 different entries. To store this grammatical information, this group of entries start with a byte indicating how many entries for that verb. It is followed by multiple 8 byte entries for all acceptable grammatical combinations:
Byte 0
Byte 1
Byte 2
Byte 3
Byte 4
Byte 5
Byte 6
Byte 7
Number of object clauses
Prep number for direct object
Prep number for indirect object
GWIMBIT number for direct object
GWIMBIT number for indirect object
LOC byte for direct object
LOC byte for indirect object
Action Number

3.5 Get What I Want (GWIM) Feature

The GWIMBIT number is used in the FIND feature mentioned in section 9.5 of “Learning ZIL” to find unspecified but necessary objects in a command. PARSER will attempt to find an object in the current location with a set attribute flag corresponding to the GWIMBIT number. If only one object is found to match, it will assume the user meant that object and use it in the given command. If no object or more than one object matches, PARSER will ask for clarification (called orphaning). The player can then give a clarifying answer without retyping the entire previous command or type a completely new command.
For example, the syntax entry for “IGNITE OBJ WITH OBJ” has a GWIMBIT number for the indirect object set to the FLAME bit. If that entry is the best syntax match the command “IGNITE TORCH”, the indirect object is still missing. PARSER will try to find an object with the FLAME bit set to use as the indirect object. If a an object in the current location has the FLAME bit ›› set (like a lantern), PARSER will assume the indirect object is the the lantern. The command will then assume to be “IGNITE TORCH WITH LANTERN” and the indirect object will be set to LANTERN.

3.6 Location Restriction of Objects

The LOC byte is probably the most mysterious value in the syntax entry. The highest 7 bits indicate how PARSER searches and checks on requested objects. For example, PARSER will not complete an action with an object on the ground if its syntax entry requires an object be held or carried. While “Learning ZIL” has listed 9 possible properties, the source code for Mini-Zork indicates only 7 are used through version 3:
Table: LOC Bits
Bit 7
Bit 6
Bit 5
Bit 4
Bit 3
Bit 2
Bit 1
Bit 0
Location-related Flags
Possession-related Flags
$80
$40
$20
$10
$08
$04
$02
$01
HELD
CARRIED
ON-GROUND
IN-ROOM
TAKE
MANY
HAVE
Not used
At top level and not inside another container
Not at top level, contained inside another object
At top level of a room and not inside another container
Not at top level, contained inside another object on the ground
Will automatically TAKE object in the current location if necessary before using it
Multiple objects are allowed for a particular action
Must already be in the user’s possession
When the GWIM feature tries to search for an unspecified objected, that routine needs to know how far to search. This is indicated by the flags for HELD, CARRIED, IN-ROOM, and ON-GROUND which guide how the function, SEARCH-LIST, finds objects in the given location (a room or the user). More details will be given in section 13.2.
“Learning ZIL” does mention an EVERYWHERE and ADJACENT option, but there is no evidence that they were ever used in version 1-5 games as confirmed by the internal Infocom documents on ZIL. It could’ve been used in the graphical YZIP- based games.

3.7 Pre-actions and Actions

The action number is used to look up the routine address from the ACTION and PRE-ACTION tables. These tables sequentially list the packed addresses with any reference table. All syntaxes that refer to a similar action (INSERT DOWN object, DROP object, and SPILL object IN object) will use the same action number. INSERT object ON object and INSERT object UNDER object use two different action numbers as the game processes these actions differently. The action number is also used to lookup the address for the pre-action routine if one exists (if not, $0000 is used). A pre-action routine can check the objects, variables, or game status before a particular action routine is called. The same pre-action routine can be used with different action routines.

3.8 An Example

Multiple verbs have the same verb number such as CARRY, GET, and TAKE in this example. The verb number will then correspond a group of syntax entries. Here only 3 are used:
    [02 00 f0 11 00 64 00 39] "carry OBJ from OBJ"
    [01 f9 00 00 00 00 00 31] "carry out OBJ"
    [01 00 00 11 00 34 00 39] "carry OBJ"
In the first example, the $02, indicates two noun clauses required for that syntax. The next two bytes indicate the required prepositions for the noun clauses. The $00 indicates no preposition before the direct object clause. The $F0 refers to the preposition for the indirect object clause, FROM in this case. The GWIMBIT number $11 is the attribute to check on objects if the direct object is missing. There is no GWIMBIT number for the indirect object. The direct object LOC byte $64 indicates the direct object should be CARRIED or ON-GROUND. Also, multiple objects can be in the direct object clause. In the second example, the OUT preposition is needed before the direct object. In the third example, the direct object LOC byte $34 indicates the direct object needs to be ON-GROUND or IN-ROOM. The last value ($39 or $31) is the action number which indicates what specific routine to execute for that command. Since the first and third examples are very similar, the same routine will be used for both types of commands.

3.9 Update: Compact Syntaxes to Save Space

A variable sized syntax entry format was used only with Sherlock to help save space. There are 3 sizes for the syntax entries based upon the number of noun clauses. The format is described below. The preposition number is stored in the lower 6 bits (after subtracting $C0/192 from the preposition number) of bytes 0 and 3.
Byte 0
Byte 1
Byte 2
Byte 3
Byte 4
Byte 5
Byte 6
No objects
# of objects (high 2 bits) /
Prep ID (low 6 bits)
Action Number
Only direct object
# of objects (high 2 bits) /
Prep ID (low 6 bits)
GWIMBIT
byte
LOC byte
Action Number
Direct and indirect objects
# of objects (high 2 bits) /
Prep ID (low 6 bits)
GWIMBIT
byte
LOC byte
Prep ID
(low 6 bits)
GWIMBIT byte
LOC byte
Action Number

No comments:

Post a Comment