Saturday, February 23, 2019

Chapter 7. PARESR: How Now Brown Cow (Part 3) - Updates (Actors, Adverbs, and Numbers)

7.8 Update: Speaking to Actors with Quotes

The first modification to PARSER began with Zork 2 and relates to double quotes in commands. A double quote is treated like an end-of-command token like THEN or period. It also toggles the QUOTE-FLAG which is essentially a flag for finding an another double quote. The first double quote seen sets that flag. A second will clear it.
In Zork 2, the use of a double quote comes into play in several special situations.When the player says something out loud such as SAY “ABRACADABRA”, the game will typically ignore these commands unless you are saying special words for a spell or a riddle. The game will then pull out the word after the first double quote and use it to trigger other routines.
PARSER will also handle speaking to the other actors using quotes. Actors are characters in the game that can be asked questions or perform actions. They are created using objects and have the PERSONBIT set. Using interrupts, the game can have the actors move or perform other actions on their own. To have an actor perform an action like:

TELL WIZARD “TURN OFF THE LAMP”

PARSER first divides up the command into two separate commands with the double quote as the terminating token for both commands. So the previous command becomes:

TELL WIZARD“ and
TURN OFF THE LAMP”

To have actors perform actions, PERFORM (discussed later) will call the TELL/ASK routine first which ensure the direct object is a person (PERSONBIT) and then set WINNER to the actor. PARSER will then process the second command with the actor as the WINNER. The individual action routines handle these special situation where actors perform the commands by checking the value of WINNER. Later, PARSER will ensure that WINNER is set back to the player on its next execution by detecting a clear QUOTE-FLAG and WINNER not set to the player.
Deadline expanded the methods for interacting with actors by allowing commands in these formats:

  • TELL/ASK actor TO command
  • Actor THEN command
PARSER would change the TO or THEN tokens to a double quote token which is then treated as an end-of-command token just like it was back in Zork 2. The verb number in ITBL would automatically be set to the verb number for TELL if the second format is used. HGTG later added a check to ensure the token after TO was a verb which indicated the start of another command.
The player in Deadline could also refer to actors with their name followed by a comma:

Actor, command

PARSER (through CLAUSE) considers as a end-of-command between two commands. The comma is then changed to “then” and processed like in the other formats.
Deadline also allowed the PLAYER to ask the actors directly about an object or ask them to give you an object using:

  • ASK actor ABOUT object
  • ASK actor FOR object
These are treated like any other command and depend on the object attributes.
Finally, Deadline also introduced a new routine which checks if the requested actor is in the current location and if the subsequent command is appropriate (action is WHAT, FIND, TELL, or SHOW) before allowing the commands to be processed. This check would be added to subsequent games.

PERFORM will call the TELL/ASK routine which ensure the direct object is a person (PERSONBIT) and then set WINNER to the actor. PARSER will then process the second command with the actor as the WINNER.

7.9 Update: Titles and Adverbs (briefly)

The recognition of titles used to address actors like MRS. or MR. and automatically skip them along with any following period was introduced with Deadline.  The game also recognized 5 specific adverbs (CAREFULLY, QUIETLY, SLOWLY, QUICKLY, and BRIEFLY) and saved the adverb’s Vocabulary address in ADVERB. It would be used later by specific action routines such as WATCH, GO, or READ. The adverbs were considered a special token and not a specific part of speech. Bureaucracy is the only game with adverbs, but they are not needed to complete the game.

7.10 Update: Numbers, a new object

Deadline introduce numbers and time to Infocom games. Since these values do not match with any token, they are given a $0000 Vocabulary address. NUMBER? will then change the $0000 to the vocabulary address for a “intnum” or “number” token. A special global variable contains the numerical value. For time values, the time in minutes is saved:
  1. 1. Loop through all the characters of an unknown token
  2. 2. If the character is a digit, then take the current sum, multiply it by 10, and add this value of this digit.
  3. 3. If a “:” is found, then the previous digits are likely an hour value. Save this into a separate TIM variable and reset the total sum. Continue looping through the remaining digits which will be the minutes value.
  4. 4. Once all the digits are read (and there are no extra non-digit characters), then the sum (value of the digits) is save into P-NUMBER and this unknown token is given the Vocabulary address for “intnum”.
  5. 5. If the sum is greater than 10000, then the routine will return FALSE.
  6. 6. If the number was an time value, then the sum is actually the minutes value. The routine will take the hour value and multiply it by 60 and then add the sum. If the hour value is greater than 23, then the routine will return FALSE. There is no restriction on the size of the minutes value though. This new sum (time in minutes) is also saved in P-NUMBER.

Routines can then search for the “intnum” token and use the value in the corresponding global variable. Because of this design, only one number or time value could be used in each command. Games using time values can impose certain restrictions on the what hours are valid. For example, Deadline considers any time between 1:00 and 7:59 to be PM. So 7:00 is converted to 19:00 (or 7pm) for the game. Hours from 0 to 23 are considered valid for all games. Most games will convert the hour and minutes into all minutes. Only Sherlock checks the value of the minutes. So a time value of 14:80 could be valid in most games that use time as they are converted to all minutes.
More number formats would be recognized in later games. They are listed below:
Added Format
Game Introduced
Notes
$xxx
Cutthroat
No cents portion is allowed. The value is saved in a different global variable than P-NUMBER. The “intnum” object has it’s property 11 set to “amount of money”.
xxx-yyyy
Suspect
The phone number is stored as two separate numbers in separate global variables
xxx,xxx
LGOP
Comma separated number is converted to a single 4-6 digit number
xx(B-E)
$xxx.yy
#xxxx
Bureaucracy
Number followed by letters B-E indicates seat row and letter (4 times row # + seat letter converted to 0-3 value) and money value (converted to cents) are saved in their own special global values. The object type returned is “intnum” except for a money value where “money” is used. If a # starts a number, it is ignored.
HH:MM AM/PM
Sherlock
The hour is converted to the corresponding 24 hour value. The entire time is saved in the game’s special time table format

Chapter 7. PARSER: How Now Brown Cow (Part 2) - Scanning Tokens

 Start of PARSER: Where is the command?

PARSER was able to understand single commands or multiple commands separated by “.” or THEN by parsing them individually. It first decides if the next command should come from the previously given input by seeing if P-CONT is set. This variable contains the starting location of the next command’s first token from the previous input. This is set by seeing if more tokens exist after a complete command is parsed. If P-CONT is clear, PARSER then asks for new input from the user by printing “>” character and calling the READ opcode.

7.7 Now, Traverse the Tokens…

Because of the structure of accepted commands, PARSER can search for a command in an efficient manner. It will walk through the tokens and look for ones with specific parts of speech. If a noun clause is found, PARSER will call another routine, CLAUSE, to find the start and end tokens for the direct object clause. If no error is returned by CLAUSE, PARSER will continue checking tokens and look for another noun clause (indirect object clause) or end-of-command token. The order for checking tokens is:
  1. 1. Invalid Token

All valid tokens have an address to their associated entry in the vocabulary. Any invalid or unmatch tokens are given an address of $00. If this is found, an unknown-word error message will be printed and FALSE returned.
  1. 2. End-of-Command Token

An end-of-command token (THEN or “.”) will stop the loop and jump to the post-looping processing. If there are more tokens, PARSER will save the position of this next token which PARSER will use for the next command.
  1. 3. Direction Token

This is the only verb with its own specific check in PARSER since it is the most common command given. A direction token has an associated direction value which is the property number for a room’s exits in that direction. There are 4 special scenarios where this direction value is saved and the loop is stopped:

  1. 1. This is a 1 token command, just the direction is given.
  2. 2. This is a 2 token command and the verb GO was already given (as in “GO EAST”).
  3. 3. There are more tokens after the direction, and the next token is a end-of-command token.
  4. 4. There are more tokens after the direction, and the next token is a conjunction token (AND or “,”). If so, the conjunction token is changed to “then” to indicate a new command. So a series of direction commands separated by commas or “and” become separate commands.
PRSA is set to GO (if not already done). PRSO is later set to the direction value.  
  1. 4. Verb Token

If a verb token is found, PARSER checks if a verb has already been a found. A command cannot have two verbs. If no verb has already been found, this verb’s verb number and address to a verb table that has the 4 byte token data are stored in words 0 and 1 of ITBL. If a verb has already be found, PARSER will see if it the word could also refer to a different valid part of speech.
  1. 5. Preposition, Quantity, Adjective, and Noun Tokens

Any of the above tokens indicates a noun clause is starting. The number of noun clauses variable is incremented. A separate routine (CLAUSE) will then find the end of the noun clause and store the start and end addresses in ITBL. CLAUSE then returns the start address of any remaining tokens PARSER should process. There are several exceptions where CLAUSE is not execute:

  1. 1. If the matched adjective or noun is followed by OF, PARSER will ignore the adjective and noun and use the token after OF for the start of the noun clause. This new token will be matched on a subsequent loop.
  2. 2. If there are no more tokens or an end-of-command token is next after the matched preposition and less than 2 noun clauses have been found, the preposition address and value information will be save into ITBL (Word 2 and 3).
  3. 3. If there are already 2 noun clauses, a "Too many noun clauses??" error is given. PARSER will then return with FALSE.
  1. 6. Special Token

Any remaining special tokens that do not affect the syntax or objects requested will be ignored. This includes tokens like IS, YES, A, or THE.
  1. 7. Improper/extra tokens - Syntax Error

All other situations are a syntax error. Therefore, PARSER will display a can’t-use-the-word error and return FALSE. examples?
The first version of PARSER could understand multiple commands if they were separated by THEN or “.”. Using AND or “,” between verbs without objects like

JUMP AND LOOK

were not understood. However, commands with objects separated by AND or “,” like

OPEN MAILBOX AND GET LEAFLET

are accepted as CLAUSE would realize the AND separates two commands.

Tuesday, February 19, 2019

Chapter 7. PARSER: How Now Brown Cow (Part 1) - Data Structures

  • Arguments: None
  • Result: TRUE if command is valid, FALSE if command is not valid

7.1 Introduction

Probably the most intrigue feature of Infocom games has always been the parser. The minimal online documentation touched only on the features of PARSER but never the mechanism behind PARSER. Since Zork 1, the general approach to parsing has been relatively unchanged with successive Infocom games. Improvements were made, but they mainly expanded the syntaxes that the game would understand. Many games have special commands or ways to interact with the PLAYER that results in modifications in PARSER. Finally, the first EZIP game, AMFV, added new parser commands like OOPS.

7.2  Characters vs. Tokens

The ZIP language's READ routine will take a sequence of characters and store it in the input buffer (INBUF). The first byte in that buffer has the number of characters in the input. The input ends with a zero byte and does not include the terminating character like a carriage return. READ will then match words in the input buffer to those in the game’s vocabulary and create a separate buffer of values corresponding to the matched words (called tokenizing). Words are separated by a space or designed separator characters stored in the vocabulary. For each word, a 4 byte block is created from three pieces of information for each token. First, the Z-string of the word is created and matched to the words in the vocabulary. All ZIP 1 to 3 games were limited to the first six characters of a word. ZIP 4 and 5 could use up to nine characters. If a match is found, the address of that word in the vocabulary table is saved in the first two bytes of the block. If no match is found, 0 is used. The third byte in the block will be the length of the token. The last byte is the offset from the start of the input buffer. The token buffer (LEXV) starts with two bytes. The first byte in the token buffer is the maximum number of tokens allowed. The second byte is the actual number of tokens in the buffer. The rest of the token buffer are groups of 4 byte token data blocks to represent the words in the input.

7.3 PARSER Variables and Grammatical Structure Definitions

In “The Parser’s Role” section of “Learning ZIL”, PARSER takes the input and tries to identify the action number for PRSA and the object numbers for PRSO (parser direct object) and PRSI (parser indirect object). This is not quite correct. It will set PRSA with the action referred by the verb in the input. However, PARSER does not set PRSO or PRSI. The only exception is if the PRSA is GO. Then the PRSO is set to the exit direction. The routine actually fills two tables (P-PRSO and P-PRSI) with all the direct and indirect objects requested in the command.
The basic grammar structure for a command is:

verb + prep + noun clause + prep + noun clause + end-of-command +
verb + prep + noun clause + prep + noun clause + end-of-command ...
Only the verb is required. All other parts are optional. A noun clause is a noun or set of nouns connected by conjunctions (AND or commas). These nouns can be modified by adjectives, quantifiers (ALL, A, or ONE), or other special tokens excluding prepositions (OF, BUT, or EXCEPT). The entire noun clause is referred as a direct or indirect object clause. Prepositions are not included in the noun clause.
For example:

DROP THE YELLOW BALL AND CROWBAR
INSERT A DOLLAR INTO THE RED SLOT 
TAKE ALL EXCEPT THE CANDLES
THE YELLOW BALL AND CROWBAR, A DOLLAR, THE RED SLOT,and ALL EXCEPT THE CANDLES are the noun clauses.
Individual commands can be connected together with end-of-command tokens (THEN, AND,, or periods) that indicate where a command stops. So:

DROP THE YELLOW BALL AND TAKE THE CROWBAR
DROP THE YELLOW BALL. TAKE THE CROWBAR

are equivalent to

DROP THE YELLOW BALL THEN TAKE THE CROWBAR
Commas can separate multiple objects in a single noun clause or indicate the start of a new command. So:

DROP THE BALL, A RAKE AND SHOVEL
DROP THE BALL, TAKE THE SHOVEL

are processed differently based upon word type after the comma. More in the details below.

7.4 PARSER Table (ITBL)

The main goal of PARSER is to extract the action (PRSA) and valid objects in the direct and indirect object clauses from the given command. To assist other routines in extracting this information, PARSER will store specific information related to the verb, prepositions, and location of noun clauses in a 10 word table, ITBL:
Word 0
Word 1
Word 2
Word 3
Word 4
Word 5
Word 6
Word 7
Word 8
Word 9
Verb Number
(VERB)
Verb Table Address (VERBN)
Prep Number (PREP1)
Addr of Prep (PREP1N)
Prep Number (PREP2)
Addr of Prep (PREP2N)
Start Addr of Direct Clause (NC1)
End Addr of Direct Clause (NC1L)
Start Addr of Indirect Clause (NC2)
End Addr of Indirect Clause (NC2L)
The verb number is a unique value for similar meaning verbs in the vocabulary. It is not the same as the action number. A verb will have the same verb number no matter the context of its use but it could have a different action number.  For example, LOOK has the verb number $E9 in all syntaxes, but the action number for LOOK FOR is $2D, LOOK IN is $3F, and LOOK is $D0 corresponding to different types of actions. The verb table contains the same information about the verb as in the token buffer: verb’s address (in Vocabulary), length, and location in the input buffer (INBUF). The start and end addresses of a clause refer to locations in the token buffer (LEXV).  Of note, this end address actually points to the token AFTER the last included token in the particular clause.

7.5 Checking Word Types with WT?

  • Arguments (Address, word type to match, word type to return)
  • Return ID value or FALSE if no match
WT? is one of the most important routines in Infocom games and sees if the Vocabulary entry at the given address has the given word type. This is the primary word type as described in Section 2.6. If the primary word type does not match the given word type, WT? returns with FALSE. If there is a match, the result returned depends on the third argument. If no third argument is given, the routine will return TRUE. If the third argument matches the secondary word type (as described in Section 2.6), then the secondary ID is returned. Otherwise, the primary ID is return regardless if it is a valid word type for the primary ID.
Primary word type
Secondary word type
Bit 7
Bit 6
Bit 5
Bit 4
Bit 3
Bit 2
Bit 1
Bit 0
$80
$40
$20
$10
$08
$04
$02 = Adjective
$00 = Noun
Noun
Verb
Adjective
Direction
Preposition
Special
$03 = Direction
$01 = Verb
Using:

$4386:3A 6B C4 D9 62 B5 B4

the first 4 bytes are the z-string for “inflat”. $62 indicates it is an verb and adjective. So,

            CALL WT?($4386, $40 or $20) will return TRUE
CALL WT?($4386, $40 or $20, $02) will return the secondary ID of $B5.
CALL WT?($4386, $40 or $20, $00 or $01 or $03) will return the primary ID of $B4.
CALL WT?($4386, $10) will return FALSE as there is no match.
Infocom games interestingly do not put the special ID values for directions, verbs, or adjectives as the primary ID value when those tokens only have one word type. For example, the Vocabulary entry for “search” is:

$4967:61 46 DD 0D 41 E0 00

with the word type as 41 (primary type is verb, secondary type is verb). The primary ID value is $00 though. To get the verb number ($E0), you have to access the secondary ID by using $03 as the third argument:

        CALL WT?($4967, $40, $01)

Since a third argument is needed anyways to get an ID value, the designers to just put it as the secondary ID value.

Almost everyone game uses the same WT? routine. So games did not even use a separate routine but just hard coded the check when needed. LGOP did add an additional word type check for nouns, $80. If it was found, the routine would quickly exit (no value is returned for nouns) and bypass checking for secondary word types. Sherlock uses the newer compressed Vocabulary entry format. Since secondary word type checking happened mainly with prepositions, WT? only allowed prepositions to be checked then when secondary word type arguments are given. This is done by searching the Preposition table. If a match is found the preposition value is calculated based upon the token’s position in the Preposition table. Internal Infocom notes mentions a special WT? for The Lurking Horror where 3 ID values were stored for each token, but no evidence of this can be found of this routine in the 3 known game releases.

Monday, February 18, 2019

Chapter 6. MAIN-LOOP - Heart of the Game

  • Arguments: None
  • Return: None

6.1. Introduction

MAIN-LOOP is the heart of all Infocom games and keeps the game structure orderly. It repeatedly requests for parsed commands and loops indefinitely. MAIN-LOOP does not get modified too much with newer games. Many of the changes were to make programming game-specific details and restrictions easier to do. These game changes essentially provided more checks on the player input and provided better responses. Only significant changes to MAIN-LOOP will be later described.

6.2. The Details

MAIN-LOOP will call PARSER to ask and process a user’s command. If PARSER cannot properly parse the command, MAIN-LOOP will continue to call PARSER to process new commands. If it has successfully processed a command, PARSER will set PRSA (parser action) with the requested action number (the 8th byte in a syntax entry) and fill the PRSO (parser direct object) and PRSI (parser indirect object) tables (P-PRSO and P-PRSI) with all the direct and indirect objects requested. This is different to what is described in “Learning ZIL”. MAIN-LOOP then loops and acts upon all the objects:
  1. Check the number of objects in the direct and indirect clauses
  2. If the direct objects clause has no objects, then see it the action is GO.
    1. If so, then call PERFORM with GO and the direction in PRSO.
    2. If no objects are needed for the requested action, then call PERFORM on PRSA with no objects
    3. If at least 1 object clause is needed for the requested action, then print an error message. Display a specific error message if the command is an invalid response to an orphaned command.
  3. One clause will be designated the multiple one and one clause has a constant object, first one in the clause.
  4. Call the requested action with PERFORM multiple times for each object in the multiple object clause as while the other clause just has its first object used.
    1. If M-END is returned, then halt the processing of multiple objects. Erase any remaining commands.
    2. If M-END is not returned, then continue looping through the multiple objects
  5. Increase number of turns by 1 (even if multiple object are processed).
  6. Call CLOCKER to check interrupts even if the given command was not valid. This was later changed in Deadline and other future games to only calling CLOCKER if PARSER was successful.

6.3. Details of Multiples of Multiples

The MAIN-LOOP handles commands with multiple objects for a given action. It will loop through these objects and execute the same action for each object. However, there is some confusion as to how it determines preference if two sets of objects are given. The examples below show how MAIN-LOOP iterates through multiple objects.

Multiple Direct Objects
Multiple Indirect Objects
Multiple Direct and Indirect Objects
IGNITE CANDLE AND, PAPER WITH TORCH
  • IGNITE CANDLE WITH TORCH
  • IGNITE PAPER WITH TORCH
CUT TREE WITH AXE AND SWORD
  • CUT TREE WITH AXE
  • CUT TREE WITH SWORD
IGNITE CANDLE AND PAPER WITH TORCH AND FIRE
  • IGNITE CANDLE WITH TORCH
  • IGNITE PAPER WITH TORCH
So any additional indirect objects are ignored when there are both multiple direct and indirect objects. MAIN-LOOP will always iterate through the direct object clause if it has the same or more objects than the indirect clause. The indirect object remains constant (the first one in the clause) for all iterations. The only exception is for only 1 direct object and multiple indirect objects. MAIN-LOOP will then iterate through the indirect objects while keeping the direct object constant.

6.4 Update: Managing Global Variables

Only minor improvements were made with handling the PRSA, PRSO, and PRSI variables. Updating the L- versions of these variables which are used by the AGAIN command was moved into the MAIN-LOOP section starting with Zork 2. Later games would excluding updating these variables if certain commands were used. Zork 3 added the option of checking the the LIT variable with commands that require no objects. If it was clear, then a “It’s too dark to see” error would be given. Zork 2 (R28) also moved the updating of the IT-OBJECT and its location variable into the MAIN-LOOP instead of PERFORM. LGOP and Plundered Hearts also added a specific check on the visibility of the IT-OBJECT in the MAIN-LOOP.

6.5 Update: How many NOT-HERE-OBJECTs?

To generate a better user responses when some objects are missing in a command, MAIN-LOOP (since Infidel) started to count how many requested objects in a multiple object command were not present. This number was kept in the global P-NOT-HERE variable and used to provide a more specific error message for missing objects. For example, if more than P-NOT-HERE was greater than 1, the error message would use “objects” instead of “object”. One final coding relic is the P-MULT flag. It is cleared and set in the MAIN-LOOP, but has been used only in Infidel’s NOT-HERE-OBJECT ACTION routine.  All subsequent ZIP 3 games since Infidel still set this flag, but it is never used. It is also present in various developmental versions but not used. Its true function remains unclear.

6.6 Update: Checking For Invalid Exceptions

Starting with Deadline, MAIN-LOOP would check for specific invalid situation where an action should not be done on a specific object. Deadline ensured that none of the referred objects in a command was the WINNER. Planetfall had its own special check on objects used with “PICK UP” by making sure the PRSO was on/in PRSI. If not, it would skip over that PRSO.
Wishbringer (R69) would be the first game to check for these exceptions in a separate routine instead of in MAIN-LOOP.  It would be called whenever the GETFLAG mode in the game is set to ALL. MAIN-LOOP will iterate through each object in a multiple object clause and check it for any invalid exceptions exits before sending it off to PERFORM. If an invalid exception exists, PERFORM will be skipped. This CheckException routine usually looks for the actions that require the object be held or local such as DROP or INSERT. If that command is being used, the routine will see if the proper attributes are set (TAKEN, for example), held by the WINNER, or not already inserted for example. Every game since Wishbringer has such a routine.