[z-machine] Tokenizing

Mose Wingert beska_miltar@hotmail.com
Sat, 19 Mar 2005 12:06:25 -0500


>> For instance, the dictionary section of the Z Machine spec says that 
>> dictionary words can start with non-alpha characters...they give 
>> "#record" as an example...

> As far as I know, yes, you can have capital letters in a dictionary word. 
> Nobody *does* it, and I expect the Inform compiler refuses to generate...

Ok...that makes sense.  Thanks for the quick response!

Second (and hopefully, last) issue with this...

Say the word is "#record", and we're in a v1 or v2 game.  Then the 
zcharacters to encode this could be {3, 23, 23, 10, 8, 20} in 
decimal...which matches to "#reco", rather than "#reco"...though we match 
all 6 characters in the dictionary, one is "wasted" switching alphabets.  Is 
that right?  So far I feel like things are fine...

But what if our word is something like "#####"?  In a v1 or v2 game we could 
encode this using shift-lock characters as {5, 23, 23, 23, 23, 23}, which 
would match against the dictionary "#####"...or we could encode it like we 
would be forced to in a v3+ game, without shift-lock characters...{3, 23, 3, 
23, 3, 23}, which would match against the dictionary "###".  My guess is 
that tokenizing should always use shift rather than shift-lock 
characters...but I don't see that this is enforced by the spec...

Obviously, if I want to match dictionary words that contain characters from 
A2, I have to use the same same shift sequence that was used for creating 
the dictionary.  Is it always "shift" rather than "shift-lock" for v1-v2?

Mose