[z-machine] Tokenizing
Mose Wingert
beska_miltar@hotmail.com
Sat, 19 Mar 2005 12:06:25 -0500
>> For instance, the dictionary section of the Z Machine spec says that
>> dictionary words can start with non-alpha characters...they give
>> "#record" as an example...
> As far as I know, yes, you can have capital letters in a dictionary word.
> Nobody *does* it, and I expect the Inform compiler refuses to generate...
Ok...that makes sense. Thanks for the quick response!
Second (and hopefully, last) issue with this...
Say the word is "#record", and we're in a v1 or v2 game. Then the
zcharacters to encode this could be {3, 23, 23, 10, 8, 20} in
decimal...which matches to "#reco", rather than "#reco"...though we match
all 6 characters in the dictionary, one is "wasted" switching alphabets. Is
that right? So far I feel like things are fine...
But what if our word is something like "#####"? In a v1 or v2 game we could
encode this using shift-lock characters as {5, 23, 23, 23, 23, 23}, which
would match against the dictionary "#####"...or we could encode it like we
would be forced to in a v3+ game, without shift-lock characters...{3, 23, 3,
23, 3, 23}, which would match against the dictionary "###". My guess is
that tokenizing should always use shift rather than shift-lock
characters...but I don't see that this is enforced by the spec...
Obviously, if I want to match dictionary words that contain characters from
A2, I have to use the same same shift sequence that was used for creating
the dictionary. Is it always "shift" rather than "shift-lock" for v1-v2?
Mose