In Database Parsing

The LambdaMOO multiuser network text based virtual enviroment server, commonly called a MUD server, contains a simple command parser specificly designed for use in VE situtations. This made programming non-VE interfaces, such as interfaces for programming and adminstration tasks, difficult to program and caused ambigious command parsing.

With the release of version 1.7.7 of the LambdaMOO server, an easy and straight forward method of extending the command parser was included, via $do_command():

  -- Added the first rudimentary support for in-DB command parsing.  Each user
     command is broken up into words, a list of which is passed as the arguments
     in a call to #0:do_command(), if it exists, with `argstr' initialized to the
     raw command line typed by the user.  If #0:do_command() does not exist, or
     if that verb-call completes normally (i.e., without suspending or aborting)
     and returns a false value, then the built-in command parser is invoked as
     usual to handle the command.  Otherwise, it is assumed that the DB code
     handled the command completely and no further action is taken by the server
     for that command.
This opened up new doors in how the commands that users entered were parsed and what kind of interfaces could be used to specify how to do the parsing and what actions to perform. Previously, the only way the database could get ahold of the string that the user had entered was after the server had tried to do something with it. Commonly termed the :huh stack, it was used to allow a reduamentary form of multiple inheritence and code sharing for command line actions. Unfortunately, the commonly distributed and used :huh stack was programmed to emulate server-style parsing, creating a bottleneck in both design and use.

Server-style Command Parsing

The server has a builtin command parser. It assumes it is operating on a virtual enviroment object model, and works quite well for such cases.

There are three ways that commands can be specified under server style parsing.

verb can be any word, but for the sake of ease of use in VE situtations, it makes sense that the part-of-speech of this word be verb. Doing so makes it much easier for newcomers to learn the commands.

direct-object and indirect-object come from the set this, any, or none. Commands are bound to objects in the MOO, and specifing this will tell the server to match the string entered to the name/aliases of the object that the command is defined on. any matches any string, including the empty string, and none matches only the empty string.

preposition can be any of the following:

This allows commands such as:
command enteredhow it is specified
take bagbag:take this none none
cut steak with knifesteak:cut this with any
spread butter on top of toast     toast:spread any on top of this
tell frank about dinnerfrank:tell this about any
This shows that the objects that define the commands, bag, steak, toast, and frank, have verbs with various arguments and examples of how they can be called.

Problems with Server-style Command Parsing

Using when trying to program for non VE interfaces makes the programmer do a lot of overhead and duplicate work. For example, in the above command cut steak with knife, the code for steak:cut would have to check to make sure the indirect-object is a knife, since you don't want to allow the steak to be cut with just anything, like a beach ball. An appropriate message telling the user that they can not do that would make more sense.
cut steak with traffic cone
I see no "traffic cone" here.
cut steak with headphones
The headphones do not seem sharp enough to cut the steak.
Unfortunately, the programmer must program the code in steak:cut to recognize the indirect-object as something they can cut with. This can be a problem if the thing they are refering to is not within the scope that the server matches object names against.

This means that two steps may be neccessary in the verb's code to determine if what is being done can really be done (according to the problem domain of the verb).

  1. Verify that the indirect-object was actually matched by the server, and if not, call other related matching routines (on LambdaCore based MOOs, this would be $string_utils:match_object or player:my_match_object). This could also fail, in which you'd like to notify the user that there is nothing within their scope that matches the indirect-object string they entered (again, on LambdaCore based MOOs, this would be $command_utils:object_match_failed).
  2. Check to make sure that the object that the action is being performed with (per our example) is actually usable within the problem domain of the verb. This usually means tracing the ancestory of the object or looking for the existance of certain verbs or properties on the object. If this fails, then we have to print an informative message as to why it couldn't be done.
This makes for highly customizable command interfaces, but leads to a great amount of duplicate code.

Another problem with server style parsing is that you are locked into the direct-object preposition indirect-object three argument model. This makes commands that do not lend themselves well to VE settings difficult to program.

@recreate object as $thing named newobject
In this case, we will have to split the indirect-object portion of the command that we receive on named, and then match one to an object reference. Plus, we have to handle all possible errors during this parse/matching process ourselves.
let Munchkin in here for 3 days
Here, the server runs into a problem with trying to parse. Does it split on in or for, and then, how does it match the resultant strings to objects which could contain a verb named let?
Notice also that these two commands have more than three arguments. This means that at least one of the arguments for the verbs need to be any and that we need to do the parsing of that string and matching.

Objectives

In creating a command parser in the database, we set out to accomplish the following objectives:
  1. Increasing the number of arguments a command could take.
    The limit of three arguments that the server imposes is not enought for most applications. Originally, our design had a limit of at most nine arguments per command, but the current implementation allows for any number of arguments.
  2. Greater robustness in object matching.
    Only matching on the users's location and contents and other objects within the immediate scope does not allow enough flexiblity in the kind of matching that can be performed. We decided that if possible, object matching would be defered to the user, and that methods on the user would perform matching for us, thereby opening up a whole new realm of custom matching stacks.
  3. Expandability.
    Having the command parser hardwired into the server software makes upgrades and bug fixes hard to do and difficult to maintain. Ideally, the parser should be implemented in the database as much as possible and allow easy integration of new features.
  4. Reducing the amount of duplicate code.
    We noticed that much of the code that handled commands was spent in trying to parse the arguments and match objects and that the part of the code that actually did the work was small by comparsion. By abstracting the argument parsing and matching process away from the code that actually performs the command, a standard in argument matching and error handling/reporting can be established. This has the capability to provide a much more consistant interface both from the end user point of view and the programmer point of view.
  5. Something else I can't remember (where are my notes?)
    I don't remember what this point was, but I'm damn sure it was a good one. In fact, if I remember correctly, I had very strong convictions for it, as did everyone else. I'm sure I'll remember it one day. The problem is, the harder I think about it, the less likely I am to remember it. And of course, I'll be driving down the road one day, and it will just come back to me, but as usual, I won't have a pencil, or anything to write with, or anything to write on, and the idea will be lost forever.

Solutions

There are a few routes that can be taken in designing new features or a new parser. The two I will explore here are allowing the database to call the server parser and not using the server parser at all.

Extending Server Parsing

In order to get maximum flexiblity when parsing, it makes sense to extend the server parser to defer some of it's work to the database. One of the ideas for doing this is to have the server start parsing the entered string and try calling all verbs that match. The verb is then able to examine the arguments closely and determine if the entered string actually makes sense in terms of what this verb is going to handle. If it does, then it performs the action. If not, the verb notifies the server that it can not handle the entered command, and the server will then continue searching for another verb to try.

This has the advantage of writing each verb to only handle the tasks it was meant to handle. Since the parsing of that command happens in-line, then the code that parses the entered string is closely related to the code that performs the task. The main disadvantage is great amount of duplicate code that must be copied or written that will handle the parsing and matching of the arguments in each and every method. Also, this relizes heavily on the operation of the server parser and does not lend itself well to future expansion.

Another method, the method we chose to explore, is to implement a new parser completely in the database. Thanks to the existance of the $do_command() hook, which is called before the server has a chance to parse the entered command, a different style of parsing can be implemented which overrides the server's parsing.

Writing a command parser into the database in the MOO language has proven to have a number of advantages. We are not constrained by any of the server's parsing limits; any number of arguments can be supported, we can match objects how ever we want, the code would be dynamicly changable so that in the future new arguments can added or the entire parsing process could be changed, and the amount of duplicate code is kept to a minimum.

The main disadvantage of doing all parsing in the MOO language are the limits that the MOO server places on running tasks. Parsing tasks can take a relatively long time to execute, possiblity exhausting the allowed resources (number of operations that can be performed (ticks) and number of seconds allowed to run). Of course, we can suspend the parser if necessary, but then we run into the problem of commands possiblly being executed out of order.

We chose the route that would offer the greatest amount of flexibility and expandability and decided to implement a complete command parser directly in the MOO language.

An Implementation

The in-database parser we wrote spans five objects and is over 2300 lines of code. When processing commands on a regular basis on our development MOO, E_MOO, and the virtual enviroment is in a regular state (ie, between 5 and 20 objects are in the user's scope) it takes, on average, about 10 thousand ticks to execute a full match cycle. Most commands are found and matched in under 8500 ticks. The actual speed of the parse process is dependant on the number of objects in the user's scope and the number of commands on each of those objects and the number of arguments that end up being matched. Your milage may vary.

We felt it was necessary to make a distinguishtion between the parts of the verb that are used in matching and the parts of the verb that are executed when a match succeeds. As such, we seperated the arguments and permissions and other non-code data from the code. We call the verb name, the arguments, the permissions, etal the command, and restrict the name method to be only the code portion. We like to think that the command and the method put together comprimise a verb. We left the code portion up to the server to handle, and only worry about the command part when trying to match an entered string. This allows us also to plug in a new command handling front end and not have to worry about the conversion of the code to work with it.

We've also taken to distinguish between command methods and utility methods. Command methods are called as the result of a command match by the parser. Utility methods, which by default have the verb arguments this none this, are used for behind the scenes work, just like this none this verbs are used in normal database operation.

Commands are defined internally as a LIST. In order to create a high level of abstraction, $cmds was created. $cmds defines a series of accessor methods that treat command manlipulation the same way the server treats verb manlipulation. $cmds:add_command, $cmds:set_command_args work exactly like their verb counterparts, but operate on commands.

The command structure is composed of the following elements:

Users/Programmers should not try to manlipulate these strucutures directly, but instead should use the methods on $cmds to add, delete, and set arguments/permissions of commands.

The nitty-gritty of the parser's operation. $cmds, $arguments, $parser, $do_command(), $call_method()
Shortcircuting.
The parse loop: search for commands, match static arguments, match variable arguments, call method

Problems and Misstarts


ThwartedEfforts -- Archwizard of E_MOO -- abakun@scinc.com
Dec94, Mar95, Jun95, Oct95, Dec95, Jan96, Apr96; Last update Jul96
Copyright © 1994, 1995, 1996 by Andrew Bakun