next up previous contents
Next: The Value-Select Up: Modules Previous: The Database   Contents


The Categoriser

Interpreting an interaction form is based on the assumption that the fields can be categorised and that the possible categories are not only finite but even a manageable set. These categories could, for example, be "name", "address", "city", "e-mail", and so on. To assign a category to a field is the purpose of this module. The approach to fulfil this task develops along two lines.

In the first step, we make use of the information the Parser extracted. Out of all this we can make a first guess on which category a field, isolated from the others, belongs to. Consider, for example, a special text-field that does not show the characters when entered by the user - we could assume at a quite high percentage that this field expects a password. Also, we can count on a programmers sense of style to use speaking names for the fields. In other words, a field expecting a name might indeed be called "name". All these assumptions hold to a certain degree.

This is a very delicate step, as fields on different pages resemble each other only to a small and changeable extent, even if the fields have the same purpose. For example, Figure figure_auto_snapshots shows three different fields for keyword-searches. As you can see, not only the wording of the labels varies, also their position is not predetermined. Concerning their shape, they could be a picture, simple text, a button or any other type that the designer of the web-page considers expressive enough for making the content of a field clear. All this exacerbates the assigning of a first probability on the isolated fields.

At the second level, we consult the data already in the Database. This is based on the assumption, that there exist similarities between user dialogs. Take, for instance, ordering forms, the number of which are increasing as e-commerce is spreading. There is a basic set of information every company requires, to be able to do the delivery, which are the name, the address, and perhaps type of shipment and type of payment. (In case an ordering form is recognised as such it is, of course, advisable to employ a special procedure, otherwise you might either wonder about the charges to your account before receiving loads of books and goodies or the very web-service is loaded with fake orderings.) Another frequently used combination is a login mask, consisting of a login name and a password, or the simple interface for searching a keyword in a site.

After assigning probable categories to the isolated fields in the first step, a means to compare this form to the already classified ones in the Database has to be applied. Having found the most similar one, it can be expected, that those are of the same type. This considers the whole form, not only the separate fields. Since the two forms serve the same purpose, they presumably consist of fields belonging to the same categories. Therefore, the new form can adapt the categories of its fields to the pattern provided by the readily categorised entry in the Database, that was calculated to have the same meaning. Again, the initial interpretation will play a role in this transfer in order to find the most suitable field for a category.

Hidden fields have to be handled in a special way. On the one hand, these fields could contain information barely important to the server the query will be directed at. This type of hidden fields cannot be categorised, even a human might have difficulties interpreting the meaning of such a field. On the other hand, these fields might indeed have their own category. This could be the case on a specialised web-site using a server which is able to handle more general queries. For instance, a web-site dedicated to a single city could use a service, which is available for many cities. The name of the city would then be encoded in a hidden field. Another occurrence, which requires the assignment of categories to hidden fields, is in forms already on dynamic web-pages. A more complex interaction might propagate some of the information the user has already given in a previous dialog, such that these values do not have to be given multiple times. For example, the dialog could ask for the users name on one page and ask for his address on the subsequent. Of course, the server needs to know both, the name and the address, at the same time, but this fact is concealed from the user by hidden fields.

Finding out what the fields are actually for, as done in this module, is the basis for assigning values to them. This is done in the subsequent module.


next up previous contents
Next: The Value-Select Up: Modules Previous: The Database   Contents
Andreas Aschenbrenner