next up previous contents
Next: Further Improvements Up: The Prototype Previous: Results   Contents


Value Selection

After having categorised the fields of the newly encountered interaction form, values have to be assigned to the fields. These values can be selected from the existing ones in the Database. Again, the distance of the form to the existing entries in the Database is taken. In other words, values of fields with the same category are rather transferred from more similar forms.

As described in Section 5.2.5 the possible types of a field have to be considered. Not only when transferring values to a field, it has to be taken into account, that only a limited number of values might be assigned to it. Also, when adopting values from a field, these values have to be weighted appropriately.

Consider a check-box having the category "street", since it asks whether the user lives in a special street or not. Its possible values are, hence, `1' and `0'. Thus, we cannot transfer the value of a field with a different type, as it might contain the name of a street explicitly. On the other hand, simply adopting one of those two values for a text-field of the same category is certainly not recommended.

Preliminary results produced by the prototype appear to be very promising. All in all 13 out of the 20 training examples produced answers as good as hoped for. Interestingly enough, 5 of those remaining 7 samples are keyword-searches, that could not find any matching documents. The other 2 cases are domain-registrations, which expect highly specific queries. Even though the categoriser recognised them as such, the correct query could not be compiled, because one requires the domain-name to have a pattern like domain-name.at, whereas another expects domain-name only, not allowing a dot in the string. Both these cases are likely to be covered once the Database is extended. In addition to repeating requests for service using variations on queries and extracting the best result as described in Section 5.2.6, even better results can be expected.

Important to mention is the fact, that those 7 samples, which produced results not as good as wanted, were not exactly the same as the 7 not completely correct categorised forms in the previous step. This is due to restrictions on possible values as already mentioned, and also to relaxed expectations of the server. This is just natural, as not every server will e.g. check, whether a name of a street actually exists in the given city, hence, it would not find out, if the value in the query was actually rather an URL than a street name. On the other hand, some values suitable for a rightly assigned category, sometimes are not appropriate for the specific service. Making the categories more fine grained and inducing more values in the Database will help on this problem. A value producing a good result in one query does not necessarily produce a result as good in a different request for service, therefore, various values have to be tried at any time to extract the best results possible.

A problem, which should be addressed in the future, are the highly specific searches for keywords. An approach, which tailors a query to the specific web-page, appears to be more effective. To achieve this, words appearing to be important on the page the form is on could be taken as keywords. Such words could be contained in its title or emphasised in any other way. The effort put into this will pay off, as a fairly high percentage of interactions are, in fact, keyword searches.


next up previous contents
Next: Further Improvements Up: The Prototype Previous: Results   Contents
Andreas Aschenbrenner