next up previous contents
Next: Results Up: Categorisation Previous: Comparison of New Forms   Contents


Nearest Neighbour

At this stage the new form is interpreted as a whole. Previously, the fields have been categorised isolated from each other, here they are viewed in the context with the other fields of the form. This task is tackled by making use of the implicit knowledge in the Database.

It is assumed, that there is a limited number of types of interaction. All search-forms on a web-site are similar, all on-line opinion polls resemble each other in a way. By comparing the new form to the existing ones in the Database as described in Section 5.3.2, the measure of distance to each can be calculated. With these tools the Nearest Neighbour Algorithm can be applied. Thereby, the k entries in the Database, which are closest i.e. most similar to the new form, are taken. The key-vector of one of those k samples is then transferred to the new form.

The matrix of the new form must be adapted to the interpretation. Since a completed interpretation is not ambiguous, the transferred key-vector has only integer values. Therefore, the field having the highest likelihood for a category, which has a value of 1.0 in the key-vector, will be assigned the very category. Of course, if the key-vector has a value of 2.0 for a category, two fields will have to be found, and so further. All fields left are assigned the undefined category, which is not part of the key-vector.


next up previous contents
Next: Results Up: Categorisation Previous: Comparison of New Forms   Contents
Andreas Aschenbrenner