next up previous
Next: Conclusion Up: SOMLib: A Distributed Digital Previous: SOMLib - The Architecture

   
SOMLib at Work

From the user's point of view, SOMLib facilitates both intuitive library browsing interfaces as well as query processing tools. The topographic mapping provided by a SOM locates similar documents close to each other on neighboring nodes. Thus, SOMLib proves to be a good analogy to what we are used to find in conventional libraries, where books are sorted by topical areas. Furthermore, the implicit hierarchy - which in fact is rather a web of sub-libraries interconnected by reference - allows quick and intuitive access to relevant information by choosing relevant sub-libraries when manually selecting library section after library section. To use a conventional analogy, a user interested in neural networks will start from a high-level map, where he finds (by query or simply by looking at the labels of the nodes) a node covering that topic and several related ones in the close surroundings. Opening this node (section) he either finds a list of documents covering that topic, which are available at different libraries (i.e. static referencing) or a list of sections of other libraries containing documents on that topic. These links can be followed to finally reach the very sections of first order libraries, where access to the desired documents is provided (node referencing).

A query to the library system can consists of a sample text which is parsed using the vector structure of the library system. The resulting query vector is then presented to the library to calculate the winning node. In a first order SOM, the documents mapped onto the winning node are returned as query results. The same applies for higher order maps implementing only static referencing. If both referencing schemes are implemented, in higher order maps the statically referenced documents are returned as first hit query results together with the next order referenced nodes from the node-referencing scheme, i.e. the nodes of the lower order maps that are mapped onto the winning node. In an interactive search the user can now browse through the library 'hierarchy' either manually or by passing the query vector on to desired lower order maps with appropriate conversion of the query vector to match the lower order libraries' vector structure. For automatic retrieval, all references are followed to retrieve documents that are mapped on winning nodes in the lowest order SOM.

From the library administrator's point of view there are two different situations to be considered. On the one hand there are the first level SOMs which are trained with the feature vectors created by parsing (a subset of) existing documents. The resulting maps are relatively small since they only need to represent the very documents present in the library. New documents can be added to the map by parsing them using the previously extracted vector structure and mapping the resulting feature vectors. As long as the general scope of the library does not change extensively, new documents can be added without destroying the topology preserving mapping. As new topics emerge, the small first level libraries need to be retrained. The 'old' SOMLib map can either be retained to serve other referencing higher order maps, or the nodes of the old map can be mapped onto the corresponding nodes in the new SOMLib map by determining the winning node on presentation of the (modified to match the new vector structure) weight vectors of the old map's nodes. If a first order map tends to grow too big, one can choose to split the underlying documents into groups to create separate first order SOMLibs, which are then combined in a higher order map.

Secondly, there are higher order maps to be administered. These are based on several lower order maps, the structure vectors of which are merged to create a new vector. The modified weight vectors of the lower order maps are then used to create the higher level SOMLib map. In many cases a natural hierarchy will evolve in institutional arenas, say several university departments will have their own SOMLibs as first order maps, which are then integrated in a single second order map at university level, which in turn may be combined at a national level and so on. Others may choose to combine first or higher order SOMLibs of institutions covering a certain topic of interest, with the possibility for mutual referencing, to create their personal library system.


next up previous
Next: Conclusion Up: SOMLib: A Distributed Digital Previous: SOMLib - The Architecture
Andreas RAUBER
1998-06-02