Hierarchical feature maps

Next: Comparison of both models Up: Topology preserving self-organizing networks Previous: Self-organizing maps

Hierarchical feature maps

The key idea of hierarchical feature maps as proposed in [8] is to use a hierarchical setup of multiple layers where each layer consists of a number of independent self-organizing maps. One self-organizing map is used at the first layer of the hierarchy. For every unit in this map a self-organizing map is added to the next layer of the hierarchy. This principle is repeated with the third and any further layers of the hierarchical feature map. In Figure 2 we provide an example of a hierarchical feature map with three layers. The first layer map consists of $2 \times 2$ units, thus we find four independent self-organizing maps on the second layer. Since each map on the second layer consists again of $2 \times 2$ units, there are 16 maps on the third layer.

**Figure 2:** Architecture of a three-layer hierarchical feature map
$\begin{figure}\begin{center} \leavevmode \epsfxsize=40mm % \epsffile{hfmarch.eps} \end{center}\end{figure}$

The training process of hierarchical feature maps starts with the self-organizing map on the first layer. This map is trained according to the standard training process of self-organizing maps as described above. When this first self-organizing map is stable, i.e. only minor further adaptation of the weight vectors are recorded, training proceeds with the maps of the second layer. Here, each map is trained with only that portion of the input data that is mapped on the respective unit in the higher layer map. By this, the amount of training data for a particular self-organizing map is reduced on the way down the hierarchy. Additionally, the vectors representing the input patterns may be shortened on the transition from one layer to the next. This shortage is due to the fact that some input vector components can be expected to be equal among those input data that are mapped onto the same unit. These equal components may be omitted for training the next layer maps without loss of information. There is no loss of information because these portions of the input vector are already represented by the higher layer unit.

Next: Comparison of both models Up: Topology preserving self-organizing networks Previous: Self-organizing maps

Andreas RAUBER
1998-09-10