Fourth Workshop on Data Analysis (WDA2003)

June 13-15 2003
Ruzomberok, Slovak Republic



Majekova Chata mountain hotel, Ruzumberok
Following the great success of the previous 3 Austrian-Slovak Student Workshops on Data Analysis (WDA) in Kosice (2000, 2002) and Budapest (2001), this year's workshop took place in Ruzomberok, Slovak Republic, from June 13-15 2003. It attracted students from 4 universities, and featured a fascinating program focusing on various aspects of data analysis, particularly in the domains of text and audio mining. The opening talk was given by Stanislav Krajci from the University of Kosice, who presented two models for the fuzzification of conceptual lattices to identify meaningful clusters in data. Starting from a crisp version of the conceptual analysis algorithm, two fuzzifications were presented, leading to a fuzzy modification of Rice & Siff's clustering algorithm. The main program of the workshop began with a session on text classification and ontologies, followed by a session focusing on the unsupervised analysis of text documents using various extensions to the self-organizing map (SOM) algorithm. Session three, finally, was devoted to the analysis of audio data. The Workshop also featured a special break-out session exploring the limits of data analysis in general, on particular the limits of text mining and the World Wide Web, based on two pieces of literary work, namely The Library of Babel by Nobel Laureate Jorge Luis Borghes, as well as a short story by Umberto Eco, Professor of Semiotics at the University of Bologna, namely On the Impossibility to Draw a Map of the Empire on a Scale of 1 to 1. An excursion to the UNESCO World Heritage Site of Vlkolinec rounded off the Workshop. We would like to thank all participants for their active participation during the meeting. Special thanks also go to the Austrian-Slovak Exchange Program (ÖAD), who supported this Workshop under project number 41s6.

Below the final program some images documenting this great event are provided. I am sure all participants enjoyed this event and, apart from having lots of fun, also learned a lot and came back with lots of fascinating new ideas.


  • Stanislav Krajci: Two Fuzzifications of a Conceptual Lattice (invited lecture)
  • Session 1
    • David Celjuska: Engineering of Ontology from Text and Associated Rules
    • Martin Blicha: Pre-processing of Text Documents in Slovak Language with Linguistic Approach
    • Martin Sarnovsky: Document Classification using BOW library
  • Session 1
    • Peter Butka: Clustering of Textual Documents: Modifications of GHSOM Algorithm
    • Katja Schmidt: SOMLib - Information Retrieval in Digital Libraries
    • Roland Schaffer: Analysis of the Behavior of GHSOM v1.5 Trained With Mathematical Documents
  • Session 3
    • Milan Hudec: Sythesizer for the Blind
    • Florian Fankhauser: Comparison of MARSYAS and SOMeJB for Genre-oriented Clustering of Music
    • Thomas Lidy: Marsyas and Rhythm Patterns: Evaluation of Two Music Genre Classification Systems

