at.tuwien.ifs.somtoolbox.data
Class SOMLibClassInformation

java.lang.Object
  extended by at.tuwien.ifs.somtoolbox.data.SOMLibClassInformation
Direct Known Subclasses:
ESOMClassInformation

public class SOMLibClassInformation
extends java.lang.Object

This class provides information about class labels for the InputData input vectors.

The file format consists of a header and the content as follows:

$TYPE string, mandatory. Fixed to class_information.
$NUM_CLASSES integer, mandatory: gives the number of classes.
$CLASS_NAMES mandatory: a space-separated list of class names; the count has to be the same as in $NUM_CLASSES.
$XDIM integer, mandatory: number of units in x-direction. Fixed to 2.
$YDIM integer, mandatory: dimensionality class information vector, equals the number of input vectors ( InputData.numVectors()).
labelName_n classIndex_n the $YDIM number of mappings from the input vector label name to the class label index [0...($NUM_CLASSES-1)].

See also an example file from the Iris data set.

Alternatively, the file format can be more simple, and not contain any file header. Then, there is only a list of lines with two tabulator-seperated Strings in the form of labelName className.
The number of classes, and the indices of those classes, are computer automatically.

Finally, the simplest form of the file is to have lines with just the class label; then, this class is assigned to the input datum with the index of the line number.
The number of classes, and the indices of those classes, are computer automatically.

Version:
$Id: SOMLibClassInformation.java 4241 2011-12-02 19:50:52Z mayer $
Author:
Michael Dittenbach, Thomas Lidy, Rudolf Mayer, Jakob Frank

Nested Class Summary
static interface SOMLibClassInformation.ClassColorChangeListener
           
 
Field Summary
protected  java.lang.String classInformationFileName
          The file name to read from.
private  int[] classMemberCount
          The number of inputs in each class.
private  java.lang.String[] classNames
          The names of the classes.
private  java.util.ArrayList<java.lang.String> classNamesTemp
           
private  java.util.ArrayList<SOMLibClassInformation.ClassColorChangeListener> colorChangeListeners
           
private  int[] dataClasses
          A mapping input index => class index, for fast lookup.
private  java.util.HashMap<java.lang.String,java.lang.Integer> dataHash
          Mapping class name => class index, for fast lookup.
private  java.util.HashMap<java.lang.String,java.lang.String> dataHashTemp
           
private  java.lang.String[] dataNames
           
private  java.util.ArrayList<java.lang.String> dataNamesTemp
           
private static java.util.logging.Logger logger
           
private  int numClasses
          The number of classes.
protected  int numData
          The number of input vectors.
private  java.util.ArrayList<java.awt.Color> paintList
           
 
Constructor Summary
SOMLibClassInformation()
          Constructor intended to be used e.g.
SOMLibClassInformation(java.util.Map<java.lang.String,java.lang.String> classAssignment)
           
SOMLibClassInformation(java.lang.String classInformationFileName)
          Creates a new class information object by trying to read the given file in both the versions with a file header ( readSOMLibClassInformationFile()) and the tab-separated file (readTabSepClassInformationFile() ).
SOMLibClassInformation(java.lang.String[] classNames, java.lang.String[][] dataName)
          Constructor intended to be used when generating data.
 
Method Summary
 void addClassColorChangeListener(SOMLibClassInformation.ClassColorChangeListener l)
           
 void addItem(java.lang.String label, java.lang.String classname)
           
 java.lang.String[] classNames()
          Returns all the distinct class names.
 int[] computeClassDistribution(java.lang.String[] labelNames)
          computes the percentages of class membership for the given label names
 java.awt.Color getClassColor(int index)
          Get the colour for the given class index.
 java.awt.Color[] getClassColors()
          Get all class colours.
 int getClassIndex(java.lang.String vectorLabel)
          Gets the class index for a given input vector label.
 int getClassIndexForInput(java.lang.String vectorName)
           
 java.lang.String getClassName(int index)
          Gets the class label name for a given input vector index.
 java.lang.String getClassName(java.lang.String vectorName)
          Gets the class name of the given vector name.
 java.lang.String[] getClassNames()
          Returns the names of the classes.
 java.lang.String[] getDataNames()
           
 java.lang.String[] getDataNamesInClass(java.lang.String className)
           
 java.lang.String[][] getDataNamesPerClass()
          Returns an array of data names for each class.
 int getNumberOfClassMembers(int classIndex)
          Gets the number of input vectors in the given class.
 java.util.ArrayList<java.awt.Color> getPaintList()
          Get the class colours as list.
 double getPercentageOfClassMembers(int classIndex)
           
 boolean hasClassAssignmentForName(java.lang.String vectorName)
           
private  void initPaintList()
          Initialise a standard paint list
 boolean loadClassColours(java.io.File file)
          Load colours from an external (non-classinfo) file.
static void main(java.lang.String[] args)
          Method for stand-alone execution to convert a file to the SOMLibClassInformation format.
 int numClasses()
          Gets the number of classes, as read from $NUM_CLASSES, or computed.
static SOMLibClassInformation parse(java.lang.String contents)
          Parses the given contents, which must adhere to the SOMLib format, to a SOMLibClassInformation object.
The difference to the main constructor SOMLibClassInformation(String) is that the constructor reads from a file, while this method already has the contents in the given parameter.
 void processItems(boolean sortClassNames)
          process any items that were added by the addItem(String, String) method.
private  void readSimple()
           
protected  void readSOMLibClassInformationFile()
          Reads a class information file containing a header and class indices.
private  void readSOMLibClassInformationFile(java.io.BufferedReader br)
           
private  void readTabSepClassInformationFile()
          Reads a class information file containing no header, and tab-separated String entries for the input vector and class labels.
private  void readTabSepClassInformationFile(java.io.BufferedReader br)
           
 void removeClassColorChangeListener(SOMLibClassInformation.ClassColorChangeListener l)
           
 void removeNotPresentElements(SOMLibSparseInputData inputData)
           
 void setClassColor(int index, java.awt.Color color)
          Get the colour for the given class index.
private  void throwClassInfoReadingError(java.lang.String classInformationFileName, java.io.IOException e)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

logger

private static final java.util.logging.Logger logger

classInformationFileName

protected java.lang.String classInformationFileName
The file name to read from.


numClasses

private int numClasses
The number of classes. Either read from the file header, or computed from the distinct number of class names in the tab-seperated file.


classNames

private java.lang.String[] classNames
The names of the classes. Either read from the file header, or computed from the distinct class names in the tab-seperated file.


classMemberCount

private int[] classMemberCount
The number of inputs in each class.


numData

protected int numData
The number of input vectors. Either read from the file header, or computed from the number of data lines in the tab-seperated file.


dataNames

private java.lang.String[] dataNames

dataClasses

private int[] dataClasses
A mapping input index => class index, for fast lookup.


dataHash

private java.util.HashMap<java.lang.String,java.lang.Integer> dataHash
Mapping class name => class index, for fast lookup.


classNamesTemp

private java.util.ArrayList<java.lang.String> classNamesTemp

dataNamesTemp

private java.util.ArrayList<java.lang.String> dataNamesTemp

dataHashTemp

private java.util.HashMap<java.lang.String,java.lang.String> dataHashTemp

paintList

private java.util.ArrayList<java.awt.Color> paintList

colorChangeListeners

private java.util.ArrayList<SOMLibClassInformation.ClassColorChangeListener> colorChangeListeners
Constructor Detail

SOMLibClassInformation

public SOMLibClassInformation()
Constructor intended to be used e.g. when generating data, or when reading a file with the SOMPAKInputData


SOMLibClassInformation

public SOMLibClassInformation(java.util.Map<java.lang.String,java.lang.String> classAssignment)

SOMLibClassInformation

public SOMLibClassInformation(java.lang.String[] classNames,
                              java.lang.String[][] dataName)
Constructor intended to be used when generating data.


SOMLibClassInformation

public SOMLibClassInformation(java.lang.String classInformationFileName)
                       throws SOMToolboxException
Creates a new class information object by trying to read the given file in both the versions with a file header ( readSOMLibClassInformationFile()) and the tab-separated file (readTabSepClassInformationFile() ).

Parameters:
classInformationFileName - The file to read from
Throws:
SOMToolboxException - if there is any error in the file format
Method Detail

getClassNames

public java.lang.String[] getClassNames()
Returns the names of the classes.


getDataNamesPerClass

public java.lang.String[][] getDataNamesPerClass()
Returns an array of data names for each class.


getDataNames

public java.lang.String[] getDataNames()

throwClassInfoReadingError

private void throwClassInfoReadingError(java.lang.String classInformationFileName,
                                        java.io.IOException e)
                                 throws SOMLibFileFormatException
Throws:
SOMLibFileFormatException

readTabSepClassInformationFile

private void readTabSepClassInformationFile()
                                     throws SOMToolboxException,
                                            java.io.IOException
Reads a class information file containing no header, and tab-separated String entries for the input vector and class labels.

Throws:
SOMToolboxException - if there is any error in the file format
java.io.IOException

readTabSepClassInformationFile

private void readTabSepClassInformationFile(java.io.BufferedReader br)
                                     throws java.io.IOException,
                                            SOMLibFileFormatException
Throws:
java.io.IOException
SOMLibFileFormatException

readSimple

private void readSimple()
                 throws SOMToolboxException,
                        java.io.IOException
Throws:
SOMToolboxException
java.io.IOException

processItems

public void processItems(boolean sortClassNames)
process any items that were added by the addItem(String, String) method. This method should be called prior to accessing the other getter methods of this class

Parameters:
sortClassNames - indicates whether the class names should be sorted

addItem

public void addItem(java.lang.String label,
                    java.lang.String classname)

readSOMLibClassInformationFile

protected void readSOMLibClassInformationFile()
                                       throws java.io.IOException,
                                              SOMToolboxException
Reads a class information file containing a header and class indices.

Throws:
java.io.IOException
SOMToolboxException

readSOMLibClassInformationFile

private void readSOMLibClassInformationFile(java.io.BufferedReader br)
                                     throws java.io.IOException,
                                            SOMLibFileFormatException,
                                            ClassInfoHeaderNotFoundException
Throws:
java.io.IOException
SOMLibFileFormatException
ClassInfoHeaderNotFoundException

numClasses

public int numClasses()
Gets the number of classes, as read from $NUM_CLASSES, or computed.

Returns:
the number of classes.

classNames

public java.lang.String[] classNames()
Returns all the distinct class names.

Returns:
the class names.

getClassIndex

public int getClassIndex(java.lang.String vectorLabel)
Gets the class index for a given input vector label.

Parameters:
vectorLabel - the name of the vector
Returns:
the index of that label.

getClassName

public java.lang.String getClassName(int index)
Gets the class label name for a given input vector index.

Parameters:
index - index of the input vector.
Returns:
the name of the class.

getClassName

public java.lang.String getClassName(java.lang.String vectorName)
                              throws SOMLibFileFormatException
Gets the class name of the given vector name.

Parameters:
vectorName - the label/name of the input vector.
Returns:
the name of the class.
Throws:
SOMLibFileFormatException - If there is no class information available for the given vector name/label

getClassIndexForInput

public int getClassIndexForInput(java.lang.String vectorName)
                          throws SOMLibFileFormatException
Throws:
SOMLibFileFormatException

hasClassAssignmentForName

public boolean hasClassAssignmentForName(java.lang.String vectorName)

getNumberOfClassMembers

public int getNumberOfClassMembers(int classIndex)
Gets the number of input vectors in the given class.

Parameters:
classIndex - the index of the class.
Returns:
the total number of inputs in that class.

getPercentageOfClassMembers

public double getPercentageOfClassMembers(int classIndex)

getDataNamesInClass

public java.lang.String[] getDataNamesInClass(java.lang.String className)

computeClassDistribution

public int[] computeClassDistribution(java.lang.String[] labelNames)
computes the percentages of class membership for the given label names


initPaintList

private void initPaintList()
Initialise a standard paint list


getPaintList

public java.util.ArrayList<java.awt.Color> getPaintList()
Get the class colours as list.


getClassColors

public java.awt.Color[] getClassColors()
Get all class colours.


getClassColor

public java.awt.Color getClassColor(int index)
Get the colour for the given class index.


setClassColor

public void setClassColor(int index,
                          java.awt.Color color)
Get the colour for the given class index.


addClassColorChangeListener

public void addClassColorChangeListener(SOMLibClassInformation.ClassColorChangeListener l)

removeClassColorChangeListener

public void removeClassColorChangeListener(SOMLibClassInformation.ClassColorChangeListener l)

loadClassColours

public boolean loadClassColours(java.io.File file)
Load colours from an external (non-classinfo) file.


removeNotPresentElements

public void removeNotPresentElements(SOMLibSparseInputData inputData)

main

public static void main(java.lang.String[] args)
                 throws SOMToolboxException,
                        java.io.IOException
Method for stand-alone execution to convert a file to the SOMLibClassInformation format.

Throws:
SOMToolboxException
java.io.IOException

parse

public static SOMLibClassInformation parse(java.lang.String contents)
                                    throws SOMToolboxException
Parses the given contents, which must adhere to the SOMLib format, to a SOMLibClassInformation object.
The difference to the main constructor SOMLibClassInformation(String) is that the constructor reads from a file, while this method already has the contents in the given parameter.

Throws:
SOMToolboxException