Flowchart recognition task 2012

Input | Output | Details | Training data | Test data | Contact

The topics in this third task are patent images representing flow-charts. Participants in this task will be asked to extract the information in these images and return it in a predefined textual format.

The text file will contain, as a rule, structural information. The point is to obtain as much as the information present in the image, in such a way as to be able to process it further for the purposes of patent search. As a rule, you will not be asked to make inferences about the nature of the nodes or edges. The nodes are simply to be represented by the geometrical figure: rectangle, circle, parallelogram, etc. (see below list). The edges are also just plain or dotted, but there is one special type of edge: wiggly which denotes the link between an actual node and its label, as seen in the images below.

Input

Tiff images containing flowcharts.



 back to top

Output

A text file describing the flowchart.
The text file is a sequence of lines, each line prefixed with a mark to identify the information on the line, as follows:

MT - META : refers to meta information about the flow chart.
MT Title : title of the chart, in double quotes - optional
MT NO : number of nodes in the flow chart - required
MT DE : number of directed edges in the flow chart - required
MT UE : number of unidirected edges in the flow chart - required

NO - NODE: the line starting with NO describes a node in the flow chart. Each such line must contain:
NO : identifier of the line describing a flow chart's node. Required
id : integer : identifier of the node in the flow chart. Required
node-type : a keyword describing the type (shape) of the node. Possible values: oval, rectangle, double-rectangle, parallelogram, diamond, circle, point, no-box, cylinder, unknown. Required
text : a string between double quotes : the text appearing in the flow chart's node. Optional

DE - directed edge
UE - unidirected edge
: the line starting with DE or UE describes an edge in the flow chart.
Each such line must contain:
DE | UE : identifier of the line describing a flow chart's edge. Required
start-node : integer : the identifier of the flow chart node where the edge has its starting point, Must occur on a line with the 'NO' identifier. Required
end-node : integer : the identifier of the flow chart node where the edge has its ending point. must occur on a line with the 'NO' identifier. Required
type : a keyword describing the edge. May take one of the following values: plain, dotted, wiggly, unknown.Required
text : a string between double quotes : the text attached to the edge. Optional

CO - COMMENT : denotes a line that contains comments.

Example of output file:

MT Title "Fig.7"
MT NO 10
MT DE 5
MT UE 4
CO ====== Now comes the list of nodes ======
CO === identifier  type  text   ============
NO 1 oval "BEGIN"
NO 2 rectangle "RECEIVE AND DIGITIZE IMAGE"
NO 3 rectangle "DISPLAY IMAGE AND SELECT CHART"
NO 4 rectangle "DESIGNATE APPROXIMATE POSITIONS"
NO 5 rectangle "RECOGNIZE GRAPHICAL OBJECT AND OUTPUT"
NO 6 oval "BEGIN"
NO 7 no-box "80"
NO 8 no-box "82"
NO 9 no-box "84"
NO 10 no-box "86"
CO ========== Here come the edges ===========
CO === start-node end-node type text ========
DE 1 2 plain ""
DE 2 3 plain ""
UE 2 7 wiggly ""
DE 3 4 plain ""
UE 3 8 wiggly ""
DE 4 5 plain ""
UE 4 9 wiggly ""
DE 5 6 plain ""
UE 5 10 wiggly ""
CO ======== THIS IS IT =======================

 back to top

Details

Evaluation

The main evaluation measure will be the graph distance metric based on the mcs, most common subgraph (see Bunke and Shearer, 1998 and Wallis et. al., 2001).

The distance between the topic flowchart Ft and the submitted flowchart Fs is computed as:

            
where |.| denotes the size of the flowchart/graph. In our case, the size of the flowchart is the number of edges plus the number of nodes.

The distance between the topic and the submitted flowcharts will be computed at three levels:

  • basic: only the flowchart structure is taken into consideration, i.e. nodes and edges without type information and without text labels.
  • intermediate: use the flowchart structure together with the node and edge types.
  • complete: flowchart structure, node and edge types, text labels

Wiggly edges

The so-called wiggly edges are the only part of the result which involves a bit of semantics. Wiggly edges connect the nodes of the flowchart with their labels. After intense discussions, we have decided that rather than have you discard them, they should be returned. After all, if you can discard them, it means you have already identified them as being of the wiggly type.

The semantic part comes into play here, because wiggly edges are not always wiggly. In the vast majority of cases, they are, but in some cases they are simple straight lines. In such cases you still have to return them as wiggly, because they are not actually part of the flowchart, but they are annotations. One way to determine that they are annotations is to observe that one of their nodes does not have a frame, or is just a number or short out-of-vocabulary term. This is unfortunately not a rule that will always apply.

Node types

There are 10 types of nodes available.

  • OVAL : can be either an oval in the proper sense, or a rectangle with rounded corners
  • RECTANGLE : is a simple rectangle
  • DOUBLE-RECTANGLE : is a rectangle with its lateral edges doubled
  • PARALLELOGRAM : is a convex quadrilateral with two pairs of parallel sides and angles not equal to 90 degrees (a skewed rectangle)
  • DIAMOND : is a convex quadrilateral with non-parallel sides, OR a rectangle with cut corners (transforming it into a hexago or octogon)
  • CIRCLE : is a simple circle (as opposed to the oval, it has only one diameter)
  • POINT : is a junction of edges which does not have a node. Often used to depict merging flows. Also, a point node will be present if the edge does not have a visual destination or source node.
  • CYLINDER : a depiction of a cylinder (a rectangle with an oval on top or at the bottom), often used to depict data storage.
  • NO-BOX : free text connected to an edge. Often used to depict labels, but sometimes (rarely) other nodes will have no box as well.
  • UNKNOWN : everything else.

As you may observe, the difference between a rectangle, oval and diamond will often be determined by its corners. Many ovals are actually rectangles with rounded corners. Some diamonds will be hexagons (rectangles with cut corners)

Other issues and notes

Text outside nodes
Sometimes, nodes will have some text outside of their boxes. We are not taking into account such text. It will be difficult in general to recognize whether the text refers to the node or to the edge (the later should in principle be returned). This is why the evaluation will not place strong emphasis on edge labels. Again, the structure is the important element here.
Labels not connected with edges
Sometimes, there will be no wiggly edges between labels and their nodes. We consider such cases as the one above (text outside nodes) and therefore ignore such labels
Non-ASCII text
Some nodes contain mathematical formulas or other non-ASCII text. The qrels do their best at representing them with plain-text, but again, this is less important for this task.
Lines inside nodes
Some nodes will have lines dividing them into two distinct rectangles or sections. We take into account the outer form of the node (i.e. disregard internal graphics)
Point nodes
Point nodes are not actually present in the flowchart, but are needed to properly represent the graph. Often, a flow will merge into another flow, and this will be represented as an arrow pointing to another line. The intersection will be a point node. This has the effect that the line pointed to will be broken down into two edges (one on each side of the point node), which may be either directed or not.

Training data

Here you have a set of 50 images containing flowcharts, and here you have the corresponding text files.

 back to top

Test data

The set of test topics contains 100 black and white images. You can download it here.

 back to top

Contact

For questions and anything else regarding this task, please contact Mihai Lupu (lupu at ifs.tuwien.ac.at) or Florina Piroi (piroi at ifs.tuwien.ac.at)

 back to top