Additional documents for IWOR publication

Paper

IWoR 2016 paper

Presentation

IWoR 2016 presentation

Data from the 11 student groups (rcft files)

RCFT files of the students

Conceptual structure (AOC-poset) extracted from interfaces and classes of Java 1.8 sequences

Some students introduced existing Java interfaces or abstract classes in their formal contexts.
Here again, this may seem strange because the initial principle is to extract an interface from a concrete class. Nevertheless, this can be useful to observe in the conceptual structure where the existing Java interfaces and abstract classes appear. 
This phenomenon can be observed in the following figure which extends Fig. 3 of the paper (AOC-poset built on concrete sequence classes). In concept extents, one can see the position of existing interfaces and abstract classes. For example Concept_interf_10 in the following  Figure (equivalent to Concept_Coll_9 in Fig. 3) does not reveal a new interface, but shows that existing Java List interface is a data type which is rediscovered by the approach. Concept_interf_11 in the following figure  (equivalent to Concept_Coll_10 in Fig. 3), which introduces peek, can be interpreted, on the contrary, as a new data type (the type of stack-spirit collections whose top object can be consulted).AOC-poset sequences (classes and interfaces)

Text of the work proposed to Master students in Computer Science, 2nd year – University of Montpellier –  2015 (Translated)

Application of Formal Concept Analysis for extracting a Java interface hierarchy (Global refactoring)

In this work, you will assess Formal Concept Analysis for extracting a Java interface hierarchy. 
You will build an interface hierarchy describing the types of a program. The method is inspired by the paper:
R. Godin and H. Mili. Building and Maintaining Analysis-Level Class Hierarchies Using Galois Lattices. Special issue of Sigplan Notice – Proceedings of ACM OOPSLA’93, 28(10):394-410, 1993. 
In this paper, an equivalent work has been done on the  collection hierarchy of Smalltalk. The paper is available on internet through its title.

This approach has the following interest :

  • Highlighting interfaces allows to clearly separate specification from implementation, and to give a view on type organization which is less influenced by implementation purposes than classes.
  • Interfaces are abstract types, which offer a support to write abstract classes and to reuse code. Thanks to these abstract classes, concrete classes can be derived and source code can be shared. Interfaces may be used as parameter types in methods, giving to these methods a more general scope.
  • The interface hierarchy, in a language as Java, can use multiple inheritance (without conflict introduction), contrarily to class hierarchy. A better conceptual classification can thus be built, giving opportunities for more general methods in classes that manipulate the objects that conform to the interfaces.

The study will be done on collection types (interfaces), calculated from the existing Java collection classes. The extracted collection interface will be compared with existing java interfaces, abstract classes and concrete classes.

You will work by groups of 2-4 persons with the following tasks that will be shared in the group:

  • Understanding the paper:
    Marianne Huchard, Hervé Leblanc: Computing Interfaces in Java. ASE 2000: 317-320
    available at: http://www.lirmm.fr/content/download/10162/142571/file/shortAse2000.pdf
    This paper is a litlle old, it may include elements that are no more correct for the Java language.
    It does not describe experimental results, this is where you will contribute.
  • Understanding the existing Java collection hierarchy (classes and interfaces)
  • Extracting data, create the data file needed for the target tool (RCAexplore), by static analysis (see courses of A.D. Seriai) or introspection (reflect package, see courses of M. Huchard)
  • Understanding RCAexplore tool, apply it to your data, check the results
  • Understand and analyze the result, by comparing the calculated interface hierarchy with the existing interfaces.
  • Write a synthesis document.

Step 1 – Understanding the initial collection hierarchy (classes and interfaces)

Study the collection hierarchy. Build a graphical representation which will help you to understand and discuss your results.

Several sources may help you:

Expected result: Existing collection hierarchy with comments. Indicate how you build this representation. Establish three files to clearly show which elements you consider: concrete classes, abstract classes, interfaces.

Step 2 – Extract  data

The  data will come from the concrete classes from the collection hierarchy.
You can test other variants, but you have to explain why and you have to discuss the differences that you observe.

For each class, you should extract:

  • public final static attributes with their name and value (declared locally or inherited)
  • signatures of public methods  (declared locally or inherited)
  • you can also test a variant including the constructors (declared locally or inherited)

For each method signature (public and non-static):

  •  its name
  • its return type
  • its parameter type list
  • its exception list

You will have to know and maybe use a hierarchy of the types that appear as parameter type or return type to apply the part of the course on hierarchical characteristics in FCA if needed. This may apply to the case where there are specializations between return types in some methods.

You have to study the interest of extracting or not the generic parameters when the classes are generic, or when methods have specific generic types.

Study what to do with the interfaces like Serializable, that are not in the collection hierarchy but appear  as a super-interface. Propose a solution.

Expected result: Documented program (or process using a static analysis tool) which extracts the data. Several files in the format of RCAexplore tool corresponding to several variants if you found interesting to have several (using only concrete classes or not, using a type hierarchy, using only method name or also return type and parameter type list, exceptions). Explain and discuss your choices.

Step 3 – Understanding the RCAexplore tool

You will find the RCAexplore tool (created by X. Dolques), together with a documentation at:
http://dolques.free.fr/rcaexplore/

The online documentation will show you how to launch the tool and use it with a graphical interface and here after you have the main commands.
In these command lines, the tool has the name rcaexplore.jar.
We will only use part of the functionalities of the tool, because we won’t make relational analysis.

To train yourself, the course example is available here:
http://www.lirmm.fr/users/utilisateurs-lirmm/marianne-huchard/enseignement/hmin306

To build a conceptual structure (concept lattice, AOC-poset, Iceberg lattice):
java -jar rcaexplore.jar explogui <fichier.rcft> <output directory>

This launches the exploratory RCA with the rcft file and a graphical interface.

The exploration may build the corresponding files:

  • result.xml contains an XML view of structures. It can be analyzed a posteriori
  • The .dot files contain the structures and can be opened with Graphviz)
  • The trace.csv file contains the configuration options
  • latticebuilder.sh allows to generate, from the .dot files, .pdf files (or other formats) to vizualize alternatively the conceptual structures

With the tool, you can create (with « choose construction algorithm ») concept lattices  (menu item « fca} »),  AOC-posets (menu item « ares »),  sub-structures restricted to object-concepts (menu item « ocposet »), or restricted to attribute-concepts (menu item « acposet »)  or  Iceberg lattices (menu item « ares »). « icebergXX » corresponds to the lattice concept where we only  keep the bottom concept and the concepts whose extent size is higher or equal to XX\% of the total object numbers.

Once the algorithm chosen, you can launch the construction (« auto »).
The tool proposes to directly vizualize the results (rather than going through .dot files).
The built structures are in <output directory>.
The files stepi.dot and stepi-j.dot present the structures built at each RCA step. 
You only need to look at step 0.

Expected result: Give the files and present the obtained structures.

Step 4  – Understanding and analysis of the result

For each selected dataset:

  • Compare the calculated interface hierarchy with the existing classes and interfaces
  • In the AOC-poset, analyze systematically the results. Indicate which one you find interesting and why.
  • Try to give a name to the discovered concepts using their position or content. Finding easily a name is a good indicator of the interest of the concept.
  • Reversely, analyze the interfaces and abstract classes of the Java collection hierarchy, and indicate if they correspond to a concept of the AOC-poset or of the concept lattice.
  • Study and discuss the additional concepts that you find in the concept lattice (and not in the AOC-poset)
  • Which patterns or implication rules do you discover in your data. Explain how FCA helps in finding them.

Expected result: Document detailing your results.

Dernière mise à jour le 02/09/2016