Java代写:COMP612 Refactoring Recommender

使用多种方法,重构所给的代码。

refactoring

Project Background

Developers have to constantly improve the internal structural quality of software systems to satisfy evolving requirements for several reasons. For this purpose, refactoring has been one of the most common activities applied by developers during software maintenance and evolution. Examples of common refactoring types include

  1. Restructuring or moving class members, such as Extract Method, Move Method and Pull Up Method, and
  2. Creating new elements, such as Extract Superclass and Extract Interface

Regardless of the refactoring type, developers need to know when and how they should refactor the source code. They need to know what code elements (packages, classes, methods, etc.) need to be refactored and which refactoring operations should be applied to them. To this end, developers can monitor indicators of poor structural quality in source code to identify opportunities for refactoring, such as code smells.

A code smell (or smell for short) is a bad structure (pattern of code) that suggests there might be a deeper problem in the system. More specifically, the presence of a smell in the source code indicates that a better way of writing the code exists or that more design perhaps should go into it. Consequently, they are a good indicator of refactoring opportunities. Developers can use the presence of a smell as a guide for when and how to refactor the code.

The goal of the group project is to implement a tool that identifies code smells (Complex Class and Feature Envy) in a software system and proposes the refactoring operations to remove them.

Refactoring Recommender: A Heuristic-based Tool

There are different approaches to implement a refactoring recommender tool. The one that your team will need to implement is a heuristic-based tool. It recommends refactoring operations to remove a code smell based on the combination of different heuristics. A heuristic is a methodology to solve a problem by trial-and-error taking into account “educated guesses” before making a decision. The benefit of using heuristic algorithms is because they provide a ‘good enough’ solution (i.e., a suboptimal solution) in a reasonable amount of time; it is not the best solution but it is an approximation.

For the group project, you will need to implement at least two heuristics, which together should propose refactoring operations to remove instances of Complex Class and Feature Envy. The following sections present the heuristics in the context of every smell.

Algorithm 1: Refactoring Complex Class via Extract Method

Complex Class occurs when at least one method has a high complexity. A feasible heuristic to remove complex classes work on reducing the complexity of methods that have a high cyclomatic complexity (CC). For example, it can reduce the method’s complexity by grouping code fragments that together contribute to its complexity, and move them to a new method in the same class scope. The Extract Method is a suitable refactoring type to reduce the cyclomatic complexity and thus removing the complex class smell.

The Algorithm 1 illustrates the steps to propose extract methods that can remove a complex class. The algorithm relies on an heuristic to find the code fragments for extraction.

The first algorithm is composed of four parts. In the first one, it retrieves all the classes in the system and finds those that correspond to a complex class smell (lines 1 - 2). In the second part, it relies on the cyclomatic complexity to identify the methods that are contributing to the complexity in every complex class in the system (lines 3 - 6).

The third part is the core of the algorithm: it should find the code fragment that contributes to the method’s complexity and, consequently, should be extracted. To find these fragments, the algorithm relies on a heuristic method that finds these “extractable” code fragments (line 7).

Heuristic 1 returns a list of code fragments that can be extracted (see details in Heuristics). Notice that your team will need to decide how to couple this heuristic into algorithm 1. Will the algorithm try all suggestions for extraction until you find the best one? Or will it stop as soon as the smell is removed? There is no wrong strategy to couple the heuristic into the algorithm as long as the algorithm returns the refactoring operations to remove the smell. However, there are some strategies that can make the heuristic more accurate.

After finding the code fragment (or fragments) for extraction, the algorithm simply recommends the extract method operations. For this purpose, it collects information about the extraction to suggest the refactoring: the name of the new method that will receive the extracted fragment, its parameters and return type, and the fragment of the original code (lines 8 - 12).

Your team can choose the strategy to suggest a name, the parameters and returning type for the newly-extracted method. For example, you can implement any strategy to suggest a name as simple as suggesting “method 1” or relying on another heuristic or NLP (Natural Language Processing) to find a name based on the extracted fragment.

The algorithm prints the refactoring details to extract the new method (line 13). Thus, the process can repeat with a new method or with the same one in case it still has a high complexity.

  • Input: Source code of a Java project
  • Output: Details about the refactoring operation to remove the complex class. For example, the name of the new methods, list of parameters, starting line and ending line of the original code that should be extracted. Any details necessary to manually apply the refactoring operations.

Algorithm 1.1 (second implementation)

The previous algorithm relies on the cyclomatic complexity as an indicator of the Complex Class. There is another and better implementation that your team can choose. It relies on the detection of Brain method smell. Brain Methods are methods that centralize the intelligence of a class, leading to the class’s high complexity. Michele Lanza and Radu Marinescu proposed the following detection strategy for Brain Method:

This detection strategy uses four metrics:

  • LOC - Lines of Code - to measure the number of lines of code of a method
  • CYCLO - McCabe’s Cyclomatic Number - to measure the complexity of a method
  • MAXNESTING - Maximum Nesting Level - to measure the maximum nesting depth of a method
  • NOAV - Number of Accessed Variables - to measure the number of accessed variables

This detection strategy uses two types of thresholds:

  • MAXNESTING and NOAV use Generally-Accepted Meaning Thresholds. Several is defined between 2 and 5. Many is the short term memory capacity, so it’s 7 or 8.
  • LOC and CYCLO use a Statistics-Based Threshold. For these types of thresholds, a large number of projects needs to be analyzed. The authors of Object-Oriented Metrics in Practice analyzed 45 Java projects and extracted Low, Average, High and Very High thresholds for some basic metrics. The High threshold for LOC for a class is 130. The High threshold for CYCLO for a method is 3.1.

Algorithm 1.1 shows the steps to propose extract methods that can remove a complex class by removing brain methods. Notice the algorithm is almost the same as Algorithm 1, however it relies on a more restricted and strong detection strategy. While Algorithm 1 uses cyclomatic complexity (CC > 10), Algorithm 1.1 uses other metrics in addition to cyclomatic complexity, such as lines of code (LOC), and maximum nesting depth of a method.

Algorithm 2: Refactoring Feature Envy via Extract Method and Move Method

Feature envy indicates a method that is more interested in the data of other classes than the one it is actually in. It occurs when a method has more calls to another particular class than the one it belongs to. This might be a sign that the infected method was misplaced and that it should be moved to another class. Therefore, a common pattern of refactoring operations to remove Feature Envy is composed of Extract Method and Move Method or just a Move Method in case the whole method needs to be moved.

The Algorithm 2 illustrates the extract method and move method refactorings to remove a feature envy. The algorithm is composed of four parts: (i) identification of method lines that are more interested in different classes; (ii) extraction of these lines into a new method; (iii) identification of the class that suits better the newly-created method; and (iv) provide the suggestion of a Move Method refactoring to move the new method to the identified target class.

First, the algorithm retrieves all the classes in the system that contain a feature envy (lines 1 - 2). For the group project, you can run the smell detector and return all the methods (and their respective classes) that contain feature envy (line 2). Next, the heuristic iterates over all methods with the smell (line 3) while the method still has the smell (line 5).

In the second part, the algorithm extracts the supposed “envy lines” into a new method. For this purpose, Algorithm 2 also relies on a heuristic (see details in Heuristics) to identify possible line fragments for extraction (line 6).

By having these possible fragments, the algorithm can search in the candidate of target classes (line 7) the one that is the destination (target) of the extracted fragment. Finding the target class also relies on a heuristic (see details in Heuristics). After finding the target classes, the algorithm can simulate both refactoring operations: extract method (line 9) and move method (line 10). This simulation is necessary because this pair of refactorings might not be enough to remove the code smell - the old method can still have code fragments that envious other classes.

The algorithm verifies if the smell still exists in the current class (line 11). If so, it updates the current method with the simulation of the extraction (line 12). Otherwise, the algorithm stops extractions in the current method (line 14). Finally, the algorithm prints the refactoring operations to remove the feature envy (line 17).

  • Input: Source code of a Java project
  • Output: Details about the refactoring operation to remove the complex class. For example, the name of the new methods, list of parameters, starting line and ending line of the original code that should be extracted, target class or classes that should receive the method. Any details necessary to manually apply the refactoring operations.

Heuristics

The algorithms to suggest refactoring operations to remove Complex Class and Feature Envie smells rely on at least two heuristics. One heuristic to identify the lines of code that collaborate for providing functionality and another one to find the target to move a method. Other heuristics may need to be applied, for example,

  • A. An heuristic to suggest the name of the extracted method;
  • B. An heuristic to find the parameters for the extracted method and its return type
  • C. An heuristic to fix the method chain when moving a method across multiple targets (Customer RentalTapeMovie )

Your team decides how to implement and couple these heuristics with Algorithms 1 and 2. Feel free to alter the algorithms and all heuristics in any way that will help you to remove the smells. The two main heuristics are explained below.

Heuristic 1: Extracting Method Opportunities

There are several approaches to identify code fragments that collaborate for providing functionality to a method. The extraction of some code fragments can reduce the size of the initial method, and consequently, reduce its complexity and increase its cohesion. For the group project, your team should implement or adapt the SEMI (SRP-based Extract Method Identification) approach.

The SEMI approach was proposed by Charalampidou et al.6. Their heuristic aims to identify functionally coherent lines, i.e., lines of code that cooperate in providing specific functionality to the system, and consequently can be extracted together. For this purpose, the heuristic is divided into two parts:

Part 1: Identification of Candidate Extract Method Opportunities

The first part of the heuristic is based on computing the cohesion among the statements (i.e., code lines). It is the same concept behind the cohesion among methods of a class: two methods are coherent if they access the same attribute. In this heuristic, two statements are coherent if they access the same variables.

For every variable in the method’s body, the heuristic creates a matrix with all the variables (also method calls per statement) accessible by the statement. This matrix is used to create statement clusters. A cluster is a group of all the successive statements that access at least one common variable or call the same method. The clusters are created based on the successive distance between statements. The distance starts with 1 (step = 1) and increments until it covers the whole method body (step = method size). Next, it merges two clusters if there is an overlap between their statements. After all possible extract method opportunities have been identified, it removes the duplicate and the invalid clusters. Notice that it identifies all possible sets of successive statements that are coherent to each other (regardless of their size)7.

Part 2: Extract Method Opportunity Ranking

The first part of the heuristic returns a list of all statement clusters that are candidates for extraction. The next part is to find which statement cluster in the list should be the heuristic’s output, i.e., the code fragment that will be suggested for extraction first in Algorithm 1 and 2. For this purpose, the heuristic ranks these clusters according to their level of cohesion.

As part of the ranking process, it also groups statement clusters that are heavily overlapping and are of similar size. The assumption is that they probably offer the same functionality, so they can be grouped as the primary cluster and the alternative cluster(s). At the end of this part, the heuristic will return the ranked list of the statement clusters, i.e., the list of code fragments suggested for extraction:

  • Group 1:
    • Primary cluster (i.e., code fragment): A
    • Alternative cluster: B, C, D, E
  • Group 2:
    • Primary cluster: H
    • Alternative clusters: F, G, I
  • Group N:
    • Primary cluster: Y
    • Alternative clusters: W, X, Z

Notice that Algorithm 1 and 2 expect a suggestion of a code fragment for extraction, while Heuristic 1 returns a list of suggestions (ranked and grouped code fragments). For the group project, the primary clusters in every group are the most interesting ones since they are the most cohesive code fragments. The alternative cluster can also be tested for extraction.

Heuristic 2: Finding the Target class

Your team is in charge of implementing the heuristic to find the class that the envy method is more interested in. There are different strategies to find such a target class, all of them try to find the class to which the (newly-extracted) method relates the most.

You can take as your advantage two facts:

  • A. The moving method envies the target class, i.e., it makes more method calls to or access the target class.
  • B. Lines of code in an extracted fragment are cohesive and can be moved together to a different class.

A possible heuristic can first

  1. Sort the target classes based on the method calls or access to the target class.
  2. Simulate moving the method into the target class
    • a. Calculate the LCOM for the target and source classes before moving the method
    • b. Simulate moving the method
    • c. Recalculate the LCOM for the target and source classes after moving the method
    • d. Check if the LCOM value is reduced for all involved classes. We expect an improvement in the classes’ cohesion when solving the smell.
  3. Repeat step 2 with other target classes until finding the class that will receive the method.

Smell Detector

Both Heuristics 1 and 2 require information about the lines to extract and candidates of target classes. Your team can get this information with the SmellDetector project. For example, suppose we run the smell detector in the RefactoringExample project, which has a feature envy associated with the (Customer) statement method. The smell detector would return three instances associated with the classes Rental, Tape and Movie as illustrated below.

Notice that the output indicates the method that contains the smell (customer), where the method starts and ends (12 - 56), the possible candidates of target classes (Rental, Tape and Movie) and their respective method access (9, 3 and 3, respectively). Feel free to modify the SmellDetector project to collect any other information necessary to implement the algorithms and heuristics.

Implementation, Demo and Evaluation

You will need to adapt the SmellDetector project to implement these heuristics. Feel free to change/adapt the algorithms and their respective heuristics in any way you want. They are just suggestions of heuristics to guide your implementation. Notice that you do not need to apply the refactoring operations. However, you still need to manipulate the AST, for example, to find the shared variables necessary for Heuristic 1.

Your team will need to schedule a ~1h30min demonstration with the instructor in week 14 and 15 (04-17 - 04-28). During the demo, the instructor will:

  1. Clone the team’s repository8 in his machine;
  2. Run the application with two source code examples
    • a. ComplexClass project: An example of Complex Class provided by the instructor during the demo. Your tool should suggest the extract method to reduce the method’s complexity.
    • b. RefactoringExample project. Your tool should suggest a similar set of refactoring operations to remove the feature envy from Customer’s statement method by moving the extracted fragments to Movie.
  3. The team will explain and answer questions about the implementation.

The algorithms and heuristics will be evaluated based on their accuracy and robustness. Your team grade will be based on the implementation of the heuristics/algorithms and the demo compared to the remaining teams.

Good luck