Getting the Brownfield clean, but not green – Part II

Initial situation

In my previous blog post, I introduced an architecture which could be the backbone of a historically grown application. Certain design issues, called smells, were introduced which could prevent a new developer in the project from its main task, which is adding certain extra functionality to the existing application. Certain smells, that contradict an evolvable architecture, need to be eliminated. This process of improving the existing code before applying new functionality is called Refactoring, which is the subject of today’s blog post.
A good definition for an entity in this blog can be found in Domain Driven Design by Eric Evans. In short, an entity can be identified by its key and is an independent object that usually has a real life representation, as this is the case for movable objects or storage bins and even activities as they are a business document.
A historically grown application which tracks movements of objects through storage bins could consist of the following three entities:

  • Movable object
  • Storage bin
  • Movement activity

The main task of this example application is to keep track of the current storage bin of the moveable objects and their movement activities in the past. So here is the simplified class diagram:

Sample Legacy Application

Tell, don’t ask

ZCL_MOVEMENT_ACTIVITY expects a movable object as well as a destination to work properly. This data is passed as a flat structure via the constructor and the method +SET_DESTINATION(IS_DESTINATION : Z_STORAGE_BIN) . The intention is probably to provide some data sources in order to allow ZCL_MOVEMENT_ACTIVITY make some assumptions about the data it got.
This is not a good practice.

Providing structures as method input is a common anti-pattern in ABAP development. The habit certainly comes from programming with function modules, where your only opportunity to hand over complex data structures was to provide flat or nested structures or even internal tables.
In ABAP OO, you don’t need to do it. Think of structures differently: Objects are better structures since the can include basically the same data that can be included also by structures, with the difference of adding behavior to them. This behavior could also be restricted, which means that certain fields in the structure may not be accessible. This behavior could be validation which is processed before a certain field in the structure is populated.
The appropriate class in the example would be ZCL_STORAGE_BIN. As you can see, ZCL_STORAGE_BIN defines the attribute MS_STORAGE_BIN. In addition you can later add certain instance methods like IS_BLOCKED( ) or CHECK_DEST4MOVEABLE_OBJ(IO_MOVEABLE_OBJCT: ZCL_MOVEABLE_OBJECT) in order to support quantity checks or checks, if the storage bin may serve as a destination for an object’s movement. The method’s signature wouldn’t even need to change.
So instead of interpreting the data you got in your method, you are now telling the passed object to return some data to you which is needed for processing. Since the only data which is accessible for you can be accessed by the object’s methods (besides public attributes, never ever do that!), data encapsulation is far better in comparison to use simple structures, which cannot transfer any logic at all. A well written class is a pleasure for the developers who need to work with it. However, a badly written class may still expose to much information to it’s callers, but this is another smell.
Solution: The best approach would be to completely replace the signature of +SET_DESTINATION(IS_DESTINATION : Z_STORAGE_BIN)
With
+SET_DESTINATION(IO_DESTINATION: ZCL_STORAGE_BIN)
Inside the method, call appropriate Get-Methods of the passed object, or the common Is-Methods like IS_BLOCKED( ).
If this is not possible, introduce another, new method which is subject to future developments. The old method is marked as OBSOLETE in the comments or method’s description or both. Please do not forget to also include a hint of where the new method is located since one of the most annoying issues for a developer is to use an obviously deprecated method, which lacks the link to its newer version.
If the structure Z_STORAGE_BIN contains the storage bin key, you could consider another solution. The old method +SET_DESTINATION(IS_DESTINATION : Z_STORAGE_BIN) will remain as it is. But internally, you call a factory or repository to retrieve the storage bin’s object representation by its key attributes.

Storage Bin Repository

The class diagram currently gives you no hint about instance creation control patterns. You could consider to implement the repository as a Singleton, but this would usually prevent IoC Containers to create the repository for you as the contructor is usually made protected in these patterns. IoC Containers may bring you huge advantages when it comes to unit tests but this topic deserves its own section.

Don’t repeat yourself

Any logic hat has a natural relevance to the entity where it is related to should need to be implemented closely to the specific class which implements this entity.
This also applies to a status check on the movable object, wether it is blocked for movement activities or not. Don’t try to make this decision outside of the class ZCL_MOVEABLE_OBJECT, based on a STATUS-field in Z_MOVEABLE_OBJECT which may be retrieved by IV_MOVEABLE OBJECT in the constructor of ZCL_MOVEMENT_ACTIVITY.
Even if the status attribute only contains two possible values, “B” for “Blocked” and “R” for “Ready”, don’t do that. Playing with fire usually leads to burned fingers.
Instead, try to get an instance of type ZCL_MOVEABLE_OBJECT as fast as possible. This could happen by a method parameter or, again, a repository lookup. Implement the interpretation upon the status where it belongs to (ZCL_MOVEABLE_OBJECT).
To come back to the topic of this section: Interpreting the object’s status outside of the object’s main class usually leads to other code fragments where this interpretation is done again as the main class lacks appropriate methods. This way, you are starting to repeat yourself.

Single responsibility principle

Take a look at the three classes again. Is there anything they have in common?
You might recognize the SAVE() Method at the end of each methods list.
So what are the classes are intended to do?

  • ZCL_MOVEABLE_OBJECT: carry moveable objects data and read it from DB and store it somewhere
  • ZCL_MOVEMENT_ACTIVITY: perform movements and store these movements somewhere
  • ZCL_STORAGE_BIN: carry storage bin data and store it somewhere

Found the smell?

Entity objects should not take care about how they are stored somewhere, for example in the database.
This is not because we like to switch from one database to another in the near future (despite the fact that ABAP offers you very good Open SQL support).
Instead, coupling entities like ZCL_MOVEABLE_OBJECT to the way they are stored usually leads to another issue: Caching requirements will likely be also implemented in ZCL_MOVEABLE_OBJECT.
This is where ZCL_MOVEABLE_OBJECT starts to grow and to serve more than just one purpose.
It is better to leave this task to repositories, as mentioned before. Repositories control the way, how entity objects are created or stored to the database. Hence, our SAVE-implementations should move to their appropriate repositories.

Storage Bin Repository with SAVE-method

Testability

Repositories manage the access to entities. Usually, they are seldom referenced by entities, since the classes that implement these entities should expect other entity instances wherever possible. Sometimes you cannot get an instance directly, for example if the method’s signature needs to stay the same and expects only the ID of the referenced entity from its callers.
Take a look at +SET_DESTINATION(IS_DESTINATION : Z_STORAGE_BIN) of ZCL_MOVEMENT_ACTIVITY again. In case, the key of the storage bin is already included in the structure Z_STORAGE_BIN and you have a repository for this entity, try to ask the repository for an instance inside the method.
However, the repository needs to be registered in the activity instance before. This could happen either via constructor injection or via a RESOLVE_DEPENDENCIES( ) – method which can be called in the constructor.
This method may acquire the instance of the storage bin repository by either calling the constructor, a factory or a static method of ZCL_STORAGE_BIN_REPOSITORY which implements the singleton pattern. If you share only one instance of ZCL_STORAGE_BIN_REPOSITORY in your application you may also utilize caching of objects on instance level.
In order to unit test ZCL_MOVEMENT_ACTIVITY you will need to have sample storage bins as well as sample movable objects. Creating these storage bins and moveable objects will likely cause you to setup database table entries for each test which is quite a big effort.
Instead, it is usually a far better idea, to describe repositories by specific interfaces.
Storage Bin Repository with interface

This way you can access the repository by its interface, not caring about any implementation details in your client’s classes. The implementation behind this interface can be easily mocked either by using a mocking framework or manually.
On the other side, IoC Containers, may serve as service locators in a RESOLVE_DEPENDENCIES( )-method and replace the productive implementation of a repository by a mocked repository in a unit test. The mocked repository will eventually always returns the same instance of a storage bin, no matter what is really stored in the database.
Warning: using IoC Containers as pure service locators is quite dangerous as you would start to call the container all over the place. Instead it is better idea to declare every dependency via dependency injection (CONSTRUCTOR- or SETTER-based) in the entity. This dependency will be satisfied by the entity’s repository, which in turn, declares its own dependencies, for example other repositories. The goal of IoC Containers is to create the backend objects like repositories and satisfy their dependencies automatically for you. In a clean design there is no need to create entities with the help of IoC Containers, because there are already repositories for this task, but this would certainly be a topic for a future blog post.

Result

In the end, if all client calls can be adjusted, the resulting UML diagram could look like this. Of course such a result would be the optimal and most optimistic scenario, but in most of the cases you can come close to this result by utilizing RESOLVE_DEPENDENCY( ) methods as described in the previous sections, without changing the old API at all. This solution wouldn’t be clean, but much better than the current design and certainly more robust.
The goal is not, to change the design of a working application without specific purpose. But in case enhancements would have to be done, it is often better to improve the design where you see obstacles in implementing the desired functionality in the first step.
Afterwards, the enhancement is implemented even faster and more robust, with less testing effort and side effects. New developers understand more easily the task of each class, at the cost of a higher learning curve at the beginning in order to get the whole picture of entity classes, repositories and cross-cutting concerns like IoC Containers or mocking frameworks.
Eventually, unit testing of certain components becomes easier. These unit tests contribute to both software quality as well as even a bit of documentation as you are testing an object in unit tests the way it is intended to be used (respectively not used).
Sample Legacy Application after Refactoring

Advertisements

Getting the Brownfield clean, but not green – Part I

Initial situation

Imagine a legacy application which has grown over the past few years. It has evolved for generations, maintained by several generations of developers. While the first generation of developers had a clear vision of what the application should do and what the design should be, this vision was lost somewhere in between. The application cannot be understood easily and specific documentation would be needed to understand implementation details. As developers often run out of time, this documentation is not existent at all.
As a new developer in an existing application, you often face this situation. If the application’s structure is not clear you will have a hard time to figure out what the core concepts are and how to make enhancements to the existing code.
A historically grown application which tracks movements of objects through storage bins could consist of the following three entities:

  • Movable object
  • Storage bin
  • Movement activity

The main task of this example application is to keep track of the current storage bin of the moveable objects and their movement activities in the past. So here is the simplified class diagram:
Sample Legacy Application
This class diagram looks quite handy at the beginning. In the next chapters I’m going to show what went wrong and what makes future enhancement unnecessarily complicated.

Tell, don’t ask

ZCL_MOVEMENT_ACTIVITY expects a movable object as well as a destination to work properly. This data is passed as an ID of the movable object via the constructor (which eventually leads to a flat structure Z_MOVEABLE_OBJECT) and the method +SET_DESTINATION(IS_DESTINATION : Z_STORAGE_BIN) . The intention is probably to provide some data sources in order to allow ZCL_MOVEMENT_ACTIVITY make some assumptions about the data it got.
This is not a good practice.
Example: If there was an object status which indicates of the object is blocked for any movements, the activity class needs to read the status from the structure Z_MOVEABLE_OBJECT. Also, it needs to make a decision based on this status, if the object is blocked for movements or not.
If there was another class, which would need to list all blocked moveable objects, it would need to perform this interpretation on the status field again und you would probably need to copy the status interpretation logic into some other method. This also violates the “Don’t repeat yourself” principle.
This way, you make assumptions on foreign objects. Instead of telling, you ask for something and interpret the object’s data somewhere outside of the object.

Don’t repeat yourself

As mentioned before, don’t repeat yourself. This also applies to the logic which needs to evaluate the structure Z_STORAGE_BIN. Again, you are getting inveigled to make some assumptions about another object. Whenever you want to perform the assumption and an interpretation about that again, the fastest way to do it is to copy it, which is a clear smell.

Example: If the storage bin is completely reserved with other movements, is something that needs to be retrieved from the data passed in Z_MOVEABLE_OBJECT, either directly or indirectly.
If there is no central interpretation logic for this, the logic will need to be implemented wherever it is needed – you are starting to repeat yourself.

Single responsibility principle

Take a look at the three classes again. Is there anything they have in common?
You might recognize the SAVE() Method at the end of each methods list.
So what are the classes are intended to do?
ZCL_MOVEABLE_OBJECT: carry moveable objects data and read it from DB and store it somewhere
ZCL_MOVEMENT_ACTIVITY: perform movements and store these movements somewhere
ZCL_STORAGE_BIN: carry storage bin data and store it somewhere

Found the smell?
Everytime you need to use an “and” in your class’ description, there could be smell. In fact, in all of these three classes, there is more than one responsibility for each of them: Besides their core competency, they are obviously also caring about the persistence of their data, in most of the cases this is where some update modules are called or open SQL statements are called directly.

Inappropriate Intimacy

Take a look at ZCL_MOVEMENT_ACTIVITY. What would need to be done if the class was interested in the current storage bin of the moveable object passed to it’s constructor via it’s ID? In case this information was included in the structure Z_MOVEABLE_OBJECT, which can be loaded using the method LOAD_DATA in ZCL_MOVEABLE_OBJECT, we would be fine.
But more often, these data structures directly inherit from database tables, and often, the bin status report would have its own database table.

This means, as a lack of appropriate access functionality in the current class diagram, the developer would certainly read the data directly from the database table. The developer would make assumptions about the implementation of the method +SET_CURRENT_STORAGE_BIN(IS_STORAGE_BIN : Z_STORAGE_BIN) from outside of the core class ZCL_MOVEABLE_OBJECT. This is a clear sign of inappropriate intimacy. ZCL_MOVEMENT_ACTIVITY would be dependent of implementation details of ZCL_MOVEABLE_OBJECT.
Often this smell regularly leads to another smell, “Don’t repeat yourself”, since this retrieval logic is often just copied wherever needed.

Testability

At some point of time you may decide to unit test ZCL_MOVEMENT_ACTIVITY. You figure out, that it expects an object ID in the constructor which likely leads to a call to ZCL_MOVEABLE_OBJECT=>LOAD_DATA(…) in order to retrieve the data which is needed to get the moveable object.
If ZCL_MOVEABLE_OBJECT=>LOAD_DATA(…) is not called, there needs to be database SELECT for the moveable object which is even worse.
In any case, just to pass a moveable object to the test, you need to prepare the database in the SETUP routine and clean it up in the TEARDOWN routine. This is hard work compared to the benefit you expect from a unit test.
Uli once told me, a good architecture is quite often an architecture which can be easily tested and I think this guy is right.

Conclusion

You have seen three classes and at least five different smells. The list of smells is not a complete list, there could be more, but this would go beyond the scope of this blog post. In real life applications you easily have several hundreds of classes. Every smell decreases reusability of your application’s components, makes it more and more inflexible and in the end very hard to maintain. This is where bugs may arise quite easy.
So what should we do? Throwing the application away is not an option. There will be no green field where we can start from scratch. Our options are refactoring and introducing unit tests which also serve as documentation for the behavior of specific functionalities to some extent. We need to get the brownfield clean, but not green. But this would be the subject of my next blog post.