Hexagonal Architecture

Vlad Mettler   2014/06/29   Comments Off on Hexagonal Architecture

Intro

Designing a new system from scratch allows for some creative thinking and a discovery process. Opening a mind to some ideas considered to be quite extreme or outlandish may lead to interesting insights which may fundamentally change the thinking.
By a combination of a difficult problem, a stimulating company and some google-fu, the following thinking happened early in 2014 to Vlad Mettler, James Gardner and a few other good people.

The Problems

Classic Multi-Tier

We have used this approach for years. We have learned for it to be The True Way of organizing the software. In general, the approach consists of dividing the system into a number of layers roughly forming the following stack:

PRESENTATION
PROCESSING
STORAGE

It certainly is an improvement over lack of architecture or a monolith approach. Some level of separation of concerns is achieved. It becomes easier to organize work around specialized areas. Multiple teams can cooperate on the same project without stepping on each others toes. This does not come free though, as some thinking has to be put into the glue combining the different layers.

Tight coupling and MVC implementations

Many frameworks have been built based on the Model-View-Controller model; most of these frameworks claim or at some point in time claimed to be loosely coupled. On closer inspection it turns out though that in most cases the three layers are strongly coupled, leak into each other and have a disappointingly low cohesion.
A good example is Django, which allows leaking of the business logic into its templates/views, models and even the controller gluing the model and view. The model defines the entities and contains most of the actual logic for them. A change to the model requires a chain of changes to the views and other models.[5]

Lack of business logic separation

Very often it is difficult to separate the part of a program that defines ‘what is done’ from the part responsible for ‘how a thing is done’. Each program is written in order to fulfill more or less precise requirements. This is the reason-to-exist for software; its nature; its core. Without this part of the code, the application looses usability. Logic of a program is what holds the rules, entities and workflows important for the execution of requirements.
On the other hand we have all the code which is fulfilling the helper role. Database storage, network access, user interface, etc.
Most of frameworks mix the logic and non-logic code together. We end up loosing the clear picture of what our software is doing. We loose the clear modularization and start asking questions about ‘what to test’ and ‘what does this code do’.

Testing

Proper testing procedures require clearly defined requirements and strong understanding of how parts of the tested system work together. Very often testing is treated as an after-thought. Contemporary development methodologies are being interpreted from a point of view of ‘delivery at all cost’. This leads to unreliable software, delivered often on time, but requiring lengthy bug-fixing and change request processes. This, coupled with omitting the customers presence and input, will kill any project by a slow death.

Lack of clear boundaries

It is notoriously difficult for new programming adepts to identify the boundaries they should be using when testing projects and components. Fundamentally misunderstood notion of a ‘Unit’ as a ‘Class’ or ‘Function’ leads to admirable but futile exercise which results in testing the implementation instead of intention. Usefulness of tests is sacrificed on an altar of God of Coverage for the sake of metrics presented by the mid-management to the upper echelon of the company. Good intentions lead to a nightmare of an ossified codebase, invulnerable to any refactoring or feature change or extension.

Deep nesting of test doubles

Any testing exercise will require an introduction of test doubles at some level. Any non-deterministic factor has to be replaced with a controllable object in order to test anticipated edge cases. Test doubles are also necessary in order to speed up the test execution. This is particularly important in case of the unit tests written during Test Driven Development. We want to void any communication with external world which results in latency problems or processing on the other end.
Sometimes it is simply impossible to make some calls. Consider the case of a 3rd party REST API. It would not be advisable to hit that API from our tests as this may incur costs and/or overload the service provider.
In a layered architecture we end up creating stacks of mocked modules containing mocked modules containing mocked methods and more turtles all the way down. This is not only difficult to understand and visualize but also to realize. Clearly that model is broken as has been rightfully criticized over and over again [4].

Unit scoping

Lack of clear boundaries and problems with test doubles[6] is often accompanied by difficulties in scoping the size of a part of the system being tested. How wide and how deep does the testing exercise have to hit the system?

Fixture process

In order to test the system we must set it to a desired state. This process is called ‘a fixture’ and is very often realized by means of large files containing data to be pushed to various storage systems. These are misguidedly being called ‘the fixtures’. They are notoriously difficult to support, amend and understand. Good luck to anyone trying to understand what is the state of a system after 15000 lines of SQL have been magically executed.

Proliferation of test vectors

Unit testing, integration testing, regression testing, acceptance testing, etc. The differences between these processes are often minute but the theoreticians spend quite a long time on discussing them and pinpointing what they consider to be important from a dogmatic point of view. It would be interesting to see if some of these processes can be merged or removed or replaced with something a bit easier to understand and execute.

Work sharing/ Team work

The multi-tier pattern helps quite a bit in organization of work as it divides the system into a number of layers which can be then used for organization of teams. Interestingly, very often the organizations end up coming up with a structure of software architecture matching closely the structure of the organizations[7]. It may or may not be a desirable thing.
It would be preferable to disconnect the process of architectural design from the organization and the organization’s capabilities (such as its communication channels, available hardware, personnel, management structure, internal politics, etc.). The design has to be dictated by the requirements.

Branching and versioning

10 teams working on the same codebase will very quickly find themselves in a dire need of coordination of their coding efforts. Many tools simplify this process and many patterns help in organizing the use of those tools. Git and Gitflow are a good example and are extremely useful. We must consider the use of these tools and correct versioning patterns when thinking about the architecture and the work surrounding it.

The Thinking

What do we strive for

Testability should not be relegate to a QA process. Testing the piece of software should be central to any programming exercise. The test process should ideally start before the coding and be transparent enough for non-technical team members or stakeholders to understand.
The programming language or paradigms are of secondary importance. It is desirable that the language should be changed according to the necessary solution for a particular problem that is being solved. The ideal architecture should allow combining multiple languages into an extendable system.
The architecture should follow simple repeatable patterns. Ideally we should be able to repeat the same patterns not only when extending the system in a parallel way but also recursively. Isolated components constructed out of isolated components.
The architecture should be flexible enough to be able to describe small and big systems, isolated and running ‘in process’, localized and distributed, in the cloud and on dedicated hardware.
Any construct following the architecture should be easy to monitor and extend.
It would be beneficial if the architecture allowed for the “tell, don’t ask” paradigm. The component requesting a change of the system’s state should know as little as possible about the component executing the change.

R&D

New way of approaching the architecture is necessary in order to solve some of the aforementioned problems. One interesting idea is to change the layered approach into an inside-out approach where everything that does not pertain to the business logic is outsourced to an external, separated and largely unknown collaborator. This approach was suggested initially by Alistair Cockburn as ‘Hexagonal architecture'[1]. Very similar ideas underlay Jefret Palermo’s ‘Onion Architecture'[2].
All the elements of a system build as a series of ‘Hexagons’ or ‘Onions’ are first-class citizens. There is no hierarchy.
Once we have accepted the idea of lack of hierarchy of components, we can take it a bit further and start deploying the components as a series of separated services. Not dissimilar to the way the UNIX operating system allows for joining a number of fairly small programs into a series of actors; each actor in the chain is responsible for only one task,¬† accepts input (standard in) and spits out the result (standard out) which can then be channeled to another actor. This concept is known as ‘Microservices’.[3]

Standing on the shoulders of giants

The Solution which proposed has been heavily influenced by a number of publications both printed and electronic. The authors have merely collected, filtered and reassembled the ideas formulated by other authors. We are ‘standing on the shoulders of giants’ which include but are not limited to:
1. Kent Beck
2. Martin Fowler
3. Alistair Cockburn
4. Robert Martin
5. Steve Freeman
6. Nat Pryce

The Solution

Architectural tenets

  • Components use interfaces to communicate
  • Components are isolated – they have very clear boundaries
  • Components are decoupled – change in one component does not necessitate a change in another component
  • Components are stateless – components need all the information to be provided explicitly; without use of shared contexts
  • Testing is important – components must make it trivial to write a set of suits to guarantee correct operation
  • Testing is automated – all testing is executed by machines; the only human driven testing is the exploration

The Hexagon

The components are called hexagons. Each component is build of two main parts (the core and the port) and needs adapters for communication with other components and/or services.

The core

The core contains all the logic important from the business point of view. The core is isolated from the outside world. The only way to gain access to core’s functionality and for the core to access any other hexagon or a service is by making calls to the interfaces. From core’s point of view these interfaces constitute the source and destination of the communication; the real source or target are hidden and implemented as adapters. This important property can be leveraged during process of testing, by providing fake adapters.

The port

Separation of the program’s logic from the outside world is achieved by introduction an interface shell around the logic’s core. The port is a formalization of that interface. The ports are logically collected into facets defined by different types of conversations the core is having with the outside world.
Ports can be divided into two main types:

  • incoming port exposes core’s functionality
  • outgoing port defines core’s view of the outside world

Traffic moves both ways through the ports. The side initiating communication process defines if we treat a port as incoming or outgoing.

The adapter

Adapter is plugged into the port but logically does not constitute a part of the hexagon itself. It’s role is to ensure that the communication flowing through the port is understood by both sides.
Adapter’s main role is to ensure that the port (interface) exposed by the hexagon is capable of having a communication with whatever interface the adapter is plugged into on the other side. The adapter is a means of assuring that two components communicating with each other are decoupled. Adapters are switchable, providing the replacement adapter implements the original interface.

The service

This is a name applied to a component used in the hexagonal architecture, which is not a hexagon. Databases, file systems and REST interfaces are a good example of services which will be found in hexagonal ecosystem. All hexagons must be separated from the services using the same mechanisms that would be used for isolation of two hexagons.

Testing and programming

Existence of hexagons with their interfaces provides us with a very clean definition of boundaries for testing which solves one of the most notorious problems encountered while testing software.

Two different types of testing useful in development of software based on hexagonal architecture are hexagon tests and communication tests

Hexagon tests

These are the ‘unit test’ of the hexagonal architecture. The hexagon core is tested by a series of calls to its incoming ports, with all the outgoing ports being replaced with test doubles. These test doubles are providing the isolation of units from the environment the units will live in.

Communication tests

Communication tests assure that the adapter placed between two ports or a port and a service such as a database, operate according to specified parameters. This type of test is a replacement for ‘integration test’. The scope of this test can be defined as ‘port to port’. Core’s are ignored and instead calls to the outgoing ports are executed and the calls into the other side’s incoming ports are inspected.

Note how the ports are shared between both testing approaches. This makes it evident how prominent a role of carefully constructed interfaces is in hexagonal architecture.

Acceptance and regression tests

Usually, a bunch of hexagons will be wired up together by adapters in order to provide some functionality to the business. That business functionality is the real reason for the software to exist. Testing of the platform should be based on criteria defined by the stakeholders and preferably automated to some degree. These tests are executed on systems providing functional copies of the production systems. This type of testing is known as ‘behaviour driven development’.
Before any development starts, criteria for success are defined and expressed in a way allowing automation. Development consists of two parallel flows:

  • turning the acceptance criteria into a set of automated tests
  • development of code fulfilling these criteria

The end result is a set of tests which can be used for automated regression testing of the system as a whole

Microservices

Microservices are the run-time environment for hexagons. They are merely an engine where the hexagons run. Each hexagon can be exposed as a service and accessed via other services via REST or RPC. The notion of a microservice is not yet perfectly solidified. There are discrepancies between size of services that are considered to be a perfect fit for a microservice. The discussions based on number of lines needed for implementation or a service are deeply flawed as the implementation is only of secondary importance. The cohesion of service’s interface should be the only criterion used to define the size of a unit. We should define software’s unit around the business needs not the pre-existing hardware solutions or software concepts.

References

[1] See http://alistair.cockburn.us/Hexagonal+architecture
[2] See http://jeffreypalermo.com/blog/the-onion-architecture-part-1/
[3] See http://martinfowler.com/articles/microservices.html
[4] Such interesting critique has been expressed by David Heinemeier Hensson at http://david.heinemeierhansson.com/2014/test-induced-design-damage.html . The article is especially interesting because the author identifies some very adamant problems but arrives to some, quite frankly, misguided conclusions.
[5] The authors are not claiming that Django should never be used, but merely suggest that Django is probably not a universal tool to apply to all the software projects.
[7] See http://en.wikipedia.org/wiki/Conway%27s_law