I think that caring for
dependencies is the best thing you can do to make your program maintainable and
to fight against complexity. I’d like here to provide some tips to identify
dependencies/structural patterns at a glance, such has layered or entangled
code, high-cohesion and low-coupling accross components, hungry caller and popular callee…
For most engineers, talking of dependencies means talking about
something that looks like that:
However, boxes and arrows graph is not the most appropriate way to deal
with dependencies. Dependencies can be represented through what is called a Design Structural Matrix (DSM). In the snapshot
below, the same information is represented by a DSM and by a boxes and arrow graph.
- Matrix headers’ elements represent graph boxes
- Matrix non-empty cells correspond to graph arrows.
Thus, the coupling from PaintDotNet
to PdnLib is represented by a non empty
cell in the matrix and by an arrow in the graph.
Moreover, the upper part of the matrix is symmetric and is composed of
blue and green cells. A blue cell means that an element from the horizontal header is using an element
from the vertical header while a green
cell means that an element from the vertical
header is using an element from the horizontal
header. Thus, if the matrix’ headers contain the same set of elements (which is
the case here), a dependency is represented both by a blue cell and by a green cell: These 2 cells are
symmetric across the diagonal.
One pattern that is made obvious by a DSM is layered structure (i.e acyclic
structure). When the matrix is triangular, with all blue cells in the
lower-left triangle and all green cells in the upper-right triangle, then it shows
that the structure is perfectly layered (i.e it doesn’t contain any dependency
On the right part of the snapshot, the same layered structure is
represented with a graph. All arrows have the same left to right direction. The problem with graph, is that the graph
layout doesn’t scale. Here, we can barely see the big picture of the structure.
If the number of boxes would be multiplied by 2, the graph would be completely
un-readable. On the other side, the DSM representation wouldn’t be affected; we
say that DSM scales better than graph.
Btw, interestingly enough, most of graph layout algorithms rely on the fact
that a graph is acyclic. To compute layout of a graph with cycles, these
algorithms temporarily discard some dependencies to deal with a layered graph,
and then append the discarded dependencies at the last step of the computation.
If a structure contains a cycle, the cycle is displayed by a red square
on the DSM. We can see that inside the red square, green and blue cells are
mixed across the diagonal. There are also some black cells that represent
mutual direct usage (i.e A is using B and B is using A).
The NDepend’s DSM comes with the option Indirect Dependency. An indirect dependency
between A and B means that A is using something, that is using something, that
is using something … that is using B. Below is shown the same DSM with a cycle
but in indirect mode. We can see that the red square is filled up with only black
cells. It just means that given any element A and B in the cycle, A and B are
indirectly and mutually dependent.
Here is the
same structure represented with a graph. The red arrow shows that several
elements are mutually dependent. But the graph is not of any help to highlight
all elements involved in the parent cycle.
Notice that in NDepend, we provided a button to highlight cycles in the
DSM (if any). If the structure is layered, then this button has for effect to triangularize
the matrix and to keep non-empty cells as closed as possible to the diagonal.
High Cohesion / Low-Coupling
The idea of high-cohesion (inside a component) / low-coupling (between
components) is popular. But if you cannot visualize dependencies, this idea is just
something abstract. DSM are good at showing high cohesion. In the DSM below, an
obvious squared aggregate around the diagonal is displayed. It means that
elements involved in the square have a high cohesion: they are strongly dependant
on each other although we can see that they are layered since there is no
cycles. They are certainly candidate to be grouped into a concrete artifact (such
as a namespace or an assembly).
On the other hand, the fact that most cells around the square are empty
advocate for low-coupling between elements of the square and other elements.
In the DSM below, we can see 2 components with high cohesion
(upper and lower square) and a pretty low coupling between them.
While refactoring, having such an indicator can be pretty useful to know
if there are opportunities to split coarse components into several more fine-grained components. I once wrote a
post on the topic Hints on how to
componentize existing code
A hungry caller is a code element that is directly using a lot of other
code elements. Having hungry caller is typically a bad thing. Such situation
pinpoints an element that have certainly plenty of responsibilities. However a
few high level hungry callers that connect several part of an application are
unavoidable. We are talking here of the Mediator
The idea is to encapsulate connections
between elements in one place (the mediator) to make interact elements that are independent from
A hungry caller is represented by columns with many blue cells and by
rows with many green cells. The DSM below shows that NDepend.UI.KernelImpl is a hungry caller.
other hand, a popular callee is a code element that is used by many other code
elements. Popular callee are also unavoidable (think of the String class for example) but a popular
callee is not a bad thing. It just means that in every code base, there are some
central concepts represented with popular classes
A popular callee is represented by columns with many green cells and by
rows with many blue cells. The DSM below shows that NDepend.UI.KernelInterface is popular.
is that when one is keeping its code structure perfectly layered, popular
components are naturally kept at low-level. Indeed, a popular component cannot de-facto use many things, because
popular component are low-level, they cannot use something at a higher level. This
would create a dependency from low-level to high-level and this would break the
acyclic property of the structure.
words, keeping a structure layered create pressure on popular low-level elements.
This pressure avoid that the code base structure become entangled. I wrote a
post on this cool idea: Evolutionary
Design and Acyclic componentization
So far we just saw symmetric DSM where the set of elements in vertical and horizontal headers is the same. However NDepend can deal with
rectangular matrix. For example you can see the coupling between 2 components
by right clicking a cell, and select the menu Open this dependency.
If the opened cell was black as in the snapshot above (i.e if A and B are mutually dependent) then
the resulting rectangular matrix will contains both green and blue cells (and
eventually black cells as well) as in the snapshot below.
In this situation, you’ll often notice a deficit of green or blue cells
(3 blue cells for 1 green cell here). It is because even if 2 code elements are
mutually dependent, there often exist a natural level order between them. For
example, consider the System.Threading
namespaces and the System.String
class. They are mutually dependent; they both rely on each other. But the
matrix shows that Threading is much more dependent on String than the opposite
(there are much more blue cells than green cells). This confirms the intuition
that Threading is upper level than String.
And this intuition
is something that you can use at your advantage. Often a code base seems
completely entangled at first glance. But what I saw many times is that
removing cycles between components is not such a big burden. It is because even
if you are not enforcing automatically layering, developer’s intuition tells them
that low-level things (such as DB access code) shouldn’t use high-level things
(such as UI code). Of course, each code base contains hacks and mistakes, but my
point is that they are hopefully more the exception than the rule. There are
often far less mistake than good things in a code base. I wrote a post on the
topic: Re-factoring, Re-Structuring and
the cost of Levelizing.
here is what the data object pattern looks like. Data objects are classes that
contains only getter/setter and their respective fields.
I think that managing dependencies is the best thing to care for in order to reach maintainable
code and lower complexity. Being able to quickly identify patterns through a
DSM is a powerful tool that I hope, will help you in your daily work.