Can we avoid tooling to prevent spaghetti code?

Normal
0

21

false
false
false

FR
X-NONE
X-NONE

Recently, Jeremy D.
Miller

wrote on twitter:

I still think NDepend is more important in strange codebases. I know
where the coupling pain points are in my own code.

 

I
read Jeremy’s blog and think it is an excellent point to improve developer
skills.
I was
very concerned to read: I know where the coupling pain points are in my own code. I put a
lot of energy in evangelizing about the efficiency of good componentization to
fight again spaghetti code. Anyone
that worked on a real-world large project (a project with many
developers, with turn-over, lack of education and years of legacy) knows that
dealing with wrong coupling is a major problem. Actually I think it is THE major problem for large organization.
I tried to summarize this situation + solution in my post:
 Getting
rid of spaghetti code in the real-world: a Case Study

 

We
all agree that software development is an incredibly difficult engineering task,
and I get skeptical when I hear too confident things like I know where the coupling pain points are in my own code. The
reasons are twoflod :

 

  • First, without the help of
    appropriate tooling, nobody can master the structure of a program
    .

 

  • Second, from what I saw, unmanageable
    monster code structure is the primary technical reason of project failure.

 

 

 

By
curiosity I analyzed StructureMap
(an OSS
project developed by Jeremy) with NDepend.

 

Below
is
a graph of dependencies between namespaces: the code structure is
entangled, there is no care for componentization nor any identifiable
layer:

 

 

Here is the same
entangled structure viewed this time with the dependency matrix + the
exhibition of a particular cycle of length 5:

 

 

Don’t take me wrong, I
don’t put the blame on StructureMap nor its author. I present concrete and
verifiable facts, not opinion. Even, I estimate that lack of componentization
is something acceptable taking account the fact that StructureMap is a small
project developed by a single person (around 3.100 Lines of Code LoC
). My concern is that, Jeremy being a
leader in the .NET community, if a programmer takes the words (tooling is for someone else) I know where the
coupling pain points are in my own code
for granted, he will end up
with a similar spaghetti code structure but this time, on a large scale
professional project.

 

 

Actually while starting
NDepend 5 years ago I didn’t put much care in componentization and layering. Inevitably
I ended up dealing with spaghetti. At a point in 2006, the project was on its
way to become something professional and at the same time, was released the
dependency matrix feature. The first thing to do was to eat my own dog food by layering
the code. For those interested find the whole story detailed in this article: Control
component dependencies to gain clean architecture

(section Getting rid of dependency cycles
: a case-study
).

 

The dependency matrix
below shows the current NDepend code base structure. To read the matrix: a blue
cell means that the namespace represented by the column is using directly the namespace
represented by the row. The structure is made of 66 components mapped to 66 layered
namespaces: the matrix is triangular meaning that high level and low level layers
are well identified or in other words, there are no dependency cycles. The blue
triangle in the middle is the consequence of the UI code cohesion. The blue
ribbon at the bottom is the consequence of the low-level internal framework that
itself relies on some helpers.

 

 

 

 

With 78K LoC NDepend is
25 times bigger than StructureMap. This is illustrated by the metric view
below. Every day it is a blessing for the whole team to work on a clean
structured code. Also every new design decision is driven by the respect of this
clear, simple and elegant concept.

 

 

 

With 78K LoC NDepend is still
a medium-sized code base. It is itself small compare to a real large code base
such as the whole .NET Framework v3.5 with millions of LoC. The point here is
to show how big a code base can become, to give an idea of the damage that spaghetti code could do.

 

 

 

An entire post in
response to a twitter sentence is maybe a bit disproportioned. But I feel
concerned when someone trusted says that he doesn’t need tooling to care for
dependencies/layering/componentization. To me, it is as chocking as hearing
that code correctness can be achieved without automatic testing and contract. I
saw too many projects failing because of the spaghetti paradigm, and I am sure you saw it too (I just hope you
don’t deal with it every day
[:)]). The return on investment for structuring the code is just too high to not do it. Alternatively, the pain you got for not structuring code is also just too high to not structure it.

 

 



I had an email discussion with Jeremy about this post:

Jeremy: By I know where my coupling pain points are I meant that I know
exactly which classes are dangerous to touch and which ones aren’t that
risky (…) At no point in the post do you make any kind of concrete proof that the particular namespace coupling made any kind of real impact on development.  I watch CC#’s religiously (it’s crept up in some places by natural accretion), but the namespace coupling?  I’d judge that to be relatively unimportant in terms of real impact.

Me: The problem comes from which artifact you use to represent components. Microsoft advocates implicitly for assemblies to represent components, which IMHO is a very poor choices:
By implicitly I mean mainly:

  • the internal visibility level which scope is assembly
  • the one-to-one association between VS project and assembly
  • the way project gets referenced each other in VS

I qualify it as a poor choice because assembly is a physical concept (file) while component is a logical concept. I wrote a lot about the problems provoked by physical components (VS slow down, C#/VB.NET compilation time, CLR loading time, deployment issues with plenty of files to deal with, unit of re-use spawned on many files…)

Advices on partitioning code through .NET Assemblies
Control
component dependencies to gain clean architecture

(section .NET Components)

And Jeremy also advocated against using assembly to master coupling in its post: Separate Assemblies != Loose Coupling

If you don’t choose assembly to represent component, the natural artifact candidate is then namespace. And if so, namespaces cycles is bad because components cycles is bad. If components A and B are mutually dependent, you get one super component made of A & B, because A and B cannot be tested, reused, refactored etc, (all these things that characterize a component),…independently from each other.

A third choice to modelize components, apart assembly and namespace, is a group of types, as the .NET Base Class Library team is doing. They used namespaces to hierarchize the framework set of public types which is not a bad thing from a user point of view. They are now stuck to group their type (like lower level is primitive CLR types int, bool.. then comes a component with more evoluated primitive types string, StringBuilder, array… then comes threading and other low level stuff… etc). Types are then grouped, independently from their namespaces (and maybe their assemblies, this I don’t know). The problem here is that this third artifact (group of types) is made concrete by some proprietary tools and has no real existence in the source code itself. It is hard for programmer to get a concrete picture of componentization and this certainly increases the complexity for refactoring any portion of code.

 

 

This entry was posted in Uncategorized. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • http://z-bo.tumblr.com John "Z-Bo" Zabroski

    Hi Patrick,

    I understand your plea, but…

    Here are my notes:

    You criticize Microsoft’s implicit position that assemblies are components, but don’t make the connection between this and the consequences of this action. It actually extends all the way down into the “.NET Data Model” that was intended to replace all those problem domain specific data models like RDO.

    Here is my favorite example. DependencyProperty/DependencyObject in .NET 3.0. In particular, there are two pairs of these objects, one for WPF and one for WCF. The WPF one is just plain awful, and it is a miracle Microsoft completed WPF given this architectural mistake. To highlight the design flaw, DependencyProperty has knowledge of WPF subsystems, despite the fact it is intended as a helper for all of WPF. This knowledge of WPF is the principle reason why WCF needs its own notion of DependencyProperty and therefore DependencyObject. In turn, this effects integration *everywhere* in .NET 3.0!

    Easily the hugest architectural mistake Microsoft has made with .NET, ever.

    I honestly don’t know exactly why Microsoft did this. I’m told by insiders that hanging WPF subsystem flags like AffectsMeasure and AffectsRender was intended to improve “discoverability”. I found this explanation kind of surprising. My guess is the true reason has become “onions lost in the varnish”. The only justified reason for this is performance, though, not discoverability. In particular, the CLR has an overhead for cross-assembly security context checks and I’m not sure if the CLR is bright enough to “lift” these checks using static analysis so that they are only performed once.

    However, I wouldn’t overblow namespaces and assemblies. Adding accessibility modifiers for namespaces does not solve any problems whatsoever. A more sensible approach is to stick with the concept of namespaces as simply not having any meaning in the CLR. What needs to definitely be done if it’s not already is to drastically eliminate any performance overhead for cross-assembly security context checks, by lifting checks so they are performed only once per xref.

  • Brum

    Way to go Patrick. Keep that light shining.

    You and me both know why MS itself is investing heavily in static dependency tooling for VS2010, which in turn will help to make this awareness an increasingly more intrinsic part of our trade.

    We can only hope they at some point adress the issues of making namespaces first class citizens of the code world (i.e. private/public/protected namespace MyNamespace {…} etc.) and the “Copy Local” madness. We both know how these things relate to the subject at hand.

    Oh yes. And let’s hope Rico Mariani (http://channel9.msdn.com/shows/Going+Deep/Rico-Mariani-Visual-Studio-Today-Tomorrow-and-Beyond/) get’s it right so we can have all of our big code base in one VS solution as soon as possible and run the tests inside that same VS instance too. This would certainly make it easier to do the refactoring actually needed to get all the way to the world of levelized.

    Actually I am tempted to turn the table and ask Jeremy D.Miller:
    - After Patricks exposé, did you learn something new about the dependencies of StructureMap?
    - Knowing about the dependency cycles, do you have good arguments for not dealing with them, other than that you know they are ‘pain points’, ‘dangerous to touch’, ‘high risk’ etc.?

    Just to make it clear. I share Patrick(and in fact mostly everyone in this thread)s view that StructureMap is a brilliant and worthwhile piece of framework (in fact we use it and plan to use it more in our work to control dependencies, now if that isn’t ironic I don’t know what is ;)?).

    The first question is of course to the point of the help you can get from tooling when it comes to seeing dependencies. If even a guy who claims he has full control of his circulars admits to learn something from a static dependency tool like NDepend I think P is vindicated. Warning! The opposite is of course not true! (due to the size of the project as P has pointed out)

    The second is more related to the stop criterias (when is it good enough? are there legitimate reasons for leaving some circulars be? what are these reasons? can they be enumerated? etc.).

    Regards
    Brum – Sworn In Spaghetti Slayer

  • http://www.NDepend.com Patrick Smacchia

    Neil,
    In the explanations, A,B,C are playing equivalent roles, you can still say:
    if B depends on C that depends on A that depends on B, and you see that B is in the exact same entangled position than A (and C also).

  • Neil Blackburn

    Hi Patrick,

    Great post again. I wanted to fully understand why dependency cycles are harmful, and this is probably a dumb question. You say if A depends on B that depends on C that depends on A, A can’t be developed and tested independently. But isn’t that the case for B which has a dependency on C?

    Thanks.

  • http://codebetter.com/members/Patrick-Smacchia/default.aspx Patrick Smacchia

    >And my point is that some cycles are not “evil”

    Then we fundamentally disagree on this point. Cycles means absence of component and absence of component means death of the project in the mid-term. I wrote a lot about this, so let me copy/paste some reasoning from this article:
    http://www.theserverside.net/tt/articles/showarticle.tss?id=ControllingDependencies

    Why are dependency cycles harmful ?

    Dependency cycles between components lead to what is commonly called spaghetti code or tangled code. If component A depends on B that depends on C that depends on A, the component A can’t be developed and tested independently of B and C. A, B and C form an indivisible unit, a kind of super-component. This super-component has a higher cost than the sum of the cost over A, B and C because of the diseconomy of scale phenomenon (well documented in Software Estimation: Demystifying the Black Art by Steve McConnell). Basically, this holds that the cost of developing an indivisible piece of code increases exponentially. This suggests that developing and maintaining 1,000 LOC (Lines Of Code) will likely cost three or four times more than developing and maintaining 500 LOC, unless it can be split in two independent lumps of 500 LOC each. Hence the comparison with spaghetti that describes tangled code that can’t be maintained. In order to rationalize architecture, one must ensure that there are no dependency cycles between components, but also check that the size of each component is acceptable (500 to 1000 LOC).

  • Bob

    > As the title of the post suggest, the point here is to explain by example that without tooling you cannot avoid cycles/spaghetti/entangled code.

    And my point is that some cycles are not “evil”, and that while showing a pretty picture with arrows pointing everywhere make it look like a codebase with fatal flaws, that may not actually be the case. Again, show the dependency matrix then show some actual code.

  • http://www.NDepend.com Patrick Smacchia

    >Displaying a graph with red nodes and lines and describing it as “entagled” is not fact, but opinion.

    I am not native english spoken. Will, if you have another adjective than ‘entangled’ to qualify a graph with numerous circular dependencies please submit. I did not invent these cycles, they are here, they are facts.

    >Show us how using the tool can improve the code/design.

    I mentioned that the NDepend code base was entangled also in the early days because of the lack of tooling. It is now fully layered for 3 years almost, thanks to special tooling (actually NDepend analyzing itself). I provided a link to an article that explains this evolution.

    As the title of the post suggest, the point here is to explain by example that without tooling you cannot avoid cycles/spaghetti/entangled code.

  • http://geekswithblogs.net/WillSmith Will Smith

    It seems to me that if the purpose of your post it to demonstrate the advantage of NDepend as a tool, then you should show some sort of before and after of the CODE. Use a small sample project, display the dependency matrix and the code before, then after. Show us how much more elegant and readable the code is. Show us how using the tool can improve the code/design.

    Regardless of your stating:

    “Don’t take me wrong, I don’t put the blame on StructureMap nor its author. I present concrete and verifiable facts, not opinion…”,

    displaying the dependency matrix of StructureMap, then of NDepend as a comparison can easily be construed as an criticism of SM. Displaying a graph with red nodes and lines and describing it as “entagled” is not fact, but opinion.

    That said, I can see the value in using a tool like NDepend to review the complexity of your code. However, blindly following the suggestions to eliminate all of the circular dependencies could easily add more complexity to a design. We have SOLID principles, not laws.

  • http://www.NDepend.com Patrick Smacchia

    >It’s simply too much trouble to get that out of the way for the reward you’ll get back IMHO.

    Frans, saying that is like saying you don’t believe in layering and componentization or am I wrong?

    >it would be better if you clearly showed what the impact would be if this coupling would stay around

    The following post is a testimonial of how painful was a big entangled monolithic assembly and why it was needed to split it up. Basically, the problem was that this monster assembly forced all its users to use 100% of the underlying code, even when only 10% was only needed. The team came back to me very recently after layering the whole code and splitting the monster and they seemed extremely happy with it:

    http://codebetter.com/blogs/patricksmacchia/archive/2008/09/23/getting-rid-of-spaghetti-code-in-the-real-world.aspx

    >I really don’t see the disadvantage of this, only if the different namespaces (which share the same root namespace and are in the same assembly) are split up in different assemblies, but when does that happen? They aren’t grouped together for nothing.

    The only constant in software development is that things are changing unexpectedly. When does that happen? When do we need to split up a big entangled monolithic piece of code that grown unexpectedly? I would answer very often actually, the post referenced above show this very clearly.

    Also, concerning the advantages of a good componentization, the following post explains how seamless it was to do a global and apparently very intrusive refactoring just because the code was properly layered from the ground up:
    http://codebetter.com/blogs/patricksmacchia/archive/2009/02/22/evolutionary-design-and-acyclic-componentization.aspx

  • http://www.NDepend.com Patrick Smacchia

    Bob, I took many care of not judging SM code base. I know that SM is a useful pieces of OSS software developed by a brilliant mind. I explicitly wrote my intentions:

    >Don’t take me wrong, I don’t put the blame on StructureMap nor its author…

    Re-read the blog post and you’ll see that the only and unique thing I exposed from the SM code base, are the 2 screenshoots that are pure facts about the SM code base itself.

    That said ,I demonstrate that the NDepend code base is better componentized, and I explain this is because we use tooling for that, the point I wanted to demonstrate in this post: Can we avoid tooling to prevent spaghetti code?

  • http://weblogs.asp.net/fbouma Frans Bouma

    “By I know where my coupling pain points are I meant that I know exactly which classes are dangerous to touch and which ones aren’t that risky (…) At no point in the post do you make any kind of concrete proof that the particular namespace coupling made any kind of real impact on development. I watch CC#’s religiously (it’s crept up in some places by natural accretion), but the namespace coupling? I’d judge that to be relatively unimportant in terms of real impact.”

    I Agree with this. In my current major project (you know what ;)) I rewrote a lot of code, and it turned out to be very differently organized, yet I do know I have a couple of cycles in the namespace coupling. It’s simply too much trouble to get that out of the way for the reward you’ll get back IMHO. I’m definitely not going to add interfaces all over the place just to get rid of the coupling while there’s 1 implementation of that interface in the whole big code base: the impact on namespace coupling is IMHO too low for that work.

    I.o.w.: it would be better if you clearly showed what the impact would be if this coupling would stay around. I really don’t see the disadvantage of this, only if the different namespaces (which share the same root namespace and are in the same assembly) are split up in different assemblies, but when does that happen? They aren’t grouped together for nothing.

  • Bob

    Fair enough, I just found it lame that you “exposed” the SM codebase that way you did. I would imagine you could take code from almost any developer, regardless of skill, and have NDepend render an image that makes it look as if the developer learned to program by reading the back of a Cracker Jack box. SM is a great piece of software that solves the needs of many applications and does so in an elegant way. Your post was misleading.

  • http://www.NDepend.com Patrick Smacchia

    Bob ‘Anonymous’, I dedicate several hours every week to expose some ideas that, I estimate, can make the life of blog readers easier.

    I am the lead developers of the NDepend tool and consequently most of the ideas explained revolve around this tool domain. On CodeBetter, I think that Glenn Block with MEF and Matthew Podwysocki with F# have the exact same approach. I consider their post as invaluable insider notes, who else than co-creator could better explain intentions surrounding these technologies?

    I try hard to provide real added value for each blog post, with some efficient ideas that are too rarely debated in the .NET community. If for some engineering reasons you disagree with my posts content, you are highly welcome to open a thread by answering my posts.

  • Bob

    Another sales pitch.