A typical effect of setting CopyLocal = true

If you read me in the past, you certainly know that I have a problem about the Visual Studio default option that set CopyLocal = true in project references assemblies. I mean this option:

NancyCopyLocal

I’ve already explained the problem in my blog, and also in White-Books available on the NDepend website. (see white-book Partitioning code base through .NET assemblies and Visual Studio projects page 4 CopyLocal = true is evil)

CopyLocalEvil

For example, in 2009 by fixing the problem on the NUnit code base, I’ve been able in less than an hour, to shrink the compilation time from 32 seconds to 12 seconds.

So what is the consequence of Nancy framework relying on CopyLocal = true: A huge number of cloned version on Nancy.DLL for each VS project referencing it. After a recompilation, I can count 56 occurences of Nancy.dll that weights 842 KB each, it means more than 46MB wasted. Apparently it doesn’t affect so much the VS compilation duration, I guess the VS team made VS 2012 more intelligent in the sense that if it has already parsed a reference assembly for a project, it doesn’t parse it for others, which is a good thing.

NancyDuplication

I tried to post something on the Nancy user group, without success, hence this post. To fix this issue, it is as simple as:

  • making sure that each project compile in a ..\bin\Debug and ..\bin\Release folder
  • from VS projects, reference the DLLs in the ..\bin\Debug directory, and set CopyLocal = false for each reference

Additional, I found ..\bin to be a good folder to host test assemblies. This way they can reference the application assemblies that live in ..\bin\Debug and the indirection can be handled with a App.config file like:

On Nancy code structure

One very good point is that Nancy.dll is a single DLL that only depends on .NET Fx assemblies so it is optimally packaged.

NancyRefs

On the other hand, this large DLL made of more than 7.000 lines of code and almost 300 classes, doesn’t abide by any kind of layered architecture. The consequence is that each namespace depends on all other namespaces.

NancyGraph

As a consequences classes are not layered at all, there is no high level or low level classes.

Let’s have a look at the dependency matrix below. It is made of the 287 Nancy.dll classes, in mode indirect dependency. A blue cells means that the class in column depends (directly or indirectly) on the class in row, a green cell means the opposite, and a black cell means that the 2 classes are dependents on each others. A perfectly layered set of classes (which is not wished, but what should be approached) would be a lower triangular matrix blue, and a higher triangular matrix green. Here the matrix has blue and green cells mixed equally in both lower and higher triangulars, which is another way to figured out that classes are not layered.

NancyMatrix

It means that it must be difficult to unit-test a set of cohesive classes (i.e a component) in isolation from the others (even by using mocks that don’t help in case or re-entrant dependency). It also means that maintenance and evolution must be impaired by this absence of code structure, because touching a single class can virtually have an unexpected effect anywhere in the code base.

Posted in Acyclic componentization, Code Dependency, code structure, Code visualization, CopyLocal syndrome, Dependency Cycle, Dependency Graph, Dependency Matrix, graph, Graph of Dependencies, Lines of Code, Maintainability, Partitioning, Visual Studio, VS, VStudio | 22 Comments

The joy of being a programmer

I am programming since I am 10 and I am now 38. Today I measure how much good programming bring to my life, directly and indirectly. I’d like to give credit to aspects I love in my job. Hopefully some young people will read this and will consider maybe doing one of the most wonderful job on earth.

Getting in the flow: According to wikipediaflow is the mental state of operation in which a person performing an activity is fully immersed in a feeling of energized focus, full involvement, and enjoyment in the process of the activity. In essence, flow is characterized by complete absorption in what one does. Focus, immersion, being concentrated, involvement …  being everyday in the flow by coding hours and hours, contribute a lot to a solid positive state of mind, it is a bit like meditation.  These are moments where one can completely forget about minor everyday pesky stuff, but also forget for a while more serious problem in life everyone has to face. Being in the flow is the condition to solve challenging problems and to create beautiful pieces of engineering. Being in the flow can lead to addiction but it is not addiction. This is essential to control when to check-in in the flow and when check-out, making sure to not be disturbed meantime.

Being creative: Being a software engineer is one of the most mainstream way of being paid for being creative. Often, writing software is deemed to be an artistic activity. A programmer has to be humble, because this is a kind of art not understood by the mass. But being humble is a chance to become wiser and increase self-confidence. Also, knowing you are going to be creative for a while, is an excellent motivation to overcome the first step effort to jump in the flow. But the truth is that for every passionate programmer, there is a background thread in the mind in charge of creativity (often running at sleep-time), that makes it so that in the morning the envy to create what you have in mind is too strong.

Becoming an expert: It is common to hear that a programmer must know numerous technologies, that its skill is to learn how to learn new technologies. On this I disagree because what makes me really happy is to master completely a technology and exercise daily my expertise. I used to master all the tricky details of C++ and COM and it was fun. Before that I used to master some assembly level programming and it was fun, I was not even getting paid for that. In 2002/2003 and then  2005 I wrote two editions of a 1.000 pages book on .NET and C# and writing it has been one of the most blessed moment in my career. Since then I capitalize on this knowledge every single hour of coding, letting me focus my thoughts on problems to solve, and not on all the non-trivial things a complex platform like .NET is actually performing under my feet. Of course I am constantly discovering new details about surrounding technologies, like functional programming paradigms through the prism of functional paradigm introduced in .NET languages. But I know what is my core knowledge, both in terms of technologies and in term of program design skills. And as long as I won’t be forced to change my core skills, relying on my expertise to express my creativity and making a living on top of it is a source of personal achievement.

Meet inspiring people sharing the same passion: I imagine meeting peers is a source of happiness for every expert in something. This is also why investing in a solid programming particular set of skills is a positive thing. Not only respect from others programmers arise, but it lets have great exchange with smart people as passionate as you. The importance of flow, underlined above, also comes with the disadvantage of being often alone with your thoughts. Most programmers enjoy working alone anyway, but for those who need a bit of more social activity, having an expertise in something is also a great way to become partly a teacher (in professional or academic spheres), partly an architect and contribute to important decisions, partly a team-leader and be responsible for project progressing, partly a consultant, and be able to share your knowledge in a pleasant social environment. I put the word partly in italic because if your social activity make it so that you are not writing code meant to be run in production anymore, you shouldn’t consider you as a programmer anymore and you’ll loose a great deal of the points I am mentioning here. If you need social interactions all day long, programming is not for you. This is also why (sadly) there are so few women in software, because evolution designed them to be much more social beings than men.

Being involved in something that make sense: Here also my position might be a bit different of what is widely accepted. I agree that for juniors, it is important to multiply the opportunities to work in several different teams and companies, to get an idea of what they like and what they don’t, to be influenced by several inspiring mentors. But once you become seniors, working on the same application in the long term, where you feel well programming in,  polishing it days after days, seeing its evolution across years, maintaining it in a clean state by adopting modern paradigms like automatic tests, DbC or relentless refactoring, having your word to say about strategic decisions, personally I found this being a great source of daily happiness and a great motivation to involve myself! In addition, working hard to achieving important milestones regularly, is an excellent way to give a sense to your professional career, which is (much) more the exception than the rule.

Work wherever, whenever you like: A 2KG laptop with the proper tools set installed, a few hours of electricity every 24 hours, this is all what a programmer needs to do his job well. A descent internet connection is often appreciated, and there are today only few points on earth where internet is not available at all somehow. Programming might be probably the less demanding working activity in terms of time and space requirement. Getting in a flow in a 12 hours plane flying across the planet, scheduling half a year to live and work in a paradise tropical island, avoiding traffic jam by staying at home for work (in your pijama), delaying programming in the night or early morning to take care of the kids and their education, all this is not only possible but pretty common actually. Most serious software companies let some of their skilled engineers work wherever they prefer. Did you know that many of the great minds behind Visual Studio didn’t actually moved with their family to Seattle, but still live in their preferred location, sometime far away from the US?

Make a decent living doing something you like: Last but not least, everywhere, skilled programmers get paid above the average salary in their countries, and if we take the example of a developing country like India, good programmers get a much much higher income than the average. On top of that, a skilled programmer have pretty close to zero chances to remain unemployed for a long time. This also means that if you don’t like your current position, it is easy to find another better suited one. This situation is made possible because less than two decades ago, the modern civilization realized that IT is the condition for its development. It is a fact that many of the richest persons in the world were originally programmers and every motivated programmer has the potential to create its own ISV business. A programmer can also decide to make more money by coding for the financial industry, or even bet on a startup and potentially makes millions in a few years.

All the next big things will consume even more IT, this includes human genome analysis for the mass, the entire medicine that will be deeply impacted by that, more prevalent portable devices, more sophisticated entertainments, augmented reality, robots and automated machines to do surgery, to assist the increasing growing number of old people, artificial intelligence in the long term, and certainly everything nobody hasn’t imagined yet. After all 20 years ago, nobody anticipated the impact of Google. 10 years ago nobody anticipated the impact of Facebook and smart phones.

Ok enough getting in the flow of writing my passion for programming, I have some code to write before the weekend :)

Posted in Code, Programming | 5 Comments

Don’t assign a field from many methods

For the next NDepend version, amongst plenty of cool stuff, a new default code rule will be added. It has been named Don’t assign a field from many methods. It falls into the category of Purity – Immutability – Side-Effects rules, one of my preferred set of rules since often hard-to-find and to-fix bugs come from wrong state managements.

Screenshot2

As its name suggests this rule matches fields assigned from multiple methods. Bugs due to corrupted state are often the consequence of fields anarchically assigned.

On many code bases tested, I found the threshold of 4 methods writer to be the best balanced number between too many false positives, and a decent amount of suspicious matches to look after at code-reviewing time. Of course this threshold is easily customizable.

Several parameters can be taken account.

  • Is the field static? in which case it can be pretty serious if many methods assign it.
  • Does the field type is a value or a reference type? In the case of a reference type, such situation might reveal a potential NullReferenceException scenario luring.
  • Is the field assigned outside from its class. Since another rule prevents fields to be visible from outside its class, this case has been eliminated.
  • How many assigner methods of the parent class are visible outside the class? How many  methods calling directly or indirectly an assigner method are visible outside the class?
  • Can a particular class instance be accessed from multiple threads? in which case the situation sounds pretty alarming.

Not all cases are taken account since static analysis have limits and for code review warning purposes the rule algorithm must be kept easily understandable. Here is the rule code and comments that contain insight about how to generally fix such issue.

I’d be curious to read your suggestions and comments on that.

And here is what it looks like when we run the rules on the NUnit code base:

NDepend CQLinq Rules

Posted in C#, high cohesion, Immutability, Maintainability, measurement | Leave a comment

A Program to explore Code Diff

Recently I answered the question Generating a diff report using NDepend during build on stackoverflow. As explained in my answer, the easy way to go is to follow the documentation on Reporting Code Diff. But for the user that wants something smarter and more programatic, I propose a program based on NDepend.API that:

  • explore two folders containing older and newer versions of assemblies to compare
  • create on-the-fly two NDepend projets and analyze them
  • compare the two older/newer code base snapshots, which lead to the creation of a ICompareContext object
  • use this compare-context object to explore code diff and changes

The power offered by a compare-context object is barely exposed here. We only reports 3 lists of assemblies added/removed/changed, which can already be a useful information. But much more can be done with a compare-context object in hands. I wrote recently about how Code Query LINQ can be used to write rules to detect code regression issue. This post explains how code-diff can be used to detect regression, including API Breaking Changes, loss in code quality, recent code not well covered by tests… All CQLinq queries/rules exposed are based on NDepend.CodeModel API. And thus, they can all be integrated and combined into a program that starts like the one above, and then uses a compare-context object to explore code diff.

A subtle points is that when one writes a CQLinq rule involving code diff, one doesn’t deal with a compare-context object. This is an astute we use to make CQLinq rules/queries as fluent, concise and elegant as possible. Indeed, a CQLinq pre-compiling phase takes care to transform calls to extension methods, into calls involving the compare-context. Hence the following CQLinq rule that reports methods that were complex, and that became even more complex….

…is actually transformed into this expression, where the usage of the compare-context is made obvious.

Thus the logic of this query could be easily integrated into our program this way:

In the real-world, for companies that have build servers and that care for the quality of code delivered, this ability to write custom programs that can be included into the build process chain to focus on the quality of new or refactored code can certainly be a nice option.

But one can also see value in writing a custom-tooling program to be ran on the developer desktop. Such program could also use the special method TryCompareSourceWith() that can open a source file diff tool on the right location of two versions of the same source file (the source file diff tool is the one configured in NDepend > Tools > Options > Source Files Compare Tool). For example the open-source PowerTool Code Review Methods Changed uses this TryCompareSourceWith() method to show diff in code during a code review session. Certainly a more sophisticated program could help reviewing code diff, for example with the ability to assign some users to approve some diff, and persist in a DB such approval or disapproval actions.

Enjoy!

Posted in Change summary, Changes, Code, Code Diff, CQLinq, LINQ | Leave a comment

Ruling Code Quality Regression

A prominent characteristic of the software industry is that products are constantly evolving. All modern development methodologies prone that a product should evolve through small iterations. Internally, development teams are using Continuous Integration servers that shrink increment length to a minimum.

In continuously evolving code bases, chances of regressions are high. A regression occurs when a new bug is introduced inadvertently, because of a change in code. The weapon against regressions lies in automatized and repeatable tests. This is why it is advised to developers to cover their code through unit-tests: not only to make sure that written code works as intended, but also to check as often as possible that in the face of future evolutions, the code won’t be broken inadvertently.

While it is advised to write performance testing to avoid performance regressions, others non-functional requirements, such as code quality, are not checked continuously. As a direct consequence, code quality declines, nobody really cares, and this affect significantly the code maintainability. There are tooling to assess quality, there are tooling that tells the global code quality evolution, but so far there are no tooling that exposes in details code quality regression.

This is a problem I’ve been mulling on during the last years. Through the last release of NDepend, we’ve introduced a multi-purposes code ruling capability through Code Query Linq (CQLinq). A CQLinq query is a C# code query, that is querying the NDepend.API code model.

A wide range of code aspects can be queried, including facilities to query code quality, and the possibility to query the diff between two versions of a code base.  A Code Quality Regression rule, typically relies on both code quality and diff areas on NDepend.API CodeModel.

Dissecting a Code Quality Regression rule

For example below is a default CQLinq code rule that matches methods that were both already complex, and became even more complex, in the sense of the popular Cyclomatic Complexity code metric, presented directly by the interface IMethod. If two snapshots of the code base are currently compared by NDepend, the extension method OlderVersion() returns for a IMethod object, the corresponding IMethod object in the older snapshot of the code base. Of course, for a method that has been added, OlderVersion() returns null, this is why in the query we first filter methods where IsPresentInBothBuilds() and even where CodeWasChanged(), since the complexity of the method won’t change if the method remains untouched.

This query can be edited live in Visual Studio and its result is immediately displayed (here we analyse 2 different versions of the code base of NUnit).

CCIncrease

For any method, a right-click option let’s compare older and newer versions through a source diff tool (I like especially the VS 2012 diff integrated capabilities). And even better, to avoid formatting and comments irrelevant diff, and focus only on code change, a right-click option let’s compare older and newer version decompiled through RedGates .NET Reflector.

DiffSmall

Several Code Quality Regression default Code Rules

A group of default code rules, named Code Quality Regression is proposed. They all relies on the OlderVersion() diff bridge extension method, and then focus on a code quality aspect, like the ratio of code covered by tests, various other code metrics like number of line of code, or also the fact that an immutable type became mutable (which can typically break client code).

Rules

And since CQLinq is really just the C# LINQ syntax you already know, it is easy to define a particular code aspect that shouldn’t be broken, and write a custom regression rule on it.

API Breaking Changes Code Rules

Amongst the various code aspects we, developers, don’t like to see changed, is the visibility of publicly visible code elements. For dev shops responsible for publishing an API, this issue is know as API Breaking Changes and there are some default CQLinq rules to detect that.

APIBreakingChanges

The API Breaking Changes rules source code abide by the same idea. Here we are seeking for types in the older version of the code base, that are publicly visible, and that:

  • have been removed in the newer version of the code base (while their parent assembly is still present)
  • or where their visibility is not publicly visible anymore.

APIBreakingChangesRule

Some facilities to show the result to the user are proposed. For example, when chasing public interfaces broken (i.e with methods added or removed) the query select clause can be refined, to let the user list the culprit added/removed methods:

APIBreakingChangesRule2

Finally

Typically, Code Quality tools are daunting because for any legacy real-world code base, they list thousands of issues . It is neither realist nor productive, to freeze the development for weeks or months to fix all reported flaws.

The ability to focus only on new quality regressions, that appeared recently during the current iteration, comes with the advantage to restraint the number of issues to fix to an acceptable range. This also favors the relevancy of issues to fix, since it is more productive to fix issues on recent code before it is released to production, than to fix issues on stable code, untouched for a long time.

Posted in API usage, CC, Change summary, Code, code base snapshot comparison, Code Diff, Code Query, Code Rule, code structure, CQLinq, Full Coverage, Immutability, Lines of Code, LINQ, Maintainability, NDepend, software metric, Software Quality Measurement | 1 Comment