Sponsored By Aspose - File Format APIs for .NET

Aspose are the market leader of .NET APIs for file business formats – natively work with DOCX, XLSX, PPT, PDF, MSG, MPP, images formats and many more!

Introduction to Refactoring

Evolution.  It is inevitable.  Software succumbs to
evolution, like everything else.  Almost all software goes through
a process of revisions and changes between the time it was born a wee
little prototype, to its inevitable death.  Object oriented
programming is by far, IMO, the easiest code design to deal with when
having to make changes during the life of a software product. 
However, just because you might be using OOP, doesn’t mean you have
optimal design.  Last week I talked about analyzing code metrics, which play a big part in determining how good, usable and maintainable your design is.

This is where refactoring comes in.  Refactoring can be defined
as a process where developers examine existing code and improve the
design of that code by means of modifications.  While there is no
specific model for the process of refactoring, there are certainly some
common areas where you see most problems in design, which is where
refactoring takes place.

Refactoring is a very in-depth subject, and I’m only going to skim
the top of it for you beginners out there.  I’m not going to get
into test driven development or patterns of software design, but simply
explain 4 common tasks which are encapsulated in refactoring.

Identify methods that can be moved.

This means to look for methods that are encapsulated in the wrong
class.  Methods should perform a task relevant to the class in
which they belong.  If a method is making frequent calls to
another class, consider moving that method to the other class. 
Look at the following example:

Namespace Refactoring

 

    Public Class Person

 

        Private _dateOfBirth As Date

        Public Property DateOfBirth() As Date

            Get

                Return _dateOfBirth

            End Get

            Set(ByVal Value As Date)

                _dateOfBirth = Value

            End Set

        End Property

 

    End Class

 

    Public Class VoterRegistration

 

        ‘ Used to determine if the person is old enough to vote

        Public Function CalculateAge(ByVal voter As Person) As Int32

            Dim years As Int32 = DateTime.Now.Year – voter.DateOfBirth.Year

 

            If DateTime.Now.Month < voter.DateOfBirth.Month OrElse (DateTime.Now.Month = voter.DateOfBirth.Month AndAlso _

                DateTime.Now.Day < voter.DateOfBirth.Day) Then

                years = years – 1

            End If

 

            Return years

 

        End Function

 

    End Class

 

End Namespace

Hopefully it is clear that “CalculateAge” is in the wrong class.  It is taken a single argument of type “Person” and acting entirely upon that argument.  This is a clear case where a method should be moved, in this case from the “VoterRegistration” class to the “Person” class so you have code like below for the “Person” class:

    Public Class Person

 

        Private _dateOfBirth As Date

        Public Property DateOfBirth() As Date

            Get

                Return _dateOfBirth

            End Get

            Set(ByVal Value As Date)

                _dateOfBirth = Value

            End Set

        End Property

 

        Public Function Age() As Int32

            Dim years As Int32 = DateTime.Now.Year – _dateOfBirth.Year

 

            If DateTime.Now.Month < _dateOfBirth.Month OrElse (DateTime.Now.Month = _dateOfBirth.Month AndAlso _

                DateTime.Now.Day < _dateOfBirth.Day) Then

                years = years – 1

            End If

 

            Return years

 

        End Function

 

    End Class

 

Identify new methods.

Common mistakes amongst developers is to create methods that are too
complex and accomplish too much themselves.  This leads to methods
that are hard to build upon, maintain and debug.  Cyclomatic complexity is
a common code metric to use to determine if a method is too
complex.  Dividing methods into smaller, more easily managed
pieces improves upon simple design and improved clarity. 
Repetative code is another clear indicator of where to create new
methods.

Take a look at the following code:

A method that is reusing code within itself

 

Namespace Refactoring

 

    Public Class Foo

 

        Public Function GetData() As DataTable

 

            Dim dt As New DataTable

 

            Dim dcFirstName As New DataColumn

            dcFirstName.DataType = Type.GetType(“System.String”)

            dcFirstName.AllowDBNull = False

            dcFirstName.Caption = “FirstName”

            dcFirstName.ColumnName = “FirstName”

            dcFirstName.DefaultValue = Nothing

            dt.Columns.Add(dcFirstName)

 

            Dim dcLastName As New DataColumn

            dcLastName.DataType = Type.GetType(“System.String”)

            dcLastName.AllowDBNull = False

            dcLastName.Caption = “LastName”

            dcLastName.ColumnName = “LastName”

            dcLastName.DefaultValue = Nothing

            dt.Columns.Add(dcLastName)

 

            Dim dcDateOfBirth As New DataColumn

            dcDateOfBirth.DataType = Type.GetType(“System.DateTime”)

            dcDateOfBirth.AllowDBNull = False

            dcDateOfBirth.Caption = “DateOfBirth”

            dcDateOfBirth.ColumnName = “DateOfBirth”

            dcDateOfBirth.DefaultValue = Nothing

            dt.Columns.Add(dcDateOfBirth)

 

            Return dt

 

        End Function

 

    End Class

 

End Namespace 

Obviously, its staring you right in the face that there is reusable code here. This is where you create a new method.

Refactored to create a new method

Namespace Refactoring

 

    Public Class Foo

 

        Public Function GetData() As DataTable

 

            Dim dt As New DataTable

 

            dt.Columns.Add(BuildColumn(“FirstName”, Type.GetType(“System.String”)))

            dt.Columns.Add(BuildColumn(“LastName”, Type.GetType(“System.String”)))

            dt.Columns.Add(BuildColumn(“DateOfBirth”, Type.GetType(“System.DateTime”)))

 

            Return dt

 

        End Function

 

        Private Function BuildColumn(ByVal columnName As String, ByVal columnType As Type) As DataColumn

            Dim dc As DataColumn = New DataColumn

            dc.DataType = columnType

            dc.AllowDBNull = False

            dc.Caption = columnName

            dc.ColumnName = columnName

            dc.DefaultValue = Nothing

            Return dc

        End Function

 

    End Class

 

End Namespace

 

Identify inheritance.

Many times when you see “Select Case” or “Switch” statements, this
is a strong indicator that the code should be refactored into an
inheritance design.  Look at the following code:

Part of a carnival ride program.

 

Namespace Refactoring

 

    Public Class Foo

 

        Private baseTokenAmount As Int32 = 1

 

        Public Function GetNumberOfRequiredTokens(ByVal typeOfPerson As PersonType) As Int32

            Select Case typeOfPerson

                Case PersonType.Infant

                    Throw New Exception(“Too young to ride”)

                Case PersonType.Child

                    Return baseTokenAmount

                Case PersonType.Adolescent

                    Return baseTokenAmount * 2

                Case PersonType.Adult

                    Return baseTokenAmount * 3

                Case PersonType.Senior

                    Throw New Exception(“Too old to ride”)

            End Select

 

        End Function

 

        Public Enum PersonType

            Infant

            Child

            Adolescent

            Adult

            Senior

        End Enum

 

    End Class

 

End Namespace

The code is pretty clean and simple, but it makes it hard to build
on and add other persontypes into the system.  If you have code
like this everywhere, you’d have to go into a lot of different places
in the program and change code to adjust for an added personType. 
This is an example of code that should be refactored into
inheritance.  The result would be the following code:

Conditional refactored to inheritance and polymorphism.

Public Class Foo

 

    Public Shared Function GetNumberOfRequiredTokens(ByVal person As IPerson) As Int32

        Return person.GetNumberOfRequiredTokens()

    End Function

 

End Class

 

Public Interface IPerson

    Function GetNumberOfRequiredTokens() As Int32

End Interface

 

Public MustInherit Class Person : Implements IPerson

    Private baseTokenAmount As Int32 = 1

    Protected ReadOnly Property Tokens() As Int32

        Get

            Return baseTokenAmount

        End Get

    End Property

 

    Public MustOverride Function GetNumberOfRequiredTokens() As Int32 Implements IPerson.GetNumberOfRequiredTokens

End Class

 

Public Class Child : Inherits Person

    Public Overrides Function GetNumberOfRequiredTokens() As Int32

        Return MyBase.Tokens()

    End Function

End Class

 

Public Class Adult : Inherits Person

    Public Overrides Function GetNumberOfRequiredTokens() As Int32

        Return MyBase.Tokens * 3

    End Function

End Class

 

Public Class Infant : Inherits Person

    Public Overrides Function GetNumberOfRequiredTokens() As Int32

        Throw New TooYoungException

    End Function

End Class

 

Public Class TooYoungException : Inherits Exception

 

End Class

Now, to add a new person type, we just add a new class for that person type that inherits from the class “Person” and we won’t have to change any existing code, we only added new code.

 

Fix your variable names.

What do you think td stands for?  You
can probably come up with dozens of different things it could possible
be.  What if I told you it stands for todaysDate
This is a big problem in code.  Meaningless variable names. 
In Visual Studio, sure Intellisense tells me its a date, but other than
that, what the heck is it for?  Give your variables
meaningful names, and not names where you leave out all the vowels
either.  tdysDt is not very helpful
either.  You are going to save somebody a lot of time and headache
in the future if you use meaningful names.  I’ve even seen people
go back to their own code before and have to decifer what the heck td stood for by search back through code.

 

I have explained 4 common design issues to look for when
beginning your code refactoring.  Once you do it a little while,
it becomes easier and you’ll be able to code to avoid these pitfalls,
rather than having to fix them afterwards by refactoring.

This entry was posted in .Net Development, Patterns and Practices, TDD. Bookmark the permalink. Follow any comments here with the RSS feed for this post.

2 Responses to Introduction to Refactoring

  1. Bob, glad to help.

    If you want some more information on inheritance and polymorphism, check out this article:

    http://codebetter.com/blogs/raymond.lewallen/articles/59908.aspx

    And I’m working on another refactoring post :)

  2. Bob says:

    Raymond,

    While I have been coding with .NET since it went GA, I really wasn’t taking full advantage of OOP. Mostly because the samples are always useless.

    I think identifying methods that need to be moved is something I’ve always been pretty good at, and I have also always looked for code in seperate methods that could be made into its own method.

    Refactoring to create a new method just slapped a brick up against my skull tho. Not sure why, but THANK YOU for the slap. I just had some code handed to me that was 50 lines. Now it’s a nice, easy-to-read 10 lines.

    Also, the Inheritance and Polymorphism thing has always been a bit vague for me. They are both a lot less vague for me now, both in terms of How, but more importantly Why to use them.

    OK, Now, stop sitting around and write “Refactoring, Part 2″!

    And thanks for the article!

Leave a Reply