Channels ▼

Arnon Rotem-Gal-Oz

Dr. Dobb's Bloggers

Code Readability: Documentation vs. Refactoring

June 18, 2008

Technical documentation of the code/project  seems like a worthy goal. After all,  if we aren't living in a vacuum someone will need to understand our code/design and maintain it, use it to develop new stuffm, etc.


On the other hand writing documentation can get tedious, boring, hard to maintain, and whatnot -- so you run across all sorts of ways to deal with it.

Consider, for example, the following:
  • On a recent article in IBM's DeveloperWorks Paul Duvall suggests you use automation to generate various documentation artifacts like UML diagrams based on the source code (using ANT tasks and UMLGraph), ERDs (with SchemaSpy) etc.
  • Another example, is a .Net tool called GhostDoc. This tool automagically generates C# XML documentation comments ("///") e.g. (all comments below are ghostdoc work):
<span class="rem">/// <summary></span><br /><span class="rem">/// Appends the HTML text.</span><br /><span class="rem">/// </summary></span><br /><span class="rem">/// <param name="htmlProvider">The HTML provider.</param></span><br /><span class="kwrd">public</span> <span class="kwrd">void</span> AppendHtmlText( IHtmlProvider htmlProvider )
<span class="rem">/// <summary></span><br /><span class="rem">/// Adds the specified item.</span><br /><span class="rem">/// </summary></span><br /><span class="rem">/// <param name="item">The item.</param> </span><br /><span class="kwrd">public</span> <span class="kwrd">void</span> Add( <span class="kwrd">string</span> item )<br />
<span class="rem">/// <summary></span><br /><span class="rem">/// Determines the size of the page buffer.</span><br /><span class="rem">/// </summary></span><br /><span class="rem">/// <param name="initialPageBufferSize">Initial size of the page buffer.</param> </span><br /><span class="rem">/// <returns></returns> </span><br /><span class="kwrd">public</span> <span class="kwrd">int</span> DeterminePageBufferSize( <span class="kwrd">int</span> initialPageBufferSize )<br /><br />[from <a href="http://dotnetslackers.com/articles/vs_addin/Introduction_ghostdoc.aspx">Introduction to GhostDoc</a>]<br />

 What I think is that while both of these efforts can help satisfy a customer-specific requirement for "comprehansive documentation"* they have very little value in making anyone understand anything about your code. UML diagrams can only help if they are created at a higher level of abstraction than the code (which means they'd be hand-crafted) and if GhostDoc can understand your code enough to create anything useful, it means that your method and parameter names are self-descriptive anyway.

In a previous post I mentioned that I prefer to rely on tests, short methods and meaningful names for readability. I'll talk about tests in another post. For this installment let's look at the other two. I think it would be better demonstrated by an example.


Consider the following horror of a method:


       public void HandleWithFrame(FrameProp Frame)
        {

         int FreeProcessNum = 0;
         int FreeProcessId = 0;

         if (Frame != null)
         {
             rwl.EnterWriteLock();

             if (m_WaitingFrame.ContainsKey(Frame.m_SessionId))
                 m_WaitingFrame[Frame.m_SessionId].m_Frame = Frame;
             else
                 m_WaitingFrame.Add(Frame.m_SessionId, new WaitingFrame(Frame));

             IncrementPriority();

             rwl.ExitWriteLock();
         }

           rwl.EnterUpgradeableReadLock();
           foreach (var keyValuePair in m_ProcessMap)
          {
            if (keyValuePair.Value.m_Busy == false)
            {
               FreeProcessId = keyValuePair.Key;
               ++FreeProcessNum;

               if (FreeProcessNum > 1)
                     break;
            }

          }

           if (FreeProcessNum == 0)
           {
               rwl.ExitUpgradeableReadLock();
               return;
           }
       

           if (FreeProcessNum >= 2  )
           {

               rwl.EnterWriteLock();
            
                m_ProcessMap[FreeProcessId].SendFrame2CV(Frame);
              
                m_WaitingFrame[Frame.m_SessionId].m_NumProcess += 1;
                m_WaitingFrame[Frame.m_SessionId].m_Priority = 0;
                m_WaitingFrame[Frame.m_SessionId].m_Frame = null;

                m_ProcessMap[FreeProcessId].m_Busy = true;
              
 
              rwl.ExitWriteLock();
              rwl.ExitUpgradeableReadLock();
              return ;
          }


        


          WaitingFrame MaxPriority = new WaitingFrame();
          MaxPriority.m_NumProcess = 1000;
          MaxPriority.m_Priority = -1;

       
            foreach (var Item in m_WaitingFrame)
            {
                 if (Item.Value.m_NumProcess < MaxPriority.m_NumProcess)
                      if (Item.Value.m_Priority > MaxPriority.m_Priority)
                          MaxPriority.m_Frame = Item.Value.m_Frame;
            }
       
         rwl.EnterWriteLock();

       //  Console.WriteLine("ProcessId={0} Assign", FreeProcessId);
          m_ProcessMap[FreeProcessId].SendFrame2CV(MaxPriority.m_Frame);
          m_ProcessMap[FreeProcessId].m_Busy = true;

        
          m_WaitingFrame[MaxPriority.m_Frame.m_SessionId].m_NumProcess += 1;
          m_WaitingFrame[MaxPriority.m_Frame.m_SessionId].m_Priority = 0;
          m_WaitingFrame[MaxPriority.m_Frame.m_SessionId].m_Frame = null;
        
        
        rwl.ExitWriteLock();
        rwl.ExitUpgradeableReadLock();
        return ;

       }


This class needs a lot of explanations if you want to understand what exactly going on here. So you can set yourself up to writing a lot of comments and trying to figure things out -- or assuming this class was fully tested (which it wasn't, but that's another story)  try to refactor it until we get something meaningful (I will omit the added tests though for brevity).

Step 1: First If

It seems that when we have a first frame we want to keep it aside, so let's extract method all the if code to EnqueueFrame

           if (Frame != null)
               EnqueueFrame(Frame);

Okay, so now we look at EnqueueFrame. The code we see here talks with the  m_WaitingFrame private member (which is a Dictionary of <Guid, WaitingFrame>();. The first thing we'll do is to rename it to FramesQueue. Now the more interesting thing is that the code here has to do with managing this FramesQueue and isn't directly related to the containing class.


We can either subclass the Dictionary class or we can add an extention method to Dictionary<Guid,WaitingFrame> to handle this for us. We'll do that, and then refactor the If again:

           if (Frame != null)
           {
               rwl.EnterWriteLock();
               FramesQueue.Enqueue(Frame);
               rwl.ExitWriteLock();
           }

and Enqueue looks like (in a separate interanal static class):

        public static void Enqueue(this Dictionary<Guid, WaitingFrame> queue, FrameProp Frame)
        {

            if (queue.ContainsKey(Frame.m_SessionId))
                queue[Frame.m_SessionId].m_Frame = Frame;
            else
                queue.Add(Frame.m_SessionId, new WaitingFrame(Frame));

            queue.Prioritize();

        }


The advantage of what we've achieved thus far is both better OO design (separation of concerns) and enhanced readability by using intention revealing names and notation.

Step 2: The foreach loop


Again we'll start with Extract Method, we can now remove the definition of FreeProcessNum from the beginging and we get:

var FreeProcessNum = GetFreeProcessNum(ref FreeProcessId);

but this code is not really clear. For one thing, we have to rename FreeProcessNum to FreeProcessesCount to make it more legible. and we have an ugly and hard to follow ref variable. It is probably better to apply the Single Responsibility principle and  seperate this into two distinct methods,  so we'd get:
var FreeProcessesCount = ProcessesList.CountFree();
var FreeProcessId = ProcessesList.GetNextFreeId(); // we don't really need/want the ID but we'll fix that later

(as in the previous example we add extension methods to ProcessesList to make the code more intention revealing and for better seperation of concerns).


All we want to do in CountFree is count how many proccesses are not marked as busy so we can rewrite:

var FreeProcessNum = 0;
foreach (var keyValuePair in ProcessesList)
{
        if (keyValuePair.Value.m_Busy == false)
        {
               FreeProcessId = keyValuePair.Key;
               ++FreeProcessNum;

               if (FreeProcessNum > 1)
                       break;
         }

 }
  return FreeProcessNum;

into
        public static int CountFree(this Dictionary<int, ProcessStatus> processesList)
        {
            return processesList.Count(item => item.Value.m_Busy == false);
        }

Thank you MS for adding Linq and Lambda expressions :). The same can be done for GetNextFreeId
 

Step 3: The Two Ifs and the Rest


Taking a deep look at the code we can see that the rest of the method tries to find a free processor and if there are enough processors send the frame, otherwise it should send the top prioritized. We can also spot a bug here that two different threads can get the same Processor and then try to send a message to it one after the other. Another potential bug comes from the way the maximal priority is found. There's an assumption there that the max priority would be 1000. While it isn't likely to happen it is still a hard coded assumption.


Anyway, if we continue and apply the same principles that got us here (Single Responsibility Principle, Don't Repeat Yourself, Intention revealing methods, coherence and opening classes to add specific functionaliy) we get the original method to look like the following:

       public void ProcessFrame(FrameProp nextFrame) //was HandleWithFrame
        {
           rwl.EnterWriteLock();
           try
           {
               if (nextFrame != null)
                   FramesQueue.Enqueue(nextFrame);

               TryDispatchTopFrame();
           }
           finally
           {
               rwl.ExitWriteLock();
           }

       }

Compare this with the original method....


Also note that it isn't that the functionality disappeared -- it is just neatly distributed and grouped in short-related methods in related classes e.g.

    internal static class FramesQueueExtnesions
    {
        public static void Enqueue(this Dictionary<Guid, WaitingFrame> queue, FrameProp Frame)
        {

            if (queue.ContainsKey(Frame.m_SessionId))
                queue[Frame.m_SessionId].m_Frame = Frame;
            else
                queue.Add(Frame.m_SessionId, new WaitingFrame(Frame));

            queue.UpdatePriorities();

        }

        public static void ResetSlot(this Dictionary<Guid, WaitingFrame> queue,Guid slotId)
        {
            queue[slotId].m_NumProcess += 1;
            queue[slotId].m_Priority = 0;
            queue[slotId].m_Frame = null;
        }


        public static void UpdatePriorities(this Dictionary<Guid, WaitingFrame> queue)
        {

            foreach (var Item in queue)
            {
                if (Item.Value.m_Frame != null)
                    Item.Value.m_Priority += 1;
            }


        }
        public static FrameProp FindTopPrioritized(this Dictionary<Guid,WaitingFrame> queue)
        {
            var maxPriority=  queue.Max(item => item.Value.m_Priority);
            return queue.First(item => item.Value.m_Priority == maxPriority).Value.m_Frame;
        }
    }

You should note that this is not the end of the refactoring (e.g. we should still handle the WaitingFrame, FrameQueue and the ProcessesList which we are called here) we just took a look at a single method.

While there might still be a need for an occasional explanatory remark , this little exercise demonstrates that we can gain a lot in the way of clarity by refacroting code and keeping up a few simple principles. Oh yeas... and what we got at the end of the process is not just readable code, but also a more maintainable, better designed code that can move forward and evolve further as the system changes.



* I don't underestimate the value of generating full documentation when there's such a requirement from a customer. I would prefer to convince a customer that having such a Write-Only document is a complete waste of time  and trees but sometimes you can't help it. Generating documents in these situations can be a life-saver.

 

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video