Don't Pollute Your Team Project With Bad Comments

Take the responsibility of keeping your code base clean — especially for your team projects

Yong Cui

Level Up Coding

· ~7 min read · June 11, 2020 (Updated: December 14, 2021) · Free: No

Advance Your Programming Skills

When we learn to code, many seasoned programmers like to tell us that we should document our code well. Many of us follow this advice by writing comments along with any code that we write. Sometimes, we do find some comments to be handy when we re-visit our code in months or years. After all, you know what your commenting style is, and your own comments can remind yourself, more or less, what you did.

However, there is a catch — you had to comment your code precisely in the first place when you wrote the code. How many times have you found out that some of your comments were confusing and didn't have exactly the right information consistent with the code? With the proper education and training of new programmers, I've seen the trend that many projects have implemented more strict team rules in terms of writing comments, which is a good thing, overall.

However, it causes a keen problem on the other extreme. Superfluous dogmatic comments occupy every single place throughout the code base — way too many comments. If you've ever worked on team projects, does it happen to you that the code is rarely under-documented? Instead, it's often over-documented.

We should never underestimate the necessity of good comments in our code. For instance, some comments are legal announcements for intelligence property protection. Some comments clearly explains the purpose of choosing certain operations over other seemingly more common ones. Some other comments provide a high-level overview of the module at the top of the document.

The above are some examples of necessary and informative comments that we should appreciate and thrive to write these kinds of comments in our projects. However, on the other hand, we should avoid writing, probably should never, pollute your own or shared code base with unnecessary and misleading comments, which I collectively refer to as bad comments. This article is aimed to discuss some categories of bad comments and how to get rid of them with some suggestions.

Distracting and Noisy Comments

The most significant form of bad comments are those redundant, which can cause distraction by simply creating unneeded obstacles preventing us from reading the code continuously.

class Coordinate:
    # The latitude of the coordinate
    latitude = 0
    # The longitude of the coordinate
    longitude = 0
# the chosen user
chosen_user = ...
# the current user
current_user = ...

They're not totally made-up examples. I've seen similar ones in more than a few occasions. Do you find these comments to be very distracting? Simply compare them with the cleaned up version. I doubt that the following code without these redundant comments affects your understanding of the code. Quite on the opposite, I believe that it has a much better readability.

class Coordinate:
    latitude = 0
    longitude = 0
chosen_user = ...
current_user = ...

Tip # 1. Remove noisy comments which otherwise only serve as obstacles harming code readability.

Duplicate Information

There is another form of redundant comments. It happens a lot when you don't write a good function, and you're trying to use the comments to explain your code. Let's see a trivial example.

# Calculate the total amount that a customer needs to pay
# t is the initial subtotal
# r is the sales tax rate, by default it's 8%
# c is the coupon, which will take x dollars from the price
# Important: to use the coupon, we need to validate it
def total(sub, r=0.08, c=None):
    if not c:
        return sub*(1+r)
    else:
        if validate(c):
            return (sub-c)*(1+r)
        else:
            raise ValueError("Invalid or Expired Coupon")

The function in the above code snippet simply calculates the total amount of money that a customer needs to pay, taking into account the sales tax and the coupon used. There are several lines of comments above the function. However, most of them are redundant. Compare it to the following.

# Calculate the total amount that a custom needs to pay
def calculate_total_amount(subtotal, tax_rate=0.08, coupon=None):
    tax_adjusted_factor = 1+tax_rate  
    if not coupon:
        return subtotal*tax_adjusted_factor
    else:
        if check_coupon_validity(coupon):
            return (subtotal-coupon)*tax_adjusted_factor
        else:
            raise ValueError("Invalid or Expired Coupon")

There are two things to highlight by comparing the updated version with the previous one.

We get rid of the duplicate information that simply describe the same operations that the function performs. It is a violation of the DRY principle (i.e., Don't Repeat Yourself). Some people may think that DRY only applies to the code your write, which isn't true. It also applies to comments, too. In this case, whenever you need to change your function, you need to change your comments too. For example, what if your local government changes the sales tax rate?
We change the names for the functions and related variables. These names are more sensible and thus clearly reflect what data they hold or what jobs they do. We didn't substantially change the function's operations, but with these updated names, you'll find it much easier to read. With the previous version, you may have to look back and force to figure out what these variables are.

Tip # 2. Fix your code and remove duplicate comments that violate the DRY principle.

One quick note is that if your functions are to be public APIs for external use, you may want to implement more official documentation (e.g., Python's docstrings). The discussion of redundant comments doesn't apply to these legitimate comments, as I pointed out earlier.

Commented-out Code

The commented-out code is also a heavy pollution that you can find in your code base. The existence of itself isn't necessarily a bad thing, because it reflects that the code writer may have carefully thought about the problem before he/she finally found the best solution. Thus, the old code was commented out.

But the commented-out code is just as noisy as the distracting comments that we discussed in the beginning. It harms the code's readability. We have to stop for a moment, and for curiosity or whatever reasons, would like to read the commented-out code. Doesn't it waste time for the one and cause confusion for the other?

What's worse about the commented-out code is the uncertainty of whether we can delete these lines of code or not. Is it possible that the coder will come back and activate these lines? Or, the updated code is the final version and we can safely delete these comment lines. No matter what, this uncertainty is counter-productive.

We all know that version control tools such as GitHub have become very handy, allowing us to track changes to a greater detail than ever before. So, there are little reasons that can justify our leaving commented-out code in our code base. Whenever you complete a block of code with some commented-out code, you should think about removing these lines as soon as possible. Even if you may tolerate yourself by saying, maybe later when I come back. However, do you know that most of the time, later means never?

Tip # 3. Don't leave commented-out code in your code base.

Ambiguous Comments

Ambiguous comments are those that contain unclear, inconsistent, misleading, or even wrong information. They can do the most damage to the code base. We all agree that the purpose of our commenting is to help us understand the code. However, when there are mismatches between comments and code, which one can you believe?

Certainly, we should put more weight on the code, because it's the code that is is the blood and flesh of our software. But is it possible that the comments are actually the desired functionality and the code isn't yet performing the job? Consider the following trivial example.

# This function check the validity of the coupon.
# returns 0 when we find no problems
def check_coupon_validity(coupon):
    # search the database
    if valid_coupon:
        return 1
    else:
        return 0

In the above code, the comments say that when we find no problems with the coupon, we should expect the function to return 0. However, the code appears to tell a different story. When the coupon is found to be valid, the function returns 1. Which one is correct, the comment or the code? We can't know the answer unless we read other related functions to investigate how the returned value is used. In the end, we may, at best, infer that the comment or the code is correct. Or, we have to talk to someone who knows the data before we can be certain which is correct.

Thus, we should be extremely cautious when we write any comments, and we need to make sure that the comments precisely describe the pertinent code without any ambiguity. In addition, to be professional, whenever we see any unclear or incorrect comments in our code base, take a step further and solve the problem by addressing the inconsistency. Because if a bad comment exists in the code base, it can cause more confusion in the future when it's more distal from the source of the problem. So address it at the earliest time point.

Tip # 4. Don't leave ambiguous comments in your code base. See any? Address it.

Conclusions

Comments are necessary evils in any code base. When we have minimum amount of good comments, our code is easy to read and maintain. However, when our code is polluted with bad comments, especially those unclear and misleading ones, our productivity will be significantly diminished, because we'll have to spend considerable time addressing these inconsistencies. Thus, we should continuously keep an eye on any comments we write in our code base, and think about whether they fall into one of the categories of bad comments as discussed above. If all of the teammates can take such responsibility, our code base will be strong while stay clean.

#programming #technology #software-engineering #data-science #python