Being able to write clean code is crucial to becoming a better software developer. Not only will this practice make your life easier, but it will help others who are accessing your code and work to grow your reputation as a developer who is known for being thorough. This article is a high-level run-through of different tips and techniques that you can use to write clean and understandable code in a project; helping you to maintain code for a longer period of time without any hassle.
To understand clean code we first need to have an idea of what bad code is — then we can identify problems and solutions to achieve clean code from there.
There is a famous analogy about defining bad code that is first narrated by Thomas Holwerda, later in Robert C Martin’s book, in which the measurement of code quality in WTFs/minute_!_
The original image can be found here.
Suppose we wrote a piece of code and some other teammates are reading or reviewing the code. If they immediately understand what the code is meant to do, that’s good code. But sometimes we ourselves can’t even understand something that we wrote a few months ago because we ignored understandability practices and wrote bad codes!
Sometimes we developers say that naming something is the toughest job while writing code, yes indeed! A variable, function, or a class should be self-explanatory to the person who is reading the code, which makes naming very important.
Look at this sample code:
Here, we can see two variables list1
and list2
but from the names, we are not getting what the values are or the types of these variables. Here comes the first tip (i.e. using intention-revealing names which also helps developers to stop the spread of disinformation). Please also note that the arrow function inside the filter has the argument named item. This is not very clear as we can’t say what kind of thing this item is!
Let’s refactor it and see how we can make improve here:
The distinction between function names based on what they do is very important. Suppose we have a function that returns all the users, but now we need the other two functions which will return a specific user by id as well as a list of users who are admins.
I have seen so many codebases where I found function names like this:
It’s not until you read what is inside the function that you can say what this function intends to do. This type of function names creates ambiguity and also keeps other developers in a dark zone where they don’t know about the existing functions and end up creating a new one which does the same thing as getUsers2.
Let’s refactor this.
You can see in all of the function names we have added the action i.e. _get _as a prefix. Using verbs as a prefix of the function name is a convention that we should follow because it says what action the function does.
There are times when super-coders form a team and use super acronyms in every other thing that they write. When we go through their code we can see some short names and sometimes we can’t even pronounce the names they have written down.
Let’s look into this piece of beautifully created but incomprehensible code:
Here, we have a class named Box (always remember that a class name cannot be a verb and should always be a noun). In this Box class, we have some variables as h
, w
, a_m
, a_ft
. Do you understand what these mean and what they store in them? If the answer is yes then this is most likely because it is a pretty simple, small code block and a simple one, but trust me, if you see something like this in a big project you will feel completely confused.
Let’s refactor the above code and try to understand how we’re doing it:
We have explicitly written down the variable and function names and also added their units. Adding units in the variable names is very important when you have a program that deals with multiple units. It also saves a lot of time while debugging and adding new features.
When writing names, we can follow different conventions for choosing case styles e.g. Snake Case, Pascal Case, Kebab Case, Camel Case. Some languages have their own conventions for using a case style, but we should follow a single case-style throughout the codebase.
Functions are a key component of any programming language, and we write a lot of functions in any project. Sometimes they are called the building blocks of the business logic inside the codebase. We might see a long function having a body of hundreds of lines of code but it gets difficult to catch the gist of what it’s doing inside of it. To resolve this issue we follow a simple rule: write small functions that only do one thing.
Sometimes we write functions where we fetch multiple data-points from the database, filter them on different attributes, sort them, run additional transformation methods on the data, generate the HTML from a template, and so on. When this occurs, we can separate out those functions and create small functions. This also creates a good separation of concern among the functions.
Let’s look into this example function:
Reading the function you might lose the context of what’s happening inside of it. Here, we are fetching all the pending messages from the database for users who are a merchant and then creating multiple batches based on the BATCH_SIZE before pushing these batches to the job queue and finally updating the status of the sent messages.
It works, but it looks really cluttered, right? Let’s break this long function into multiple small pieces where each of them performs a single task.
This is how the refactored function will look:
From the look of it, we can identify what these separate functions are doing, making it easy for developers to narrow down the scope while they refactor or debug around this function.
Suppose there is a function that is doing multiple things and we cannot separate them as they are closely related. These types of functions need proper indentation and blocking in the code; this explains which block is doing what and creates an abstraction on that separation we were talking about above.
Function arguments are another important thing when it comes to clean code. A helpful convention to follow is to never use more than two arguments in a function, and if you must do this then use an object or a dictionary. As a result, anyone can understand what the data-points re that they are sending as arguments to the function from the keys of the objects.
We also follow another rule called DRY (Do not Repeat Yourself) while writing functions. The mental model is pretty easy here, whenever we are writing a block of code that is already written somewhere in the codebase, we should treat that block of code as a function and create one to use in multiple places.
Comments are like butter on bread, without it the bread will be dryer and not as enjoyable to eat. If we put relevant comments in each of the modules or functions we write, then whenever someone jumps into that part of the code they might understand the intention behind the writing of that piece of code. Furthermore, if any change is required then they can do this more confidently as they understand what is going on inside of it!
There are many different types of comments for different occasions, let’s check those out:
Other than these, sometimes we put comments to create auto-generated documentation for our projects. Creating documentation from code-snippets is pretty much famous in the open-source ecosystem as it helps contributors or users understand what the code does from the look of the documentation.
Whatever development platform we use, they all have guidelines for code formatting which we should follow in order to maintain consistency in the codebase. Although a big team may have multiple team tracks that are working on the same repository, having some sort of code conventions applied in the repo helps to make the code clean.
Linting for static code analyzing is very helpful because it automatically checks a few things on top of the rules that we set and gives alerts if we try to break any of them. For JavaScript we use ESLint, Prettier is also a popular code formatting tool in this ecosystem. In Python, we use autopep8 or black as a formatter.
I believe that linting and formatting should be done automatically because we are humans and so make mistakes. There is a chance of pushing unformatted code with linter errors in it to the remote repository. Using Git hooks can be very helpful in avoiding this as it allows us to do auto linting and formatting before pushing or committing any codes to the Git.
Finally, writing clean and understandable code completely depends on the mental model of a developer. If we simply follow the boy scout rule i.e. “Leave your code better than you found it”, we can eventually end up having a codebase that is way better than before and that people love to work on. Making an effort today to improve your ability to write clean code will certainly pay off in the future, I hope that my article can play a helpful role in your software development journey.
#coding #software-development #clean-code #programming