TL;DR After introducing what complexity is, how it can be measured, and how to monitor it in Ruby – I’ll get to the difficult part: how to reduce it in Ruby. The post is concluded with some thoughts and recap.
Intro
Even Vader does not know what are those buttons for.
Developing a feature requires some attention on code quality – for the sake of reliability, efficiency, security, and maintainability. If you have been following us, you know that we are not satisfied with code that just works – and nor should you be. One of the ways to achieve good quality is keeping the complexity in check. After introducing what complexity is, how it can be measured, and how to monitor it in Ruby – I’ll get to the difficult part: how to reduce it in Ruby.
Disclaimer: As complexity is not a simple topic and deserves to be illustrated – this post is a bit longer than usual – I hope you’ll bear with me. Bookmark it, you can always read it in pieces.
Easy, isn’t it? Eek!
Complexity is a term covering several issues – including time and space complexity of an algorithm. In this case, we’re talking about code complexity i.e. how complex it is to understand and modify your code. For a machine, it makes no difference. For another human being, the case is entirely different. Complex code is hard to read – hard to modify – and hard to debug (yep, we’re writing code for humans). It impacts the maintainability and reliability of the software directly .
As code complexity has many facets, it is not so easy to summarize it with a single measure. Several measures have been developed, each one allowing to grab one aspect of complexity.As an introduction, we’ll be covering the Cyclomatic Complexity metric and the ABC metric.
Cyclomatic Complexity
The most known metric is the cyclomatic complexity indicator. It measures how many paths of execution are possible through the code. If there is no branch (neither loop nor exception rescue), then there is only one path (i.e. the code is purely sequential) and the complexity equals to 1.
def hello_world puts "Hello world" end
Each time you add a branch or a loop, you create new paths – and complexity increases. For instance, adding one if-clause results in two paths, and a complexity equaling 2.
def hello(name) if name puts "Hello #{name}" end end
Its evaluation is easy, especially when considering only one possible exit point, i.e. one return:
NB + 1
where NB is the number of branch and loop keywords. If you consider multiple exit points, it is:
NB - NR + 2
where NR is the number of returns. For instance, if we’re rewriting the previous example with a guard clause, we have now a complexity equals to 1 as we have 2 exit points.
def hello(name) return unless name puts "Hello #{name}" end
By nesting more and more branches, we’ll get the Arrow Anti-Pattern with a high cyclomatic complexity.
Really sizing up its meaning needs some experience. However, according to the community and according to some studies, 11 seems to be a good sanity threshold at a module/class level. This is not just a number, it has been observed that defects and cyclomatic complexity are correlated!
The ABC Metric
ABC is not per se a complexity metric. It is a measure of number statements and structures. Still, it is accepted that functional size has an impact on the complexity: a method with a lot of instructions, assignments, calls, branches, and other structures will be more complex and harder to understand – due to its size. ABC is computed as a function distance of the number of: assignments (A), branches (B), and conditions (C):
|ABC| = sqrt( A * A + B * B + C * C )
Each language having its peculiarities, the counting of A, B, and C must be adjusted accordingly. Basically, it consists in deciding to which category A, B, and C a keyword and other language structure belongs – and how to count it. It’s a good complementary metric to cyclomatic complexity as it takes into account more than just branches.
For instance, if we count unless for 1 as branch, assignment for 1, and function call for 1 as branch, the following code has a ABC metric of sqrt(1² + 2² + 0²) = ~2.2:
def hello(name) return unless name greetings = "Hello #{name}" puts greetings end
You-hoo! Gi’me a ruler!
“Give them a tool, the use of which will lead to new ways of thinking.” Richard Buckminster Fuller
Metrics without measuring tools – rulers – are useless. In Ruby, Flog or Excellent are among the best known and most actively used tools.
Flog
Flog is a kind of extended ABC metrics tool for Ruby. The C stands for call - not condition – in this case – as Flog puts a particular accent on it. For instance, a call to metaprogramming methods has a higher weight than to a normal method.
As with every metric, you need some experience to really weigh a Flog score for a single method. The range [20, 60] is considered as a gray zone, needing some personal contemplation: it could be due to the business domain – or it’s a true candidate for refactoring.
Excellent
Excellent is not a single “ruler”, it’s a set of checks and metrics that you can easily apply together on a codebase. As for complexity metrics, it includes ABC metric, an approximated cyclomatic complexity, and a Flog-like score.
Aye! It’s high, but how to reduce it?
A metric (tool) is a way to check and evaluate how critical the situation is. It will get your attention to some piece of code and push you to ask yourself: do I need to refactor? Using tools like Flog and Excellent, this is only the easy part.
A score – while interesting – does not give any clue how to fix your code. One cannot say: when you have a score A you need to do this, and with a score B you have to do that. You need to investigate the situation to understand what’s wrong and how you can refactor your code.
To address these doubts it might be useful following common guidelines as the SOLID ones. In the next paragraphs, I’ll be offering several solutions in different typical use cases. The examples are intentionally stupid to clearly illustrate the idea.
Conditional execution of a method
class Book def buy if available? puts 'buy' end end # def available? false end end
method |
Flog Score |
Cyclomatic Complexity |
buy |
2.3 |
2 |
available? |
0 |
1 |
It’s impossible to buy a Book if it’s not available?. It’s very common that a method can’t be run under some conditions – preconditions. Wrapping the block with an if-clause makes the normal path unclear – especially if there are several ones. In order to isolate business logic and preconditions, it’s better to use guard clauses to stop execution in the beginning if the conditions are met. This can be done with Exception or by directly return.
class Book def buy # with Exception raise 'The book is not available' unless available? # without return unless available? puts 'buy' end # def available? false end end
method |
Flog Score |
Cyclomatic Complexity |
buy (without exception) |
2.3 |
1 |
buy (with exception) |
3.3 |
1 |
available? |
0 |
1 |
Conditional behavior of a method
class Book def price(number_of_copies) total = @price if number_of_copies > 50 total = @price * 0.75 elsif number_of_copies > 20 total = @price * 0.85 elsif number_of_copies > 10 total = @price * 0.95 end total end end
method |
Flog Score |
Cyclomatic Complexity |
price |
9.7 |
4 |
The price of the Book depends on the number of purchased copies. The method price returns conditionally a value for different special cases. It has been implemented by following the school of one single exit point. By doing this, you enforce dealing with multiple cases at the same time. It makes the reading more difficult. Sometimes, it’s still better to keep a single exit point, other times not. If you can delimit a clear behavioral scope for each special case and assign an exit gate, directly returning the value will generally make the code clearer and simpler (I let the reader check the impact on cyclomatic complexity). In our case, it’s better to exit as soon as the price is calculated.
class Book def price(number_of_copies) return @price * 0.75 if number_of_copies > 50 return @price * 0.85 if number_of_copies > 20 return @price * 0.95 if number_of_copies > 10 @price end end
method |
Flog Score |
Cyclomatic Complexity |
price |
7.8 |
1 |
One other option is to extract the condition into a dedicated method get_percentage(number_of_copies) and use a Hash to store the mapping between the percentage and the conditional number_of_copies (I’ll use this approach below in the section “Repetitive conditional behavior”).
Condition depending on types
class Book end # class DVD end # class Shop def how_much_does_it_cost(product) price = 0 if product.is_a? Book price = 5 elsif product.is_a? DVD price = 10 end price end end
method |
Flog Score |
Cyclomatic Complexity |
how_much_does_it_cost |
5.2 |
3 |
You can buy two kinds of product in the Shop: Book and DVD. That Shop practices a pricing only depending on the kind of product. At first, in the method how_much_does_it_cost, the product type is tested to decide the price. If you add a new product type, you’ll need to add other if-clauses and make the method more complex (and lead you to the God Object anti-pattern). In fact, polymorphism is an object oriented mechanism that allows you to exactly do that: implementing different behaviors for different types. Using it helps drastically reducing complexity. In our case, we just need to put the price in the product classes Book and DVD.
class Book def price 5 end end # class DVD def price 10 end end # class Shop def how_much_does_it_cost?(product) product.price end end
method |
Flog Score |
Cyclomatic Complexity |
Shop#how_much_does_it_cost |
1.1 |
1 |
DVD#price |
0.3 |
1 |
Book#price |
0.3 |
1 |
Total |
1.7 |
3 |
Nil guard clause
# somewhere in you Shop code # … if product puts product.name end # …
method |
Flog Score |
Cyclomatic Complexity |
Shop#somewhere |
2.6 |
2 |
How many times did you not guard a call on a nil object? Whatever the reason, if you have a nil object, you cannot call a method. Each time you can have a nil, it is necessary to guard a call on it.
This could be resolved with the Null Object pattern (another option with Ruby on Rails consists in using the method try). The idea is simply to provide a model of the absence of object with a compatible interface with the used type, i.e. calling a method on an instance of it won’t throw a NoMethodError exception and won’t do anything. You can implement your own or require a Gem providing it such as the excellent Naught by Avdi Grimm. Once done, you can remove the guard clauses.
# somewhere in you Shop code # … puts product.name # …
method |
Flog Score |
Cyclomatic Complexity |
main |
2.2 |
1 |
Multiple branches leading to a same statement or return value
class Book def price if @pocket return 5 else if @second_hand return 5 else return 10 end end end end
method |
Flog Score |
Cyclomatic Complexity |
price |
2.3 |
1 |
For two different conditions (@pocket and @second_hand), the same price value is returned: 5. This can at least be combined as a single conditional expression thanks to Boolean algebra (this is called ConsolidateConditional refactoring).
class Book def price return 5 if @pocket || @second_hand 10 end end
method |
Flog Score |
Cyclomatic Complexity |
price |
2.1 |
1 |
It is also a good practice to extract complex Boolean expression into a dedicated method (this is called DecomposeConditional refactoring). It will be easier to modify and to reuse.
class Book def price return 5 if reduced_price? 10 end # private def reduced_price? @pocket || @second_hand end end
method |
Flog Score |
Cyclomatic Complexity |
price |
1.8 |
1 |
reduced_price? |
1.0 |
1 |
Total |
2.8 |
2 |
If you need to implement a deep decision tree with multiple conditions and possible outputs (for instance when dealing with taxes in accounting), you’ll probably need to use a dedicated library.
Long method
class Shop def inventory_missing_products # calculates the number of sold nbr_of_sold = Hash.new(0) receipts.each do |receipt| nbr_of_sold[receipt.product.name] += 1 end # # calculates the number of stock nbr_in_stock = Hash.new(0) products_in_stock.each do |product| nbr_in_stock[product.name] += 1 end # # calculate missing products nbr_of_bought.inject(Hash.new(0)) do |missing_of, (name, nbr)| missing_of[name] = nbr - nbr_of_sold - nbr_in_stock end end end
method |
Flog Score |
Cyclomatic Complexity |
inventory_missing_products |
18.6 |
4 |
The method inventory_missing_products couldn’t calculate the number of missing products without knowing two other amounts nbr_of_sold and nbr_in_stock. In our case, we calculated them in the same method. As you read, we needed to make clear that we have 3 different goals to achieve – by putting comments and adding separation white line. For the sake of clarity, it would be better to extract those preliminary calculations into dedicated methods. By doing so, we’ll have smaller methods – with well defined goal and scope – and lower complexity. Of course, the total complexity won’t be reduced (it could even be increased), but the complexity of each method will be now “decent”.
class Shop def inventory_missing_products nbr_of_sold = how_many_sold_products nbr_in_stock = how_many_products_in_stock nbr_of_bought.inject(Hash.new(0)) do |missing_of, (name, nbr)| missing_of[name] = nbr - nbr_of_sold - nbr_in_stock end end # private def how_many_sold_products receipts.inject(Hash.new(0)) do |nbr_of_sold, receipt| nbr_of_sold[receipt.product.name] += 1 nbr_of_sold end end # def how_many_products_in_stock products_in_stock.inject(Hash.new(0)) do |nbr_in_stock, product| nbr_in_stock[product.name] += 1 nbr_in_stock end end end
method |
Flog Score |
Cyclomatic Complexity |
inventory_missing_products |
10.9 |
2 |
how_many_sold_products |
7.4 |
2 |
how_many_products_in_stock |
6.0 |
2 |
Total |
25.3 |
6 |
This is also valid for branch and loop blocks and their expression. When the block becomes too long or the expression contains too many clauses, it’s always better to extract them into a separate method. A method has a great benefit over a block that shouldn’t be underestimated: it has a name.
Repetitive conditional behaviors
class Shop def calculate_total_price(price, country) case country when 'BE' return 0.7 * price + price * 0.21 when 'FR' return price + price * 0.20 when 'UK' return [0.9 * price - 1.0, 5.0].max + price * 0.20 end end end
method |
Flog Score |
Cyclomatic Complexity |
calculate_total_price |
14.6 |
1 |
The Shop is spread over Europe: this means a different VAT per country. The Shop also applies different promotions depending on the country. There is clearly a pattern common to each country: the price with promo + the VAT (that needs to be applied on the original price). We have duplicated codes and conditionals. We can do better.
If we want to keep the freedom to change the promo rules, we have a good candidate to apply the Strategy pattern, i.e. to extract the calculation of the promo as a Method Object that could be specialized if needed. In order to illustrate another option, we’ll use another similar functional approach: using lambdas. It’s especially suited for small functions. Finally, to get rid of the case branches, we’ll use a Hash.
class Shop PROMO_RULE_PER_COUNTRY = { 'BE' => ->(price) { price * 0.7 }, 'UK' => ->(price) { [price * 0.9 - 1.0, 5.0].max } } PROMO_RULE_PER_COUNTRY.default = ->(price) { price } # VAT_PER_COUNTRY = { 'BE' => 0.21, 'FR' => 0.20, 'UK' => 0.20 } VAT_RULE_PER_COUNTRY.default = 0.0 # def calculate_total_price(price, country) return PROMO_RULE_PER_COUNTRY[country].call(price) + VAT_RULE_PER_COUNTRY[country] * price end end
method |
Flog Score |
Cyclomatic Complexity |
calculate_total_price |
6.2 |
1 |
Shop#constants |
8.8 |
- |
Total |
15.0 |
1 |
Other cases
There are too many possible cases to be broken down in a simple – be it long – blog post like this one – to discuss different design pattern and refactoring methods. Enough material to write ten other posts. We’ll surely keep them into our list of potential topics!
Some References
- Flattening Arrow Code by Jeff Atwood
- Refactorings catalog maintained by Martin Fowler
- How To Score Your Rails App’s Complexity Before Refactoring by Eric Davis
- Through the eyes of Sonar: Complexity by Stéphan Mestach
- Ruby Science by Thoughtbot
Closing Thoughts, hmm.
Without any external point of view, it’s difficult to realize that – while some piece of code seems easy to you – it’s complex for others. Indeed, we don’t necessarily think in the same way. Even if you take care, your code will become more complex quickly. A code base is a living thing – and evolves through the changes made. What was simple can become progressively complex – one little change after another. A Code Analysis Tool or – better – a Code Review (Tool) can underline the complexity and catch your attention. It is then possible – and necessary – to act.
Case |
Refactoring |
conditional execution of a method |
|
conditional behavior of a method |
|
condition depending on types |
|
nil guard clause |
|
multiple branches leading to same statement or return |
|
long method |
|
repetitive conditional behaviors |
Strategy Pattern, lambda |
other cases |
Refactoring, Design Pattern, etc. |
In the end, all those refactoring methods allow to reduce, move, or split the complexity. The goal is to make the work units – like functions, methods, classes, and modules – easier: easier to read – easier to understand – easier to change – easier to debug and to maintain. Sometimes – however – a higher score is better:
- the tackled business domain is complex – full of exceptions and conditions, or
-
the simpler alternative uses more advanced language structure and libraries, making it more complex.
Conclusion. Whoa!
“Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it.” Alen Perlis
Each metric grabs a different aspect of complexity. In the end, the devs are the only ones that can correctly interpret each case. Anyway, objectively pointing to complex locations is a good way to catch the attention – not only yours but also the code reviewer’s. Complexity is effectively a complex question – bringing you back to the basics. It initiates discussion with the code reviewer. Automated tools and code review are complementary approaches that will significantly help you to control code complexity.