Software Design/Break up too large and complex functions

From Wikiversity
Jump to navigation Jump to search

Checklist questions:

  • Can the function be broken up into smaller ones which are easier to work with?

This practice corresponds to rule Keep functions short and simple from C++ Core Guidelines.

Why[edit | edit source]

Easier to make mistakes with more variables[edit | edit source]

Large functions tend to have more variables defined in their scope. Many of the variables may have compatible types (or the same type), so they might be used one instead of another by mistake. In other words, large functions are more prone to programming mistakes than small functions.

Negative effects of functions not fitting into a screen and/or developer's working memory[edit | edit source]

If a function is so long that it doesn't fit a single screen in code editor, readers have to keep some parts of the function's logic in memory to comprehend it which is an extra cognitive load, or scroll the function's body up and down a lot. In either case, understanding of the function and activities like debugging become more effort consuming.[1][2] This may also contribute to the probability of making mistakes in the function.

These effects may begin to manifest even for functions fitting into a screen vertically, but longer than the number of lines of code that a particular developer can keep in focus at once. This may depend on the font size and vary for each individual. The lower bound might be 10–30 lines of code which is much fewer lines than can fit on a screen vertically in most setups. 30 lines also appear in the "Rule of 30" by Lippert and Roock.[3][4]

Functions with high cyclomatic complexity may be error-prone[edit | edit source]

There is at least one study[5] showing that higher cyclomatic complexity is associated with higher bug density. However, there is no scientific consensus on whether this relationship exists.

Functions with high cyclomatic complexity are hard to test[edit | edit source]

For functions with high cyclomatic complexity, it may be hard to prepare test data that covers all execution paths. When a function is broken into smaller functions, it may be easier to test obscure paths by feeding rather artificial test data into the smaller functions.

Pressure to prune comments from already large functions[edit | edit source]

Developers of a sufficiently long function may feel a self-imposed pressure to keep the function shorter (at least fitting the screen size vertically, as mentioned above): for example, to strip down explanatory comments from it, or to abbreviate the code. Therefore, long functions may end up to be documented worse than shorter functions and to contain more obscure code. This makes understanding of these functions even more difficult (in addition to the effect stemming from the size itself, discussed above).

Large functions are worse for performance in environments with JIT compilation[edit | edit source]

In modern environments with JIT compilation such as JVM and CLR, longer functions are typically worse for performance than shorter functions for the following reasons:

  • Certain optimizations which require analysis of the whole function may be turned off for larger functions because they become too expensive.
  • Extra large functions may not be compiled at all and only be executed via interpretation.
  • Large functions may not be inlined in their caller functions. This is true in environments with static compilation as well.
  • When a large function is compiled as a whole, the time spent to compile rarely or never visited branches in the function and the memory used to store the compilation result is wasted.

Large functions may not be highly cohesive[edit | edit source]

Although this is not an issue associated with the function's size per se, speculatively speaking, large functions are suspects for having multiple purposes[6] and internal repetition.

Why not[edit | edit source]

Static enforcement[edit | edit source]

ESLint (JavaScript)[edit | edit source]

complexity rule:

"complexity": ["error", 15]

max-lines-per-function rule:

"max-lines-per-function": ["error", 30]

Checkstyle (Java)[edit | edit source]

CyclomaticComplexity check:

<module name="CyclomaticComplexity">
  <property name="max" value="15"/>
</module>

MethodLength check:

<module name="MethodLength">
  <property name="max" value="30"/>
</module>

ReSharper for VisualStudio (C#, C++, JavaScript, TypeScript, Visual Basic)[edit | edit source]

Cyclomatic complexity plugin for ReSharper.

Radon (Python)[edit | edit source]

Cyclomatic Complexity metric.

Related[edit | edit source]

References[edit | edit source]

  1. Refactoring: Improving the Design of Existing Code (2 ed.). 2018. ISBN 978-0134757599. https://martinfowler.com/books/refactoring.html.  Chapter 3, "Long Function" section
  2. Holzmann, Gerard J. (2006). "The Power of 10: Rules for Developing Safety-Critical Code". IEEE Computer 39 (6). doi:10.1109/MC.2006.212. http://web.eecs.umich.edu/~imarkov/10rules.pdf.  Rule 4: "No function should be longer than what can be printed on a single sheet of paper in a standard format with one line per statement and one line per declaration."
  3. Refactoring in Large Software Projects: Performing Complex Restructurings Successfully. 2006. ISBN 978-0470858929. 
  4. "Bird, Jim (February 13, 2013). "Rule of 30 – When is a Method, Class or Subsystem Too Big?".
  5. Schroeder, Mark (1999). "A Practical Guide to Object-Oriented Metrics". IT Professional. http://cetus.ee.queensu.ca/~benkam/fan-in-out.pdf. 
  6. "Keep functions short and simple" in C++ Core Guidelines