July 14, 2022

Single Responsibility Lines‽

From Wikipedia:

The Single Responsibility Principle (SRP) is a computer programming idea: every module or class should have responsibility for a single functionality. All its services should be narrowly aligned with that responsibility. I leave the discussion about SRP in modules and classes to the plethora of sources addressing them

Yeah, But Every Line? #

Yes, every line should perform a single¹ task. Even if you are using Perl, every line should have one specific goal. Complex expressions are ¡very impressive¡, but such preening makes the code hard to understand². Breaking complex expressions and commands into informational variables and steps makes them easier to read³, easier to debug, and easier to refactor

There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies. — Charles Hoare

Good Names Help #

Proper naming, as well as intermediate and informational variables, reduce the need for comments⁴ by making the code communicate well. They make debugging and logging easy and consistent. Intermediate variables do not make code slower (or faster). Interpreters and compilers break the source down to the operator level; a little whitespace and a local token are not going to phase them⁵

Do not confuse a compound expression with an overly complex one, although length is a consideration. (x > 2) && (x < 14) is a single test, and if it fits neatly on one line, is a perfectly nice expression

This monstrosity

state_of_being.find("walk to school", False) ||
           state_of_being.find("take your lunch", False) ||
                           are_tautologies_welcome_here

even with the descriptive names, is not readable

Getting Data Is Not the Same as Using Data #

my_data_bag[thingie1][wet_sproket == 4 ? "subkey1" : "subkey2"] =
   calculate_important_data(wet_sproket != 4 ? source2 : source1.child)

[^7]: nested and embedded ternaries are a code smell

As with complex mathematical expressions[^7], the inner-most expression evaluates first. Unlike mathematics, we don’t have to do it like this. Figure out the input parameters, then use them to get data, then apply the data

if wet_sproket == 4
    param = source1.child
else
    param = source2

important_data = calculate_important_data(param)

if thingie not in my_data_bag
    my_data_bag[thingie] = {}

if wet_sproket == 4
    my_data_bag[thingie]["subkey1"] = important_data
else
    my_data_bag[thingie]["subkey2"] = important_data

I’ve “wasted” ten lines of code (not counting whitespace) making the assignment readable, understandable, testable, debuggable, and robust

Returning Is a Thing #

Even short examples can have embedded complexity

func handle(queue chan *Request) {
    for req := range queue {
        req.resultChan <- req.f(req.args)
    }
}

should be more like

func handle(queue chan *Request) {
    for req := range queue {
        result = req.f(req.args)
        req.resultChan <- result
    }
}

Again, clever expressions holding lots of business rules and operations are bad; combining them with a control structure is worse

return this_too() ? "shall" : "pass".strip()

should be more like

result = this_too() ? "shall" : "pass".strip()
return result

if this_too():
    result = "shall"
else:
    result = "pass".strip()

return result

Don’t Confuse Clarity with … #

Brevity #

A common mantra[^908] in business and marketing speak (although that is rarely succinct). We do not pay by the character anymore: we can afford good naming, vowels, and vertical whitespace

[^908]: which is usually followed by buzzwords, sadly enough. See Leverging Our Synergies

Loquation #

Artificially long names don’t help any more than commenting every line. If good code reads like prose, “Never use a long word where a short one will do” (George Orwell). See also “Homeopathic naming” from Kevlin Henney

Efficiency #

A seemingly straight-forward line like this

for i, value in enumerate(data[::2]):
     value2 = data [i + 1]

tries to do two things at once. Slicing data is a operation which returns a new slice of alternate elements. The author intended to iterate the original slice, skipping alternate members, but the code iterates through every item in an array of the odd-indexed elements of the data[]
This code implements the desired behaviour clearly

oddElements = data[::2]
evenElements = data[1::2]
for I, value1 in enumerate(oddElements):
    value2 = evenElements[i]

Iterating over an array is a separate responsibility from modifying the array

Look at how seemingly straight-forward code can hide intent
for i, value in range(value * 4):
Even this is not too simple to expand to separate lines. This code does not tell us why it is multiplying by four (e.g. quarterly to annual). Using a constant might tell us more, and looking at it as two different operations, iteration and modification, will tell us everything

Breaking the alternating array into two parallel arrays (as above) makes using them easy. Clever (worse) code might look like

 # this is clever code. don't do this

oddElements = data[::2]
for i, value1 in enumerate(oddElements):
     value2 = data[i * 2 - 1]

This code has a bug in it[^1122], but it’s not obvious. Breaking the second index into its own line might make the bug more obvious

[^1122]: the first iteration will try and access data[-1]

Examples #

Do one conceptual thing #

The compound operations are both easy to understand and fit the operation

log.info("user name: ${user.name.lower.trim}")

Does one thing: operates a loop #

for (i = array_length - 1; i >= 0; i--)

Does one thing: records and operates on the result of a call #

if (value, err := DoThatThingYouDo(47); err == nil)

Does several things #

Forcing code onto one line does no one any favours
Many, many comments might make this understandable (if they stayed up-to-date)

return transylvania_6_5000(total_value < MEDIAN_LIMIT ? total_value + extras(total_value, previous_value) : total_value * 1.2)`

Rather, break the expression down into separate lines. These arithmetic additions are easy to see and understand because the names are clear if they are on separate lines

[^14]: Nomenclature is the Best Documentation

if total_value < MEDIAN_LIMIT
    total_extras = extras(total_value, previous_value)
    result = calculate_total_extras(total_value + total_extras)
else
    result = calculate_total_extras(total_value * ABOVE_MEDIAN_MARKUP)

return result

or even

if total_value < MEDIAN_LIMIT
    total_extras = extras(total_value, previous_value)
    extras_and_values = total_value + total_extras
    result = calculate_total_extras(extras_and_values)
else
    marked_up_value = total_value * ABOVE_MEDIAN_MARKUP
    result = calculate_total_extras(marked_u-_value)

return result

Our goal is clarity; Do What’s Indicated[^98]

[^98]: Elihu King, 1992

Does two things #

This operates a loop by making an evaluation unrelated to the loop itself

// DOES TOO MUCH     
for (i = array_length;
      i >= 0 && array[i] > 14 && array[i] < max_val * 1.12; 
      i--)

Managing the loop is one thing; testing the data is something else. Testing for a null-terminator might be simple enough to include in the loop-control expression

lowest_allowed_value = 14
highest_allowed_valule = max_val * 1.12
for i = 0; i < array_length; i++)
    if (array[i] <= lowest_allowed_value ) || (array[i] >= highest_allowd_value)
        break

or break the calculation into a function with readable code. If the function name is good, readers will never have to look at that code

for i = 0; i < array_length && value_within_range(array[i]); i++)

Smells and Tells #

Ternaries #

Ternary expressions are a common violation of doing one thing. They are best used to fill in default values, make plurals, select the correct indefinite article, and include or omit small bits of text. Apart from trivial formatting operations, don’t call functions in ternaries. Nesting ternaries is Right Out

Line Length #

Line length is a broad metric, and often a good one. If a non-string line is wrapping, breaking it into separate lines is usually better than wrapping it. Compilers and interpreters are very good at optimizing code; the number of lines source code spans, even with intermediate variables, is never going to affect performance or object size

Indentation #

I like four-space indentation because it’s easier to read and makes it obvious when one has nested structures too deeply. Find a block of code starting at a shallow depth and move it to its own function where reasonable

Comments #

Comments cannot compensate for obtuse code. They decay and, in the words of Kevlin Henney, someone who writes obtuse code rarely has the ability to write clear comments

Code Reviews #

If any code reviewer does not understand what a line does, refactor the code instead of just telling them or, at worst, adding a comment. They represent the code-readers of the future (maybe even Future You). Their confusion indicates that the code does not express its purpose. Don’t argue that it is obvious or what a “reasonable” coder would think; take it as evidence that your code failed to communicate with another person[^8]

[^8]: To self-examine, consider an exercise: Explain to Me Like I’m Five. Not only will it help focus on the most important parts, it helps with descriptive naming