No Clever Code

Be Kind to Future-You

2025-04-17T18:27:23-07:00

An early mentor of mine described the starting point of a development effort, feature implementation, or a debugging session as

The Point of Maximum Ignorance

Besides it being the point where the client wants a fixed-price bid, it is also the point where Past You can do the most good for Future You

When one is coding a feature and comes to the error handling, one has a choice: throw an Exception("Something went wrong") or pay some of the debugging work forward. That point of error handling is The Point of Penultimate Wisdom¹(PoPW)

The code knows

what it was trying to do
what step went wrong
what values it was using at that step
(in many cases) what code was calling it and the values it passed in
and more

One can find wisdom after something went wrong by furiously² instrumenting the code and deploying it into production, by swimming upstream in the logs to find the butterfly wingbeat that started the particular process, and by stepping through a half-dozen procedure calls to get to the point of failure to find the values and the exact point of failure, but the Point of Penultimate Wisdom already knows all the things. The choice is whether to have the PoPW communicate it in the original error message

Pre-loading debug information — as an investment — minimizes the operational work needed during the chaos, pressure, and confusion of a failure. While not all the work is guaranteed to return dividends, its presence (like insurance) allows some confidence that failures can be diagnosed quickly (as opposed to being a terrifying Complete Unknown)

One can catch every exception and explicitly log exactly what was happening (and re-raising the exception, one hopes). Modern systems automatically provide a lot of information about errors, even in production builds, without added debugging information. One must balance the code-clutter of reporting the details of a detectable error³ against the future value of finding a problem quicker

While the person debugging a failure might not be the person who last touched the code, it does not really matter

Eagleson’s Law: Any code of your own that you haven’t looked at for six or more months might as well have been written by someone else

Do your (future) self a favour

The Point of Ultimate Wisdom is just after successfully debugging a failure and discovering a design flaw ↩
and sometimes angrily, depending on the time of day and day of week ↩
harkening back to the days of C (and of Go today) where one checked the return of every function for 0 = SUCCESS ↩

It’s All About the Emersons

2025-04-17T18:23:49-07:00

A foolish consistency is the hobgoblin of little minds

A wise consistency is the obsession of system architects

Humans like patterns. We even see them when they are not there¹. Much of poetry is the patterns of word sounds and rhythms and even the patterns of grammar and meanings

Patterns can help optimize construction: standard sizes of materials and tools, as well as construction and assembly standards (“codes”), allow various environments to be predictable, familiar, and productive. Beyond the namesake Software Patterns, common computer languages, frameworks, protocols, and idioms allow software developers to move from project to project and organization to organization, pausing only to learn about the new specifics before diving into work

As with most things, extremes of standardization are generally a Bad Idea. An absence of patterns and consistency makes cooperation and task mobility impossible, and a surfeit drowns productivity in “red tape” as the effort to standardize everything starves the work which those standards support. As with most things, good solutions are somewhere in the middle — a compromise best suited for the situation: It Depends

For a standard to be a good balance, consider these factors:

Is its goal to be useful, to be perfect and universal, or to be a confirmation of a previous decision
Will enforcement of consistency serve the goal of the standard or will the existence of the standard save time by rendering some choices moot

The last point is not an idle one. Projects are full of decisions, and having to decide about enforcement (e.g. the format of a Pull Request title or where the curly braces go) is not a good use of attention². A standard, even a petty one, might be valuable in removing distractions and exceptions. Every exception invites politics: who gets the exceptions and who grants them. And every exception tests (or “proofs”) the rules, and once rules are instituted and supported, most people won’t keep testing the rules they have finally become accustomed to

As long as consistencies serve more than they constrain, then perhaps they are not foolish at all

apophenia: refusing to believe that apparently connected or synergistic things are random and coincidental; The Texas Sharpshooter Fallacy is taking a collection of random data and recognizing the subset that supports one’s position ↩
A common view of languages with a single, enforced format (e.g. Go) is everyone hates the standard because it is not their personal style and loves it because they don’t have to argue with others’ personal style ↩

Efficiency and Flexibility Are Inversely Proportional

2025-01-29T22:38:39-08:00

E ∝ F^-1 #

In almost all things, a reciprocal relationship between efficiency and flexibility exists. In some cases, it’s obvious why: assembly language runs faster, and Python codes, changes, and debugs faster. Depending on one’s context or priorities, the definition of “efficiency” and “flexibility” change

Efficient	Flexible
Software execution speed	Software development speed
Domain-specific software languages	General software language
Software monolith deployment	Separate APIs, UIs, and data stores deployment
Strictly following a recipe	Dash of this and a bit of that
Measuring carefully to a cut-list	“feeling” what the wood wants to be

One can optimize flexible systems to be as efficient as possible (e.g. compiling Python), and one can make efficient systems as flexible as possible (assembler macros and debugging). The actual characteristics of efficiency and of flexibility mean that they will often affect each other in opposition

Efficiency removes unnecessary action, material, time, etc.: performing only enough action to reach the required goal¹; minimizing waste; avoiding waiting and delay². Individual actions in an efficient process are prescribed and have little room for choice or variance. Even tasks that require lots of judgment (e.g. a lumber cruiser) have little leeway in the criteria for that judgment

Flexibility is being in-the-moment, treating every new cycle as if it might be completely different, not assuming that the previous process is the best, and Doing What’s Indicated given the immediate environment, resources, demand, and one’s best judgement, up to experimenting to see what happens if one tries something new and unexpected³. The extreme extent of flexibility is experimentation (possibly to find an efficient procedure)

Another characteristic of efficient systems is fragility (sometimes called “brittleness”). Using an environment efficiently fails when that environment changes. In the face of changes to the desired output (buyers’ fickleness and alternative products), resources (changes in raw material supply), technology (3-D printing), and mechanisms (labour costs and regulations), an efficient system will fail because it has little slack to handle change

Many efficiency systems have adaption to local change built-in: Lean and Scrum, for example, work towards continuous improvement (change), but they can only change their mechanisms and protocols; the goals and resources (and often, the technologies) are outside their control. Good planning schedules regular reviews to confirm that all the assumptions and techniques in a system are still efficient.

Asking for something to be efficient and flexible is asking for a contradiction that cannot, by definition, exist

Mindful Functions

2023-12-30T12:05:40-08:00

TL;DR: break code into short functions with scalar parameters and good names, and coordinate them #

Mindful means, for a human, to focus on the immediate environment, task, and intention: to be “present in the moment”. For a function, it means to do one operation and to return the consequence of that operation. This is quite close to Functional Programming’s definition of a pure function (side-effects withstanding). F.P.‘s prime benefit is the isolation of the operation; mindfulness’ prime benefits are attention, modularity, and scope

Each function or method should do one thing¹. The implementation of that one thing might be calling other functions. For example, a function prepare_order() calls other functions to process every step of the order; it never actually changes or creates data itself, yet it completes a single conceptual task by orchestrating sub-functions

Hierarchical Code #

After one distributes the code into smallish functions (which do one thing each) and coordinating functions that call the others, ensure that within each function all the code and calls are at the same level of abstraction. Coordinating functions are all coordination; they do not coordinate some things and do other things themselves. They focus on which functions they call, high-level looping, handling errors, and passing data from one function to another

             data1, error1 := doHighLevelThing(data)
             if error1 != nil ...

             data2, error2 := doOtherThing(data1)
             if error2 != nil   ...

            for normalizationIndex, datum2 := range data2 {
                data2[normalizationIndex].price = 
                                              normalizePrices(datum2)

            // IMPROVEMENT: move this into a function
            for dateAfterIndex, item := range data2 {
                if item.Date.After(cutoff_date) {
                         data2[dateAfterIndex].status = TooLateSorry
                }
            }

Sub-functions do their one thing and return, e.g. one function should not have the code to format data and to save it; those are two separate things that should live in two separate functions. The formatter should not care about files and the saver should not care about formats. Functions should not coordinate with sibling sub-functions; that’s their caller’s job

The pattern of a coordinating function calling a series of other functions and managing their inputs and outputs has several names: top‑down structure, king-pinning, stepwise design, etc. This is the strategy behind the Open-Closed principle: make a process’ steps discrete, replaceable, and augmentable (via injection or subclassing) by making them separate

Good code reads like well-written prose —Grady Booch

Higher-level functions #

Look more like well-written prose (function names are descriptive), or, at worst, an executive summary
Call and coordinate one or more lower-level functions
Have code that describes what, now how:
- business logic chooses which lower-level functions to call
- record or react to errors returned by lower-level functions
- manage and manipulate the results of lower-level functions

Lower-level functions #

Perform the one task (according to all the function arguments) or return an error
May call and coordinate lower-level functions
Have code that implements how
Do not panic or log errors

Pass In Only What The Function Needs #

Instead of passing in objects as arguments and having the function pull attributes from that object, pass in the attributes themselves to the function (e.g. scalars and collections). Simpler parameters make for simpler testing and debugging (see example below). The less a function knows about its environment, classes, and globals, the more portable it is

For example:

function1(objectA)

def function1(an_object: AClass):
    if an_object.items == []:  // tightly linked to the AClass structure
        return

    # do something with the items

vs.

if an_object.items != []:
    function1(an_object.items)

def function1(some_items: List[Items]):
    # do something with the items

Minimize globals within functions. If a function needs a value that comes from a singleton, context information, and other global, the caller should pass it in

try:
    sable_local_value = somePackage.customer_configuration()
except:
    return "Nope, didn't work"

nextValue = function1(datum, sable_local_value.currency_code)

Even instance methods should not use instance fields except at the highest level (e.g. the protected virtual methods designed to be overridden). Instances fields are globals (albeit in a much smaller, encapsulated globe) with all the drawbacks of globals

    def prepare_total(self):
         sable_local_value= somePackage.CustomerConfiguration()

# we're passing in the part of self that _prepare_items cares about
         self._prepare_items(self.items, sable_local_value.currency_code)
         self._prepare_shipping(sable_local_value.currency_code)
         self._prepare_discounts(sable_local_value.currency_code)

Ideally, these high-level methods access the instance fields and call functions (or static methods) which are happily ignorant of the instance

Break It Down #

Breaking long blocks of procedural code into smaller functions makes them easier to read, easier to understand, easier to debug, and easier to see design and context problems. Perhaps the process of breaking up large blocks of code itself allows the authors to step back from writing the next line of code to think about organization and naming, and that exercise might allow them to catch problems

Fold Code into Functions #

An analogy is code-folding in an IDE: wherever one would want to fold up chunks of code to see an overview, the folded code might be better in a function. Likewise, if a block of code has a comment explaining what it does, it might well become a separate function with a name[^3] that explains what it does instead of the comment

[^3]: Nomenclature is the best documentation

Don’t Tell a Function What To Do #

Beware of functions that look like

def do_something(please_do_details : bool)

If a function does one thing, it does not need parameters to tell it what to do. Behavioural parameters are a code smell or perhaps a design smell. The caller decides what behaviours it wants, then calls different functions to get that behaviour

if url_queries["details"] is not None:
    return do_something(data)
else:
    return do_something_with_details(data)

def do_something(foo):
    bar = do_something_nice(foo)
    baz = do_something_well(bar)
    return baz

def do_something_with_details(foo):
    bar = do_something_nice(foo)
    baz = do_something_well(bar)
    foz = do_something_with_details(baz)
    return foz

Don’t Call For No Reason (aka Ding-Dong Ditch) #

Callers should determine that they need to call a function before calling it; the function should not have to check its arguments to see if it should have been called[^4] (e.g. passing an empty array). This is not the same as validating the arguments (e.g. Guards); functions should not trust their callers even though they must obey all arguments or return an error

[^4]: many optional parameters is a code smell

Bringing an Outline to Life #

If one lays out a list of comments in a high-level function as an outline or roadmap for future development, turn that list into a series of function calls with names at least as descriptive as the outline, and remember to remove the comments. A function with a good name does not need a comment, especially in code that calls it

And Do This Other Thing #

Good function names describe what they do. Reading calling code is like a narration: we do this thing (by calling this function) and then that thing (by calling that function). If a function name has the conjunction “And” in it (e.g. ReadAndFormatData()), then it is doing two things that might be better done in two separate functions. An “Or” or “But” in a name implies that the function makes a decision that its caller (or a wrapping function) should make rather than passing a behavioural parameter
For example, BackupDataButNotOnTuesday() might have the test for Tuesday and call a function that does the backup. A better name is RegularBackup() as that is the conceptual thing we want to do. Within RegularBackup(), separate the test for Tuesday and the call to the function that does backups

Righteous Naming

2023-12-26T11:06:20-08:00

If names are not correct, language will not be in accordance with the truth of things — Confucius (attributed)

Good names take effort[^1]. Bad names, eventually, take more effort. Good names help code look like (awkward) prose[^3] instead of mathematical formulas. Good names help readers focus on what the code is getting done rather than “WtF is this coding doing‽”. Good names reduce the need for comments or the readers to drill into the functions’ code[^4]

Think of a big pile of tools, materials, and supplies. To find anything, one has to dig through, on average, half the pile. It’s like one big function or script

Think of the same stuff in unlabeled drawers and boxes. One looks through boxes unnecessarily because the contents are not obvious from the boxes’ labels (or one does not understand or does not trust the labels)

Think of the same stuff in labeled boxes grouped by some cognitive theme: material type; power source; scale of operation; etc. Not only can one know what is in a box without looking inside, one knows on which shelf or against which wall has the sort of boxes one wants. This is what well-named functions provide

Finally, using smaller drawers and boxes reduces the amount of stuff one looks through within the drawer or box, and smaller functions might be useful in more places because they fit in smaller spaces

Name variables, fields, attributes, accessors, mutators, packages, and files so they describe what they are; name functions and methods so they describe what they accomplish, not how they accomplish it

    filtered_items = remove_backordered_items(items)
    items_after_discount = discount_bulk_orders(filtered_items)
    items_by_ship_date = group_by_ship_date(items_after_discount)
    return items_by_ship_date

[^1]: There are only two hard things in Computer Science: cache invalidation and naming things. — Phil Karlton

[^3]: Good code reads like well-written prose —Grady Booch

[^4]: as long as the function is mindful and does one thing)

Also published as Rightous Naming on Medium

We Can Afford To Buy Vowels

2023-12-05T21:39:27-08:00

TL;DR Programs must be written for people to read, and only incidentally for machines to execute —Harold Abelson

In the old days <eyeroll/>, we had 26 variables: the letters of the alphabet. The next leap in programming, we could add one digit after the letter. That was not good enough for development then, and it surely is not now

With no restriction on the length of names, the vestigial traditions of opaque naming cannot hide behind brevity or idiom, and certainly not consistency¹. The only measure for names is their utility in communication

A line of code like r.Get(true) is opaque. While IDEs can help decode what the r and the true mean, using descriptive names allows the code to speak for itself: orderReader.Get(CacheDataOk)

Multi-letter abbreviations harken back to Hungarian notation: abjad² type names prefixed or suffixed to a (presumably) descriptive name (e.g. submitBtn). Or they presume screens are fixed at 80 x 40 characters (e.g. DTE for “Date”). While names might be too long³, unabbreviated names are easier and more accurate for humans to read; code readers should be free to focus on the why of the code, not the what does this mean‽ of the code

Good code reads like well-written prose —Grady Booch

emerson ↩
abjad: a writing style omitting vowels (e.g. Semitic) ↩
my coworkers won’t believe I said this ↩

A Function's Interface Is It's Bond

2023-12-05T21:16:05-08:00

TL;DR: A function must respect all arguments or return an error

The Arguments #

The only realistic way a function can fail its interface is how it treats arguments as mis-typed arguments or return values are usually caught by a framework

All The Arguments #

Be they required or optional arguments that are present, functions must respect all the arguments; they cannot ignore some because they are inconvenient or contradictory

“I’d like a number four with no cheese” #

One cannot stop paying attention just because we know what a #4 is. Functions and especially REST endpoints often accept resource ids or primary key values, which implicitly identify specific resources. Tempting as this easy lookup is, the function must apply all the arguments even if they are additional criteria. With a key value in hand, the function does not need additional criteria, but it cannot ignore them

A function could optimize¹ by having a dedicated code path retrieving data via the key and testing that data against any additional criteria while having another code path to query by criteria without a key value, but that is unlikely to be a good decision. Datastores usually² are perfectly capable of making this sort of optimization

No One Would Ask For That #

The function cannot make assumptions about what the caller wants; it must provide what the caller actually asked for, even if that seems nonsensical

While a USA address’ state is implicit in its zip code³, if a function accepts both state and zip code, it must ensure that its results meet both requirements: if the supplied state does not match the implicit state from the zip code, then it must return no addresses

And Nothing But the Arguments #

If a function abides by all the arguments it receives, then receiving an unknown argument must be an error condition. For compiled languages, spurious arguments are impossible. For other languages, they might be allowed. Take the time to detect and report them

HTTP is a line format: a text message where the contents have semantic meaning depending on what part of the message one looks at. One can often translate the HTTP method (verb) and path into a function call and required parameter (with the query-string acting as additional, usually optional, parameters). While HTTP frameworks can often detect extra and missing parameters an error (404 or 422), it is usually an option

“…; premature optimization is the root of all evil (or at least most of it) in programming.” - Donald Knuth ↩
always? ↩
a business rule of the USPS ↩

Walk, Don't Race

2023-08-28T21:42:39-07:00

TL;DR #

Cha-Cha-Changes #

Breaking changes happen, and the breaking occurs between the layers (e.g. UI and API, API and database). Rather than trying to release new layers simultaneously (especially with edge distribution), commit the time and effort to a four or five-step pattern that ensures the layers work perfectly at all times, losing no data

1. Create the Change You Want To Be #

Create the goal: the new schema; the new endpoint; the new protocol. Test it, benchmark it, and be happy with it. Copy and translate the old system’s data to the new system (handle the subsequent data changes in step 4)
For example, instead of changing the data type of a column, create a new column of the desired type

2. Embrace the New While Respecting the Old #

The Problem With Data #

It’s always the data

The transition depends on how the old and new systems share data

Type A: if the clients are making read-only calls (e.g. calculations), then both systems run independently; updated clients choose to call the new system in step 4
Type B: If the clients modify data, then clients must update both datastores (or the datastores must synchronize with each other — e.g. two-way database synchronization) or both the original and the replacement columns. Wait until Step 3 to ensure that the old and new data sets match before depending on the new datastore

3. What Is Old Is New Again #

For Types B once all of the clients update the new dataset/column, synchronize the data from old to new to catch any missing data (e.g. data updated since step 1)

4. When Choosing Between Two Evils, Choose the Lesser One #

All the clients stop depending on the old datastore, columns, APIs, etc.

5. I Hope No One Is Using This … #

Once all the clients stop using the old datastore, remove it

Make it invisible with a Proxy #

If one cannot influence one’s clients to participate in the hijinks described above, create a proxy or facade with the old system’s interface that will do the double-updates to the new system and, when ready, use the new system for getting data

This proxy can remain as a translation layer or be deprecated as clients lose the option of the old interface

A Simple Caching Pattern in Go

2023-03-17T13:49:47-07:00

One might use it like this:

var myCache = NewOurCache(0, calculateTimestamp)
// ...
var myValue time.Time = myCache.Open(dateText, timeText, Location)

OurCacheType is an alias for the actual data cached. It can be any type including a structure or pointer. composeKey() is an example how to consistently create a string key from whatever identifies the data. These parameters usually match the parameters of the createAValue function (if used)

Use the cache by Write() values if the Read() cannot find them, or you can use something like the Open() function, which returns the cached value if found in the cache or it generates the value (using the createAValue function), caches it, and returns it[^1]

[^1]: My naming convention is:
find("key") returns the value and a boolean (like aMap["key"])
get("key") excepts/panics if it cannot return the value
open("key") returns an existing instance or creates the value

If a cache is disabled, Read() always returns found == false (even if it has the desired key already in data) and Write() does nothing. Open() will call the generation function every time

import (
    "sync"
    "sync/atomic"
)

// OurCacheType is not necessarily an int; whatever the cache holds, e.g. structures
type OurCacheType = int

type ourCacheData map[string]OurCacheType

// the function that does the work
type getItemFunc func(partOfKey, OtherPartOfKey string) OurCacheType

type OurCache struct {
    capacity          int
    data              ourCacheData
    createAValue      getItemFunc
    hitCount          int32
    missCount         int32
    overCapacityCount int32
    semaphore         *sync.RWMutex
    enabled           bool
}

type ourCacheDatum struct {
    key   string
    value OurCacheType
}

// NewOurCache correctly initializes a cache isntance
func NewOurCache(capacity int, createFunc getItemFunc,
    data ...ourCacheDatum) OurCache {

    result := OurCache{capacity: capacity, createAValue: createFunc,
        semaphore: &sync.RWMutex{}, data: ourCacheData{},
        enabled: true} // or start it disabled (e.g. to test without cached values)

    // initializing the cache is useful for testing
    for _, datum := range data {
        result.data[datum.key] = datum.value
    }
    return result
}

func (cache *OurCache) Open(partOfKey, otherPartOfKey string,
) OurCacheType {
    key := composeKey(partOfKey, otherPartOfKey)
    value, found := cache.Read(key)
    if !found {
        value = cache.createAValue(partOfKey, otherPartOfKey)
        cache.Write(key, value)
    }
    return value
}

func (cache *OurCache) Read(key string) (result OurCacheType, found bool) {
    cache.semaphore.RLock()
    defer cache.semaphore.RUnlock() // activates three lines down

    if !cache.enabled {
        return result, false
    }
    result, found = cache.data[key]

    if found {
        atomic.AddInt32(&(cache.hitCount), 1)
    } else {
        atomic.AddInt32(&(cache.missCount), 1)
    }

    return result, found
}

func (cache *OurCache) Write(key string, value OurCacheType) {
    cache.semaphore.Lock()
    defer cache.semaphore.Unlock()

    if cache.enabled {
        cache.checkCapacity()
        cache.data[key] = value
    }
}

// called from Write(), which has a write-lock
func (cache *OurCache) checkCapacity() {
    if cache.capacity > 0 && (len(cache.data) >= cache.capacity) {
        cache.data = ourCacheData{}
        atomic.AddInt32(&(cache.overCapacityCount), 1)
    }
}

func (cache *OurCache) Size() int {
    cache.semaphore.RLock()
    defer cache.semaphore.RUnlock()

    result := len(cache.data)
    return result
}

func (cache *OurCache) HitCount() int {
    result := atomic.LoadInt32(&cache.hitCount)
    return int(result)
}

func (cache *OurCache) MissCount() int {
    result := atomic.LoadInt32(&cache.missCount)
    return int(result)
}

func (cache *OurCache) OverCapacityCount() int {
    result := atomic.LoadInt32(&cache.overCapacityCount)
    return int(result)
}

func (cache *OurCache) Abled() bool {
    cache.semaphore.RLock()
    defer cache.semaphore.RUnlock()

    return cache.enabled
}

func (cache *OurCache) Enable) {
    cache.semaphore.Lock()
    defer cache.semaphore.Unlock()

    cache.enabled = true
}

func (cache *OurCache) Disable() {
    cache.semaphore.Lock()
    defer cache.semaphore.Unlock()

    cache.enabled = false
}

// Clear when the source of data changes
func (cache *OurCache) Clear() {
    cache.semaphore.Lock()
    defer cache.semaphore.Unlock()

    cache.data = ourCacheData{}
}

func (cache *OurCache) Reset() {
    cache.semaphore.Lock()
    defer cache.semaphore.Unlock()

    cache.data = ourCacheData{}
    cache.hitCount = 0
    cache.missCount = 0
    cache.overCapacityCount = 0
}

func (cache *OurCache) Statistics() (size, capacity, hit, miss, over int, enabled bool) {
    cache.semaphore.RLock()
    defer cache.semaphore.RUnlock()

    return cache.Size(), cache.capacity, cache.HitCount(),
            cache.MissCount(), cache.OverCapacityCount(), cache.Abled()
}

// an example of a deterministic way of generating a unique string key for a cachable value
// e.g. a consistent string representation of a Time
func composeKey(partOfKey, otherPartOfKey string) string {
    return partOfKey + "|" + otherPartOfKey
}

A generic version is possible but requires a lot of sacrifices like exposing and delegating the format of the key strings to the caller and losing the variable parameters in NewOurCache(). I could not figure out a way to define a function type that returned the type of the cache, but that critical piece might be possible

One hundred lines of code is a small price to pay for type safety

Single Responsibility Lines‽

2022-07-14T22:22:35-07:00

From Wikipedia:

The Single Responsibility Principle (SRP) is a computer programming idea: every module or class should have responsibility for a single functionality. All its services should be narrowly aligned with that responsibility. I leave the discussion about SRP in modules and classes to the plethora of sources addressing them

Yeah, But Every Line? #

Yes, every line should perform a single¹ task. Even if you are using Perl, every line should have one specific goal. Complex expressions are ¡very impressive¡, but such preening makes the code hard to understand². Breaking complex expressions and commands into informational variables and steps makes them easier to read³, easier to debug, and easier to refactor

There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies. — Charles Hoare

Good Names Help #

Proper naming, as well as intermediate and informational variables, reduce the need for comments⁴ by making the code communicate well. They make debugging and logging easy and consistent. Intermediate variables do not make code slower (or faster). Interpreters and compilers break the source down to the operator level; a little whitespace and a local token are not going to phase them⁵

Do not confuse a compound expression with an overly complex one, although length is a consideration. (x > 2) && (x < 14) is a single test, and if it fits neatly on one line, is a perfectly nice expression

This monstrosity

state_of_being.find("walk to school", False) ||
           state_of_being.find("take your lunch", False) ||
                           are_tautologies_welcome_here

even with the descriptive names, is not readable

Getting Data Is Not the Same as Using Data #

my_data_bag[thingie1][wet_sproket == 4 ? "subkey1" : "subkey2"] =
   calculate_important_data(wet_sproket != 4 ? source2 : source1.child)

[^7]: nested and embedded ternaries are a code smell

As with complex mathematical expressions[^7], the inner-most expression evaluates first. Unlike mathematics, we don’t have to do it like this. Figure out the input parameters, then use them to get data, then apply the data

if wet_sproket == 4
    param = source1.child
else
    param = source2

important_data = calculate_important_data(param)

if thingie not in my_data_bag
    my_data_bag[thingie] = {}

if wet_sproket == 4
    my_data_bag[thingie]["subkey1"] = important_data
else
    my_data_bag[thingie]["subkey2"] = important_data

I’ve “wasted” ten lines of code (not counting whitespace) making the assignment readable, understandable, testable, debuggable, and robust

Returning Is a Thing #

Even short examples can have embedded complexity

func handle(queue chan *Request) {
    for req := range queue {
        req.resultChan <- req.f(req.args)
    }
}

should be more like

func handle(queue chan *Request) {
    for req := range queue {
        result = req.f(req.args)
        req.resultChan <- result
    }
}

Again, clever expressions holding lots of business rules and operations are bad; combining them with a control structure is worse

return this_too() ? "shall" : "pass".strip()

should be more like

result = this_too() ? "shall" : "pass".strip()
return result

if this_too():
    result = "shall"
else:
    result = "pass".strip()

return result

Don’t Confuse Clarity with … #

Brevity #

A common mantra[^908] in business and marketing speak (although that is rarely succinct). We do not pay by the character anymore: we can afford good naming, vowels, and vertical whitespace

[^908]: which is usually followed by buzzwords, sadly enough. See Leverging Our Synergies

Loquation #

Artificially long names don’t help any more than commenting every line. If good code reads like prose, “Never use a long word where a short one will do” (George Orwell). See also “Homeopathic naming” from Kevlin Henney

Efficiency #

A seemingly straight-forward line like this

for i, value in enumerate(data[::2]):
     value2 = data [i + 1]

tries to do two things at once. Slicing data is a operation which returns a new slice of alternate elements. The author intended to iterate the original slice, skipping alternate members, but the code iterates through every item in an array of the odd-indexed elements of the data[]
This code implements the desired behaviour clearly

oddElements = data[::2]
evenElements = data[1::2]
for I, value1 in enumerate(oddElements):
    value2 = evenElements[i]

Iterating over an array is a separate responsibility from modifying the array

Look at how seemingly straight-forward code can hide intent
for i, value in range(value * 4):
Even this is not too simple to expand to separate lines. This code does not tell us why it is multiplying by four (e.g. quarterly to annual). Using a constant might tell us more, and looking at it as two different operations, iteration and modification, will tell us everything

Breaking the alternating array into two parallel arrays (as above) makes using them easy. Clever (worse) code might look like

 # this is clever code. don't do this

oddElements = data[::2]
for i, value1 in enumerate(oddElements):
     value2 = data[i * 2 - 1]

This code has a bug in it[^1122], but it’s not obvious. Breaking the second index into its own line might make the bug more obvious

[^1122]: the first iteration will try and access data[-1]

Examples #

Do one conceptual thing #

The compound operations are both easy to understand and fit the operation

log.info("user name: ${user.name.lower.trim}")

Does one thing: operates a loop #

for (i = array_length - 1; i >= 0; i--)

Does one thing: records and operates on the result of a call #

if (value, err := DoThatThingYouDo(47); err == nil)

Does several things #

Forcing code onto one line does no one any favours
Many, many comments might make this understandable (if they stayed up-to-date)

return transylvania_6_5000(total_value < MEDIAN_LIMIT ? total_value + extras(total_value, previous_value) : total_value * 1.2)`

Rather, break the expression down into separate lines. These arithmetic additions are easy to see and understand because the names are clear if they are on separate lines

[^14]: Nomenclature is the Best Documentation

if total_value < MEDIAN_LIMIT
    total_extras = extras(total_value, previous_value)
    result = calculate_total_extras(total_value + total_extras)
else
    result = calculate_total_extras(total_value * ABOVE_MEDIAN_MARKUP)

return result

or even

if total_value < MEDIAN_LIMIT
    total_extras = extras(total_value, previous_value)
    extras_and_values = total_value + total_extras
    result = calculate_total_extras(extras_and_values)
else
    marked_up_value = total_value * ABOVE_MEDIAN_MARKUP
    result = calculate_total_extras(marked_u-_value)

return result

Our goal is clarity; Do What’s Indicated[^98]

[^98]: Elihu King, 1992

Does two things #

This operates a loop by making an evaluation unrelated to the loop itself

// DOES TOO MUCH     
for (i = array_length;
      i >= 0 && array[i] > 14 && array[i] < max_val * 1.12; 
      i--)

Managing the loop is one thing; testing the data is something else. Testing for a null-terminator might be simple enough to include in the loop-control expression

lowest_allowed_value = 14
highest_allowed_valule = max_val * 1.12
for i = 0; i < array_length; i++)
    if (array[i] <= lowest_allowed_value ) || (array[i] >= highest_allowd_value)
        break

or break the calculation into a function with readable code. If the function name is good, readers will never have to look at that code

for i = 0; i < array_length && value_within_range(array[i]); i++)

Smells and Tells #

Ternaries #

Ternary expressions are a common violation of doing one thing. They are best used to fill in default values, make plurals, select the correct indefinite article, and include or omit small bits of text. Apart from trivial formatting operations, don’t call functions in ternaries. Nesting ternaries is Right Out

Line Length #

Line length is a broad metric, and often a good one. If a non-string line is wrapping, breaking it into separate lines is usually better than wrapping it. Compilers and interpreters are very good at optimizing code; the number of lines source code spans, even with intermediate variables, is never going to affect performance or object size

Indentation #

I like four-space indentation because it’s easier to read and makes it obvious when one has nested structures too deeply. Find a block of code starting at a shallow depth and move it to its own function where reasonable

Comments #

Comments cannot compensate for obtuse code. They decay and, in the words of Kevlin Henney, someone who writes obtuse code rarely has the ability to write clear comments

Code Reviews #

If any code reviewer does not understand what a line does, refactor the code instead of just telling them or, at worst, adding a comment. They represent the code-readers of the future (maybe even Future You). Their confusion indicates that the code does not express its purpose. Don’t argue that it is obvious or what a “reasonable” coder would think; take it as evidence that your code failed to communicate with another person[^8]

[^8]: To self-examine, consider an exercise: Explain to Me Like I’m Five. Not only will it help focus on the most important parts, it helps with descriptive naming