Giter Site home page Giter Site logo

cognitive-load's Introduction

Cognitive Load is what matters

It is a living document, last update: May 2024

Introduction

There are so many buzzwords and best practices out there, but let's focus on something more fundamental. What matters is the amount of confusion developers feel going through the code.

Confusion costs time and money. Confusion is caused by high cognitive load. It's not some fancy abstract concept, but rather a fundamental human constraint.

Since we spend far more time reading and understanding code than writing it, we should constantly ask ourselves whether we are embedding excessive cognitive load into our code.

Cognitive load

Cognitive load is how much a developer needs to think in order to complete a task.

When reading code, you put things like values of variables, control flow logic and call sequences into your head. The average person can hold roughly four such things in working memory. Once the cognitive load reaches this threshold, a significant effort is required to understand things.

Let's say we have been asked to make some fixes to a completely unfamiliar project. We were told that a really smart developer had contributed to it. Lots of cool architectures, fancy libraries and trendy technologies were used. In other words, the previous author had created a high cognitive load for us.

Cognitive Load

We should reduce the cognitive load in our projects as much as possible.

The tricky part is that the previous author may not have experienced a high cognitive load due to familiarity with the project.

Familiarity vs Simplicity
The problem is that familiarity is not the same as simplicity. They feel the same — that same ease of moving through a space without much mental effort — but for very different reasons. Every “clever” (read: “self-indulgent”) and non-idiomatic trick you use incurs a learning penalty for everyone else. Once they have done that learning, then they will find working with the code less difficult. So it is hard to recognise how to simplify code that you are already familiar with. This is why I try to get “the new kid” to critique the code before they get too institutionalised!

It is likely that the previous author(s) created this huge mess one tiny increment at a time, not all at once. So you are the first person who has ever had to try to make sense of it all at once.

In my class I describe a sprawling SQL stored procedure we were looking at one day, with hundreds of lines of conditionals in a huge WHERE clause. Someone asked how anyone could have let it get this bad. I told them: “When there are only 2 or 3 conditionals, adding another one doesn’t make any difference. By the time there are 20 or 30 conditionals, adding another one doesn’t make any difference!”

There is no “simplifying force” acting on the code base other than deliberate choices that you make. Simplifying takes effort, and people are too often in a hurry.

Thanks to Dan North for his comment above.

Once you onboard new people on your project, try to measure the amount of confusion they have (pair programming may help). If they're confused for more than ~40 minutes in a row - you've got things to improve.
Cognitive load and interruptions

Types of cognitive load

Intrinsic - caused by the inherent difficulty of a task. It can't be reduced, it's at the very heart of software development.

Extraneous - created by the way the information is presented. Caused by factors not directly relevant to the task, such as smart author's quirks. Can be greatly reduced. We will focus on this type of cognitive load.

Intrinsic vs Extraneous

Let's jump straight to the concrete practical examples of extraneous cognitive load.

P.S. contributions are welcome!


We will refer to the level cognitive load as follows:
🧠: fresh working memory, zero cognitive load
🧠++: two facts in our working memory, cognitive load increased
🤯: working memory overflow, more than 4 facts

Complex conditionals

if val > someConstant // 🧠+
    && (condition2 || condition3) // 🧠+++, prev cond should be true, one of c2 or c3 has be true
    && (condition4 && !condition5) { // 🤯, we are messed up here
    ...
}

Introduce intermediate variables with meaningful names:

isValid = var > someConstant
isAllowed = condition2 || condition3
isSecure = condition4 && !condition5 
// 🧠, we don't need to remember the conditions, there are descriptive variables
if isValid && isAllowed && isSecure {
    ...
}

Nested ifs

if isValid { // 🧠+, okay nested code applies to valid input only
    if isSecure { // 🧠++, we do stuff for valid and secure input only
        stuff // 🧠+++
    }
} 

Compare it with the early returns:

if !isValid
    return
 
if !isSecure
    return

// 🧠, we don't really care about earlier returns, if we are here then all good

stuff // 🧠+

We can focus on the happy path only, thus freeing our working memory from all sorts of preconditions.

Inheritance nightmare

We are asked to change a few things for our admin users: 🧠

AdminController extends UserController extends GuestController extends BaseController

Ohh, part of the functionality is in BaseController, let's have a look: 🧠+
Basic role mechanics got introduced in GuestController: 🧠++
Things got partially altered in UserController: 🧠+++
Finally we are here, AdminController, let's code stuff! 🧠++++

Oh, wait, there's SuperuserController which extends AdminController. By modifying AdminController we can break things in the inherited class, so let's dive in SuperuserController first: 🤯

Prefer composition over inheritance. We won't go into detail - there's plenty of material out there.

Too many small methods, classes or modules

Method, class and module are interchangeable in this context

Mantras like "methods should be shorter than 15 lines of code" or "classes should be small" turned out to be somewhat wrong.

Deep module - simple interface, complex functionality
Shallow module - interface is relatively complex to the small functionality it provides

Deep module

Having too many shallow modules can make it difficult understand the project. Not only do we have to keep in mind each module responsibilities, but also all their interactions. To understand the purpose of a shallow module, we first need to look at the functionality of all the related modules. 🤯

Information hiding is paramount, and we don't hide as much complexity in shallow modules.

I have two pet projects, both of them are somewhat 5K lines of code. The first one has 80 shallow classes, whereas the second one has only 7 deep classes. I haven't been maintaining any of these projects for one year and a half.

Once I came back, I realised that it was extremely difficult to untangle all the interactions between those 80 classes in the first project. I would have to rebuild an enormous amount of cognitive load before I could start coding. On the other hand, I was able to grasp the second project quickly, because it had only a few deep classes with a simple interface.

The best components are those that provide powerful functionality yet have simple interface.
John K. Ousterhout

The interface of the UNIX I/O is very simple. It has only five basic calls:

open(path, flags, permissions)
read(fd, buffer, count)
write(fd, buffer, count)
lseek(fd, offset, referencePosition)
close(fd)

A modern implementation of this interface has hundreds of thousands of lines of code. Lots of complexity is hidden under the hood. Yet it is easy to use due to its simple interface.

This deep module example is taken from the book A Philosophy of Software Design by John K. Ousterhout. Not only does this book cover the very essence of complexity in software development, but it also has the greatest interpretation of Parnas' influential paper On the Criteria To Be Used in Decomposing Systems into Modules. Both are essential reads. Other related readings: It's probably time to stop recommending Clean Code, Small Functions considered Harmful, Linear code is more readable.

If you think we are rooting for bloated God objects with too many responsibilities, you got it wrong.

Too many shallow microservices

We can apply the above scale-agnostic principle to microservices architecture as well. Too many shallow microservices won't do any good - the industry is heading towards somewhat "macroservices", i.e., services that aren't that shallow. One of the worst and hardest to fix phenomena is so-called distributed monolith, which is often the result of this overly granular shallow separation.

I once consulted a startup where a team of three developers introduced 17(!) microservices. They were 10 months behind schedule and appeared nowhere close to the public release. Every new requirement led to changes in 4+ microservices. Diagnostic difficulty in integration space skyrocketed. Both time to market and cognitive load were unacceptably high. 🤯

Is this the right way to approach the uncertainty of a new system? It's enormously difficult to elicit the right logical boundaries in the beginning, and by introducing too many microservices we make things worse. The team's only justification was: "The FAANG companies proved microservices architecture to be effective".

A well-crafted monolith with truly isolated modules is often much more convenient and flexible than a bunch of microservices. It's only when the need for separate deployments becomes crucial (e.g. development team scaling) that you should consider adding a network layer between the modules (future microservices).

Feature-rich languages

We feel excited when new features got released in our favourite language. We spend some time learning these features, we build code upon them.

If there are lots of features, we may spend half an hour playing with a few lines of code, to use one or another feature. And it's kind of a waste of time. But what's worse, when you come back later, you would have to recreate that thought process! 🤯

You not only have to understand this complicated program, you have to understand why a programmer decided this was the way to approach a problem from the features that are available.

These statements are made by none other than Rob Pike.

Reduce cognitive load by limiting the number of choices.

Language features are OK, as long as they are orthogonal to each other.

Thoughts from an engineer with 20+ years of C++ experience ⭐️
I was looking at my RSS reader the other day and noticed that I have somewhat three hundred unread articles under the "C++" tag. I haven't read a single article about the language since last summer, and I feel great!

I've been using C++ for 20 years for now, that's almost two-thirds of my life. Most of my experience lies in dealing with the darkest corners of the language (such as undefined behaviours of all sorts). It's not a reusable experience, and it's kind of creepy to throw it all away now.

Like, can you imagine, the token || has a different meaning in requires ((!P<T> || !Q<T>)) and in requires (!(P<T> || Q<T>)). The first is the constraint disjunction, the second is the good-old logical OR operator, and they behave differently.

You can't allocate space for a trivial type and just memcpy a set of bytes there without extra effort - that won't start the lifetime of an object. This was the case before C++20. It was fixed in C++20, but the cognitive load of the language has only increased.

Cognitive load is constantly growing, even though things got fixed. I should know what was fixed, when it was fixed, and what it was like before. I am a professional after all. Sure, C++ is good at legacy support, which also means that you will face that legacy. For example, last month a colleague of mine asked me about some behaviour in C++03. 🤯

There were 20 ways of initialization. Uniform initialization syntax has been added. Now we have 21 ways of initialization. By the way, does anyone remember the rules for selecting constructors from the initializer list? Something about implicit conversion with the least loss of information, but if the value is known statically, then... 🤯

This increased cognitive load is not caused by a business task at hand. It is not an intrinsic complexity of the domain. It is just there due to historical reasons (extraneous cognitive load).

I had to come up with some rules. Like, if that line of code is not as obvious and I have to remember the standard, I better not write it that way. The standard is somewhat 1500 pages long, by the way.

By no means I am trying to blame C++. I love the language. It's just that I am tired now.

Business logic and HTTP status codes

On the backend we return:
401 for expired jwt token
403 for not enough access
418 for banned users

The guys on the frontend use backend API to implement login functionality. They would have to temporarily create the following cognitive load in their brains:
401 is for expired jwt token // 🧠+, ok just temporary remember it
403 is for not enough access // 🧠++
418 is for banned users // 🧠+++

Frontend devs would (hopefully) introduce variables/functions like isTokenExpired(status), so that subsequent generations of developers wouldn't have to recreate this kind of status -> meaning mapping in their brains.

Then QA people come into play: "Hey, I got 403 status, is that expired token or not enough access?" QA people can't jump straight to testing, because first they have to recreate the cognitive load that the guys on the backend once created.

Why hold this custom mapping in our working memory? It's better to abstract away your business details from the HTTP transfer protocol, and return self-descriptive codes directly in the response body:

{
    "code": "jwt_has_expired"
}

Cognitive load on the frontend side: 🧠 (fresh, no facts are held in mind)
Cognitive load on the QA side: 🧠

The same rule applies to all sorts of numeric statuses (in database or wherever) - prefer self-describing strings. We are not in the era of 640K computers to optimise for memory.

People spend time arguing between 401 and 403, making choices based on their level of understanding. But in the end it just doesn't make any sense. We can separate errors into either user-related or server-related, but apart from that, things are kind of blurry. As for following this mystical "RESTful API" and using all sorts of HTTP verbs and statuses, the standard simply doesn't exist. The only valid document on the matter is a paper published by Roy Fielding, dated back in 2000, and it says nothing about verbs and statuses. People get along with just a few basic HTTP statuses and POSTs only, and they are doing just fine.

P.S. It's often mentally taxing to distinguish between "authentication" and "authorization". We can use simpler terms like "login" and "permissions" to reduce the cognitive load.

Abusing DRY principle

Do not repeat yourself - that is one of the first principles you are taught as a software engineer. It is so deeply embedded in ourselves that we can not stand the fact of a few extra lines of code. Although in general a good and fundamental rule, when overused it leads to the cognitive load we can not handle.

Nowadays, everyone builds software based on logically separated components. Often those are distributed among multiple codebases representing separate services. When you strive to eliminate any repetition, you might end up creating tight coupling between unrelated components. As a result changes in one part may have unintended consequences in other seemingly unrelated areas. It can also hinder the ability to replace or modify individual components without impacting the entire system. 🤯

In fact, the same problem arises even within a single module. You might extract common functionality too early, based on perceived similarities that might not actually exist in the long run. This can result in unnecessary abstractions that are difficult to modify or extend.

Rob Pike once said:

A little copying is better than a little dependency.

We are tempted to not reinvent the wheel so strong that we are ready to import large, heavy libraries to use a small function that we could easily write by ourselves. It introduces unnecessary dependencies and bloated code. Make informed decisions about when to import external libraries and when it is more appropriate to write concise, self-contained code snippets to accomplish smaller tasks.

Abuse of this principle could lead to indirect coupling (or just unnecessary coupling), premature abstractions and large, generic solutions, maintenance complexity, high cognitive load.

Tight coupling with a framework

Frameworks evolve at their own pace, which in most cases doesn't match the lifecycle of our project.

By relying too heavily on a framework, we force all upcoming developers to learn that framework first (or its particular version). Even though frameworks enable us to launch MVPs in a matter of days, in the long run they tend to add unnecessary complexity and cognitive load.

Worse yet, at some point frameworks can become a significant constraint when faced with a new requirement that just doesn't fit the architecture. From here onwards people end up forking a framework and maintaining their own custom version. Imagine the amount of cognitive load a newcomer would have to build (i.e. learn this custom framework) in order to deliver any value. 🤯

By no means do we advocate to invent everything from scratch!

We can write code in a somewhat framework-agnostic way. The business logic should not reside within a framework; rather, it should use the framework's components. Put a framework outside of your core logic. Use the framework in a library-like fashion. This would allow new contributors to add value from day one, without the need of going through debris of framework-related complexity first.

Hexagonal/Onion architecture

There is a certain engineering excitement about all this stuff.

I myself was a passionate advocate of Onion Architecture for years. I used it here and there and encouraged other teams to do so. The complexity of our projects went up, the sheer number of files alone had doubled. It felt like we were writing a lot of glue code. On ever changing requirements we had to make changes across multiple layers of abstractions, it all became tedious. 🤯

Jumping from call to call to read along and figure out what goes wrong and what is missing is a vital requirement to quickly solve problem. With this architecture’s layer uncoupling it requires an exponential factor of extra, often disjointed, traces to get to the point where the failure occurs. 🤯

This architecture was something that made intuitive sense at first, but every time we tried applying it to projects it made a lot more harm than good. In the end, we gave it all up in favour of the good old dependency inversion principle. No port/adapter terms to learn, no unnecessary layers of horizontal abstractions, no extraneous cognitive load.

Do not add layers of abstractions for the sake of an architecture. Add them whenever you need an extension point that is justified for practical reasons. Layers of abstraction aren't free of charge, they are to be held in our working memory.

Even though these layered architectures have accelerated an important shift from traditional database-centric applications to a somewhat infrastructure-independent approach, where the core business logic is independent of anything external, the idea is by no means novel.

These architectures are not fundamental, they are just subjective, biased consequences of more fundamental principles. Why rely on those subjective interpretations? Follow the fundamentals instead: dependency inversion principle, isolation, single source of truth, true invariant, complexity, cognitive load and information hiding.

Discussion

DDD

Domain-driven design has some great points, although it is often misinterpreted. People say "We write code in DDD", which is a bit strange, because DDD is about problem space, not about solution space.

Ubiquitous language, domain, bounded context, aggregate, event storming are all about problem space. They are meant to help us learn the insights about the domain and extract the boundaries. DDD enables developers, domain experts and business people to communicate effectively using a single, unified language. Rather than focusing on these problem space aspects of DDD, we tend to emphasise particular folder structures, services, repositories, and other solution space techniques.

Chances are that the way we interpret DDD is likely to be unique and subjective. And if we build code upon this understanding, i.e., if we create a lot of extraneous cognitive load - future developers are doomed. 🤯

Learning from the Giants

Take a look at the overarching design principles of one of the biggest tech companies:
Clarity: The code’s purpose and rationale is clear to the reader.
Simplicity: The code accomplishes its goal in the simplest way possible.
Concision: The code is easy to discern the relevant details, and the naming and structure guide the reader through these details.
Maintainability: The code is easy for a future programmer to modify correctly.
Consistency: The code is consistent with the broader codebase.

Does the new fancy buzzword comply with these principles? Or all it does is creating extraneous cognitive load?

Here's a fun picture
Code Complexity vs. Experience from @flaviocopes

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
Brian Kernighan

Conclusion

The intricate and multifaceted nature of cognitive load within the realm of comprehension and problem-solving necessitates a diligent and strategic approach in order to navigate the complexities and optimize mental capacity allocation. 🤯

Do you feel it? The above statement is difficult to understand. We have just created an unnecessary cognitive load in your head. Do not do this to your colleagues.

Smart Author

We should reduce any cognitive load above and beyond what is intrinsic to the work we do.


Follow on GitHub, Twitter or connect on LinkedIn

cognitive-load's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cognitive-load's Issues

Suggestion: The Mental Game of Python

Some time ago I watched on Youtube this talk from Raymond Hettinger about the Mental Game of Python where he gives some examples of cognitive load. If you are interested in the topic you might find it useful.

I've asked ChatGPT to make a summary of the talk:

In the talk "The Mental Game of Python," Raymond Hettinger, a Python core developer, presents ten programming strategies using live examples. Here are a few key points from the talk:

  1. Chunking and Aliasing: Hettinger brings up the theory that the human mind can only handle/remember 7 pieces of information at a time, give or take 2. Any more information can cause cognitive overload and errors. Hence, in programming, he emphasizes the importance of ensuring that programmers can use these cognitive 'slots' to improve the code, rather than having to decipher complex logic. This is achieved by modularizing and standardizing through functions, modules, and packages. Hettinger uses Python's random module to highlight the importance of chunking and modular code (source).

  2. Problem Solving: Hettinger quotes the Feynman method of problem-solving: write down a clear problem specification; think very, very hard; write down a solution. He demonstrates the strategies of incremental development and solving simpler programs using the example of a tree walker, showing how they can help build programs that solve complex problems (source).

  3. DRY Principle: Hettinger also discusses the Don't Repeat Yourself (DRY) principle in the context of object-oriented programming (OOP), classes, and inheritance. He suggests that programmers should manually repeat tasks until patterns emerge before moving code into functions, with examples from file conversion to illustrate his point (source).

  4. OOP as a Graph Traversal Problem: Hettinger argues that OOP is essentially a graph traversal problem. Given the richness of the Python ecosystem, he suggests that there's often no need to create new classes. Instead, programmers can identify their current position on the 'graph,' check where they need to go, and use available methods or write new ones to achieve their goals (source).

Example needed for onion-architecture

Hey,

I am myself advocate of onion architecture.

You said:

we gave it all up in favour of the good old dependency inversion principle

I'd like to see an example, as for me, doing onion architecture is mainly "dependency inversion principle" plus keeping DOMAIN, BUSINESS LOGIC (Use cases) and INFRASTRUCTURE in 3 different layers, glued together with "dependency inversion principle".

I'd love to see explanation how "favour of the good old dependency inversion principle" is different from "onion architecture" so I can better understand that section.

best regards,
Maciej

Fallout counter

image

Having web app for people to record and share where their cognitive Action Points are spent would help many stubborn devs and (maybe even more important) unaware managers to get what Development eXperience is all about, and how much money and/or potential contributors are they potentially losing every day.

In the Fallout screen above (taken from this guide) the player character has 10 AP until next turn. In development world each turn is a day, and I would say that 7 AP is the most for most people to handle. Unless there is a SOP (standard operating procedure) that you can follow mindlessly. Realistically I would say 4 AP is the limit per task after which people may start to feel tired.

Unlike physical activity, when a person is exhausted mentally, in can be draining and non-recoverable. The task that takes too much AP will deplete them over and over even without taking actual steps and just thinking about them. I observed that on people learning programming and I observe that on myself when I take more than I can chew.

After AP reserve is over, the person falls out. The good design on cognitive load is reduce AP per action (for everybody) and extend AP limit (for specific person). Making things simpler and automating things both can help. Fallout AP can be used as real metric. Not sure like KPI/OKR metric, but probably a metric on its own.

Some concerns from a researcher in this space

Hi, I do research about the cognitive factors of programming. I just completed my Ph.D. at Stanford involving studies of cognitive load in program comprehension, as detailed here: https://arxiv.org/abs/2101.06305

Thanks for putting together this document! You bring up many important points about what makes code easier and harder to read. I'm sure that programmers will learn something useful by reading this document and using its ideas to improve their own code. I appreciate that you keep the focus close to working memory — many people (including researchers!) will invoke "cognitive load" to just mean "a thing is hard to think about", rather than the specific meaning "a task requires a person to hold information in their working memory".

However, my concern with this document is that cognitive load is still a very specific cognitive phenomenon about the use of working memory under specific individual, task, and environmental conditions. We have essentially no experimental data about how to optimize programs to minimize cognitive load. But this document presents a lot of programmer folklore under the authority of "reducing cognitive load", which I worry presents a veneer of scientific-ness to a subject that has very little scientific backing. This document presents ideas that I suspect most developers intuitively agree with (composition > inheritance, too many microservices is bad), and then retroactively justifies these with via cognitive load. Readers get to think "ah good, there's science to back up my feelings," but there's no real science there!

Here's two examples from the document that I think misuse the concept of cognitive load.

"Inheritance nightmare"

Ohh, part of the functionality is in BaseController, let's have a look: 🧠+
Basic role mechanics got introduced in GuestController: 🧠++
Things got partially altered in UserController: 🧠+++
Finally we are here, AdminController, let's code stuff! 🧠++++ [..]

Prefer composition over inheritance. We won't go into detail - there's plenty of material out there.

What exactly is being held in a person's memory here? The contents of the function? The name of the class holding the content? The location of the class in the file? A visual representation of the inheritance hierarchy? The details matter! And are these details even held in working memory? Given the names, a person might be able to infer that UserController is a subclass of BaseController, and not need to store that fact in WM.

It sounds like the issue actually being described here is not inheritance, but rather abstraction -- code is separated over a number of functions and modules, but sometimes a person needs to make a cross-cutting change that involves knowledge of all of those separate pieces of code. (This kind of problem is what tools like Code Bubbles try to solve.) There is a working memory story somewhere here, but it's not just about composition versus inheritance! Using this as a "composition is better than inheritance" parable is a misuse of cognitive load.

"Too many small methods, classes or modules"

Mantras like "methods should be shorter than 15 lines of code" or "classes should be small" turned out to be somewhat wrong. [...]

Having too many shallow modules can make it difficult understand the project. Not only do we have to keep in mind each module responsibilities, but also all their interactions. To understand the purpose of a shallow module, we first need to look at the functionality of all the related modules. 🤯

I think this example is just way too abstract to be useful. For example, in theory "shallow modules" could actually reduce cognitive load. If a person internalizes each module, then that module could be a single chunk in working memory. For instance, consider two Rust implementations of a function that computes the minimum of the inverse of a vector of numbers:

fn min_inverse_1(v: Vec<i32>) -> Option<f32> {
  let mut min = None;
  for x in v {
    if x == 0 { 
      continue;
    }
    let n = 1. / (x as f32);
    match min {
      None => min = Some(n),
      Some(n2) => if n < n2 {
        min = Some(n);
      }
    }    
  }
  min
}

fn min_inverse_2(v: Vec<i32>) -> Option<f32> {
  v.into_iter()
    .filter(|&x| x != 0)
    .map(|x| 1. / (x as f32))
    .reduce(f32::min)
}    

min_inverse_2 relies on a system of shallow modules (in a sense). A person reading min_inverse_2 has to understand what into_iter, filter, map, and reduce all mean. The person reading min_inverse_1 only needs to understand the basic features of the language.

However, the fact that min_inverse_2 relies on many external interfaces is not a problem if a person has internalized those definitions. In fact, it is probably easier to see at a glance what its behavior is, and to verify whether it is implemented correctly. Again, that's why I emphasize that cognitive load is heavily dependent on not just the structure of the code, but also the programmer's knowledge and the tools they use.

One other thing... saying that UNIX I/O is "easy to use due to its simple interface" is a very contestable claim. A user of read has to be aware of the entire function specification, which is actually pretty complicated: https://pubs.opengroup.org/onlinepubs/009604599/functions/read.html

In sum...

I would strongly encourage you to consider renaming this the "Code Complexity Handbook", or a comparable title. There is good advice here that reflects the knowledge of experienced software engineers. But the science of working memory applied to programming is far too young to justify the kinds of claims made here.

[Comments] OMG YES

This isn't an "issue" but I just wanted to write a note of support 🎉

A lot of the useful tips and tools mentioned in this document are worth learning and using. OOP, Design Patterns, DDD, small method lengths (side note: never heard anyone say "small classes" -- that one makes no sense to me) are all extremely helpful in the pursuit of easily maintainable/extendable/scalable code... IF they're applied correctly.

I guess the issue is that you might need to apply them incorrectly a few times to learn their benefits.

But as I see it, a general rule of thumb is: Is the thing I'm trying to apply to my code making my FUTURE life harder or easier? When I return to this codebase in six months and have completely forgotten why I made the decisions I did, is the thing I'm trying to apply now actually going to make it easier for me to get back into it?

Or: Is it OK to leave this method long, breaking that recommendation, as long as the method is well-structured, has clearly understandable variables and logic, and has comments to explain what's going on?

You just want to make your future life as easy as possible for yourself (which also makes it easier for your follow team mates).

Rather than trying to be clever, using obscure functions and trying to reduce the number of lines of code, we should be focussing on reducing the number of lines you need to scroll up in order to understand what's going on. That's a much better goal than playing code golf with yourself/your team.

Impress me with how BASIC you've made everything (and, when applied correctly, the recommendations mentioned in this article with HELP with that).

[New section] Premature optimisation

@erni27 suggested a new section here.

That's a very good idea, actually. We can start sharing our thoughts to see what would emerge.

People overcomplicate things for the sake of optimisation so much time. There's one particular real-world case I would like to highlight.

Often times people optimise things not because the code will be slow, but because they think it would be slow. I.e. the root cause of such a behaviour are big numbers and not-so-good understanding of latency numbers on the low level.

Once a team got a task: "implement a feature for items processing". The business had somewhat 100K+ items in the storage.
Simple solution: get those 100K+ items in app's memory, do the job
Complex solution: inject a morphology plugin to a storage along side with a lua script, so to execute the code exclusively on storage's side. Thus avoiding both passing the data over the network and loading all the data in app's memory.

The justification for this was:

  1. That's too much data to transfer over the network
  2. 100K would take a lot of app's memory, and it would be slow to go through all the items

All of these are imagined things, based on poor understanding of low-level things.

The reality was:

  1. The storage and app reside in same cluster
  2. It takes 10 ns (0.00001 ms) to send 1 byte over the network
  3. An item size is ~50 bytes
  4. Sending 100,000 items over the network would take 50,000,000 ns (50 milliseconds)
  5. It sure not an issue to load 5 mb worth of data in app's memory
  6. The business grew at a very slow rate, those 100K items were accumulated over 8 years
  7. The feature is used by the admin users once in a while

While the initial 100,000 number may scare developers, the final 50 ms is far less scary. We won't calculate the amount of time we need to loop through all that items, as this is a far less significant number.

Unfortunately, this kind of banal analysis wasn't made, and the complex solution was implemented. The team faced some serious issues with this solution along the way.

Given the business growth rate and all other factors we have optimised for the situation that could potentially occur in ~200 years. So, increased project's complexity will pay off in ~200 years.

The optimised solution profit is in far imagined future, whereas the unnecessary cognitive load is here with us.

Documentation

The handbook needs a good section on the code documentation. From the top of my head I could only come up with a bad and improved example, but it seems a bit bulky to fit into a handbook.

Well-documented code should shortcut right to the conclusion, instead of reading and interpreting a chunk of code. It should describe all ins, outs and design decisions taken in the piece of code, making it not necessary to read the potentially not very important at the moment implementation. At the same time, it should be concise and free of unnecessary details.

Example of a Bad Code Documentation:

# Function: calculate
# Parameters: x, y
# Returns: result
# Description: This function calculates the sum of two numbers.
def calculate(x, y):
    # add the numbers
    result = x + y
    # return the result
    return result

Explanation:

  • The function name "calculate" is not descriptive.
  • The parameter names "x" and "y" are not descriptive, making it unclear what kind of values they represent.
  • The return value "result" does not indicate what it represents or its purpose.
  • The description of the function is redundant and does not provide any additional meaningful information.
  • There is no information about possible errors, edge cases, or any other relevant details.

Improved Code Documentation:

def sum_two_numbers(num1, num2):
    """
    Calculate the sum of two numbers.

    Args:
        num1 (int): The first number.
        num2 (int): The second number.

    Returns:
        int: The sum of num1 and num2.

    Raises:
        ValueError: If either num1 or num2 is not a valid integer.

    Example:
        >>> sum_two_numbers(2, 3)
        5
    """
    try:
        result = int(num1) + int(num2)
        return result
    except ValueError:
        raise ValueError("Both arguments must be valid integers.")

Explanation:

  • The function name "sum_two_numbers" clearly conveys the purpose of the function.
  • The parameter names "num1" and "num2" are more descriptive and indicate that they represent numbers.
  • The return value is explicitly defined as an integer and its purpose is clear.
  • The documentation includes a description, specifying what the function does.
  • The "Raises" section mentions a specific error that can occur and provides a helpful error message.
  • An example is provided to demonstrate how to use the function and what output to expect.

In the improved example, the code documentation is more informative, providing better context, details, and examples. It helps developers understand the purpose, usage, and potential errors of the function, promoting better code comprehension and usage.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.