Skip to content

You can't read code that you can't find

When working in a large code base, searchability, is just as important as readability. You’ll be reading the code to understand it, but you’ll be dependent on search tools like grep to even find what to read. If you can’t find the right code to read, you won’t just be confused, you’ll be left with false confidence that you found every part of a system.

Write out full names, don't construct them

Important strings and names across your codebase

Most code is searchable by default, unless you’ve broken up terms that someone may search for. As an example, someone searching for "lidar_enable" would miss get_feature_flag(sensor_name + "_enable") - leading to possible bugs during a refactor. Even though it might lead to more verbose or duplicated code, get_feature_flag("lidar_enable") is worth it for the search-ability it provides.

Obviously you don’t want to get too extreme with this, or you’ll end up with tons of duplicated code, but I think some cases like feature flag names, parameter names etc, are well worth the duplication.

Function names within a class

An even worse case is dynamically creating function names to use. It might save a few characters and feel like deduplication, but you’re going to drive someone crazy looking for all the uses of the function sync_policy().

Sync_types1
Fig 2. Good luck finding all uses of sync_policy() in a 2k line file written like this.

Using getattr isn’t the root of the problem here, the problem is dynamically creating the name for getattr. This code below is just as semantically compressed, while totally searchable and only a few more characters.

Sync_types2
Fig 3. This code is more searchable, although someone searching for self.sync_policy would still miss it.

Don't use common substrings as names

We have a concept in our codebase called sfull (short for "software full restart"). Unfortunately, searching for this returns massive noise for every time we use the word “successfully” in our code or documentation. In retrospect, I wish we had chosen the name swfull because it would be much easier to search for.

When picking a new name, do a quick search to see if it has good SEO in your repo!

How to search if you are stuck with a bad SEO name

You can still find sfull without matching successfully using some command line options or a regex!

grep -w sfull * : Only finds “whole word” matches. This is usually what you want, but doesn’t help you if you’re using the search bar in an IDE.

grep -E \bsfull\b * : Use a regex to specifically match “[word boundary]sfull[word boundary]”. Regex “word boundaries” include the start or end of line as well as non-word characters.

Many IDEs allow you to search with regex, so you can use this there too!

Search
Fig 4. Using regex search with word boundaries in VS Code (note the highlighted .* in the top right)

Akshay Nagpal gets credit for the cover photo of this post and has a great advanced tutorial of Grep.

In summary

  1. Fully write out semantically important names, both constants and functions.
  2. Use names with good SEO in your codebase.

Code Quality at Cobalt Robotics

At Cobalt, we build autonomous indoor security guard robots that patrol through office buildings and warehouses looking for anything out of the ordinary for our customers. We write code that controls a 120lb robot navigating around people - basically an indoor self driving car, and our customers are relying on us to keep their most sensitive areas secure and protected. We're committed to keeping a great engineering culture, moving fast, and NOT breaking things. To do that we need great engineers like you!


Fig 5. A Cobalt Robot patrolling an office space