Building Software

Dec 21, 2025

Building software is harder than it looks. I've never met anyone who has all the answers about how to do it successfully, but I've spent many years working with and watching people who get it wrong and right in various ways, so I think I have some thoughts worth sharing.

It seems obvious that of course you should develop a design for how your software is going to work, but not everyone actually does it. My whole team was once called to a meeting with another team (I'm not going to tell you which one, to protect the innocent and the guilty who might since have learned better) where an engineer lectured us about this new project his team were working on. He was inordinately proud that he actually had a design - he told us all about it, in great detail, and it was clear that he was convinced that this was a great and powerful innovation that he had to educate us all about. This guy was a favourite of the director, so I refrained from commenting the obvious - it's definitely good that you have decided to think about your software before writing it, but what on earth have you been doing all these years up until now?!

Thorough design always seems to be one of the first casualties of time pressure, and it is always at risk when different people work on a project with different understandings of how the design works. In practice many non-trivial codebases turn into unmaintainable messes - I've seen a great many unmaintainable messes in my time. So how do you get it right?

If you are lucky the management of your team will be on board with the need for proper design - but you aren't always lucky. There are risks at both ends of this spectrum - the manager who just wants you to get on with it and stop wasting time on navel-gazing, and the manager who is so keen on design that they have a whole menagerie of specified templates, documents and processes they demand you follow. Both of these are bad. The first is probably easier to navigate around - writing some simple prototype code is often a good way to develop your design, as you often don't really understand the problem until you have made at least one run at solving it, and will help the manager feel like progress is happening. Just make sure you don't invest much effort in the early code - write the simplest thing that might work, and be prepared to rewrite it all from scratch if needed. The document templates guy is harder to deal with. If you are working as a consultant where you are developing software for an external client, then design documentation might be an actual deliverable to give the client, to show the client what you are going to build, and give the client the opportunity to verify that's what they actually want - all good important stuff. In almost any other circumstance, however, you want to aim for the simplest possible statement of the design. Just write down in whatever format you prefer the basic functionality, enumerate the components of your design and how they will interact with each other. Write down some specs for api's in whatever format works (eg function header declarations with explanatory text). If you do that and the managers still want something more formal, then I suppose the best you can do is copy and paste what you have to fit the prescribed boilerplate, give that to management, and continue to work with your own copy of the design.

The obvious thing is to choose a metaphor. Tell yourself a story that relates to how your software is going to operate, and then the things and the events in your story become software components. Avoid falling in love with your own metaphor - it is very common to discover your metaphor isn't quite right once you start implementing it. You need to be willing to adjust as you go, let the software requirements be in charge, not your metaphor. Be aware that not everyone understands your metaphor - make sure you write it down in enough detail so people can follow your thought process. Avoid metaphors that are too specific - anything based on any form of popular culture is a risk as not everyone knows or likes the same things. Maybe you think that carrying a ring to Mordor works to model some process in your software, but not everyone will know what you mean and some people will be actively irritated by it. Name things after what they do, rather than after some name from your story - that makes it easier for others to relate. If people have no idea what you mean, that's bad, but fixable as you can explain, but worse is if they think they do know what you mean, but don't, as they won't ask for an explanation.

If you find yourself creating an object called a "manager" or anything similar, have another think. Just like in real life, managers need to justify their existence - good ones facilitate the work of their teams, bad ones just accumulate power and information with little obvious benefit to anyone else. Software components can be like that too - a point of centrality is a risk because it will be prone to contention, leading to either performance limitations or race conditions or both, and it will almost certainly make your dependency graph more complex because most things will probably have to depend on it. Rather than having some object that knows all and knows how to make everything happen, instead create small bundles of specific information that can be passed around your tree of components. If your programming language is flexible in the right ways you can define interfaces, protocols or templates and then some object can actually fulfill multiple roles without any one client needing to depend on the whole definition of the object.

Since we've mentioned dependencies in passing, we'd better dive in a bit further. You need to be borderline obsessive about keeping your dependency graph simple. It is so easy to transform a clean design into a tangled mess - eternal vigilance is required. You need a simple story about how the dependencies work, and stick to it. And then because not everyone will understand or believe you, you need to keep enforcing it and be prepared to refactor code afterwards to fit the story.

A normal practice is to identify the things - or the nouns - in your problem domain, and build a class hierarchy around that. That is fine as a starting point - but you also want to pay attention to the verbs. What do your things do? This is especially important for a process (or task, or whatever you want to call it) that involves more than one thing - where does the logic live? If you put the logic on one of the noun classes, there are a number of risks. Other developers will have a harder time finding the code, as there are multiple places it might live. The nouns will need to depend on each other, as each needs to handle some other type of noun object. The more popular nouns will become huge source files because they support so much functionality, and then it's harder to find any of the code. Better, in my opinion is keep your noun classes as simple as possible - give them properties (ivars, member variables, whatever you want to call them) and accessors for the properties, along with some basic state validation to ensure property values are consistent. Avoid making the nouns depend on each other (see above about obsessive dependency control). And then make a class for each verb that has the actual logic for the process in question, and that can depend on exactly the nouns it needs to work with. This also has the advantage that if the task in question takes a while to perform, the verb class is the natural place to store the state of the task progress - putting that on the nouns is a nightmare.

You will need state machines. Any non-trivial software system has state - it's better to describe it explicitly. I've been constantly surprised at how resistent so many people are to this, but honestly it really is a good idea. If you need to ask more than one question to know the state of something, you don't understand it and probably no-one else does either, and you won't be able to avoid hard-to-fix bugs like race conditions. The ideal state machine is just a single property on one object. Permit it to be changed only by a well-defined process that validates the correctness of the state transition. Put the state transition code in the one place. Sometimes people get so enamoured of state machines they want to write a generic one and then you plug in your transition logic by overriding methods or providing template instances - the drawback there is that the code then gets split across numerous places and it becomes hard to keep track of it all. The actual logic of a state machine is quite simple - a case or switch statement normally does the job - so there is little to be gained from writing a general one, just keep it specific to your use-case. In multi-threaded code the state has to be protected to ensure concurrent access works correctly - if you're really keen to get maximum performance then you could do it locklessly with compare-and-swap and similar ideas, though being absolutely sure your code is bug-free when you do so is difficult. Otherwise just serialise it, using a mutex or serial queue or similar.

Naming things is one of the two difficult problems of computer programming, along with cache consistency and off-by-one errors. It is worth paying attention to. Be specific - avoid vague words like "category", "class", "context". There will typically be multiple ways you will want to categorise things, and even if there is only one to start with you will inevitably get some new requirement later that will introduce a new one. If you must use any of these words then make sure you always give it a prefix saying what sort of category or context it is. Be aware of words that can have multiple meanings - someone new to your code may not realise which one you mean. Especially for the numerous words in English that can be either a verb or a noun, make sure it's clear which you mean, or choose a synonym. Be consistent - for example, if you have a container, decide once and for all whether your container names will be singular or plural and stick with the decision. Database table names generally work better in the singular (eg in the usual example of an HR database there would be an Employee table, not an Employees table). For arrays, maps and the like in memory, one might be able to argue it either way, but whichever you do be consistent.

Constrain your interfaces. If you give your api users two ways to use your interface, some proportion of them will choose the worse way, and then by the time you realise it will be too late to close off that option because shipped code will be using that interface. The deprecation dance is long and tedious, much better to just not go there in the first place. Make sure that the only way to use your interface is the correct way. Be as heavy-handed as you like in enforcing that. Use whatever facilities your programming language offers to ensure that code that misuses your api won't build. Once you reach the limit of your language's features, asserts or exceptions are the way to go. Be particularly careful if your api is actually a class hierarchy, and api users are invited to provide their own subclasses. By default this opens you right up to users getting it wrong, or just not understanding what they are supposed to override. Avoiding this altogether by just not doing this is one good solution, but if you think the convenience outweighs the risks then tightly constrain which classes can be subclassed and make sure those classes are providing additional specificity, not basic functionality.