This is a lightly edited update to this post originally published on 20 Aug 2018
Do I need a technical design?
In agile software development, there is architecture (decisions that are
hard to change) and incremental design. Architecture, in this sense, is
a pretty small number of things—programming language and probably
application frameworks and data storage. Incremental design is the norm:
we add classes, endpoints, and database tables as we identify a need
for them, or remove them as they are unneeded or replaced.
But what about decisions in between these two extremes? For example, it
used to be that users all signed up for the website as individuals, and
now there is a need for some kind of organization which can manage the
users under it. Or we used to have a bunch of separate products with
their own logins, apps, and management and now there is a need to do
some or all of those things in ways which apply to all products. Or our
application used to assume that all users needed to be connected to the
internet at all times, and now we want to build in offline operation.
I won’t completely rule out handling larger changes via the usual
communication of incremental development—pair programming, discussion of
individual stories, pull request review, and the like. But it can be
hard to maintain a clear idea of the larger design that way, and I have
usually been happier with a discussion which happens at a higher level
and whose goal is to get a direction into which we can fit in the
smaller decisions that we will make as we go.
I’ll write more later about who should drive this process, how to
develop such a design, and what is worth writing down and communicating.
But let's first ask when we should be
doing this design.
It is tempting to say that the high level design of a system must happen
before we can start breaking down the work or implementing pieces of
it. Which sounds good, and is nice when it works out, but I have yet to
see a design of this sort which does not get changed during
implementation. There’s a lot of reality check (interactions with
existing functionality, feedback which we only get when we have an early
version to show, complications which we didn’t notice at first).
Therefore I wouldn’t try to finalize the design before we start acting
on it. And I wouldn’t go to the other extreme—of trying to make major
changes in a fully incremental way and doing all the communication after
the fact. My preference is to start with rough ideas and conversations
about the design, and as those get refined and conversations continue,
there is a point where the general contours start falling into place.
That’s about when I start implementation. I want at least some of the
coding to be happening (even if we know we might be revising it later),
because otherwise I don’t really trust the design. In parallel, I’m
stepping up the communication (documents, meetings, etc). As things fall
into place (which may include allocating people’s time, agreeing on
technical or business decisions, and getting a clearer picture of
implementation choices), you’ll fall into the rhythm of building the
thing, because the general contours of what you are building have been
established by this point.
Who drives a technical design?
So we have a problem which is large enough that we don’t think we want
to approach it in a purely tactical way, and we’ll even assume we have
defined at least the general outlines of what we want this design to
accomplish. Who should turn this into a design detailed enough to
implement?
Before I discuss who, let me say this is an intrinsically messy process.
There are a bunch of things we want out of our design. Things to do now
or save for another day. People (in various roles) with opinions
(either because, well, people have opinions, or more nobly, because they
have a specific organizational goal they are trying to achieve). See
for example Gregor Hohpe’s
The Architect Elevator.
Issues like reliability, security, accessibility, and branding. A large
design space (a distinguishing character of software being its
malleability—or at least potential for malleability). Pros and cons for
pretty much every aspect.
If that seems daunting, don’t despair. Just don’t be surprised if a
decision which was discussed at length, carefully considered, agreed by
all, and signed off subsequently starts to seem less settled. Or someone
who you had thought was aware of what was going on suddenly “discovers”
your design and has suggestions. Or your scope seems to keep expanding
or contracting.
The most important person in this process is the one who is refining the
design and who will be involved in implementing it. We can call them
the “
responsible” person (although don’t think of the roles too
rigidly—I did say this process tends to be on the messy side, didn’t
I?). To do all these things, and have time for this design, the
responsible person needs to be able to focus on this (usually, this
means they aren’t a manager).
But that person can’t produce a good design by sitting in a room and
thinking hard (if for no other reason, because getting buy-in is a key
part of what will make this design get implemented and achieve its
goals). Therefore their main activity is going to be communication. I’ll
talk later about how to communicate and what to communicate,
but in the context of “who”, identify who should be “
consulted”.
That is, who needs to be aware of the design and would have good ideas
about how to do it. Broadcasting what you are doing and inviting input
works well, but I’d also directly seek out the people who will be most
knowledgeable or important.
One rule of thumb for involving a lot of people is “accept input widely,
accept direction narrowly”. You want to hear from as many perspectives
as you can. Whether or not you take the advice, thank people and
appreciate that they took the time to engage with you. These will be the
people who help communicate the changes you are making.
Saying “accept direction narrowly” raises the question of who ultimately will be deciding. This role is generally called the “
approver”
and will often be the manager of the responsible person (the details
will depend on your organization, though). Sign-offs are a good way of
formalizing decisions already made and making sure that there is
sufficient buy-in throughout the organization. They aren’t good at
exploring different possible solutions or weighing pros and cons, so
think of formal sign-off type processes (if you have them) as a way of
ratifying what is already understood, not as a way of hashing out
agreements.
Lastly we have people who aren’t necessarily providing input but who should be “
informed”
about the design. The basic goal here is to cast as wide a net as
feasible (in accordance with “err on the side of overcommunicating”
which tends to be good advice especially in larger organizations). Think
of ways to reach a variety of audiences: different levels of detail,
different ways of presenting the work (for example, it can work to have
one document which is technical and one which is more about the business
goals and rationales—as long as they are reasonably in sync on topics
such as what is in or out of scope), or different places you can
announce what you are doing and offer to answer questions or sync up
with interested parties.
Describing the responsible, approver, consulted, and informed roles
makes it clear that communication is central to the process of making
technical decisions and being ready to put them into practice. The next
two parts of this series will be about how to communicate, and what
topics to include in that communication.
How do I develop and promote my technical design?
In the first two parts of this series we figured out we needed some kind
of technical design, and we figured out who should be making that
happen. How does the responsible party get this thing going? Do you call
a meeting? Write something up?
Typing “useless meeting” into an internet search engine and reading the
results should be enough to give us pause about calling a meeting to
hash out our technical design. Yet in so many organizations the meeting
is the mechanism by which attention is allocated, or is otherwise
necessary. So first, what are the pitfalls? The usual risk of a meeting
turning into (too much of) an open ended discussion is exacerbated by
the large design space and many stakeholders. Another sign that meeting
discussion is a bad idea is if the wrong people are there: don’t
hesitate to say “can the three of us (less than the whole meeting) have a
break-out on this topic after the meeting?” or “would you be willing to
talk to X (who is not present) and bring the information back?” Set
your goals, such as (1) make a brief announcement about what is underway
and how people can get more details or engage further, (2) present your
design to date and solicit clarifying questions, or (3) give people an
opportunity to raise concerns to be addressed in the future. Or if you
do want a longer discussion, set the topic, keep an eye on the clock,
and don’t be afraid to steer the group back to the agenda. Also, aim for
a level of detail appropriate for the people in the meeting. Software
developers may be most interested in database schemas and code
organization, infrastructure engineers may be most interested in
reliability, security or how your design is spread across various
machines, product may be most interested in what functionality your
design will or will not unlock, and so on.
I’ve often had good luck circulating the design in document form. People
have something to react to and can leave comments on the document
itself or in other ways. So is this a Big Design Up Front? Not exactly.
I’m aiming for something closer to a High Level Design Written As We
Need It. It is at a higher level than code. It is at a higher level than
detailed descriptions of functionality (click on button X and see the
following fields with the following error conditions). It might contain
things like database schemas or protocol specifications, although
sometimes even that can be a bit fine grained.
What is a design document for? First of all, as a communication tool.
Secondly, to clarify the thinking of the person writing it. What about
things like traceability between requirements and implementation,
justifying the need for making a change, or documenting what has been
changed? I would tend to think of those kinds of documents (how many you
need will vary depending on your situation) as separate. The design doc
is written and revised as you are thinking something through and
figuring it out. More concrete documents (including breakout into tasks,
specifying behaviors in detail, or explaining code details), have a
greater need for detail and precision and are the output of the design
process, although of course the design document can link to them as they
are created. Seeing the design document as a communication tool helps
focus the process of writing it. Imagine that it is a conference talk
and you are trying to figure out who is the audience and what they would
want to know about your design.
Expect to iterate on the design. Gather some ideas. Think about them and
boil them down to a proposed design. Talk to people one on one.
Circulate it in writing. Figure out how else to get it out there. That
will generate ideas and reactions. Figure out what to revise based on
that. Expect to repeat this process until there is a sufficient degree
of convergence on a course of action. Don’t fall into either the extreme
of spending all your time talking to people (and not getting around to
taking in what they said, researching things as needed, and making some
decisions), or the other extreme, of thinking through something and
coming up with something which makes sense to you, but which may lack
buy-in from other people or may miss important requirements.
So we are developing our design and communicating in diverse ways
(presentations, written documents, informal discussions, and yes maybe
even meetings). But what topics should we cover? The last section goes into some specifics.
What goes into a technical design?
So far we decided we need a technical
design, figured out who would be doing it, and how we’ll be sending it
out and getting input. But what is the content of that communication
(for example, what sections would we put into a written design
document)?
What to include will vary depending on your organization and the needs
of a particular design. For an early stage startup, anything relating to
scaling and operations may take a back seat to “am I building something
people want and how can I most quickly validate my hypothesis?”. For a
company in a highly regulated space, there may be a lot of requirements
specific to your field.
The same applies to an individual design. Does my design concern a
server with a high or low need to be available? Does my design concern
data which is sensitive? Does this design change anything related to
this topic? (If not there’s probably little to say on the subject). For
that reason, I’d suggest treating templates (including this article) as
guidelines, and omitting sections which don’t seem relevant. One of the
fastest ways to lose an audience is to include a bunch of material that
you aren’t very interested in (and probably didn’t do a very good job
with). And of course to prioritize everything is to prioritize nothing, a
good motto in a variety of contexts.
So, what might we include?
Goals and non-goals
These are perhaps the most important sections. If you can figure out
what your design achieves and what you are leaving for another day or
deciding is not worth doing, you are well down the path of figuring out
how to do it.
Description of the proposed solution
What changes will we make to code, data, networks, and hardware? How
does this design achieve the goals? Give enough detail that people can
see some of the implications of various choices, but try to avoid the
kinds of details which can easily be fleshed out during implementation.
Security
What data is stored and sent where? How is access controlled? If
cryptography is involved, how are keys managed and have we chosen
appropriate algorithms? Are some parts of the system isolated from
others and if so how?
Reliability
Is there redundancy? What are the consequences of network outages? If
data is stored in a primary-replica setup, how do we choose a new primary? If
data is written multiple places how do we reconcile them? Are there
rate limits or other ways of keeping a problem one place from cascading
elsewhere?
Capacity
What is the expected load on the various systems involved? Does load
ramp up gradually or do we expect a sudden spike in traffic? What needs
to be handled manually and is there sufficient staffing to do it?
Monitoring
Do we need to report new metrics? How will we know about errors?
Data analytics
How will we measure usage of the new functionality? What kind of analysis might we want to do?
History
Has the company considered this problem before? What previous decisions
got us here? If there are documents describing previous designs, I tend
to just link to them rather going into a lot of detail about what has
gone before.
Storage
What database(s) are involved (new or existing)? What changes in database schemas are required?
Interfaces between systems
Defining these can help clarify the design and is particularly helpful
if one of the functions of your design is to coordinate between
different teams or companies who are responsible for different pieces.
Alternatives
How else did we consider solving the problem? Why did we choose the solution we are proposing?
Open questions
This section is particularly helpful if you know certain topics are
controversial or warrant further discussion. As questions are resolved,
move items from here into the main design section or the alternatives
section.
Rollout
In what order are we building this? Are we shipping it continuously? In a
series of phases? Is it rolled out selectively to certain users?
These questions can be taken as a template for a design document, but
they also can be used to figure out who to go talk to, what to put into a
presentation, or what anticipated questions to prepare for.
I've talked a lot about things to do: Did you
talk to X? Did you consider Y? What if we did Z? And those are all very
helpful up to a point. But only do those things which seem necessary for
your particular organizational culture and problem you are trying to
solve. The purpose of all these suggestions is to help you build things
and solve problems, so as you go, don’t be afraid to keep asking
yourself and others: Are people on the same page now? Is this enough
specificity to build this? Is my technical design sufficient for what I
need?