— Remote Work, Engineering Management — 6 min read
This post is the first of a series of posts I've drafted with help from ChatGPT. I will be posting about my reasons for doing this and my experience with it shortly. Stay tuned!
Recently, I had a discussion in a Slack group focused on engineering management about remote work and productivity. The group member asked about managing productivity in a remote environment, and mentioned several issues their organization had experienced, such as new team member onboarding, dropped productivity, and puzzling communication.
The person specifically asked for a reproducible and systematic approach that could assist with making remote work more effective. In response, I offered the following advice:
Based on my experience, productivity doesn't suffer from being remote alone, but rather from a lack of clear expectations and incentives which has a much bigger impact when workers are remote. On-site managers have access to information that remote managers do not, and probably should not.
For example, an on-site manager, especially in an open workspace, can look over their reports shoulders and see that they are typing, see that they have their IDE or editor or copy of the product open. An on-site manager can see one-on-one conversations between team members taking place and easily visualize collaboration. The only ways managers can gather this data in a remote setting is to essentially have spyware installed on their team members' systems - something I strongly believe is unethical and unacceptable.
As a manager, it's important to have a meaningful way to measure individual contributors' impact on the organization as a whole. This should take into account both deliverable assets as well as value from collaboration, mentorship, and facilitation.
Once you have figured out how to measure collaborative value from your contributors, align your incentives and culture around greater collaboration. Encourage your team members to be active in your team's Slack, talk through what they're doing and what they're seeing, ask questions frequently, and seek engagement from others. Collaborating and spreading institutional knowledge helps the entire organization.
In a remote environment, time spent in high-bandwidth conversations through voice, video, and screenshare tools is a valuable resource simply being left untapped by companies and workers who have "transitioned" to remote. I recommend that you encourage your team members to jump into a video chat or a Slack huddle at the first sign of trouble, even if that trouble is simply feeling less comfortable with work that is about to start. Encourage people to proactively engage with people who seem to be having trouble, to offer to pair or simply to talk things through.
In addition, team members should be encouraged to be active in your team's slack; to talk through what they're doing and what they're seeing; to ask questions frequently; to be engaged in the product and seek engagement from others as well. Sometimes an academic-individualism mindset gets in the way, where people assume they're being graded on their ability to work independently; perfectly independent workers don't help us, but collaborating and spreading institutional knowledge in the organization helps us quite a lot.
Encourage your team members to learn, use, teach, and suggest new collaboration tools if they help them. There are many great collaboration tools out there, such as Draw.io and Obsidian, which are great for collaboration if you can get everyone to use them.
OK, so we've decided collaboration is important, but as managers, one of our jobs is to translate what is happening out of view of stakeholders into appropriate, realistic, actionable metrics. How do we empower stakeholders to visualize the progress and impact of efforts to improve collaboration?
The first thing to mention is that there is some great prior art on measuring engineering organizations that's worth reading. Will Larson's article on measuring engineering organizations is my top suggestion, as he tends to take a systematic approach to deciding which measurements are relevant, rather than seeking a one-size-fits-all approach.
The term "bus factor" referring to "how many people could be hit by a bus before the system becomes unmaintainable" is a very rough metric that's frequently spoken about but does a very good job of providing a starting point for talking about collaboration effectiveness. When made more formal, it involves answering a couple of questions:
Note: Surveys may be distorted by Impostor Syndrome and the Dunning-Kruger Effect, describing underestimating and overestimating one's competence in an area respectively.
Once you have this data, you can think about the appropriate way to interpret it.
You could focus on the bus factor for specific important pieces of the system. Ostensibly this is data that you want to have because they are important to the larger company objectives. These views are often the views that are easiest to translate into stakeholder concerns. A piece of critical code that only one person knows about is a quantifiable business risk, one that an actuary with no technical knowledge would have a pretty easy time figuring out with high confidence - figure out how much absence translates into a system failure and figure out the probability of any person being absent for that long. If you know the potential cost of such a failure occurring, this provides enough data for you to calculate an ongoing risk cost, and a very good guideline for which parts of the system to prioritize involving more of the team in.
You could also view it from the perspective of the team- how much confidence does each team member have, and how many times are they the only expert on a single piece of the system? Given this information, you can identify people with lots of unique institutional knowledge and give them a goal of increasing the bus factor on their specific areas of expertise, while pairing them with other folks with less institutional knowledge to gain expertise in a certain part of the product. This works best on parts of the product that are frequently touched, because that provides opportunities for pairs to learn together while delivering something.
This is also measurable in a way that stakeholders can appreciate. Being able to visualize the amount of valuable knowledge that contributors have is a great way to understand team health, the challenges facing the organization, the potential untapped expertise at play, and the proper investments to make to make the team stronger contributors.
For a lot of engineers, especially in a remote organization, a huge portion of lost productive time comes from idle-waiting on review or feedback. If you take an example card, you could figure out how to break the story cycle down into:
Of particular interest here is the amount of time between assignment and merge. More specifically, you'll want to identify the percentage of that time spent between the last commit before review (or the point at which review was requested) and the review itself. If you're finding a lot of time is spent waiting on reviews, that can indicate a collaboration and coordination issue. If instead you're finding a lot of time is spent coding (or "coding" as the case may be), you will have to investigate further as to where that coding time is going. This could have collaborative causes (i.e.: being afraid of breaking something they don't know about, being afraid of their approach being rejected and having to do the work over again), but it could also have individual causes like disengagement.
Ultimately, discussions that boil down to "we're talking about this because you're not moving fast enough" have to be done with extreme care, as it can come off as blame casting or disciplinary when the most productive use of that time is focusing on ways to help. I've met a lot of engineers who spend a great deal of time coding things that take me much less time to accomplish, and all it takes is me showing them my tools and workflows to accelerate their output. Most of engineering is not about working harder but working smarter, with the right tools for the job used the right way doing the right thing.
Hard work and tactical flexibility are very strongly emphasized, particularly in small organizations, which leads to a highly individualized approach that amplifies a lack of open collaboration in a remote environment. Learning how to leverage collaboration and effective tool use can enable more effective work than is possible on-site. A remote team run like a crucible where individuals succeed and fail independently will be ineffective in the long term, but a remote team with engaged mentors ready to teach and learn from each other constantly will be stronger and more prepared to take on large bodies of complex work together.