Wicked Good Development Episode 23: Demystifying tech debt

Wicked Good Development is dedicated to the future of open source. This space is to learn about the latest in the developer community and talk shop with open source software innovators and experts in the industry.

In this episode, Kadi sits down with Sonatype’s Director of Product Management, Justin Young, and Engineering Manager, Brad Cupit to discuss all things tech debt. What is it? Can different types be treated the same? How do you quantify it? And more importantly, how do you prioritize it?

Tune in and learn how you too can begin to understand your tech debt and begin to tactically manage it.

Listen to the episode

Wicked Good Development is available wherever you find your podcasts. Visit our page on Spotify's anchor.fm

Show notes

Hosts

Kadi Grigg - Host - (Twitter: @kadigrigg)

Panelists

Brad Cupit, Engineering Manager
Justin Young, Director, Product Management

Relevant links

Transcript

Kadi Grigg (00:09):
Hi, my name's Kadi Grigg, and welcome to another episode of Wicked Good Development, where we talk shop with OSS innovators, experts in the industry, and dig into what's happening in the developer community. For today's episode, it's all about tech debt. And before we jump into it, I want to introduce our panelists we have today. First up, we have Justin Young. Justin, welcome to the show. Can you tell everybody at home who's listening a little bit about who you are and why you're excited about this topic?

Justin Young (00:34):
Hi, Kadi. Yeah, for sure. First, I'm so excited to be on this podcast. I'm almost a little nervous to be here today. I've been waiting. So, thanks for having me on. My name is Justin Young, and I'm a director of product management at Sonatype. So, looking over a new product we're launching called Sonatype Lift. And technical debt is really interesting to me right now for multiple different reasons. The first being we have a new product. And so when you have a new product, we're taking on a ton of technical debt. And I really want to understand what the impact in the future is. And then also Sonatype Lift itself is a product that helps you manage quality and really helps you reduce the burden of technical debt. So, how are we able to bring that value to the market and in particular to the developers who use the software.

Kadi Grigg (01:23):
Thanks Justin, and welcome. Brad?

Brad Cupit (01:25):
Yeah, my name is Brad Cupit. I'm an engineering manager at Sonatype. But before I was a manager, which I've been for about a year, I was a developer and tech lead. And I'm excited about this topic because I actually am excited about tech debt, and I have passion for getting rid of it. So in general, I'm a fairly neat person, although if you looked at my desk right now, you might not agree. But I don't like it when things are confusing or unintuitive or you have to memorize stuff, and you just have to memorize like 40 different things before you can ramp up on new project. Let's get rid of all that stuff.

Kadi Grigg (02:01):
All right, let's dive in. So, before we even get started into the fun little intricacies that go with tech debt, what is tech debt? You know, for a high level overview, our listeners are at different levels and where they are in their experiences with tech. So, I think that's a good place to start. So Brad, do you want to just provide a high level overview of what tech debt is?

Brad Cupit (02:26):
Yeah, the classic example of tech debt is you decide to do something that's a little faster so that you can deliver value to customers sooner, but it's got some negative consequences associated with it. Maybe it's confusing. Maybe it's not intuitive. Maybe it's going to lead to bugs in the future. Who knows? There's some negative consequence that's going to come if you don't pay off that debt. I actually have a story, like a real world tech debt that's happened in my life in the last month or so. My family, we have two vehicles, and each of them has buttons near the rear-view mirror where you can open the garage door, for example. You have to program these buttons. And on our older car, it was such pain. It took me like 45 minutes outside, no A/C on, trying to get these things programmed, and I didn't even know if it was ever going to work.

Brad Cupit (03:13):
And finally it did, and I was happy. And so we got one button for the garage door, and then another button for the gate, the community gate. And so if I had programmed those in different orders on the two cars, that would be really annoying, right? Button number one opens---depends on the car you're in---button number one opens the garage door. But on the other car, it's button number two. I didn't do that. I set them up the same, and everything was great, and it's worked well for my family for years. But recently, my dad and my son noticed that in one of the cars there's an icon on button number two, a house icon, and they said, "Dad, it's really confusing. Why does the house icon not open the garage door?" And I just never noticed the house icon before. So now I could fix it. I could swap them around. But it's going to be another 45 minutes of me sitting out in the hot car trying to program this thing. And then on top of that, the people who drive the cars most frequently, my wife and I, we're gonna have to re-memorize the order of the buttons. But it would be more intuitive for new people that haven't driven the cars before. So, we're not going to change it. We're gonna stick with what we are and just going to have to memorize it. But that's a real-world example of inadvertent tech debt that I introduced that bothered my son and my dad.

Justin Young (04:25):
<laugh>. Might I suggest a little sticker over the light?

Brad Cupit (04:28):
<laugh> There you go.

Justin Young (04:28):
So, I have a question about that one. Your example is very functional. So in software, do you determine, do you classify technical debt as functional or nonfunctional? And where do you get towards UX debt? Or capability debt? Or other types of debt that you have to manage as a software product?

Brad Cupit (04:46):
Oh, you absolutely can have things that are not user-friendly, not intuitive, that your users, maybe they complain to you, maybe they just figure it out. But yeah, this is hard to do. It takes me 15 clicks, but it should only take me two clicks, for example. Or the label on that button, I never remember what it means. And I only press it once a month, and every single time I have to go look it up in the documentation what it means. So, absolutely you can have things that are a form, we would call it UX as opposed to tech debt in the UI. But they're very, very similar ideas. So when you say functional, can you elaborate on what you mean functional and what the opposite of functional is, Justin?

Justin Young (05:28):
Yeah, for sure. So, when I think about building software, there are functional requirements, which is how you utilize the software, how you derive value from the software, and how at the end of the day, software is able to fill its function. So, Microsoft Word allows you to type words and gives you spell checks and stuff like that. And non-functional requirements are things that aren't unique to Microsoft Word. Is it running all the time? If Microsoft's not running, then that's not a functional requirement. But it is something that's non-functional about it. Or is it easy to build more software on top of Microsoft Word? Or is it really easy to add the grammar check on top of the spelling? And so I classify those types of things because, from a product management perspective, we're always hyper-focused on that functional. It's like, how do we add things to this widget to derive more value to the market at the expense of non-functional? Because it's hard for a non-technical person to understand what makes that product easy to extend or what makes that product easy to maintain.

Brad Cupit (06:35):
This is a great question. In fact, if you ask four different developers to define tech debt and how it relates to bugs, for example, they'll all think they've got exactly the same answer. Because everyone has the same answer, but you'll get four different answers. And when you really dive into it with engineers, they don't realize, "Hold on, you have a different definition of tech debt than I do?" So, if we're not meeting the functional requirements, the most the user can tell, "Hey, this thing doesn't work right." So we would call that a bug. And then on the non-functional side, there's just the table stakes. Well, we expect that any new features you're adding now or today are not going to slow up how fast we can get new features tomorrow. It's just the expectation that customers have, or that it's not going to come with any negative consequences. And so those table stakes or those non-functional things that impact development, we might consider those to be tech debt, and they might eventually even have an impact on the customer down the road if it gets bad enough.

Justin Young (07:36):
You mentioned a bug. When does tech debt become a bug, and when is a bug tech debt? Some tech debts could lead to bugs, and are they the same thing? When I was an individual contributor as a product manager, I would classify it as just work that had to do outside of my purview, because it's hard for me to even notice that delineation line.

Brad Cupit (07:58):
Yeah, that's another example where you talk to engineers, and they will give you different views. But everyone thinks their own view is the view that everyone else has. So, it is actually tough to know what the difference between tech debt and a bug is. So, let's say you're running in a cloud provider somewhere like AWS or Google Cloud or Azure, and you are starting to run out of IP addresses. So, at some point you're gonna completely run out, and you won't be able to run new stuff---that sounds like tech debt. But what happens when you're absolutely completely out of IP addresses? Is that tech debt or a bug? And some people would say, "Oh it's definitely a bug because it's not intended." Other people would say, "Well it shouldn't change from tech debt to bug based on something external changing."

Brad Cupit (08:44):
And so it really can be very difficult to classify something whether it's a tech debt or bug. We've got a couple of guidelines. For example, if a tester or human or user or anyone can discover it, it is almost guaranteed to be a bug in that case. If it's working now, but you're worried it'll break in the future, it's likely tech debt. If it's broken now, and it has to be fixed right now, it might be a bug, not a guarantee, but it might be. If it's a problem that we can live with now, it might be tech debt. And if everything's working fine, but engineers are complaining about it, it's likely tech debt. If the code is difficult to update, or to add new features to, or to understand, it's likely tech debt. And if the problem stems from you trying to complete something sooner, then it's a good chance that it's tech debt in that case. And so those are some guidelines that can help you decide whether something's tech debt or a bug, but it's much harder than it sounds to identify.

Justin Young (09:38):
If it can make us feel any better, I've been talking to Forrester analysts about the same thing, and they were fully transparent that they couldn't come to a definition of tech debt. And even within the analyst community, everyone has a different definition. Exactly what is this thing?

Brad Cupit (09:54):
I read a really good article by Martin Fowler where he divides it up into four different quadrants. And there's, like we said at the beginning, the definition that everyone agrees on, so if you think of it as reckless versus prudent, and deliberate versus inadvertent. So you can make a quadrant out of those. The definition everyone agrees on is prudent and deliberate. That's where we specifically said, "We know what we're doing. We need to get this feature to market sooner. Maybe we need to beat the competition, or we need to generate revenue, or whatever. And so we're going to cut a corner. We're going to take a shortcut." But shortcuts aren't the way. The reason why they're not the way is because there's some disadvantage with it. And so you say, "All right, well maybe we'll pay that down." There's a consequence.

Brad Cupit (10:38):
But then if you look at the other parts of the quadrant, you can take reckless and deliberate where you're just like, "Hey, we're not even gonna spend the time looking at alternatives. We're just going to jump right into this thing, and we're going to use it. We're going to use this tool or build this feature, being kind of reckless. And it's going to generate tech debt as well." And then you've got inadvertent and reckless where you don't even realize what you're doing is wrong. You're ignorant to the fact that you may even be ignorant. And that could generate problems in the future. And sometimes that might even just be referred to as just bad engineering. And then the last area of the quadrant is inadvertent and prudent, and that's almost an oxymoron. But you still might intentionally do that. Let's say you're building a proof of concept, and you have no idea whether the market even wants this product. And so you don't want to spend a ton of time paying down tech debt early on, because a year from now you might not even be selling that thing.

Kadi Grigg (11:31):
How do you begin to even prioritize all of those different types that you just mentioned? Because some of it, like you said, it could be reckless, and you don't even know you're self-inflicting harm. So, as you're scaling, and you're pushing out more features, like you said, you're obviously going to incur some tech debt as you go. But there's also that portion that you have no idea about. So, I'm just trying to think as someone who would be looking to manage all of that. How do you even begin to wrap your arms around what your debt looks like, and then how do you begin to prioritize it?

Brad Cupit (12:00):
The great thing about tech debt is your developers will let you know. They'll complain about it, and there's usually a reason why they're complaining about it. Maybe it's confusing, like that garage door opener icon. Maybe it's not intuitive. Maybe it bothers them for some reason because maybe it causes them to waste time, it's something they have to memorize, and it's okay if you only have to memorize one or two things. They have to memorize 40 or 50 things. That's hard to keep that all on their head. Maybe it's a manual process, and just they know if they could have the time to devote to it. They could automate this, and it could be consistent every time, and that saves developer time. So, there's some disadvantage and some reason why the developers are complaining about it.

Brad Cupit (12:43):
If it's inadvertent and reckless, you don't even know that you have it---well, maybe it's not a problem. At some point, someone will realize, "Oh, there's a better way," or, "We know better now than we did back then." And then they'll start complaining about it. And, and so you can track it. In fact, that's what our team has found. We have a graph that we could pull up anytime we want that'll show us the number of tech debt tickets we filed versus the number we fixed. And we can just see if the gap is widening. And if it is, that means we're not spending enough time paying down our tech debt. And if the gap is closing, then that's great. Eventually, we could one day potentially get to zero. I mean, nobody really expects that we'll ever get to zero in tech debt. But that helps the engineers feel a lot better. Okay, we're at least closing that gap. We're paying down some of this debt. It's an emotional reaction. We feel better about the team and culture. Okay, paying down some of this stuff. We're not just like, go, go, go, constantly building more debt, and we're never going to fix it.

Justin Young (13:40):
You've mentioned a couple times disadvantages of tech debt. And so we're kind of circling around a definition of exactly what is this thing. And maybe we'll never get it. But, what are the impacts of incurring tech debt as an engineering organization, and then also as an entire organization as someone who builds software and is trying to go out and innovate in the market?

Brad Cupit (14:01):
I've mentioned a couple times---confusing, not intuitive---that it's frustrating as an engineer when something bites you more than once. You're like, "Oh man, that bit me a year ago. And I forgot." Well, maybe it's not your fault. Maybe the thing is not intuitive. Or designed poorly---confusing code that every time you see it, and you need to add a feature to it, you've got to spend an hour just figuring it out, getting it back in your head. "Okay, finally, I understand what this does." It can take new hires longer to ramp up because there's lots of areas that they have to learn about or memorize as they join the team. Actually, in the worst case, it can lead to engineers wanting to quit. I did, as a manager, have someone come to me, and they were mostly complaining about the quality of a particular code base they were working on, and they weren't sure how much longer they were gonna be able to to take it. And they weren't going to quit---they might have switched to another team. But at that point, you've let it get too far. And then you can also have time bombs, things that are working right now but are gonna blow up and completely stop development in the future. Ideally you'd like to fix that before it stops, before it prevents development. But that required you to be on top of it, at least you know around the time that you expect the time bomb to go off.

Justin Young (15:15):
It definitely jives with my experience, that I'll typically see at the best case a reduced velocity for teams who've taken on a lot of tech debt. And we've cut a lot of corners because we wanted to optimize time-to-market, and all of the sudden we can't deliver to the extent that we want to. And then at the worst case, unreliable systems---that time bomb. Not only am I slowing down velocity because I have to bring in a bunch of work to recover from shortcuts made in the past, but also there's an impact to customers. I mentioned the reliability of Word. Although Word's a fantastic software, anything that goes down because they incur tech debt is going to impact my ability to build brand and make a lovable experience for the user.

Brad Cupit (16:00):
Yeah, absolutely. And that's the most frequent way this impacts the customer---delays in deliveries. And it might not be massive. It might be some piece of tech debt that every two weeks it wastes 30 minutes. Is the customer going to notice the feature 30 minutes earlier or if it got delivered to them 30 minutes late? Maybe not, right? But, well eventually it'll kind of grate on the team. "Oh yeah, I've got to do that 30-minute process." Or maybe it's an hour and maybe they start up a rotation to take it off. One person does the build this week and someone else does it next week, and if it's a manual process, it can grate on your engineers, and it can also impact your ability to deliver quickly. I'm assuming we're gonna talk about DORA metrics. DORA metrics are their deployment frequency, lead time for changes, mean time to recover, and change failure rate. And if you're doing well in these metrics, you have a mature team. And if you don't have a high deployment frequency because you have a very manual process that takes a long time to release, then you can't react as quickly to urgent bug reports being filed by your customers. So you're just a little slower because you've built up that tech debt, and customers just end up living with it.

Justin Young (17:19):
So, it's not just the user that gets affected by slowdowns and features. It's really us as a company. So, if I am Microsoft Word, and it takes 30 minutes every time to add another feature more so than my competition, then you have Google Docs come in and be able to add features at a clip 30 minutes faster. And that allows competitors to come and start to take market share and build better products. So that's a lagging indicator. I definitely don't want to get to the point where I have a product that starts to get out-competed in the market by someone who's innovating faster. How do the DORA---are those my leading indicators, the DORA metrics? Is that what we should be keeping an eye on to understand how much technical debt has been brought upon us?

Brad Cupit (18:04):
A little bit. There might be a correlation. Maybe not a causation. DORA metrics are a good indicator that you have a mature process. And if you have a mature process, you've probably put the time into it. You've probably tried automating it, and therefore you're less likely to have tech debt. Although you still could. Maybe that whole time you spent automating it, you went through it as fast as possible. And it's sloppy, and it breaks every third time you run it. You could still have things automated that have caused incurred tech debt. But that's an indicator, like you mentioned, a leading indicator. It can represent a mature process. And the more mature your company is, hopefully the less tech debt you have. Other indicators, like I said, the developers will let you know because that's one of the easiest ways to know about something---whether it's tech debt or not.

Brad Cupit (18:53):
It's not a bug, it's not in the user's face, the users aren't complaining about what your developers are---there's a really, really high likelihood that's gonna be tech debt. That's fixable. And you can graph that. You can see how much tech debt you have filed versus are fixing. Then you can also have the frequency---not the frequency---the intensity of the tech debt. The priority of the tech debt. Something that you know is going to break and cause bugs, or something that's constantly dragging the team down because it's slowing them up, or it's wasted time, or you're onboarding a bunch of new people, and this is just a point of confusion that everyone has to memorize. Those things you might choose to prioritize, but it's really something you'd handle on a case-by-case basis.

Justin Young (19:33):
I was hoping you were going to say I could graph developer unrest over time and have an absolute metric to understand the technical debt. So, you talked about backlogs of tech debt. What kind of tools are out there to first get ahold of the problem---step one, understand what the problem is in front of us. And then I'd love to lead into what types of tools help with tech debt.

Kadi Grigg (19:56):
Can I piggyback off that too? Because I'm wondering, in that same light, is there a certain criteria you have when you know you're putting something bad into your code base? But you're like, "Okay, I can get back to it later." Is there criteria on that of what's acceptable to go into the backlog? Or what's not acceptable to go into the backlog?

Brad Cupit (20:16):
That's a great question, Kadi. Really, everyone's got a different tolerance for it. It's kind of like what you consider to be a clean room. I think almost everyone would agree that some empty pizza boxes in the corner of a room with a dead rat on top of it that's been there for three weeks---everyone's gonna agree that's a messy room. But what if you have a neat room. Everything's organized except you have several pieces of paper on a desk that are a little bit messy. Some people would say, "Oh I can't handle this. I've got to fix that now." And other people will say, "This room is beautiful. It's very neat." And so it's the same thing with engineers and tech debt. You have different tolerance levels, and that's a good indicator of almost the whole team. Or the whole team is complaining about something. Okay, you probably should get that fixed because you've got a varying degree of tolerances on the team, and if they're all on the same page, then it's a problem. And Justin to your question earlier, or your point---

Justin Young (21:09):
Well, let me interrupt there because I'd like to extend that a bit. I think the dimension that was missed was a conversation with your product management counterpart and maybe design, because those three groups are going to figure out what time-to-market requirements are, what UX debt they want to take on, what functional debt they want to take on, and together prioritize. So, engineering definitely has that input and is the only person who truly understands how dirty the room is. But using other functions to understand, "Well what does this dirty room mean to our customer. What does it mean to the market? What does it mean to our ability to go out and sell?" is gonna be important too. So, I mentioned early stage products. Early stage products tend to take on a ton of technical debt because you haven't fully found product market fit. You're out there trying to figure out what is the essence of what I want to bring to market, and that's the most important thing in the world. But as you grow the product and become enterprise-capable, enterprise-ready, well that's when technical debt needs to get paired back because things like reliability is so important to your customer. You need to be able to continue to service them. And then you already found product market fit, so you can start to balance it more towards technical debt than just on the innovative capabilities of the product.

Brad Cupit (22:21):
Yeah, that's a great point Justin. You're right. Engineers can't just decide we're fixing this, and that's it. It is a team decision. Or you have a product manager at the top that's dictating priorities. One of the things that's worked really well for our team is we're given tech debt and a bug budget per sprint. So we know we can do 25%, for example, of the story points that we're going to do per sprint can be devoted to tech debt and bugs and fixing them. And you can measure that based on what we said earlier---the number of tech debts filed versus fixed. You can see, is that too high for some teams? Maybe 10% is enough. Or is that too low? And then that lets the project manager not have to get into the nitty gritty of every tech debt, every piece of tech debt does. Is this important? Do we need to work on this right now? You can give the team a little bit of latitude. Something really cool Sonatype does also is one day every other week, we have something we call Improvement Day, where developers are allowed to work on whatever they want. They can learn something new. They can fix a nasty piece of tech debt that's been bothering them for weeks. Whatever they want. And so that also is another lever that teams can pull to feel like, "I'm able to conquer some of this stuff. Especially the stuff that's really annoying and grating."

Justin Young (23:35):
That percent allocation is so useful to me in the product management side to understand are we right-sizing our investment for the maturity of the tool, and what we expected of the software, and what we expect is required to build something. And so you mentioned 25%. How do you derive that number? And I'll go back to that question. Is it tickets in a backlog? And then how do you understand who gets to classify, create those tickets? Different teams have different behaviors. And you can expect some teams that're hyper-focused on tech debt just because of their history may end up adding way more items in a work backlog. They may want to sweep every corner of your room, to your analogy, where some teams may be very used to my room. I'm glad we're on a podcast because I do have boxes and a rat in the background.

Brad Cupit (24:27):
Yeah. On our team, any engineer can file a tech debt ticket, at any point in time, that they want to. It actually is kind of cathartic to do that. You've got something that's bothering you, and you don't have the time or the resources to fix it. But just writing it out and explaining, "Hey, we want to fix this. Here's why." It actually does make you feel a little bit better, and you've at least made a little bit of progress. And so any engineer can file a tech debt ticket whenever they want to. Then we can measure that. And we can prioritize that. And we can actually have a list of prioritized tech debt tickets that we want to go through. Maybe this tech debt ticket impacts another team, so it's higher on the list. Maybe this one impacts new hires, but we're not scheduled to hire anyone new for six months, so we'll put it lower on the list, for example.

Brad Cupit (25:12):
And so then we just work on them in priority, and we fill up our budget. And sometimes we get less than that budget. We have a lot of normal story work that we need to get done, things we need to finish this and finish that. And then other times we might do a little bit more. And how do you know that percentage? We just took a guess, and we started going with it, and we've started to pay some things down that have been sitting around for a long time. So, that's a good indicator that we're giving it enough of a budget, enough of our percentage time.

Kadi Grigg (25:43):
Is there any tooling that can help with this? Like to put metrics on it or actually quantify what your tech debt looks like?

Brad Cupit (25:51):
That's a great question.

Kadi Grigg (25:52):
I'm just trying to think. It sounds like a super manual process, which for me I'd be losing my mind. So, I'm just wondering, there's got to be some tooling. We live in the world of automation now, so is that actually something that's possible.

Brad Cupit (26:03):
A little bit. Yeah, there are some tools that can help for sure. So, it's just like it would be hard for a machine to come into your room and know whether it's clean or messy. And you can teach it certain things, like if there's a mess on the table, and then machines can do something. You can have static analysis. There are certain metrics like McCabe's cyclomatic complexity, code coverage, branch coverage, the length of a method. Linters can help too. There's several tools that can help. Most of that though is kind of superficial tech debt. And the big stuff that really irks people most of the time, not all the time but most of the time, is just really hard for a machine to identify. A machine can't tell you whether you've got something that's architected in a way that just doesn't match the real world.

Brad Cupit (26:50):
And as a result it's confusing, because people come in with certain expectations, and that's not how the code's written. Machines, they really can't help you with that. So, they can't find everything, but they can find some things, and they can help. And absolutely teams using those, especially if they use them in a not so black-and-white way. Sometimes they'll find something that, "The machine told me this. It gave me a warning. But I know that this is okay, and I'd like to be able to dismiss that." That's an appropriate way to use some of these tools. And, "Hey, our code coverage is really low in this particular area. But I know that it's okay because it's just getters and setters," for example if something doesn't need to be tested. So, as long as you have some flexibility on how you interpret those metrics and don't just completely fail to build, and, "Stop what you're doing. You have to fix every single little thing," then they can be additions to your tool belt as a developer.

Justin Young (27:43):
So, if I were to re-say what you said, there's technical debt that requires domain-specific knowledge of how the software works. Then there's technical debt that's more table stakes, that tools can analyze and provide feedback on. Paying down that, what you said, is less complex technical debt---does that allow you to recover or get benefits similar to the more challenging stuff? Or how would you compare the benefit of solving that kind of stuff and making sure that you don't have too much cyclomatic complexity in a piece of software, because in the future that's going to bite you?

Brad Cupit (28:22):
It kind of depends. So, I've heard stories. I remember there was a video game. It was a soccer video game, and the story that I'd heard was that they had this huge, huge method that was really long called "Do best kick." And people were really scared to touch it, because you just had such an easy chance of breaking something else. But it was critical and very important and had to be edited sometimes. So, in that case, a tool can help you and say, "Hey, that method does really well." Or maybe a tool could say, "Hey, 90% of the commits to source control touch this file. So, this file is a hotspot, and it's going to generate merge conflicts. So you may want to..." So, they can absolutely help with things. And you can't say that anything a tool found---I mean even though I said it was kind of superficial, you can't say that really not, there are things the tools can find and identify that are problematic and will cause issues in the future. But it's important to know that they just can't find everything.

Brad Cupit (29:22):
And some of the really big things are only identifiable by a person, by someone who's in there in the weeds and can see, even though this thing, this code passes all the metrics, it's just really written in a confusing way. The names of variables were chosen really poorly, and machines, they can't tell you about that. So you at least have to have the human side too.

Justin Young (29:50):
Linters give you the ability to understand a lot of easy fixes. And some of them are cosmetic, like variable names, but some end up having significant impact to your ability to build reliable and maintainable systems. So, I was wondering how something like an NPE weighs in your understanding of technical debt. You have a null pointer exception, teams moving fast and loose may miss it. Do you classify that as technical debt, and does it have the same impacts? And furthermore, is it just as easy to fix as something cosmetic like a variable name?

Brad Cupit (30:26):
Yeah, so if you're looking at the logs of a system that's running. And you see stack traces. You see errors. And sometimes those errors can go on for pages and pages and pages, and they can obscure the real information that you want. Or it's an indicator of an actual problem that for whatever reason the code is ignoring. And so, sometimes that might even be considered a bug like this software is not functioning as expected. There's other things that static analysis or that tools can help you identify as problematic. For example, let's say you're using a really old open source component. Maybe it's eight years old. Maybe it's been replaced by newer versions that have less vulnerabilities or are more optimized or just a better way of working with this tool. And if you get too far behind on these packages, the older they are, usually the harder they are to upgrade to you.

Brad Cupit (31:18):
So, it's really nice to be able to keep up with them, have a tool that can tell you, "Hey, this thing's getting kind of old. It's probably time for you to upgrade." And the worst case scenario is, you look at shops that are using languages that are not in development anymore and languages, that are extremely old, so old that they're not taught in universities anymore, and you have a hard time hiring new developers to use that language. If you were to let things get really, really decades out of date, then you can face some serious consequences. So the sooner you upgrade the better, and the easier, I should say. And you don't want to just never upgrade. Because that has consequences too.

Justin Young (31:58):
So, there's stuff that you want to address during development, and technical debt that you are, I'm assuming on a minute-by-minute basis, making triage decisions as to what you want to fix in technical debt and what you want to pass on. And there's signals about technical debt that get provided to a developer through the IDE, through linters. Or like you said, through static analysis tools. What types of findings do you believe are just fixed without a conversation. And what types of findings are fixed only when there is a conversation with product that says, "Hey, by the way, I know you want this out by the end of this sprint, but we're going to have to bump into the next sprint because this is a technical debt we would incur if we got it at this sprint."

Brad Cupit (32:48):
Yeah, great question. So, usually the scope is involved. There are times when your editor that you're working in can tell you, "Hey, I noticed this problem on this line," and you can just fix it right away. It takes 30 seconds. Yeah, it would take longer to have a conversation with other people about it than it would to just fix it. And so, you just go ahead and fix it, versus "Hey, we have this tool that is business critical, and it is not meeting our needs, and it took us six weeks to add the last feature, and it's going to take us another six weeks to add the next feature. We have to re-architect it, and we put that time in, and then we'll be able to add features much easier and much faster." And obviously at that kind of time scale, it is going to impact customer deliverables, since you do have to talk to your product manager, and say, "Hey, where can we prioritize this?"

Brad Cupit (33:35):
It's kind of like staying in a rental unit where you're paying rent, and that money is going to the owner. But if you were to instead move into a house or condo or something where now you're paying it off, like you're getting some ownership out of it. Do you want keep paying the rent for a long time, or do you want to start to get some value out of it, even though it might be more cost upfront? So, same thing when comes to re-architecting something. And that's a big thing to do, so you have to have that conversation.

Justin Young (34:09):
I'm interested in kind of the small things, the things that don't change scope. Can you classify them? Is there a nature of those types of findings, and what value do they have? Do they impact your deployment frequency? Do they impact your change failure rate?

Brad Cupit (34:27):
Sometimes they don't. Sometimes you're in there, you need to change some code, and you notice, "Hey, this variable name, I have a better idea for how we can name it, and that'll make this code so much easier to understand for others." And so you just do it. That's part of the normal day-to-day effort. Nobody sees it. Well, if someone's reviewing your code, they see it, and they can give you a thumbs up like, "Oh, that's such a much better job. Great work." So it's a quick fix, a quick win. And you've improved the code base a little bit by doing that. This happens all the time. Sometimes it's more than just one line. Sometimes people say, "All right, we've got the code. It's working, and it's fine. Everyone agrees it's fine. But now as I add this new feature, all of the sudden it crosses that imaginary line of, nope, it's not fine."

Brad Cupit (35:05):
I need to do some refactoring. And so as part of that code, you just go ahead and do it right there, and pay it off as part of implementing that feature. If you're in an extreme rush, like it's four o'clock and this thing's due at five, you might say, "I'm not going to do any refactoring right now. I'll implement it as fast as I can, and get it out there. And then tomorrow I'll go do the refactor." And that's where some teams run into trouble where they're always pushing things off until tomorrow and tomorrow and tomorrow never comes. And eventually you've got to pay the piper. That's why the debt analogy is good. Because you will have to pay it off at some point. Probably the longer you wait, the more painful it'll be. Although that also depends. It's case-by-case.

Kadi Grigg (35:46):
So, I think this is actually a good place to kind of wrap up today's conversation. I know, Justin and Brad, we've talked a lot about different types of tech debt, what the implications are, what things you can do, metrics involved. And for final thoughts, I'd like to ask both of you, we've talked about what mature organizations do, but for people who are just starting out to think about tech debt, maybe not as mature as these larger organizations, is there anything that you would recommend to them as they start to look at tech debt and how to really tackle that to make it work for them? To not be in the news about something going crazy. So Brad, do you want to go first?

Brad Cupit (36:28):
Yeah, the key takeaways---I guess you want to put the time into it, and this comes with some advantages. You have happier engineers. You have less of that, "Oh yeah, we knew that would break one day" type of comments. You have faster ramp-up for new hires, and potentially even lower operating costs if you knew the first pass at implementing something was really inefficient, and you can do it again the second time, and it actually makes it cheaper to run, costs you less money. So you want to put the time in, and you want to measure that. You want to measure that you're not putting in too much and not putting in too little. And then you want to trust your people, because one of the best sources of whether something is tech debt or not is the engineers from the team---the people who are in the weeds day-to-day. And so, if you trust them, then great, they file the tickets, and you give them the bandwidth to go and fix the ones that are the most important, and everyone's happy.

Kadi Grigg (37:20):
What about you, Justin? I know your viewpoint's going to be a little different coming from product management.

Justin Young (37:25):
Yeah, for sure. I think it's that observability---understanding how much tech debt you have, how much you're willing to take on, and how that impacts a multi-year plan. Because the tech debt you take on today is going to impact you in the future. And I love how Brad was talking about investing a certain amount, understanding the percentage you want to invest in tech debt to gain the acceleration in your team in the future. So, there may be requirements that you have today, and you've decided to take on tech debt. Well, don't build a multi-year roadmap on the velocity you're seeing today, because that velocity is incurring the technical debt that's going to have the impact on the deployment frequency and lead time to changes. And then also the impact on your ability to serve your customer around meantime to recover and change failure rate.

Justin Young (38:09):
All those metrics are really important, and they become next year's problem. So, ensure that you have that understanding so that you can build a multi-year plan and start to pay down things when they're important. And not to forget the soft kind of aspects of it that Brad was talking about. You do want to keep people happy and maintain a set of teams that are highly functioning. So, it's not all just DORA metrics. It is complex, but I think we do have techniques out there to measure and plan for technical debt.

Kadi Grigg (38:41):
Thank you both so much for taking the time to chat with me today. I really can't say thank you enough for this conversation. I know I definitely learned a lot. So, until next time guys.

Kadi Grigg (38:54):
Thanks for listening to another episode of Wicked Good Development, brought to you by Sonatype. This show was co-produced by Kadi Grigg and Omar Torres and made possible in partnership with our collaborators. Let us know what you think and leave us a review on Apple Podcasts or Spotify. If you have any questions or comments, please feel free to leave us a message. If you think this was valuable content, share this episode with your friends. Until next time.