Transcript
Thomas Betts: Hello and welcome to The InfoQ Podcast. I’m Thomas Betts, and today I’m talking with Suhail Patel. Suhail is a principal engineer at Monzo focused on building their core platform.
His role involves building and maintaining Monzo’s infrastructure, which spans nearly 2,000 microservices. He recently spoke at QCon London about navigating personal growth as an individual contributor at a rapidly growing company.
In today’s conversation, we’ll dig into how to stay resilient during those high growth transitions and what that means for the sociotechnical systems we build and depend on. So Suhail, welcome back to The InfoQ Podcast.
Suhail Patel: Thank you. It is an absolute delight to be interviewed once again. I’m really, really excited for this topic.
Yes, it is something that I love talking about with other engineers, engineers from different organizations, and yes, hopefully we get a bit of feedback for people who are listening to this podcast on how they have it within their organizations. So yes, thank you once again.
Monzo’s Architecture Evolution [01:29]
Thomas Betts: Yes, definitely. So let’s start with the background. I think if people know one thing about Monzo’s architecture, it’s that diagram that shows thousands of microservices and how they connect and they create the systems within Monzo. But it wasn’t always like that, right? You don’t just start day one and have a thousand microservices, nor did you have a thousand employees. So in your time there, how have things changed?
Suhail Patel: Yes, absolutely. It definitely does not start like that from day one. I don’t know if there is a technical authority on how you build a bank. Many, many banks have had the mainframe architecture for multiple decades and it seems to be working pretty well for them. Once you look at revenue and profitability numbers, maybe the engineers might say something different, but they seem to be working pretty well.
For us, we’ve had a bit of fortune in that we have made a few bets that have really, really paid off. We invested in writing our microservices in Go relatively early on, and that has been a big bet that has paid off the Go programming language has completely exploded, and also investing in things like Kubernetes. Kubernetes has very much exploded and we’ve been on it since 1.0 and yes, indeed we’ve not started with thousands of microservices.
For us, we really wanted to have composability and because our ambition was to grow our organization, we wanted to make sure that our units of domain context were appropriately separated, and I think that has really, really played off in the present and also for the future because what we’re able to do is migrate these microservices from team to team as our organization has grown and we can have different concerns being owned by different areas that are autonomously responsible for deployment, health, scaling, and all the other attributes within those services, whilst also maintaining that shared core architecture.
Thomas Betts: Well, that’s something I hadn’t really thought about before. The idea of microservices and DevOps is often you build it, you run it, but when you have that many things, it’s hard to constantly be adding new services and still having that baggage you have to support forever. You mentioned being able to hand those services off to other teams and that was built in from the get-go as part of the design that these would be moved around?
Suhail Patel: Yes, absolutely. For us, it’s as simple as updating our software catalog and updating code owners, and we’ve built a lot of machinery behind the scenes so that alerts get automatically migrated and the sentry alerts and things like that get automatically updated behind the scenes. I speak to a lot of organizations that are adopting tools like Backstage and other service catalog platforms, and I think where the real value is the automation that you can build on top of those platforms so that you can do this ownership over time.
So as your organization grows, your automation is able to kick in and effectively how people move around between teams, you’re able to move your systems around between teams as well, and I think that’s where the real value comes in. If you can invest in that automation, your life becomes much, much easier.
Suhail’s Role and Career Growth [04:27]
Thomas Betts: Yes, yes, I think so. So I also want to talk about your role over that period of growth. What roles have you had and how has that changed over the time?
Suhail Patel: I am quite unique in the Monzo organization in that I’ve always remained in the world of platform. I’ve been at Monzo for seven years now, which is a very, very long amount of time compared to most of the folks within the organization and also most careers in tech, especially in today’s day and age. I joined in a platform team that was about four people. It was newly established.
It was at the time where we were like, “Okay, we want to invest in some core platform capabilities”. And now we’re over 10 squads. There’s nearly a hundred people working in the platform domain in some function or another, and a lot of that has been subsuming other areas and platformizing them. So taking care of things like machine learning and data. Now we’re looking at AI just like every other organization whilst also maintaining our core platform infrastructure as well, effectively maturing as an organization and investing in our world of platform.
I think it was a very conscious decision early on and that technology was going to be a very, very key part of how we are able to scale and the products that we are able to build. We’ve been very, very fortunate that at exec level and at CTO level and across all levels, our board level as well, we’ve had a massive buy-in in order to maintain a platform organization that is focused on building those underlying tools rather than actually building the product side of things. And it takes a constant effort to keep that investment up and running, but that investment provides a significant amount of value for other engineers in Monzo to be productive.
Platform Investment and Buy-in [06:02]
Thomas Betts: Yes, I think that buy-in like you said is critical. If you’re off building something, but the executive team doesn’t see value in that or the sales team or the product owners or whoever don’t understand why we need to invest in the platform, you’re just seen as one other system that’s being developed and not like this is the thing I need to add to the features for my part of the business.
Suhail Patel: Yes, absolutely.
Thomas Betts: But you’re saying the platform is what powers the business and everyone understands that, that’s definitely a good way to have the organization think about things.
Suhail Patel: Yes, absolutely. In many organizations, platform is usually just a cost center. They receive the AWS bill and someone from execs up high wants to see the AWS bill go down. So yes, it’s typically often just seen as a cost center. Whereas there is a real tangible monetary value that can be provided, especially when you quantify things like the cost of having incidents or the cost of technology slowing you down, being able to release a product. One of our fundamental philosophies is that technology shouldn’t be the blocker to our ideas and our product releases, our imagination should be.
MVP and MVA in Platform Design [07:07]
Thomas Betts: Yes, so beginning of those ideas, I want to talk about the idea of an MVP. We hear about an MVP, the minimum viable product, and MVA, the minimum viable architecture is becoming more common and it’s that useful way of saying we don’t need everything right now. What is absolutely essential, and that can be when you were the four-person team or maybe it’s on these new teams developing something new for AI and ML. So for the core platform team, what does that look like? How do you think about what’s the first thing and the first version of the platform and what you waited until later to implement?
Suhail Patel: For us, maybe our journey has been a little bit unique in that our technology choices have been strongly influenced by our founding engineers and organizations that they’ve worked in the past. So effectively our microservices architecture utilizing Go, we’ve built our systems on top of Cassandra and adopted container-based technologies.
Those are based on experiences pretty early on from our founding engineers. But I think we’re a decade old. So I think that minimum viable architecture is a constantly shifting landscape, and you look at in today’s day and age, I think there are some things that remain constant though. Sometimes when you think about architecture, you immediately think, “Well, I need a database and caching and queues and all of these different bits of functionality”. And you see really, really complicated diagrams being established for what are relatively simple applications.
If you’re looking from my perspective, you need a way to reliably store data. You’re going to need a database, you may need an ability to go and store larger binary blobs, so you may need an ability to do that, like access to an S3 or something. You need some level of compute infrastructure, whatever flavor that might be, whether you want to run containers in Kubernetes or run serverless, what have you, you need to have some elements of compute infrastructure, and I think there’s a fourth component that is often ignored, which is some form of audit trail, audit log, being able to understand what your application did behind the scenes, and I’m not just talking about standard logging or standard telemetry or tracing or audit log from a security point of view.
Event Logging as a Superpower [09:15]
Suhail Patel: I’m actually thinking about it more from if you’ve got disparate components, even if you’ve got some semblance of a microservice architecture or different functions or what have you, you’re going to be spending especially a lot of time early on debugging where a particular bug may have occurred or understanding what a customer did or what have you, if your logic is not fully formed, and if you’re able to effectively do the equivalent of attaching a debugger and stepping through your code, but in production, I think that is an absolute superpower of capability and there are some technologies out there that are not fully baked and fully mature yet that allow you to do this in real time, but I think really having the ability to emit logging into some form, whether it is into some blob storage, some form of audit log, which is the customer did this and then they did that, and then they did this, and then this went through this bit of your application.
Often in architecture, we call it the event sourcing model, but it is not necessarily pertaining to outputs that happen in the database. You’re not actually building your data from event sourcing itself, which comes with its own set of complexities, but effectively having some form of events so that you know what is going on behind the scenes, that is a real superpower because then you’re able to debug problems really, really quickly because you’re able to effectively see some timeline of events.
So I see that as a very, very core part of the minimum viable architecture. It’s not something that you often see in textbooks or in write-ups online. Typically, you see people rushing to have a CDN or a cache or what have you. Those things are okay, but I think having that event log is really, really key.
Thomas Betts: Yes, some of those things you don’t need until you scale up to the point that you need them, and so you want to make sure you have them in place before you get to the million users like, “Oh, we only need to support 100 users or 100,000 users, but at 10 million, we have different challenges”. You won’t have the 10 million people on day one. Maybe you’re that successful, but that’s unlikely that you have to design that as your MVP and your MVA.
But I like the idea of having what happened established as a core first feature because if that’s there, then every time you make another microservice another part of the platform, you tend to copy and paste the code. If it’s not even fully baked in, you’re like, “Oh, well just reuse this thing”. And then everybody gets that and then everyone expects it to be there and you rely on it when, like you said, when you’re doing production-level debugging. So you know that that’s part of the system as opposed to, “Oh, I try and figure it out and piece things together”.
Suhail Patel: Yes, and I think adding that on is a much harder later on. These become really, really complex projects later on and you always wish that you had it, especially when you have an incident or something goes wrong or you’re debugging a complex bug and yes, you wish you had that to hand.
Balancing Scale and Architecture [12:07]
Thomas Betts: So I think there’s this symbiotic relationship when you’re dealing, especially the scale of microservices that Monzo has that in order to support this growing application, you need to make the platform bigger and you’re adding more employees constantly because the company is growing and so you’ve got more people, more features, more platform. How do you balance all those changes so that one area doesn’t get too far ahead or lag behind the other when it comes to architecture and features?
Suhail Patel: Yes, that’s a really, really good question. The key thing there is communication, being in the room when these decisions are being made on what things we want to work on, but also being in the room when people have the pie in the sky ideas. I think no one comes to work, well, most days and tries to drop a surprise in that we’re going to go and implement this feature and I want it delivered by next week. In most organizations that are scaling, there’s some form of planning and coordination and some advanced notice so that the time and energy is being spent appropriately and these things are sequenced in order and being part of that sequence is a very, very key thing. For us within a platform team, I always think about capabilities that we want to be able to provide the ability to unlock things for product teams.
Platform Capabilities and Speculation [13:25]
Suhail Patel: So being one step ahead of some of those ideas, what would we be able to enable if this thing was easier? If we were to often unlock the capability for calling into LLMs really, really reliably in a cost-effective manner, what would that enable from a product perspective is something that we are constantly thinking about. And there is a bit of a dilemma because it’s very, very easy to make lots of bets and for many of those bets not to pay off.
For example, one would argue should we have had a team focused on blockchain capabilities for example? Or in stablecoins or what have you from a platform perspective, not even from a product perspective, but these are big shifts that have happened in the financial services and in the industry and have informed how we write ledgers and a bunch of other areas.
So it’s very, very easy to make lots and lots of speculative bets, but I think being in the room and having an ear of what is happening on the ground, speaking to various individuals, I keep a very, very close relationship with the staff-plus and principal engineer community, the engineering director community as well. Being engaged in those conversations and making sure that ideas can then bubble up into our planning as well is really, really key.
And having a bit of gut intuition of what is happening on the product level. And advice that I often give to platform engineers is it’s very, very easy to get fixated in your platform world. You start thinking about building blocks in Kubernetes containers and in Terraform modules and AWS bits of functionality and basically being a parrot for whatever your cloud vendor gives you or whatever a third party vendor can give you, and I think that is not the aim of the game.
The aim of the game is to go and understand what is going to be built from a product perspective or what things we want to be able to enable, not be a product manager or product designer, what have you, but to get a good understanding and then be able to translate that into things that might be needed from a technical perspective. As an example, if they’re building a piece of functionality that is constantly going to need communication back and forth between your servers, then you’re going to have to build a high throughput system.
And if that’s not a capability that you have at the moment or if you have a database technology that is going to constrain you, then that might be something that you want to get on top of, or if you need for example, a document database or like a vector store or something like that, that is the ability to translate something that might be needed from a product perspective into some speculative execution that you might need from the platform perspective, and that is what I like to think as core bets to be looked into.
Translating Product Needs to Tech Bets [15:58]
Suhail Patel: To have an opinion on what that speculative execution might look like so that when the rubber hits the road and you actually need to execute on some of these things, you have some plan of attack rather than just sitting in the room and saying, “Well, we’ll need to investigate how this thing might occur”. And some of this can come through industry like keeping a close eye on what is happening in industry. Again, very, very big plug for QCon. I have a lot of conversations at the conference with people at the forefront in the industry, and I think that is a very, very big takeaway from me understanding what technologies are there, going to talks about technologies that I’m not familiar with and gaining that context so that when I come back I am able to reference these technologies and take a bit of a shortcut, we’re not just starting from ground zero.
Thomas Betts: Yes, yes. I think the idea of you don’t want to be the blocker. Every new idea is going to have some bottlenecks and some blockers that come up, but you want to make sure that if you’re on the platform team for example, you don’t want the platform to be the persistent blocker because if you are, people are impatient, they’ll find workarounds.
So you want to make sure that you’re leading the way, providing enough runway so that you’re out there two steps ahead. So when they come to you and say, “We want to do something, are we able to?” It’s like, “We already have something in our roadmap, we’re going to have that ready for you. You’ll be able to just use”. They said high communication between services or whatever it is.
Suhail Patel: Yes, that’s exactly it.
Automating Developer Efficiency [17:23]
Thomas Betts: Going back to the whole journey from the early days to scaling up. You can get by doing a lot of stuff manually or with limited automation, like building and deploying code is probably the first example and we know that doesn’t scale, and I think a lot of companies are saying, “Okay, day one that MVP has to have a build script. I should be able to check in my code and it gets deployed”. But I like to take that analogy and say, “Okay, we know how to make our software deploy better. How do we get to making people more efficient?”
I think it was Kelsey Hightower has a great quote that he doesn’t want the 10x engineers, he wants someone who can empower 10 other engineers, and it seems like platforms are one example of how you can do that, but where’s the tipping point? How do you get from the one person doing the work in your role to say, “Okay, I’m an individual contributor”, to, “I’m going to enable and empower 10 other people to be productive”.
Empowering Engineers Through Communication [18:20]
Suhail Patel: Yes, I think there is a really, really key point in sharing your marbles, spreading your marbles. There’s an analogy in every organization. For me, I think that constant communication is speaking and writing. That for me is like a form of influence. Being able to write everywhere that you write, everywhere that you proliferate, your message is really, really key. Being very, very articulate in communicating what bits you’re looking into, how you’re doing something, what commands you’re running.
It’s like teach a person to fish and they’ll be able to do it themselves, but often we say, “Oh, I’m just going to tweak the variable over here”. But for a junior engineer, they might not know what that means exactly. So being very, very articulate in what you are doing. There can be very small distinctions. For example, you may say, let’s say in an incident or for example in a debrief talk, I ran a command that changed X to Y. And if it’s easy including the actual command that you ran so that someone else can take inspiration, might force someone else to find the ability to learn something new.
They might have learned a new command for example, or a new tool that you have or a capability that you have that they might have not known before. And effectively what you’re doing is you are spreading your wings, you’re accelerating the learning for other people. So being articulate doesn’t really mean writing reams and reams of documentation or spending all of your time doing lots and lots of tech talks. I think these are all positive things, but you shouldn’t need to write 50 pages. That’s probably eventually going to get summarized by AI anyway. But for example, being verbose where verbosity can be used as a tool to go and teach someone.
Verbosity and Learning from Incidents [20:11]
Suhail Patel: Again, I use the notion of incidents. I think incidents are a very, very great way of learning personally. In an incident channel, I like to be extremely verbose when I’m debugging an incident. Which is, “I’m going to go open this dashboard. I have looked at this panel. I have noticed an uptick over here, and here’s a screenshot of what I noticed. I have seen this thing trend downwards. I have seen an increase from 10 milliseconds to 20 milliseconds, which is unexpected because the P99 for this, and I’ve zoomed out.”
And I’ve had a lot of engineers reach out to me and were like, I really like that method of communication because you’re effectively breaking down the way that your mind is thinking about these things and some of those things which are gut intuition, I am vocalizing. So other people are then building up their own gut intuition as well. We’re seeing this a lot and I do see a very, very strong parallel to what we now call prompt engineering, I guess, right? In prompt engineering, you effectively are being very, very verbose.
If you leave your prompts as being quite terse, quite concise, you can often yield good results, but you can yield great results if you are very, very verbose. If you’re able to tell it, we typically capitalize our SQL keywords or we have underscores for these sorts of things, make sure you run the linter and address any errors. Being very, very verbose about your thought process, not exactly on how to do the task. You want to leave some freedom and autonomy, but being very verbose on things that are non-negotiable, things that are like gut intuition, things you want to encode as part of the process is very, very similar parallels.
Prompt Engineering and Debugging [21:53]
Thomas Betts: Yes, I’m glad you brought up prompt engineering because it’s something that I’ve been talking about recently with my coworkers. We used to have this challenge of onboarding new team members or even onboarding new teams and saying, “Oh, well, you have shared documentation or is there a video of a recording? Here’s where we brought someone up to speed”. And in the past, it had always been small tribal knowledge.
Back when everyone was in the office, you’d sit down and you’d be the buddy and someone would work with you very closely for your first month and pair program it a lot. And in remote environments, it’s harder to do that. And we’re relying more on Slack and asynchronous communications as ways to document things. And what we found recently with using (we use GitHub Copilot) is writing out the Copilot instructions file like in this repo, here’s how our code is structured, here are the things we do, here’s the unit testing frameworks we use.
And I realized that I’m doing that because I need to tell the AI to write code the way we write code, but that’s just like when I brought on a new engineer and we say, “Here’s how we write stuff in this project or in this company”. And the idea of the AI is your new junior dev and you have to teach them every time and then you write it down and then they get better at it. It’s the same way I think you teach other people. And so using that as a metaphor for, “Here’s a practical example of where I have to explain myself”. But using that for incidents where people are learning how to be better at debugging and troubleshooting because those are core skills.
Suhail Patel: Yes, absolutely. I think there are really, really strong parallels and engineers who are good at one will excel in the other. I do fundamentally believe that, and I see this in practice.
Transitioning to Technical Leadership [23:35]
Thomas Betts: So you’ve talked about your role, what has changed and what are the day-to-day activities that make that transition easier for someone going from hands-on, individual contributor to more of that systems thinking and looking across platforms and across multiple teams and being that leader, even if you don’t have a formal, I have a team and I manage people, but once you get to a staff or principal level, you are a leader and so you need to lead by example. How do you make that transition?
Suhail Patel: I don’t think there’s a one binary flip. There’s no one action that says I am now a leader. Maybe in some organizations you get given a tech lead title, but in most organizations nowadays that title is earned. So by the time that you earn the tech lead title, you’re already doing the work, you’re already acting as a tech lead for an organization, and same with a staff role as well, staff principal role. You’re already doing and having that level of impact that is recognition for the impact that you are having.
I think a very, very key thing is a staff principal role, and I think even maybe even a tech lead role is very specific to an organization. So understanding the organizational context, there’s behaviors and aspects that are really, really key. A staff/principal title is quite rare to be given just for code contributions, just for technical code contributions because there are fantastic software engineers out there that will write code faster than you or write code that is better factored than how you write code.
I am by far not the best programmer. There are engineers that are far better than me across the industry. There are engineers that are graduating now who are going to be far better than any of us in engineering and learning new technologies and having capabilities that we could have just dreamed of. However, understanding that organizational context and having that conviction is really, really key. The conviction to go put in bets and take ownership of seeing those bets to success.
Seeing Bets Through to Success [25:37]
Suhail Patel: I think that’s the unique capability that an individual contributor at this level really has is that I am able to say with some element of data, but there’s a lot of data because you’re proving something that a capability that you might not have had. So there’s no data to prove it. You can say that others in the industry have done it, but there’s no capability to say that this thing is going to be seen to success, but the thing being seen to success, you can’t just say, “Well, I am going to delegate this to another team and then I’m going to get out of the way and come back when it’s time to celebrate success”. Because what happens when it leads to failure? If it leads to failure?
And in many organizations without that constant sharing of that vision and without that sharing of that motivation, often these projects do lead to failure or partial success as I like to call it, rather than full success. So effectively, making bets and making sure that you see them to success as an individual contributor, being part of the discussions where you’re designing architecture, being part of the discussions where you’re making complex code decisions, setting up the team for success, make sure that you’re writing decision records and especially decision records and having discussions for areas that are going to be non-reversible changes, one-way doors is a very, very key thing. That doesn’t mean you need to be part of every discussion.
I think that is quite a negative behavior in fact because your teams or the teams that you build lose autonomy, but making sure that if the team structure is set up so that they are thinking about these things, it’s an opportunity for people within those teams to grow and setting those teams up for success and putting the right constraints and guardrails in place. So for example, if there is implicitly a cost constraint or if there is a performance constraint or if there’s a scalability angle that you want to target, make sure that those are communicated early on so that for example, let’s say that the system is not performing to the same order of magnitude that you’d intended it for.
You’re not going to end up having an uncomfortable conversation that you need to scale the system up. It’s going to become a hundred times more expensive than you budgeted for within some performance parameters. Trying to catch those things early on. And if you encode that thinking process as members of your team, for example, through one-to-ones and having discussions and being close enough to the discussion on technical architecture, you can see projects to success without directly being involved. And once people understand again that methodology of thinking, that will then proliferate to the rest of your organization.
Coaching and Building Team Capacity [28:06]
Suhail Patel: You’re effectively then investing in other people, which I think is also a really, really key part of being a staff-plus engineer is investing in other people leveling up your bench or your roster of people. Similar to a coach in a sports team, you are responsible for leveling up the players within your team and you’re going to have people from all spectrums. You’re going to have junior engineers, engineers that are new to the industry, interns maybe even, senior engineers as well.
And all of these folks, there’s many, many ways to engineer your way out of a problem and it requires that diversity of thinking. They’re going to have really, really good ideas, but they’re going to also want to understand the constraints that you are hearing and discussions that you are involved in. So having that high bandwidth two-way communication is really, really key to level up and build up your area of the organization and making sure that you’re building up your bench and your team as well.
Trailblazing and Decision-Making [29:02]
Thomas Betts: Yes, and I think the ideas of talking about I’m going to go off and explore the uncharted territory, that’s my role, whether it’s in that I’m going to become a lead engineer, it’s all about matters is scale. I’m going to be lead for a team and then I’m going to be lead for a group of several teams and then higher level leading across an organization, and it’s like how far out into the wilderness are you going sometimes? Being a trailblazer and say, “We don’t have an established paved road yet”.
Somebody has to go there first, do the bushwhacking and then start putting down the gravel trail. And eventually someone comes along and is like, “We’ve now done this three times, we need to pave the road”. So some of it is that going out there, but also letting people see, “Here’s how we go and do that exploration. Let me take you on the hike the first time and we have to do the bushwhacking and then the next time you have to do that on your own. You don’t feel completely lost and afraid to do it”.
And that’s I think one of those ways to level people up is just give them the opportunity, give them just enough chance that they could make the wrong decision and there’s guardrails to catch them for the small, like you said, the things that aren’t one-way doors might be easier, but once you get to a bigger decision, that’s when they might suggest, “I think we should do this, but I hate saying it’s above my pay grade, but there’s someone else that should be involved making this”. Go to the next level of engineering management and leadership to say, “Weigh in on this decision, but here’s my idea”.
Communicating Across Engineering Levels [30:26]
Suhail Patel: Yes, absolutely. And I think one of the opportunities you have, especially as an individual contributor is you can spell the level of ambiguity at a different level. For example, if you have an engineer manager or engineer director who’s not involved in day-to-day engineering, typically your method of communication will be messages, docs. That would be your method of communication, maybe architectural diagrams.
Whereas as an individual contributor, you’re able to get a code level, you’re able to help people unblock themselves through code because you’re effectively looking at the same shared language that you both are communicating. The artifact that you are producing as engineers is something that is shared and you’re able to get at that lowest form and communicate through that lowest form.
Thomas Betts: Yes, I think I always go back to “audience, context, and purpose” when you’re trying to communicate. So knowing that I’m not going to show my VP, “Here’s the code”. That’s a good lesson learned–they don’t need that much detail, they don’t have time for that. But you need to be able to say, “What’s the high level decision we’re making and walking people through that thought process”. I’ve appreciated ADRs as a personal tool that helps me take the pause, think through the problem and write down what are the things I’m actually considering and getting them out of my head, but also as a way for other people to see, “Here’s how Thomas thought through this problem”.
And then letting them write the next ADR and I’m just an advisor weighing in. It’s like here you make the decision, you come up with the options, and if I know that they should have considered a third option that wasn’t on their list, I can suggest that and then they can think through what are the trade-offs? Maybe that third option isn’t great, but we should at least consider it.
Suhail Patel: Exactly. And it also serves as a form of future documentation as well for people that are coming along, which is these are the options that were considered. So if someone then comes by and has a fourth, fifth, maybe there’s new changes in the industry, they’re able to reflect that and say, “Well, actually these tools have these capabilities now. So we exhausted our search space at the time, but industry has now moved on. An open source tool might have come about, a new standard might’ve been procured, a new third party tool might’ve been onboarded”. That makes life much easier.
Abstractions vs. Deep Understanding [32:40]
Thomas Betts: So on the same vein of getting people up to speed, I’ve encountered many experienced engineers and not just software engineers, mechanical and electrical engineers as well, and they think that the newer generation, the juniors that are coming up behind them need to go through that same pain and suffering that they went through because that’s how they built up the scar tissue, that’s how they learned to be what they are, and that if we abstract away all those low level details, there’s a loss of understanding. They don’t know how the system works. They have to learn how the system works, therefore they have to experience the pain to understand that.
On the other side of the argument, if we hide those details, that means we can allow people to focus more on the domain-specific problems, the software that’s actually going to be the game changer. It’s not valuable for every engineer to know how Kubernetes works because you’re not a Kubernetes company, you are a bank. You should be writing banking software. So where do you see that balance playing out between having the abstractions to allow people to go faster but enough visibility so that they’re able to understand how is it actually working?
Suhail Patel: I am firmly in the territory that not every engineer needs to understand every single abstraction or needs to know all of the details, and this is actively even part of a hiring strategy. What we want engineers to have is the ability to do first principles thinking, to understand, for example, this is where the barrier of my knowledge is, but I know that there is a problem underlying this and I’m going to go and scratch the surface one level deeper or go solicit expertise one level deeper because I have a hypothesis. And rather interestingly, we are in a remarkable time because now you have a really knowledgeable rubber duck right beside you on command all of the time because you can go and rubber duck with a prompt a ChatGPT or Claude or whatever tool is being used nowadays. Even including local tools, even if you’re interacting with an LLM locally or in Copilot or what have you, you’re able to rubber duck something. You’re able to showcase how you debug on a hypothesis and then go in and be able to do that.
For example, even if you go to most CS courses, computer science courses that is, or software engineering courses in university, most don’t teach assembly. There’s a whole bunch of CS courses that no longer teach Java. They’ve gone into Python for example, or other programming languages, and I find it fascinating. You look at a lot of the enablement of the AI generation and a lot of AI tools being created. Those are created by folks who aren’t computer engineers by trade, aren’t software engineers by trade, but they have an idea. And I think there is space for everyone to go in and contribute really, really good stuff. I think it’s going back to the whole diversity of thought.
I don’t believe that the only people that the industry is accessible for is people who have been programming since they were five. I think that is very, very exclusionary to individuals. I think the only skill that you need to have to be a software engineer is the willingness to do first principles thinking, and I think capability and ability can be learned over time and you learn through others. I have learned many, many things through the shoulders of others. Again, shouting out to QCon. And I learned a lot through video tutorials online, writing code from books and watching talks online is how I learned how to do AWS. So we’ve very much learned from our peers, but just having that curiosity and that first principles thinking I think is the only prerequisite that we need to be really, really successful in our industry.
Now on the abstraction of low-level details, even for mature organizations. For many engineers, when you abstract away low-level details, it allows them to spend their limited brain cycles–because we only have so many hours in the day. You have a limit to how many brain cycles–onto higher order things.
So for example, on a day-to-day I don’t think about racking servers and hard drive failures and RAID and ZRAID and BIOSs and operating systems and the various third party packages that we build on top of. I simply spin up an EC2 instance or a VM somewhere and it just works and that VM dies from time to time and a new one spins up in its place and it just works. I think there’s an element of magic to that. Just like I’m speaking to you in a computer, the shoulders of giants for us to be able to have a real-time conversation right now, it’s remarkable technology and yet for us, we’re having a conversation about the world of software engineering and being a staff-plus and principal engineer.
We’re not thinking about what is going on behind the scenes. I’m not coming to you from a browser and a video codec that I’ve written and an audio codec that I’ve written. I sort of plug it in and then it just sort of works. And that is what I expect from my technology. When you have engineers build up that expectation, they can rely on foundations that are built on from others. They can spend their brain cycles thinking about higher order things and within an organization, for us to have putting this analogy straight onto the ground, it really allows engineers to focus on building a really, really good product for our customers with technology being reliable.
And then when they notice deviances in that technology, having a group of people that they can reach out to. And I mentioned a little bit earlier when we were talking about minimum viable architecture, having that event log for us to be able to go and debug and having that capability and having engineers self-serve and that be a reliable suite of abstractions. Just like how we rely on our keyboard and our mouse to be working on a day-to-day basis. Relying on those abstractions to be working and being taken care of on a day-to-day basis frees up our brain capacity to go and work on high order things and deliver really, really good products for customers.
And you’re seeing this play out in the industry as well. The things provided by cloud providers and AI providers and a lot of hot new companies nowadays like Supabase and stuff. They’re taking care of a lot of the details behind the scenes so that you can focus on high order things.
AI’s Role in Software Engineering [38:51]
Thomas Betts: So I’m going to give you a chance to finish on a hot take. Where does AI come into this? And the speculation seems to range from, “We don’t need to ever hire junior engineers because I can have an AI do their job”. To, “You know what? We could hire only junior engineers because if they know how to use AI, they’ll be doing the work of seniors”. What guidance do you have for people that are trying to figure out how to use AI as part of their professional development and even for hiring?
Suhail Patel: I think the people that I want to reach out to are folks who have immediately dismissed it, may have used it once and because it didn’t do exactly how they would’ve done it. They have immediately dismissed it as a tool that is just part of the hype cycle or what have you. I very much lie straight in the middle between AI skeptic and AI fanatic.
I am someone who is very, very keen to see where the industry goes, but just like I’m very, very keen on looking at my IDE updates, for example, to see a new functionality that is being provided because it provides me shortcuts. It’s a tool that allows me to go and get what I want done, done.
I do not find myself very attached to the code that I write. I like my code to be clean. I like my code to be readable. I like my code to be well-designed and well-architected and for folks to be able to contribute. I like my code to be bug-free. And if I encode some of those principles into AI, I firmly believe it can generate code that is 90% of the way there. And if that means that that frees up a couple hours of my day so that I can go and do other stuff, I think that is a massive net benefit.
Again, doing some of that rubber ducking that I mentioned a little bit earlier, I found a lot of value in talking to LLMs and outlining my thought process for it to come up with some element of critique or some element of feedback on how my thought process may be informed or for identifying new bits of technology. Once you figure out where AI can be useful, I think it can be a really, really useful tool for engineers to have within their arsenal. I think a key part of AI is understanding how to converse with it and when it excels. Different tools for different jobs. There is a certain point where you write shell scripts for writing an application. You wouldn’t write a bank in bash. So we’ve already built this ability as engineers to understand where we apply certain tools for certain things.
But I think there’s an entire category of engineers that are immediately dismissing this technology, this marvel of a technology I firmly believe. It is one of the very, very few times where I believe I have the entire internet on my computer, especially if you use a local model, if you download Llama or something and you can use it on the plane. I just find it absolutely remarkable that this technology is available to us. So I think being dismissive of it is actively hurting you as an engineer.
I’m not in the camp where you must use AI within your professional organizations or else. You’re not going to be a top performer or what have you, but understanding where this can be used to augment your day-to-day is really, really key. And I think my top advice as an engineer if you’re getting started is start asking AI some of the things that you’ve been up to, see how the LLM would’ve done it.
Spend 20 bucks, download Cursor and sign up to a model and just sprinkle in a usage of AI here and there and get a gut feel about how you’re asking these questions and what prompts you’re giving and where you’re getting useful output. Where is it becoming a net positive in your contribution? And then you can continue to accelerate that net positive and where it is being a detractor, you can not use it for their use case. You’re going to have to write it by hand anyway. You’re going to have to write the software by hand anyway, so you’re not at much of a loss, but you can still have it be a net positive in certain areas within your life, and that can even apply outside of the world of software engineering as well.
AI as a Nonjudgmental Mentor [42:57]
Thomas Betts: Yes, yes. I think someone had a bad analogy. If it was a carpenter and it’s like, “Oh, those new table saws, somebody chopped off a finger, I’m going to stick to hand saws”. You’re not going to be able to work as effectively. It’s a tool. I agree. The first generations didn’t work 100%. They’re still not 100%, but neither are people. So learn how to use your tools, learn how it can accelerate, but that takes some practice.
You have to build up the muscle memory of, “Oh, here’s how I use these appropriately and here’s how I find the benefit”. And I think if you think all it can do is just write code, you’re not using it to the full potential. Using it to think about how you think about a problem, expand your capabilities. And for what we’re talking about today, if you’re growing from an IC role to a leadership role and you’re struggling with, “How do I do that?” You can have that little one-on-one conversation about how do I get better? How would I explain this to someone else in my team?
Suhail Patel: Yes, absolutely. And I think there’s a lot of capabilities that we immediately dismiss. So for example, if you’re looking at a new code base or contributing to an open source piece of technology or learning a new programming language or debugging something in an area that you’re not familiar with, these are areas that new capabilities in AI are absolutely fantastic. Having someone beside you where you can ask it any form of question. I had an engineer come to me and say they’ve routinely been afraid to ask other humans what they perceive as silly questions, and AI is non-judgmental.
You can ask whatever questions you like and AI will not perceive you for any amount of your intelligence, and if it unlocks that capability for engineers who might be a bit shy or who are afraid of asking silly questions, I think that is a net positive for the world. And you mentioned as part of your question, whether AI will replace junior engineers or senior engineers. My thesis, and I’m happy to be challenged on this one, is that we are going to have now more software than ever with the AI’s ability to generate software.
First Principles in the AI Era [45:04]
If you get really, really good at having that first principles thinking again, and I use that phrase a fair bit to be able to go and understand and debug and become proficient of what the AI has generated and being able to augment that with their clarifications and prompts and feedback and build up gratuitous feedback cycle. No matter what role you are as an engineer, whether you’re junior, mid-level, senior staff-plus principal, distinguished fellow, there will be a net positive of this tool in your life.
Just like a junior engineer isn’t a 10x engineer by using VI. They might have watched all the senior engineers do it, but that doesn’t mean they’re a 10x engineer. You need to use the tool effectively. You need to produce the right output. You need to be a net positive for your organization. The same principle, in my opinion, applies here.
Closing Thoughts and Farewell [45:59]
Thomas Betts: Well, I think we’re going to wrap it up right there. Suhail Patel, thanks again for joining me today.
Suhail Patel: Thank you, Thomas. It’s been an absolute delight.
Thomas Betts: And listeners, we hope you’ll join us again soon for another episode of The InfoQ Podcast.
.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.