Transcript
Cummins: I’m Holly Cummins. I work for Red Hat. I’m one of the engineers who’s helping to build Quarkus. Just as a level set before I start, how many of you are Java folks? How many of you are using Quarkus? How many of you have not even heard of Quarkus? I’ve worked on Java for most of my career. I’m here to talk about Java. I want to actually start by talking a little bit about Rust. I’m not a Rust developer. I have never developed Rust. I’m not here to criticize Rust, but actually I’m going to start by criticizing Rust. Of course, Rust has so many amazing features. It’s so well engineered. It’s really a needed language. It is incredibly efficient, but Rust does have a problem.
There’s a reason I have never learned Rust, which is, Rust has a reputation for being really hard to learn, and I am lazy. This is something that you see everywhere in the community. People talk about how hard Rust is. It’s too difficult to be widely adopted. Even people who really advocate strongly for Rust will talk about how hard it is. I love the title of this article, “Why Rust is Worth the Struggle”. They start by saying, with Rust, you approach it with trepidation, because it’s got this notoriously difficult learning curve. I love this, “Rust is the hardest language up to that time I’ve met”.
When people talk about Rust, people will tell you that Rust doesn’t have garbage collection, and that’s one of the things that makes it efficient. I have some questions about that. If we start with the assumption that not having garbage collection makes a language performant, which is wrong, but if we start with that assumption, what happens if we add garbage collection to Rust? Now at this point, all of the people who are Rust developers are sort of screaming quietly in the corner, going, why would you do that? What happens if you do that? It turns out, if you do that, Rust becomes much easier to use. They added a layer of garbage collection on top of Rust, and then they had a bunch of volunteers do a coding task. The people who had the garbage collected version were more likely to complete the task, and they did it in a third of the time.
Now I think we really need to rethink the efficiency of Rust, because Rust is very efficient in terms of its computational resources. If you can make something adding garbage collection, is that really an efficient language? Rust maybe is not so efficient. There’s always this tradeoff of, you’ve got your human efficiency and your machine efficiency, and with Rust, they’ve really gone all in on the machine efficiency at the expense of human efficiency. That’s the tradeoff. I don’t like that tradeoff. In fairness to Rust, I think Rust don’t like that tradeoff either, which is why they have all of the things like the really powerful compiler. That’s something that we’ll come back to as well.
Quarkus (Java Framework)
The question is, can we do better? This is where Quarkus comes in. Quarkus is a Java framework. The programming model will be very familiar to you. We have integrations with the libraries that you’re almost certainly already using, like Hibernate, like RESTEasy, but it’s got some really nice characteristics. One of those, and this is probably the thing that people think of when they think of Quarkus, is that Quarkus applications start really fast. You can run Quarkus with GraalVM as a natively compiled binary, or you can run it on OpenJDK. Either way, it starts really fast. If you run it with GraalVM, it actually starts faster than an LED light bulb. Just to give you a scale of how instantaneous the start is. Quarkus applications also have a really low memory footprint. When we used to run on dedicated hardware, this didn’t really matter.
Now that we run in the cloud where memory footprint is money, being able to shrink our instances and have a higher deployment density really matters. If you compare Quarkus to the cloud native stack that you’re probably all using, if you are architecting for Java, we are a lot smaller. You can fit a lot more Quarkus instances in. It’s not just when you compare it to other Java frameworks. When you compare Quarkus even to other programming languages, you can see that we’re competing with Go in terms of our deployment density. Node.js has a higher deployment density than old-school Java, but it’s not as good as Quarkus. This is cool.
There’s another thing that Quarkus is quite good at which we don’t talk about so much, and I wish we would talk about it more, and that’s throughput. If you look at your traditional cloud native stack, you might get about 3000 requests per second. If you are taking Quarkus with the GraalVM native compilation, the throughput is a little bit lower, same order of magnitude, but it’s lower. This is your classic tradeoff. You’re trading off throughput against footprint. This is something that I think we’re probably all familiar with in all sorts of contexts. With native compilation, you get a really great startup time, you get a great memory footprint, but at the expense of throughput.
Many years ago, I worked as a Java performance engineer, and one of the questions we always got was, I don’t like all of this stuff, this JIT and that kind of thing, couldn’t we do ahead-of-time compilation? The answer was, at that time, no, this is a really terrible idea. Don’t do ahead-of-time compilation. It will make your application slower. Now the answer is, it only makes your application a little bit slower, and it makes it so much more compact. Native compilation is a pretty reasonable choice, not for every circumstance, but for some use cases, like CLIs, like serverless. This is an awesome tradeoff, because you’re not losing that much throughput. This is a classic tradeoff. This is something that we see. I just grabbed one thing off core, but we see this sort of tradeoff all the time like, do I optimize my throughput or do I optimize my memory? Depends what you’re doing.
Let’s look at the throughput a little bit more, though, because this is the throughput for Quarkus native. What about Quarkus on JVM? It’s actually going faster than the alternative, while having a smaller memory footprint and a better startup time. That’s kind of unexpected, and so there is no tradeoff, we just made it better. Really, we took this tradeoff that everybody knows exists, and we broke it. Instead of having to choose between the two, you get both, and they’re both better. I always try and think, it’s a double win. I’ve tried a few. I’ve tried 2FA.
Someone suggested I should call it the überwinden. I don’t speak German, and so it sounded really cool to me, but it’s become clear to me now that the person who suggested it also didn’t speak German, because whenever I say it to a German person, they start laughing at me. German’s a bit like Rust. I always felt like I should learn it, and I never actually did. You may think, yes, this isn’t realistic. You can’t actually fold a seesaw in half. You can’t beat the tradeoff. It turns out you can fold a seesaw in half. There are portable seesaws that can fold in half.
What Are the Secrets?
How does this work? What’s the secret? Of course, there’s not just one thing. It’s not like this one performance optimization will allow you to beat all tradeoffs. There’s a whole bunch of things. I’ll talk about some of the ones that I think are more interesting. Really, with a lot of these, the starting point is, you have to challenge assumptions. In particular, you have to challenge outdated assumptions, because there were things that were a good idea 5 years ago, things that were a good idea 10 years ago, that now are a bad idea. We need to keep revisiting this knowledge that we’ve baked in. This, I was like, can I do this? Because I don’t know if you’ve heard the saying, when you assume you make an ass of you and me, and this is an African wild ass.
The first assumption that we need to challenge is this idea that we should be dynamic. This one I think is a really hard one, because anybody knows being dynamic is good, and I know being dynamic is good. I was a technical reviewer for the book, “Building Green Software”, by Anne. I was reading through, and I kept reading this bit where Anne and Sarah would say, “We need to stop doing this because it’s on-demand”. I was thinking, that’s weird. I always thought on-demand was good. I thought on-demand made things efficient. This is sort of true. Doing something on-demand is a lot better than doing it when there’s no demand, and never will be a demand. When you do something on-demand, you’re often doing it at the most expensive time. You’re often doing it at the worst time. You can optimize further, and you can do something when it hurts you least.
This does need some unlearning, because we definitely, I think, all of us, we have this idea of like, I’m going to be really efficient. I’m going to do it on-demand. No, stop. Being on-demand, being dynamic is how we architected Java for the longest time. Historically, Java frameworks, they were such clever engineering, and they were optimized for really long-lived processes, because we didn’t have CI/CD, doing operations was terrible. You just made sure that once you got that thing up, it stayed up, ideally, for a year, maybe two years.
Of course, the world didn’t stay the same. What we had to do was we had to learn how to change the engine while the plane was flying, so we got really good at late-binding. We got really good at dynamic binding, so that we could change parts of the system without doing a complete redeployment. Everything was oriented towards, how can I reconfigure this thing without restarting it? Because if I restart it, it might never come up again, because I have experience of these things.
We optimized everything. We optimized Java itself. We optimized all of the frameworks on top of it for dynamism. Of course, this kind of dynamism isn’t free, it has a cost. That cost is worth paying if you’re getting something for it. Of course, how do we run our applications now? We do not throw them over the wall to the ops team who leave it up for a year, we run things in the cloud.
We run things in containers, and so our applications are immutable. That’s how we build them. We have it in a container. Does anybody patch their containers in production? If someone said to you, I patch my containers in production, you’d be like, “What are you doing? Why are you doing that? We have CI/CD. Just rebuild the thing. That’s more secure. That’s the way to do it”. Our framework still has all of this optimization for dynamism, but we’re running it in a container, so it’s completely pointless. It is waste. Let’s have a look at how we’ve implemented this dynamism in Java. We have a bunch of things that happen at build time, and we have a bunch of things that happen at runtime.
Actually, the bunch of things that happen at build time, it’s pretty small. It’s pretty much packaging and compilation to bytecode, and that is it. All of the other excitement happens at runtime. The first thing that happens at runtime is the files are loaded. Config files get parsed. Properties files get parsed. The YAMLs gets parsed. The XML gets parsed. Then once we’ve done that, then there’s classpath scanning, there’s annotation discovery. Quite often, because things are dynamic, we try and load classes to see if we should enable or disable features. Then we keep going. Then, eventually the framework will be able to build this metamodel.
Then, after that, we do the things that are quite environment specific. We start the thread pools. We initialize the I/O. Then eventually, after all of that, we’re ready to do work. We’ve done quite a lot of work before we did any work, and this is even before we consider any of the Java features, like the JIT. What happens if we start this application more than once, then we do all of that work the first time. We do it again the second time. We do it again the third time. We do it again the fourth time, and there’s so much work each time. It’s a little bit like Groundhog Day, where we’re doing the same work each time. Or it’s a little bit like a goldfish, where it’s got this 30-second memory, and the application has no memory of the answers that it just worked out and it has to do the same introspection each time.
Let’s look at some examples. In Hibernate, it will try and bind to a bunch of internal services. For example, it might try and bind to JTA for your transactions. The first thing it does is it doesn’t know what’s around it, so it says, ok, let me do a reflective load of an implementation. No, it’s not there. Let me try another possible implementation. No, it’s not there. Let me try another implementation. No, it’s not there. It keeps going. Keeps going. It keeps going. Of course, each time it does this reflective load, it’s not just the expense of the load, each time a class not found exception is thrown. Throwing exceptions is expensive, and it does this 129 times, because Hibernate has support for a wide range of possible JTA implementations. It does that every single time it starts. This isn’t just JTA, there are similar processes for lots of internal services. We see similar problems with footprint.
Again, with Hibernate, it has support for lots of databases, and so it loads the classes for these databases. Then eventually, hopefully, they’re never used, and the JIT works out that they’re not used, and it unloads them, if you’re lucky. Some classes get loaded and then they never get unloaded. For example, the XML parsing classes, once they’re loaded, that’s it. They’re in memory, even if they never get used again. This is that same thing. It’s that really sort of forgetful model. There’s a lot of these classes. For example, for the Oracle databases, there’s 500 classes, and they are only useful if you’re running an Oracle database. It affects your startup time. It affects your footprint. It also affects your throughput.
If you look, for example, at how method dispatching works in the JVM, if you have an interface and you’ve got a bunch of implementations of it. When it tries to invoke the method, it kind of has to do quite a slow path for the dispatch, because it doesn’t know which one it’s going to at some level. This is called a megamorphic call, and it’s slow. If you only have one or two implementations of that interface, the method dispatching is fast. By not loading those classes in the first place, you’re actually getting a throughput win, which is quite subtle but quite interesting. The way you fix this is to initialize at build time.
The idea is that instead of redoing all of this work, we redo it once at build time, and then at runtime we only do the bare minimum that’s going to be really dependent on the environment. What that means is, if you start repeatedly, you’ve got that efficiency because you’re only doing a small amount of work each time. That is cool. Really, this is about eliminating waste. As a bonus with this, what it means is that if you want to do AOT, if you want to do native in GraalVM, you’re in a really good place. Even if you don’t do that, even if you’re just running on the JVM as a normal application, you’ve eliminated a whole bunch of wasted, repeated, duplicate, stupid work.
Really, this is about doing more upfront. The benefits that you get are, it speeds up your start. It shrinks your memory footprint. Then, somewhat unexpectedly, it also improves your throughput. What this means is that, all of the excitement, all of the brains of the framework is now at build time rather than at runtime, and there’s lots of frameworks.
One of the things that we did in Quarkus was we said, we have to make the build process extensible now. You have to be able to extend Quarkus, and they have to be able to participate in the build process, because that’s where the fun is happening. I think with anything that’s oriented around performance, you have to have the right plug-points so that your ecosystem can participate and also contribute performance wins. What we’ve done in Quarkus is we have a framework which is build steps and build items, and any extension can add build steps and build items.
Then, what we do is, build steps get declared, and then an extension can declare a method that says, I take in this build step, and I output that build step. We use that to dynamically order the build to make sure that things happen in the right time and everything has the information that it needs. The framework automatically figures out what order it should build stuff in. Of course, if you’re writing an extension, or even if you’re not, you can look to see what’s going on with your build, and you can see how long each build step is taking, and get the introspection there.
Some of you are probably thinking, if you move all of the work to build time, and I, as a developer, build locally a lot, that sounds kind of terrible. What we’ve done to mitigate this is we’ve got this idea of live coding. I’ve been in the Quarkus team for about two years. When I joined the team, I always called live coding, hot reload. Every time my colleagues would get really annoyed with me, and they’d be like, it’s not hot reload, it’s different from hot reload. I think I now understand why. We have three levels of reload, and the framework, which knows a lot about your code, because so much excitement is happening at build time, it knows what the required level of reload is. If it’s something like a config file, we can just reload the file, or if it’s something like CSS or that kind of thing. If it’s something that maybe affects a little bit more of the code base, we have a JVM agent, and so it will do a reload there. It will just dynamically replace the classes.
Or, if it’s something pretty invasive that you’ve changed, it will do a full restart. You can see that full restart took one second, so even when it’s completely bringing the whole framework down and bringing it back up again, as a developer, you didn’t have to ask it to do it, and as a developer, you probably don’t even notice. That’s cool. I think this is a really nice synergy here, where, because it starts so fast, it means that live coding is possible. Because as a developer, it will restart, and you’ll barely notice. I think this is really important, because when we think about the software development life cycle, it used to be that hardware was really expensive and programmers were cheap.
Now, things have switched. Hardware is pretty cheap. Hardware is a commodity, but developers are really expensive. I know we shouldn’t call people resources, and people are not resources, but on the other hand, when we think about a system, people are resources. Efficiency is making use of your resources in an optimum way to get the maximum value. When we have a system with people, we need to make sure that those people are doing valuable things, that those people are contributing, rather than just sitting and watching things spin.
How to Make People Efficient
How do you make people efficient? You should have a programming language that’s hard to get wrong, idiot proof. You want strong typing and you want garbage collection. Then, it’s about having a tight feedback loop. Whether you’re doing automated testing or manual testing, you really need to know that if you did get it wrong despite the idiot proofing, you find out quickly. Then, typing is boring, so we want to do less typing. Java gives us those two, the strong typing and the garbage collection. I just showed that tight feedback loop. What about the typing? With Quarkus, we’ve looked at the performance, but then we’ve also really tried to focus on developer joy and making sure that using Quarkus is delightful and fast. One of the things that we do to enable this is indexing. Indexing seems like it’s actually just a performance technique, but we see it gives a lot of interesting benefits in terms of the programming model.
Most frameworks, if it’s doing anything framework-y and non-trivial, it needs to find all of the classes. It needs to find all of the interfaces that have some annotation, because everything is annotations, because we’ve learned that everything shouldn’t be XML. You also really often have to find all of the classes that implement or extend some class. Annoyingly, even though this is something that almost every Java library does, Java doesn’t really give us a lot of help for this. There’s nothing in the reflection package that does this. What we’ve done is we have a library called Jandex, which is basically offline reflection. It’s really fast. It indexes things like the annotations, but it also indexes who uses you. You can start to see, this could be quite useful.
What kind of things can we do with the index? What we can do is we can go back and we can start challenging more assumptions about what programming looks like, and we can say, what if developers didn’t have to do this and that, and this and that? As an example, a little example, I always find it really frustrating when I’m doing logging that I have to initialize my logger, and I have to say, Logger.getLogger, whatever the call is, and tell it what class it sees. I only half the time know what class I’m programming in, and I get this wrong so often because I’ve cut and paste the declaration from somewhere else.
Then there’s this mistake in the code base, and the logging is wrong. I was like, why do I have to tell you what class you’re in when you should know what class you’re in, because you’re the computer, and I’m just a stupid person? What we’ve done with Quarkus is exactly that. You don’t have to declare your logger. You can just call, capital the static call Log.info, and it will have the correct logging with the correct class information. This is so little, but it just makes me so happy. It’s so nice. I think this is a good general principle of like, people are stupid and people are lazy. Don’t make people tell computers things that the computer already knows, because that’s just a waste of everybody’s time, and it’s a source of errors. When I show this to people, sometimes they like it, and go, that’s cool.
Sometimes they go, no, I don’t like that, because I have an intuition about performance, I have an intuition about efficiency, and I know that doing that kind of dynamic call is expensive. It’s not, because we have the Jandex index, so we can, at build time, use Jandex to find everybody who calls that log.class, inject a static field in them, initialize the static field correctly. Because it’s done at build time, you don’t get that performance drag that you get with something like aspects. Aspects were lovely in many ways, but we all stopped using them, and one of the reasons was the performance of them was a bit scary. We assume that we can’t do this thing that we really want to do because we assume it’s expensive, it’s not anymore. It gets compiled down to that. You can see that that is pretty inoffensive code. I don’t think anybody would object to that code in their code base.
Let’s look at a more complex example. With Hibernate, obviously, Hibernate saves you a great deal of time, but you still end up with quite a bit of boilerplate in Hibernate, and repeated code. Things like, if I want to do a listAll query, you have to declare that for every entity. It’s just a little bit annoying. You think, couldn’t I just have a superclass that would have all of that stuff that’s always the same? What we can do with Hibernate, if you have your repository class, what we can do is we can just get rid of all of that code, and then we can just have a Panache repository that we extend.
That’s the repository pattern where you have a data access object because your entity is a bit stupid. For me, I find an active record pattern a lot more natural. Here I just acquire my entity, and everything that I want to do in my entity is on the entity. That’s normally not possible with normal Hibernate, but with Hibernate with Panache, which is something that the Quarkus team have developed, you can do that. Again, you’ve got that superclass, so you don’t have to do much work, and it all works. One interesting thing about this is it seems so natural. It seems like, why is this even hard?
Of course, I can inherit from a superclass and have the brains on the superclass. With how Hibernate is working, it’s actually really hard. If I was to implement this from scratch, I might do something like, I would have my PanacheEntity, and then it would return a list. The signature can be generic. It’s ok to say, it just returns a list of entities. In terms of the implementation, I don’t actually know what entity to query, because I’m in a generic superclass. It can’t be done, unless you have an index, and unless you’re doing lots of instrumentation at build time. Because here what you can do is you see the superclass as a marker, and then you make your actual changes to the subclass, where you know what entity you’re talking to. This is one of those cases where we broke the tradeoff that machine efficiency of having the index enabled the human efficiency of the nice programming model.
Some people are probably still going, no, I have been burned before. I used Lombok once, and once I got into production, I knew that magic should be avoided at all cost. This is something that the Quarkus team have been very aware of. When I was preparing for this talk, I asked them, under the covers, what’s the difference between what we do and something like Lombok? Several of the Quarkus team started screaming. They know that, with this, what you want is you want something that makes sense to the debugger, and you want something where the magic is optional. Like that logging, some of my team really like it.
Some of my team don’t use it because they want to do it by hand. Panache, some people really like it. Some of the team just use normal Hibernate. All of these features are really optional. They’re a happy side effect. They’re not like the compulsory thing. I think again, this is a question of efficiency. What we see with a lot of these frameworks, or some of these low-code things, is they make such good demos, but then as soon as you want to go off the golden path, off the happy path, you spend so long fighting it that you lose any gain that you maybe had from that initial thing. Really, we’ve tried to optimize for real use, not just things that look slick in demos.
The Common Factor Behind Performance Improvements
I’ve talked about a few of the things that we do, but there’s a lot of them. When I was preparing this talk, I was trying to think, is there some common factor that I can pull out? I started thinking about it. This is my colleague, Sanne Grinovero. He was really sort of developer zero on Quarkus. He did the work with Hibernate to allow Hibernate to boot in advance. This is my colleague, Francesco Nigro. He’s our performance engineer, and he does some really impressive performance fixes. This is another colleague, this is Mario Fusco. He’s not actually in the Quarkus team. He tends to do a lot of work on things like Drools, but he’s given us some really big performance fixes too.
For example, with Quarkus and Loom, so virtual threads, we had really early support for virtual threads back when it was a tech preview. What we found was that virtual threads, you hope that it’s going to be like a magic go faster switch, and it is not, for a number of reasons. One of the reasons is that some libraries interact really badly with virtual threads, and so some libraries will do things like pinning the carrier thread. When that happens, everything grinds to a halt. Jackson had that behavior. Mario contributed some PRs to Jackson that allowed that problem in Jackson to be solved, so that Jackson would work well with virtual threads.
I was looking and I was like, what is that common factor? What is it? I realized they’re Italian. This is a classic example of confirmation bias. I decided the key to our performance was being Italian. Without even realizing it, I looked for the Italians who’d done good performance work. When we do a Quarkus release, we give out a T-shirt that says, I made Quarkus. On the most recent release, we gave out 900 T-shirts. There’s a lot of contributors. A lot of people have done really cool engineering on Quarkus, only some of them were Italian. You don’t have to be Italian to be good at performance, in case anybody is feeling anxious. The title of this talk is Italian graft, and so being Italian is optional, but the graft part is not. This stuff is work. When you’re doing that kind of performance optimization, you have to be guided by the data, and you have to do a lot of graft. You measure, because you don’t want to do anything without measuring.
Then you find some tiny improvement, and you shave it off. Then you measure and you find some tiny improvement, and you shave a little bit of time off. You measure, and then you find some tiny improvement. This was very much what we saw in this morning’s talk as well. It was in C rather than Java, but it was the same thing. If I’m going to profile, then I’m going to find some tiny optimization that I’m going to do. You keep going and you keep going. It’s not easy, so it needs a lot of skill, and it also needs a lot of hard work. I mentioned Francesco, our performance engineer, and he really is like a dog with a bone. When he sees a problem, he’ll just go and go. I think a lot of the rest of us would have gone, “Ooh”, and he just keeps going. He has this idea that what he offers to the team is obsession as a service. You need people like that.
I want to give one example. We run the tech and power benchmark, and what we found was we were behaving really unexpectedly badly when there was this large number of cores. With a small number of cores, our flame graph looked as we hoped. When it was a lot of cores, all of a sudden, our flame graph had this really weird shape, and there was this flat bit, and we’re like, what’s going on there? Why is no work happening in this section of the flame graph? Again, many people would have gone, what a shame? To find out, Francesco and Andrew Haley, another colleague, they read 20,000 lines of assembler. What they found was worth it. They found the pattern that was causing the scalability problem, and the pattern was checking if something is an instanceof.
At this point, hopefully some of you are screaming as well and going, I think there’s a lot of that. That’s not a weird, obscure pattern, that is a very common pattern. Once Franz had found the problematic pattern, he started to look at what other libraries might be affected. We found Quarkus was affected. Netty was affected. Hibernate was affected. Camel was affected. The Java Class library was affected. This was a really big, really bad bug. He found actually that there was an existing bug, but nobody had really realized the impact of it. I think this is partly because it happens when you’ve got like 32 cores, when you’ve got like 64 cores. We’re now much more often running at that kind of scale. It’s a cache pollution problem.
The problem is, when you do this check, the cache that is used for this check is shared across all of the cores. If you’ve got a lot of code running in parallel, basically the cache just keeps getting corrupted, and then you just keep having to redo the work. This was a bad problem. This was not like that saving 2%. This is one of the tech and power benchmarks, and this was running before the fix and running after the fix. You can see we went from 1.8 million requests per second to 5.8 million requests per second. That’s just a small benchmark, but it was a huge improvement.
What we did was, Franz wrote a little tool, because not every instanceof call is problematic. It depends on various factors. He wrote a tool that would go through and detect the problematic case. We ran it through the whole code base, and we started doing the fixes. It’s very sad, because this is fixed in the JVM now, but only on the sort of head, so people won’t get the benefit of the fix for quite a while. We had code that was, for example, like this. Then after the fix, you can see we had to do all of this stuff.
Again, you don’t need to necessarily read the code, but you can just see that the throughput is a lot higher, but the code is a lot longer, so it’s again exactly the same as Alan’s talk. You have this tradeoff. I love it for this one, because the developer did the PR and then they basically apologized for the code that they’re doing in the PR. I’m not a fan of the fix. It’s not idiomatic. It’s difficult to maintain, but it gives us so much more throughput that we have to do it. Again, it’s that tradeoff of machine efficiency against human efficiency. Only in this case, it’s not everybody else’s efficiency, it’s just my team’s efficiency. This is what Anne was talking about when she said, you really want your platform to be doing the hard, grotty, nasty work so that you can have the delightful performance experience. We do the nasty fixes so that hopefully other people don’t have to.
Another thing to note about efficiency is it’s not a one-time activity. It’s not like you can have the big bang, and you can go, yes, we halved the throughput, or halved the cost. Life happens, and these things just tend to backslide. A while ago, Matt Raible was doing some benchmarking, and he said, this version of Quarkus is much slower than the previous version. We thought, that’s odd. That’s the opposite of what we hoped would happen. Then we said, “Are we measuring our performance?” Yes. “Don’t we look to see if we’re getting better or worse?” Yes. “What happened?” What it is, is, if you get that bit of code, is the performance getting better or worse here? It looks like the performance is getting much better. If you look at it over the longer picture, you can see that actually it’s probably getting a little bit worse. Because we had this really big regression that masked a series of smaller regressions.
We had a change detection algorithm that was parametric, and it meant that we missed this regression. We did the work and we fixed it, and we fixed a lot. It was very cool. That was another engineer who was not Italian, called Roberto Cortez. One of the things that Roberto did, which just makes me smile, is, again, it’s about the assumptions. We do a lot of string comparison in config. Config tends to be names based, and so the way any normal human being would do a string comparison is you start at the first character, and then you go. The interesting bit is always at the end. Roberto worked out, if I go from the other end, the config is much faster. I would recommend you all to have a Francesco, to have a performance engineer. You can’t have Francesco, he’s ours, but you need to find your own. It does need investment.
I’ve got one last tradeoff I want to talk about. This is the efficient languages track, but we really do have a green focus here. There’s this classic tradeoff with sustainability between doing the stuff that we want to do and saving the planet. In general, historically, we have always tended to do the stuff we want to do rather than save the planet. I think there is some hope here. I’ve started talking about something called the vrroooom model. Naming is the hardest problem in computer science, because I didn’t think to actually do a Google before I did the name. It turns out there is a vroom model, which is a decision model. That’s with a slightly different spelling than I did. I did 3r’s and 2o’s and stuff, which was another terrible mistake.
If you Google, vrroooom, it thinks you want to do it with the conventional spelling, but then it says, but would you like to search instead for the vrroooom model with the idiosyncratic spline? If you click on that, what do you think happens? The hope is that you get my stuff. The reality is rather different. Everything here, it’s all about cars, and hot babes. That is what you get if you search for the vrroooom model. Even you can see there, that’s a Tesla advert. It says sexy above it. It’s all about sexy cars. Naming, hardest problem in computer science. I should have thought about that.
My vrroooom model, the one that doesn’t involve sexy cars, I really started thinking about this when I looked at the paper. We were talking about this before, and Chris said, you know that stupid paper that compares the programming languages, and there’s a lot of problems with this paper. What I want to show you is not the details of it, but something that I noticed, which is, it has a column for energy and it has a column for time, and they look kind of almost the same.
If you plot it, you can confirm that this trend line is basically straight. It means languages that go fast have a low carbon footprint. We see this with Quarkus. With Quarkus on this graph, we benchmarked the energy consumption of Quarkus native, Quarkus on JVM, the other framework on JVM, the other framework on native. What we did was we had a single instance, and we just fired load at it until it ran out of throughput. The shorter lines are where it ran out of throughput earlier. Lower is better. Lower is the lower carbon footprint. You can see that there’s, again, this really strong correlation. Quarkus on JVM has the lowest carbon footprint of any of these options because it has the highest throughput. It’s the win-win again, that you get to have the really fast language and have the nice programming model and also save the world. We beat the tradeoff.
I just love this that instead of having this opposition between machine efficiency and human efficiency, the one helps us gain the other. If you start with efficient languages, you really need to consider both machine efficiency and human efficiency. When you’re looking at your machine efficiency, you need to challenge your assumptions. Only do work once, obviously. Move work to where it hurts the least. Index. Indexes are so cheap, they’re so good, they solve so many problems. Unfortunately, this isn’t a one-off activity. You do need that continued investment in efficiency. Then, when you look at your human efficiency again, same thing, you need to challenge your assumptions. You need to get those feedback loops as small as you can. Don’t make people tell the computer what the computer already knows, because that’s a waste of everybody’s time.
See more presentations with transcripts