Transcript
Holly Cummins: I work for Red Hat. I help build Quarkus. When I was preparing this talk, I realized that part of my job now is really thinking about efficiency. Actually, almost all through my career, I’ve ended up working on efficiency in one way or another. When I started out my career, I was a performance engineer, and I worked on the JVM garbage collection. Of course, the aim of a performance engineer is to make things more efficient. I then switched, and I worked as a consultant for many years. I really thought about methodology. I was coaching teams in Lean, coaching teams in extreme programming, with the aim of making them more efficient.
Now I’ve come back round, and I’m back in the Java space, and I work on Quarkus — Quarkus is already extremely efficient — but with the aim of making Quarkus even more efficient. I realized quite a while ago, but especially when I started doing consulting, that our industry has a pretty terrible problem with waste. I heard a story from another team, and a client had come back to them complaining about broken provisioning software. You have to use a bit of imagination for this, because it was about 15 years ago, before the cloud, which is hard to even remember. We had sold them this amazing software that would allow them to provision a pre-cloud instance in 10 minutes, which now sounds absolutely terrible.
At the time, it was amazing. It was really cool. The client was not impressed, because they did not get this 10-minute provisioning time that we had sold them. Instead, they would start provisioning an instance, and three months later, the instance would appear. They came to us to complain. We did some investigation, and we realized that the problem was not the technology. The problem was that they had put an 84-step pre-approval process in front of this 10-minute provision time. In fact, it was a miracle that it got provisioned even in three months.
It was just so maddening and frustrating to see this horrible waste. Because when we think about efficiency, what we’re actually thinking about is waste, and trying to eliminate waste, or trying to reduce waste. This is something that we’ve been thinking about for quite a while, actually. We started really thinking about efficiency and trying to quantify efficiency in the 1700s. This research started when we started working on things like steam engines. People started working on steam engines. The way a steam engine works, or any kind of engine, is you put energy in, and you get energy out at the other end. What you’ve put in is heat energy for a steam engine, and what you get out is kinetic energy. Your engine is moving forward.
The moving forward is the useful part of the energy that comes out. You also get a whole bunch of other energy coming out, which is just more heat. That is the waste of the process. This idea of thinking about efficiency in this way was so good that it moved from the physicists to the business folks. In the 1800s, people started thinking about processes in the same way. Can we quantify how efficient a process is? Can we quantify how much waste there is in this process? When you have any business process, what happens is you put time and money in, and then it churns through the bureaucracy. What you get out at the other end, hopefully, is value. You also get waste. Here, what we’re thinking about isn’t machines, but people. The principles are pretty similar.
Then this idea was so good that in the 1960s, we applied it to software, to computers, to these new kinds of machines. The principle is the same. You have your computer, and you put in time and electricity and hardware and programmer tears. Then what comes out at the end is answers which may or may not be correct. There’s quite a lot of waste in that process as well. This is bad. Waste is really bad. Waste is killing the planet, which is where we come to the sustainable software side. It’s killing the planet in all sorts of ways. E-waste is killing the planet just because of the volume of stuff that we’re having to try and dispose of. Energy waste is killing the planet by driving climate change. What causes energy waste in the context of software is slow code. Our slow code is killing the polar bears. Our slow code has some other consequences as well that are maybe more visible. I heard this story from Sara. I’ve reused it because I liked it.
A while ago, LinkedIn were doing user research on a new feature. They tested it around the LinkedIn offices, and it worked great. Then they went out to India with their user research team to this city called Nashik, which is not a small city. What they found was that when they watched users using this new feature, the users were not actually using the new feature. The users were sat watching a spinner because the connectivity was not good enough to allow this huge bloated software to come down the wire onto the device. What they had made was a system that was so inefficient, it was actually useless for part of the audience, which is pretty terrible from business terms and every other climate term as well.
The Cost of Zombie Workloads
There’s another problem with our code, which is, some of the code we run is just slow. A lot of the code we’re running, we don’t even remember that we wrote. It’s still there, just trundling away. This is called zombie code, or zombie servers. This isn’t a small problem. The best estimate that we have says about 25% of servers are doing no useful work. The bar here for useful work is high. This isn’t that these servers are watching cat videos or something like that. These are servers where there’s no data going in or out. The average server is running at about 12% to 18% of capacity. You have this whole server, this marvel of engineering, and it’s only running at 12% of what it could be doing. This is terrible. Of course, there’s data. Some data is used. Quite a lot of data is stored, and then never used again. That has a cost. Unused data is killing the planet. We can quantify this as well. The best estimate we have says that 68% of data that’s stored is never used again.
The challenge, of course, is knowing which 68%. That’s why we don’t just throw 68% of the data away before we store it. There are things we can do. The cost of all this, the consequences of all this, obviously, there’s a huge financial consequence, storing data costs money, running servers costs money. It uses a lot of electricity as well. Then there’s other things as well. Those computers that we provision so that they can do nothing at all, manufacturing them has a cost. It has a carbon cost. This cost is called embodied carbon.
As soon as we manufacture a computer, we really want to be making as much use of it as we can. We want to try and avoid manufacturing future computers unless we’re going to get really good use out of them. Of course, if these computers are running anywhere that needs cooling, such as a data center, there is an enormous quantity of water that gets used just to cool these data centers to support the computation that’s being run. Then, finally, I mentioned already, there’s all of the e-waste, because once we finish with the computer, we have to dispose of it. At that point, it is garbage.
Solutions/Fixes for Waste – LightSwitchOps, and Faster Code
That’s pretty bleak. The good news is almost all of this, we actually know how to fix. We just need to do it. One solution that I’m quite fond of, I call LightSwitchOps. This is a really good solution to zombies. I say it’s a really good solution because LightSwitchOps is something that I came up with.
For a while, I was going around and I’d say LightSwitchOps, and people would go, “Hmm”. Then, eventually, people started saying LightSwitchOps to me, which was quite exciting. If you read Sara’s book, which I would recommend, LightSwitchOps is in Sara’s book. It’s like in print. That was very exciting because I thought leadered, which is cool. The idea with LightSwitchOps is that when we leave a room, we never have this existential dread of, what if I turn the light off and it doesn’t come back on again? We know it’s going to work. With a server, we have tried turning it off again, and we know it doesn’t come back on again in general.
A lot of us have this muscle memory that we must avoid turning servers off if we can. This is something we can fix. If we’re moving towards a cloud-native development model, which hopefully we all did 10 years ago, we’ve already fixed this. We have these systems that have the idempotency. They have the resiliency. We can shut them off, hopefully. Now all we need to do is put the infrastructure in place to support shutting them off. This can actually be really simple. For example, you can just do it with shell scripts. I heard a story from someone, she was working in the IT department in a Belgian school, and she just wrote some shell scripts to turn their computers off overnight, and it saved the school €12,000. It’s a huge amount of money for something really quite simple. I say really quite simple. I think she did have to use Cron in order to write the shell scripts, so maybe it was not so simple. There are alternatives.
Again, that’s in a school. You can do this in a business context as well. If you’re not using your servers 24-7, you can reduce your cloud bill potentially by 30% just by having that automation. If you don’t want to use Cron, there are options. There’s a project that has come out of AXA France, and it’s called Daily Clean. The idea is that you have a GUI so that you don’t have to remember the Cron syntax. You can set it up however you want, and then it just has a little pod that will run in your Kubernetes cluster and take care of turning things off and on. The Daily Clean, it’s open source. There are also now commercial alternatives that are starting to come in this space, which I find exciting. For example, there’s a product called Turn it Off. I haven’t actually used it, but I’ve seen it come by on LinkedIn.
The idea is that you can install Turn it Off, and it will save you a great deal of money. Then they will take some of that money as the payment for saving you a great deal of money. That’s their business model. The benefits of these are really big. It saves energy. It saves money. If you’re thinking about security, of course, it also reduces your attack surface. It gets rid of those servers that are probably running obsolete software because they’re the ones that everybody forgot about. There’s huge benefits. It also tests your disaster recovery because if you’ve proved that you can turn a server off and bring it back on again, you’re in pretty good shape for disaster recovery.
There’s another thing that we can do to reduce the carbon footprint of our software, and this is to make our software faster. This is very similar to the resource utilization proxy that Sara talked about. What we’re talking about here is a proxy, but instead of calling it a proxy, which is boring, I decided I would call it the vrroooom model because that was exciting. I didn’t really think this through. I should have just called it a proxy because if you Google for vrroom model, and you can see I was being really clever, and I put a non-deterministic number of Rs and Os in there. If you are naming something, never give it a non-deterministic name. The SEO is absolutely awful.
If you type it with a non-deterministic number of Rs and Os, Google will say, “Don’t be silly. You must mean Vroom spelt normally”. That doesn’t help you because what I didn’t think to check before I came up with this name was, is there already a Vroom model? There is already a Vroom model. The Vroom model was invented by Dr. Vroom. It hadn’t occurred to me that there was a Dr. Vroom who had invented a model. Naming is the hardest problem in computer science. It gets worse though, because you’ll notice that Google gives you the little link that says, maybe you do want to spell it with all the Rs and Os.
If you search for the spelling with the Rs and Os, what you get is this. I have that text banner there because otherwise I would be violating the conference code of conduct. It is not good what you get if you search for vrroom model. Naming is the hardest problem in computer science. Leaving all that aside, what is the vrroooom model? The vrroooom model was really inspired by looking at this paper that did the rounds of the internet a couple of years ago, which was looking at various programming languages, and seeing which one uses the most energy. Everybody loved this paper because what it showed was that your favorite programming language was good, and other people’s favorite programming language was bad, which is just what everybody wants. You can see that Python is quite near the bottom. PHP is quite near the bottom. I’m a Java person. Java’s quite near the top. Of course, it just goes to show that I’m right and everybody else is wrong.
The more important thing to notice here is that they’ve got three columns. The first is the energy used, and the second is the total time taken. These are normalized. If you squint, you can see that those two columns are basically almost the same. Instead of just squinting and guessing at the numbers, which is what I did the first time, you can actually be a bit more systematic and you can plot it. You can see that that line is almost straight. This is showing that the programs that ran really slowly used more energy. The programs that ran nice and quick used less energy. This is useful when choosing a programming language, but it’s also really useful just for all the things that you would normally do with that programming language. It means that if you make it faster, you’re also saving the world.
I want to give a little bit of a case study about Quarkus, just to look at some of the things that we did and how they impacted this. Because one of the things about performance improvement is that often performance can be a lot faster than it is. We just need to challenge the assumptions. We need to challenge the things that we had been taking for granted. With Java, for example, Java had been optimized for dynamism. Java is incredibly dynamic. You can change all of the libraries, you can change all of the dependencies while your application is running, and it will continue to work through the magic of reflection.
The problem is that that is how we did ops 15, 20 years ago. That is not how we do ops now, because we have CI/CD. You really shouldn’t be SSHing into your production server in order to patch the application with a new copy of some JARs. No one is doing this. What we are doing instead is we have a CI/CD pipeline. We’re probably running in containers. When we need to update something, we just do a new run of the pipeline. We don’t try and patch the live system. We still have all this dynamism in the live system to support that patching. This dynamism and having this super dynamic runtime in a container is really pointless. It’s worse than pointless because it has a cost. We’re paying a dynamism tax for dynamism that we’re not even using.
If you think about what a Java application does and how it starts, the packaging stage, the build stage that we run in our pipeline is actually pretty small. It generates bytecode. It puts it into an archive. What happens when the application starts? There’s a ton of stuff that happens when the application starts. All of those config files that we packaged in, all of the XML, the YAML, they all get parsed. Then what happens is all of the libraries will scan the classpath, see what else they’re coexisting with, try and load classes to see what features they should be enabling and disabling. Then the libraries will build a metamodel of the world. Then eventually after all of that, it can do the actual start, so things like thread pools, I/O, so on. At that point, we’re ready to do work.
If we start the application more than once, which hopefully we will, then we have to do all of that work every single start, which is wasteful. It’s a bit like this goldfish model of an application that it forgets what it just did when it started the previous instance because that knowledge isn’t saved anywhere. This impacts all sorts of things. For example, with Hibernate, Hibernate will try and auto-wire to your transaction manager. What it will do is it will use Java reflection. It will reflectively load the most popular JTA implementation.
Then if that doesn’t work, it will load the second candidate. If that doesn’t work, it will load the third candidate. It will keep going. The number of ones that it tries to load if it’s not successful is 129. Every single time it fails, it will throw a ClassNotFoundException. This is pretty wasteful because you’re not going to change your JTA application live in production without running it through a CI/CD pipeline because it’s 2025. That affects the startup time. This dynamism, it also affects the runtime throughput. For example, the way that JVM dispatches methods is that normally you will have an interface, and then you’ll have a whole bunch of possible implementations of that interface that are all on the classpath. If they’re on the classpath, what the JVM has to do is something called megamorphic dispatching, where it has to look at each one and go, are you the right one? No, you’re not the right one.
Then eventually it finds the right one and does the dispatch. If you have a tighter classpath, then what you can do instead is what’s called monomorphic dispatching, where it is fast. You have this throughput drag all of the time just from having stuff on your classpath that you weren’t really using. We can fix this. If instead of initializing at runtime, you initialize at build time, what you get is you just get rid of all of that performance drag. The only thing that you need to do at runtime is starting your thread pools and so on.
If you start repeatedly, you’ve got that repeated start because you’re not doing all of the same work every single time, which reduces waste, which is good. This efficiency and other things that are efficient in Quarkus, we can translate directly from improved throughput because of these design choices to reduced energy impact, reduced carbon impact. What this chart is showing is Quarkus running as a JVM application, Quarkus running as a natively compiled binary, another JVM framework that many of you will be using, compiled as a native binary or just running on the JVM.
What you’ll see here, the length of the line is how many transactions we could shovel in before it gave up and said, I’ve capped out my throughput. The shorter line means the lower max throughput. Quarkus on JVM could do 20,000 requests per second. The height of the line is how much carbon footprint there was. You can see that in every single case, if the line is shorter, the line is also higher. This is the vrroooom model in action again. There’s this really strong correlation between having better throughput and having lower energy footprint.
I really like this, because memory footprint, we tend to think of that as a tradeoff against throughput. With Quarkus, what it was able to do was it was able to have a smaller memory footprint, a smaller startup time, and a faster throughput. We broke that tradeoff. This is the double win, which is cool.
Then, of course, there’s the climate aspect as well. You may think this idea of the seesaw folding in half and breaking the tradeoff isn’t realistic. It turns out you can actually buy seesaws that fold in half. This is physically realistic as well as being technically true. There’s another interesting tradeoff that we managed to beat with Quarkus, which is again, normally you think of machine efficiency and human efficiency as being traded off against each other. You say, I could program in Rust, and my machine efficiency will be very high, but my human efficiency, maybe not so much. With Quarkus, a lot of the performance improvements that we made also enabled user experience improvements that made it more efficient to program with. Again, it’s this double win. By having more understanding of what is in the application space, we’re able to get rid of a whole bunch of boilerplate code, which is nice.
Another thing that we can do, because we do more at build time, we have to have live coding. We have to have a hot reload, because otherwise it wouldn’t be very nice to use. What that means is if you’re using the live coding in dev mode, you’re not wasting all of that machine energy to run the full build cycle and do the Maven, verify Maven run, whatever. Then you’re also saving your time, because you’re getting faster feedback. It’s this really nice thing, again, where it’s better for the human, and it’s better for the efficiency, and so it’s better for the planet. We also do targeted continuous testing, where we will run the tests continuously as you make changes.
If we ran every single test continuously as you made changes, that would be really terrible on any non-trivial suite. We use code coverage techniques to work out just which tests we need to run, run just those. It means that there’s much less energy running the full test suite when we don’t have to, but it’s also less dev time wasted waiting for that feedback. We call all of these together developer joy. It’s this double win, or triple win, or quadruple win, depending how you’re counting.
The point of all of this isn’t really about Quarkus, or it’s not that if you’re using Java, you should be using Quarkus, although obviously you should. It’s about, if you challenge those assumptions, all of a sudden, all sorts of things can become unlocked. Design investment in efficiency can give you pretty astonishing results. We find that the runtime cost is halved for a lot of our users. That’s a huge difference. It’s because of challenging these assumptions. The same assumptions are elsewhere, or there are other assumptions elsewhere in other languages, in other programming models. If we can find them and just question them, we can get these really big, really amazing savings.
The ‘AI-lephant’ in the Room
Of course, it’s not a tech conference in 2025 if we don’t talk about AI. AI has some interesting efficiency implications. One of them is, again, going back to that Quarkus programming model. We worked really hard to come up with this really elegant, really concise programming model. Now sometimes people say, I don’t need to care about the programming model because I can just have a really terrible API and the AI will do it. I don’t think so. There’s three quite serious problems with this. One is cost. Running these models costs a lot of money, and it costs a lot of carbon potentially, so we don’t want to be taking that cost unless we’re getting a really significant benefit. The other problem is correctness. I was in Australia recently, and so I got a bit overexcited about kangaroos.
I asked ChatGPT what the efficiency was if you generate your electricity in a very organic, bio-friendly way by having a kangaroo on a trampoline and having the kangaroo bounce on the trampoline to power the generator. It turns out the efficiency of this system is 15%. I knew I’d made it up, and I got such a convincing answer that I was a little bit worried that actually researchers at the University of Sydney had done a system that gave 15% efficiency through putting kangaroos on a trampoline. There is definitely a problem with correctness in some of these systems. We can work around this problem in some cases. I should say as well, just in case you go away and you say, what did you learn at QCon? Kangaroos on trampolines, it’s the future of energy. Kangaroos on trampolines is not the future of energy. Researchers did not do this. I just put it in the slides because it was funny.
The next problem, and I think this is more subtle, because correctness is really easy to detect, and if the code doesn’t compile, you don’t commit the code. There’s a problem with bloat as well. A while ago, I was experimenting and I got ChatGPT to write me a demo Quarkus app. It started producing code, and I was really pleased. I was just, “This is saving me a lot of time”. Then it continued producing code, and I was like, “Yes, it would have taken me ages to write all this code”.
Then it continued producing code, and then it continued producing code. Eventually I looked and I realized it would have taken me ages to write all this code, but I never would have, because the code it wrote doesn’t need to be there. We have this really beautiful, concise programming model, and it just went, let me give you as much code as possible. I think you’ve probably all seen this with your getters and setters, where it will put a comment on every getter and setter saying, this gets the thing. I don’t need a comment to say it gets the thing, because it’s obvious, because I can read. There’s this weight that comes into the output, because it is trying to produce as much code as possible. It’s like it’s being measured on lines of code.
The cost of this is actually a cost to us, because we spend a lot more time reading code than writing code. If we optimize for writing code at the expense of reading code, and we end up with these really flabby code bases that have all these pointless comments, and just pointless code, and duplication, then it means that although in the short term maybe we’re able to generate stuff quicker, in the long term this code base is going to be more expensive to work with. It seems like we saved time, but actually all we did is shift the work around from present us to future us. Sometimes that’s the right thing to do, but we just need to be aware of the tradeoff and we need to think about who we’re shifting the work to.
I think what we’ve ended up with then is this illusion of efficiency, where we have this really naive metric of like, look, it produced so much code, that’s awesome, but actually we’re measuring the wrong thing and so we’re coming to the wrong conclusions about what we should be optimizing. Again, this is something that we need to be aware of, but there are things that we can do to improve it. We have solutions. One of these is if we have smaller, fine-tuned models, they’re much more likely to produce something correct, and they’re much more likely to be optimized for the programming model that we’re using.
Again, if you have small models in RAG, you will get better output and the cost of running them is much lower. We can also just not throw the baby out with the bathwater. We’ve developed a lot of really good techniques. We can continue using them in a hybrid way. If you combine symbolic reasoning with an LLM, you will in general get much better output than if you just use the LLM on its own. Again, it’s cheaper because the cost of running this symbolic reasoning is pretty low. These illusions of efficiency, I mentioned AI, but they are everywhere. It is so easy to come to the wrong conclusion when we’re thinking about efficiency.
Limits To Efficiency
With efficiency, there’s a couple of traps that we can fall into. One is that we optimize the wrong thing, we have this illusion of efficiency. There’s another one which is interesting, which is that actually there’s a really fundamental limit to how efficient we can get. There are actually three things that limit efficiency. One is Jevons’ paradox. Jevons’ paradox is something that was in the news, but I’d been thinking about it for a while. Jevons’ paradox says that efficiency improvements can lead to increased consumption.
What this means, I always think of it as the highway paradox. Because when we widen roads, what we imagine is that we’re going to have the same number of cars on a six-lane motorway, and every time we need to go somewhere, it’s just going to be this really incredibly fast journey. Of course, what tends to actually happen when we widen roads is this. We make the road wider, and then the number of cars increases. It actually takes you exactly as long to get to the destination as it did before. It’s just that there’s more people on the road. Maybe that’s what you want. Maybe you wanted to enable more, but just be aware that this is most likely going to happen. There’s another limit to efficiency, which comes from physics, and it comes from thermodynamics. Even for machines, there’s a limit to efficiency, and it’s actually really quite low. When they were inventing the steam engine and developing it, you can see that they made quite significant improvements in efficiency for a while.
Then it really leveled out, and it capped out. No matter what they did, they couldn’t make it any more efficient. This wasn’t because they weren’t doing the engineering properly. It was because there was a mathematical limit to how efficient the engine could be, that came from thermodynamics. That limit is 37%, which is astonishingly low, I think. It means that more than half the energy you put into that engine is going to get thrown out as heat, no matter how good an engineer you are. It gets worse, because if you make your engine really super-efficient because you’re an awesome engineer, you might actually be worse off than if you had made it less efficient, which makes no sense, but it’s true. The maximum efficiency of a combustion engine is 37%.
Most engines running on the road do not run at 37%. Most engines on the road run at around 20%. Why is this? It’s because they’re deliberately detuned, because otherwise the engine wears out faster. You have to run the engine at a lower efficiency. We have this tradeoff between efficiency and resiliency. We can have an engine that’s really efficient, or we can have an engine that lasts. Actually, we probably want at least some of the engine lasting, quality of service.
We see the same tradeoff everywhere. For example, in the rail network, the trains can go between stations much faster than they actually do. The reason that they don’t is if you designed your schedule for the train to absolutely rocket along, it would mean anytime there was a tiny problem, the whole schedule would collapse. Instead, they put in some padding into the schedule in order to make sure that the schedule is predictable. It’s this tradeoff between predictability and efficiency. There’s a similar tradeoff between redundancy and efficiency. When you have a failover in another region, there’s usually quite a lot of inefficiency that comes with that. You’re willing to accept that because you quite want the failover, thank you very much. Again, we can see this everywhere.
People often talk about a stool as very well engineered because it has the optimum number of legs. It has three legs, which is the minimum that you need for the stool to stand up properly. Whereas if you look at something like a dog, a dog has one more leg than it needs. You can verify this experimentally, although please don’t do this at home. If you’ve ever seen a dog that had an accident, you get these three-legged dogs. It is still a perfectly functional dog. It does all the doggy things and it runs around the park. It’s really happy because it has that resiliency from the extra leg. If you do the same experiment to your stool, it’s much less cruel, but it’s also much less successful because it is no longer a stool, it’s just a stick. The resiliency, it lowers the efficiency, but it gives you other good things instead.
Working Less and Achieving More (Double Wins)
This tradeoff, it’s exactly the same for people as well. All work and no play, it’s not going to give you the outcomes that you were hoping for. I saw this on social media a while ago. I just heard of a founder who monitors his co-founders’ and employees’ productivity via a Whoop group. I don’t know what Whoop is, but I assume it’s some health tracking system.
The team is collectively averaging five-and-a-half hours of sleep per night. I thought this was going to go on to say, this is terrible, because if you operate without enough sleep, it’s the equivalent of operating drunk. No one would want their employees to come into the office drunk. This is not where it went.
Instead, it said, if you haven’t done this too, to monitor the work ethic of your founders, I encourage it. I think the moral of the story maybe is to not be a founder and to be cautious with venture capital. It’s also that you shouldn’t do this. Because even if we’re not founders, I think we’ve probably all had the experience of, if we get overloaded, things get bad. We’ve also all seen it, I think, with our colleagues, where suddenly our colleagues become really intolerable. It’s because they have too much to do. You go to the colleague and you’re like, could you please do this thing? Before you can even finish the sentence, they’re like, no, I’m too busy. How dare you ask me for this completely reasonable, unreasonable thing? You’re like, sorry. It’s because they’re overloaded. It doesn’t just impact us personally, because we don’t like talking to them anymore. You can actually quantify it for the business.
For example, with Splunk, apparently at one point, they had 90% attrition in a 6-month period. How do you run your business if you’re losing 90% of your people every 6 months? You can’t. You’ve lost all of your institutional memory. Running it too hard has this really big cost. What we need is we do need to build more slack into the system. I should clarify, when I say slack, I do not mean that, which is a human hamster wheel, which just goes to show that naming is still the hardest problem in computer science, because slack is the opposite of slack. What I mean by slack is just that idleness, that doing nothing. Because doing nothing can lead to more efficiency, which is paradoxical and weird, but true.
I mentioned this, but I want to re-mention it because it’s so important. The default mode network is an area of the brain that when the rest of the brain becomes less active because you’re not doing anything, the default mode network becomes more active. It engages and does useful stuff only in the absence of external demands. We can trigger the default mode network in a variety of ways. One of the most convenient ones is to have a shower. We’ve probably all had the experience of solving problems in the shower, because one of the things that the default mode network is involved in is problem solving. Apparently, 72% of people get their best ideas in the shower, and 14% of people take showers specifically for the purpose of having ideas. I always feel a little bit uncomfortable showing this statistic because this research was sponsored by a shower company, so they had a vested interest.
The default mode network is something that we’ve learned from psychology. If psychology is too fluffy, we can come to the exact same conclusion just with math. Math says that systems need to run under capacity to function. Queuing theory is the study of what happens when you have work going into a system and you hope to have work coming out of the system. Basically, it’s the study of absolutely everything. The idea is that you have an arrival process. Usually that follows a Poisson distribution. That work goes into a queue, and things wait in the queue, getting progressively more unhappy as they wait in the queue. Then they go to servers, and they get sorted out with the server.
Then the work is completed and everybody is happy. If the arrival rate is really low, there’s going to be nothing in the queue and the servers are idle. This is an underutilization case. This is our servers at 10% of capacity. It’s wasteful. We can go too far in the other direction as well for both machines and people, which is that if the server capacity is too low, if you’re not actually able to get through the work, what happens is you build up this really long queue. This is exponential. What it means is that as your server capacity goes to 100%, it’s not like your wait times are a bit tedious. Your wait times go to infinite. If you have infinite wait time in a system, it just isn’t working. You really need to put yourself somewhere around that 80% utilization in most cases, because if you go to 90% utilization, your wait times will double, which is probably unacceptable in a business sense. On the other hand, you do need to balance that against the cost of the idle capacity, which is waste, which we want to avoid.
Math says it, psychology says it. We can see it in business studies as well. It used to be that the 6-day working week was standard. The only day that people would have off was Sunday. This changed with Henry Ford. Henry Ford looked around and he realized that nobody had any leisure to buy cars because they were all working six days. He instituted the 5-day working week so that people had two days of leisure. The really interesting thing is his car sales went up because people wanted to do something at their weekend, but his productivity stayed the same. He was able to pay people more, have them work less, and actually produce more cars because people weren’t so tired, there were fewer errors on the assembly line, all of that kind of thing. We’re now exploring the next step, which is, what if we go to a 4-day working week? What happens then? It seems absolutely ridiculous. Why would you pay your employees to do 4 days instead of 5 days? The experiments with it are actually pretty positive.
Retention goes up, so companies have 42% less people leaving, which makes sense. I would stay someplace that only expected me to work 4 days. There’s also a 36% increase in revenue. Having people work less makes the company more profitable. I think what this shows is that efficiency can be really counterintuitive. We have this idea of what efficiency looks like. For example, a kangaroo is the most efficient land animal. This is not too surprising. Because a kangaroo is very bouncy, it’s very strong, it just looks muscular. It’s cool, and it’s bouncy. It can convert kinetic energy to potential energy, and then back in this efficient cycle. A kangaroo is not the most efficient animal.
The most efficient animal is actually the jellyfish. None of us go into work in the morning and say, I wish to be the jellyfish employee. It doesn’t really have a good look. You’re not going to put that on your appraisal. It’s actually incredibly efficient. I think we sometimes need to think about, what does efficiency look like? What am I optimizing for? Am I optimizing for the right thing, or should I maybe be just a little bit more jellyfish? We can take advantage of some of these paradoxes to give us better business and better life. If we know that increasing capacity doesn’t actually give us a benefit, because utilization just goes up, can we do the opposite? Can we shrink capacity and do this inverse Jevons maneuver, and achieve more by doing less? Because if we can, that’s really cool. Instead of having this tradeoff between being effective and working less, we can have the broken seesaw again. It’s a double win, which is cool.
Recap
To wrap up, because I know I’ve covered all of physics, psychology, biology, business studies, everything, and save the world. Waste is really bad. Waste is everywhere. We can get rid of waste. We just need to sometimes challenge assumptions. We really should do that, because it’s just such a terrible problem. We also should be not wasting our time on things that aren’t valuable. We can be working less and achieving more, which is pretty cool. Happiness is not waste. We should eliminate waste. We shouldn’t have our happiness be eliminated as a byproduct of that eliminating waste, because happiness has a lot of value. Idleness is not waste. Idleness is often improving the efficiency and the resiliency of the system.
Finally, look for double wins everywhere, because, really, it’s not a zero-sum game. There are double wins that we can find everywhere, and get better sustainability and better business. That’s pretty awesome, I think.
See more presentations with transcripts


 
			 
                                 
                              
		 
		 
		 
		