[Video Podcast] Improving Valkey With Madelyn Olson

Watch the video:

Transcript

Thomas Betts: Hello, and welcome to the InfoQ Podcast. I’m Thomas Betts. Today I’m speaking with Madelyn Olson. Madelyn is the maintainer of the Valkey project and a principal software development engineer at Amazon ElastiCache and Amazon Memory DB, focusing on building secure and highly reliable features for the Valkey engine.

She recently gave a deep dive technical presentation at QCon San Francisco about recent changes to the Valkey hash table and the associated performance improvements. I found it fascinating. So Madelyn, welcome to the InfoQ Podcast.

Madelyn Olson: Thanks so much for having me and I’m glad you enjoyed the talk. The QCon conference was one of the best ones I’ve been to in a while. The audience was really great. They asked really informative questions.

The Valkey Origin Story [01:19]

Thomas Betts: Yes. QCon is definitely my favorite conference. So I think we need to start off with the origin story of Valkey. Maybe people haven’t heard the name before. Where did Valkey come from?

Madelyn Olson: And so I’ll start by giving a little bit of history before the actual creation of Valkey. So I actually was a maintainer of the open source Redis project since about 2020. So me and some of the other major contributors of Redis had kind of built a pretty exciting development community. And so when Redis decided to change their license, so back in 2024 in March, Redis moved from an open source permissive BSD license to a commercial license, SSPL and variant called RSAL. That community sort of got together and said, “Hey, we want to continue building what we’ve been working on”. So me and another Redis maintainer, his name was Xiao, he works at Alibaba, went in. We got four other engineers from the Redis community. So we got an engineer from Ericsson, Tencent, Huawei and Google. And so that group of folks, we got together, we went to the Linux Foundation and we were able to create Valkey.

So it was a very fast creation. From the time the license change happened to when Valkey was created was only about eight days. And that’s because we had a knit community together. So we went, created the project. And so that was about 18 months ago. So since then, Valkey’s been doing a lot of stuff. We had a bunch of ongoing engineering work that we just sort of continued. So we did launch a Valkey that was just like a fork of Redis. So that was version 7.2. The first real release was 8.0. That was last year and that was sort of like a statement like, “Hey, we can build stuff”. And since then we’ve had two more major releases. We had Valkey 8.1 earlier this year, although not really related to the fact I just did a talk at QCon and we had another release that had Valkey 9.0.

So that released in November. So we’ve had a bunch of releases under our belt now. There’s a lot of managed providers of Valkey now. Folks like Amazon ElastiCache, the service I work on, now supports Valkey. Memorystore, one of GCP’s offerings, has a support for Valkey. There’s also a lot of other third party providers like Aiven and Percona that also have a managed Valkey offerings. So we’re seeing a lot of excitement kind of around the community and it’s going really well.

Getting Started with Valkey [03:40]

Thomas Betts: Yes. I think you segued into my next question, which is how does someone get started with this? Is it a Valkey-as-a-service offering? It sounds like that is. If you have Redis and you want to switch, is that something you can do? Is there anything developers need to know? There’s a lot of getting started, but moving kind of questions tied into that from both the infrastructure and the engineering side. So start wherever you want.

Madelyn Olson: We like to say Valkey is a drop-in replacement to Redis open source 7.2. So that was the last BSD version of Redis. So if you’re on that version, you can always safely upgrade to any Valkey version. Valkey is fully backwards compatible in that sense. There’s some newer versions of Redis. I mentioned Redis went to a proprietary license. They’ve actually since moved back to AGPL. So if you’re on either of the proprietary or AGPL versions of Redis, there might be some incompatibility you have to look for. We see that most users on old versions of Redis are able to move kind of safely to Valkey. And so what does that look like? Valkey and Redis are typically used as a cache. So typically you can just delete the cache, move it to Valkey and it should work just fine. But one of the reasons people really like Valkey is that it has high availability options.

So what you can do is basically attach replicas to your existing cluster and sync all the data and then you can do failovers. So this is what we call like an online upgrade process. If you’re using a managed service like ElastiCache or Memorystore or Aiven, they’ll make that all seamless for you. You usually just click a button. I can speak a little bit for ElastiCache since I work on the service. It really is just like you click on a button on the console and say, “Hey, I want to move to Valkey”, and it does it all. It’s all online. You don’t really have any outage and it’s quite seamless. So the upgrade process is pretty straightforward. From a tooling and client side application perspective, most clients that work with Redis will also work for Valkey. The big ones are like redis-py, the Spring Data Redis provider, all that type of stuff works just as well with Valkey as it does with Redis.

What we’ve heard from a lot of users, one thing I think I mentioned in the talk as well is a lot of people move to Valkey and it’s so seamless and easy. We’ve been trying to get people to write blog posts being like, “Hey, this is what we migrated and this is what we learned”. And they’re like, “We migrated and we learned nothing. We just clicked a button”. So we’re still working on getting user stories.

Thomas Betts: Yes, you’re a victim of your own performance. You made it too easy.

Madelyn Olson: Right. Too easy. And part of that was because the fundamentals were really good, right? There was already an online upgrade story and we are a fork. So it’s not like we’re trying to build compatibility. We start with compatibility for free and we just have to maintain it. So yes, the comparability story’s been pretty good.

Valkey is a hash map over TCP [06:26]

Thomas Betts: I do want to get into what was the focus of your QCon presentation. My naive understanding is Valkey’s just a cache. It’s a key value store. I stick stuff in, I pull stuff out. And that’s kind of, you said, you’re like, it’s a hash map over TCP. Can you say a little bit more about that? And like what is it underneath the covers? What’s the magic of Valkey?

Madelyn Olson: Yes. And I think you synthesized the simple version. The version that most people know about Valkey is that it’s just a hash map. It’s a key value store and people know that the values can be not strings, right? So a traditional key value store is something like Memcached. You put the key is a string and the value is a string. But the real power of Valkey is that it has complex data types or values. So some use cases we like to talk about, you can store like a set and be like, “Oh, has this user logged in recently? Should we show an advertisement to this user?” You can store all that in the set objects and then do very quick checks on that data type. So I think that’s the first thing that really differentiates Valkey from just a simple hash map. And then the other thing is like to build, like the hash map is straightforward.

It’s all the stuff around it that’s really complicated. The stuff like horizontal clustering, stuff like replication, stuff like durability, stuff like observability and statistics. That’s where most of the work goes into Value, right? The actual core hash map and the data is actually straightforward, but it’s all this other stuff around it that we have to maintain and build that is where we spend most of our engineering time.

Improving the performance without breaking anything [07:59]

Thomas Betts: That’s what makes it a product and not just a hash map. Not that I could implement a hash map myself. There’s no reason for me to do that, but the reason of having it as a service on some scalable appliance or scalable infrastructure is like all you said. And once you go to having all those things, that adds layers of complexity. And if I recall, your presentation was talking about basically changing everything under the covers, but keeping all of the horizontal clustering, the durability, everything else you mentioned, all that had to stay working while you tweaked around inside, right?

Madelyn Olson: Right. And that was really sort of the gist of the talk. So the talk was in like a modern performance track at QCon. And so we had recently rebuilt this hash table and that’s exciting in itself, but what’s really impressive is we didn’t have any performance regressions. And so I guess I can just kind of start doing the story now. So back in 2022, we were thinking about, this is actually back in the Redis days, this is pre-fork. Me and some of the other contributors were like, “Hey, we built a hash table in 2009”. And although some of this stuff was known in 2009, people were still kind of big into simplicity, they didn’t want to over engineer. And so we built a hash table that was pretty good at the time and kind of just looking back at it, we’re like, “Hey, we can make stuff a lot better”. And the big things we realized were, we were doing a lot of independent memory allocations.

So when we want to basically take an object and put it inside Valkey, we were building container objects, we were using link lists to basically keep track of when objects hash to the same bucket inside the hash table and we had relatively high load factor. So load factor is basically the ratio of how many places we can put stuff in a hash table versus how filled they actually are. And that’s because we were using kind of older techniques and we weren’t taking great advantage of modern hardware. So one of the big things that’s happened in the last 10 or so years is hardware hasn’t gone that much faster, but it’s gotten better able to operate on multiple pieces of data at the same time.

So the hash table we built really wasn’t that aware of that functionality. So the things we were trying to do by going over this hash table, and it took us over… Let’s see, I think we started working on this hash table monetization in 2023, and it took us until basically the end of last year to sort of like finish it all to make sure it all worked. And there’s like a lot of problems we had to solve. And there’s even problems I didn’t talk about in the talk, because the talk was a little bit, it had to be simplified to talk about one specific area. And for example, one of the problems we had to solve was in the clustering mode. So Valkey, when Valkey is horizontally distributed, you have multiple different servers. Each key is hashed to a specific slot, that’s deterministic. So key foo will always be like slot 12,000.

And so how these slots are distributed across the nodes is dynamic. So that’s how you scale out is you move slots to different nodes and you’ll move those keys along with it. So one of the things we have to do, we have to basically know which keys are in which slot on a node. So we need like an O of one-ish way to basically, not just, we need a way to iterate over the keys in a given slot when we want to migrate them. In the original version, like when we forked Valkey off, the data structure we were using basically looked like it was a giant linked list of all the keys. So we basically maintained this linked list, which has 16 byte pointers, right? Basically a pointer to the next pointer and like the previous one, right? Linked list. And so that was really expensive, right?

16 bytes doesn’t sound like a lot, but when your average data set size is only like 100 bytes, that’s a lot of overhead. And one of the things we did is we basically, instead of having these linked lists, is we decomposed these giant hatch tables, which comprised all the data into basically a dictionary per slot. And now that sounds conceptually straightforward. Instead of you first compute the slot, the key is in, and then you go look for it in the specific per slot dictionary. But there’s certain things that Valkey does to operate the whole data set, stuff like expiration and eviction. And the way those are working is they were sampling items from every dictionary to sort of just determine which one to kick out. And all of a sudden we needed to do this across upwards of thousands of these per slot dictionaries. And so we actually had to spend a bunch of time and research.

We prototyped a bunch and we consolidate on this data structure called a binary index tree, which basically lets us sample randomly across all of these personal dictionaries proportional to how much data is in them. So a binary index tree is basically a… I mean, it’s a binary tree and it’s cumulative to the number of items in like the leaf nodes. So like we’re able to basically pick a random number and that’ll tell us which slot dictionary we should be sampling to get the specific item out. That’s a problem we had to solve and we solved it for Valkey eight and that was like sort of our first big jump. And then the next big jump was we basically started to, instead of having dedicated allocations for the key and the Valkey objects themselves, we started kind of compacting all this memory stuff together. In my talk, I talked about it’s moving from static structure, right?

So a bunch of small fixed size structures to basically dynamically allocating bigger blocks and memory, which is more aligned with how high performance caches kind of are built nowadays. We took a lot of inspiration from stuff like Segcache, which is based on top of Pelikan, which is a caching framework. So I know I just talked about a lot about a lot of different low level details. And so this is all the stuff that makes me really excited about everything that’s going on in Valkey.

What does performance mean for Valkey [13:58]

Thomas Betts: Well, I just love talking to someone who’s this passionate about like the specific thing they do and like, you couldn’t do this if you weren’t that excited about it. So you said the first jump was that binary index tree and then you started to make other changes. And I think if I go back a couple minutes in the recording, it said there were no performance regressions. How are you measuring performance as you’re making these changes? What does performance mean?

Madelyn Olson: It’s funny, when I was talking at some folks at QCon, this is like one of the soapbox that I like to get on top of, which is a lot of people when they think about performance, they usually think about latency, like how long does it take to get a response back from when you send a request? But the problem with Valkey is Valkey is so fast that the actual time to get a response back is almost entirely dominated by network. So most simple commands of Valkey take about one microsecond. So if you’re doing any network hops, which is the intended mode of Valkey, you’re going to see at least hundreds of microseconds if you’re doing like an internal AZ hop and like up to like a millisecond if we do cross AZs. When we talk about performance, we’re almost always talking actually about throughput because once you sort of hit the limits of throughput, you start seeing these huge latency spikes because of contention within the engine itself.

So when I talk about performance, I’m always talking about throughput. So to your actual question, how do we measure throughput? So throughput’s pretty easy to measure, right? You just send a lot of traffic to the engine and see how much it can actually process at a given time, which is quite… I mean, it’s not trivial, but it’s relatively straightforward. We have built-in tooling inside Valkey to do basically load testing, sending lots of traffic, the tool called Valkey Benchmark. We’re currently evaluating some other approaches. So that’s like our end goal. We’re like, “Hey, this is the end thing we have to compare against”. But we also do a lot of what we call micro benchmarking, which is basically, oh, it’s nice. When you have a bunch of C code, you can just basically go put it on a machine and run it 10,000 times and see how long it takes.

So when we were rebuilding this hash table in a bunch of different ways, like every step along the way, we were doing this micro benchmarking to see how long did it take before, how long does it take now? And that was kind of the best way to sort of guide our performance journey, to make sure we weren’t regressing in weird cases.

And then the other thing that we don’t do a lot, but we should do a lot more, is in our world, a lot of the performance like that, we spend a lot of time basically waiting for main memory access, right? So in the same way that if you have to fetch something from disk, it takes a long time. In our world, fetching something from main memory takes a long time. So we spend a lot of time looking at like CPU counters, which basically tell us how much time are we waiting on. And the terminology is like backend stalls, right? How long are we waiting for memory to be available to be processed?

So we also spend a lot of time looking at those counters to see, “Hey, are we actually doing a good job pre-fetching memory?” So that’s another big thing we care a lot about is before we actually want to execute a command, we want to make sure all that memory is pulled in from main memory into the CPU caches so that it can execute very quickly. So we also compare how much time are we spending on those different areas, like executing commands versus stalling for memory. And then also like perf and flame graphs and stuff. So perf is a way to basically sample, it’s a program that runs periodically and basically generates stack traces of like where the program was at a given given point in time. And then you’re able to compact all together and build these graphs which say, “Hey, where is the program spending most of its time?” So that’s a good way to get sort of like intuitive understanding of like, “Hey, what’s taking a long time?” And then you can sort of helps you zoom in on specifically what you should be targeting.

Designing for typical use cases [18:00]

Thomas Betts: So I know the details always matter. And you mentioned earlier the 100K or 100 bytes average size, was that the size of the key is 100 bytes or the data designer bytes. And what if people stick bigger stuff in there? You said it can store anything. Does that play into your calculations and your designs for how to change things around?

Madelyn Olson: Yes, that’s an excellent question. So when I was talking about 100 bytes, so inside ElastiCache, we did some analysis a while ago and we came up with these numbers and they’re not really representative of everything, right? Everyone has their own workloads, but we found the P50 for key plus value size is about 100 bytes. Keys are actually usually very small. They’re usually between like 16 and like 32 bytes. And then the values are a bit bigger. For strings, they’re bigger, like 70, 80, 100 bytes.

But as you correctly pointed out, one of the things about Valkey that makes it kind of special is a lot of other caching projects are like, “Hey, you have to decide upfront how big your keys are, how big the values are, how many total items you’re going to have”, because that allows you to make a lot of optimizations and Valkey’s like, “We’ll just take whatever you give us. Give us just any key size will accept any value”. The default value size is 512 MB. You should not put 512 MB objects in Valkey, but you can, it will not explode. So as you said, we have to care about those cases as well. So typically we are kind of testing a range of stuff. We’re usually testing values like in 50-, 60-byte, 80-byte ranges, we’re testing in like 512-byte ranges, and then we’re also testing in like tens or hundreds of kilobyte ranges.

And those are like probably the most representative and so we want to make sure that those perform well. One of the things that comes up a lot when we’re talking about performance is, what if we regress on something weird, right? In my talk at QCon, we actually had a slight performance regression in an area that we just had no performance tests for. Functionally it didn’t break, but performance wise, we were basically not pre-fetching memory correctly. We’re pre-fetching the wrong memory and that was really hard to detect. We only figured it out because someone like said, “This code looks wrong”. And we’re like, “Yes, it is wrong”. It’s not crashing, but it is wrong, so we should fix it. And they submitted a fix, which is why open source is great. Once people are reviewing all the code and able to make good suggestions.

What changed in Valkey 8 and 9? [20:40]

Thomas Betts: I know we can’t show your diagrams here. I know the link will eventually be up on QCon or on InfoQ, so people can watch the presentation. Can you do the hand wavy thing? Describe what you changed. You were just starting to get to… I remember seeing the presentation like, “Here is this thing that was wrong”. And I’m like, “Okay, I think I can see how it’s wrong, but you’re talking about changing a hash table and what you’re sticking in there and there’s a dictionary look up”. Walk us through the big pieces of the memory you’re talking about and how it matters because part of me is like, that shouldn’t matter, but I guess it does.

Madelyn Olson: It’s funny. As I was mentally walking through while I was discussing earlier, I’m like, “How am I going to explain that diagram?” And so I just didn’t. So I would definitely encourage you to go watch the talk after this, but I will do my best to explain the design here. So in the traditional hash table that we had before, so we talked a little bit about buckets, right? So within the hash table, you hash a key. So we’ll take foo/bar as an example just to, as an illustration. So you hash the key foo, let’s say you get like bucket Z like bucket 10. So you’ll go and check that bucket and it will point to a container object. So inside Valkey, we call it these dictionary entries. So we’ve paid eight bytes so far. So on 64-bit systems, there’s an eight bytes, that’s how much memory it needs to high contain a pointer, right?

So we point to a dictionary entry, that’s eight bytes. And inside that dictionary entry, we have three pointers. The first one is the pointer to the actual key. And so when we’re doing the lookup, we have our original key, foo. So we have to do a memory compare against those two. So we need to have that key somewhere. And the original implementation was always a separate pointer. So in Valkey 8, this was the first thing we actually also embedded into the structure itself. And the reason this is difficult is like, so before we just had three pointers. So we always had a fixed, 24-byte allocation. And all of a sudden when there’s a key there, like foo. In this case, foo is a little bit simple because three bytes is less than eight. You could just put it there. But as we said, the key could be 40 bytes.

So we need to have all the code in place to basically say, “Hey, this is where the key is inside this memory block”. So we have to write some custom code to basically say, “Hey, go read this address inside the block, then go jump based on that memory read address, and then you can then start reading the key”. So this is a bunch of complexity and it’s a little bit less efficient per se to run, but one of the advantages of modern hardware is like, it’s really good at this. It’s very good at doing construction branch prediction to guess where the code path is going to go and start running instructions. So this got, we kind of expect to see as maybe a small performance degradation, but we didn’t see any here. First thing was the key.

So the next pointer that we had inside this container dictionary structure was the pointer to the actual value. So in this case, there’s actually two hops here. The first is to this object container. So as we said earlier, Valkey can have a lot of different data types, sets, hashes. So we need some metadata to say what type of object this is. So that’s some of those information. And we also have information about ref counting. So we’re not a garbage collected language, so we have to keep track of how many different pointers exist to this object. So obviously some like this dictionary has one, but there’s other ones.

It’s used in a couple of other places like what we do. We do this thing called reply offloading now. So when we actually, when someone requests the data, we can give it to a client and be like, “You go write this client, go write this data out to the network”. And so the client holds a reference for it until it’s able to fully write out the object. So we have some metadata there. And so we basically are spending eight bytes deployed to this container and we should just be able to embed the container. So that’s also one of the things we did in 8.1 is we embedded that container. So we’ve now saved two pointers so far.

And then the third thing was the next pointer. So I mentioned briefly, when we do this hashing of the key, multiple keys can have the same hash value. So we need to find a way to figure out which actual key value pair the end user was looking for. We did that by having a linked list. So we have a pointer to basically the next object. It’s actually a singlely linked list, not doubly, because the length will actually be quite short. And that’s actually a pretty hard pointer to get rid of because we need to do this resolution.

And so what we ended up doing was we took inspiration from some other hash tables that have been built recently. So the more modern way to handle these collisions is to do probate. So instead of putting it in like a linked list, you put it in a different bucket and you have some way to figure out which bucket it’s in. The most common way to do this is what’s called linear probing. So if a name should be in bucket 10, you’ll put it in bucket 11. And so if you check bucket 10 and it’s not there, you go check bucket 11. So we prototype this a little bit. There are some problems with this, which is that if you have very long chains, like you might need to check dozens of buckets. That’s bad for Valkey because it is latency sensitive. I mean, I said commands are really fast, but if they’re all slow, that’s still bad.

We care a lot about throughput. So instead what we did is we tried to adopt a strategy called, from an implementation called SwissTable. And so what SwissTable does is it takes advantage of the fact that within CPUs, memory is fetched in what’s called a cache line. So if you try to access a piece of memory, it pulls in 64 bytes around that. So we basically tried to pack as many pointers to objects in that 64-byte block as possible. And it turns out we can store about seven. So we have seven, eight-byte pointers, and then we also have about eight bytes of extra metadata. So we’re able to basically use like SIMD instructions to check all seven of those pointers at once to see if any of those are matches. So that sort of allows us instead of paying eight bytes for the next pointer, we’re really only paying like one and a half because these whole buckets get changed as opposed to individual entries.

So we’re able to better amortize out the cost of that next point. So what I just talked about is about 23 bytes of savings. It’s actually a little bit less, but in practice it’s also a little bit different because our memory allocator rounds to what are called arena sizes. So like there’s an arena of eight bytes and 16 bytes and 24 bytes, which are about eight bytes apart. And the larger, like if you request like 92 bytes, it’s going to get you and give you a 96 byte allocation. So there’s some overhead loss there as well. Within the talk, I mentioned a little bit some of the actual results we saw. We had a customer who had a little bit of an unusual workload. They had lots of very small keys and values. So the keys were about eight bytes and the values were about eight bytes and they saw almost a 40% memory reduction, which is really cool.

Obviously most real world examples are going to save less than that, but since most caching workloads are actually bound by the amount of memory you can cast the dataset translated to a lot of customers quite happy. Even if you see like an 8% reduction in memory, that makes users very happy because that probably delays how often they have to scale out or scale up.

Performance improvements [28:37]

Thomas Betts: Yes. Or how often you need to expire things, you can have them live longer if that works for your case. So yes, I kind of wanted to back us up out of the rabbit hole. Thanks for diving down into there, but that’s what I wanted to get to. So obviously saving memory has scaling factors that you can take advantage of. And you said you didn’t want to have any performance regressions. Were you seeing performance improvements because you have less memory? It seems like things should, through all this should be faster. Were you getting the throughput that you were hoping for?

Madelyn Olson: So our goal was at the time, we wanted to save memory and we didn’t want to degrade performance. If performance was flat, we were okay with that reality. And what we found was the… So this hash table we have, this like central data type, does power the key value data store, but it also powers a bunch of other stuff in Valkey as well. It’s the backend structure for the sets, the hash map, the storage sets. It’s also used kind of all over the place, like which keys are currently blocked and stuff. So the main key-value workload performance is basically flat, and that’s because we so aggressively prefetch memory for command execution that it’s basically all L1 cache the whole way, L1 and L2 cache. So theoretically, without that memory prefetching, it was faster, but for the main key-value workload’s pretty flat. For some of these other workloads which don’t do as much pre-fetching, they saw quite a bit. They saw 20, 30% faster, higher throughputs.

Thomas Betts: You’re just able to do more. So that translates to the other parts of your system that arre relying on Valkey are able to be a little better. I think that was a good summary. You said your goal is to save memory, don’t degrade performance. Your goal was not to increase performance. What are the benchmarks? What’s the threshold? I don’t even know what the scale is. Millions of reads per timeframe? I don’t know.

Madelyn Olson: There’s two important metrics. The first is basically throughput per core, which is the cost play. Most people use caching to reduce costs or improve efficiency. So you could do about a quarter of a million requests per second, either reads or writes per core. And so that’s sort of our benchmark. The other benchmark we have is vertical scalability per key. How many requests per second can you serve on a specific key? Because caching tends to be lumpy. So we want to make sure we’re also able to handle those lumps well. And so in that dimension, we do about like 1.2 million requests per second. We have some improvements that are upcoming that will get us up to about 1.4 million requests per second, which we’re quite happy with.

Thomas Betts: That seems like it should be good enough. I’m trying to think of what I would be doing that has a million reads every second.

Madelyn Olson: There are plenty of workloads where people are like, “We need 10 million requests per second on a single key”. And we’re like, “Have you considered having multiple copies of this key?”

Thomas Betts: Right. There might be a different thing you’re running into.

Madelyn Olson: Yes.

Valkey’s open source governance [31:44]

Thomas Betts: I want to go back to, you talked about the origin story of Valkey. And I know open source projects have different challenges depending on, there’s a lot of open source projects, a lot of maintainers, there’s a lot of different models. What’s the current state of Valkey’s governance model?

Madelyn Olson: So as I mentioned, there was the six original creators of Valkey. Those six companies, they formed what’s called the technical steering committee or the TSC. All six of those people still comprise the current TSC. There is some directional goal to make the TSC bigger, but it’s one of those things like we got to make it happen. So it’s still quite vendor-neutral. There’s a bunch of other engineers who have gotten involved in the project that we’re hoping will kind of become the TSC maintainers relatively soon.

Thomas Betts: And it’s being well maintained obviously by you and others.

Madelyn Olson: I think so. I wish I had more time to write code. I sometimes joke that I’m a principal engineer at Amazon, which means I don’t do anything. I just float around and give hot takes. I wish I had word that I wish I… I love working on the project. I love working on code. I’m hoping the rest of today we’ll be writing code.

The most peculiar place to run Valkey [32:56]

Thomas Betts: That would be great. Well, before I let you get back to writing code, I was given one planted question. And I don’t know the answer to this. I’m curious, what’s the most peculiar place that you’ve run Valkey?

Madelyn Olson: That I personally have run Valkey?

Thomas Betts: Or maybe the most interesting place you’ve heard someone using it?

Madelyn Olson: I really love Ericsson’s use case. Ericsson uses them in telecommunication equipment, which I think is cool. I’ve run Valkey on my Steam Deck for a demo for a conference, which I thought was kind of fun. It’s always going to be the embedded that have the coolest use cases. Like just running in the cloud, like running in your business app is very boring. For the longest time, it’s what powered all of my home automation systems. I built a polling system where everything went through Valkey to like when you sent a request, it would go put it in Valkey and then things would pull from the queue. I don’t know if that’s all that interesting though.

Thomas Betts: I think it’s one of those weird things to say, “I have this, how can I use it?” Rather than like, “Is it the most appropriate thing? Do I need it?” But knowing that you could. And then you mentioned, I think this is all written in C, is that correct?

Madelyn Olson: In C, yes.

No, Valkey will not be rewritten in Rust [34:07]

Thomas Betts: Rust is the new C. Are you going to rewrite it in Rust?

Madelyn Olson: No.

Thomas Betts: See, I knew I could save one hot take for you. It’s not that hot.

Madelyn Olson: No, no, it’s not that hot a take. I mean, my opinion, I actually love Rust. Many years ago, I actually wrote an internal doc that for our team and our group inside Amazon, it’s like, we should write all future code in Rust, because I am such a believer in Rust. But one of the things that I also talk a lot about is rewriting code. Rust is very opinionated. It’s hard to take C code and pour it to Rust without dramatically changing the structure. You don’t get a lot of the benefits of the Rust ecosystem by just porting all the C code into Rust. And one of my concerns is we might degrade performance, memory efficiency. There’s all these big risks of things we might impact. And what’s the benefit we’re getting? We’re going to be super dependencyless. Valkey has no external dependencies. We build everything in at static time.

So we’re not going to use Cargo. Sure, maybe testing would be a little bit easier. Oh, there’s some great tooling around performance profiling, but we have a lot of that expertise already. It’s one of those things like, it’s true, there’s a lot of sunk costs in what we’re doing in C, but I think there’s a lot of risks of moving to Rust and I don’t think there’s a lot of benefits.

Thomas Betts: That’s actually a really clear distinction. I like people talking about that very clearly. Write all your new code this way, but don’t port the old one.

Madelyn Olson: Yes. And to be clear, we do write code in Rust. We have a Rust model. So Valkey has this plugin extensibility system. There’s a SDK written in Rust. LDAP authentication is written in Rust. It’s really quite elegant. It’s like 300 lines of code. We didn’t want to roll our own LDAP authentication integration in C. That sounds terrible. But yes, just importing it from Cargo is nice. But it’s also interesting, that module ecosystem. So we have this thing called Valkey Bundle. So the core Valkey is just called Valkey. And then we have this Valkey plus extensions. The core Valkey is like 10 megabytes. The bundle is like 50. And on the grand scheme, it’s 40 megabytes, whatever. But some people care about that. It’s important to some users. So it takes so much longer to build. The Valkey core is like 20 minutes total build time. Building all these extensions and merging together is like 55.

It’s like, there are downsides. There are real drawbacks. So I stick with my opinion, like deep core infrastructure that’s already built and well tuned to probably stick around C, but you should at least try building new things in Rust, for sure.

Thomas Betts: I like that. I like that distinction. Well, your QCon presentation will eventually be available to view for everyone on InfoQ. I don’t have a date, unfortunately, but I do recommend people go check it out. The computer sciencey diagrams of hash maps and pointers. I geeked out on it. I think other people at the conference did too. Until that’s available, where can people find more information about changes of Valkey, where are you publishing stuff?

Madelyn Olson: Yes. So the main thing is, so valkey.io/blog, there is a higher level explanation on the blog for that new hash table that’s there on the blog. So go check that out. We publish primarily deep technical stuff on the blog. If you’re interested in getting involved, there is a Slack community. You can get there at valkey.io/slack and it will redirect you to join our Slack group. You’re free to message me on there and be like, “Hey, I want to get involved or what’s cool, what’s happening”. There’s some groups there that are very responsive.

Thomas Betts: Well, Madelyn, thank you again for joining me today.

Madelyn Olson: Yes, thanks so much for having me.

Thomas Betts: And listeners, we hope you’ll join us again soon for another episode of the InfoQ Podcast.

Mentioned:

.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

[Video Podcast] Improving Valkey with Madelyn Olson

Watch the video:

Transcript

The Valkey Origin Story [01:19]

Getting Started with Valkey [03:40]

Valkey is a hash map over TCP [06:26]

Improving the performance without breaking anything [07:59]

What does performance mean for Valkey [13:58]

Designing for typical use cases [18:00]

What changed in Valkey 8 and 9? [20:40]

Performance improvements [28:37]

Valkey’s open source governance [31:44]

The most peculiar place to run Valkey [32:56]

No, Valkey will not be rewritten in Rust [34:07]

Leave a Reply Cancel reply

Stay Connected

Latest News

Uber Moves In-House Search Indexing to Pull-Based Ingestion in OpenSearch

103 ByteDance employees dismissed for corruption and other misconduct · TechNode

Oura's FDA lobbying benefits Apple Watch, if everyone's smart about the risks

YouTube TV Launches a Raft of New Streaming Packages From $55 a Month

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Watch the video:

Transcript

The Valkey Origin Story [01:19]

Getting Started with Valkey [03:40]

Valkey is a hash map over TCP [06:26]

Improving the performance without breaking anything [07:59]

What does performance mean for Valkey [13:58]

Designing for typical use cases [18:00]

What changed in Valkey 8 and 9? [20:40]

Performance improvements [28:37]

Valkey’s open source governance [31:44]

The most peculiar place to run Valkey [32:56]

No, Valkey will not be rewritten in Rust [34:07]

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News