Transcript
Gu: I’m very excited to share about some of the ways we’re leveraging generative AI for productivity at Wealthsimple, and the journey that got us to this place. My talk is going to be roughly structured and broken into four sections. I’ll start by sharing some context about what we do. We’ll dive deeply into our LLM journey. I’ll talk also about the learnings that came out of it. Then I’ll end with sharing a quick snapshot overview of generative AI today.
Wealthsimple is a Canadian FinTech company. Our mission is to help Canadians achieve their version of financial independence. We do this through our unified app, where investing, saving, and spending comes together as one. At Wealthsimple, our generative AI efforts are primarily organized into three streams. The first is employee productivity. This was the original thesis of how we envision LLMs to add value, and continues to be an area of investment today. As we started building up the foundations, the tools, the guardrails for employee productivity, this also gave us the confidence to start extending the same technologies for our clients, to actually optimize operations, which becomes our second stream of focus.
In optimizing these operations, our goal is to use LLMs and generative AI to provide a more delightful experience for our clients. Third, but certainly not least, there’s the underlying LLM platform, which powers both employee productivity and optimizing operations. Through the investments in our platform, we have a few wins to share in the past 1.5 years since we’ve embarked on our LLM journey. We developed and open sourced our LLM gateway, which, internally, is used by over half the company. We developed and shipped our in-house PII redaction model. We made it really simple to self-host open source LLMs within our own cloud environment. We provided the platform support for fine-tuning and model training with hardware accelerations. We have LLMs in production optimizing operations.
LLM Journey (2023)
How do we get here? Almost two years ago, on November 30, 2022, OpenAI released ChatGPT, and that changed the way the world understood and consumed generative AI. It took what used to be a niche and hard to understand technology and made it accessible by virtually anyone. This democratization of AI led to unprecedented improvements in both innovation and productivity. We were just one of the many companies swept up in this hype and in the potential of what generative AI could do for us. The first thing that we did in 2023 was launching our LLM gateway. When ChatGPT first became popularized, the security awareness from the general public for fourth, and third-party data sharing was not as mature as it was today. There were cases where companies were inadvertently oversharing information with OpenAI, and this information was then being used to train new models that would become publicly available.
As a result, a lot of companies out there, Samsung being one of them, had to actually ban ChatGPT among employees to prevent this information from getting out. This wasn’t uncommon especially within the financial services industry. At Wealthsimple, we really did see GenAI for its potential, so we quickly got to work building a gateway that would address these concerns while also providing the freedom to explore. Our gateway, this is a screenshot of what it used to look like in the earlier days. In the first version of our gateway, all it did was maintain an audit trail. It would track what data was being sent externally, where was it being sent externally, and who sent it.
Our gateway was a tool that we made available for all employees behind a VPN, gated by Okta, and it would proxy the information from the conversation, send it to various LLM providers such as OpenAI, and track this information. Users can leverage a dropdown selection of the different models to initiate conversations. Our production systems could also interact with these models programmatically through an API endpoint from our LLM service, which also handles retry and fallback mechanisms. Another feature that we added fairly early on in our gateway was the ability to export and import conversations. Conversations can be exported to any of the other platforms we work with, and they can be imported as checkpoints to create a blended experience across different models.
After we built the gateway, we ran into another problem, which was adoption. A lot of people saw our gateway as a bootleg version of ChatGPT, and there wasn’t that much incentive to use it. One of our philosophies at Wealthsimple is, whenever it comes to new technologies or new tools, we want to make the right way the simple way, or the right way the path of least resistance. We wanted to do something similar with our gateway as well. We wanted people to actually want to use it, and we want to make it really easy for people to use it. We emphasized and amplified a series of sticks and carrots to guide them towards that direction. There was a lot of emphasis on the carrots, and we let a lot of the user feedback drive future iterations of our gateway. Some of the benefits of our gateway is that, one, it’s free to use. We pay for all of the cost. Second, we want to provide optionality. We want to provide a centralized place to interact with all of the different LLM providers.
At the beginning, it was just OpenAI and Cohere, so not much to choose from. This list also expanded as time went on. We also wanted to make it a lot easier for developers. In the early days of interacting with OpenAI, their servers were not the most reliable, so we increased reliability, availability through a series of retry and fallback mechanisms. We actually worked with OpenAI to increase our rate limits as well.
Additionally, we provided an integrated API with both our staging and production environments, so that anyone can explore the interactions between our gateway and other business processes. Alongside those carrots, we also had some very soft sticks to nudge people into the right direction. The first is what we called these nudge mechanisms. Whenever anyone visited ChatGPT or another LLM provider directly, they would get a gentle nudge on Slack saying, have you heard about our LLM gateway? You should be using that instead. Alongside that, we provided guidelines on appropriate LLM use which directed people to leverage the gateway for all work-related purposes.
Although the first iteration of our LLM gateway had a really great paper trail, it offered very little guardrails and mechanisms to actually prevent data from being shared externally. We had a vision that we were working towards, and that drove a lot of the future roadmap and the improvements for our gateway. Our vision was centered around security, reliability, and optionality. Our vision for the gateway was to make the secure path the easy path, with the appropriate guardrails to prevent sharing sensitive information with third-party LLM providers. We wanted to make it highly available, and then again, provide the options of multiple LLM providers to choose from.
In building off of those enablement philosophies, the very next thing we shipped in June of 2023 was our own PII redaction model. We leveraged Microsoft residuals framework along with an NER model we developed internally to detect and redact any potentially sensitive information prior to sending to OpenAI or any external LLM providers. Here’s a screenshot of our PII redaction model in action. I provide this dummy phone number, I would like you to give me a call at this number. This number is recognized by our PII redaction model as being potentially sensitive PII, so it actually gets redacted prior to being sent to the external provider. What was interesting is that with the PII redaction model, while we closed a gap in security, we actually introduced a different gap in the user experience. One of the feedback that we heard from a lot of people is that, one, the PII redaction model is not always accurate, so a lot of the times it interfered with the accuracy, with the relevancies of the answers provided.
Two, for them to effectively leverage LLMs into their day-to-day work, it needs to be able to accept some degree of PII, because that fundamentally was the data that they worked with. For us, and going back to our philosophy of making the right way the easy way, we started to look into self-hosting open source LLMs. The idea was that by hosting these LLMs within our own VPCs, we didn’t have to run the PII redaction model. We could encourage people to send any information to these models, because the data would stay within our cloud environments. We spent the next month building a simple framework using llama.cpp, a quantized framework for self-hosting open source LLMs. The first three models that we started self-hosting was Llama, it was Llama 2 at the time, the Mistral models, and also Whisper, which OpenAI had open sourced. I know technically, Whisper is not an LLM, it’s the voice transcription model. For simplicity, we included in the umbrella of our LLM platform.
After introducing these self-hosted LLMs, we made a fast follow by introducing retrieval augmented generation as an API, which also included a very deliberate choice of our vector database. We heard from a lot of the feedback, and we saw in both industry trends and the use cases that the most powerful use cases of LLMs involved grounding it against context that was relevant to the company. Making these in similar investments within our LLM platform, we first introduced Elasticsearch as our vector database.
We built pipelines and DAGs in Airflow, our orchestration framework, to update and index our common knowledge bases. We offered a very simple semantic search as our first RAG API. We encouraged our developers and our end users to build upon these APIs and building blocks that we provided in order to leverage LLMs grounded against our company context. What we found very interesting was that even though grounding was one of the things that a lot of our end users asked for, even though intuitively it made sense as a useful building block within our platform, the engagement and adoption was actually very low. People were not expanding our knowledge bases as we thought they would. They were not extending their APIs. There was very little exploration to be done. We realized that we probably didn’t make this easy enough. There was still a gap when it came to experimentation. There was still a gap when it came to exploration. It was hard for people to get feedback on the LLM and GenAI products that they were building.
In recognizing that, one of the next things that we invested in was what we called our data applications platform. We built an internal service. It runs on Python and Streamlit. We chose that stack because it’s easy to use and it’s something a lot of our data scientists were familiar with. Once again, we put this behind Okta, made it available behind our VPNs, and created what was essentially a platform that was very easy to build new applications and iterate on those applications. The idea was that data scientists and developers, or really anyone who was interested and willing to get a little bit technical, they were able to build their own applications, have it run on a data applications platform, and create this very fast feedback loop to share with stakeholders, get feedback.
In a lot of the cases, these proof-of-concept applications expanded into something much bigger. Within just the first two weeks of launching our data application platform, we had over seven applications running on it. Of those seven, two of them eventually made it into production where they’re adding value and optimizing operations and creating a more delightful client experience. With the introduction of our data applications platform, our LLM platform was also starting to come together. This is a very high-level diagram of what it looks like. In the first row, we have a lot of our contextual data, our knowledge bases, is being ingested through our Airflow DAGs to our embedding models, and then updated and indexed in Elasticsearch. We also chose LangChain to orchestrate our data applications, which sits very closely with both our data applications platform and our LLM service. Then we have the API for our LLM gateway through our LLM service, tightly integrated within our production environments.
As our LLM platform came together, we started also building internal tools that we thought would be very powerful for employee productivity. At the end of 2023, we built a tool we called Boosterpack, which combines a lot of the reusable building blocks that I mentioned earlier. The idea of Boosterpack is we wanted to provide a personal assistant grounded against Wealthsimple context for all of our employees. We want to run this on our cloud infrastructure with three different types of knowledge bases, the first being public knowledge bases, which was accessible to everyone at the company, with source code, help articles, and financial newsletters. The second would be a private knowledge base for each employee where they can store and query their own personal documents.
The third is a limited knowledge base that can be shared with a limited set of coworkers, delineated by role and projects. This is what we call the Wealthsimple Boosterpack. I have a short video of what it looks like. Boosterpack was one of the applications we actually built on top of our data applications platform. In this recording, I’m uploading a file, a study of the economic benefits of productivity through AI, adding this to a private knowledge base for myself. Once this knowledge base is created, I can leverage the chat functionality to ask questions about it. Alongside the question answering functionalities, we also provided a source, and this was really effective, especially when it came to documents as a part of our knowledge bases. You could actually see where the answer was sourced from, and the link would take you there, so if you wanted to do any fact checking or further reading.
LLM Journey (2024)
2023 ended with a lot of excitement. We rounded the year off by introducing our LLM gateway, introducing self-hosted models, providing a RAG API, and building a data applications platform. We ended the year off by building what we thought would be like one of our coolest internal tools ever. We were in a bit of a shock when it came to 2024. This graph, this is our Gartner’s hype cycle, which maps out the evolution of expectations and changes when it comes to emerging technologies. This is very relevant, especially for generative AI, which in 2023 for most of us, we were entering this peak of inflated expectations. We were so excited about what LLMs could do for us. We weren’t exactly sure in concrete ways where the business alignment came from, but we had the confidence, we wanted to make big bets in this space.
On the other hand, as we were entering 2024, it was sobering for us as a company and for the industry as a whole too. We realized that not all of our bets had paid off. That in some cases, we may have indexed a little bit too much into investments for generative AI, or building tools for GenAI. What this meant for us, for Wealthsimple in particular was, our strategy evolved to be a lot more deliberate. We started focusing a lot more on the business alignment and on how we can get business alignment with our generative AI applications. There was less appetite for bets. There was less appetite for, let’s see what happens if we swap this out for one of the best performing models. We became a lot more deliberate and nuanced in our strategy as a whole. In 2024, we actually spent a big chunk of time at the beginning of the year just going back to our strategy, talking to end users, and thinking really deeply about the intersection between generative AI and the values our business cared about.
The first thing we actually did as a part of our LLM journey concretely in 2024 was we unshipped something we built in 2023. When we first launched our LLM gateway, we introduced the nudge mechanisms, which were the gentle Slack reminders for anyone not using our gateway. Long story short, it wasn’t working. We found very little evidence that the nudges were affecting and changing behavior. People who are getting nudged, it was the same people getting nudged over again, and they became conditioned to ignore it. Instead, what we found was that improvements to the platform itself was a much stronger indicator for behavioral changes. We got rid of these mechanisms because they weren’t working and they were just causing noise.
Following that, in May of this year, we started expanding the LLM providers that we wanted to offer. The catalyst for this was Gemini. Around that time, Gemini had launched their 1 million token context window models, and this was later replaced by the 2-plus million ones. We were really interested to see what this could do for us and how it could circumvent a lot of our previous challenges with the context window limitations. We spent a lot of time thinking about the providers we wanted to offer, and building the foundations and building blocks to first introduce Gemini, but eventually other providers as well. A big part of 2024 has also been about keeping up with the latest trends in the industry.
In 2023, a lot of our time and energy were spent on making sure we had the best state-of-the-art model available on our platform. We realized that this was quickly a losing battle, because the state-of-the-art models were evolving. They were changing every week or every few weeks. That strategy shifted in 2024 where instead of focusing on the models themself, we took a step back and focused higher level on the trends. One of the emerging trends to come out this year was multi-modal inputs. Who knew you could have even less friction-full mediums of interacting with generative AI? Forget about text, now we can send a file or a picture. This was something that caught on really quickly within our company. We started out by leveraging Gemini’s multi-modal capabilities. We added a feature within our gateway where our end users could upload either an image or a PDF, and the LLM would be able to drive the conversation with understanding what was being sent.
Within the first few weeks of launching this, close to a third of all of our end users started leveraging a multi-modal feature at least once a week. One of the most common use cases we found was when people were running into issues with our internal tools, when they were running into program errors, or even errors working with our BI tool. As humans, if you’re a developer, and someone sends you a screenshot of their stack trace, that’s an antipattern. We would want to get the text copy of it. Where humans offered very little patience for that sort of things, LLMs embraced it. Pretty soon, we were actually seeing behavioral changes in the way people communicate, because LLM’s multi-modal inputs made it so easy to just throw a screenshot, throw a message. A lot of people were doing it fairly often. This may not necessarily be one of the good things to come out of it, but the silver lining is we did provide a very simple way for people to get the help they needed in the medium they needed.
Here is an example of an error someone encountered when working with our BI tool. This is a fairly simple error. If you asked our gateway, I keep running into this error message while refreshing MySQL dashboard, what does this mean? It actually provides a fairly detailed list of how to diagnose the problem. No, of course, you could get the same results by just copying and pasting it, but for a lot of our less technical users, it’s a little bit hard sometimes to distinguish the actual error message from the full result.
After supporting multi-modal inputs, the next thing we actually added to our platform was Bedrock. Bedrock was a very interesting addition, because this marked a shift in our build versus buy strategy. Bedrock is AWS’s managed service for interacting with foundational large language models, and it also provides the ability to deploy and fine-tune these models at scale. There was a very big overlap between everything we’ve been building internally and what Bedrock had to offer. We had actually considered Bedrock back in 2023 but said no to it, in favor of building up a lot of these capabilities ourselves. Our motivation at that time was so that we could build up the confidence, the knowhow internally, to deploy these technologies at scale. With 2024 being a very different year, this was also a good inflection point for us, as we shifted and reevaluated our build versus buy strategy.
The three points I have here on the slides are our top considerations when it comes to build versus buy. The first is that we have a baseline requirement for security and privacy. If we wanted to buy something, they need to meet that. The second is the consideration of time to market and cost. Then, third, this was something that changed a lot between 2023 and 2024, was in considering and evaluating our unique points of leverage, otherwise known as the opportunity cost of building something, as opposed to buying it. There were a lot of trends that drove the evolution of these strategies and this thinking. The first was that vendors and LLM providers, their security awareness got a lot better over time. LLM providers were offering mechanisms for zero-day data retention. They were becoming a lot more integrated with cloud providers. They had learned a lot from the risks and the pitfalls of the previous year to know that consumers cared about these things.
The second trend that we’ve seen, and this was something that affected us a lot more internally, is that as we got a better understanding of generative AI, it also meant we had a better understanding of how to apply it in ways to add value, to increase business alignment. Oftentimes, getting the most value out of our work is not by building GenAI tools that exist on the marketplace. It’s by looking deeply into what we need as a business and understanding and evaluating the intersections with generative AI there. Both of these points actually shifted our strategy to what was initially very build focus, to being a lot more buy focused. The last point I’ll mention which makes this whole thing a lot more nuanced is that, over the past year to two years, a lot more vendors, both existing and new, are offering GenAI integrations. Almost every single SaaS product has an AI add-on now, and they all cost money.
One analogy we like to use internally is, this is really similar to the streaming versus cable paradigm, where, once upon a time, getting Netflix was a very economical decision when contrasted against the cost of cable. Today, with all of the streaming services, you can easily be paying a lot more for that than what you had initially been paying for cable. We found ourselves running into a similar predicament when evaluating all of these additional GenAI offerings provided by our vendors. All that is to say is the decision for build versus buy has gotten a lot more nuanced today than it was even a year ago. We’re certainly more open to buying, but there are a lot of considerations on making sure we’re buying the right tools that add value and not just providing duplicate value.
After adopting Bedrock, we turned our attentions to the API that we offered for interacting with our LLM gateway. When we first put together our gateway, when we first shipped our gateway, when we first offered this API, we didn’t think too deeply about what the structure would look like, and this ended up being a decision that we would regret. As OpenAI’s API specs became the gold standard, we ran into a lot of headaches with integrations. We had to monkey patch and rewrite a lot of code from LangChain and other libraries and frameworks because we didn’t offer a compatible API structure. We took some time in September of this year to ship v2 of our API, which did mirror the OpenAI’s API specs. The lesson we learned here was that it’s important to think about, as this industry, as the tools and frameworks within GenAI matures, how those providers were thinking about like, what is the right standard and the right integrations?
Learnings
This brings us a lot closer to where we are today, and over the past few years, although our platform, our tools, and these landscapes have changed a lot. We’ve also had a lot of learnings along the way. Alongside these learnings, we also gain a better understanding of how people use these tools and what they use them to do. I wanted to share some statistics that we’ve gathered internally on this usage. The first is that, there is, at least within Wealthsimple, a very strong intersection between generative AI and productivity.
In the surveys and the client interviews we did, almost everyone who used LLMs found it to significantly increase or improve their productivity. This is more of a qualitative measure. We also found that LLM gateway adoption was fairly uniform across tenure and level. It’s a fairly even split between individual contributors and people leaders. This was great affirmation for us, because we had spent a lot of time in building a tool and a platform that was very much bottoms-up driven. This was good affirmation that we were offering these tools that were genuinely delightful and frictionless for our end users.
In terms of how we were leveraging LLMs internally. This data is a few months outdated, but we actually spent some time annotating a lot of the use cases. The top usage was for programming support. Almost half of all of the usage was some variation of debugging, code generation, or just general programming support. The second was content generation/augmentation, so, “Help me write something. Change the style of this message. Complete what I had written”.
Then the third category was information retrieval. A lot of this was focused around research or parsing documents. What’s interesting is that almost everything, all the use cases we saw, basically fell within these three buckets, there was very little use case outside. We also found that about 80% of our LLM usage came through our LLM gateway. This is not going to be a perfectly accurate measure, because we don’t have a comprehensive list of all of the direct LLM accesses out there, but only about 20% of our LLM traffic hit the providers directly, and most of it came through the gateway. We thought this was pretty cool. We also learned a lot of lessons in behavior. One of our biggest takeaways this year was that, as our LLM tooling became more mature, we learned that our tools are the most valuable when injected in the places we do work, and that the movement of information between platforms is a huge detractor. We wanted to create a centralized place for people to do their work.
An antipattern to this would be if they needed seven different tabs open for all of their LLM or GenAI needs. Having to visit multiple places for generative AI is a confusing experience, and we learned that even as the number of tools grew, most people stuck to using a single tool. We wrapped up 2023 thinking that Boosterpack was going to fundamentally change the way people leverage this technology. That didn’t really happen. We had some good bursts in adoption, there were some good use cases, but at the end of the day, we actually bifurcated our tools and created two different places for people to get their GenAI needs. That was detrimental for both adoption and productivity. The learning from here is that we need to be a lot more deliberate about the tools we build, and we need to put investments into centralizing a lot of these toolings. Because even though this is what people said they wanted, even though this intuitively made sense, user behavior for these tools is a tricky thing, and that will often surprise us.
GenAI Today
Taking all of these learnings, I wanted to share a little bit more about generative AI today at Wealthsimple, how we’re using it, and how we’re thinking about it going into 2025. The first is that, in spite of the pitfalls we’ve made, overall, Wealthsimple really loves LLMs. Across all the different tools we offer, over 2200 messages get sent daily. Close to a third of the entire company are weekly active users. Slightly over half of the company are monthly active users. Adoption, engagement for these tools is really great.
At the same time, the feedback that we’re hearing is that it is helping them be more productive. We also learn all the lessons, all of the foundations and the guardrails that we learned and developed for employee productivity also paves the way to providing a more delightful client experience. These internal tools establish the building blocks to build and develop GenAI at scale, and they’re giving us the confidence to find opportunities to optimize operations for our clients. By providing the freedom for anyone at the company to freely and securely explore this technology, we had a lot of organic extensions and additions that involve generative AI, a lot of which we had never thought of before. As of today, we actually do have a lot of use cases, both in development and in production, that are optimizing operations. I wanted to share one of them. This is what our client experience triaging workflow used to look like. Every single day, we get a lot of tickets, both through text and through phone calls from our clients.
A few years ago, we actually had a team dedicated to reading all of these tickets and triaging them. Which team should these tickets be sent to so that the clients can get their issue resolved? Pretty quickly, we realized this is not a very effective workflow, and the people on this team, they didn’t enjoy what they were doing. We developed a transformer-based model to help with this triage. This is what we’re calling our original client experience triaging workflow. This model would only work for emails. It would take the ticket and then map it to a topic and subtopic. This classification would determine where this ticket gets sent to. This was one of the areas which very organically extended into generative AI, because the team working on it had experimented with the tools that we offered. With our LLM platform, there were two improvements that were made.
The first is that by leveraging Whisper, we could extend triaging to all tickets, not just emails. Whisper would transcribe any phone calls into text first, and then the text would be passed into the downstream system. Generations from our self-hosted LLMs were used to enrich the classification, so we were actually able to get huge performance boosts, which translated into so many hours saved by both our client experience agents and our clients directly themselves through these improvements in performance. Going back to this hype chart, 2023 we were climbing up that peak of inflated expectations. 2024 was a little bit sobering as we made our way down. Towards the end of this year, and as we’re headed into next year, I think we’re on a very good trajectory to ascend that slope of enlightenment. Even with the ups and downs over the past two years, there’s still a lot of optimism, and there’s still a lot of excitement for what next year could hold.
Questions and Answers
Participant 1: When it comes to helping people realize that putting lots of personal information into an LLM is not necessarily a safe thing to do, how did you help ensure that people weren’t sharing those compromising data from a user education standpoint?
Gu: I think there’s two parts to this. One is that we found, over the years of introducing new tools, that good intentions are not necessarily enough. That we couldn’t just trust people. That we couldn’t just focus on education. That we actually needed the guardrails and the mechanisms within our system to guide them to ensure they make the right decision, outside of just informing them about what to do. That was one part to our philosophy. I think, to your point, definitely leveling up that understanding of the security risk was very important. Being a financial services company, we work with very sensitive information for our clients. As a part of our routine training, there’s a lot of education already about like, what is acceptable to share and what is not acceptable to share. The part that was very hard for people to wrap their heads around is what happens when this information is being shared directly with OpenAI, for instance, or in a lot of cases like fourth-party data sharing.
For instance, Slack has their AI integration. Notion has their AI integration. What does that mean? To an extent, it does mean all of this information will get sent to the providers directly. That was the part that was really hard for people to wrap their heads around. This is definitely not a problem that we’ve solved, but some of the ways that we’ve been trying to raise that awareness is through onboarding. We’ve actually added a component for all employee onboarding that includes guidelines for proper AI usage. We’ve added a lot more education for leaders and individuals in the company who may be involved in the procurement process for new vendors, and the implications that may have from a security point of view.
Participant 2: What consisted of the data platform, and how did you use that in your solution?
Gu: There’s definitely a very close intersection between our data platform and our machine learning platform. For instance, one of the bread and butters to our data platform is our orchestration framework through Airflow. That was something we use to update the embeddings within our vector database and make sure it was up to date with our knowledge bases. Outside of that, when it comes to exploration, and especially for our data scientists, as they’re building new LLM and ML products, there’s a very close intersection between the data we have available in our data warehouse and the downstream use cases. I would call those two out as the biggest intersections.
Participant 3: Early in the conversation, you talked about Elasticsearch as your vector database capability for similarity search for RAG purposes. Later, you talked about transitioning to Bedrock. Did you keep Elasticsearch, or did you get off of that when you transitioned to Bedrock?
Gu: We didn’t get off of that. Actually, we’re using OpenSearch, which is AWS’s managed version of Elasticsearch. At the time we chose OpenSearch/Elasticsearch, because it was already part of our stack. It was easy to make that choice. We didn’t go into it thinking that this would be our permanent choice. We understand this is a space that evolves a lot. Right now, Bedrock is still fairly new to us. We’re primarily using it to extend our LLM provider offerings, specifically for Anthropic models. We haven’t dug or evaluated as deeply, like their vector database or like their fine-tuning or their other capabilities. I think that’s definitely one of the things we want to dig deeper into for 2025 as we’re looking into what an evolution the next iteration of our platform would look like.
Participant 3: Are you happy with the similarity results that you’re getting with OpenSearch?
Gu: I think we are. I think studies have shown that this is usually not the most effective way of doing things from a performance and relevancy perspective, at least. Where we’re really happy with it is like, one, it’s easy to scale. Latency is really good. It’s just overall simple to use. I think depending on the use cases, like maybe using a reranker, or leveraging a different technique may be better suited, depending on the use case.
See more presentations with transcripts