How to Train large language models (LLMs) like ChatGPT on Private Data Using Azure AI Hub
It has been observed that people become more proactive when they incorporate Generative AI into their day-to-day workflows. Generative AI enhances proactivity when utilized effectively. Its benefits extend not only to individuals but also to entire industries.
There have been rapid advancements in Generative AI technology, evolving from simple text generation to analyzing large amounts of data and generating images. ChatGPT can answer a wide range of questions, but if you are not widely known and want to ask specific questions about yourself, ChatGPT might not have the answers. However, with the help of Azure AI, large language models (LLMs) like ChatGPT can be customized to meet individual requirements, allowing it to function as a personalized AI assistant,
How to make your own personalized AI assistant or chatbot
Building your own AI assistant or chatbot is exciting, but you need to consider various components for it, such as the LLM module, AI services, orchestration, deployments, monitoring, and more.
If you are an Azure enthusiast, Azure AI Studio can be used for this purpose. In this article, I will be using it.
Azure AI Studio is a collection of various AI modules and services. It also provides a platform for building AI tools. It supports no-code, low-code, and full-code approaches.
How to access Azure AI Studio: It is part of portal.azure.com. You need to create an account if you don’t already have one. Login to Azure portal and Search for Azure AI Foundry , and click on it.
Click on Hub
Provide the required information for your project, then click on Review + Create, and finally, click on Create
Click on Go to resource
Click on Launch Azure AI Foundry
A separate window for https://ai.azure.com will open. Click on + New project, enter the project name, and click Create.
Project Name: aidemoproject created
Click on Models + endpoints and then +Deploy model
Click on Deploy base model
Select model of your choice. I am selecting gpt-4o-mini and clicking on Confirm
Check the Token per Minute Rate Limit as per your application requirements. In this example I took minimum as 1k and clicked on Deploy
I deployed gpt-4o-mini, Endpoint can be used in external applications.
Here I am going to deployed model ((gpt-4o-mini) in our chat playground. Click on playgrounds then Try the Chat playground
Give instruction to model, what you want it to do and click on Apply changes
Now Chat playground is ready to use. Type any word and see the result
I got below output from model
It is working in same way, what was instructed. I created own chat bot with custom system message. Here I can upload our data and model can use that data to answer of questions.
Train LLM model on private Data
You can retrieve data for LLM module in different way, such as Azure Blob Storage, get data with storage URL, upload files/folders.
Here I will be using upload files/folders option.
Click on Data + indexes and then + New data, I am going to create a new data source, where I can upload our data and train LLM module to answers on the basis on that data.
Click on Upload files/folders as for this demo I am going to upload data directly to Azure AI studio but other option also available to store private data.
I just uploaded my resume, now click Next
Now give Data name, in this example I gave as personaldata and click on Create
You can see your uploaded data path on next screen
Now I have created data store name as personaldata, now time to create index, click on Data + indexes
Click on Indexes
Click on + New index
Click on Data in Azure AI Foundry
Select data source what was created before.
Click on Create a new Azure AI Search resource
It will ask you to login with https://portal.azure.com , create a search service there and provide values as per your requirements, click on Review + create then Create
Now go back to https://ai.azure.com prompt and select created AI Search service aisearchservice124 and click on Next
Select vector search
Click on Create vector index . This step may takes couples of minutes
Status will be Completed once index is created. Index creation took me about 10 minutes.
Now click on Playgrounds and then Try the Chat playground
Select LLM model of your choice and then index what was created
Now type the question related that resume and you will get answers of that