Want To Train ChatGPT On Your OWN Data? Here's How To Do It

How to Train large language models (LLMs) like ChatGPT on Private Data Using Azure AI Hub

It has been observed that people become more proactive when they incorporate Generative AI into their day-to-day workflows. Generative AI enhances proactivity when utilized effectively. Its benefits extend not only to individuals but also to entire industries.

There have been rapid advancements in Generative AI technology, evolving from simple text generation to analyzing large amounts of data and generating images. ChatGPT can answer a wide range of questions, but if you are not widely known and want to ask specific questions about yourself, ChatGPT might not have the answers. However, with the help of Azure AI, large language models (LLMs) like ChatGPT can be customized to meet individual requirements, allowing it to function as a personalized AI assistant,

How to make your own personalized AI assistant or chatbot

Building your own AI assistant or chatbot is exciting, but you need to consider various components for it, such as the LLM module, AI services, orchestration, deployments, monitoring, and more.

If you are an Azure enthusiast, Azure AI Studio can be used for this purpose. In this article, I will be using it.

Azure AI Studio is a collection of various AI modules and services. It also provides a platform for building AI tools. It supports no-code, low-code, and full-code approaches.

How to access Azure AI Studio: It is part of portal.azure.com. You need to create an account if you don’t already have one. Login to Azure portal and Search for Azure AI Foundry , and click on it.

Click on Hub

Provide the required information for your project, then click on Review + Create, and finally, click on Create

Click on Go to resource

Click on Launch Azure AI Foundry

A separate window for https://ai.azure.com will open. Click on + New project, enter the project name, and click Create.

Project Name: aidemoproject created

Click on Models + endpoints and then +Deploy model

Click on Deploy base model

Select model of your choice. I am selecting gpt-4o-mini and clicking on Confirm

Check the Token per Minute Rate Limit as per your application requirements. In this example I took minimum as 1k and clicked on Deploy

I deployed gpt-4o-mini, Endpoint can be used in external applications.

Here I am going to deployed model ((gpt-4o-mini) in our chat playground. Click on playgrounds then Try the Chat playground

Give instruction to model, what you want it to do and click on Apply changes

Now Chat playground is ready to use. Type any word and see the result

I got below output from model

It is working in same way, what was instructed. I created own chat bot with custom system message. Here I can upload our data and model can use that data to answer of questions.

Train LLM model on private Data

You can retrieve data for LLM module in different way, such as Azure Blob Storage, get data with storage URL, upload files/folders.

Here I will be using upload files/folders option.

Click on Data + indexes and then + New data, I am going to create a new data source, where I can upload our data and train LLM module to answers on the basis on that data.

Click on Upload files/folders as for this demo I am going to upload data directly to Azure AI studio but other option also available to store private data.

I just uploaded my resume, now click Next

Now give Data name, in this example I gave as personaldata and click on Create

You can see your uploaded data path on next screen

Now I have created data store name as personaldata, now time to create index, click on Data + indexes

Click on Indexes

Click on + New index

Click on Data in Azure AI Foundry

Select data source what was created before.

Click on Create a new Azure AI Search resource

It will ask you to login with https://portal.azure.com , create a search service there and provide values as per your requirements, click on Review + create then Create