Elevating AI Models: The Art and Science of OpenAI Fine-Tuning

Enri Rane

Dec 27, 2023 • 11 min read

Introduction

Fine-tuning, in the realm of machine learning and AI, refers to the process of taking a pre-trained model and adapting it to a new, more specific task or dataset.

This method is particularly valuable when working with large-scale language models or deep neural networks like those offered by OpenAI. Instead of training a new model from scratch, which can be resource-intensive and time-consuming, fine-tuning leverages the existing knowledge encoded within a pre-trained model. By adjusting the model's parameters, often through additional training on task-specific data, fine-tuning enables practitioners to customize the model's learned representations, allowing it to excel in a specialized domain or perform a particular task with increased accuracy and efficiency.

This approach democratizes the use of sophisticated AI models, empowering developers, researchers, and businesses to adapt powerful pre-existing models to their unique needs, thus accelerating the deployment of AI solutions across various industries and applications.

Keywords

AI Prompt - is any form of text, question, information, or coding that communicates to AI what response you’re looking for. It's like giving directions to get the right response or action from the AI.
AI tokens - Tokens can be thought of as pieces of words or groups of 2-3 characters.

Which are the advantages?

Enhanced Precision: Fine-tuning refines models for more precise and accurate outcomes compared to basic prompting.
Expanded Training Data Capacity: It allows incorporating a larger volume of examples beyond what a prompt can accommodate.
Efficient Token Utilization: Fine-tuning optimizes token usage, reducing the length of prompts required for effective training.
Faster ResponseTimes: It enables quicker processing of requests, leading to reduced latency in model outputs.

Enhanced Precision

Fine-tuning elevates model accuracy by fine-tuning its parameters, aligning it more closely with specific tasks or contexts. This process sharpens the model's understanding, allowing it to produce more refined and reliable results compared to prompting alone. By adapting to nuanced patterns and intricacies within the data, fine-tuning significantly enhances the precision and quality of the model's outputs.

Expanded Training Data Capacity

Fine-tuning enables the integration of a larger and more diverse set of examples than what can be feasibly accommodated within a single prompt. This capacity for extensive training data empowers the model to grasp a broader spectrum of language nuances, variations, and scenarios, ultimately enhancing its adaptability and performance across a wide range of tasks and domains.

Efficient Token Utilization

With fine-tuning, models can be trained more efficiently, optimizing the usage of tokens and reducing the dependency on longer prompts. By fine-tuning, the model can extract more meaningful insights from shorter, more targeted inputs, leading to more resourceful and concise interactions while maximizing the utility of available tokens.

Faster Response Times

Fine-tuned models exhibit improved efficiency in processing requests, resulting in reduced latency and faster response times. The model's refined parameters and focused training enable it to swiftly generate accurate outputs, enhancing the overall user experience by providing quicker and more responsive interactions. This speed enhancement is particularly advantageous in real-time or time-sensitive applications.

When to use fine-tuning?

Opt for OpenAI fine-tuning when your task demands specialized domain knowledge, specific input-output connections, and enhanced task efficiency.

Domain-Specific Tasks: Utilize fine-tuning when your task involves domain-specific knowledge, such as medical diagnosis, legal document analysis, or code generation. This approach assists the model in comprehending specialized terminologies and contextual nuances within a specific field.

Specific Input-Output Mapping: Employ fine-tuning if your task requires a particular input-output format or structure. Fine-tuning enables the model to learn and generate outputs in the desired format, catering precisely to the specified requirements of your task.

Improved Performance: Opt for fine-tuning when seeking enhanced performance metrics like accuracy, relevance, and task adherence. Fine-tuning has the potential to elevate the model's performance significantly compared to prompt engineering, ensuring better alignment with task objectives and expectations.

OpenAI models open for fine-tuning

This section explores OpenAI's pre-trained models available for fine-tuning. These models offer a customizable framework, empowering users to adapt AI capabilities for specific needs.

Gpt-3.5-turbo-1106 (recommended)
Gpt-3.5-turbo-0613
Babbage-002
Davinci-002
Gpt-4-0613 (experimental mode in December 2023)

Steps of fine-tuning

In the process of fine-tuning, users first prepare a dataset and then upload this data for training. Subsequently, they develop an enhanced model by refining the existing one, ultimately leveraging this refined model to achieve improved performance across specific tasks or domains. Here are the necessary steps:

Prepare Dataset
Upload the training data
Develop a new, refined model.
Utilize the refined model.

Step 1: Prepare Dataset

Preparing the dataset is a pivotal phase in the fine-tuning journey of an OpenAI model. It entails the meticulous creation of a series of structured "jsons," which essentially act as simulated prompts. These jsons serve as representations of a variety of potential questions or scenarios directly linked to a specific task. The primary objective at this stage is to immerse the AI in a diverse array of possible prompts, effectively acquainting it with the breadth of situations it may encounter. The difference from creating a prompt is that in this case you should also provide the answer for the asked question.

In simpler terms, this step is about creating a full list of all the related situations or questions that are important for the subject. The purpose is to expose the AI to this array of scenarios, enabling it to grasp and learn from these examples. This tailored approach essentially educates the AI, allowing it to become adept at handling similar situations in the future.

By providing the AI with this specialized dataset, it gains a deeper understanding of the nuances within the specified domain. This focused training equips the AI with the knowledge and context necessary to generate more precise and contextually appropriate responses. Consequently, the AI becomes proficient in delivering well-informed and relevant outputs when presented with queries or prompts related to the specific task it was trained on.

Here it is an example of how a dataset of a model that validates if a text is valid job description or job profession:


{"messages": [{"role": "system", "content": "You are an assistant for a employment agency that validates and checks if the provided text content is a real and valid job description or job title/profession. The text might be in different languages"}, {"role": "user", "content": "Here is the provided text that I need you to validate :  Backend developer"}, {"role": "assistant", "content": "Got it. I have the text. How can I help?"},{"role": "user", "content": "Reply only with the result, nothing else. I need you to validate if the text is a real and valid job description or title. If it's a real job title or profession name, I want you to reply with {\"valid\":true}. Otherwise if the text is longer, then is this case we have a description. I want you to check if this description is a valid job description that contains skills and reponsibilities of an open position that can later be extracted.  So as a conclusion, if the text is a real job title or a real job job description that has responsibilities and skills that can be extracted return a json containing 'valid' as key and true or false as value. For example if the text is a 'doctor' return only {\"valid\":true}. For all other cases return {\"valid\":false}. Reply only with the json no additional words before or after it."},{"role": "assistant", "content": "{\"valid\": true}"}]},
{"messages": [{"role": "system", "content": "You are an assistant for a employment agency that validates and checks if the provided text content is a real and valid job description or job title/profession. The text might be in different languages"}, {"role": "user", "content": "Here is the provided text that I need you to validate :  Graphic Designer:"}, {"role": "assistant", "content": "Got it. I have the text. How can I help?"},{"role": "user", "content": "Reply only with the result, nothing else. I need you to validate if the text is a real and valid job description or title. If it's a real job title or profession name, I want you to reply with {\"valid\":true}. Otherwise if the text is longer, then is this case we have a description. I want you to check if this description is a valid job description that contains skills and reponsibilities of an open position that can later be extracted.  So as a conclusion, if the text is a real job title or a real job job description that has responsibilities and skills that can be extracted return a json containing 'valid' as key and true or false as value. For example if the text is a 'doctor' return only {\"valid\":true}. For all other cases return {\"valid\":false}. Reply only with the json no additional words before or after it."},{"role": "assistant", "content": "{\"valid\": true}"}]}

Points to Consider During Dataset Creation:

Validity of JSONs: Ensure that each JSON file is valid to prevent file upload failures.
Quantity of Examples: The dataset's effectiveness directly correlates with its diversity. A larger collection of examples leads to a more efficient and quicker model. A robust model typically requires a minimum of 500 JSONs to achieve the desired outcomes.
Minimum Upload Requirement: To initiate the process, a minimum of 10 JSONs is required for file upload.
Context limit: Right now, in December 2023, we can only fine-tune gpt-3.5-turbo, while gpt-4-turbo is in experimental mode and can be fine tuned only via the OpenAI platform. Gpt-3.5-turbo actually supports a context of 16k tokens. Make sure each JSON sample never exceeds this limit. Otherwise the JSON string will be cut and it might cause errors in how the AI understands your data.

Calculating the number of tokens per JSON

Here is a function where you can easily calculate the number of tokens per prompt. All you have to do is import the gpt-tokenizer package, extract the encode function and use it to convert the JSON string to tokens.


import { encode } from 'gpt-tokenizer';

export const findTokensNumberSample = async (str: string) => {
 const encodedSample = encode(str);
 const encodedSampleLength=encodedSample.length
 console.log('Encoded sample length is :', encodedSampleLength);
 return encodedSampleLength;
};

Step 2: Upload the training data

After creating different JSON samples it is time to upload the created file in the OpenAI servers. You can achieve this by uploading the file via the interface, clicking the "Create" button or via code.

If you decide to go with the second solution here is how you can do it:


import { OpenAI, ClientOptions } from 'openai';
import * as dotenv from 'dotenv';

dotenv.config();

const configuration: ClientOptions = {
 apiKey: process.env.OPENAI_API_KEY,
};

const openai = new OpenAI(configuration);

export default openai;


import openai from './openAI';

export const uploadFileToOpenAI = async (filePath: string) => {
 console.log(`Uploading file`);
 let file = await openai.files.create({
   file: fs.createReadStream(filePath),
   purpose: 'fine-tune',
 });
 console.log(`Uploaded file with ID: ${file.id}`);
 console.log('-----');
 return file;
};

export const checkUploadedFileStatus = async (file: FileObject) => {
 console.log(`Waiting for file to be processed`);
 while (true) {
   file = await openai.files.retrieve(file.id);
   console.log(`File status: ${file.status}`);

   if (file.status === 'processed') {
     break;
   } else {
     await new Promise((resolve) => setTimeout(resolve, 1000));
   }
 }
};

When uploading a file containing JSON examples to OpenAI servers, the process duration may vary based on the file size and the quantity of JSON examples included. To monitor the upload progress, utilize the second function, which provides three distinct statuses:

-Uploading: This status indicates that the file is currently in the process of being uploaded.

-Error: If an issue arises during the upload, this status will be displayed. Commonly, errors occur due to invalid JSON examples, potentially caused by syntax errors within the file.

-Processed: Upon successful completion of the upload, this status confirms that the file has been processed and uploaded successfully.

Keep an eye on these statuses to track the progress of your file upload, allowing you to identify any potential errors or successfully completed uploads.

Step 3: Develop a new, refined model

Upon the successful upload of the file containing your dataset, the subsequent step involves initiating a job to generate the newly fine-tuned model. To execute this phase, you'll require the uploaded file's ID, housing all your dataset, and the designated name of the OpenAI model intended for enhancement.

Typically, this process takes a few minutes to complete. To monitor the progress, consider implementing a second function dedicated to tracking the job's advancement. As the job progresses, keep an eye out for the "succeeded" status, which indicates the completion of the fine-tuning process. This status will be displayed on your console interface once everything has finished processing. Utilize this status update as confirmation that the new fine-tuned model has been successfully generated.


export const createFineTuningJob = async (fileId: string) => {
 console.log(`Starting fine-tuning`);
 let fineTune = await openai.fine Tuning.jobs.create({
   model: 'gpt-3.5-turbo-1106',
   training_file: fileId,
 });
 console.log(`Fine-tuning ID: ${fineTune.id}`);

 return fineTune;
};


export const trackFineTuningProgess = async (fineTune: FineTuningJob) => {
 console.log(`Track fine-tuning progress:`);
 let after: string | undefined;
 while (fineTune.status !== 'succeeded') {
   fineTune = await openai.fineTuning.jobs.retrieve(fineTune.id);
   console.log(`${fineTune.status}`);

   const options = after ? { limit: 50, after } : { limit: 50 };
   const events = await openai.fineTuning.jobs.listEvents(
     fineTune.id,
     options,
   );
   for (const event of events.data.reverse()) {
     console.log(`- ${event.created_at}: ${event.message}`);
   }

   if (events.data.length > 0) {
     after = events.data[events.data.length - 1]?.id;
     console.log(after);
   }

   await new Promise((resolve) => setTimeout(resolve, 5000));
 }
};

Step 4: Utilize the refined model

To summarize the whole flow, this a “picture” of the whole process:


export const createFineTuning = async (filePath: string) => {
 const file = await api.uploadFileToOpenAI(filePath);

 await api.checkUploadedFileStatus(file);

 const fineTune = await api.createFineTuningJob(file);

 await api.trackFineTuningProgess(fineTune);
};

Once the fine-tuning process concludes, obtaining the new model involves sending a GET request to the OpenAI models endpoint. This request retrieves a comprehensive list of all available models, including the newly created ones mixed with the existing ones.

To distinguish the fine-tuned models, they feature your organization's name within their names. Specifically, a fine-tuned model's name will follow this format: “ft:chosen-openAI-model:my-org:custom-suffix:id”

To isolate these fine-tuned models, filter the list using a unique identifier exclusive to them, such as your organization's name. Upon filtering, the model just created will appear as the first entry in the refined list.

Here's a code snippet showcasing how to achieve this filtering:


export const getFineTunedModels = async () => {
 const keyword = 'my-organization';
 const list = await openai.models.list();

const fineTunedModels = list.data.filter((model) => model.id.includes(keyword));
 console.log('My fine-tuned models', fineTunedModels);
};

After successfully getting the name, all you have to do is replace your old OpenAI model for example gpt-3.5-turbo with your newly created model in the prompt.


export const checkIfTextIsValidJobDescriptionOrName = async (
  text: string,
) => {
  const messages: ChatCompletionMessageParam[] = [
    {
      role: 'system',
      content:
        'You are an assistant for a employment agency that validates and checks if the provided text content is a real and valid job description or job title/profession. The text might be in different languages.',
    },
    {
      role: 'user',
      content: `Here is the provided text that I need you to validate : \n ${JSON.stringify(text)}`,
    },
    {
      role: 'assistant',
      content: 'Got it. I have the text. How can I help?',
    },
    {
      role: 'user',
      content:
        'Reply only with the result, nothing else. I need you to validate if the text is a real and valid job description or title. If it's a real job title or profession name, I want you to reply with {\"valid\":true}. Otherwise if the text is longer, then is this case we have a description. I want you to check if this description is a valid job description that contains skills and reponsibilities of an open position that can later be extracted.  So as a conclusion, if the text is a real job title or a real job job description that has responsibilities and skills that can be extracted return a json containing 'valid' as key and true or false as value. For example if the text is a 'doctor' return only {\"valid\":true}. For all other cases return {\"valid\":false}. Reply only with the json no additional words before or after it.',
    },
  ];

  const config: ChatCompletionCreateParamsNonStreaming = {
    model: ft:gpt-3.5-turbo-0613:my-org::id,
    messages,
    response_format: {
      type: 'json_object',
    },
  };

  const response = await openai.chat.completions.create(config);

  const content = response.choices[0].message.content;
  
  return content;
};

The future of Fine-tuning

As the horizon of AI continues to expand, the future of fine-tuning OpenAI models appears boundless. With ongoing advancements and a growing community of innovators, the evolution of tailored AI solutions seems limitless. Fine-tuning holds the promise of not just refining existing models but sculpting AI to fit increasingly specific needs across industries, driving innovation, and opening doors to uncharted possibilities. As we venture forward, the trajectory of fine-tuning OpenAI models seems poised to redefine the landscape of AI applications, bringing about a future where personalized, adaptable AI becomes the norm rather than the exception.