Recently, there has been a huge boom of using large language models (LLMs) to create chatbots, assistants, and other LLM-based applications for businesses. ChatGPT API is a popular tool for this, as it allows you to use a powerful LLM with 140 billions parameters without having to worry about the computational power or training required.
However, there is one major drawback to using ChatGPT API: everything you send to it is also sent to OpenAI, the company behind ChatGPT. This means that OpenAI can and will collect and use your data for training its own models.
This brings us to the question: what can you send to ChatGPT, and what not? And what are the alternatives to ChatGPT?
When ChatGPT isn't a good option
One of the popular recent examples of using ChatGPT API is a data warehouse LLM. This basically means that if you have, for instance, a database with customers, you can use an LLM like ChatGPT to ask any questions about this data, and get quick answers. This allows you to simply ask questions in plain English without any coding and get insights.
This is a very interesting and promising technology, but it leads to the same problem - if you use ChatGPT for this task, data of your customers is available to OpenAI. So, let's define a few rules for when not to use ChatGPT:
Prompts with personal data like names, addresses, passwords;
Prompts with private data like medical records, financial information, etc;
Prompts with company secrets like manufacturing processes, recipes;
Prompts with intellectual property that has copyrights, trademarks, patents;
Prompts with code of private (business) projects.
Apart from this data, you should remember that ChatGPT also stores all the information about browser and device you use, account and billing information, location and IP.
Vendors and servers providers;
AI trainers who review the conversation.
Although OpenAI's data sharing policies are not clear, it is clear that the people they share data with should not have access to your private or sensitive information. Some countries like Italy even banned ChatGPT, as Italy’s data regulator (Garante per la Protezione dei Dati Personal) contended that OpenAI didn’t have the legal right to use personal information.
What if you still want to use ChatGPT?
Here are some main rules that should be followed to maximize data privacy protection:
Don't share private information. This includes all of the sensitive data above.
Use a VPN. A VPN can help you to mask your IP address and location, which at least a bit reduces information leak to OpenAI.
Turn off history saving. OpenAI introduced a new feature recently, which allows you to choose which conversations can be used to train their models, and which isn't. From one side, this prohibits saving all the chats you had, so you can't see your old chats. But from the other side, your data will not be used for training, which is something.
What are the alternatives to ChatGPT?
But what if you do need an LLM to work with private data, mentioned above? This is where open-source models come into play. Using platforms like HuggingFace, you can look in possible options, compare them, and decide what model suits you the best. You can visit their leaderboard to compare LLMs using different metrics, like ARC (abstraction and reasoning benchmark), MMLU (multi-task language understanding) and other:
Most of the top models are based on LLaMa, an open-source model released by Meta. These models are all free to use for commercial purposes, and on many benchmarks they are almost the same performance as GPT3, and sometimes even GPT4.
Most of these models are fine-tuned versions of LLaMa. This means that they were trained on additional amount of data, and these models are better in specific fields. For example, WizardCoder LLM is a model, that was trained on a lot of code data, and it will do a lot less mistakes when coding tasks compared to other models.
If you decided that you want an open-source model for your business, you should take into account that these models will require a lot of computational power to use them. Generally, an estimated amount of GPU memory needed to run an LLM is calculated by this formula
GPU Memory = amount of billions of parameters in model * 4
So, if you have a 7 billion parameter model, you will need around 28 GB of GPU memory. While this formula is an approximation for recommended amount of memory, the minimum amount is usually a few times lower.
This formula can also help to calculate approximate financial expenses on hosting such a model. For example, if we look at cloud provider prices and find out that NVidia T4 with 16 GB of memory costs around 140$ per month, you will need 280$ for hosting the model. This price can be decreased in 2 or even 4 times when using optimization techniques, like saving model weights in smaller amount of bytes. This will decrease accuracy of the model a bit, but it saves a lot of memory.
Apart from that, cloud services can be set up in a way so that the model will be available, for example, only in working hours, which also decreases expenses 3x.
You can also purchase a server with GPUs. This will save money in the long term, as cloud providers usually set at least 3x price for their computational power.
This private model can then be used for commercial uses, too. So, if you have NLP projects in mind that require an LLM, already hosted model can help with that. But what projects could be implemented with an LLM to help your business?
Examples of using private LLMs to optimize your business
There are many ways of using LLMs to increase your workers productivity and performance. Here are just a few ideas:
Private chatbot like ChatGPT. This will help with coding, general questions, ideation and many other fields.
Data LLM. This can be implemented in multiple ways. For example, of you have a database of customers, like mentioned above, you can use LLM to gain quick insights from this database. A different way of using data LLM is create a chatbot for your documents. Using tools like embeddings, you can set up an LLM that will answer questions based on data from your documents and files. This can be helpful, for example, if you have some kind of manual for using some device that is huge. By using embeddings, you will be able to ask a bot any question from this manual, and it will answer it quickly.
Virtual assistant. In case you need a bot on your website that will answer questions of your customers, LLM can help here too. This doesn't require any computational-hard fine-tunings, but just embedding database with list of possible questions and answers, or just general information about the product you provide.
Content generation. This can help with generating articles, blogs, social media posts and other things. This will make your marketing managers a few times more productive.
Content summarization. If you have many customers with multiple feedbacks, reports and other text-based information, you can use LLM for quickly summarizing and gaining insights.
How to increase performance of your private LLM for specific use-case?
LLMs like LLaMa are trained on a huge amount of text. This means that they have surface knowledge in many fields, but no deep knowledge in any of them. So, for example, if you want an LLM for your company that will help with coding, it will have the same amount of knowledge in programming as it has about unicorns and game of thrones, which are (I suppose) irrelevant for this case.
To increase performance and accuracy of your model, there are multiple ways. Based on your need, you can choose any of them (or even combine):
Prompt engineering. This is the easiest way, as it requires you to just write a few sentences. This basically means adding additional information to the prompt. For example, if you have a virtual assistant for you customers, you can add something like "Act like a virtual, helpful assistant for [company] that works in [field]. Here is the question from the customer: ". This won't greatly boost performance, but it's better then nothing.
Embeddings. This is best choice for you if you need to add a lot of information to the LLM's knowledge, and this information is rapidly changing. For example, if you have a stock advisor bot, it will be a good idea to add information about latest trends, news and stock prices in the bot. This can be done using embeddings.
Fine-tuning. This is a good choice if you want to deepen knowledge of your LLM in specific area. For example, if you want to create a medical assistant, you can fine-tune the LLM on medical articles, books and journals. This way the bot will have a lot of domain-specific knowledge, and will make a lot less mistakes. This is the hardest option, as it requires a lot of labeled data and computational power. This option was also recently added to ChatGPT. Using their API, you can fine-tune GPT model on your data.
Large language models (LLMs) like ChatGPT can be a valuable tool for businesses of all sizes. They can be used to improve customer service, generate content, and automate tasks. However, it is important to weigh the benefits of using an LLM against the potential risks to privacy.
If you are looking for a quick and easy solution, then ChatGPT may be a good option. However, if privacy is a top priority, then hosting your own LLM is the way to go.