Large Language Models (LLMs) are here to stay. With the recent release of Llama 2, open-source LLMs are approaching the performance of ChatGPT and with proper tuning can even exceed it.
Using these LLMs is often not as straightforward as it seems especially if you want to fine-tune the LLM to your specific use case.
In this article, we will go through 3 of the most common methods for improving the performance of any LLM:
- Prompt Engineering
- Retrieval Augmented Generation (RAG)
- Parameter Efficient Fine-Tuning (PEFT)
There are many more methods but these are the easiest and can result in major improvements without much work.
These 3 methods start from the least complex method, the so-called low-hanging fruits, to one of the more complex methods for improving your LLM.
To get the most out of LLMs, you can even combine all three methods!
Before we get started, here is a more in-depth overview of the methods for easier reference: