Everybody knows that large language models are, by definition, large. And even not so long ago, they were available only for high-end hardware owners, or at least for people who paid for cloud access or even every API call. Nowadays, the time is changing. In this article, I will show how to run a LangChain Python library, a FAISS vector database, and a Mistral-7B model in Google Colab completely for free, and we will do some fun experiments with it.
Components
There are many articles here on TDS about using large language models in Python, but often it is not so easy to reproduce them. For example, many examples of using a LangChain library use an OpenAI class, the first parameter of which (guess what?) is OPENAI_API_KEY. Some other examples of RAG (Retrieval Augmented Generation) and vector databases use Weaviate; the first thing we see after opening their website is “Pricing.” Here, I will use a set of open-source libraries that can be used completely for free:
- LangChain. It is a Python framework for developing applications powered by language models. It is also model-agnostic, and the same code can be reused with different models.
- FAISS (Facebook AI Similarity Search). It’s a library designed for efficient similarity search and storage of dense vectors, which I will use for Retrieval Augmented Generation.
- Mistral 7B is a 7.3B parameter large language model (released under the Apache 2.0 license), which, according to the authors, is outperforming 13B Llama2 on all benchmarks. It is also available on HuggingFace, so its use is pretty simple.
- Last but not least, Google Colab is also an important part of this test. It provides free access to Python notebooks powered by CPU, 16 GB NVIDIA Tesla T4, or even 80 GB NVIDIA A100 (though I never saw the last one available for a free instance).
Right now, let’s get into it.
Install
As a first step, we need to open Google Colab and create a new notebook. The needed libraries can be installed by using pip
in the first cell: