Nowadays, it is easy to use different large language models (LLMs) via the web interface or the public API. But can we seamlessly integrate LLM into the data analysis process and use the model directly from Python or Jupyter Notebook? Indeed, we can, and in this article, I will show three different ways to do it. As usual, all components used in the article are available for free.
Let’s get into it!
1. Pandas AI
The first Python library I am going to test is Pandas AI. It allows us to ask questions about our Pandas dataframe in natural language. As a toy example, I created a small dataframe with all EU countries and their populations:
import pandas as pddf = pd.DataFrame({
"Country": ['Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czech Republic', 'Denmark', 'Estonia', 'Finland',
'France', 'Germany', 'Greece', 'Hungary', 'Iceland', 'Ireland', 'Italy', 'Latvia', 'Liechtenstein', 'Lithuania',
'Luxembourg', 'Malta', 'Monaco', 'Montenegro', 'Netherlands', 'Norway', 'Poland', 'Portugal', 'Romania', 'Serbia',
'Slovakia', 'Slovenia', 'Spain', 'Sweden', 'Switzerland'],
"Population": [8_205000, 10_403000…