A deep dive into the one line of code that can bring thousands of ready-to-use AI solutions into your scripts, utilizing the power of the 🤗 Transformers library.
The human language used in different forms and fashions can generate a plethora of information but in an unstructured way. It is in people’s nature to communicate and express their opinions and views, especially nowadays with all the available outlets to do so. This led to a growing amount of unstructured data that, so far, has been minimally or not utilized by businesses.
However, in recent years, a notable shift has occurred.
The rapid development in the field of Artificial Intelligence (AI), especially in the area of Natural Language Processing (NLP) allowed us to programmatically understand and interact with this information, prompting many businesses to revisit this source of knowledge as a fuel for new products.
This urgency was created with the release of the ChatGPT, which illustrated to the world the effectiveness of transformer models and, in general, introduced to the mass audience the field of Large Language Models (LLMs).
This product’s simplicity and general nature allowed everyone to use these AI processes to perform various tasks in the field of NLP without the need to understand complex mathematical equations or learn how to train and maintain machine learning models. Just open a chatbot (or call an API), craft a proper prompt in your native language, and then magically you have an AI product.
However, as with all great products, this one comes with a cost. A cost that in some tools can be in the form of a subscription or most commonly charged based on the tool usage, with rates that charge per word/token used.
While the rate per token in most cases can seem really small (what can 0.03$ per 1K tokens do?)[1] imagine using this tool to extract information from a book with hundreds of pages; the cost could skyrocket in a matter of seconds and bite back companies if they don’t understand and monitor correctly.