Building Your Own Personal AI Assistant: A Step-by-Step Guide to Build a Text and Voice Local LLM | by Amirarsalan Rajabi

You can find the code in this GitHub repo:
https://github.com/amirarsalan90/personal_llm_assistant

The main components of the app include:

Llama-cpp-python is a python binding for the great llama.cpp , which implements many Large Language Models in C/C++ . Because of its wide adoption by open-source community, I decided to use it in this tutorial.

Note: I have tested this app on a system with Nvidia RTX4090 gpu.

First thing first, lets create a new conda environment:

conda create --name assistant python=3.10
conda activate assistant

Next we need to install llama-cpp-python. As mentioned in llama-cpp-python descriptions, llama.cpp supports a number of hardware acceleration backends to speed up inference. In order to leverage the GPU and run the LLM on GPU, we will build the program with CUBLAS. I had some issues with getting to offload the model on GPU and I finally found this post on how to properly install:

export CMAKE_ARGS="-DLLAMA_CUBLAS=on"
export FORCE_CMAKE=1
pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir
pip install llama-cpp-python[server]

Source link

What's Hot

How I Created a Data Science Project Following CRISP-DM Lifecycle | by Gustavo Santos | Nov, 2024

Researchers from Snowflake and CMU Introduce SuffixDecoding: A Novel Model-Free Approach to Accelerating Large Language Model (LLM) Inference through Speculative Decoding

Top Hyperscience Alternatives: Ratings, Reviews & Pricing

Building Your Own Personal AI Assistant: A Step-by-Step Guide to Build a Text and Voice Local LLM | by Amirarsalan Rajabi | Mar, 2024

How I Created a Data Science Project Following CRISP-DM Lifecycle | by Gustavo Santos | Nov, 2024

Increase Trust in Your Regression Model The Easy Way | by Jonte Dancker | Nov, 2024

Reporting in Excel Could Be Costing Your Business More Than You Think — Here’s How to Fix It… | by Hattie Biddlecombe | Nov, 2024

Leave A Reply Cancel Reply

How ML AI Can Help Businesses Reduce Overhead Costs

How the AI Surge May Help Current WFH Employees

The ultimate contact center automation guide

Top 5AI Development Companies To Transform Your Business | by Amyra Sheldon

How I Created a Data Science Project Following CRISP-DM Lifecycle | by Gustavo Santos | Nov, 2024

Researchers from Snowflake and CMU Introduce SuffixDecoding: A Novel Model-Free Approach to Accelerating Large Language Model (LLM) Inference through Speculative Decoding

Top Hyperscience Alternatives: Ratings, Reviews & Pricing

Nous Research Introduces Two New Projects: The Forge Reasoning API Beta and Nous Chat

Our Picks

How I Created a Data Science Project Following CRISP-DM Lifecycle | by Gustavo Santos | Nov, 2024

Researchers from Snowflake and CMU Introduce SuffixDecoding: A Novel Model-Free Approach to Accelerating Large Language Model (LLM) Inference through Speculative Decoding

Top Hyperscience Alternatives: Ratings, Reviews & Pricing

What's Hot

Building Your Own Personal AI Assistant: A Step-by-Step Guide to Build a Text and Voice Local LLM | by Amirarsalan Rajabi | Mar, 2024

Related Posts

Leave A Reply Cancel Reply