Falcon 180B: Can It Run on Your Computer?

There is also a chat version. The models are available on the Hugging Face hub:

Falcon 180B is completely free and state-of-the-art. But it’s also a huge model.

Can it run on your computer?

Unless your computer is ready for very intensive computing, it can’t run Falcon 180B out-of-the-box. You will need to upgrade your computer and use a quantized version of the model.

In this article, I explain how you can run Falcon-180B on consumer hardware. We will see that it can be reasonably affordable to run a 180 billion parameter model on a modern computer. I also discuss several techniques that help reduce the hardware requirements.

The first thing you need to know is that Falcon 180B has 180 billion parameters stored as bfloat16. A (b)float16 parameter is 2 bytes in memory.

When you load a model, the standard Pytorch pipeline works like this:

An empty model is created: 180B parameters * 2 bytes = 360 GB

Source link

What's Hot

Researchers at Cambridge Provide Empirical Insights into Deep Learning through the Pedagogical Lens of Telescopic Model that Uses First-Order Approximations

What Did I Learn from Building LLM Applications in 2024? — Part 1 | by Satwiki De | Nov, 2024

Google DeepMind Researchers Propose RT-Affordance: A Hierarchical Method that Uses Affordances as an Intermediate Representation for Policies

Falcon 180B: Can It Run on Your Computer?

What Did I Learn from Building LLM Applications in 2024? — Part 1 | by Satwiki De | Nov, 2024

Introducing the New Anthropic Token Counting API | by Thomas Reid | Nov, 2024

To Index or Not to Index. Leverage SQL indexing to speed up your… | by Christopher Karg | Nov, 2024

Leave A Reply Cancel Reply

How ML AI Can Help Businesses Reduce Overhead Costs

How the AI Surge May Help Current WFH Employees

The ultimate contact center automation guide

Top 5AI Development Companies To Transform Your Business | by Amyra Sheldon

Researchers at Cambridge Provide Empirical Insights into Deep Learning through the Pedagogical Lens of Telescopic Model that Uses First-Order Approximations

What Did I Learn from Building LLM Applications in 2024? — Part 1 | by Satwiki De | Nov, 2024

Google DeepMind Researchers Propose RT-Affordance: A Hierarchical Method that Uses Affordances as an Intermediate Representation for Policies

Introducing the New Anthropic Token Counting API | by Thomas Reid | Nov, 2024

Our Picks

Researchers at Cambridge Provide Empirical Insights into Deep Learning through the Pedagogical Lens of Telescopic Model that Uses First-Order Approximations

What Did I Learn from Building LLM Applications in 2024? — Part 1 | by Satwiki De | Nov, 2024

Google DeepMind Researchers Propose RT-Affordance: A Hierarchical Method that Uses Affordances as an Intermediate Representation for Policies

What's Hot

Falcon 180B: Can It Run on Your Computer?

Related Posts

Leave A Reply Cancel Reply