How to Improve LLM Responses With Better Sampling Parameters | by Dr. Leon Eversberg

A deep dive into stochastic decoding with temperature, top_p, top_k, and min_p

10 min read

18 hours ago

Example Python code taken from the OpenAI Python SDK where the chat completion API is called with the parameters temperature and top_p. — When calling the OpenAI API with the Python SDK, have you ever wondered what exactly the temperature and top_p parameters do?

When you ask a Large Language Model (LLM) a question, the model outputs a probability for every possible token in its vocabulary.

After sampling a token from this probability distribution, we can append the selected token to our input prompt so that the LLM can output the probabilities for the next token.

This sampling process can be controlled by parameters such as the famous temperature and top_p.

In this article, I will explain and visualize the sampling strategies that define the output behavior of LLMs. By understanding what these parameters do and setting them according to our use case, we can improve the output generated by LLMs.

For this article, I’ll use VLLM as the inference engine and Microsoft’s new Phi-3.5-mini-instruct model with AWQ quantization. To run this model locally, I’m using my laptop’s NVIDIA GeForce RTX 2060 GPU.

· Understanding Sampling With Logprobs
∘ LLM Decoding Theory
∘ Retrieving Logprobs With the OpenAI Python SDK
· Greedy Decoding
· Temperature
· Top-k Sampling
· Top-p Sampling
· Combining Top-p…

Source link

What's Hot

Microsoft Released LLM2CLIP: A New AI Technique in which a LLM Acts as a Teacher for CLIP’s Visual Encoder

This Machine Learning Paper Transforms Embodied AI Efficiency: New Scaling Laws for Optimizing Model and Dataset Proportions in Behavior Cloning and World Modeling Tasks

Gradient Boosting | Towards Data Science

How to Improve LLM Responses With Better Sampling Parameters | by Dr. Leon Eversberg | Sep, 2024

Gradient Boosting | Towards Data Science

A Practical Framework for Data Analysis: 6 Essential Principles | by Pararawendy Indarjo | Nov, 2024

How I Created a Data Science Project Following CRISP-DM Lifecycle | by Gustavo Santos | Nov, 2024

Leave A Reply Cancel Reply

How ML AI Can Help Businesses Reduce Overhead Costs

How the AI Surge May Help Current WFH Employees

The ultimate contact center automation guide

Top 5AI Development Companies To Transform Your Business | by Amyra Sheldon

Microsoft Released LLM2CLIP: A New AI Technique in which a LLM Acts as a Teacher for CLIP’s Visual Encoder

This Machine Learning Paper Transforms Embodied AI Efficiency: New Scaling Laws for Optimizing Model and Dataset Proportions in Behavior Cloning and World Modeling Tasks

Gradient Boosting | Towards Data Science

The Complete Guide to NetSuite Saved Searches

Our Picks

Microsoft Released LLM2CLIP: A New AI Technique in which a LLM Acts as a Teacher for CLIP’s Visual Encoder

This Machine Learning Paper Transforms Embodied AI Efficiency: New Scaling Laws for Optimizing Model and Dataset Proportions in Behavior Cloning and World Modeling Tasks

Gradient Boosting | Towards Data Science

What's Hot

How to Improve LLM Responses With Better Sampling Parameters | by Dr. Leon Eversberg | Sep, 2024

A deep dive into stochastic decoding with temperature, top_p, top_k, and min_p

Table Of Contents

Related Posts

Leave A Reply Cancel Reply