How to Run Bootstrap Analysis in BigQuery | by Tom Ellyatt

Make the most out of the little data you have, by grabbing your data by the bootstraps.

Obligatory pair of boots with straps | created via DALLE

Imagine you’re trying to gauge the average height of all the trees in a vast forest. It’s impractical to measure each one, instead, you measure a small sample and use those measurements to estimate the average for the entire forest. Bootstrapping, in statistics, works on a similar principle.

This involves taking a small sample from your data and, through a method of repeated sampling, estimates statistics (like the mean, median or standard deviation) for your dataset. This technique allows you to make inferences about populations from small samples with greater confidence.

In this article, we will cover:

The basics of bootstrapping, what is it exactly?
How to achieve a bootstrapped sample in BigQuery
An experiment to understand how results change based on varying sample sizes, and how that relates to a known statistic
A stored procedure you can take away and use yourself

At its core, bootstrapping involves randomly selecting a number of observations from a dataset, with replacement, to form what is known as a “bootstrap sample.”

Let’s simplify this concept using a scenario where you have a basket of 25 apples and you’re curious about the average weight of apples in a larger context, like a market.

The Grab and Note Technique

Start by diving into your basket to grab an apple at random, weigh it, and then, instead of setting it aside, you put it right back into your basket. This way, every time you reach in for an apple, every single one, including the one you just weighed, is fair game to be picked again.

Repeat

Now, you repeat the grab, weigh, and replace action the same amount of times as there are apples in your…

Source link

What's Hot

No Train, All Gain: Enhancing Deep Frozen Representations with Self-Supervised Gradients

BLIP3-KALE: An Open-Source Dataset of 218 Million Image-Text Pairs Transforming Image Captioning with Knowledge-Augmented Dense Descriptions

Meta AI Researchers Introduce Mixture-of-Transformers (MoT): A Sparse Multi-Modal Transformer Architecture that Significantly Reduces Pretraining Computational Costs

How to Run Bootstrap Analysis in BigQuery | by Tom Ellyatt | Feb, 2024

A Practical Framework for Data Analysis: 6 Essential Principles | by Pararawendy Indarjo | Nov, 2024

How I Created a Data Science Project Following CRISP-DM Lifecycle | by Gustavo Santos | Nov, 2024

Increase Trust in Your Regression Model The Easy Way | by Jonte Dancker | Nov, 2024

Leave A Reply Cancel Reply

How ML AI Can Help Businesses Reduce Overhead Costs

How the AI Surge May Help Current WFH Employees

The ultimate contact center automation guide

Top 5AI Development Companies To Transform Your Business | by Amyra Sheldon

No Train, All Gain: Enhancing Deep Frozen Representations with Self-Supervised Gradients

BLIP3-KALE: An Open-Source Dataset of 218 Million Image-Text Pairs Transforming Image Captioning with Knowledge-Augmented Dense Descriptions

Meta AI Researchers Introduce Mixture-of-Transformers (MoT): A Sparse Multi-Modal Transformer Architecture that Significantly Reduces Pretraining Computational Costs

A Practical Framework for Data Analysis: 6 Essential Principles | by Pararawendy Indarjo | Nov, 2024

Our Picks

No Train, All Gain: Enhancing Deep Frozen Representations with Self-Supervised Gradients

BLIP3-KALE: An Open-Source Dataset of 218 Million Image-Text Pairs Transforming Image Captioning with Knowledge-Augmented Dense Descriptions

Meta AI Researchers Introduce Mixture-of-Transformers (MoT): A Sparse Multi-Modal Transformer Architecture that Significantly Reduces Pretraining Computational Costs

What's Hot

How to Run Bootstrap Analysis in BigQuery | by Tom Ellyatt | Feb, 2024

Make the most out of the little data you have, by grabbing your data by the bootstraps.

The Grab and Note Technique

Repeat

Related Posts

Leave A Reply Cancel Reply