These three carts can be seen as three different data distributions. If we assumed that there are two classes (apples and bananas) initially, then the interpretations that follow would be incorrect. Rather, think of each cart as a different distribution — so the first cart is a data distribution where all data points belong to a single class, and the second & third carts are the data distributions with two classes.
Looking at the example above, it is easy to identify the carts with the most pure or impure data distributions (class distributions to be precise). But in order to have a mathematical quantification of purity in a dataset so that it can be used by an algorithm to make decisions, entropy and Gini Index come to rescue.
Both of these measures look at the probability of occurrence (or presence) of each class in a dataset. In our example, we have a total of 8 data points (fruits) in each case, so we can…