We will learn here to find one of the measures of Central tendency for Grouped Data called Mean or Average. Since, the data is grouped into classes it is not easy to locate Mean or Average accurately. But we can find approximate Mean by using Mean formulae.
Prerequisite / Revise this:
Mean Formula / Average Formula
Mean is defined as the ratio of sum of all observations to the number of observations.
For an ungrouped frequency distribution, the Mean,
where,
= Sum of observations
= Total number of Observations
↪ In most of real life situations, data is usually so large that to make a meaningful study it needs to be condensed as grouped data.
↪ In Grouped frequency distribution, observations are classified into class intervals of same widths.
↪ By convention, the common observation belongs to the higher class, i.e., 10 belongs to the class interval 10-20 (and not to 0-10).
↪ The number of observations in each class is called Class frequency.
↪ It is assumed that the frequency of each class interval is centered around its mid-point. So the mid-point (or class mark) of each class can be chosen to represent the observations falling in the class.
Class-mark =
Direct Method
The class marks serve as x_{i}’s in this method. For the ith class interval, the frequency f_{i }corresponds to the class mark x_{i}.
Now, the mean can be computed using following mean formula
- This method of finding the mean is known as the Direct Method.
- This method gives an approximate mean because of the mid-point assumption.
Remember: When this mean formula is used
(i) For Ungrouped frequency distribution,
x_{i} = ith observation
f_{i} = frequency of the ith observation.
(ii) For Grouped frequency distribution,
x_{i} = class mark of the ith class interval
f_{i} = frequency of the ith class interval.
Ex – Find the mean for given data
Solution – We can write the given data in grouped frequency distribution table as following
So, the mean x̄ of the given data is given by
Assumed Mean Method
Sometimes when the numerical values of x_{i} (class mark) and f_{i} are large, finding the product of x_{i} and f_{i} becomes tedious and time consuming. We can’t change the f_{i}’s, but we can change each x_{i} to a smaller number, so that our calculations become easy. We can achieve this by subtracting a fixed number from each of these x_{i}’s.
- The first step is to choose one among the x_{i}’s as the assumed mean, and denote it by ‘a’. We may take ‘a’ to be that x_{i} which lies in the center of .
So, in previous example, we can choose a = 47.5 or a = 62.5. Let us choose a = 47.5.
- The next step is to find the difference between a and each of the x_{i}’s, that is, the deviation (d_{i}) of ‘a’ from each of the x_{i}’s i.e.,
- The third step is to find the product of d_{i }with the corresponding f_{i}, and take the sum of all the f_{i}d_{i}’s (Σf_{i}d_{i}).
Then the Mean of the deviations will be as:
- Since in obtaining d_{i}, we subtracted ‘a’ from each x_{i}, so, in order to get the mean ͞x , we need to add ‘a’ to d .
This can be explained mathematically as:
Mean of deviations,
∴ Mean = Assumed Mean + Mean of deviations
Example: For previous example, we can write mean deviation table as following (a = 47.5)
Substituting the values of a, Σf_{i}d_{i }and Σf_{i} from Table we get,
47.5 + 14.5 = 62
Therefore, the mean of the marks obtained by the students is 62.
Step-deviation method
↪ In previous example, if we find the mean by taking each of x_{i} (i.e., 17.5, 32.5 and so on) as ‘a’, then the mean determined in each case will be the same, i.e., 62.
So, we can say that the the value of the mean obtained does not depend on the choice of ‘a’.
↪ We can also observe that deviations are common multiples of the class size i.e., the values in Column 4 are all multiples of 15. So, if we divide the values in the entire Column 4 by 15, we would get smaller numbers to multiply with f_{i}. (Here, 15 is the class size of each class interval.)
↪ Let,
where a is the assumed mean and h is the class size.
↪ Then, Mean of reduced deviations,
↪ Now, can be find as following
↪ Example: For the previous example, we can write the step deviation table as follow (a = 47.5)
Now, substituting the values of a, h, Σf_{i}u_{i} and Σf_{i}from the Table, we get
= 47.5 + 14.5 = 62
So, the mean marks obtained by a student is 62.
The method discussed above is called the Step-deviation method.
Note :
↪ the step-deviation method will be convenient to apply if all the d_{i}’s have a common factor (=h).
↪ The mean obtained by all the three methods is the same (an approximate mean).
↪ The assumed mean method and step-deviation method are just simplified forms of the direct method. Calculation is simplified by reducing x_{i}.
↪ The choice of method to be used depends on the numerical values of x_{i} and f_{i}. If x_{i} and f_{i} are sufficiently small, then the direct method is an appropriate choice. If x_{i} and f_{i} are numerically large numbers, then we can go for the assumed mean method or step-deviation method. If the class sizes are unequal, and x_{i} are large numerically, we can still apply the step-deviation method by taking h to be a suitable divisor of all the d_{i}’s.
↪ The formula x̄ = a + hū still holds if a and h are not as given above (i.e., a = x_{i} & h = class size), but are any non-zero numbers such that u_{i} = (x_{i} − a)/h.
⏪ Measure of Central Tendency for Ungrouped Data | Mode & Median for Grouped Data⏩ |