Which measure of central tendency measure of Centre is best used when there is an outlier?

  1. Last updated
  2. Save as PDF
  • Page ID16240
  • By now, everyone should know how to calculate mean, median and mode. They each give us a measure of Central Tendency (i.e. where the center of our data falls), but often give different answers. So how do we know when to use each? Here are some general rules:

    1.  Mean is the most frequently used measure of central tendency and generally considered the best measure of it. However, there are some situations where either median or mode are preferred.
    2. Median is the preferred measure of central tendency when:
      1.  There are a few extreme scores in the distribution of the data. (NOTE: Remember that a single outlier can have a great effect on the mean). b.
      2. There are some missing or undetermined values in your data. c.
      3. There is an open ended distribution (For example, if you have a data field which measures number of children and your options are 0, 1, 2, 3, 4, 5 or “6 or more,” then the “6 or more field” is open ended and makes calculating the mean impossible, since we do not know exact values for this field) d.
      4. You have data measured on an ordinal scale.
    3. Mode is the preferred measure when data are measured in a nominal ( and even sometimes ordinal) scale.

    CC licensed content, Shared previously

    • When to use each measure of Central Tendency?. Authored by: Paul Jones. Provided by: Columbia Basin College. License: CC BY: Attribution
    • Introductory Statistics . Authored by: Barbara Illowski, Susan Dean. Provided by: Open Stax. Located at: http://cnx.org/contents/. License: CC BY: Attribution. License Terms: Download for free at http://cnx.org/contents/

    • Journal List
    • J Pharmacol Pharmacother
    • v.2(3); Jul-Sep 2011
    • PMC3157145

    J Pharmacol Pharmacother. 2011 Jul-Sep; 2(3): 214–215.

    INTRODUCTION

    Apart from the mean, median and mode are the two commonly used measures of central tendency. The median is sometimes referred to as a measure of location as it tells us where the data are.[1] This article describes about median, mode, and also the guidelines for selecting the appropriate measure of central tendency.

    MEDIAN

    Median is the value which occupies the middle position when all the observations are arranged in an ascending/descending order. It divides the frequency distribution exactly into two halves. Fifty percent of observations in a distribution have scores at or below the median. Hence median is the 50th percentile.[2] Median is also known as ‘positional average’.[3]

    It is easy to calculate the median. If the number of observations are odd, then (n + 1)/2th observation (in the ordered set) is the median. When the total number of observations are even, it is given by the mean of n/2th and (n/2 + 1)th observation.[2]

    Advantages

    1. It is easy to compute and comprehend.

    2. It is not distorted by outliers/skewed data.[4]

    3. It can be determined for ratio, interval, and ordinal scale.

    Disadvantages

    1. It does not take into account the precise value of each observation and hence does not use all information available in the data.

    2. Unlike mean, median is not amenable to further mathematical calculation and hence is not used in many statistical tests.

    3. If we pool the observations of two groups, median of the pooled group cannot be expressed in terms of the individual medians of the pooled groups.

    MODE

    Mode is defined as the value that occurs most frequently in the data. Some data sets do not have a mode because each value occurs only once. On the other hand, some data sets can have more than one mode. This happens when the data set has two or more values of equal frequency which is greater than that of any other value. Mode is rarely used as a summary statistic except to describe a bimodal distribution. In a bimodal distribution, the taller peak is called the major mode and the shorter one is the minor mode.

    Advantages

    1. It is the only measure of central tendency that can be used for data measured in a nominal scale.[5]

    2. It can be calculated easily.

    Disadvantages

    1. It is not used in statistical analysis as it is not algebraically defined and the fluctuation in the frequency of observation is more when the sample size is small.

    POSITION OF MEASURES OF CENTRAL TENDENCY

    The relative position of the three measures of central tendency (mean, median, and mode) depends on the shape of the distribution. All three measures are identical in a normal distribution [Figure 1a]. As mean is always pulled toward the extreme observations, the mean is shifted to the tail in a skewed distribution [Figure 1b and c]. Mode is the most frequently occurring score and hence it lies in the hump of the skewed distribution. Median lies in between the mean and the mode in a skewed distribution.[6,7]

    Which measure of central tendency measure of Centre is best used when there is an outlier?

    The relative position of the various measures of central tendency. (a) Normal distribution (b) Positively (right) skewed distribution (c) Negatively (left) skewed distribution

    SELECTING THE APPROPRIATE MEASURE

    Mean is generally considered the best measure of central tendency and the most frequently used one. However, there are some situations where the other measures of central tendency are preferred.

    Median is preferred to mean[3] when

    1. There are few extreme scores in the distribution.

    2. Some scores have undetermined values.

    3. There is an open ended distribution.

    4. Data are measured in an ordinal scale.

    5. Mode is the preferred measure when data are measured in a nominal scale. Geometric mean is the preferred measure of central tendency when data are measured in a logarithmic scale.[8]

    Footnotes

    Source of Support: Nil

    Conflict of Interest: None declared.

    REFERENCES

    1. Swinscow TD, Campbell MJ. 10th ed(Indian) New Delhi: Viva Books Private Limited; 2003. Statistics at square one. [Google Scholar]

    2. Gravetter FJ, Wallnau LB. 5th ed. Belmont: Wadsworth – Thomson Learning; 2000. Statistics for the behavioral sciences. [Google Scholar]

    3. Sundaram KR, Dwivedi SN, Sreenivas V. 1st ed. New Delhi: B.I Publications Pvt Ltd; 2010. Medical statistics principles and methods. [Google Scholar]

    4. Petrie A, Sabin C. 3rd ed. Oxford: Wiley-Blackwell; 2009. Medical statistics at a glance. [Google Scholar]

    5. Norman GR, Streiner DL. 2nd ed. Hamilton: B.C. Decker Inc; 2000. Biostatistics the bare essentials. [Google Scholar]

    6. SundarRao PS, Richard J. 4th ed. New Delhi: Prentice Hall of India Pvt Ltd; 2006. Introduction to biostatistics and research methods. [Google Scholar]

    7. Glaser AN. 1st Indian Ed. New Delhi: Lippincott Williams and Wilkins; 2000. High Yield Biostatistics. [Google Scholar]

    8. Dawson B, Trapp RG. 4th ed. New York: Mc-Graw Hill; 2004. Basic and Clinical Biostatistics. [Google Scholar]


    Articles from Journal of Pharmacology & Pharmacotherapeutics are provided here courtesy of Wolters Kluwer -- Medknow Publications


    What central tendency is used for outliers?

    Mean is the only measure of central tendency that is always affected by an outlier. Mean, the average, is the most popular measure of central tendency.

    What measure could be used to measure an outlier '?

    You can convert extreme data points into z scores that tell you how many standard deviations away they are from the mean. If a value has a high enough or low enough z score, it can be considered an outlier. As a rule of thumb, values with a z score greater than 3 or less than –3 are often determined to be outliers.

    Which measures of center are sensitive to outliers?

    The four measures of center are mean, median, mode, and midrange. Mean – The mean is what you know as the average. It is calculated by taking all of the values in a set and dividing them by the total number of values in that set. The mean is very sensitive to outliers (more on outliers in a little bit).

    When outliers are present in the data set which measure is the best to describe central tendency in the data group of answer choices?

    Of the three measures of tendency, the mean is most heavily influenced by any outliers or skewness. In a symmetrical distribution, the mean, median, and mode are all equal. In these cases, the mean is often the preferred measure of central tendency.