Mahalanobis distance

The "Mahalanobis distance" is a rule for calculating the distance between two multidimensional points. It is more particularly useful when multinormal distributions are involved, although its definition does not require the distributions to be normal.

This function here is for calculating the distance of a point to the mean of a distribution.

 D = (x - )'http://www.aiaccess.net/Symboles_Maths/s_sigma_maj.gif-1(x - )

Where,http://www.aiaccess.net/Symboles_Maths/s_sigma_maj.gif the covariance matrix of the multivariate distribution, D is called the Mahalanobis distance of the point x to the mean of the distribution.

Mahalanobis distance D is a useful way for detecting multivariate outliers (observations suspected to be quite different from the "average" observations in the data set).

 

Below is the sample input window

 

 

 

Below is the sample output and explanation:

 

Mahalanobis distance for:

RATING

COMPLAINTS

PRIVILEGES

LEARNING

RAISES

CRITICAL

ADVANCE

Observarions with missing values will be excluded

Total number of observations: 30

 

(1) Means for each variable is not specified, sample means will be used. Below is calculated means from the sample

 

Comparing to means:

RATING 64.63333

COMPLAINTS 66.60000

PRIVILEGES 53.13333

LEARNING 56.36667

RAISES 64.63333

CRITICAL 74.76667

ADVANCE 42.93333

(2) Below is the D2 for each observation and its p value

 

Mahalanobis distance (D2) for each ID comparing to above means

ID D2 Pvalue

[1,] 1 10.6222881 0.15596424

[2,] 2 0.6538401 0.99866582

[3,] 3 4.7748135 0.68742159

[4,] 4 3.8175183 0.80053955

[5,] 5 2.7765024 0.90487934

[6,] 6 10.9948389 0.13884430

[7,] 7 4.7508969 0.69033193

[8,] 8 2.8684187 0.89690185

[9,] 9 9.8296665 0.19843337

[10,] 10 2.7599293 0.90628737

[11,] 11 5.3408563 0.61844239

[12,] 12 6.4739678 0.48561924

[13,] 13 7.7988697 0.35066319

[14,] 14 14.8325448 0.03820686

[15,] 15 3.7096907 0.81254204

[16,] 16 14.9598931 0.03651697

[17,] 17 5.5322872 0.59529234

[18,] 18 15.1145674 0.03455854

[19,] 19 3.1478805 0.87099331

[20,] 20 5.4825762 0.60128633

[21,] 21 7.2473061 0.40359276

[22,] 22 2.2790047 0.94279973

[23,] 23 3.8410188 0.79789612

[24,] 24 11.5037256 0.11810615

[25,] 25 5.6767703 0.57795643

[26,] 26 13.3770493 0.06343744

[27,] 27 4.1924675 0.75736129

[28,] 28 5.7809935 0.56554182

[29,] 29 4.7655992 0.68854305

[30,] 30 8.0942186 0.32435921

(3) Below lists the outliers (observations with a D2 p<0.05)

Following observations with Mahalanobis distance (D2) p<0.05

ID D2 Pvalue

[1,] 14 14.83254 0.03820686

[2,] 16 14.95989 0.03651697

[3,] 18 15.11457 0.03455854