Excel Tutorial: Statistical Functions
These are some of the common statistical functions in Excel that are useful in investing. Again, these are not the only ones to be found in Excel, I’m just filtering the onses most useful to investors.
AVERAGE(number1,number2...) Returns the average or arithmatic mean of the series.
GEOMEAN(number1,number2...) Returns the geometric mean of the series. See Note 1.
for difference between geometric and arithmatic mean.
MEDIAN(number1,number2...) Returns the median (i.e. midpoint) of the series
MAX(number1,number2...) Returns the maximum value in the series
MIN(number1,number2...) Returns the minimum value in the series
COUNT(number1,nubmer2...) Returns the number of values in the series.
FORECAST(X,known_X,known_Y) Extrapolates Y given a series of known X and Y values.
Uses linear regression (i.e. fitting a straight line
through the data) to project forward results.
TREND(known_y,known_x,X,const) Extrapolates Y given a series of known X and Y values.
Similar to FORECAST(), TREND() uses linear regression.
GROWTH(known_y,known_x,x,const) Extrapolates Y given a series of known X and Y values.
Similar to TREND(), but GROWTH() fits an exponential
curve through the data to project forward results.
LINEST(known_y,known_x,const,stats) Returns the statistical information on the line that
best fits the data. This function must be entered as
an array formula. See Note 2.
LOGEST(known_y,known_x,const,stats) Returns the statistical information on the exponential
curve that best fits the data. This function must be
entered as an array formula. See Note 2.
number1, number2, etc. are variables that can be entered directly into the equation or can be replace with a cell reference or range of cells.
known_X, known_Y are variables (usually a range of cells) that are input into the regression analysis.
X is the next X variable. Typically TREND(), GROWTH() etc. solve for an extrapolated Y value given the known_Xs & known_Ys and an assumed X. For example the annual returns on a portfolio are 3%, 17%, 23%, & -15%. These would be the known_Y values, the known_X values are the years 1, 2, 3, & 4. X would be the year for which you want TREND() to project a corresponding Y value.
Const is an optional TRUE/FALSE variable which can be excluded. It is used in the line/curve fitting function to force the y-interect to go through 0. If the flag is set to TRUE, the function will return a slope for a line/curve through the origin.
Stats is an optional TRUE/FALSE variable which can be excluded. If it is set to TRUE, additional statistical information (mn,mn-1,…,m1,b;sen,sen-1,…,se1,seb;r 2,sey) will be provided by the function.
Note 1:
When talking about average growth rates there are two types, arithmatic mean and geometric mean. Arithmatic mean is what is commonly known as the average, and is calculated from summing all the terms and dividing by the number of terms (i.e. (X + Y + Z) / 3). Geometric mean is calculated by multiplying all the terms and taking the n_th root of the product (i.e. (X * Y * Z) ^ (1/3) ).
The significance of this is in growth rates. Growth rates fluctuates from year to year and trying to nail down an average growth rate over a number years is a bit difficult if you use arithmatic mean. Take the following example of a portfolio’s 10 year performance.
Year Growth Value 0 $10,000 1 5% $10,500 2 8% $11,340 3 12% $12,701 4 23% $15,622 5 -2% $15,310 6 -6% $14,391 7 35% $19,428 8 10% $21,371 9 16% $24,790 10 1% $25,038
Average Growth (arithmatic mean):
(5% + 8% + 12% + 23% – 2% – 6% + 35% + 10% + 16% + 1%) / 10 = 10.2%
Now, according to the law of compounding, the ending value of the portfolio could be calculated given the starting value, growth rate and time period. Since there were no deposits or withdrawls to the portfolio, the compound interest equation can be use. Starting_Value * (1 * Growth_Rate) ^ Time. But if you plug in the values from this example you get
$10k * (1 + 10.2%) ^ 10 = $26,413.
$1,375 higher than the actual ending value.
To get the true average growth rate, also known as the compounded annual growth rate (CAGR), you need to use the geometric mean. It is calculated a bit differently because only positive numbers can be input into the equation.
(1.05 * 1.08 * 1.12 * 1*23 * 0.98 * 0.94 * 1.35 * 1.1 * 1.16 * 1.01) ^ (1 / 10) = 9.6%
Using the geometric mean in the compound interest equation we get
$10k * (1 + 9.6%) ^ 10 = $25,038
which matches the actual results. For further discussion on the difference and applicability of the geometric mean, see this article
Note 2:
Array formulas are for functions that return several pieces of data. LINEST() return two pieces of information, the slope and y-intercept of the best fit line. Since two separate numbers cannot occupy the same cell, the function needs to be spread across two cells. Example:
A B
Row 1 1 1
Row 2 2 2
Row 3 3 3
Row 4 4 5
Row 5 5 8
Row 6 6 13
Row 7 7 21
Row 8 8 34
Row 9 9 55
Row 10 10 89
To enter the LINEST() function, highlight cells A11 & B11. Then type
=LINEST(B1:B10,A1:A10)
but instead of hitting ENTER at the end, hold down SHIFT + CTRL + ENTER.
The results should be 8.31 and -22.60.
Similarly, =LOGEST(B1:B10,A1:A10) should return 1.63 and 0.69.
To project values of Column B for values of 11, 12, 13, 14 & 15 in Column A, either the TREND() or FORECAST() functions can be used for linear regression or the GROWTH() function can be used for exponential growth projections.
Example of TREND():
A B
Row 1 1 1
Row 2 2 2
Row 3 3 3
Row 4 4 5
Row 5 5 8
Row 6 6 13
Row 7 7 21
Row 8 8 34
Row 9 9 55
Row 10 10 89
Row 11 11 =TREND(B1:B10,A1:A10,A11)
Row 12 12 =TREND(B1:B10,A1:A10,A12)
Row 13 13 =TREND(B1:B10,A1:A10,A13)
Row 14 14 =TREND(B1:B10,A1:A10,A14)
Row 15 15 =TREND(B1:B10,A1:A10,A15)
Results:
A B
Row 1 1 1
Row 2 2 2
Row 3 3 3
Row 4 4 5
Row 5 5 8
Row 6 6 13
Row 7 7 21
Row 8 8 34
Row 9 9 55
Row 10 10 89
Row 11 11 68.80
Row 12 12 77.11
Row 13 13 85.42
Row 14 14 93.73
Row 15 15 102.04
If the GROWTH() function was used in place of TREND(), the results would be:
A B
Row 1 1 1
Row 2 2 2
Row 3 3 3
Row 4 4 5
Row 5 5 8
Row 6 6 13
Row 7 7 21
Row 8 8 34
Row 9 9 55
Row 10 10 89
Row 11 11 147.74
Row 12 12 240.69
Row 13 13 392.11
Row 14 14 638.79
Row 15 15 1040.67
To get an idea of how well the TREND() and GROWTH() projections work, I created some charts plotting the projected values against the real values. In this example I used Fibonacci numbers as a pattern to fit a regression line.
You can see from the charts that the exponential regression works much better than linear regression for fitting this data set. If LOGEST() had some more data points to work with, the slope of the regression line would go to phi.

Discussion Area - Leave a Comment