Data Smoothing Using a LOG10 Function
Forecasts developed using the LOG10 function are superior to the forecast based on the original values.
micsrm140cd
Fitting trend lines to service-oriented measurements is another problem that you can encounter. While trend-based estimates of future service levels can be inferior to the results that you can obtain using analytic queuing models, you can often fit regression models to service-oriented measurements to obtain a sense of direction for the values. Consider the following table of total pages per second collected for a 3090-200J processor:
Observation Page Month Number /Sec ======= =========== ====== JAN98 1 75 FEB98 2 77 MAR98 3 80 APR98 4 86 MAY98 5 97 JUN98 6 113
Figure 7-6 shows a scatter plot of the data. A linear regression model that was developed for this historical data series has the following parameters:
n = 6, the number of historical observations b = 83.1, the y intercept m = 1.41, the slope of the line 2 r = 0.17, the coefficient of determination F = 29.3, the F value p = 0.01, the probability that we should reject the hypothesis s = 14.8, the standard error e
The predicted and residual values for the historical data series are shown in the following table:
Observation Page Est. Residual Month Number /Sec Pages (error) ======= =========== ====== ====== ======== JAN98 1 75 84.5 9.5 FEB98 2 77 85.9 8.9 MAR98 3 80 87.3 7.3 APR98 4 86 88.7 2.7 MAY98 5 97 90.1 -6.9 JUN98 6 113 91.5 -21.5
The model poorly represents the historical data collected for paging. This is indicated by the small value of r-squared. To understand why this model fails, consider the underlying mechanism that produced the observations. Paging is an example of a service-oriented measurement that results from an underlying queuing relationship. That is, rather than expecting a linear increase in paging to a linear increase in load, you can expect an exponential increase. A standard technique (KEL74) for modeling exponential data is to use a logarithmic function to transform the data into a linear form. Univariate Model Forecasting employs a log base 10 function, as shown in the following equation:
x(j) = LOG10(x(j)+1.0), for all j (Eqn 11)
The 1.0 is added to the historical observation before the transformation to avoid the undefined result of the LOG10 function at zero. To interpret the forecasted values, we must reverse the transformation performed in Equation 11. The following equation shows this reverse transformation:
x(j) x(j) = 10 - 1.0, for all j (Eqn 12)
The 1.0 that is subtracted in Equation 12 corresponds to the 1.0 that was added in Equation 11. A LOG10 function was applied to the historical paging data presented in the previous table to "linearize" the observations. The result of this transformation is shown in the following table:
Observation LOG10 Month Number Pages ======= =========== ====== JAN98 1 1.88 FEB98 2 1.89 MAR98 3 1.90 APR98 4 1.93 MAY98 5 1.99 JUN98 6 2.05
Using the smoothed observations, a second linear model was developed. The parameters of this model are shown below:
n = 6, the number of historical observations b = 1.84, the y intercept m = 0.03, the slope of the line 2 r = 0.96, the coefficient of determination F = 31.82, the F value p = .005, the probability that you should reject the hypothesis s = -0.06, the standard error. (Note that you must e transform this value using Equation 11 to obtain the actual standard error, 10**-.06 = 0.88.)
The future observations predicted by the model must be transformed back from the logarithmic scale in the same manner as the standard error value (described above). The predicted and residual values developed from model are shown in the following table:
Est Obs Page LOG10 LOG10 Trans Residual Month # /Sec Page Pages Est (error) ======= === ====== ====== ===== ===== ======== JAN98 1 75 1.88 1.87 74.1 0.9 FEB98 2 77 1.89 1.90 79.4 -2.4 MAR98 3 80 1.90 1.93 86.1 -6.1 APR98 4 86 1.93 1.96 91.2 -5.2 MAY98 5 97 1.99 1.99 97.7 -0.7 JUN98 6 113 2.05 2.02 104.7 8.3
Forecasts developed using the transformed paging observations are superior to the forecast based on the original values. Consider logarithmic transformation any time you are modeling service-related values like device utilizations, turnaround times, or response times.
Figure 7-6. Monthly Page/Sec Values
PAGING DATA | * | | 110 + | | | | 105 + | | | | P 100 + A | G | E | * S | 95 + / | | S | E | C 90 + | | | | * 85 + | | | | 80 + * | | | * | 75 + * | ---+------------------+------------------+------------------+------------------+------------------+-- 1 2 3 4 5 6 OBSERVATION NUMBER