This article aims to measure the impact of the Top-Down and Bottom-up approaches on the sales forecast error variance with the simple exponential smoothing method, relating the error variance to the most common characteristics of a sales series: coefficient of correlation, sales variance, and aggregate sales share.
For this, the expressions referring to the forecast error variance were analytically demonstrated for these two approaches, under different response time conditions: unitary, constant and variable. The results of forecast error variances for the Top-Down and Bottom-up approaches were compared for two different situations with respect to the alpha smoothing constant: equal and different between these approaches.
It was identified that, considering or not the damping constant (alpha) between the approaches, two types of main effects are verified that explain the behavior of the variance of the sales forecast errors in function of the most common characteristics of the sales series: Portfolio Effect (equal constants) and Anchoring Effect (different constants). Until now, these effects are considered antagonistic by the sales forecasting literature.
The rest of the article is structured in six more sections. Section 2 is dedicated to the literature review. In section 3, the variances of sales forecasting errors in these two sales forecasting approaches for different response time conditions are analytically demonstrated. In sections 4 and 5, the results found are analyzed and discussed, that is, the adequacy of these approaches to the most common characteristics of a series of sales. Finally, in section 6 the conclusions are presented and in section 7, the bibliographical references.
- LITERATURE REVIEW
The first part of this section is dedicated to reviewing the Top-Down and Bottom-Up approaches in sales forecasts, as well as their adequacy to different characteristics of the sales series of products or items; the second, the Simple Exponential Smoothing method and the recent research carried out in this area and the third, the determination of the variance of sales forecast errors with this method and the need to consider the response time.
2.1 Top-Down and Bottom-up Approaches
There is great consensus among authors on the conceptualization and operationalization of Top-Down (TD) and Bottom-up (BU) approaches in sales forecasting. For example, according to Lapide (1998), in the TD approach, the sales forecast is made for the sum of all items, and then disaggregated item-by-item, generally based on the historical percentage of the item in the total. In this sense, Schwarzkopf et al. (1988) point out that in the DT approach, the aggregated total is first predicted and then apportioned into items, families or regions based on their historical proportions.
In the BU approach, each of the items is predicted separately and the forecasts are added if an aggregated forecast is needed for the group (Lapide, 1998). In other words, in the BU approach, the forecaster first prepares forecasts for each SKU (stock keeping unit) and then aggregates them at the level of interest of the analysis (Jain, 1995).
Despite the consensus related to the conceptualization and operationalization of the TD and BU approaches, there is great conflict between the authors about their adequacy or use in order to minimize sales forecast errors and their variance. This discussion is particularly important when one observes the research carried out by Mentzer and Cox (1984) in 160 North American companies to determine the main factors involved in the accuracy of sales forecasts. In addition to factors such as formal training, type of industry and sales volume, the level at which sales forecasts are prepared (corporate level and further aggregation of the data or product level and further disaggregation) also significantly interferes with the accuracy of the sales forecast. . According to the authors, greater accuracy in sales forecasting is associated with a higher level of data aggregation.
Several authors have tried to relate the adequacy of the TD and BU approaches to different characteristics of the items' sales series, such as, for example, the sales correlation coefficient between the item under study and all the remaining items grouped together (r), the fraction or proportion of the item under study in the aggregated total sales (f) and the ratio between the variances of the item under study and that of the other grouped items (k2).
Although there are controversies regarding the adequacy of the TD and BU approaches, an implicit premise in some of the researched articles is related to the Portfolio Effect, a concept initially defined by Zinn, Levy and Bowersox (1989) – and later expanded to the response time by Tallon (1993) and Evers (1998) – to assess the impact of inventory centralization on the variance of aggregated sales in different markets.
According to the Portfolio Effect, the centralization of inventories, that is, the aggregation of sales, minimizes the total variance when the sales correlation coefficient between markets is -1 and the relative sales variance between markets is 1. In Generally speaking, the rationale for reducing variance, according to the authors, is to compensate for fluctuations in sales between two markets: when sales in one market increase, sales in the other market decrease by the same amount.
An example in this sense is the article by Kahn (1998), according to which, in the DT approach, the characteristic peaks and valleys of each item are canceled by the aggregation, constituting an artificial representation of the true nature of the business. Negative correlation between items would reduce aggregate sales variance. Also corroborating the main conclusions of the Portfolio Effect, Schwarzkopf et al. (1988) pointed out that estimates based on aggregated data are more accurate than estimates based on individual forecasts, when items have independent sales patterns (null correlation).
However, Lapide (1998) states that, as a general rule, the TD approach only makes sense if, and only if, the sales patterns of each item are the same. In other words, if all items are growing, decreasing or remaining stable, which characterizes a positive correlation between the sales of different items. The author goes on to state that, frequently, a product family is composed of items that potentially cannibalize each other, as in the case of a family with new and old products. For these items, the demand pattern is quite different, as some items grow at the expense of others (negative correlation), which would make the BU approach preferable.
Gordon, Morris and Dangerfield (1998) and Gelly (1999) discuss the suitability of the TD and BU approaches to other characteristics of the items' sales series in addition to the correlation coefficient.
More specifically, Gordon, Morris and Dangerfield (1998) studied more than 15.000 sales series, aggregated and disaggregated, generating forecasts using the Triple Exponential Damping method. The BU approach resulted in more accurate forecasts in 75% of the series, with the greatest gains in accuracy being obtained for items with a strong positive correlation and when it represented a large fraction of the aggregated sales series. In contrast, when the data are negatively correlated, the TD approach proved to be more accurate, regardless of the item's participation in the aggregate sales series.
Finally, in the case study presented by Gelly (1999), the TD approach proved to be more appropriate for items that have a predictable sales pattern over time, that is, a low coefficient of variation in sales that could be a result of a large share of the item in the aggregate sales series and a small ratio between the item's demand variance and that of the other grouped items.
Table 1 summarizes the impacts related to the TD and BU approaches identified in this section.
![]() |
2.2 Simple Exponential Damping
According to Gijbels, Pope and Wand (1999), Simple Exponential Smoothing (AES) is the most commonly used model in time series forecasting. Its main advantages are related to the fact that it is a non-parametric model – that is, not associated with a given probability distribution – based on a simple algebraic formula, which allows quickly updating the local level estimate of the time series through recurrences. in your equation. AES and its extensions were developed in the late 50s by Brown, Winters and Holt, among other authors (Chatfield, Koehler and Snyder, 2001). Among its main assumptions or limitations, it should be noted that the AES does not consider possible growth or decrease trends, seasonal fluctuations and cyclical variations.
Over the past twenty years, some research has been undertaken to better understand and describe AES and its extensions from a statistical point of view. For example, Chatfield, Koehler and Snyder (2001) compare a variety of potential Exponential Smoothing models derived from autoregressive moving averages, structural models and nonlinear dynamic spaces and conclude why AES and its extensions are robust even in the face of changes in the variance of the time series. Blackburn, Orduna and Sola (1995) show that AES can introduce spurious autocorrelations in series whose trend component has been removed and that these autocorrelations depend on the mean age of the data and the value of the damping constant. Finally, Gibels, Pope and Wand (1999) compare AES with Kernel Regression (non-parametric) allowing a better understanding of the equivalence and adequacy between the two approaches.
Let Ft1 be the AES sales forecast for item 1 at time t, a the smoothing constant, Dt1 the actual sales of item 1 at time t, Dt the aggregated actual sales of all items at time t, and f the fraction or percentage of item 1 sales in total sales (assumed constant over time), so the AES predictions for item 1 in the TD and BU approaches are given below.
![]() |
2.3 Variance of Forecast Errors and Response Time
Many sales forecasting models do not use all available historical data. Moving averages, for example, only use the last n data points and AES assigns declining weights to past data (Silver and Peterson, 1985). Under these sampling circumstances, the most correct estimates for the mean and variance of sales are not exactly clear. The best estimate of average sales is simply the sales forecast for the next period. To estimate the variance of sales, the variance of the forecast error must be used.
According to Silver and Peterson (1985) and Greene (1997), the forecast error variance and sales variance are not equal. According to these authors, the rationale associated with greater adequacy of forecast error variance is related to the use of forecasts to estimate sales. Safety stock, for example, should be sized to protect against variations in sales forecast errors. In general, the forecast error variance tends to be greater than the sales variance (Silver, Pyke and Peterson, 2002). This is due to the additional sampling error introduced by forecast models when only using part of the available historical data. Another situation that must be considered, as pointed out by Harrison (1967) and Johnston and Harrison (1986) is whether the sales forecast is frozen during the resupply response time (RT). In this case, forecast error variance during this response time must be estimated. According to Silver and Peterson (1985), the exact relationship between forecast error variance and forecast error variance in response time depends on complicated relationships between the demand pattern in question, the procedure for revising forecasts and the value of n used in the moving average, or the damping constant (see, for example, Harrison, 1967). According to the authors, one of the reasons for such complexity is that the recurrence procedure in damping introduces a certain degree of dependence between the prediction errors of periods separated by the response time.
Silver and Peterson (1985) point out that the following empirical relationship can be used to estimate the variance of forecast errors in resupply response time:
![]() |
where c is a coefficient that must be estimated empirically and V(E) is the forecast error variance. be andt1 the sales forecast error for item 1 at time t (defined as the difference between actual sales and sales forecast), F1t-TR the AES sales forecast for item 1 updated at the beginning of the response time in t – TR and SFTR1t the cumulative sales forecast for item 1 at response time TR ending in t, then the sales forecast error for item 1 at response time in approaches BU and TD (ETR1) It is given by:
![]() |
- VARIANCE OF FORECAST ERRORS
In this section, the variances of the sales forecast errors by AES in the TD and BU approaches are calculated in three particular circumstances: unitary response time, constant response time and variable response time (by determining the random sum of the actual sales of item 1 in response time and the product of item 1's sales forecast by AES in response time).
3.1 Unit Response Time
Let V(Dt1) = s12 the variance of the actual sales of item 1, V(Dt) = s2 the variance of the aggregated sales of all items, V(Ft1) the variance of the sales forecast for item 1 by AES, then the variances of forecast errors by AES for item 1 in the TD and BU approaches, V(Et1) are given below.
![]() |
being in equations (12) and (16) the sum of the infinite terms of the geometric progression of equations (11) and (15), respectively, and in equations (13) and (17) summed the variances of actual sales and forecast sales by AES for item 1, independently proven, as shown below.
Correlation Coefficient between Demand and Forecast
Let rDXF be the correlation coefficient between the AES sales forecast for item 1 and the actual sales for item 1, the expected value of the actual sales for item 1, and the expected value of the AES sales forecast for item 1. Then , the correlation coefficient between sales and forecast in the TD and BU approaches is given by:
![]() |
3.2 Constant Response Time
![]() |
3.3 Variable Response Time
Different models were used to determine the variance of the actual sales of item 1 in the response time (V(DTRt1)) and the variance of the sales forecast of item 1 frozen in the response time (V(FTRt1)).
In the first case, the variance of the random sum of actual sales in response time was calculated using a factorial moment generator function (Zwillinger and Kokosa, 2000), assuming that actual sales and response time are independent and with a discrete distribution . The result previously presented by Mentzer and Krishnan (1988), obtained for real sales with continuous distribution, response time with discrete distribution and independence between real sales and response time, was corroborated. The authors used a moment generating function to demonstrate the relationship known in the logistics literature
![]() |
represents the response time variance. In his demonstration geometric and Laplace transformations were necessary. Although this result is valid for any combination of probability distributions, the proof that follows aims to illustrate the lower complexity involved in the proof by factorial moment generating function.
In the second case, the variance of the product of the random variables sales forecast of item 1 and response time was calculated, a result identical to those presented by Brown (1982) and Wanke and Saliby (2005). The result
![]() |
It is equivalent to calculating the variance of the frozen sales forecast in response time.
Variance of Actual Sales in Response Time
Let Pd(t) and Pn(t), respectively, be the factorial moment generating functions of the discrete random variables demand (D) and response time (TR). P(t) represents the factorial moment generating function of the random sum of the demand in the response time. In particular, the demand variance in response time (equation 28) is obtained by evaluating the following expression for t =1: P”(t) + P'(t) – (P'(t))2 where P'(t) and P”(t) are the first and second derivatives of P(t) (equation 27).
![]() |
Correlation Coefficient between Demand and Forecast in Response Time
Let rDTRxFTR be the correlation coefficient between the sales forecast for item 1 and the actual sales of item 1 at response time, the expected value of the actual sales of item 1 at response time, and the expected value of the sales forecast at time response time for item 1. Then, the correlation coefficient between sales and the response time forecast in the TD and BU approaches is given by:
![]() |
- RESULTS ANALYSIS
This section presents and analyzes the results that equate the variances of the TD and BU approaches when the damping constants are the same (same alpha) and different (possibility of adopting different alpha values in each of the approaches).
4.1 Damping Constants and Equals
considering k2 the ratio between the sales variance of item 1 and the sales variance of all other aggregated items, that is, also considering that s2 =s12 +s22 +2ρs1s2 and equating equations (14) and (18), in the case of the unitary response time, (23) and (24), in the case of the constant response time, and (37) and (38), in the case of the response time variable, we obtain the value of k that makes the forecast error variances indifferent in the TD and BU approaches.
![]() |
According to the results, for a = 0, the prediction error variance is the same for the TD and BU approaches and is equal to TRs12, that is, it does not depend on a, fekesTR2. For a > 0 and a £ 1 the value of k (kcritical) which equals the forecast error variances in the TD and BU approaches depends on f and ρ.
4.2 Different Damping Constants
also considering
![]() |
assuming different values of the damping constant for the TD (atd) and BU (abu) approaches, equating equations (14) and (18), in the case of the unitary response time, (23) and (24), in the case of the constant response time and (37) and (38), in the case of variable response time, the value of atd is obtained, as a function of abu, f, k and ρ, which makes the forecast error variances indifferent in these two approaches .
![]() |
According to the results presented in (39) and (40), the choice of sales forecasting approach is independent of TR and sTR2.
- DISCUSSION OF THE RESULTS
In this section, the results generated in the previous sections will be discussed in greater depth, especially equations (39) and (40). Specifically, different random combinations of f and ρ were generated to evaluate the response in terms of the values of k in equation (39) and different random combinations of abu, f, k and ρ to evaluate the response in terms of the values of atd in equation ( 40). This is necessary as the non-linearity of the terms of these equations makes it difficult to directly assess their marginal impact on the variables in evidence.
5.1 Equal Damping Constants
In Graph 1, constructed from equation (39), the lines of indifference between the TD and BU approaches are presented for different values of f, k and ρ. In essence, if k > kcritical the TD approach should be chosen, otherwise the BU approach. In other words, given a pair (f, ρ), if the value of k is greater than the respective kcritical value (given by equation 39) associated with that pair, the TD approach in simple exponential smoothing must be chosen over the approach BU.
![]() |
Graph 1 shows that the smaller the values of ρ and f, the greater the chances (graph area above the indifference line) of the TD approaches to minimize the forecast error variance in the AES. What explains this result is the Portfolio Effect in item 1: negative correlation with the other aggregated items and small percentage share in total sales. However, for larger values of f and ρ, the TD approach can still present smaller variance in the forecast error if the value of k is large enough. That is, even for items with a high share in the total and a positive correlation with the other aggregated items, it is possible that the TD approach is the best choice, provided that the ratio between the variance of sales of the item under study and that of other products is sufficiently big. There seems to be a compensatory effect or trade-off between values of the pair (f, ρ).
Although the relationship between f, ρ and k could be explored bidimensionally in Graph 1, different random combinations of f and ρ were generated, based on Uniform Probability Distributions to evaluate the response in terms of the values of k in equation (39). For the generation of f values, the limits of the interval between 0 and 1, and for the values of ρ, between –1 and + 1 were observed.
The results of 10.000 interactions were analyzed using Binomial Logistic Regression: if for each randomly generated pair (f, ρ) the value of k was situated between 0 and 1, it was decided that the TD approach would be FEASIBLE, otherwise, NOT FEASIBLE; that is, corresponding to the concrete situation that it would be a rare event to find a single item with a sales variance greater than the aggregate sales variance of the other items.
Table 2 shows the results of the Binomial Logistic Regression. Of the 10.000 pairs of (f, ρ) randomly generated, 74,64% implied values of k between 0 and 1 (FEASIBLE) and the remainder, values greater than 1. The positive signs of f and ρ denote a greater participation of the item in the aggregated sales and the greater the positive correlation, the greater the probability that the TD approach will NOT minimize the forecast error variance, with the BU approach being preferable.
![]() |
These results, according to Table 3, corroborate the views of Kahn (1998) and Gordon, Morris and Dangerfield (1998) on the adequacy of the TD and BU approaches.
![]() |
5.2 Different Damping Constants
Different random combinations of f, r, abu and k were generated based on Uniform Probability Distributions to evaluate the response in terms of the atd values in equation (40). For the generation of f values, the limits of the interval between 0 and 1 were observed; for values of r, between –1 and +1; for abu values, between 0 and 1 and for k values, between 0 and 5.
10.000 interactions were also carried out, and their results were analyzed using Multinomial Logistic Regression. If for each set of values of f, r, abu and k the value of atd was between 0 and 1, it was considered that the TD approach would be FEASIBLE, otherwise, NOT FEASIBLE. Table 4 presents the results of this regression analysis.
![]() |
Of the 10.000 randomly generated sets of f, r, abu and k values, only 30,1% implied atd values between 0 and 1 (FEASIBLE) and the rest, values greater than 1. The positive signs of abu and k and the negative signs of fer denote that the higher the damping constant in the BU approach and the ratio between the item's variance and the variance of the other grouped items and the lower the item's participation in aggregate sales and the correlation coefficient of the item's series with the series of the other aggregated items, the greater the probability that the TD approach will not minimize the forecast error variance, being preferable to adopt the BU approach.
On the other hand, the greater the correlation coefficient and the fraction of the item under study in the total demand, and the smaller the ratio between the variances and the alpha value in the BU approach, the greater the chances of the TD approach to minimize the forecast error variance . This result could be conceptualized as the Anchoring Effect, similarly to the Portfolio Effect in the previous section. In the Anchoring Effect, the item in question has a large participation in the family, a small uncertainty and a similar demand pattern, causing the uncertainty of the other items to be diluted when the data are aggregated in the TD approach. Again, both the response time and the standard deviation of the response time do not affect the determination of the best approach.
These results, according to Table 5, corroborate the views of Lapide (1998) and Gelly (1999) on determining the most appropriate sales forecasting approach.
![]() |
- CONCLUSION
Although there is no disagreement in the literature about the meaning and operationalization of Top-Down and Bottom-up sales forecasts, there is considerable disagreement regarding the adequacy of these approaches to different characteristics of a given item and its respective sales series: correlation coefficient with the other aggregated items, relative variance and percentage share in total sales.
This article analytically demonstrates the adequacy of the Top-Down and Bottom-up approaches in sales forecasts based on the Simple Exponential Damping method, one of the most widespread in academic and business circles. The minimization of the variance of the sales forecast error was considered as an adequacy criterion and different scenarios were considered for the response time: unitary, constant and variable.
The results indicate that the response time mean and variance are not relevant to determine the most appropriate sales forecasting approach. Furthermore, the possibility of varying the alpha damping constant between the two forecasting approaches plays a key role in understanding and interpreting the generated results.
When the same damping constant is considered and the lines of indifference between these forecasting approaches are analyzed, the classic Portfolio Effect is verified: the Top-Down approach becomes more suitable for items with negative correlation, small participation in aggregate sales and high uncertainty. On the other hand, when varying the damping constant between the approaches, the Anchoring Effect is verified, with the Top-Down approach being more suitable for items with positive correlation, large share in aggregate sales and low uncertainty. In the first case (Portfolio Effect), the variance of the item under study is compensated by the aggregated variance of the other items and by the negative correlation. In the second case (Anchoring Effect), the item under study, as it is dominant within the aggregate sales series, contributes to the dilution of the combined variance of the other items, also generating a more stable sales pattern.
In summary, the results found reconcile the apparently antagonistic views of the literature on the adequacy of the Top-Down and Bottom-up approaches, at least for the Simple Exponential Damping method, by identifying the possibility of varying the damping constant between these two approaches as your link and understanding of what happens to the aggregated sales series.
Managers can also benefit from the results as the flexibility often sought in the sales forecasting process is assured. The results presented in this article are useful in determining the best forecasting approach (Top-Down or Bottom-up) with much less computational effort involved.
Finally, the presented results are particularly relevant for companies that seek to segment their forecasting process (cf. Table 6).
![]() |
For example, C items typically have lower sales and higher sales coefficients of variation. As a general rule, if they are negatively correlated with the A items – higher sales and lower sales coefficient of variation – and different damping constants are employed, the Bottom-up approach may be the most suitable. If the same damping constant is used, the C items must observe the Top-Down approach.
On the other hand, A items should be forecast individually (Bottom-up approach) if they are positively correlated with the aggregated sales of remaining items and the same smoothing constant is used. When different damping constants are employed, the A items should follow the Top-Down approach.
- BIBLIOGRAPHY
BLACKBURN, K.; ORDUNA, F.; SOLA, M. Exponential smoothing and spurious correlation: a note. Applied Economic Letters, no. 2, pp. 76-79, 1995.
BROWN, RG Advanced service parts inventory control. Vermont: Materials Management Systems Inc., 1982.
CHATFIELD, C.; KOEHLER, AB; SNYDER, RD A new look at models for exponential smoothing. The Statistician, v. 50, no. 2, pp. 147-159, 2001.
EVERS, PT, BEIER, FJ Operational aspects of inventory consolidation decision making. Journal of Business Logistics, vol. 19, n.1, pp.173-189, 1998.
LAPIDE, L. A simple view of top-down versus bottom-up forecasting. The Journal of Business Forecasting, Summer, pp. 28-29, 1998.
GELLY, P. Managing bottom-up and top-down approaches: ocean spray's experiences. The Journal of Business Forecasting, Winter, pp. 3-6, 1999.
GIJBELS, I.; POPE, A.; WAND, MP Understanding exponential smoothing via kernel regression. Journal of the Royal Statistician Society, v. 61, no. 1, pp. 39-50, 1999.
GORDON, TP; MORRIS, JS; DANGERFIELD, BJ Top-down or bottom-up: which is the best approach to forecasting? The Journal of Business Forecasting, Fall, pp.13-16, 1997.
GREENE, JH Production and Inventory Control Handbook. 3rd Edition. New York: McGrawHill, 1997.
HARRISON, PJ Exponential smoothing and shot-term sales forecasting. Management Science, v. 13, no. 1, pp. 821-842, 1967.
JAIN, C. How to determine the approach to forecasting. The Journal of Business Forecasting, Summer, p. 2, 1995.
JOHNSTON, FR; HARRISON, PJ The variance of lead-time demand. Journal of the Operational Research Society, v.31, n. 8, pp. 303-308, 1986.
KAHN, KB Revisiting top-down versus bottom-up forecasting. The Journal of Business Forecasting, Summer, pp. 14-19, 1998.
MENTZER, JT; COX, JE A model of the determinants of achieved forecast accuracy. Journal of Business Logistics, vol. 5, no. 2, pp. 143-155, 1984.
MENTZER, JT; KRISHNAN, R. The effect of the assumption of normality on inventory control/customer Service. Journal of Business Logistics, vol. 6, no. 1, p. 101-120, 1988.
SCHWARZKOPF, AB; TERSINE, RJ; MORRIS, JS Top-down versus bottom-up forecasting strategies. International Journal of Production Research, vol. 26, no. 11, pp. 1833-1843, 1988.
SILVER, EA, PETERSON, R. Decision Systems for Inventory Management and Production Planning. 2nd ed, New York: Wiley & Sons, 1985.
SILVER, E., PYKE, D.; PETERSON, R. Decision Systems for Inventory Management and Production Planning and Scheduling. New York: Wiley & Sons, 2002.
TALLON, W. The impact of inventory centralization on aggregate safety stock: the variable supply lead time case, Journal of Business Logistics, v. 14, no. 1, pp.185-196, 1993.
WANKE, P., SALIBY, E. Proposal for new product inventory management: solution of the (q,r) model for the uniform distribution of demand and supply lead-time. Management & Production Magazine, v. 12, no. 1, pp. 1-20, 2005.
ZINN, W.; LEVY, M.; BOWERSOX, DJ Measuring the effect of inventory centralization/decentralization. Journal of Business Logistics, vol. 10, no. 1, 1989.
ZWILLINGER, D., KOKOSA, S. Standard Probability and Statistics Tables and Formulae. New York: Chapman&Hall, 2000.