HomePublicationsInsightsCOMBINATION OR COMPETITION OF FORECASTS: A CASE STUDY IN AGRIBUSINESS FREIGHT – PART 1

COMBINATION OR COMPETITION OF FORECASTS: A CASE STUDY IN AGRIBUSINESS FREIGHT – PART 1

In view of the growing importance of logistics activities in Brazil, it becomes increasingly necessary to measure its different components. Specifically, transport represents, on average, 64% of the logistics cost of companies, constituting an important object of analysis for academics and managers. Based on this perspective, a case study is presented that seeks to analyze the behavior of freight prices in the agribusiness sector, one of the most dynamic in the Brazilian economy.

  1. INTRODUCTION

Agribusiness represents almost 33% of the national GDP and involves around 37% of the Economically Active Population (EAP). This sector also accounts for 42% of Brazilian exports and places the country in a prominent position on the international scene. (MAPA, 2004).

However, agribusiness faces some difficulties, mainly in infrastructure – which may raise questions about its ability to maintain this prominent position. According to Rodrigues (2004, quoted by Carvalho and Caixeta Filho, 2007), the lack of infrastructure can result in retention of production in the field, since rural producers may be left without the means to market or store their products, which would characterize the so-called “abundance crisis perspective”. For Castro (1995, quoted by Carvalho and Caixeta Filho (2007), efficient logistics is a basic condition for the competitiveness of all sectors of the economy.

Seeking a deeper analysis of the freight market, price forecasts are conducted using five different methods, which have been used in the most diverse applications, from predicting financial indicators and optimizing functions to applications related to medicine. Their results are analyzed in the light of combination and competition, two alternative approaches to deal with forecasts generated by different methods.

Due to the length of the subject, this article will be divided into two parts. In the first, the advantages of combined forecasting in obtaining a more accurate result will be discussed, in addition to the analysis of three of the five main forecasting methods currently used by companies (Econometric Model, Classical Decomposition and ARIMA). The last two methods (Neural Networks and Genetic Algorithms) will be analyzed in the second part, which will also have a case study, which will address the forecast of freight prices for a company in the Agribusiness sector.

  1. COMBINATION X COMPETITION OF FORECASTS

The combination of predictions has been studied for a long time. Empirical evidence points out that forecasts improve when individual forecasts are combined. Gordon (1924, cited by Chase Jr, 2000), one of the first to conduct research in this field, provides a good example when he shows one of the most common ways of combining predictions – in sporting events such as boxing. In search of the fairest result, the scores of the three judges are averaged.

The advantages of combined forecasting become more evident in cases of high uncertainty or when forecasts are negatively correlated. For Chase Jr (2000) the idea of ​​combining forecasts is better as the biases of the methods and/or those who make the forecasts will be compensated from one to another. Furthermore, forecasts are generated by individuals with different data and objectives within companies. Combining them becomes favorable, as it makes the final forecast more balanced.

There are several methods for combining forecasts, the most basic being those that provide the best results: the simple average and the weighted average. For example, the notation for the simple average of three forecasting methods is:

2010_05.2_image 01_part 1

Building a combined forecast requires choosing weights for each method. This decision can be made through optimization software, whose objective is to determine the weights that minimize the forecast error. The sum of the weights is restricted to one. The notation for the weighted average is presented below:

2010_05.2_image 02_part 1

On the other hand, competition consists of comparing different individual forecasts and choosing the best one. It is an interesting approach, considering that a given method may be more accurate for a set of data. In this case, very high errors in other predictions are not incorporated in the best one.

There is no proven best forecasting approach. The types of data and the objectives to be achieved must be analyzed and each case evaluated individually. The combined forecasts, in addition to the empirical evidence that proves their greater effectiveness, are beneficial in generating a more balanced result, closer to the real central tendency over time. However, competing predictions can generate a more accurate result. Especially when forecasts are positively correlated. Flores and White (1989) consider that the best approach to be used is the one that systematically generates the best results over time, be it combination or competition.

3. MAIN FORECAST METHODS

In the first part of this article, we will address three of the five main forecasting methods – subject to combination and competition – adopted by companies.

3.1 ECONOMETRIC MODEL

Econometrics is the science that unites economics, statistics and mathematics. For Gujarati (2000, quoted by Rangel, 2007), the econometric research method essentially aims at a conjunction of economic theory with concrete measures, using as a bridge the theory and techniques of statistical inference.

Regression analysis is one of the statistical techniques used in econometrics to explain a dependent variable Y by means of one or more independent variables Xi. The correlation between independent variables in multiple regression cannot be high. When this occurs, the contribution of the variables to the model is similar, which does not generate more accurate predictions, since there is no really “new” information to explain the behavior of the dependent variable.

The equation describing a multiple regression model is given by the linear function:

2010_05.2_image 03_part 1

 

The model parameters are estimated using the method of least squares, from the data set, using software such as Excel and SPSS. Hanke and Wichern (2005) describe the meaning of regression coefficients:
“The partial coefficient of regression measures the average variation in the dependent variable per unit of variation in the independent variable, keeping the other independent variables constant.”

For example, Castro (2003) studies the formation of prices in cargo transportation in Brazil. To explain the freight price (dependent variable), the author uses as explanatory variables: fuel price, average wage, insurance expense per ton, average transport distance, average vehicle loading, average lot size and percentage of revenues in fractional charge.

As seen, econometric models can have several applications. It is important to emphasize that the addition of variables to a model does not necessarily make it more accurate, especially when the correlation index between them is high. Therefore, it is fundamental to analyze the real contribution of a variable to a model. The occurrence of multicollinearity increases the standard error of the regression coefficients, making the variable less significant. That is, at the limit, these coefficients can assume any value, including zero, which can impair the analysis of the decision maker.

3.2 CLASSIC DECOMPOSITION

The classical decomposition method is one of the simplest and easiest methods used to generate forecasts from time series. Morretin and Toloi (1987, cited by Werner and Ribeiro, 2003) define a temporal series as any set of observations ordered in time. Hanke and Wichern (2005) present the following definition:

“A time series consists of data collected, recorded, or observed over successive increments of time”

Records of a time series must be made at fixed time intervals. In general, they are done annually or monthly, depending on the purpose of the forecast. To generate a good forecast, the sample must be large. Tabachnick and Fidell (2001) consider that the minimum should be 50 observations. Normally, the values ​​of a series are autocorrelated, that is, the variation of a value impacts the neighbors. This dependency produces a pattern of variability that can be used to predict future values ​​and help manage business operations (Hanke and Wichern, 2005).

A time series is made up of four basic elements:

  1. Trend (T): It is the component that shows the growth or decline of the series over time.
  2. Cycle (C): The cyclical component represents the wave motion or cycle of a series of one or more years in duration, which tends to be periodic over several years.
  3. Seasonality (S): It is the wave fluctuation of the series within a year. It is influenced by the weather and events such as Christmas, school holidays and holidays.
  4. Irregular (I): It is the component that represents the random fluctuations of the series, which cannot be predicted. It can also be called random noise or error.

Figure 1 illustrates a series with seasonality and a small growth trend.

 2010_05.2_image 04_part 1

 

The classical decomposition can generate predictions (Y) through two models, the additive and the multiplicative. The equations are presented below:

2010_05.2_image 05_part 1 e 2010_05.2_image 06_part 1

Considering the difficulty in working with cycles – they can be confused with the trend (T) – some authors simplify forecasting models:

2010_05.2_image 07_part 1 e 2010_05.2_image 08_part 1

Still others propose different simplifications, taking into account that the random error cannot be calculated:

2010_05.2_image 09_part 1 e 2010_05.2_image 10_part 1

The calculation of the indices is very simple – it can be done in Excel – constituting one of the great advantages of this method. In addition, although the sample must be of considerable size, it is made for only one variable, unlike the econometric model, when, in general, more than two variables are needed. In contrast, in the classical decomposition, factors external to the time series are not analyzed.
3.2 ARIMA

ARIMA models (Auto-Regressive Integrated Moving-Average or Auto-Regressive Integrated Moving Averages), as well as the classical decomposition method, are used for forecasting time series. They are linear models capable of representing both stationary and non-stationary series (Hanke and Wichern, 2005). They are also named after Box-Jenkins models, authors who developed the methodology in the 70s.

According to Werner and Ribeiro (2003), the Box-Jenkins models start from the idea that each value of the (time series) can be explained by previous values, based on the use of the temporal correlation structure. This can also be called autocorrelation. It is worth highlighting the difference between the autocorrelation in the time series and the correlation between the variables in the econometric model. In the first case, it is the relationship between the values ​​of the series that generates a structure capable of making predictions. That is, it is essential that it is strong. In the second, the relationship between different explanatory variables and the independent variable is analyzed. In this case, it is important that the explanatory variables are not correlated with each other for a good performance of the model.

The model is composed of three terms – the autoregressive, trend term or integration filter and the moving averages – represented by the letters pd e q respectively. These terms can be combined, generating different models: stationary (ARMA), non-stationary (ARIMA) and seasonal (SARIMA).

Tabachnick and Fidell (2001) present the following definition for the components pd e q:

  • Autoregressive term (p) – Number of model terms describing the dependence between successive observations.
  • Moving average term (q) – It is the number of terms that describe the persistence of a random shock from one observation to another.
  • trend term (d) – Terms required to transform a non-stationary series into a stationary one.

ARIMA forecasts are performed following the steps in the flowchart in Figure 2. If the suggested model is not adequate, the procedure is repeated.

2010_05.2_image 11_part 1

Stationary models, although not very common, can be just autoregressive (AR) models or moving average (MA) models. The following are functions that describe these models, in addition to the autoregressive moving averages model (ARMA):

Autoregressive Model:

2010_05.2_image 12_part 1

 

 2010_05.2_image 13_part 1

If the analyzed series is non-stationary, that is, with trend and seasonality, it is necessary to transform it into a stationary series, taking successive differences. The first difference of a Y seriest is given by:

2010_05.2_image 14_part 1

An ARIMA model with p =1 and q=1 is given by:

2010_05.2_image 15_part 1

In the SARIMA models, often used when the series is impacted by the weather or certain annual events, in addition to the non-seasonal components p, d and q, seasonal parameters P, D and Q are added.

The use of the Box-Jenkins methodology requires knowledge and attention from the forecaster. The choice of an adequate model is essential for making a good forecast. This choice is based on some statistical tests, which are not part of the scope of this work. The complexity of the algorithms used to define the model's coefficients makes it essential to use specialized software, such as SPSS.

4 Conclusion

Logistics is growing in importance within Brazilian companies and, along with it, concerns about their expenses are also growing. Therefore, it is important to know how to measure its different components, especially the cost of transport, which represents 64% of agribusiness expenses with logistics.

However, finding models that fit the data set and the companies' objectives is a delicate task that requires analytical knowledge. In the second part of this article, the last two forecasting methods (Neural Networks and Genetic Algorithms) – subject to combination and competition – adopted by companies will be studied, in addition to the presentation of a case study, which will address the forecast of freight prices for a company in the agribusiness sector.

5 Bibliography

CARVALHO, LB; CAIXETA FILHO, JV Behavior of the Sugar Road Freight Market for Export in the State of São Paulo, Economics and Agribusiness Magazine, v. 5, no. 1, 2007.

CASTRO, N. Pricing in Cargo Transport, Research and Economic Planning, v. 33, no. 1, 2003. Available at http://ppe.ipea.gov.br/index.php/ppe/article/viewFile/89/64. Accessed on 25/03/2010.

CHASE JR, WC Composite Forecasting: Combining Forecasts for Improved Accuracy, The Journal of Business Forecasting Methods & Systems. p 2, 20-22, 2000.

FAUSETT, L. Fundamentals of Neural Networks, Upper Saddle River: Prentice Hall, 1994. 461 p.

FLORES, BE; WHITE, EM Combining Forecasts: Why, When and How, The Journal of Business Forecasting Methods & Systems. p 2-5, 1989.

GURNEY, K. An Introduction to Neural Networks. London: CRC Press, 2003. 234 p.

HANKE, JE, WICHERN, DW Business Forecasting. 8. ed. New Jersey: Pearson Prentice Hall, 2005. 535p.

MAPA – Ministry of Agriculture, Livestock and Supply. Available in http://www.agricultura.gov.br/. Accessed on 30/03/2010.

MOREIRA, LM. Multicollinearity in Regression Analysis. 2003. Available at: http://www.ead.fea.usp.br/Semead/9semead/resultado_semead/trabalhosPDF/455.pdf. Accessed on 10/03/2010.

NEVES, MV Use of Neural Networks for Time Series Prediction Applied to Computational Idleness Detection

PACHECO, MAC Genetic Algorithms: Principles and Applications. 1999. Available at http://www.ica.ele.puc-rio.br/Downloads%5C38/CE-Apostila-Comp-Evol.pdf. Accessed on 15/03/2010.

RANGEL, LA The Use of Econometrics in Empirical Accounting Research, CRCRS Magazine. 2007 Available at: http://www.crcrs.org.br/revistaeletronica/artigos/05_leandro.pdf. Accessed on 04/03/2010.

SANTA ROSA, HN Neural Networks in Time Series Forecasting. 2004. Available at: http://inf.unisul.br/~ines/workcomp/cd/pdfs/2878.pdf. Accessed on 08/03/2010.

SOARES, GL Genetic Algorithms: Studies, New Techniques and Applications. 1997. 137 f. Dissertation (Master in Electrical Engineering) – School of Engineering, Federal University of Minas Gerais, Belo Horizonte. 1997.

TABACHNICK, BG; FIDELL, LS Using Multivariate Statistics. 4. ed. Needham Heights,: Allyn & Bacon, 2001. 966 p.

WERNER, L.; RIBEIRO, JLD, Demand Forecasting: An Application of Box-Jenkins Models in the Area of ​​Technical Assistance for Personal Computers. Management and Production, v.10, n.1, p 47-67, 2003.

Authors: Peter Wanke and Marina Andries Barbosa

https://ilos.com.br

Doctor of Science in Production Engineering from COPPE/UFRJ and visiting scholar at the Department of Marketing and Logistics at Ohio State University. He holds a Master's degree in Production Engineering from COPPE / UFRJ and a Production Engineer from the School of Engineering at the same university. Adjunct Professor at the COPPEAD Institute of Administration at UFRJ, coordinator of the Center for Studies in Logistics. He works in teaching, research, and consulting activities in the areas of facility location, simulation of logistics and transport systems, demand forecasting and planning, inventory management in supply chains, business unit efficiency analysis, and logistics strategy. He has more than 60 articles published in congresses, magazines and national and international journals, such as the International Journal of Physical Distribution & Logistics Management, International Journal of Operations & Production Management, International Journal of Production Economics, Transportation Research Part E, International Journal of Simulation & Process Modeling, Innovative Marketing and Brazilian Administration Review. He is one of the organizers of the books “Business Logistics – The Brazilian Perspective”, “Sales Forecast - Organizational Processes & Quantitative Methods”, “Logistics and Supply Chain Management: Product and Resource Flow Planning”, “Introduction to Planning of Logistics Networks: Applications in AIMMS” and “Introduction to Infrastructure Planning and Port Operations: Applications of Operational Research”. He is also the author of the book “Inventory Management in the Supply Chain – Decisions and Quantitative Models”.

Sign up and receive exclusive content and market updates

Stay informed about the latest trends and technologies in Logistics and Supply Chain

Rio de Janeiro

TV. do Ouvidor, 5, sl 1301
Centro, Rio de Janeiro - RJ
ZIP CODE: 20040-040
Phone: (21) 3445.3000

São Paulo

Alameda Santos, 200 – CJ 102
Cerqueira Cesar, Sao Paulo – SP
ZIP CODE: 01419-002
Phone: (11) 3847.1909

CNPJ: 07.639.095/0001-37 | Corporate name: ILOS/LGSC – INSTITUTO DE LOGISTICA E SUPPLY CHAIN ​​LTDA

© All rights reserved by ILOS – Developed by Design C22