Bayesian Structural Time Series Model for Forecasting the Composite Stock Price Index in Indonesia

Abstrak . Salah satu model yang dapat digunakan untuk meramalkan data deret waktu adalah model Bayesian Structural Time Series (BSTS). Model BSTS merupakan model yang lebih modern dan dapat mengatasi ketidakpastian data secara lebih baik. Dalam model BSTS, digunakan algoritma pengambilan sampel Markov Chain Monte Carlo (MCMC) untuk mensimulasikan distribusi posterior, yang menghaluskan hasil peramalan atas sejumlah besar model yang potensial menggunakan rata-rata model Bayesian. Tujuan penelitian ini adalah memperoleh model BSTS terbaik untuk data IHSG di Indonesia berdasarkan komponen states dan jumlah iterasi MCMC, serta memperoleh hasil peramalan untuk nilai IHSG di Indonesia 24 bulan ke depan yaitu periode Juli 2023 sampai dengan Juni 2024. Hasil yang diperoleh yaitu berdasarkan perbandingan nilai R-square pada model, model BSTS dengan komponen state tren linear lokal dan musiman, serta jumlah iterasi MCMC 𝑛 = 500 merupakan model BSTS terbaik yang dapat


INTRODUCTION
Currently, the most developed method is the forecasting method.One method that is often used to carry out forecasting is time series analysis.Time series analysis is the analysis of a set of data in a past time period which is useful for knowing or predicting future conditions (Rohmaningsih, et al., 2016).In its development, time series analysis is often used in the fields of economics and finance, especially in the capital market, namely shares.
In practice, to attract investors to invest in the capital market, good capital market conditions are needed.The indicator that is often used by investors in conducting capital market analysis before investing is the Composite Stock Price Index (IHSG) (Syahadati, et al., 2021).According to Nurwani (2016), IHSG is the combined value of company shares listed on the Indonesia Stock Exchange (BEI), whose movements indicate the current economic conditions in the capital market.According to Syahadati (2021), IHSG is one of the economic indicators used to see how the economy is in Indonesia, so it is necessary to make forecasts to help make future decisions.
One model that can be used to predict time series data is the Bayesian Structural Time Series (BSTS) model.The BSTS model can be used for forecasting, looking for related variables (feature selection), inferring causal relationships, and knowing aspects that have an impact at the moment (nowcasting) (Scott & Varian, 2014).The BSTS model is a stochastic state space model that can investigate trend, seasonal and regression components separately (Feroze, 2020).The BSTS model is a more modern model and can handle data uncertainty better.The uncertainty in the data is caused by stochastic or random movements over time so that for more accurate forecasting, a model is needed that can handle this uncertainty well.In the BSTS model, because analytical calculations of the Bayesian posterior distribution are very difficult, numerical calculations using the Markov Chain Monte Carlo (MCMC) method will be used to simulate the posterior distribution, which smooths the forecasting results over a large number of potential models using the Bayesian averaging model (George & McCulloch, 1997;Hoeting, et al., 1999;Madigan & Raftery, 1994).In this research, the bsts package in R software will be used to assist in the calculations.
Research regarding the BSTS model includes research conducted by Almarashi & Khan (2020), regarding time series modeling using BSTS on Flying Cement share price data.Then the BSTS model was compared with the classic Autoregressive Integrated Moving Average (ARIMA) model, based on forecasting plots and Mean Absolute Percent Error (MAPE).The results of this research show that for short-term forecasting, both ARIMA and BSTS are good to use, but for long-term forecasting, the BSTS model with local level components is the best model to use.Apart from that, research was conducted by Tang & Halmkrona (2022), regarding the comparison of ARIMA, BSTS, and Generalized Additive Models (GAM) models to develop a package delay forecasting model using tracking data.The results of this research show that by using the RMSE, MAE, and MASE assessment criteria, the BSTS model is able to provide better performance compared to the ARIMA and GAM models.results for IHSG data in Indonesia for the next 24 months, namely the period July 2023 to June 2024.The calculation process will be assisted by R software.

Structural Time Series (STS)
According to Almarashi & Khan (2020), in the Structural Time Series (STS) model, the data comes from some unobserved process known as the state space and the observed data is generated from the state space with additional noise.The state space component is responsible for generating data such as trends, seasonality, cyclicity, and the effects of independent variables which will be identified separately before being used in the STS model.The general model of STS is as follows (Scott & Varian, 2012): ~(0,   ). (1) Matrix   ,   ,   ,   , and   initially assumed to be known and error   and   assumed to be serially independent and independent of each other at all time points.Initial state vector  1 diasumsikan ~( 1 ,  1 ) independently from  1 ,  2 , … ,   and  1 ,  2 , … ,   , where  1 and  1 assumed to be known in advance.

Local Level
The local level model is the simplest model in the state space model (Almarashi & Khan, 2020).This local level model assumes the trend is a random walk.So the local level model is defined as follows:   =   +   ,   ~(0,   2 ),  +1 =   +   ,   ~(0,   2 ). (2) In the local level model equation, the structural parameters Z_t,T_t,R_t have a scalar value of 1, as well   in the form of constant variance   2 and   in the form of constant variance   2 at the local level model.The parameter of the model is the variance of the error (  2 ,   2 ).The prior of this component depends on the parameters   2 .

Local Linear Trends
The local linear trend model assumes that the mean and slope of the trend follow a random walk.According to Durbin & Koopman (2012), The local linear trend model equation is as follows (Brodersen, et al., 2015): ~(0,   2 ), with: = trend value at time to -   = expectations of value increases  between times  until  + 1.

Semi Local Linear Trends
The semi-local linear trend model is a generalization of the local linear trend model but is more useful for long-term forecasting (Almarashi & Khan, 2020).This model assumes that the level or mean component moves according to a random walk, while the slope component moves based on the AR(1) process which is centered on the non-zero potential value of D. The observation equation containing a semi-local linear trend component is as follows: =   +   ,   ~(0,   2 ),  +1 =   +   +   ,   ~(0,   2 ), (4)  +1 =  + (  − ) +   ,   ~(0,   2 ), with:   = trend value at time to -   = expectations of value increases  between times  until  + 1 || < 1 = learning rate where local trends are updated.

Seasonal
The seasonal model can be considered as a regression with dummy variables as many as S seasons where the number of coefficients must be 1 and the expected value of the coefficient is 0 for 1 full cycle of  season (Scott & Varian, 2013).The STS model observation equation containing a seasonal component is as follows: with:  = many seasons   = seasonal coefficient of joint contribution to the response variable   .
Seasonal effects of   can be changed depending on the seasonality of the data.For example, if you have daily data, then use it  = 7, for quarterly data, used  = 4, and for monthly data used  = 12.Then for the weekly annual cycle effect it is used  = 52, because there are 52 weeks in 1 year (Durbin & Koopman, 2012).

Bayesian Structural Time Series (BSTS)
The BSTS model is a model that can be used for forecasting, looking for related variables (feature selection), inferring causal relationships, and knowing aspects that have an impact at the moment (nowcasting) (Scott & Varian, 2014).The BSTS model is an STS model with a Bayesian approach in estimating the model, where the model is represented in the form of a state space model (Almarashi & Khan, 2020).The steps for applying the Bayesian approach to the STS model are: 1. Determine the prior distribution for each parameter in the model.2. Obtain the posterior distribution.However, because the analytical calculation or integral solution of the Bayesian posterior distribution formula is very difficult.Then numerical calculations are carried out using MCMC simulation methods such as Gibbs sampling, namely by taking samples from the posterior distribution so that parameter estimation values from the BSTS model can be obtained (George & McCulloch, 1997).
Where the computation is done using the bsts package in R software.

RESULT AND DISCUSSION
In this research, IHSG data was used in Indonesia for the period January 1995 to June 2023.
The IHSG data used in the research is presented in Figure 1 as follows.
Based on Figure 1 above, it can be seen that the IHSG data pattern in Indonesia is not stationary because the data fluctuates significantly and shows an up and down trend over time.There are 3 extreme points of the most significant decline in the IHSG value, namely in 2008, 2013 and 2020.This is because Indonesia was experiencing an economic crisis in that year which was caused by a recession or economic downturn due to economic activity, drastic technological developments., as well as the COVID-19 pandemic that occurred in 2020.

Time Series Data Decomposition
Time series data can be decomposed into 4 main components, namely trend, seasonal, cyclical, and irregular or random components.The decomposition of IHSG data in Indonesia is presented in Figure 2 as follows.
Based on Figure 2, it can be seen that the observation data pattern can be broken down into trend, seasonal and random component patterns of IHSG data in Indonesia.The trend component pattern in the data appears to rise and fall erratically.Meanwhile, the seasonal component pattern is additive because the amplitude is always constant.Most of the random component patterns in this data are almost the same, but at some times there is a significant increase in random values.

BSTS Model
Based on the results of time series decomposition on previous IHSG data, it can be seen that the data has trend and seasonal patterns, so each BSTS model is formed which contains

Forecasting Results
Based on the best BSTS model obtained, the forecast value of IHSG data in Indonesia for the next 24 months, namely the period July 2023 to June 2024 is presented in Figure 7 as follows.
In Figure 5 above, the dark blue line is the result of forecasting the IHSG value in Indonesia for the next 24 months.It can be seen that the forecasting value for the IHSG from July 2023 to June 2024 is quite volatile.Then the green dotted line is the IHSG forecast value which shows that after June 2023, the IHSG value will fluctuate with the IHSG value range as shown by the green line interval.This means that the IHSG value may increase or decrease in the following months.For greater clarity, a plot for forecasting values for the next 24 months is presented in Figure 8 as follows.
Figure 6.JCI Data Forecasting Results Based on Figure 8 above, it is known that the forecast value of IHSG data for the next 24 months ranges from 6589-6760, with an average value of 6676.The lowest forecast value is in the 16th month, namely October 2023 and the highest forecast value is located in the 9th month, namely March 2023.For more details, a table of IHSG forecasting values in Indonesia for the next 24 along with intervals of forecasting values is presented in Table 3 as follows.

Figure 3 .
Figure 3. Decomposition of IHSG Data.Based on Figure 3, the blue dots show the variable data used for modeling.The state components for the best BSTS model are as follows.

Figure 4 .
Figure 4. Components of the BSTS State Model.

Table 1 .
1 state component, including local level, local linear trend, semi-local linear trend, and contains 2 state components.namelylocaland seasonal levels, local and seasonal linear trends, and semi-local and seasonal linear trends.Where the number of seasons used is S=12, because the data used is monthly and annual data.Apart from that, 3 MCMC iteration values will be used, namely n=200, 500, and 1000.The performance of each BSTS model formed will be measured based on the R-square value.So that the best model can be obtained that can be used for forecasting IHSG values in Indonesia.The comparison of Rsquare values for each model is presented in Table 1 as follows.Comparison of R-square values Based on Table1, it is known that the R-square value for each model has values that are not much different, namely ranging from 99.92% to 99.96%.Apart from that, it is known that the model with the largest R-square value, namely 99.96%, is located in the BSTS model which consists of local and seasonal linear trend state components, with the number of seasons used, namely S=12 and MCMC iterations, namely n=500.Based on the R-square value, it can be said that the model is good for use in forecasting IHSG values in Indonesia.The posterior distribution of the best BSTS model is as follows.

Table 3 .
Summary of Forecasting Results