Forecasting China s Inflation in a Data-Rich. Environment

Similar documents
Preface... xi. A Word to the Practitioner... xi The Organization of the Book... xi Required Software... xii Accessing the Supplementary Content...

Professor Dr. Gholamreza Nakhaeizadeh. Professor Dr. Gholamreza Nakhaeizadeh

Investigation in to the Application of PLS in MPC Schemes

Cambodia. East Asia: Testing Times Ahead

Economics - Primary Track (

Item

From Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT. Full book available for purchase here.

Japan s Economic Outlook No. 183 Update (Summary)

Japan s Economic Outlook No. 181 Update (Summary)

Item

ECONOMIC BULLETIN - No. 42, MARCH Statistical tables

Monthly Economic Letter

Item

Machine Drive Electricity Use in the Industrial Sector

Aging of the light vehicle fleet May 2011

ECONOMICS-ECON (ECON)

EUROPEAN COMMISSION DIRECTORATE-GENERAL FOR ECONOMIC AND FINANCIAL AFFAIRS BUSINESS AND CONSUMER SURVEY RESULTS. April 2011

Gold Saskatchewan Provincial Economic Accounts. January 2018 Edition. Saskatchewan Bureau of Statistics Ministry of Finance

NEW-VEHICLE MARKET SHARES OF CARS VERSUS LIGHT TRUCKS IN THE U.S.: RECENT TRENDS AND FUTURE OUTLOOK

Inflation: the Value of the Pound

Part C. Statistics Bank of Botswana

A Robust Criterion for Determining the Number of Static Factors in Approximate Factor Models

Growth cycles in Industrial production (IIP) (percentage deviation from trend*, seasonally adjusted) Jan 87 Sep 89. Jan 95. Nov 88. Nov 95.

CRUDE OIL PRICE AND RETAIL SELLING PRICE OF PETROL & DIESEL IN DELHI- AN EMPIRICAL STUDY

World Geographic Shares

More information at

QUARTERLY REVIEW OF BUSINESS CONDITIONS: MOTOR VEHICLE MANUFACTURING INDUSTRY / AUTOMOTIVE SECTOR: 4 TH QUARTER 2016

Money and banking. Flow of funds for the third quarter

Gross Domestic Product: Third Quarter 2016 (Advance Estimate)

Meeting product specifications

WHITE PAPER. Preventing Collisions and Reducing Fleet Costs While Using the Zendrive Dashboard

Gross Domestic Product: Third Quarter 2016 (Third Estimate) Corporate Profits: Third Quarter 2016 (Revised Estimate)

Vehicle Scrappage and Gasoline Policy. Online Appendix. Alternative First Stage and Reduced Form Specifications

Studying the Factors Affecting Sales of New Energy Vehicles from Supply Side Shuang Zhang

PLS score-loading correspondence and a bi-orthogonal factorization

Sharif University of Technology. Graduate School of Management and Economics. Econometrics I. Fall Seyed Mahdi Barakchian

Spring forecasts : a tough 2009, but EU economy set to stabilise as support measures take effect

FOR IMMEDIATE RELEASE

Contents of Paper. 06-Jan-17 SIGNIFICANCE OF COMMODITIES SECTOR TO MALAYSIAN ECONOMY, WITH EMPHASIS ON OIL PALM

DOT HS Summary of Statistical Findings November Statistical Methodology to Make Early Estimates of Motor Vehicle Traffic Fatalities

Estimation of Unmeasured DOF s on a Scaled Model of a Blade Structure

Economic & Steel Market Development in Japan

EMBARGOED UNTIL RELEASE AT 8:30 A.M. EST, WEDNESDAY, JANUARY 30, 2013 GROSS DOMESTIC PRODUCT: FOURTH QUARTER AND ANNUAL 2012 (ADVANCE ESTIMATE)

Table 1 ANTIGUA AND BARBUDA: MAIN ECONOMIC INDICATORS

Future Funding The sustainability of current transport revenue tools model and report November 2014

Monthly Economic Letter

Federated States of Micronesia

Annual Report on National Accounts for 2015 (Benchmark Year Revision of 2011) Summary (Flow Accounts)

For personal use only

STATISTICAL TABLES RELATING TO INCOME, EMPLOYMENT, AND PRODUCTION

Proportion of the vehicle fleet meeting certain emission standards

Fueling Savings: Higher Fuel Economy Standards Result In Big Savings for Consumers

The Degrees of Freedom of Partial Least Squares Regression

Monthly Economic Letter

Statistical tables S 0. Money and banking. Capital market. National financial account. Public finance

Statistical tables S 0. Money and banking. Capital market. National financial account. Public finance

Deutsche Konjunktur 2012

Technical Papers supporting SAP 2009

Figure 1 Unleaded Gasoline Prices

Innovation of Automobile Dealers in Hokkaido

National Health Care Expenditures Projections:

FISCAL YEAR MARCH 2015 FIRST HALF FINANCIAL RESULTS. New Mazda Demio

BUSINESS AND CONSUMER SURVEY RESULTS. August 2013: Economic Sentiment rises further in both the euro area and the EU

Signs of recovery in the Russian construction market

Supervised Learning to Predict Human Driver Merging Behavior

QUARTERLY REVIEW OF BUSINESS CONDITIONS: NEW MOTOR VEHICLE MANUFACTURING INDUSTRY / AUTOMOTIVE SECTOR: 2 ND QUARTER 2017

Statistical Estimation Model for Product Quality of Petroleum

I remind you that our presentation is available on our website. We can start from the first 2 slides that show Piaggio Group First

Appendix B STATISTICAL TABLES RELATING TO INCOME, EMPLOYMENT, AND PRODUCTION

Cautious Optimism For a New Era of Global Growth

Chapter 5 ESTIMATION OF MAINTENANCE COST PER HOUR USING AGE REPLACEMENT COST MODEL

Used Vehicle Supply: Future Outlook and the Impact on Used Vehicle Prices

October 17, Please contact the undersigned directly with any questions or concerns regarding the foregoing.

ASIAN DEVELOPMENT FUND (ADF) ADF XI REPLENISHMENT MEETING 7 9 March 2012 Manila, Philippines. Post-Conflict Assistance to Afghanistan

Application of claw-back

Who has trouble reporting prior day events?

The Russian building market

ECONOMIC SURVEY STATISTICAL APPENDIX

Linking the Kansas KAP Assessments to NWEA MAP Growth Tests *

FISCAL YEAR MARCH 2014 FIRST HALF FINANCIAL RESULTS. New Mazda Axela (Overseas name: New Mazda3)

BUSINESS AND CONSUMER SURVEY RESULTS. November 2013: Economic Sentiment improves in the euro area and the EU

N ational Economic Trends

BUSINESS AND CONSUMER SURVEY RESULTS

ECONOMICS (ECON) Economics (ECON) San Francisco State University Bulletin

210 Index. diesel fuel Brazil, 73 Mexico, 99, 108 Thailand, 171, , 183n5 Turkey, 54 7 see also fuel prices

Synthesis of Optimal Batch Distillation Sequences

Mercedes-Benz: Best Sales Result for the Month of June in Company History Up 13 Percent

Does Bank Lending Tightness Matter?

Effect of driving pattern parameters on fuel-economy for conventional and hybrid electric city buses

The Hybrid and Electric Vehicles Manufacturing

FISCAL YEAR ENDING MARCH 2012 FIRST HALF FINANCIAL RESULTS

FISCAL YEAR MARCH 2015 THIRD QUARTER FINANCIAL RESULTS. Updated Mazda CX-5 (Japanese specification model)

Linking the Georgia Milestones Assessments to NWEA MAP Growth Tests *

BUSINESS AND CONSUMER SURVEY RESULTS. Euro Area (EA) February 2014: Economic Sentiment broadly unchanged in the euro area and the EU

Monthly Economic Letter

EMBARGOED UNTIL RELEASE AT 8:30 A.M. EST, TUESDAY, DECEMBER 23, 2014

Monthly Economic Letter

ACCIDENT MODIFICATION FACTORS FOR MEDIAN WIDTH

QUARTERLY REVIEW OF BUSINESS CONDITIONS: NEW MOTOR VEHICLE MANUFACTURING INDUSTRY / AUTOMOTIVE SECTOR: 3 rd QUARTER 2018

Analysis of Production and Sales Trend of Indian Automobile Industry

Transcription:

Forecasting China s Inflation in a Data-Rich Environment Ching-Yi Lin Department of Economics, National Tsing Hua University Chun Wang Department of Economics, Brooklyn College, CUNY Abstract Inflation is one of the most critical issues facing China in recent years. To improve the inflation forecasts within China this study investigates the predictive ability of three dimension reduction techniques used in a data-rich environment: Principal Components Analysis (PCA), Sliced Inverse Regression (SIR), and Partial Least Squares (PLS) applied in the Factor-augmented Autoregression (FAAR) model proposed by Stock and Watson (2005). A variety of macroeconomic data from China between January 1998 and December 2009 is obtained to construct for use by three different techniques. The empirical study finds that there is no simple method that can be used to implement out-of-sample forecasting in all circumstances. The performance of different dimension reduction methods depends on forecasting horizons, the number of chosen, and the number of slices for SIR. Keywords: Inflation Forecasting; Principal Components Analysis; Sliced Inverse Regression; Partial Least Squares; Factor-augmented Autoregression Model; China. JEL classification: C53, E31 Correspondence to Ching-Yi Lin, Department of Economics,National Tsing Hua Universtiy No. 101, Section 2, Kuang-Fu Road, Hsinchu, Taiwan 30013, lincy@mx.nthu.edu.tw Chun Wang, Department of Economics, Brooklyn College, CUNY 2100 Bedford Ave, Brooklyn, NY 11210, cwang@brooklyn.cuny.edu

I. Introduction Inflation rate in China had been surprisingly moderate over the past decades. Nevertheless, China is now on its way to a very high inflation rate. The government s full year target was 3 percent in 2010; however, the inflation rate in China was last reported at 5.1 percent year-on-year in November 2010, the largest increase since June 2008. This follows a sharp increase on the October 2010 rate of 4.4 percent. In order to fight this extreme increase of inflation, the People s Bank of China announced in October 2010 the first interest rate increase in nearly three years. In December 2010 the central bank raised the benchmark interest rate by a quarter of a percentage point; this move increased the one-year lending rate to 5.81 percent and the one-year deposit rate to 2.75 percent. The government also announced price control guidelines and it has taken a number of measures to reign in its monetary policy and boost the supply of key goods. This paper is concerned with the forecasting of inflation in China; in order to assess inflation prospects it is important to produce accurate inflation forecasts in China. The main contribution of this paper is to improve inflation forecasts for China in a data-rich environment. Dimension reduction techniques have been applied in a number of recent papers using large datasets from other countries. These recent studies include Stock and Watson (1999, 2002, 2005), Artis, 2

Banerjee, and Marcelino (2001), Bernanke and Boivin (2003), and Lin and Tsay (2005). Mehrotra and S chez-fung (2008) is one study that focuses on data in mainland China.. They employed 15 alternative forecasting models and found that those considering many predictors using a principal component were able to provide a better relative forecasting performance than the univariate benchmark. This paper extends this research to examine the predictive ability of three dimension reduction techniques: Principal Components Analysis (PCA), Partial Least Squares (PLS) and Sliced Inverse Regression (SIR) as used in Factor-augmented Autoregression (FAAR) model. In principle, PCA consists in choosing a small number of liner combinations of data that summarize a large portion of the original variation of data. It forecasts a single time series of interest using these estimated from a large panel of data; thus, information from a large number of variables can be used in forecasting while keeping the dimensions of the model under control. However, PCA are constructed independently of the dependent variable Y, the variable that we want to forecast. In contrast to this, SIR and PLS techniques consider information from both the dependent variable Y and all other available variables X. SIR uses Y values to create slices and then applies PCA on the slice means of X. PLS uses the correlation information between Y and X to project the regressors into a few 3

. Intuitively, SIR and PLS forecasts should outperform PCA forecasts since the first two reduce the dimension of X given a particular Y rather than independently of Y. In addition, one of the prominent advantages for SIR is that it allows Y to depend on the in a non-linear fashion. Our data set includes 38 major monthly macroeconomic time series from January 1998 to December 2009 in mainland China. In particular, 37 macroeconomic indicators are used to construct the for the FAAR model that will estimate and then forecast the change in the annual inflation rate. During the sample period the average annual inflation rate in China was 1.35 percent; the highest was 8.34 percent in February 2008 the record low was -2.22 percent in April of 1999. Figure 1 1 plots the annual inflation rate from January 1998 to December 2009. In terms of the one-step ahead three-month out-of-sample forecasting, the SIR technique used with only one factor performs the best. For both the six-month and one-year out-of-sample forecastings, PCA outperforms the others and the predictive ability is improved as the number of increases. Forecasting inflation is a very fundamental and extremely important task associated with decision-making. The issue of inflation in China is highly 1 Figure 1 is consistent with the data reported in the Trading Economics database. http://www.tradingeconomics.com/economics/inflation-cpi.aspx?symbol=cny 2 Stock, J. H. and Watson, M. W. (2007) Why has US inflation become harder to forecast?. 4

important not only because of its effect on the domestic monetary policy and decision making for both consumption and investment but also because of its spillover effect in the global economy. China s rapid economic growth over the past few decades has improved the standard of living of its residents and has strengthened the country s participation within the global economy. A high inflation rate for China is a threat to highly trade-dependent countries, such as Korea, as its trade share with China is rising, over 20 percent of GDP in 2010. As prices are increasing in China, Korean firms that import raw materials and intermediaries from China and the Korean consumers who purchase such Chinese commodities and foods will suffer from the higher costs. Choongsoo Kim, the Bank of Korea governor, stated at the end of 2010 that the country must closely watch rising inflation risk from China as it could put upward pressure on local consumer prices. The literature on forecasting inflation within China is quite limited. Burdekin and Siklos (2008) use a standard monetary approach to capture inflation. Chen, Tong and Li (2009) indicated that the predictive ability of the money supply to forecast inflation is actually quite low. Mehrotra and S chez-fung (2008) compare the performance of 15 models for forecasting inflation within China and find that only VAR, Phillips curve, and time series 5

models considering many predictors using a principal component outperform the univariate benchmark; our study is motivated by this finding. Since market-oriented reforms were instituted in 1978, this opening-up has induced rapid but volatile economic development within China. It is difficult to forecast inflation 2 in China using a univariate model. Intuitively, the more information the forecasters have at hand, the more accurate the forecasts will be. Nevertheless, forecasts still become less efficient and, in mean square metric, it is often desirable to reduce the information available to avoid the cures of dimensionality. Therefore, choosing a methodology for reducing the dimensionality of the data is a key to improving the performance of forecasts. This paper is organized as follows: section 2 discusses the methodologies of forecasting and briefly introduces three dimension reduction techniques. Data set and empirical results in terms of PCA, SIR and PLS forecasts relative to the benchmark autoregression model are reported in section 3 and section 4 concludes. 2. Methodologies of Forecasting A natural starting point for a forecasting model is to use past values of Y (that is, Y t 1, Y t 2, ) to forecast Y t ; this approach is called the autoregressive (AR) 2 Stock, J. H. and Watson, M. W. (2007) Why has US inflation become harder to forecast?. 6

model 3. The number of AR lags can be determined by sequential downward t- or F-tests or an information criterion such as AIC or BIC 4. It is quite straightforward to extend AR models to an autoregressive moving average (ARMA) model that also considers the autocorrelation in the error term. When additional predictors are available, autoregressive distributed lag (ADL) 5 models can also be applied. We do not apply an ADL model to forecast inflation in this paper due to the absence of data related to the unemployment rate within China. The factor forecasts model proposed by Stock and Watson (2005) is one of the recent developed forecasting models in data rich environments. First, are constructed using many predictors 6 using the following expression: (1) X t F e t t where X t contains N predictors, t F is the estimated K, is factor loading matrix and e t is the error term. Factors and the lags of the dependent variable are applied in estimation as 3 According to Chapter 14 in Stock and Watson s Introduction to Econometrics, 2 nd edition, forecasting models built on regression methods need not (and typically do not) have causal interpretation; moreover, the omitted variable bias is irrelevant for forecasting. The key requirement for the external validity of a time series regression is stationarity. 4 AIC: Akaike information criterion; BIC: Bayes information criterion. 5 An application example for the ADL model is to study whether lagged inflation rates and unemployment rates help to forecast inflation based on Phillips curve. 6 The usual case in practice states that if the number of the predictors is greater than the sample size, the simple regression model fails. 7

follows: (2) Y h t h ' F ( L) Y u t t t h Equation (2) is named the factor-augmented autoregression (FAAR); forecasts are constructed as follows: (3) Y ˆ ˆ ˆ ˆ ˆ ( L) Y FAAR, h ' T h T FT T T PCA has been extensively used in literature to construct in the first step. Furthermore, we consider two additional dimension reduction methods: PLS and SIR. Two key issues are explored in this study: the first is to investigate if the FAAR model performs better than the benchmark AR and ARMA models in in-sample prediction and out-of-sample forecasting in terms of smaller relative RMSEs and RMSFEs. The other is to compare the performance of FAAR models using different dimension reduction techniques to construct. The three dimension reduction techniques, PCA, SIR and PLS, are introduced briefly in the section 7. PCA are linear combinations of random variables which have special properties in terms of variance. The technique assumes that the set of predictor matrix X is T N which contains T observations of N predictors. The factor loading matrix corresponds to the ' eigenvectors associated with the descending eigenvalues of ( X X) ( X X) ; 7 More details about these three dimension reduction techniques are available upon request. 8

the choice of K can be obtained with information criteria selection 8 or the rule of thumb 9. Li (1991) proposed an alternative method named Sliced Inverse Regression to estimate the factor loading matrix. The SIR algorithm for estimating the effective dimension-reduction directions is as follows: (a) Standardize X by an affine transformation to get 1 2 X ~ ˆ / XX ( X X ) where XX ˆ is the sample covariance matrix and X is the sample mean of X. (b) Sort the matrix X ~ according to the values of Y. (c) Partition the sorted X ~ matrix into H 10 slices and then calculate 1 the sample mean of X in each slice: X h X ( i ), where T h 1,..., H. h ( i ) sliceh (d) Compute the covariance matrix for the slice means of by the slice sizes: ˆ H 1 ' Th ( Xh X)( Xh X. T ) h 1 X ~ weighted (e) Compute the sample covariance for s, X t ˆ X 1 T T t 1 ( X t X )( X t ' X ). (f) Find the SIR directions by conducting the eigenvalue decomposition 8 For example, Bai and Ng (2002) propose criteria for the selection of in large dimensional panels when both N and T go on to infinity 9 The rule of thumb is to choose the number of with eigenvalues greater than 1. 10 According to Li (2000), the data set should be divided into H slices as equally as possible. 9

of ˆ with respect to : ˆ ˆ t ˆ t ˆ X ˆ ˆ X t, ˆ ˆ ˆ 1 2 N. The PLS technique was first developed by Wold (1966, 1985) and has been widely used in chemistry. Generally speaking, OLS estimates are calculated by considering the covariance of X as well as the correlation between Y and X. Unlike PCA that depend only on the covariance of X while ignoring the correlation between Y and X, PLS can only capture the variation between Y and X while ignoring the covariance of X. An algorithm to calculate PLS estimates ˆPLS is as follows: Step 1: The first PLS factor is defined as f PLS 1 Xw 1, where w cx' Y is the weight and uses the information from both X and Y 1. The constant c is chosen to be 1 Y ' XX' Y in order to normalize the length of w 1 to unity. PLS PLS Step 2: To locate the second PLS, regress Y on to f 2 f 1 u1 1 1 PLS PLS get the residual, i.e. Y f 1 u and regress X on to get the PLS residual e1,(i.e. X f 1 1 e1 ) by OLS. Both residuals u1 and e1 contain f 1 the information that is unexplained by the first PLS factor. Setting Yˆ uˆ, Xˆ e 1 ˆ1 and repeating step 1 we obtain the second PLS factor f PLS 2 X ˆw 2. Step 3: Apply step 2 and step 1 iteratively to new Yˆ uˆ i, Xˆ eˆ i for 10

i 2,, K such that the third and the subsequent are found. W Step 4: Define the matrices ' ' ' w,w,..., w, P ˆ, ˆ,..., ˆ, and the K 1 vector ˆ, ˆ,..., ˆ ' 1 2 K 1 2 K q 1 2 K. Then calculate the PLS estimates for the model Y X as ˆ PLS ' W( PW) 1 q. Unlike OLS estimates, PLS estimates are biased because they only focus on the correlation between Y and X while ignoring the correlation among X ; however, the amount of bias decreases as the number of the increases. When the variance-covariance matrix of X is an identity matrix and the number of PLS are chosen to be the number of the variables, PLS is equivalent to OLS. The next section employs the constructed from the above three techniques to forecast inflation in China. 3. Empirical Results 3.1 Data This study uses a time-series dataset which covers the monthly inflation rate and 37 macroeconomic indicators between January 1998 and December 2009. Since data on the consumer price index (CPI) obtained from the National Bureau of Statistics of China is on the basis of year-on-year, the annual inflation rate in each period is defined as t log( CPI t) log( CPI t 12). Data for the 37 11

monthly macroeconomic variables used in the data-rich set include measures of financial policy, real activity, stock prices, exchange rates, monetary policy, commodity and producer prices, trade volume, and oil price. All of these are available from one of three databases: CEIC, National Bureau of Statistics of China (NBSC), and the International Financial Statistics (IFS). Table 1 lists the source used for each variable in this study. To obtain a stationary series, each macroeconomic indicator is transformed into the difference of the logarithm, whenever possible, implying the growth rate of the variable. All transformed variables are verified to be stationary. 3.2 In-sample prediction We begin exploring the performance of in-sample predictions for the change in annual inflation rate within China using the whole sample period from Feburary 1998 to December 2009. Figure 2 plots the change in annual inflation rate which is confirmed as stationary by an Augmented Dicky Fuller (ADF) test. Table 2 reports the coefficients on the first lag of the change in annual inflation rate, the adjusted R-square statistic, the root mean square errors (RMSEs), the relative RMSEs from the benchmark AR model, the candidate ARMA model and FAAR models using 5 PCA, 11 PCA and PLS, 11 and 1 SIR 11 The rule of thumb in selecting the number of PCA is to choose the number of with eigenvalues greater than 1. Thus, 11 PCA are chosen which explains the 77 percent 12

factor 12 with 5, 10 and 20 slices. Figure 3 displays the actual and the in-sample predicted change in the annual inflation rate for the selected models; the predictions of the FAAR model with one SIR factor most closely follows the true change in the annual inflation rate. In general, the FAAR using 1 SIR factor significantly outperforms other models in terms of larger adjusted R-square and small relative RMSEs. This finding confirms the advantage of SIR which considers information in the predicted variable to form. Nevertheless, the perfomance of FAAR models using PCA and PLS depends heavily on the number of factor chosen. Estimations reveal that the relative RMSE is 0.952 for a FAAR model with 5 PCA but is 0.735 for a FAAR model with 11 PCA. In the case of choosing 11, a FAAR model using PCA or PLS results in resembing relative RMSEs (0.735 vs. 0.741) 13. 3.3 Out-of-sample Forecasting variation in the 37 predictors. 5 PCA are also considered which explain a 54 percent variation in 37 predictors. Cross-validation criteria suggest over 20 PLS. For comparison purpose, we choose 11 PLS. 12 Chi-square test suggests 1 SIR factor. 13 Finally, we also observed that the coefficients on the first lag of change in the annual inflation rate are significant at the 5% significance level for the AR(1) model and at the 1% significance level for the ARMA(1,1) model. However, when enter into the model, the so called FAAR model, the resulting coefficients on the first lag of the dependent variable become insignificant. Nevertheless, we still report these in Table 2 for consistency. 13

In order to study the out-of-sample forecasting performance of the benchmark AR model and FAAR models 14 using PCA, PLS and SIR respectively in the short run and medium run, we created direct 1-step ahead 3-month, 6-month and 12-month out-of-sample forecastings with different numbers of PCA and PLS (1, 5 and 10 15 ) and 1 SIR factor with different slice choices (5, 10 and 20 slices). We reported the root mean square forecasting errors (RMSFEs) and relative RMSFEs for each model in Tables 3 through 5. The visual representation of the selected models for 3-month out-of-sample forecasting is illustrated most concisely in Figure 4; others are available upon request. In particular, Table 3 and Figures 4a and 4b each report 3-month out-of-sample forecasting results from October 2009 to December 2009. The findings of 6-month out-of-sample forecasting from July 2009 to December 2009 are summarized in Table 4. The results of 12-month out-of-sample forecasting are reported in Table 5. It is striking that there is no magic dimension reduction method for out-of-sample forecasting. The performance of different dimension reduction 14 The ARMA model is not considered in the out-of-sample forecasting due to the insignificant MA coefficient in the estimation of samples. 15 One PCA and PLS factor is chosen for comparison with 1 SIR factor. Five PCA explain about 50 percent and 10 PCA explain about 75 percent of the variation present in the 37 macro variables. The same number of PLS are chosen when comparing. 14

methods depends on forecasting horizons, the number of chosen and the number of slices of the SIR. In the instance of 3-month out-of-sample forecasting reported in Table 3, the relative RMSFE for FAAR model using 1 SIR factor with 5 slices is 0.288; this indicates that the model outperforms FAAR models using 1 PCA factor, 5 PCA, 1 PLS factor, 5 PLS and 10 PLS. However, the relative RMSFE and FAAR models with 10 PCA is 0.146; this obviously beats the FAAR model using 1 SIR factor with 5, 10 or 20 slices. The above observations are depicted in Figures 4a and 4b; Figure 4a includes the forecasted change in the annual inflation from a FAAR model using 10 PCA and is represented by the long-dash two dots line following the true annual inflation line most closely. The forecasted change in annual inflation from a FAAR model with 1 SIR factor represented by the long-dash line in Figure 4b is much closer to the true line than the forecasted change in annual inflation from a FAAR model with 5 PCA which is represented by the short dash line. The horizon of out-of-sample forecasting plays an important role in the FAAR model using SIR. Particularly, the longer the horizon is the worse the out-of-sample forecasting is for the FAAR model using 1 SIR factor 16. The estimation indicates that the relative RMSFEs increase from 0.288 for 3-month 16 Chi-square test chooses 1 SIR factor 15

out-of-sample forecasting to 0.901 for 6-month out-of-sample forecasting and finally to 0.998 for 12-month out-of-sample forecasting. This finding is consistent with the theoretical feature of SIR: SIR uses information within forecasted variables to construct. The longer the forecasting horizon is the less information there is in the forecasted variable that can be used to form. We also observe the variation in SIR slices. Table 4 reports that the relative RMSFE for a FAAR model using 1 SIR factor is 0.901 for 5 SIR slices, 1.832 for 10 SIR slices and 0.818 for 20 SIR slices 17. The second empricial finging is that the out-of-sample forecasting performance of the FAAR model using PCA depends on the number of. This finding also matches the theoretical observation for PCA; the more PCA that are chosen the more the variation in the original data is summarized. The relative RMSFE for the FAAR moel is 1.033 with 1 PCA factor, 0.798 for 5 PCA and 0.146 for 10 PCA. However, when we keep 10 PCA, the out-of-sample forecasting performance deteriorates as the horizon increases. The relative RMSFE increases from 0.146 for 3-month out-of-sample forecasting to 0.189 for 6-month out-of-sample forecasting and finally to 0.687 for 12-month out-of-sample forecasting. 17 While it is desirable to have certain information such as Cross-validation to determine the optimal SIR slices, this is not the focus of this study. 16

Third, the forecasting performance of the FAAR model using PLS is not outstanding when compared to the benchmark AR model. The only scenario where the FAAR model using PLS outperforms the AR model is shown in Table 3. In this case the RMSFE for the FAAR model with 10 PLS is 0.924 indicating a 7.6 percent improvement over the AR model. This may be due to the number of PLS 18 chosen in order to be consistant with the PCA. When starting to construct the PLS focuses on the correlation between the forecasted variable and each predictor but ignores the correlation amongst predictors. As the number of PLS increases we see that the ignored correlation among the preditors gradually becomes useful in the estimation of values. However, similar to SIR, PLS uses information in the forecasted variable to form so that the longer the forecasting horizon is the less information in the forecasted variable is catched. 4. Conclusion The importance of the issue of inflation within China has been of great interest recently. To improve the inflation forecasts for China this paper investigates the predictive ability of three dimension reduction techniques, PCA, PLS and SIR, in a data-rich environment. The empirical study finds that there is no magic method 18 Cross-validation criteria suggest over 20 PLS. We avoid this choice because our study s purpose is to reduce the dimension of the predictors significantly. 17

useful in implementing out-of-sample forecasting in all circumstances. Using 37 pieces of macroeconomic data available within China from January 1998 to December 2009, the performance of the three dimension reduction methods for the out-of-sample forecasting depends on forecasting horizons, the number of chosen and the number of slices for SIR. In particular, for three-month out-of-sample forecasting, one SIR factor in the factor-augmented autoregression (FAAR) model outperforms the others except for the FAAR model with ten PCA. When compared with other FAAR models that include only one factor, the FAAR model with 1 SIR factor performs best for the three-month forecasting. However, the performance of SIR worsens as the forecasting horizon increases. In the case of one-year forecasting, the FAAR model with ten PCA has the best predictive ability among the all models. The performance of PCA is improved as the number of PCA is increased as more variation in the original data in summarized. Further, the forecasting performance of the FAAR model using PLS is not outstanding when compared to the benchmark AR model and the others. This study sheds light on the improved accuracy of inflation forecasting in data-rich environment which is essential for the policy makers within China today. 18

References: Artis, M. J., Banerjee, A., Marcelino, M., 2001. Factor Forecasts for the UK. Journal of Forecasting 24, 279 298. Bernanke, B., Boivin, J., 2003. Monetary Policy in a Data-rich Environment. Journal of Monetary Economics 50, 525 546. Burdekin, R., Siklos, P., 2008. What has driven Chinese monetary policy since 1990s? Investigating the People's Bank's policy rule. Journal of International Money and Finance 27, 847 859. Chen, Y. B., Tang, S. L., Du, L., 2009. Can money supply forecast inflation in China? Economic Theory and Business Management 02/2009. Li, K. C., 1991. Sliced Inverse Regression for Dimension Reduction. Journal of the American Statistical Association 86, 316 327. Lin, J. L., Tsay, R. S., 2005. Comparisons of Forecasting Methods with Many Predictors. mimeo. Marcellino, M., Stock, J. H., Watson, M. W., 2006. A Comparison of Direct and Iterated AR Methods for Forecasting Macroeconomic Series h-steps Ahead. Journal of Econometrics 135, 499 526. Mehrotra, A., Sa chez-fung, J., 2008. Forecasting inflation in China. China Economic Journal 1(3), 317 322. Stock, J. H., Watson, M. W., 1999. Diffusion Indexes. NBER Working Paper NO. 6702. Stock, J. H., Watson, M. W., 2002b. Macroeconomic Forecasting Using Diffusion Indexes. Journal of Business and Economic Statistics 20(2), 147 162. Stock, J. H., Watson, M. W., 2002. Forecasting Using Principal Components 19

from a Large Number of Predictors. Journal of the American Statistical Association 97(460), 1167 1179. Stock, J. H., Watson, M. W., Marcellino, M., 2003. Macroeconomic Forecasting in the Euro Area: Country Specific versus Area-Wide Information. European Economic Review 47, 1 18. Stock, J. H., Watson, M. W., 2005, An Empirical Comparison of Methods for Forecasting Using Many Predictors. mimeo Stock, J. H., Watson, M. W., 2007. Why has US inflation become harder to forecast? Journal of Money, Credit and Banking 39, 13 33. Wold, H., 1966. Estimation of principal components and related models by iterative least squares. In: Krishnaiaah P. R. (Ed.), Multivariate Analysis, Academic Press, New York, 391-420. Wold, H., 1985. Partial least squares. In: Kotz, S., Norman, L.J. (Eds.), Encyclopedia of statistical sciences 6, Wiley Press, New York, 581 591. 20

Table 1 No. Variables Source 1 Government Expenditure CEIC 2 Government Revenue CEIC 3 Industrial Sales: Collective Ownership CEIC 4 Industrial Sales: Heavy Industry CEIC 5 Industrial Sales: Light Industry CEIC 6 Industrial Sales: State Owned CEIC 7 Financial Institution Loans CEIC 8 Production of Primary Energy: Coal CEIC 9 Production of Primary Energy: Electricity CEIC 10 Production of Primary Energy: Natural Gas CEIC 11 Production of Primary Energy: Crude Oil CEIC 12 Purchasing PI: Raw Materials (RM): Total CEIC 13 Financial Institution Deposits: Savings Deposits CEIC 14 Index: Shanghai Stock Exchange: Composite CEIC 15 Index: Shenzhen Stock Exchange: Composite CEIC 16 Spot Exchange Rate: Period Avg: SAFE: RMB to US Dollar CEIC 17 Spot Exchange Rate: Period Avg: SAFE: RMB to Japanese Yen CEIC 18 M0 NBSC 19 M1 NBSC 20 M2 NBSC 21 CPI NBSC 22 PPI NBSC 23 Foreign Exchange NBSC 24 Gold NBSC 25 Retail Price Index (Urban) NBSC 26 Social Consumption Retail Aggregate Value NBSC 27 Industrial Production (index number) IFS 28 Bank Rate IFS 29 Deposit Rate IFS 30 Lending Rate IFS 31 SHARE PRICE INDEX IFS 32 SDR Holdings IFS 33 Reserve Position in the Fund IFS 34 Nominal Effective Exchange Rate IFS 35 Real Effective Exchange Rate IFS 36 EXPORTS,F.O.B. IFS 37 IMPORTS,C.I.F. IFS 38 Oil Price IFS Note: CEIC is Macroeconomic, Industry and Financial time series databases for Global Emerging and Developed Markets ; NBSC is National Bureau of Statistics of China ; and IFS is International Financial Statistics. 21

Table 2 In-Sample Predictions for Change in Annual Inflation Rate in China from 1998:02 to 2009:12 FAAR AR(1) ARMA(1,1) 5 PCA 11 PCA 11 PLS 1 SIR factor 1 SIR factor 1 SIR factor (5 slices) (10 slices) (20 slices) dinf(t-1) 0.193 (**) 0.765 (***) 0.094 0.013 0.001 0.003-0.009-0.03 Adj R-square 0.03 n.a 0.122 0.476 0.467 0.704 0.696 0.721 RMSEs 0.645 0.643 0.614 0.474 0.478 0.357 0.363 0.347 Relative RMSEs 1 0.997 0.952 0.735 0.741 0.553 0.563 0.538 Note: dinf: change in annual inflation rate; ** indicates 5% significance level; *** indicates 1% significance level; AR(1) is the benchmark model. Table 3 Out-of-Sample Forecasts for Change in Annual Inflation Rate in China from 2009:10 to 2009:12 (3-month Forecasting) AR(1) 1 PCA factor 5 PCA 10 PCA 1 PLS factor 5 PLS FAAR 10 PLS 1 SIR factor (5 slices) 1 SIR factor (10 slices) 1 SIR factor (20 slices) RMSFEs 0.889 0.918 0.709 0.130 1.040 0.891 0.822 0.256 0.303 0.256 Relative RMSFEs 1 1.033 0.798 0.146 1.170 1.002 0.924 0.288 0.341 0.288 22

Table 4 Out-of-Sample Forecasts for Change in Annual Inflation Rate in China from 2009:7 to 2009:12 (6-month Forecasting) FAAR AR(1) 1 PCA 5 PCA 10 PCA 1 PLS 5 PLS 10 PLS 1 SIR factor 1 SIR factor 1 SIR factor factor factor (5 slices) (10 slices) (20 slices) RMSFEs 0.691 0.713 0.643 0.130 0.817 0.767 1.932 0.622 1.266 0.565 Relative RMSFEs 1 1.032 0.931 0.189 1.183 1.110 2.797 0.901 1.832 0.818 Note: AR(1) is the benchmark model Table 5 Out-of-Sample Forecasts for Change in Annual Inflation Rate in China from 2009:1 to 2009:12 (12-month Forecasting) AR(1) 1 PCA factor 5 PCA 10 PCA 1 PLS factor 5 PLS FAAR 10 PLS 1 SIR factor (5 slices) 1 SIR factor (10 slices) 1 SIR factor (20 slices) RMSFEs 0.934 0.971 0.888 0.642 0.979 0.971 1.192 0.932 1.269 1.118 Relative RMSFEs Note: AR(1) is the benchmark model 1 1.040 0.950 0.687 1.047 1.039 1.275 0.998 1.358 1.196 23

-3-2 -1 Percent 0 1 2-2 Percent 0 2 4 6 8 Figure 1 Annual Inflation Rate in China from 1998:01 to 2009:12 1998m1 2000m1 2002m1 2004m1 2006m1 2008m1 2010m1 time Figure 2 Change in Annual Inflation Rate in China from 1998:02 to 2009:12 1998m1 2000m1 2002m1 2004m1 2006m1 2008m1 2010m1 time 24

-3-2 -1 0 1 2 Figure 3 In-sample Predictions for Change in Annual Inflation Rate from 1998:03 to 2009:12 1998m1 2000m1 2002m1 2004m1 2006m1 2008m1 2010m1 time Change in Annual Inflation Rate (dinf) Predicted dinf from SIR with 1 factor Predicted dinf from AR(1) Predicted dinf from PCA with 11 25

-2-1 0 1 2-2 -1 0 1 2 Figure 4 a. Annual Inflation Rate and Forecasted Annual Inflation Rates from 2009:10 to 2009:12 (3-month Out-of-sample Forecasting) 2009m1 2009m4 2009m7 2009m10 2010m1 time Annual Inflation Rate (inf) Foecasted inf_1sir factor with 5 slices Forecasted inf_10 PCA Foecasted inf AR(1) Forecasted inf_10 PLS b. Annual Inflation Rate and Forecasted Annual Inflation Rates from 2009:10 to 2009:12 (3-month Out-of-sample Forecasting) 2009m1 2009m4 2009m7 2009m10 2010m1 time Annual Inflation Rate (inf) Forecasted inf_1 SIR factor with 5 slices Forecasted inf_5 PLS Forecasted inf_5 PCA 26