Search
strumenti

Methods and software of the statistical process

Prepare draft outputs

This sub-process is where the data are transformed into statistical outputs. It includes the production of additional measurements such as indices, trends or seasonally adjusted series, as well as the recording of quality characteristics.

Computation and evaluation of composite indices

composite index is a mathematical combination (or aggregation as it is termed) of a set of indicators that represent the different dimensions of a phenomenon to be measured.

Constructing a composite index is a complex task. Its phases involve several alternatives and possibilities that affect the quality and reliability of the results. The main problems, in this approach, concern the choice of theoretical framework, the availability of the data, the selection of the more representative indicators and their treatment in order to compare and aggregate them.

In particular, we can summarize the procedure in the following main steps:

  1. Defining the phenomenon to be measured. The definition of the concept should give a clear sense of what is being measured by the composite index. It should refer to a theoretical framework, linking various sub-groups and underlying indicators. Also the model of measurement must be defined, in order to specify the relationship between the phenomenon to be measured (concept) and its measures (individual indicators). If causality is from the concept to the indicators we have a reflective model – indicators are interchangeable and correlations between indicators are explained by the model; if causality is from the indicators to the concept we have a formative model – indicators are not interchangeable and correlations between indicators are not explained by the model.
  2. Selecting a group of individual indicators. The selection is generally based on theory, empirical analysis, pragmatism or intuitive appeal. Ideally, indicators should be selected according to their relevance, analytical soundness, timeliness, accessibility and so on. The selection step is the result of a trade-off between possible redundancies caused by overlapping information and the risk of losing information. However, the selection process also depends on the measurement model used: in a reflective model, all the individual indicators must be intercorrelated; whereas in a formative model they can show negative or zero correlations.
  3. Normalizing the individual indicators. This step aims to make the indicators comparable. Normalization is required before any data aggregation as the indicators in a data set often have different measurement units. Therefore, it is necessary to bring the indicators to the same standard, by transforming them into pure, dimensionless, numbers. Another motivation for the normalization is the fact that some indicators may be positively correlated with the phenomenon to be measured (positive polarity), whereas others may be negatively correlated with it (negative polarity). We want to normalize the indicators so that an increase in the normalized indicators corresponds to increase in the composite index. There are various methods of normalization, such as re-scaling (or Min-Max), standardization (or z-scores) and ‘distance’ from a reference (or index numbers).
  4. Aggregating the normalized indicators. It is the combination of all the components to form one or more composite indices (mathematical functions). This step requires the definition of the importance of each individual indicator (weighting system) and the identification of the technique (compensatory or non-compensatory) for summarizing the individual indicator values into a single number. Different aggregation methods can be used, such as additive methods (compensatory approach) or multiplicative methods and unbalance-adjusted functions (non-compensatory or partially compensatory approach).
  5. Validating the composite index. Validation step aims to assess the robustness of the composite index, in terms of capacity to produce correct and stable measure, and its discriminant capacity (Influence Analysis and Robustness Analysis).

Seasonal adjustment of time series

Seasonality can be defined as the systematic intra-year movement caused by various factors, e.g. weather changes, calendar, vacation or holidays and usually consists of periodic, repetitive and generally regular and predictable patterns in the level of a time series. Seasonality can be influenced also by production and consumption decisions made by economic agents taking into account several factors like endowments, their own expectations, as well as preferences and the production techniques available in the economy.

Differently, cyclic pattern  presents non fixed rises and falls and its fluctuation length is usually not shorter than 2 years.

The overlap of this two kinds of fluctuations (seasonal and cyclic) in a time series could provide some problems for short term (monthly or quarterly) variation interpretation, above all when the seasonal component is highly represented in the observed data. For this reason, in order to measure cyclical changes, short term variations are computed from seasonal adjusted series. In turn, seasonal adjustment is the process of seasonal and calendar effects removal from a time series. This process is performed by means of analytical techniques that break down the series into components with different dynamic features. These components are unobserved and have to be identified from the observed data based on an ex-ante assumptions on their expected behavior. Broadly speaking, seasonal adjustment includes the removal of both within-a-year seasonal movements and the influence of calendar effects (such as the different number of working days, or Easter and moving holidays).

Notice that calendar effects are not constant among different countries or economic sectors, so that, time series which include them are not comparable each other. For this reason, generally, calendar effects are removed together with seasonal component in the seasonal adjusted series, so that, it is possible to better catch the yearly variation (computed with respect to the same period of the previous year), as well as, the mean yearly variation. Moreover, together with the seasonal adjusted series, can be also produced time  series corrected only for calendar effects.

Once removed the repeated impact of these effects, seasonally adjusted data highlight the underlying long-term trend and short-run innovations in the series.

Seasonal adjustment approaches

All the seasonal adjustment methods are based on the assumption that each time series, Yt (with a time index t = 1,2,…T), can be decomposed into three different unobserved components:

  • trend-cycle (CTt) component representing long-run movement of the series (like those associated to business cycles). It generally depends on structural conditions like institutional situations, technological and demographic trends or patterns of civil and social organization.
  • seasonal component (St) representing the intra-year (monthly, quarterly) fluctuations.
  • An irregular component (It) representing the short term fluctuations that are not systematic and, to a certain extent unpredictable, e.g. uncharacteristic weather patterns.

Although the series may be decomposed in different ways, generally two main approaches consistent with the European guideline (Eurostat 2015), are considered:

  1. Arima Model Based (AMB) approach, developed among the others by Burman (1980), Box, Hillmer and Tiao (1978) and Hillmer and Tiao (1982), based on the assumption that there exists a statistical parametric model (ARIMA) representing the probabilistic structure of the stochastic process connected to the observed time series. Time series assumed to be a finite part of a particular realization of a stochastic process. The linear filters used in this approach depend, consequently, on the features of the time series considered. This kind of approach is adopted in the TRAMO-SEATS (Time series regression with ARIMA noise, missing observations and outliers and Signal Extraction in ARIMA time series – TS) procedure developed by Gómez and Maravall (1996).
  2. Filter Based Approach (FLB), a non-parametric or semiparametric approach, which, differently from AMB approach, does not require to hypothesize a statistical model representing the series. Indeed, it is based on an iterative application of several linear filters on the series based on central moving averages. These procedures are referred to as ad hoc because the filters are chosen according to empirical rules, not taking into account the probabilistic structure of the stochastic process generating the series. To this approach belong the classical methods of the X-11 (X11) family: from the first X11 and X11-Arima (X-11A) to the more recent X-12-ARIMA (X-12A) (Findley et al. 1998) and X-13-ARIMA-SEATS (X-13AS) (Findley, 2005) which include several improvements over the previous versions; among which, the most remarkable is the use of reg-Arima models aimed at pre-treating the data and at improving the forecasting performances of the series, that in turn translates into an improvement of the moving average symmetric filters employed and, generally, into a higher stability of the estimated seasonal factor.

In both cases data are pre-treated for selecting the decomposition scheme to be applied to the time series (additive, multiplicative, log-additive, etc.). Moreover, some deterministic effects like outliers or calendar effects are removed. This pre-treated series is the input of the following step whose output is the seasonal adjusted series (SA). Once the seasonal adjusted series is obtained there is a last step in which some elements identified in the pre-treatment phase and related to the trend-cycle components (like level shift) or to the irregular component (like additive outlier or temporary changes) are included back; while are taken out from the final series the calendar effects and the seasonal outliers.