Asymptotic selection criteria. Asymptotic properties of characterization-based symmetry and agreement tests. Entropy of discrete distributions with bounded mathematical expectation

Thesis

Therefore, one of the ways to develop the testing of statistical hypotheses was the path of “empirical” construction of criteria, when the constructed statistics of the criterion are based on a certain principle, an ingenious idea or common sense, but its optimality is not guaranteed. In order to justify the use of such statistics when testing hypotheses against a certain class of alternatives, most often by the method...

  • 1. Supporting Information
    • 1. 1. Information from the theory of C/- and V-statistics
    • 1. 2. Definition and calculation of Bahadur efficiency
    • 1. 3. On large deviations of II- and V-statistics
  • 2. Baringhouse-Hentze symmetry criteria
    • 2. 1. Introduction
    • 2. 2. Statistics
    • 2. 3. Statistics
  • 3. Exponentiality criteria
    • 3. 1. Introduction
    • 3. 2. Statistics I
    • 3. 3. Statistics n
  • 4. Normality criteria
    • 4. 1. Introduction
    • 4. 2. Statistics B^
    • 4. 3. Statistics V^n
    • 4. 4. Statistics V|)P
  • 5. Criteria for agreement with Cauchy's law
    • 5. 1. Introduction
    • 5. 2. Statistics
    • 5. 3. Statistics

Asymptotic properties of symmetry and agreement criteria based on characterizations (essay, coursework, diploma, test)

This dissertation constructs and studies goodness-of-fit and symmetry criteria based on the characterization properties of distributions, and also calculates their asymptotic relative efficiency for a number of alternatives.

The construction of statistical criteria and the study of their asymptotic properties is one of the most important tasks mathematical statistics. When testing a simple hypothesis against a simple alternative, the problem is solved using the Neyman-Pearson lemma, which, as is known, gives the optimal (most powerful) criterion in the class of all criteria of a given level. This is the likelihood ratio test.

However, for more difficult and practical hypothesis testing problems involving either testing complex hypotheses or considering complex alternatives, uniformly most powerful tests rarely exist, and the role of the likelihood ratio test changes significantly. The likelihood ratio statistic usually cannot be calculated explicitly, it loses its optimality property, and its distribution is unstable to changes statistical model. Moreover, the statistician often cannot determine the type of alternative at all, without which the construction of parametric criteria becomes meaningless.

Therefore, one of the ways to develop the testing of statistical hypotheses was the path of “empirical” construction of criteria, when the constructed statistics of the criterion are based on a certain principle, an ingenious idea or common sense, but its optimality is not guaranteed.

Typical examples of such statistics are the sign statistic, Pearson's x2 statistic (1900), the Kolmogorov statistic (1933), which measures the uniform distance between the empirical and true distribution function, the Kendall rank correlation coefficient (1938), or the Bickel-Rosenblatt statistic (1973), based on the quadratic risk of nuclear density assessment. Currently, mathematical statistics has many dozens of “empirical” statistics for testing the hypotheses of agreement, symmetry, homogeneity, randomness and independence, and more and more statistics of this type are constantly being proposed in the literature. A huge literature is devoted to the study of their exact and limit distributions, estimates of the rate of convergence, large deviations, asymptotic expansions, etc.

In order to justify the use of such statistics when testing hypotheses against a certain class of alternatives, their power is most often calculated using statistical modeling. However, for any consistent criterion, the power tends to unity as the sample size increases, and therefore is not always informative. More deep Scan comparative properties of statistics can be carried out on the basis of the concept of asymptotic relative efficiency (ARE). Various approaches to calculating AOE were proposed by E. Pitman, J. Hodges and E. Lehman, R. Bahadur, G. Chernov and W. Kallenberg in the mid-20th century; the results of the development of AOE theory by the mid-90s were summarized in the monograph. There is a generally accepted opinion that the synthesis of new criteria should be accompanied not only by an analysis of their properties, but also by the calculation of AOE in order to assess their quality and give reasonable recommendations for their use in practice.

This paper uses the idea of ​​constructing criteria based on characterizing distributions by the equidistribution property. Characterization theory originates from the work of D. Polya, published in 1923. Then it was developed in the works of I. Martsinkevich, S. N. Bernstein, E. Lukach, Yu. V. Linnik, A.A. Singer, J. Darmois, V.P. Skitovich, S.R. Pao, A.M. Kagan, J. Galambos, S. Kotz, L. B. Klebanov and many other mathematicians. The literature on this subject is large, and there are currently several monographs devoted to characterizations, for example, , , , , , , .

The idea of ​​constructing statistical criteria based on characterizations by the equidistribution property belongs to Yu. V. Linnik. At the end of his extensive work, he wrote: “. one can raise the question of constructing criteria for the agreement of a sample with a complex hypothesis, based on the identical distribution of the two corresponding statistics gi (xi> .xr) and g2(x, ¦¦¦xr) and thus reducing the question to the criterion of homogeneity.”

Let's return to the classical Polya theorem to explain specific example how such an approach might work. In its simplest form, this theorem is formulated as follows.

Polya's theorem. Let X and Y be two independent and identically distributed centered s. V. Then s. V. (X + Y)//2 and X are identically distributed if and only if the distribution law of X is normal.

Suppose we have a sample of centered independent observations Xi, ., Xn and want to test the (complex) null hypothesis that the distribution of this sample is normal with mean 0 and some variance. Using our sample, let’s construct a regular empirical function distribution (d.f.) p

Fn (t) = n-^VD

Gn(t) = n~2? VD + Xj< iv^}, t <= R1. i, j=l

By virtue of the Glivenko-Cantelli theorem, which is also valid for V-statistical empirical d.f. , for large n the function Fn(t) uniformly approaches the d.f. F (t) = P (X< t), а функция Gn (t) равномерно сближается с G (t) = ЦХ + У < tV2). Поскольку при нулевой гипотезе F = G, то Fn (t) близка к Gn (t), и критерий значимости можно основывать на подходящем функционале Тп от разности Fn (t) — Gn (t). Напротив, при альтернативе (то есть при нарушении нормальности) по теореме Пойа F ф G, что приводит к большим значениям Тп и позволяет отвергнуть нулевую гипотезу, обеспечивая состоятельность критерия.

However, this design, based on the idea of ​​Yu. V. Linnik, received almost no development, perhaps due to technical difficulties in constructing and analyzing the resulting criteria. Another reason is probably that characterizations of distributions by the equidistribution property are few and far between.

We know of only a few works devoted to one degree or another to the development of the idea of ​​Yu. V. Linnik. These are the works of Baringhouse and Henze and Muliere and Nikitin, which will be discussed below. There are also works in which goodness-of-fit criteria for specific distributions are also constructed on the basis of characterizations, but not on the basis of equidistribution, for example, , , , , , , , .

The most common use in the literature is to characterize the exponential distribution using various variants of the no-memory property , , , , , , .

It should be noted that in almost all of these works (except perhaps) the AOE of the criteria under consideration is not calculated or discussed. In this thesis, we not only study the asymptotic properties of the known and our proposed characterization-based criteria, but also calculate their local exact (or approximate) AOE according to Bahadur.

Let us now define the concept of AOE. Let (Tn) and (1^) be two sequences of statistics constructed from a sample X,., Xn with distribution Pd, where in € 0 C R1, and the null hypothesis Ho is tested: 9 € in C against alternative A: in € ©-x = ©-6o. Let Mm (a, P,0) be the minimum sample size X[,., Xn, for which the sequence (Tn) with a given significance level, a > 0 reaches power /3< 1 при альтернативном значении параметра в € (c)1- Аналогично вводится в). Относительной эффективностью критерия, основанного на статистике Тп, по отношению к критерию, основанному на Уп, называется величина равная обратному отношению указанных выборочных объемов:

Since relative efficiency as a function of three arguments cannot be calculated explicitly even for the simplest statistics, it is customary to consider limits:

Ptet, y (a,/?, 0), Ntet, y (a,/3,0).

In the first case, the AOE according to Bahadur is obtained, the second limit determines the AOE according to Hodges-Lehman, and the third leads to the determination of the AOE according to Pitman. Since in practical applications it is the cases of low significance levels, high powers and close alternatives that are most interesting, all three definitions seem reasonable and natural.

In this work, to compare criteria, we will use AOE according to Bahadur. There are several reasons for this. Firstly, the Pitman efficiency is suitable mainly for asymptotically normal statistics, and under this condition coincides with the local Bach-Dur efficiency , . We consider not only asymptotically normal statistics, but also statistics of quadratic type, for which marginal distribution under the null hypothesis, it differs sharply from normal, so Pitman's efficiency does not apply. Secondly, the Hodges-Lehman AOE is unsuitable for studying two-sided criteria, since they all turn out to be asymptotically optimal, and for one-sided criteria this AOE usually locally coincides with the Bahadur AOE. Third, significant progress has recently been made in the area of ​​large deviations for test statistics, which is crucial when calculating the Bahadur AOE. We are referring to the large deviations of the U- and V-statistics described in recent works and.

Let us now move on to an overview of the contents of the dissertation. The first chapter is of an auxiliary nature. It sets out the necessary theoretical and technical information from the theory of 11-statistics, the theory of large deviations and the theory of asymptotic efficiency according to Bahadur.

Chapter 2 is devoted to the construction and study of criteria for testing the symmetry hypothesis. Baringhouse and Henze proposed the idea of ​​constructing symmetry criteria based on the following elementary characterization.

Let X and Y be n.o.s.v.s having a continuous d.f. Then |X| and |max (X, Y)| identically distributed if and only if X and Y are symmetrically distributed around zero.

We use this characterization to construct new symmetry criteria. Let us recall that several classical symmetry criteria (see, Chapter 4) are based on a characterization of symmetry even more simple property equidistribution of X and -X.

Let us return to the Baringhouse-Hentze characterization. Let X, ., Xn observations having a continuous d.f.<7. Рассмотрим проверку гипотезы симметрии:

H0: OD = 1 —<3(-:г) V я (Е Я1. Это сложная гипотеза, поскольку вид С? не уточняется. В качестве альтернатив мы рассмотрим параметрическую альтернативу сдвига, т. е. G (x-0) = F (x — в), в >0-skew alternative, i.e. d(x-v) = 2f(x)F ($x), c > 0-Leman alternative, i.e. G(x-, 6) = F1+ e (x), 6 > 0 and the pollution alternative, i.e. G(x-6) = (1 - 6) F(x) + 6Fr+1(x), in > 0, r > 0, where F (x) and f (x) are d.f. and the density of some symmetric distribution.

In accordance with the above characterization, an empirical df is constructed based on |Xj|,., Xn, n

Hn (t) = n~2 J2 Tmax (X^Xk)<г}. На основе этих функций составляются статистики: лоо ):

Let X uY be non-negative and non-degenerate n.o.s.v.s having a d.f. differentiable at zero. F, and let 0< а < 1. Тогда X и min (^, —) одинаково распределены тогда и только тогда, когда F есть ф.р. экспоненциального закона.

In addition to constructing the agreement criterion itself and studying its asymptotic properties, it is of interest to calculate the AOE of a new criterion and study its dependence on the parameter a.

The second generalization of this characterization belongs to Des. We formulate it based on more recent work:

Let Xi, ., Xm, m ^ 2 be non-negative and non-degenerate i.s. r.v.s having a d.f. differentiable at zero. F. Then the statistics X and m minpfi, ., Xm) are identically distributed if and only if F is a d.f. exponential law.

Let Xx,., Xn be independent observations having d.f. Based on the characterizations formulated above, we can test the exponential hypothesis Ho, which consists in the fact that (7 is the d.f. of the exponential law. P, against the alternative H, which consists in the fact that C f? under weak additional conditions.

In accordance with these characterizations, an empirical df is constructed. p = pVD< О (°-0−3) 1 и -статистические ф.р. п-2 ± (* ^ < 4} + ^{тш (?, < «}), 1 П

We propose to base the criteria for checking exponentiality on statistics: pkp = - c&bdquo-(*)] aop(1).

As alternatives, we choose the standard alternatives used in the literature on exponential testing: the Weibull alternative with d(x) = (β + 1)xx(-x1+β), x ^ 0- the Makehama alternative with d(x) = ( 1 + 0(1 - exp (-x))) exp (-x - 0(exp (-x) - 1 + x)), x ^ 0 - an alternative to the linearity of the failure rate function with d (x) = (1 + bx) exp[—x—^bx2], x^O.

For the two statistics proposed above, the limit distributions under the null hypothesis are written:

Theorem 3.2.1 For the statistics Uε for n -* oo, the relation holds: where Dz(a) is defined in (3.2.2). Theorem 3.3.1 For the statistics n as n -> oo the relation holds

U0,(t + 1)2A1(t)), where D4 (t) is defined in (3.3.6).

Since both statistics depend on the parameters a and m, we establish at what parameter values ​​the AOE according to Bahadur reach their maximum and find these values. In addition, we construct an alternative in which the maximum is achieved at the point and φ ½.

The fourth chapter is devoted to testing the normality hypothesis. There are many characterizations of the normal law as one of the central laws of probability theory and mathematical statistics, and two monographs devoted exclusively to this issue. We will consider a slightly simplified version of the well-known characterization of and:

Let Xr, X2, ., Xm be centered n.o.s.v.s having d.f. o constants a, a-2,., am are such that 0< а* < 1 и = 1. Тогда статистики Х и одинаково распределены тогда и только тогда, когда F (x) = Ф (х/а), то есть F — ф.р. нормального закона с нулевым средним и некоторой дисперсией, а > 0.

Let X, ., Xn be a sample with d.f. G. Based on this characterization, we can test the main hypothesis R0, which is that G is a d.f. the normal law Fa (x) = Ф (x/a), against the alternative Hi, which is that G φ Fa. The usual empirical df is constructed. Gn and V-statistical d.f. n^

Bm, n (t) = n~t (E 1 + - +< *}),

1.¿-t=1 s

Hereinafter, the symbol a means summation over all permutations of indices. Criteria for testing normality can be based on the following statistics:

B, n = Г dGn (t), J -00 oo

BmAt)-Gn (t)]dGn (t), oo

Bin = G t-k

y(k) = y](xt p)(xt+k p) - estimate for the autocovariance y(k).

Sample partial autocorrelations are estimates of partial autocorrelations prap(t) of a random process, constructed from the existing implementation of a time series.

Gaussian white noise process is a white noise process whose one-dimensional distributions are normal distributions with zero mathematical expectation.

Gaussian random process(Gaussian process) - a random process for which for any integer m > O and any set of times tx< t2 < ... < tm совместные распределения random variables Xti, Xtm are m-dimensional normal distributions.

Innovation is the current value of the random error on the right side of the relationship that determines the autoregression process Xr Innovation is not

correlated with lagged values ​​Xt_k9 k= 1, 2, ... Consecutive values ​​of innovations (innovation sequence) form a white noise process.

Akaike information criterion (AIC) is one of the criteria for selecting the “best” model among several alternative models. Among the alternative values ​​of the order of the autoregressive model, the value that minimizes the value is selected

o 2k A1C(£) = 1n0£2+y,

Estimation of the dispersion of innovations єг in the AR model is of order.

The Akaike criterion asymptotically overestimates (overestimates) the true value of k0 with non-zero probability.

The Hannan-Quinn information criterion (HQC) is one of the criteria for selecting the “best” model among several alternative models. Among the alternative values ​​of the order of the autoregressive model, the value that minimizes the value is selected

UQ(k) = In a2k + k - ,

where T is the number of observations;

(t£ - estimate of the dispersion of innovations st in the AR model of the A>th order.

The criterion has a fairly rapid convergence to the true value of k0 at T -» oo. However, for small values ​​of T, this criterion underestimates the order of autoregression.

The Schwarz information criterion (SIC) is one of the criteria for selecting the “best” model among several alternative models. Among the alternative values ​​of the order of the autoregressive model, the value that minimizes the value is selected

SIC(£) = lno>2+Ar-,

where T is the number of observations;

A? - assessment of the dispersion of innovations st in the AR model of the A: order.

Correlogram - for a stationary series: a graph of the dependence of the autocorrelation values ​​p(t) of a stationary series on t. A correlogram is also called a pair of graphs given in data analysis protocols in various statistical analysis packages: a graph of a sample autocorrelation function and a graph of a sample partial autocorrelation function. The presence of these two plots helps to identify the ARMA model generating the available set of observations.

Backcasting is a technique for obtaining a more accurate approximation of the conditional likelihood function when estimating a moving average model MA(q):

Xt = et + bxst_x + b2st_2 + ... + bqet_q9 bq Ф0,

according to observations xl9..., xt. The result of maximizing (no bx, bl9 ..., bq) the conditional likelihood function corresponding to the observed values ​​xХ9х29 ...9хт for fixed values ​​of є09 є_Х9 є_д+Х9 depends on the selected values ​​of b*0, е_є_д+1. If the process MA(q) is reversible, then we can put 6*0 = є_х = ... = s_q+x = 0. But to improve the quality of estimation, we can use the reverse forecast method to “estimate” the values ​​of є09 e_Х9 є_д+х and use the estimated values ​​in conditional likelihood function. Lag operator (L)9 back-shift operator - operator defined by the relation: LXt = Xt_x. Convenient for compact recording of time series models and for formulating conditions that ensure certain properties of the series. For example, using this operator, the equation defining the ARMA(p, q) model

Xt = Z ajxt-j + Z bj£t-j ><*Р*ъ>ych* Oh,

can be written as: a(L) Xt = b(b)єп where

a(L) = 1 (axL + a2L2 + ... + apLp

b(L)=l+blL + b2L2 + ... + bqLq.

The problem of common factors is the presence of common factors in the polynomials a(L) and b(L)9 corresponding to the AR and MA components of the ARMA model:

The presence of common factors in the ARMA model specification makes it difficult to practically identify the model across a number of observations.

A first-order autoregressive process (AR(1)) is a random process, the current value of which is the sum of a linear function of the process value lagged by one step and a random error that is not correlated with past process values. In this case, a sequence of random errors forms a white noise process.

An autoregressive process of order p (pth-order autoregressive process - AR(p)) is a random process, the current value of which is the sum of a linear function of process values ​​lagged by p steps or less and a random error not correlated with past process values. In this case, a sequence of random errors forms a white noise process.

A moving average process of order q (qth-order moving average process - MA(g)) is a random process, the current value of which is a linear function of the current value of some white noise process and the values ​​of this white noise process lagged by p steps or less.

Wold's decomposition is a representation of a broadly stationary process with zero mathematical expectation as the sum of a moving average process of infinite order and a linearly deterministic process.

Seasonal autoregression of the first order (SAR(l) - first order seasonal auto-regression) is a random process, the current value of which is a linear function of the value of this process lagged by S steps and a random error not correlated with past values ​​of the process. In this case, a sequence of random errors forms a white noise process. Here S = 4 for quarterly data, S = 12 for monthly data.

Seasonal moving average of the first order (SMA(l) - first order seasonal moving average) is a random process, the current value of which is equal to the sum of a linear function of the current value of some white noise process and the value of this white noise process lagged by S steps. In this case, a sequence of random errors forms a white noise process. Here 5 = 4 for quarterly data, 5 = 12 for monthly data.

The Yule - Walker equations system is a system of equations that connects the autocorrelations of a stationary autoregressive process of order p with its coefficients. The system allows you to consistently find the values ​​of autocorrelations and makes it possible, using the first p equations, to express the coefficients of the stationary autoregression process through the values ​​of the first p autocorrelations, which can be directly used when selecting an autoregression model to real statistical data.

A random process with discrete time (discrete-time stochastic process, discrete-time random process) is a sequence of random variables corresponding to observations made at successive moments in time, having a certain probabilistic structure.

A mixed autoregressive moving average process, an autoregressive process with residuals in the form of a moving average (autoregressive moving average, mixed autoregressive moving average - ARMA(p, q)) is a random process, the current value of which is the sum of a linear function of steps lagging by p or less values ​​of the process and a linear function from the current value of some white noise process and values ​​of this white noise process lagged by q steps or less.

Box-Pierce Q-statistic - one of the g-statistic options:

Є = r£g2(*),

Ljung-Box Q-statistic is one of the g-statistic options, preferable to Box-Pierce statistics:

where T is the number of observations; r (k) - sample autocorrelations.

Used to test the hypothesis that the observed data is a realization of a white noise process.

Wide-sense stationary, weak-sense stationary, weakly stationary, second-order stationary, covariance-stationary stochastic process - random process with constant mathematical expectation, constant variance and invariant random variables Xt,Xt+T:

Cov(Xt,Xt+T) = r(r).

Strictly stationary, stationary in the narrow sense (strictly stationary, strict-sense stationary) random process (stochastic process) - a random process with joint distributions of random variables Xh + T, ..., + T invariant in r.

Condition for reversibility of processes MA(q) and ARMA(p, q) (invertibility condition) - for processes Xt of the form MA(g): Xt = b(L)st or ARMA(p, q): a(L)(Xt ju ) = = b(L)st - condition on the roots of the equation b(z) = O, ensuring the existence of an equivalent representation of the process Xt in the form of an autoregressive process of infinite order AR(oo):

Reversibility condition: all roots of the equation b(z) = O lie outside the unit circle |z|< 1.

Stationarity condition for processes AR(p) and ARMA(p, q) - for processes Xt of the form AR(p): a(L)(Xt ju) = et or ARMA(p, q) a(L)( Xt ju) = = b(L)st - condition on the roots of the equation a(z) = 0, ensuring the stationarity of the process Xg Stationarity condition: all roots of the equation b(z) = O lie outside the unit circle |z|< 1. Если многочлены a(z) и b(L) не имеют общих корней, то это условие является необходимым и достаточным условием стационарности процесса Хг

Partial autocorrelation function (PACF - partial autocorrelation function) - for a stationary series, the sequence of partial autocorrelations prap(r), m = 0, 1,2,...

Partial autocorrelation (PAC - partial autocorrelation) - for a stationary series, the value ppart(r) of the correlation coefficient between random variables Xt nXt+k, cleared of the influence of intermediate random variables Xt+l9...9Xt+k_Y.

Model diagnostic checking stage - diagnostics of the estimated ARMA model, selected based on the available series of observations.

Model identification stage - selection of a series generation model based on the available series of observations, determination of the p and q orders of the ARMA model.

Model evaluation stage (estimation stage) - estimation of the coefficients of the ARMA model, selected based on the available series of observations.

(Q-statistics) - test statistics used to test the hypothesis that the observed data is the implementation of a white noise process.

To section 8

Vector autoregression of order p (ph-order vector autoregression - VAR(p)) is a model for generating a group of time series, in which the current value of each series consists of a constant component, linear combinations of lagged (up to order p) values ​​of this series and other series and random error . The random errors in each equation are not correlated with the lagged values ​​of all series under consideration. Random vectors formed by errors in different series at the same time are independent, identically distributed random vectors with zero means.

Long-run relationship is a certain relationship established over time between variables, in relation to which fairly rapid oscillations occur.

Long-run multipliers (long-run multipliers, equilibrum multipliers) - in a dynamic model with autoregressively distributed lags - coefficients сх,cs of the long-term dependence of a variable on exogenous variables xi, xst. The coefficient Cj reflects the change in the value of yt when the current and all previous values ​​of the variable xjt change by one.

Impulse multipliers (impact multiplier, short-run multiplier) - in a dynamic model with autoregressively distributed lags - values ​​showing the influence of one-time (impulse) changes in the values ​​of exogenous variables chi, xst on the current and subsequent values ​​of the variable jr

Cross-covariances are correlation coefficients between the values ​​of different components of a vector series at coinciding or divergent points in time.

Cross-covariance function is a sequence of cross-correlations of two components of a stationary vector series.

Models with autoregressive distributed lag models (ADL) are models in which the current value of an explained variable is the sum of a linear function of several lagged values ​​of this variable, linear combinations of current and several lagged values ​​of explanatory variables and random error.

Transfer function is a matrix function that establishes the effect of unit changes in exogenous variables on endogenous variables.

The data generating process (DGP) is a probabilistic model that generates observable statistical data. The process of generating data is usually unknown to the researcher analyzing the data. The exception is situations when the researcher himself chooses the data generation process and obtains artificial statistical data by simulating the selected data generation process.

Statistical model (SM) is the model chosen for evaluation, the structure of which is assumed to correspond to the data generation process. The choice of statistical model is made on the basis of existing economic theory, analysis of available statistical data, and analysis of the results of earlier studies.

Stationary vector (AG-dimensional) series (K-dimensional stationary time series) - a sequence of random vectors of dimension K, having the same vectors of mathematical expectations and the same covariance matrices, for which cross-correlations (cross-correlations) between the value of the kth component of the series in moment t and the value of the 1st component of the series at moment (t + s) depend only on s.

To section 9

Unit root hypothesis (UR - unit root hypothesis) - a hypothesis formulated within the ARMA(^, q) model: a(L)Xt = b(L)cr The hypothesis that the autoregressive polynomial a(L) of the ARMA model has at least one root equal to 1. In this case, it is usually assumed that the polynomial a(L) has no roots whose modulus is less than 1.

Differentiation - transition from a series of levels Xt to a series of differences Xt Xt_v Consistent differentiation of a series makes it possible to eliminate the stochastic trend present in the original series.

Integrated of order k series - a series Xn which is not stationary or stationary with respect to a deterministic trend (i.e. is not a TS-series) and for which the series obtained as a result of ^-fold differentiation of the series Xn is stationary, but the series obtained as a result of (k 1)-fold differentiation of the series Xr is not a HY-series.

Cointegration relationship is a long-term relationship between several integrated series, characterizing the equilibrium state of the system of these series.

An error-correction model is a combination of short-term and long-term dynamic regression models in the presence of a cointegration relationship between integrated series.

Differentiation operator - operator A, transforming a series of levels Xt into a series of differences:

Overdifferenced time series - a series obtained as a result of differentiation of the G5-series. Consistent differentiation of the GO series helps eliminate the deterministic polynomial trend. However, differentiation of the T-series has some undesirable consequences when selecting a model from statistical data and using the selected model for the purpose of predicting future values ​​of the series.

Difference stationary, LU-series (DS - difference stationary time series) - integrated series of various orders k = 1,2, ... They are reduced to a stationary series by single or multiple differentiation, but cannot be reduced to a stationary series by subtracting a deterministic trend.

A series of type ARIMA(p, A, q) (ARIMA - autoregressive integrated moving average) is a time series that, as a result of ^-fold differentiation, is reduced to a stationary series ARMA(p, q).

Series stationary relative to a deterministic trend, G5-series

(TS - trend-stationary time series) - series that become stationary after subtracting a deterministic trend from them. The class of such series also includes stationary series without a deterministic trend.

Random walk, random walk process - a random process whose increments form a white noise process: AXt st, so Xt = Xt_ x + єг

Random walk with drift, random walk with drift (random walk with drift) is a random process, the increments of which are the sum of a constant and a white noise process: AXt = Xt Xt_ x = a + st, so Xt = Xt_x + a + ег Constant a characterizes the drift of random walk trajectories that is constantly present during the transition to the next moment in time, on which a random component is superimposed.

Stochastic trend - time series Zt for which

Z, = єх + є2 + ... + et. The value of the random walk at time t is t

Xt = Х0 + ^ є8, so Xt Х0 = єх + є2 + ... + єг In other words, the model

stochastic trend - the process of random walk, “emerging from the origin of coordinates” (for it X0 = 0).

Shock innovation is a one-time (impulse) change in innovation.

The Slutsky effect is the effect of the formation of false periodicity when differentiating a series that is stationary relative to a deterministic trend. For example, if the original series is the sum of a deterministic linear trend and white noise, then the differentiated series does not have a deterministic trend, but turns out to be autocorrelated.

^-hypothesis (TS hypothesis) - the hypothesis that the time series under consideration is stationary or a series stationary with respect to a deterministic trend.

To section 10

Long-run varance - for a series with zero mathematical expectation is defined as the limit

Var(ux +... + it)

G-yus T T-+OD

Dickey-Fuller tests are a group of statistical criteria for testing the unit root hypothesis within the framework of models that assume zero or non-zero mathematical expectation of a time series, as well as the possible presence of a deterministic trend in the series.

When applying the Dickey-Fuller criteria, statistical models are most often evaluated

pAxt = a + (3t + cpxt_x + +є*> t = P + h---,T,

Axt =a + cpxt_x + ^0jAxt_j +£*, t = /7 + 1,..., Г,

Axt = cpxt_x + ]T 6j Axt_j +єп t = p +1,..., T.

The /-statistics / values ​​obtained during the evaluation of these statistical models for testing the hypothesis H0: cp = O are compared with the critical values ​​/crit, depending on the choice of the statistical model. The unit root hypothesis is rejected if f< /крит.

The Kwiatkowski-Phillips-Schmidt-Shin test (KPSS test) is a criterion for distinguishing DS and Г5-series, in which the ha-hypothesis is taken as the zero one.

Leybourne test is a criterion for testing the unit root hypothesis, the statistic of which is equal to the maximum of the two values ​​of the Dickey-Fuller statistic obtained from the original series and from the time-reversed series.

Perron test - a criterion for testing the null hypothesis that a series belongs to the DS class, generalizing the Dickey-Fuller procedure to situations where during the observation period there are structural changes in the model at some point in time Tb in the form of either a level shift (the “collapse” model) , or a change in the slope of the trend (the “change in growth” model), or a combination of these two changes. It is assumed that the moment Tb is determined exogenously - in the sense that it is not selected on the basis of a visual examination of the series graph, but is associated with the moment of a known large-scale change in the economic situation, which significantly affects the behavior of the series in question.

The unit root hypothesis is rejected if the observed value of the ta test statistic is below the critical level, i.e. If

The asymptotic distributions and critical values ​​for the ta9 statistics originally given by Perron are valid for models with innovation outliers.

Phillips-Perron test - a criterion that reduces testing the hypothesis that the series xt belongs to the class of DS-series to testing the hypothesis R0: av = O within the framework of a statistical model

SM: kxt=a + f3t + (pxt_x+un t = 2,...,T,

where, as in the Dickey-Fuller criterion, the parameters an p can be taken equal to zero.

However, unlike the Dickey-Fuller criterion, a wider class of time series is allowed for consideration.

The criterion is based on G-statistics to test the hypothesis H0:<р = О, но использует вариант этой статистики Zn скорректированный на возможную автокоррелированность и гетероскедастичность ряда иг

Schmidt-Phillips test - a criterion for testing the unit root hypothesis within the model

where wt = jSwt_x + st; t - 2,G;

y/ - parameter representing the level; £ is a parameter representing the trend.

The DF-GLS criterion (DF-GLS test) is a criterion that is asymptotically more powerful than the Dickey-Fuller criterion.

Kurtosis is the coefficient of distribution peaking.

An additive outlier model is a model in which, upon passing through the break date Tb, the yt series immediately begins to oscillate around a new level (or a new trend line).

The innovation outlier model is a model in which, after passing through the break date Tv, the process yt only gradually reaches a new level (or a new trend line), around which the series trajectory begins to oscillate.

Multivariate procedure for testing the unit root hypothesis (Dolado, Jenkinson, Sosvilla-Rivero) - a formalized procedure for using the Dickey-Fuller criteria with a sequential check of the possibility of reducing the original statistical model, which the model is considered as

PAxt = a + fit + (pxt_x + ^0jAxt-j +£7> t = P + h---9T.

A prerequisite for using a formalized multivariate procedure is the low power of unit root tests. Therefore, the multivariate procedure involves repeated tests of the unit root hypothesis in simpler models with fewer parameters to estimate. This increases the likelihood of correctly rejecting the unit root hypothesis, but is accompanied by a loss of control over the significance level of the procedure.

Generalized Perron test - an unconditional criterion proposed by Zivot and Andrews (related to innovative emissions), in which the dating of the point of regime change is carried out in “automatic mode”, by searching through all possible options dating and calculations for each dating option /-statistics ta to test the unit root hypothesis; The estimated date is taken to be the one for which the value of ta is minimal.

Cochrane procedure, variance ratio test - a procedure for distinguishing TS and /)5-series, based on the specific behavior for these

series of the relation VRk = -, where Vk = -D(Xt -Xt_k).

Standard Brownian motion(standard Brownian motion) - random process W(r) with continuous time, which is a continuous analogue of a discrete random walk. This is a process for which:

increments (W(r2) W(r()),(W(rk) W(rk_x)) are collectively independent if 0< rx < г2 < ... < гк и W(s) W(r) ~ N(0, s г) при s >G;

realizations of the process W(r) are continuous with probability 1.

Window size is the number of sample autocovariances of the series used in the Newey-West estimator for the long-term variance of the series. Insufficient window width leads to deviations from the nominal size of the criterion (significance level). At the same time, increasing the window width in order to avoid deviations from the nominal size of the criterion leads to a decrease in the power of the criterion.

Two-dimentional Gaussian white noise is a sequence of independent, identically distributed random vectors having a two-dimensional normal distribution with zero mathematical expectation.

Deterministic cointegration (stochastic cointegration) is the existence for a group of integrated series of their linear combination, canceling stochastic and deterministic trends. The series represented by this linear combination is stationary.

Identification of the cointegrating vectors is the selection of a basis for the cointegrating space, consisting of cointegrating vectors that have a reasonable economic interpretation.

Cointegrating space is the set of all possible cointegrating vectors for a cointegrating system of series.

Cointegrated time series, cointegrated time series in the narrow sense, is a group of time series for which there is a non-trivial linear combination of these series, which is a stationary series.

Cointegrating vector is a vector of coefficients of a nontrivial linear combination of several series, which is a stationary series.

The maximum eigenvalue test is a criterion that, in the Johansen procedure for estimating the cointegration rank g of a system of integrated (order 1) series, is used to test the hypothesis H0: r = r* against the alternative hypothesis HA: r = r* + 1.

Trace test is a criterion that, in the Johansen procedure for estimating the cointegration rank g of a system of integrated (order 1) series, is used to test the hypothesis H0: r = r* against the alternative hypothesis HA: r > g*.

Common trends are a group of series that control the stochastic nonstationarity of a system of cointegrated series.

Granger causality is the fact of improving the quality of the forecast of the value yt of the variable Y at time t based on the totality of all past values ​​of this variable, taking into account the past values ​​of some other variable.

Five situations in the Johansen procedure - five situations on which the critical values ​​of the likelihood ratio criteria statistics used in the Johansen procedure for estimating the cointegration rank of a system of integrated (order 1) series depend:

H2(d): there are no deterministic trends in the data, neither a constant nor a trend are included in the SE;

H*(g): there are no deterministic trends in the data,

the CE includes a constant, but does not include a trend;

Hx (g): the data has a deterministic linear trend, the CE includes a constant, but does not include a trend;

Н*(r) there is a deterministic linear trend in the data, a constant and a linear trend are included in the SE;

N(g): the data has a deterministic quadratic trend, CE includes a constant and a linear trend.

(Here CE is the cointegration equation.)

For a fixed rank r, the listed 5 situations form a chain of nested hypotheses:

H2(g) with H*(g) with I, (g) with Ng) with H(g).

This makes it possible, using the likelihood ratio criterion, to test the fulfillment of the hypothesis located to the left in this chain within the framework of the hypothesis located immediately to the right.

Cointegrating rank is the maximum number of linearly independent cointegrating vectors for a given group of series, the rank of the cointegrating space.

Stochastic cointegration is the existence for a group of integrated series of a linear combination that cancels the stochastic trend. The series represented by this linear combination does not contain a stochastic trend, but may have a deterministic trend.

Phillips's triangular system is a representation of the TV system of cointegrated series with cointegration rank r in the form of a system of equations, the first r of which describe the dependence of r selected variables on the remaining (N r) variables (general trends), and the remaining equations describe models for generating general trends.

TV-dimensional Gaussian white noise (N-dimentional Gaussian white noise) is a sequence of independent, identically distributed random vectors having a TV-dimensional normal distribution with zero mathematical expectation.

asymptotically optimal

  • - a concept that states that the estimate is unbiased in the limit. Let be a sequence of random variables on a probability space, where R is one of the measures of the family...

    Mathematical Encyclopedia

  • - a concept that asserts the unbiasedness of the criterion in the limit...

    Mathematical Encyclopedia

  • - solution differential system, stable according to Lyapunov and attracting all other solutions with sufficiently close initial values...

    Mathematical Encyclopedia

  • - a concept that extends the idea of ​​efficient estimation to the case of large samples. An unambiguous definition of A. e. O. does not have. For example, in the classic option we are talking about asymptotic...

    Mathematical Encyclopedia

  • - desirable, expedient...

    Reference commercial dictionary

  • - 1. best, most favorable, most appropriate to certain conditions and tasks 2...

    Big economic dictionary

  • - the most favorable, the best possible...

    Great Soviet Encyclopedia

  • - the best, most appropriate for certain conditions and tasks...

    Modern encyclopedia

  • - the best, most appropriate for certain conditions and tasks...

    Big encyclopedic Dictionary

  • - ...
  • - ...

    Spelling dictionary-reference book

  • - ...

    Spelling dictionary-reference book

  • - ...

    Spelling dictionary-reference book

  • - ...

    Spelling dictionary-reference book

  • - ...

    Spelling dictionary-reference book

  • - ...

    Spelling dictionary-reference book

"asymptotically optimal" in books

Optimal Visual Contrast (OVC)

From the book Color and Contrast. Technology and creative choice author Zheleznyakov Valentin Nikolaevich

Optimal Visual Contrast (OVC) Imagine a black suit illuminated by the sun and a white shirt illuminated by the moon. If we measure their brightness with an instrument, it turns out that under these conditions a black suit is many times brighter than a white shirt, and yet we know that

What is the optimal scale?

From the book Twitonomics. Everything you need to know about economics, short and to the point by Compton Nick

What is the optimal scale? The author of the concept of optimal scale is the German-British philosopher Fritz Schumacher, author of the book “Less is Better: Economics as Human Essence.” He said that the capitalist tendency towards “gigantism” is not only

8.4.2. Optimal growth path

From book Economic theory: textbook author Makhovikova Galina Afanasyevna

8.4.2. Optimal growth path Let us assume that resource prices remain unchanged, while the enterprise budget is constantly growing. By connecting the tangent points of isoquants with isocosts, we get line 0G - “development path” (growth path). This line shows the growth rate of the ratio

The best option

From the book USSR: from ruin to world power. Soviet breakthrough by Boffa Giuseppe

The best option In the fire of battles in 1928, the first five-year plan was born. Beginning in 1926, two institutions, Gosplan and VSNKh, prepared various draft plans one after another. Their development was accompanied by continuous discussions. As one scheme

OPTIMAL OPTION

From the book Russian Rock. Small encyclopedia author Bushueva Svetlana

Optimal

From the book Big Soviet Encyclopedia(OP) of the author TSB

Optimal order

From the book CSS3 for Web Designers by Siderholm Dan

Optimal Order When using browser prefixes, it is important to be mindful of the order in which properties are listed. You may notice that in the previous example the prefix properties are written first, followed by the unprefixed property. Why put the genuine

Optimal person

From the book Computerra Magazine No. 40 dated October 31, 2006 author Computerra magazine

An optimal person Author: Vladimir Guriev Some topics that were popular some forty years ago today seem so marginal that they are almost not discussed seriously. At the same time - judging by the tone of the articles in popular magazines - they seemed relevant and even

The best option

From the book Stalin's First Strike 1941 [Collection] author Kremlev Sergey

Optimal option Analysis of possible scenarios for the development of events inevitably makes one think about choosing the optimal option. It cannot be said that the various “summer” options, that is, alternatives tied to May-June - July 1941, inspire optimism. No, they

The best option

From the book The Great Patriotic Alternative author Isaev Alexey Valerievich

Optimal option Analysis of possible scenarios for the development of events inevitably makes one think about choosing the optimal option. It cannot be said that the various “summer” options, i.e. alternatives tied to May - June - July 1941, inspire optimism. No, they

Optimal control

From the book Self-esteem in children and adolescents. Book for parents by Eyestad Gyru

Optimal control What does it mean to hold moderately tightly? You must determine this yourself, based on your knowledge of your own child and the conditions of the environment in which you live. In most cases, parents of teenagers try to protect their children from smoking, drinking alcohol,

Optimal way

From the book The Perfectionist Paradox by Ben-Shahar Tal

The Optimal Path We are constantly bombarded by perfection. Adonis graces the cover of Men’s Health, Elena the Beautiful graces the cover of Vogue; women and men on the vast screen, in an hour or two, resolve their conflicts, act out an ideal plot, give themselves to ideal love. We've all heard

Optimal approach

From the book Expert No. 07 (2013) author's Expert Magazine

Optimal approach Sergey Kostyaev, candidate of political sciences, senior Researcher INION RAS The US Department of Defense spent a billion dollars on a non-working computer program Photo: EPA From March 1, Pentagon spending is likely to be reduced by 43 billion

The best option

From the book Two Seasons author Arsenyev L

Optimal option - Tell me, is it wise to play on several fronts at once? - journalists asked Bazilevich and Lobanovsky at the very beginning of the ’75 season. “It’s unreasonable, of course,” they answered. - But it is necessary. We believe that it is imperative to differentiate the significance

Optimal control

From the book Managing Personal (Family) Finances. Systems approach author Steinbock Mikhail

Optimal control >> With optimal control, we divide all expenses into two large groups: – “ordinary” – regular expenses, – one-time or non-standard expenses. Optimal control can only be used after several months of detailed control.

mob_info