Homework 6

Generalized Empirical likelihood

(Due 4/2)

You need to type and email me your answers.

In an effort to improve the small sample properties of GMM, a number of alternative estimators have been suggested. These include the empirical likelihood (EL) estimator, the continuous updating estimator (CUE) and the exponential tilting (ET) estimator. All of these estimators and GMM have the same asymptotic distribution but different higher-order asymptotic properties.

EL has two theoretical advantages. First, its asymptotic bias does not grow with the number of moment restrictions, while the bias of the others often does. Consequently, for large numbers of moment conditions the bias of EL will be less than the bias of the other estimators. This property is important in econometrics, where many moment conditions are often used. For example, Hansen and Singleton (1982) and Arellano and Bond (1991) all use quite large numbers of moment conditions. The relatively low asymptotic bias of EL indicates that it is an important alternative to GMM in such applications. The second theoretical advantage of EL is that after it is bias corrected, using probabilities obtained from EL, it is higher efficient relative to the other estimators. This property has a simple explanation. When the data are discrete, having finite support, EL is equal to the MLE (can you show it?). Consequently, for discrete data EL inherits the well known higher order efficiency of MLE. Then, because discrete distributions can be used to approximate moments of a continuous distribution the efficiency of EL for the discrete case leads to efficiency in general.

Let $z_{i}$ be iid observations on a data vector z. Also let $\beta $ be a $p\times 1$ parameter vector and gMATH be an $m\times 1$ vector of functions of the data observation $z$ and the parameter, where $m\geq p.$ The model has a true parameter $\beta _{0}$ satisfying the moment condition
MATH
where $E\left[ {}\right] $ denotes expectation taken with respect to the distribution of $z_{i}.$ An important $\ $estimator of $\beta $ is the two-step GMM estimator of Hansen (1982). To describe it, let
MATH

MATH
and
MATH
Also, let $\widetilde{\beta }$ be some preliminary estimator, given by
MATH
where $\widehat{W}$ is a random matrix with properties to be specified below. The GMM estimator is
MATH
The alternatives to GMM we consider are generalized empirical likelihood (GEL) estimators. To describe GEL let MATH be a function of a scalar $v$ that is concave on its domain, an open interval $V$ containing zero. The estimator is the solution to a saddle point problem
MATH
The EL estimator is a special case with
MATH
and MATH The exponential titling estimator is a special case with
MATH
The CUE is analogous to GMM except that the objective function is simultaneously minimized over $\beta $ in MATH It is given by
MATH
where $A^{-}$ denotes any generalized inverse of a matrix $A$, satisfying $AA^{-}A=A.$

The following results shows that if MATH is quadratic then MATH Let
MATH
and MATH be an n-vector of units. Thus,
MATH
and
MATH
By Rao (1973), MATH is invariant to the generalized inverse (ginv) and MATH for any ginv. Then the CUE objective function
MATH
is invariant to ginv.

  1. Show that if MATH is quadratic, then the second-order Taylor expansion of MATH in $\lambda $ about zero is exact where MATH That is
    MATH

By concavity of MATH in $\lambda ,$ any solution MATH to the first-order conditions
MATH
will maximize MATH with respect to $\lambda $ holding $\beta $ fixed. Then
MATH
so that
MATH
solves the first-order conditions. Since
MATH
the GEL objective function MATH is a monotonic increasing transformation of the CUE objective function,so that the set of GEL estimators concides with the set of CUE estimator.

Consider the following linear model, where the structure equation is given by
MATH
and the reduced form for Y by
MATH
where $\theta _{0}$ is $p\times 1$ and $Z$ is $n\times k.$ Assume $k\geq p,$ the order condition for identification.

Under strong identification of $\theta _{0}$, $\Pi $ is fixed matrix of full column rank. Weak identification (Staiger and Stock, 1997) is modeled by letting the correlation between the instruments and the endogenous variables fade away as n goes to infinity. Assume
MATH
where $C$ is a fixed $k\times p$ matrix. For given sample size n, define the random $k$-vector
MATH
Define the $k\times k$ matrix
MATH
Let
MATH
Assume (i) MATH are iid, (ii) MATH (iii) MATH MATH Next, we give a formal definition of the GEL estimator, MATH of $\theta _{0}.$ It exploits the moment condition
MATH
and is given as the solution to a saddle point problem
MATH
where
MATH
For quadratic $\rho ,$ a second order Taylor expansion of MATH in $\lambda $ about zero is exact. This implies that for each $\theta ,$ the maximization in $\lambda $ in () is unconstrained. It follows that for
MATH
and
MATH

MATH
It holds that
MATH
By concavity of MATH in $\lambda ,$ any solution MATH to the FOC


MATH
maximizes MATH with respect to $\lambda $ for fixed $\theta .$ Then
MATH
The next lemma establishes the limit process of MATH under weak identification.

Lemma

Assume $\xi =1/2.$ Let MATH be a k-dimensional Gaussian empirical process with mean zero and covariance function
MATH
Then
MATH

Theorem

MATH

The theorem shows that under weak identification the GEL estimator has a nonstandard distribution and is in general inconsistent.

This document created by Scientific WorkPlace 4.1.