Evaluating the CDF of the Skew Normal distribution

Posted: April 4, 2020 in Research papers, Uncategorized
Tags: , , ,

A paper I wrote together with Christine Amsler and Peter Schmidt (yes, I cannot resist to say, the Peter Schmidt of the KPSS time series stationarity test, and one of the founders of Stochastic Frontier Analysis), has just been approved for publication in a special issue of Empirical Economics that will be dedicated to efficiency and productivity analysis. The paper is

Amsler C, A Papadopoulos and P Schmidt (2020). “Evaluating the CDF of the Skew Normal distribution.” Forthcoming in Empirical Economics. Download the full paper incl. the supplementary file.

ABSTRACT. In this paper we consider various methods of evaluating the cdf of the Skew Normal distribution. This distribution arises in the stochastic frontier model because it is the distribution of the composed error, which is the sum (or difference) of a Normal and a Half Normal random variable. The cdf must be evaluated in models in which
the composed error is linked to other errors using a Copula, in some methods of goodness of fit testing, or in the likelihood of models with sample selection bias. We investigate the accuracy of the evaluation of the cdf using expressions based on the bivariate Normal distribution, and also using simulation methods and some approximations. We find that the expressions based on the bivariate Normal distribution are quite accurate in the central portion of the distribution, and we propose several new approximations that are accurate in the extreme tails. By a simulated example we show that the use of approximations instead of the theoretical exact expressions may be critical in obtaining meaningful and valid estimation results.

The paper computes values of the Skew Normal distribution using 17 different mathematical formulas (approximations or exact), and/or algorithms and different software. with particular focus on the accuracy of computation of the Skew Normal CDF by the use of the Bivariate standard Normal CDF, since the latter is readily available, but also on what happens deep into the tails. There, the CDF values as so close to zero or unity that it would appear it wouldn’t matter for empirical studies, if one simply imposed a non-zero floor and a non-unity ceiling, and be ok. It is not ok. In Section 7 of the paper we show by a simulated example, that using the Bivariate standard Normal CDF only (with or without floor/ceiling) may lead to failed estimation, while inserting an approximate expression in its place for the left tail solves the problem. This is a result we did not anticipate: it says that approximate mathematical expressions may perform better than exact formulas due to computational limitations related to the latter.

Ignorability and estimator consistency in binary Logistic regression

Posted: May 10, 2019 in Technical Reports, Uncategorized
Tags: , , , ,

In (counterfactual) Treatment Effects Analysis, we learn that a fundamental condition in order to be able to estimate treatment effects reliably is that the treatment variable is “ignorable conditional on the control variables” (see Rosenbaum and Rubin 1983). When ignorability does not hold, as it happens with most cases of observational, non-randomized data, various methods have been developed to obtain ignorability, or in more precise words, to construct a sample (through “risk adjustment”, “balancing on propensity scores”, etc) that “imitates” a randomized one.

We are also told that ignorability is analogous to regressor exogeneity in the linear regression setup, and so that when ignorability does not hold, essentially we have endogeneity and the estimation will produce inconsistent and so unreliable estimates, see e.g. Imbens (2004), or Guo and Fraser “Propensity Score Anaysis” (2010), 1st ed., pp 30-35.

This is simply wrong. The treatment variable may not be ignorable and yet the estimator can be consistent. This means that we can estimate consistently the treatment effect even if the treatment is non-ignorable. We illustrate that non-ignorability does not necessarily imply inconsistency of the estimator, through the widely used Binary Logistic Regression model (BLR).

The BLR model starts properly with a latent-variable regression, usually linear,

$y^{\ell}_i = \beta_0 + \beta_1T_i + \mathbf z'_i \gamma + u_i,\;\;\; i=1,...,n \;\;\;\;(1)$

Where $y^{\ell}_i$ is the unobservable (latent) variable, $T_i$ is the treatment variable,  $\mathbf z_i$ is the vector of controls and $u_i$ is the error term. We obtain the BLR model if we assume that the error term follows the standard Logistic distribution conditional on the regressors, $u_i | \{T_i, \mathbf z_i\} \sim \Lambda (0, \pi^2/3)$. Then we define the indicator variable $y_i \equiv I\{y^{\ell}_i >0\}$, which is observable, and we wonder what is the probability distribution of $y_i$ conditional on the regressors. We obtain

$\Pr\left (y_i = 1 | \{T_i, \mathbf z_i\}\right) = \Lambda\left (\beta_0 + \beta_1T_i + \mathbf z'_i \gamma\right)\;\;\;\;(2)$

and in general,

$\Pr\left (y_i | \{T_i, \mathbf z_i\}\right) = \left[\Lambda\left (\beta_0 + \beta_1T_i + \mathbf z'_i \gamma\right)\right]^{y_i}\cdot \left[1-\Lambda\left (\beta_0 + \beta_1T_i + \mathbf z'_i \gamma\right)\right]^{1-y_i}\;\;\;\;(3)$

This likelihood is estimated by the maximum likelihood estimator (MLE).

Turning to ignorability, it can be expressed as

$\Pr \left (y_i | \{T_i, \mathbf z_i\}\right) = \Pr \left (y_i |\mathbf z_i\right)\;\;\;\;(4)$

Essentially ignorability means that the treatment variable is totally determined by the controls, or maybe, that if it is only partly determined by them, its other “part” is independent from the dependent variable/outcome.

Comparing $(4)$ with $(3)$ we see that ignorability of treatment in the context of the BLR model, is equivalent to the assumption $\beta_1=0$.

“Great”, you could say. “So run the model and let the data decide whether ignorability holds or not”. Well, the issue is whether, when ignorability does not hold, the MLE remains a consistent estimator so that we can have confidence in the estimates that we will obtain. And the assertion that we find in the literature, is that non-ignorability destroys consistency.

Does it? Let’s see: in order for the MLE to be inconsistent, it must be the case that the regressors in the latent-variabe regression (eq. 1), are correlated with the error term. The controls are assumed independent from the error term from the outset. What is argued, is that if $T_i$ is non-ignorable, then it is associated with $u_i$.

We just have seen that ignorability implies that $\beta_1 =0$. So if non-ignorability is the case, we have that $\beta_1 \neq 0$. How does this imply the inconsistency condition “$T_i$ is not independent from $u_i$“?

It doesn’t. The (informal) argument is that if the treatment variable is not fully determined by the controls, it “must” be statistically associated with the unmodeled/random factors represented by $u_i$. But there is nothing here to support a priori this assertion. Whether the treatment variable is endogenous or not, must be argued per case, with respect to the actual situation that we analyze and model. Certainly, if the argument is that the treatment is ignorable, then, if the controls are exogenous to the error term (which is the maintained assumption), so will be the treatment variable also. But if it is non-ignorable, it does not follow automatically that it is endogenous.

Therefore, depending on the real-world phenomenon under study and the available sample, we may very well have a consistent MLE in the BLR model, and so

a) be able to test validly the ignorability assumption, and

b) estimate treatment effects reliably even if the treatment is non-ignorable.–

At the request of a comment, here is a quick Gretl code to simulate a situation where the Treatment is not ignorable, but it is independent from the error term and so it can be consistently estimated. Play around with the sample size (now $n=5000$) , or embed the script into a simple index loop (with matrices to hold the estimates for each run, then fill a series with the estimates from the matrix, then take basic statistics to see that the estimator is consistent).

<hansl>

nulldata 5000

set hc_version 2 #uses HC2 robust standard errors

#Data generation

genr U1 = randgen(U,0,1) #auxialiary variable
genr Er = -log((1-U1)/U1) #Logistic error term Λ(0,1)
genr X1 = randgen(G,1,2) # continuous regressor following Exponential
genr N1 = randgen (N,0,1) # codetermines the assignment of treatment
genr T = (X1+N1 >0) #Bernoulli treatment
genr yL = -0.5 + 0.5*T + X1 + Er # latent dependent variable

#The Treatment is not ignorable because it influences directly the latent dependent variable.
genr Depvar = (yL >0) #obseravble dependent variable

#Estimation

list Reglist = const T X1  #OLS estimation for starting values
ols Depvar Reglist –quiet

matrix bcoeff = \$coeff  #starting value scale parameter of the error term

#This so that the names of the variables appear in the estimation output
string varblnames = varname(Reglist)
string allcoefnames = varblnames

#command for maximum likelihood estimation
catch mle logl = Depvar*log(CDF) + (1-Depvar)*log(1-CDF)
series g = lincomb(Reglist,bcoeff)

series CDF = 1/(1+ exp(-g)) #correct specification of the distribution of the error term

params bcoeff
param_names allcoefnames
end mle –hessian

</hansl>

Changes

Posted: October 18, 2017 in Uncategorized

It just has been made official: my PhD focus has changed, and it will now be about the Two-tier Stochastic Frontier model, on which I have already published a paper.  The thesis will contain new distributional specifications for the model, among them one that allows for statistical dependence, and two applications where I apply the model to new situations that the existing literature has not touched upon. The projection is that the whole thing will be finished in the next 6 months, since most of the work has been done in the past years, as a … recreational break.

Δύο Ασκήσεις για τη Διαχρονική Προσέγγιση στο Ισοζύγιο Τρεχουσών Συναλλαγών

Posted: May 5, 2015 in Uncategorized

Economics.SE has gone into Public Beta (and CES into Leontief)

Posted: December 2, 2014 in Uncategorized
Tags: , , ,

Economics.SE has gone into Public Beta and now everybody can participate there. Of course the content is still not much, no one expects a lot of content during the private Beta. But it already appears that the Economics.SE can eventually have the right mix and balance regarding style, level and focus of questions and answers.

For example an answer proves that the C.E.S production function converges to the Leontief technology (and to Cobb-Douglas) –but the answer also treats the case where the CES is not homogeneous of degree one, it does not exhibit constant returns to scalewhat happens then?