Base R ships with a lot of functionality useful for computational
econometrics, in particular in the stats package. This
functionality is complemented by many packages on CRAN, a brief overview
is given below. There is also a considerable overlap between the tools
for econometrics in this view and for finance in the
Finance
view.
Furthermore, the
Finance SIG
is a suitable mailing list for obtaining help
and discussing questions about both computational finance and econometrics.
Finally, there is also some overlap with the
SocialSciences
that
also covers a broad variety of tools for social sciences, e.g., including political science.
The packages in this view can be roughly structured into the following topics.
If you think that some package is missing from the list, please let me know.
Linear regression models
-
Linear models can be fitted (via OLS) with
lm()
(from stats) and standard tests for model comparisons are available in various
methods such as
summary()
and
anova().
-
Analogous functions
that also support asymptotic tests (
z
instead of
t
tests, and
Chi-squared instead of
F
tests) and plug-in of other covariance
matrices are
coeftest()
and
waldtest()
in
lmtest.
-
Tests of more general linear hypotheses are implemented in
linear.hypothesis()
in
car.
-
HC and HAC covariance matrices that can be plugged
into these functions are available in
sandwich.
-
Diagnost checking: The packages
car
and
lmtest
provide a large collection
of regression diagonstics and diagnostic tests.
-
Instrumental variables regression (two-stage least squares) is
provided by
ivreg()
in
AER, another implementation
is
tsls()
in package
sem.
Microeconometrics
-
Many standard microeconometric models belong to the
family of generalized linear models (GLM) and can be fitted by
glm()
from package stats. This includes in particular logit and probit models
for modeling choice data and poisson models for count data. Effects for typical
values of regressors in these models can be obtained and visualized using
effects.
-
Negative binomial GLMs are available via
glm.nb()
in package
MASS.
Another implementation of negative binomial models
is provided by
aod, which also contains other models for overdispersed
data.
-
Zero-inflated and hurdle count models are provided in package
pscl.
-
Multinomial responses: Multinomial models
with individual-specific covariates only are available in
multinom()
from package
nnet. An implementation with both individual- and
choice-specific variables is
mlogit. Generalized additive models
(GAMs) for multinomial responses can be fitted with the
VGAM
package.
A Bayesian approach to multinomial probit models is provided by
MNP.
Various Bayesian multinomial models (including logit and probit) are available
in
bayesm.
-
Ordered responses: Proportional-odds regression for ordered responses is implemented
in
polr()
from package
MASS. The package
ordinal
provides cumulative link models for ordered data which encompasses proportional
odds models but also includes more general specifications. Bayesian ordered probit
models are provided by
bayesm.
-
Censored responses: Basic censored regression models (e.g., tobit models)
can be fitted by
survreg()
in
survival, a convenience
interface
tobit()
is in package
AER. Further censored
regression models, including models for panel data, are provided in
censReg.
Interval regression models are in
intReg.
Furthermore, hurdle models for left-censored data at zero can be estimated with
mhurdle. Models for sample selection are available in
sampleSelection.
-
Multivariate probit models: Estimation and marginal effect computations can be
carried out with
mvProbit.
-
Miscellaneous: Further more refined tools for microecnometrics are provided in
the
micEcon
family of packages: Analysis with
Cobb-Douglas, translog, and quadratic functions is in
micEcon;
the constant elasticity of scale (CES) function is in
micEconCES;
the symmetric normalized quadratic profit (SNQP) function is in
micEconSNQP.
The almost ideal demand system (AIDS) is in
micEconAids.
Stochastic frontier analysis is in
frontier.
The package
bayesm
implements a Bayesian
approach to microeconometrics and marketing. Inference for relative
distributions is contained in package
reldist.
Further regression models
-
Nonlinear least squares modeling is availble in
nls()
in package stats.
-
Quantile regression:
quantreg
(including linear, nonlinear, censored,
locally polynomial and additive quantile regressions).
-
Linear models for panel data:
plm, providing a wide range of within,
between, and random-effect methods (among others) along with corrected standard
errors, tests, etc. For panel-corrected standard errors in OLS and GEE models,
see
geepack
and
pcse. Estimation of linear models with
multiple group fixed effects is contained in
lfe.
-
Generalized method of moments (GMM) and generalized empirical likelihood (GEL):
gmm.
-
Spatial econometric models: The
Spatial
view gives details about
handling spatial data, along with information about (regression) modeling. In particular,
spatial regression models can be fitted using
spdep
and
sphet
(the
latter using a GMM approach). A package for spatial panel
models,
splm, is under development on R-Forge.
-
Linear structural equation models:
sem
(including two-stage least squares).
-
Simultaneous equation estimation:
systemfit.
-
Nonparametric kernel methods:
np.
-
Beta regression:
betareg
and
gamlss.
-
Truncated (Gaussian) regression:
truncreg.
-
Nonlinear mixed-effect models:
nlme
and
lme4.
-
Generalized additive models (GAMs):
mgcv,
gam,
gamlss
and
VGAM.
-
Miscellaneous: The packages
VGAM,
rms
and
Hmisc
provide several tools for extended
handling of (generalized) linear regression models.
Zelig
is a unified
easy-to-use interface to a wide range of regression models.
Basic time series infrastructure
-
The
TimeSeries
task view provides much more detailed
information. Here, only the most important aspects are briefly mentioned.
-
The class
"ts"
in package stats is R's standard class for
regularly spaced time series (especially annual, quarterly, and monthly data).
-
Time series in
"ts"
format can be
coerced back and forth without loss of information to
"zooreg"
from package
zoo.
zoo
provides infrastructure for
both regularly and irregularly spaced time series (the latter via the class
"zoo") where the time information can be of arbitrary class.
This includes daily series (typically with
"Date"
time index)
or intra-day series (e.g., with
"POSIXct"
time index).
-
Several
other implementations of irregular time series building on the
"POSIXct"
time-date class are available in
its,
tseries
and
timeSeries
(previously: fSeries) which are all aimed particularly at
finance applications. See the
Finance
task view for
more information.
Time series modeling
-
The
TimeSeries
task view contains
detailed information about time series analysis in R. Here, only a brief overview
of the most important methods for econometrics is given.
-
Classical time series modeling tools are
contained in the stats package and include
arima()
for ARIMA modeling
and Box-Jenkins-type analysis.
-
Fitting linear regression models with AR error terms via OLS is possible
using
gls()
from
nlme.
-
Structural time series models are provided by
StructTS()
in stats.
-
Filtering and decomposition for time series is available in
decompose()
and
HoltWinters()
in stats.
-
Extensions to these
methods, in particular for forecasting and model selection, are provided in
the
forecast
package.
-
Miscellaneous time series filters are available in
mFilter.
-
For estimating VAR models, several
methods are available: simple models can be fitted by
ar()
in stats, more
elaborate models are provided in package
vars,
estVARXls()
in
dse
and a Bayesian approach is available in
MSBVAR. A
convenient interface for fitting dynamic regression models via OLS is available
in
dynlm; a different approach
that also works with other regression functions is implemented in
dyn.
-
More advanced dynamic system equations can be fitted using
dse.
-
Various nonlinear autoregressive time series models are provided by
tsDyn.
-
Gaussian linear state space models can be fitted using
dlm
(via maximum likelihood,
Kalman filtering/smoothing and Bayesian methods).
-
Unit root and cointegration techniques are available in
urca,
tseries, and
CADFtest.
-
Time series factor analysis is available in
tsfa.
-
Package
sde
provides simulation and inference for stochastic
differential equations.
-
Asymmetric price transmission modeling is available in
apt.
Data sets
-
Packages
AER
and
Ecdat
contain a comprehensive collections of data sets from various standard econometric
textbooks as well as several data sets from the Journal of
Applied Econometrics and the Journal of Business & Economic Statistics
data archives.
-
AER
additionally provides an extensive set of
examples reproducing analyses from the textbooks/papers, illustrating
various econometric methods.
-
FinTS
is the R companion to Tsay's 'Analysis of
Financial Time Series' (2nd ed., 2005, Wiley) containing data sets, functions
and script files required to work some of the examples.
-
CDNmoney
provides Canadian monetary aggregates.
-
pwt
provides (several releases of) the Penn world table.
-
The packages
expsmooth,
fma, and
Mcomp
are
data packages with time series data
from the books 'Forecasting with Exponential Smoothing: The State Space Approach'
(Hyndman, Koehler, Ord, Snyder, 2008, Springer) and 'Forecasting: Methods and Applications'
(Makridakis, Wheelwright, Hyndman, 3rd ed., 1998, Wiley) and the M-competitions,
respectively.
-
Package
erer
contains functions and datasets for the book of
'Empirical Research in Economics: Growing up with R' (Sun, forthcoming).
Miscellaneous
-
Matrix manipulations
: As a vector- and matrix-based language, base R
ships with many powerful tools for doing matrix manipulations, which are
complemented by the packages
Matrix
and
SparseM.
-
Optimization and mathematical programming
: R and many of its contributed
packages provide many specialized functions for solving particular optimization
problems, e.g., in regression as discussed above. Further functionality for
solving more general optimization problems, e.g., likelihood maximization, is
discussed in the the
Optimization
task view.
-
Bootstrap
: In addition to the recommended
boot
package,
there are some other general bootstrapping techniques available in
bootstrap
or
simpleboot
as well some bootstrap techniques
designed for time-series data, such as the maximum entropy bootstrap in
meboot
or the
tsbootstrap()
from
tseries.
-
Inequality
: For measuring inequality, concentration and poverty the
package
ineq
provides some basic tools such as Lorenz curves,
Pen's parade, the Gini coefficient and many more.
-
Structural change
: R is particularly strong when dealing with
structural changes and changepoints in parametric models, see
strucchange
and
segmented.
-
Exchange rate regimes
: Methods for inference about exchange
rate regimes, in particular in a structural change setting, are provided
by
fxregime.