<

Presenting results

Overview

This lesson introduces the stargazer package for presenting regression results.

Objectives

After completing this module, students should be able to:

  1. Use the stargazer package to present nice-looking regression results.

Readings

NA

Issues with summary

  • An important part of conducting research is presenting results to the rest of the world.
  • Using output of the command summary(lm(y~x, data=data)) make the regression results hard to read and it may not be the most appropiate format for sharing. Take look at the output of some regressions using the summary command.
data("mtcars")
mr1 <- lm(hp ~ mpg + cyl + wt + gear, data = mtcars)
summary(mr1)

Call:
lm(formula = hp ~ mpg + cyl + wt + gear, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max
-39.076 -17.816  -2.354  14.645  67.680

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) -133.995    106.205  -1.262 0.217859
mpg           -3.032      2.199  -1.379 0.179215
cyl           28.332      5.927   4.780  5.5e-05 ***
wt             6.739     12.116   0.556 0.582625
gear          39.215      9.134   4.293 0.000203 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 30.16 on 27 degrees of freedom
Multiple R-squared:  0.8315,    Adjusted R-squared:  0.8065
F-statistic: 33.31 on 4 and 27 DF,  p-value: 4.428e-10

There are a couple of things that are not ideal about the summary() output:

  • The variable names are coded. Unless the person reading the results is familiar with the dataset or an expert, it will be hard for them to figure out what each variable is representing.
  • The Call and Residuals part of the output is in virtually all applications redundant.
  • The output format is not editable.
  • Note that the Std. errors, t-value, and, p-value are really conveying the same information (the result of the hypothesis testing that the parameter is equal to zero). Therefore, we use only need one of them. In fact, most published papers that use regressions generally only present the SE and omit the t-value and p-value.
  • summary only presents one regression output at a time. To save space, and make our results easier to read, we’ll often present the output of several regression side-by-side in the same table, especially when estimated parameters are common.

Alternatives to summary

If you are not happy with the output of summary there are a couple of alternatives:

  1. Write your own regression table function.
  2. Use a package such as stargazer, jtools, huxreg, or, xtable.

The process of doing (1) can be time consuming and more often than not, only a few type of users will benefit from the extra work. For most of our needs, at least for the topics discussed in this course, the package stargazer will suffice.\(^1\)

Installing and loading stargazer

To use stargazer you need to,

  1. Install it once by doing: install.packages("stargazer"),
# You will do this once
# in the console
#install.packages("stargazer", repos = "http://cran.us.r-project.org")
  1. And load it with the command library("stargazer")
# You will need this in
# every file that requires
# stargazer. Only once per
# file.
library("stargazer")
##
## Please cite as:
##  Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables.
##  R package version 5.2.2. https://CRAN.R-project.org/package=stargazer

To avoid getting those messages when loading/installing packages you can add the option {r messages='FALSE'} in the R-chunk.

\(^1\) Note: I know that if you are a perfectionist/obsesive with details type of person, at some point you’ll write your own functions or even your own package, but for this course that’s not necessary.

Stargazer

Presenting regression outputs with the stargazer package:

Let’s take a look at the previous regression output, now using the stargazercommand.

stargazer(mr1,
          title = "Regression Results",
          dep.var.labels = c("Gross Horsepower"),
          covariate.labels = c("Intercept",
                               "Miles per Gallon",
                               "Cylinders",
                               "Weight (1000 lbs)",
                               "Number of Forward Gear"),
          type = "html",
          intercept.bottom = FALSE,
          header = FALSE) 
Regression Results
Dependent variable:
Gross Horsepower
Intercept -133.995
(106.205)
Miles per Gallon -3.032
(2.199)
Cylinders 28.332***
(5.927)
Weight (1000 lbs) 6.739
(12.116)
Number of Forward Gear 39.215***
(9.134)
Observations 32
R2 0.832
Adjusted R2 0.807
Residual Std. Error 30.157 (df = 27)
F Statistic 33.310*** (df = 4; 27)
Note: p<0.1; p<0.05; p<0.01


This looks way better than the summary output, and, as we’ll see in a minute, it can be tweaked to your liking.

How it works

The stargazer function takes an R object (generally of the class lm or data.frame) and extract the attributes of the object and process them into a Html or Latex table.

Let’s inspect the attributes of the regression mr1 that we just estimated:

attributes(mr1)
## $names
##  [1] "coefficients"  "residuals"     "effects"       "rank"
##  [5] "fitted.values" "assign"        "qr"            "df.residual"
##  [9] "xlevels"       "call"          "terms"         "model"
##
## $class
## [1] "lm"

All of these can be accessed manually using the $ symbol, e.g. mr1$coefficients.

mr1$coefficients
## (Intercept)         mpg         cyl          wt        gear
## -133.994984   -3.031787   28.332058    6.739500   39.215107

After stargazer extracts and process the content of the object, then it takes into account the options provided by the user and generates the output table. As we’ll see, we can use stargazer to,

  1. Display the table results as part of an Rmarkdown document, or,
  2. Export a table to the hard drive directly.

The second option is really useful if you are working in very large projects that use the same table in different documents.

Rmarkdown and the results = 'asis' option

In RMarkdown, you need to use the “results=‘asis’” option in the r-chunk options, like so {r results='asis'} to produce stargazer tables. Otherwise, Rstudio won’t know that the output has to be compiled as Latex or Html code - depending on what you are knitting - and not as a simple piece of R code output.

See what happens if I don’t include the results='asis' option,

stargazer(mr1,
          title = "Regression Results",
          dep.var.labels = c("Gross Horsepower"),
          covariate.labels = c("Intercept",
                               "Miles per Gallon",
                               "Cylinders",
                               "Weight (1000 lbs)",
                               "Number of Forward Gear"),
          type = "html",
          intercept.bottom = FALSE,
          header = FALSE) 
##
## <table style="text-align:center"><caption><strong>Regression Results</strong></caption>
## <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td><em>Dependent variable:</em></td></tr>
## <tr><td></td><td colspan="1" style="border-bottom: 1px solid black"></td></tr>
## <tr><td style="text-align:left"></td><td>Gross Horsepower</td></tr>
## <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Intercept</td><td>-133.995</td></tr>
## <tr><td style="text-align:left"></td><td>(106.205)</td></tr>
## <tr><td style="text-align:left"></td><td></td></tr>
## <tr><td style="text-align:left">Miles per Gallon</td><td>-3.032</td></tr>
## <tr><td style="text-align:left"></td><td>(2.199)</td></tr>
## <tr><td style="text-align:left"></td><td></td></tr>
## <tr><td style="text-align:left">Cylinders</td><td>28.332<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(5.927)</td></tr>
## <tr><td style="text-align:left"></td><td></td></tr>
## <tr><td style="text-align:left">Weight (1000 lbs)</td><td>6.739</td></tr>
## <tr><td style="text-align:left"></td><td>(12.116)</td></tr>
## <tr><td style="text-align:left"></td><td></td></tr>
## <tr><td style="text-align:left">Number of Forward Gear</td><td>39.215<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(9.134)</td></tr>
## <tr><td style="text-align:left"></td><td></td></tr>
## <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>32</td></tr>
## <tr><td style="text-align:left">R<sup>2</sup></td><td>0.832</td></tr>
## <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.807</td></tr>
## <tr><td style="text-align:left">Residual Std. Error</td><td>30.157 (df = 27)</td></tr>
## <tr><td style="text-align:left">F Statistic</td><td>33.310<sup>***</sup> (df = 4; 27)</td></tr>
## <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr>
## </table>

As you can see, this is the Html code of the previous table, which RStudio is not compiling because I didn’t specify theresults='asis' option.

Stargazer in Latex

  • Since these lectures are knitted to html, note that I always add the option type = "html" to the stargazer function. But your assignments are knitted to .pdf via Latex, therefore you have to specify type = "latex" as an option.

  • In any case, you always have to add the option {r results='asis'}.

  • See below an example code to generate the table in Latex. (you won’t see any table output, as this is an Html file)

stargazer(mr1,
          title = "Regression Results",
          dep.var.labels = c("Gross Horsepower"),
          covariate.labels = c("Intercept",
                               "Miles per Gallon",
                               "Cylinders",
                               "Weight (1000 lbs)",
                               "Number of Forward Gear"),
          type = "latex",
          intercept.bottom = FALSE,
          header = FALSE) 

Stargazer in the Editor

  • Unfortunately, when using the {r results='asis'} option, the table code is not processed until we knit the document. Because of that, when working in the editor, the stargazer output will simply display the code for the table, and not the table itself. Which is a bit annoying if you like to see the regression results in the editor.

  • If you feel very strongly about inspecting the regression results in the editor you can use type = "text" while working in the editor and then change it to type = "latex" when knitting the .pdf file.

stargazer(mr1,
          title = "Regression Results",
          dep.var.labels = c("Gross Horsepower"),
          covariate.labels = c("Intercept",
                               "Miles per Gallon",
                               "Cylinders",
                               "Weight (1000 lbs)",
                               "Number of Forward Gear"),
          type = "text",
          intercept.bottom = FALSE) 
##
## Regression Results
## ==================================================
##                            Dependent variable:
##                        ---------------------------
##                             Gross Horsepower
## --------------------------------------------------
## Intercept                       -133.995
##                                 (106.205)
##
## Miles per Gallon                 -3.032
##                                  (2.199)
##
## Cylinders                       28.332***
##                                  (5.927)
##
## Weight (1000 lbs)                 6.739
##                                 (12.116)
##
## Number of Forward Gear          39.215***
##                                  (9.134)
##
## --------------------------------------------------
## Observations                       32
## R2                                0.832
## Adjusted R2                       0.807
## Residual Std. Error         30.157 (df = 27)
## F Statistic              33.310*** (df = 4; 27)
## ==================================================
## Note:                  *p<0.1; **p<0.05; ***p<0.01

Not the nicest output, but customizable and can be viewed in editor - still better than summary -.

Options

stargazer has many options to tweak the format of a regression table to your liking. This are the most basic ones:

Variable Labels

  • dep.var.labels allows you to provide a custom variable name to the dependent variable (\(y\) variable).

  • covariate.labels same as above, but for the independent variables (\(x\) variables).

The input to the previous options is a vector. Then, you have specify it with c()

Space

  • The no.space reduces the vertical space between rows. This is useful if you want to reduce the space of the regression table.

  • The header=FALSE option removes some comments appended to the table

  • The omit.stat option determines which statistics are included at the bottom. See documentation for the code of each stat that can be omitted.

  • Finally, the align=TRUE option makes the numbers line up nicely with the decimal points (this option only works in Latex, and not for Html). For this to option to work, you need to add the following option in the preamble of the .Rmd file:

header-includes:
   - \usepackage{dcolumn}

Regressions side-by-side

  • To present several regressions side-by-side we simply need to specify a list with each regression as the input of stargazer. For example,
# Running four regressions
mr1 <- lm(hp ~ mpg, data = mtcars)
mr2 <- lm(hp ~ mpg + cyl, data = mtcars)
mr3 <- lm(hp ~ mpg + cyl + wt, data = mtcars)
mr4 <- lm(hp ~ mpg + cyl + wt + gear, data = mtcars)

# Collecting regressions in a list
regs <- list(mr1, mr2, mr3, mr4)

# Displaying regressions side-by-side
stargazer(regs,
          title = "Regression Results",
          dep.var.labels = c("Gross Horsepower"),
          covariate.labels = c("Intercept",
                               "Miles per Gallon",
                               "Cylinders",
                               "Weight (1000 lbs)",
                               "Number of Forward Gear"),
          type = "html",
          intercept.bottom = FALSE,
          df = FALSE) # omits d.o.f of F-test
Regression Results
Dependent variable:
Gross Horsepower
(1) (2) (3) (4)
Intercept 324.082*** 54.067 115.662 -133.995
(27.433) (86.093) (113.207) (106.205)
Miles per Gallon -8.830*** -2.775 -4.220 -3.032
(1.310) (2.177) (2.778) (2.199)
Cylinders 23.979*** 25.025*** 28.332***
(7.346) (7.486) (5.927)
Weight (1000 lbs) -12.135 6.739
(14.382) (12.116)
Number of Forward Gear 39.215***
(9.134)
Observations 32 32 32 32
R2 0.602 0.709 0.716 0.832
Adjusted R2 0.589 0.689 0.686 0.807
Residual Std. Error 43.945 38.223 38.414 30.157
F Statistic 45.460*** 35.373*** 23.585*** 33.310***
Note: p<0.1; p<0.05; p<0.01

Confidence Intervals

  • In some instances, you may be interested in presenting the C.I. for the parameters instead of the standard error. You can do so with the option ci = TRUE
stargazer(regs,
          title = "Regression Results",
          dep.var.labels = c("Gross Horsepower"),
          covariate.labels = c("Intercept",
                               "Miles per Gallon",
                               "Cylinders",
                               "Weight (1000 lbs)",
                               "Number of Forward Gear"),
          type = "html",
          intercept.bottom = FALSE,
          df = FALSE,
          ci = TRUE,
          digits = 2)
Regression Results
Dependent variable:
Gross Horsepower
(1) (2) (3) (4)
Intercept 324.08*** 54.07 115.66 -133.99
(270.31, 377.85) (-114.67, 222.81) (-106.22, 337.54) (-342.15, 74.16)
Miles per Gallon -8.83*** -2.77 -4.22 -3.03
(-11.40, -6.26) (-7.04, 1.49) (-9.67, 1.23) (-7.34, 1.28)
Cylinders 23.98*** 25.03*** 28.33***
(9.58, 38.38) (10.35, 39.70) (16.71, 39.95)
Weight (1000 lbs) -12.13 6.74
(-40.32, 16.05) (-17.01, 30.49)
Number of Forward Gear 39.22***
(21.31, 57.12)
Observations 32 32 32 32
R2 0.60 0.71 0.72 0.83
Adjusted R2 0.59 0.69 0.69 0.81
Residual Std. Error 43.95 38.22 38.41 30.16
F Statistic 45.46*** 35.37*** 23.58*** 33.31***
Note: p<0.1; p<0.05; p<0.01

More options

Stargazer has many more options, which are described here and demonstrated via a few examples here.

More than regressions

  • stargazer can also process objects of the class data.frame. Which can be used simply to present the data in table format:
stargazer(head(mtcars), type = "html", summary = FALSE)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21 6 160 110 3.900 2.620 16.460 0 1 4 4
Mazda RX4 Wag 21 6 160 110 3.900 2.875 17.020 0 1 4 4
Datsun 710 22.800 4 108 93 3.850 2.320 18.610 1 1 4 1
Hornet 4 Drive 21.400 6 258 110 3.080 3.215 19.440 1 0 3 1
Hornet Sportabout 18.700 8 360 175 3.150 3.440 17.020 0 0 3 2
Valiant 18.100 6 225 105 2.760 3.460 20.220 1 0 3 1
  • Or, to display the summary statistics of a data.frame:
stargazer(mtcars, type = "html")
Statistic N Mean St. Dev. Min Pctl(25) Pctl(75) Max
mpg 32 20.091 6.027 10 15.4 22.8 34
cyl 32 6.188 1.786 4 4 8 8
disp 32 230.722 123.939 71 120.8 326 472
hp 32 146.688 68.563 52 96.5 180 335
drat 32 3.597 0.535 2.760 3.080 3.920 4.930
wt 32 3.217 0.978 1.513 2.581 3.610 5.424
qsec 32 17.849 1.787 14.500 16.892 18.900 22.900
vs 32 0.438 0.504 0 0 1 1
am 32 0.406 0.499 0 0 1 1
gear 32 3.688 0.738 3 3 4 5
carb 32 2.812 1.615 1 2 4 8

This tables, just like the regression tables, can be edited and you can specify custom variable names, what statistics to compute, etc.

Example: Correlations Table

  • Let’s see how we can use stargazer to present a correlation table of all the variables in the data.frame with respect to the dependent variable (hp) and the main independent variable (mpg):
# Computing correlations
corData <- cor(mtcars, use = "pairwise.complete.obs")

# Extracting columns for mpg and hp
corData <- corData[, colnames(corData) %in% c("hp", "mpg")]

# Naming columns and rows
colnames(corData) <- c("Miles per Gallon","Gross Horsepower")
row.names(corData) <- c("Miles per Gallon",
                       "Number of Cylinders",
                       "Displacement (cu.in.)",
                       "Gross Horsepower",
                       "Rear Axle Ratio",
                       "Weight (1000 lbs)",
                       "1/4 Mile Time",
                       "Engine (0 = V-shaped, 1 = Straight)",
                       "Transmission (0 = Automatic, 1 = Manual)",
                       "Number of Forward Gears",
                       "Number of Carburetors")

stargazer(corData, summary = FALSE,
          type = "html",
          title = "Correlation Table",
          digits = 2)
Correlation Table
Miles per Gallon Gross Horsepower
Miles per Gallon 1 -0.78
Number of Cylinders -0.85 0.83
Displacement (cu.in.) -0.85 0.79
Gross Horsepower -0.78 1
Rear Axle Ratio 0.68 -0.45
Weight (1000 lbs) -0.87 0.66
1/4 Mile Time 0.42 -0.71
Engine (0 = V-shaped, 1 = Straight) 0.66 -0.72
Transmission (0 = Automatic, 1 = Manual) 0.60 -0.24
Number of Forward Gears 0.48 -0.13
Number of Carburetors -0.55 0.75