Statistical Analysis 9.1: Presenting regression results

Presenting results
Issues with summary
- Alternatives to summary
Stargazer
Options
More than regressions
- Example: Correlations Table

Presenting results

Overview

This lesson introduces the stargazer package for presenting regression results.

Objectives

After completing this module, students should be able to:

Use the stargazer package to present nice-looking regression results.

Readings

Issues with `summary`

An important part of conducting research is presenting results to the rest of the world.
Using output of the command summary(lm(y~x, data=data)) make the regression results hard to read and it may not be the most appropiate format for sharing. Take look at the output of some regressions using the summary command.

data("mtcars")
mr1 <- lm(hp ~ mpg + cyl + wt + gear, data = mtcars)
summary(mr1)


Call:
lm(formula = hp ~ mpg + cyl + wt + gear, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max
-39.076 -17.816  -2.354  14.645  67.680

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) -133.995    106.205  -1.262 0.217859
mpg           -3.032      2.199  -1.379 0.179215
cyl           28.332      5.927   4.780  5.5e-05 ***
wt             6.739     12.116   0.556 0.582625
gear          39.215      9.134   4.293 0.000203 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 30.16 on 27 degrees of freedom
Multiple R-squared:  0.8315,    Adjusted R-squared:  0.8065
F-statistic: 33.31 on 4 and 27 DF,  p-value: 4.428e-10

There are a couple of things that are not ideal about the summary() output:

The variable names are coded. Unless the person reading the results is familiar with the dataset or an expert, it will be hard for them to figure out what each variable is representing.
The Call and Residuals part of the output is in virtually all applications redundant.
The output format is not editable.
Note that the Std. errors, t-value, and, p-value are really conveying the same information (the result of the hypothesis testing that the parameter is equal to zero). Therefore, we use only need one of them. In fact, most published papers that use regressions generally only present the SE and omit the t-value and p-value.
summary only presents one regression output at a time. To save space, and make our results easier to read, we’ll often present the output of several regression side-by-side in the same table, especially when estimated parameters are common.

Alternatives to `summary`

If you are not happy with the output of summary there are a couple of alternatives:

Write your own regression table function.
Use a package such as stargazer, jtools, huxreg, or, xtable.

The process of doing (1) can be time consuming and more often than not, only a few type of users will benefit from the extra work. For most of our needs, at least for the topics discussed in this course, the package stargazer will suffice.$^1$

Installing and loading stargazer

To use stargazer you need to,

Install it once by doing: install.packages("stargazer"),

# You will do this once
# in the console
#install.packages("stargazer", repos = "http://cran.us.r-project.org")

And load it with the command library("stargazer")

# You will need this in
# every file that requires
# stargazer. Only once per
# file.
library("stargazer")

##
## Please cite as:

##  Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables.

##  R package version 5.2.2. https://CRAN.R-project.org/package=stargazer

To avoid getting those messages when loading/installing packages you can add the option {r messages='FALSE'} in the R-chunk.

$^1$ Note: I know that if you are a perfectionist/obsesive with details type of person, at some point you’ll write your own functions or even your own package, but for this course that’s not necessary.

Stargazer

Presenting regression outputs with the stargazer package:

Let’s take a look at the previous regression output, now using the stargazercommand.

stargazer(mr1,
          title = "Regression Results",
          dep.var.labels = c("Gross Horsepower"),
          covariate.labels = c("Intercept",
                               "Miles per Gallon",
                               "Cylinders",
                               "Weight (1000 lbs)",
                               "Number of Forward Gear"),
          type = "html",
          intercept.bottom = FALSE,
          header = FALSE)

**Regression Results**

	Dependent variable:

	Gross Horsepower

Intercept	-133.995
	(106.205)

Miles per Gallon	-3.032
	(2.199)

Cylinders	28.332^***
	(5.927)

Weight (1000 lbs)	6.739
	(12.116)

Number of Forward Gear	39.215^***
	(9.134)


Observations	32
R²	0.832
Adjusted R²	0.807
Residual Std. Error	30.157 (df = 27)
F Statistic	33.310^*** (df = 4; 27)

Note:	p<0.1; p<0.05; p<0.01

This looks way better than the summary output, and, as we’ll see in a minute, it can be tweaked to your liking.

How it works

The stargazer function takes an R object (generally of the class lm or data.frame) and extract the attributes of the object and process them into a Html or Latex table.

Let’s inspect the attributes of the regression mr1 that we just estimated:

attributes(mr1)

## $names
##  [1] "coefficients"  "residuals"     "effects"       "rank"
##  [5] "fitted.values" "assign"        "qr"            "df.residual"
##  [9] "xlevels"       "call"          "terms"         "model"
##
## $class
## [1] "lm"

All of these can be accessed manually using the $ symbol, e.g. mr1$coefficients.

mr1$coefficients

## (Intercept)         mpg         cyl          wt        gear
## -133.994984   -3.031787   28.332058    6.739500   39.215107

After stargazer extracts and process the content of the object, then it takes into account the options provided by the user and generates the output table. As we’ll see, we can use stargazer to,

Display the table results as part of an Rmarkdown document, or,
Export a table to the hard drive directly.

The second option is really useful if you are working in very large projects that use the same table in different documents.

Rmarkdown and the `results = 'asis'` option

In RMarkdown, you need to use the “results=‘asis’” option in the r-chunk options, like so {r results='asis'} to produce stargazer tables. Otherwise, Rstudio won’t know that the output has to be compiled as Latex or Html code - depending on what you are knitting - and not as a simple piece of R code output.

See what happens if I don’t include the results='asis' option,

stargazer(mr1,
          title = "Regression Results",
          dep.var.labels = c("Gross Horsepower"),
          covariate.labels = c("Intercept",
                               "Miles per Gallon",
                               "Cylinders",
                               "Weight (1000 lbs)",
                               "Number of Forward Gear"),
          type = "html",
          intercept.bottom = FALSE,
          header = FALSE)

##
## <table style="text-align:center"><caption><strong>Regression Results</strong></caption>
## <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td><em>Dependent variable:</em></td></tr>
## <tr><td></td><td colspan="1" style="border-bottom: 1px solid black"></td></tr>
## <tr><td style="text-align:left"></td><td>Gross Horsepower</td></tr>
## <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Intercept</td><td>-133.995</td></tr>
## <tr><td style="text-align:left"></td><td>(106.205)</td></tr>
## <tr><td style="text-align:left"></td><td></td></tr>
## <tr><td style="text-align:left">Miles per Gallon</td><td>-3.032</td></tr>
## <tr><td style="text-align:left"></td><td>(2.199)</td></tr>
## <tr><td style="text-align:left"></td><td></td></tr>
## <tr><td style="text-align:left">Cylinders</td><td>28.332<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(5.927)</td></tr>
## <tr><td style="text-align:left"></td><td></td></tr>
## <tr><td style="text-align:left">Weight (1000 lbs)</td><td>6.739</td></tr>
## <tr><td style="text-align:left"></td><td>(12.116)</td></tr>
## <tr><td style="text-align:left"></td><td></td></tr>
## <tr><td style="text-align:left">Number of Forward Gear</td><td>39.215<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(9.134)</td></tr>
## <tr><td style="text-align:left"></td><td></td></tr>
## <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>32</td></tr>
## <tr><td style="text-align:left">R<sup>2</sup></td><td>0.832</td></tr>
## <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.807</td></tr>
## <tr><td style="text-align:left">Residual Std. Error</td><td>30.157 (df = 27)</td></tr>
## <tr><td style="text-align:left">F Statistic</td><td>33.310<sup>***</sup> (df = 4; 27)</td></tr>
## <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr>
## </table>

As you can see, this is the Html code of the previous table, which RStudio is not compiling because I didn’t specify theresults='asis' option.

Stargazer in Latex

Since these lectures are knitted to html, note that I always add the option type = "html" to the stargazer function. But your assignments are knitted to .pdf via Latex, therefore you have to specify type = "latex" as an option.
In any case, you always have to add the option {r results='asis'}.
See below an example code to generate the table in Latex. (you won’t see any table output, as this is an Html file)

stargazer(mr1,
          title = "Regression Results",
          dep.var.labels = c("Gross Horsepower"),
          covariate.labels = c("Intercept",
                               "Miles per Gallon",
                               "Cylinders",
                               "Weight (1000 lbs)",
                               "Number of Forward Gear"),
          type = "latex",
          intercept.bottom = FALSE,
          header = FALSE)

Stargazer in the Editor

Unfortunately, when using the {r results='asis'} option, the table code is not processed until we knit the document. Because of that, when working in the editor, the stargazer output will simply display the code for the table, and not the table itself. Which is a bit annoying if you like to see the regression results in the editor.
If you feel very strongly about inspecting the regression results in the editor you can use type = "text" while working in the editor and then change it to type = "latex" when knitting the .pdf file.

stargazer(mr1,
          title = "Regression Results",
          dep.var.labels = c("Gross Horsepower"),
          covariate.labels = c("Intercept",
                               "Miles per Gallon",
                               "Cylinders",
                               "Weight (1000 lbs)",
                               "Number of Forward Gear"),
          type = "text",
          intercept.bottom = FALSE)

##
## Regression Results
## ==================================================
##                            Dependent variable:
##                        ---------------------------
##                             Gross Horsepower
## --------------------------------------------------
## Intercept                       -133.995
##                                 (106.205)
##
## Miles per Gallon                 -3.032
##                                  (2.199)
##
## Cylinders                       28.332***
##                                  (5.927)
##
## Weight (1000 lbs)                 6.739
##                                 (12.116)
##
## Number of Forward Gear          39.215***
##                                  (9.134)
##
## --------------------------------------------------
## Observations                       32
## R2                                0.832
## Adjusted R2                       0.807
## Residual Std. Error         30.157 (df = 27)
## F Statistic              33.310*** (df = 4; 27)
## ==================================================
## Note:                  *p<0.1; **p<0.05; ***p<0.01

Not the nicest output, but customizable and can be viewed in editor - still better than summary -.

Options

stargazer has many options to tweak the format of a regression table to your liking. This are the most basic ones:

Variable Labels

dep.var.labels allows you to provide a custom variable name to the dependent variable ($y$ variable).
covariate.labels same as above, but for the independent variables ($x$ variables).

The input to the previous options is a vector. Then, you have specify it with c()

Space

The no.space reduces the vertical space between rows. This is useful if you want to reduce the space of the regression table.
The header=FALSE option removes some comments appended to the table
The omit.stat option determines which statistics are included at the bottom. See documentation for the code of each stat that can be omitted.
Finally, the align=TRUE option makes the numbers line up nicely with the decimal points (this option only works in Latex, and not for Html). For this to option to work, you need to add the following option in the preamble of the .Rmd file:

header-includes:
   - \usepackage{dcolumn}

Regressions side-by-side

To present several regressions side-by-side we simply need to specify a list with each regression as the input of stargazer. For example,

# Running four regressions
mr1 <- lm(hp ~ mpg, data = mtcars)
mr2 <- lm(hp ~ mpg + cyl, data = mtcars)
mr3 <- lm(hp ~ mpg + cyl + wt, data = mtcars)
mr4 <- lm(hp ~ mpg + cyl + wt + gear, data = mtcars)

# Collecting regressions in a list
regs <- list(mr1, mr2, mr3, mr4)

# Displaying regressions side-by-side
stargazer(regs,
          title = "Regression Results",
          dep.var.labels = c("Gross Horsepower"),
          covariate.labels = c("Intercept",
                               "Miles per Gallon",
                               "Cylinders",
                               "Weight (1000 lbs)",
                               "Number of Forward Gear"),
          type = "html",
          intercept.bottom = FALSE,
          df = FALSE) # omits d.o.f of F-test

**Regression Results**

	Dependent variable:

	Gross Horsepower
	(1)	(2)	(3)	(4)

Intercept	324.082^***	54.067	115.662	-133.995
	(27.433)	(86.093)	(113.207)	(106.205)

Miles per Gallon	-8.830^***	-2.775	-4.220	-3.032
	(1.310)	(2.177)	(2.778)	(2.199)

Cylinders		23.979^***	25.025^***	28.332^***
		(7.346)	(7.486)	(5.927)

Weight (1000 lbs)			-12.135	6.739
			(14.382)	(12.116)

Number of Forward Gear				39.215^***
				(9.134)


Observations	32	32	32	32
R²	0.602	0.709	0.716	0.832
Adjusted R²	0.589	0.689	0.686	0.807
Residual Std. Error	43.945	38.223	38.414	30.157
F Statistic	45.460^***	35.373^***	23.585^***	33.310^***

Note:	p<0.1; p<0.05; p<0.01

Confidence Intervals

In some instances, you may be interested in presenting the C.I. for the parameters instead of the standard error. You can do so with the option ci = TRUE

stargazer(regs,
          title = "Regression Results",
          dep.var.labels = c("Gross Horsepower"),
          covariate.labels = c("Intercept",
                               "Miles per Gallon",
                               "Cylinders",
                               "Weight (1000 lbs)",
                               "Number of Forward Gear"),
          type = "html",
          intercept.bottom = FALSE,
          df = FALSE,
          ci = TRUE,
          digits = 2)

**Regression Results**

	Dependent variable:

	Gross Horsepower
	(1)	(2)	(3)	(4)

Intercept	324.08^***	54.07	115.66	-133.99
	(270.31, 377.85)	(-114.67, 222.81)	(-106.22, 337.54)	(-342.15, 74.16)

Miles per Gallon	-8.83^***	-2.77	-4.22	-3.03
	(-11.40, -6.26)	(-7.04, 1.49)	(-9.67, 1.23)	(-7.34, 1.28)

Cylinders		23.98^***	25.03^***	28.33^***
		(9.58, 38.38)	(10.35, 39.70)	(16.71, 39.95)

Weight (1000 lbs)			-12.13	6.74
			(-40.32, 16.05)	(-17.01, 30.49)

Number of Forward Gear				39.22^***
				(21.31, 57.12)


Observations	32	32	32	32
R²	0.60	0.71	0.72	0.83
Adjusted R²	0.59	0.69	0.69	0.81
Residual Std. Error	43.95	38.22	38.41	30.16
F Statistic	45.46^***	35.37^***	23.58^***	33.31^***

Note:	p<0.1; p<0.05; p<0.01

More options

Stargazer has many more options, which are described here and demonstrated via a few examples here.

More than regressions

stargazer can also process objects of the class data.frame. Which can be used simply to present the data in table format:

stargazer(head(mtcars), type = "html", summary = FALSE)


	mpg	cyl	disp	hp	drat	wt	qsec	vs	am	gear	carb

Mazda RX4	21	6	160	110	3.900	2.620	16.460	0	1	4	4
Mazda RX4 Wag	21	6	160	110	3.900	2.875	17.020	0	1	4	4
Datsun 710	22.800	4	108	93	3.850	2.320	18.610	1	1	4	1
Hornet 4 Drive	21.400	6	258	110	3.080	3.215	19.440	1	0	3	1
Hornet Sportabout	18.700	8	360	175	3.150	3.440	17.020	0	0	3	2
Valiant	18.100	6	225	105	2.760	3.460	20.220	1	0	3	1

Or, to display the summary statistics of a data.frame:

stargazer(mtcars, type = "html")


Statistic	N	Mean	St. Dev.	Min	Pctl(25)	Pctl(75)	Max

mpg	32	20.091	6.027	10	15.4	22.8	34
cyl	32	6.188	1.786	4	4	8	8
disp	32	230.722	123.939	71	120.8	326	472
hp	32	146.688	68.563	52	96.5	180	335
drat	32	3.597	0.535	2.760	3.080	3.920	4.930
wt	32	3.217	0.978	1.513	2.581	3.610	5.424
qsec	32	17.849	1.787	14.500	16.892	18.900	22.900
vs	32	0.438	0.504	0	0	1	1
am	32	0.406	0.499	0	0	1	1
gear	32	3.688	0.738	3	3	4	5
carb	32	2.812	1.615	1	2	4	8

This tables, just like the regression tables, can be edited and you can specify custom variable names, what statistics to compute, etc.

Example: Correlations Table

Let’s see how we can use stargazer to present a correlation table of all the variables in the data.frame with respect to the dependent variable (hp) and the main independent variable (mpg):

# Computing correlations
corData <- cor(mtcars, use = "pairwise.complete.obs")

# Extracting columns for mpg and hp
corData <- corData[, colnames(corData) %in% c("hp", "mpg")]

# Naming columns and rows
colnames(corData) <- c("Miles per Gallon","Gross Horsepower")
row.names(corData) <- c("Miles per Gallon",
                       "Number of Cylinders",
                       "Displacement (cu.in.)",
                       "Gross Horsepower",
                       "Rear Axle Ratio",
                       "Weight (1000 lbs)",
                       "1/4 Mile Time",
                       "Engine (0 = V-shaped, 1 = Straight)",
                       "Transmission (0 = Automatic, 1 = Manual)",
                       "Number of Forward Gears",
                       "Number of Carburetors")

stargazer(corData, summary = FALSE,
          type = "html",
          title = "Correlation Table",
          digits = 2)

**Correlation Table**

	Miles per Gallon	Gross Horsepower

Miles per Gallon	1	-0.78
Number of Cylinders	-0.85	0.83
Displacement (cu.in.)	-0.85	0.79
Gross Horsepower	-0.78	1
Rear Axle Ratio	0.68	-0.45
Weight (1000 lbs)	-0.87	0.66
1/4 Mile Time	0.42	-0.71
Engine (0 = V-shaped, 1 = Straight)	0.66	-0.72
Transmission (0 = Automatic, 1 = Manual)	0.60	-0.24
Number of Forward Gears	0.48	-0.13
Number of Carburetors	-0.55	0.75