One of the most famous datasets in social science is the Lalonde dataset, which is used extensively by those
interested in making causal inferences. (You’ll learn more about this dataset when you learn about matching
techniques in Gov 2001.) This data was used in the late 1970s to explore the ecacy of job training programs
for the working poor. Basically, participants were assigned randomly either to a job training program or to
do nothing. The researchers then tracked the participants to see if their 1978 earnings went up after the
study was completed.
This problem will walk you through basics of providing summary statistics in R. First, download the data
from the course website and save it to your working directory. Load it into R using the following command:
1This is known as Monte Carlo simulation”. In more complicated distributions, the integrals for the expectation can often
be intractable, making simulation necessary
There is a great deal of literature2 indicating that the ordering of candidates on a ballot has an eect on
their vote-share. In an attempt to increase the fairness of the process, New Hampshire randomly chooses
a letter from the alphabet (meaning every letter A-Z has equal probability of being chosen) and then lists
the candidates in alphabetical order starting with that letter. Our goal is to characterize the distribution of
possible ballot placements for Barack Obama. Let the random variable X be Barack Obama’s position on
the New Hampshire primary ballot.
To simplify the problem slightly, assume the shortened candidate list below.
Table 1: Candidates in a ctitious version of the 2008 New Hampshire primary.
A) Write down the PMF of Barack Obama’s ballot positioning with New Hampshire’s ballot ordering rule
and the shortened list of candidates.
B) Gov 2000 only: Create two graphs showing (1) the PMF from part A and (2) the CDF of the
distribution. Make sure the graphs are well organized, legible, and have informative labels and titles.
C) Calculate the expected value of X (Barack Obama’s ballot position) and the variance of X. Show your
2Some examples in this literature: less technical (http://www.nytimes.com/2006/11/04/opinion/04krosnick.html) and
more technical (http://imai.princeton.edu/research/files/alphabet.pdf).
D) Assume that the ballot ordering eects are the same for all candidates and are known to be the following
for a ve-candidate ballot:
Ballot Position (X) Vote-share bump relative to 5th place (%)
Table 2: Ballot order eects in a ctitious version of the 2008 New Hampshire primary.
Calculate the expected ballot order eect for Barack Obama.
E) Now for comparison, calculate the expected ballot order eect for Hillary Clinton. Do you think New
Hampshire’s ballot ordering scheme is fair? Explain using your calculations for this problem.
Challenge Problem (Extra Credit)
Suppose we have iid random variables X1; :::;Xn with E[Xi] = for all i = 1; :::; n and we also have constants
c1; :::; cn. Derive the condition that must hold in order for E[
i=1 ciXi] = . Suppose we further restrict
ci 0 for all i = 1; :::; n. Interpret these constants and state in words what the condition you derived now
means for the estimator of .
Use the order calculator below and get started! Contact our live support team for any assistance or inquiry.[order_calculator]