Problem 1

One of the most famous datasets in social science is the Lalonde dataset, which is used extensively by those

interested in making causal inferences. (You’ll learn more about this dataset when you learn about matching

techniques in Gov 2001.) This data was used in the late 1970s to explore the ecacy of job training programs

for the working poor. Basically, participants were assigned randomly either to a job training program or to

do nothing. The researchers then tracked the participants to see if their 1978 earnings went up after the

study was completed.

This problem will walk you through basics of providing summary statistics in R. First, download the data

from the course website and save it to your working directory. Load it into R using the following command:

data 2).

1This is known as Monte Carlo simulation”. In more complicated distributions, the integrals for the expectation can often

be intractable, making simulation necessary

3

Problem 3

There is a great deal of literature2 indicating that the ordering of candidates on a ballot has an eect on

their vote-share. In an attempt to increase the fairness of the process, New Hampshire randomly chooses

a letter from the alphabet (meaning every letter A-Z has equal probability of being chosen) and then lists

the candidates in alphabetical order starting with that letter. Our goal is to characterize the distribution of

possible ballot placements for Barack Obama. Let the random variable X be Barack Obama’s position on

the New Hampshire primary ballot.

To simplify the problem slightly, assume the shortened candidate list below.

Joe Biden

Hillary Clinton

John Edwards

Barack Obama

Bill Richardson

Table 1: Candidates in a ctitious version of the 2008 New Hampshire primary.

A) Write down the PMF of Barack Obama’s ballot positioning with New Hampshire’s ballot ordering rule

and the shortened list of candidates.

B) Gov 2000 only: Create two graphs showing (1) the PMF from part A and (2) the CDF of the

distribution. Make sure the graphs are well organized, legible, and have informative labels and titles.

C) Calculate the expected value of X (Barack Obama’s ballot position) and the variance of X. Show your

work.

2Some examples in this literature: less technical (http://www.nytimes.com/2006/11/04/opinion/04krosnick.html) and

more technical (http://imai.princeton.edu/research/files/alphabet.pdf).

4

D) Assume that the ballot ordering eects are the same for all candidates and are known to be the following

for a ve-candidate ballot:

Ballot Position (X) Vote-share bump relative to 5th place (%)

1 3.56

2 2.72

3 1.04

4 0.372

5 0

Table 2: Ballot order eects in a ctitious version of the 2008 New Hampshire primary.

Calculate the expected ballot order eect for Barack Obama.

E) Now for comparison, calculate the expected ballot order eect for Hillary Clinton. Do you think New

Hampshire’s ballot ordering scheme is fair? Explain using your calculations for this problem.

Challenge Problem (Extra Credit)

Suppose we have iid random variables X1; :::;Xn with E[Xi] = for all i = 1; :::; n and we also have constants

c1; :::; cn. Derive the condition that must hold in order for E[

Pn

i=1 ciXi] = . Suppose we further restrict

ci 0 for all i = 1; :::; n. Interpret these constants and state in words what the condition you derived now

means for the estimator of .

**Use the order calculator below and get started! Contact our live support team for any assistance or inquiry.**