Reanalysis of Esther Duflo’s study of school building in Indonesia

Esther Duflo’s career as an economist can be said to have started with the first chapter of her PhD thesis, which estimated the impact of a big schooling expansion in Indonesia in the 1970s on the future earnings of the children who went to the schools. The final version remains important in the literature on the impacts of education in developing countries.

Since Open Philanthropy has been investigating education as a possible area for involvement, I revisited the paper in my way. I went back to the original data (not easy!), thought critically about the methods, and added new rounds of data about the earnings of those children later in life. As often happens, I wind up pretty skeptical—but not without some self-doubt. The “natural experiment” in the paper is not a clean experiment that Duflo would later champion. So it stands to reason that the tougher the standard of evidence you bring to it, the less convincing it will be.

My paper is on arXiv and a long but comparatively non-technical post is just up on the EA Forum.

Fast and wild: new paper on my “boottest” program

Three coauthors and I just released a working paper that explains what the wild cluster bootstrap is, how to extend it to various econometric contexts, how to make it go really fast, and how to do it all with my “boottest” program for Stata. The paper is meant to be pedagogic, as most of the methodological ideas are not new. The novel ideas pertain mainly to techniques for speeding up the bootstrap, and to something called Restricted Limited-Information Maximum Likelihood estimation. The title is “Fast and Wild: Bootstrap Inference in Stata Using boottest.”

A few years ago I read the clever study by Kevin Croke that turned a short-term deworming impact study into a long-term one. Back in 2006, Harold Alderman and coauthors reported on a randomized study in Uganda of whether routinely giving children albendazole, a deworming pill, increased their weight. (Most of these children were poorly enough off that any weight gain was probably a sign of improved health.) In that study, the average lag from treatment to follow-up was 16.6 months. But randomized trials, as I like to say, are like the drop of a pebble in a pond: their ripples continue to radiate. Kevin followed up much later on the experiment by linking it to survey data from Uwezo on the ability of Ugandan children to read and do math, gathered in 2010–11. He obtained reading and math scores for some 700 children in parishes (groups of villages) that had been part of the experiment. This let him turn a study of  short-term effects on weight gain into one of long-term effects on academic ability.

In a standard move, the Croke paper clusters standard errors by parish, to combat the false precision that might arise if outcomes are correlated for children within a parish for unmeasured reasons. And because there are relatively few parishes—10 in the treatment group, 12 in the control—the paper uses the “wild cluster bootstrap” to interpret the results. This method has become popular since Cameron, Gelbach, and Miller proposed it about 10 years ago.

Kevin’s paper introduced me to this method. As a part of my effort to understand it, I wrote a code fragment to apply it. I quickly saw that the available programs for wild bootstrapping in Stata, cgmreg and cgmwildboot were useful, but could be dramatically improved upon, at least in speed. And so I wrote my own program, boottest, and shared it with the community of Stata users. As programs often do, this one grew in features and complexity, largely in response to feedback from users. In standard applications, like Kevin’s, the program is so damn fast it must seem like alchemy to new users, returning instantaneously results that would once have taken long enough that you could get a cup of coffee while you waited.

The new paper offers a pedagogic introduction to wild (cluster) bootstrapping. I’m pleased and honored to have coauthored it with James MacKinnon, Morten Nielsen, and Matthew Webb. James in particular is a giant in the field; he coauthored many of the papers that led to the development of the wild cluster bootstrap (among numerous other methods), as well as a leading textbooks on econometrics.

The new paper also divulges the secrets of boottest’s speed. I think there’s a lesson here about just how much more efficiently mathematical code can sometimes be made to run when you carefully state and analyze the algorithm. And in computationally intensive techniques such as bootstraps, speed can matter.

Revised hookworm replication

After releasing and blogging a paper in December about the GiveWell replication of Hoyt Bleakley’s study of hookworm eradication in the American South, I submitted it to the Quarterly Journal of Economics, which published the original paper in 2007. Around the first of the year, QJE rejected the paper, enclosing comments from four reviewers, including from Bleakley. The comments were very helpful in identifying errors in the replication, suggesting new things to do, and pushing me to sharpen my thinking and writing.

I just posted a new version. The story does not change. As a result, I am more sure now that the relative gains in historically hookworm-burdened parts of the South continued trends that began well before and, in the case of income, continued well after. I made two significant substantive changes, both of which strengthen my skepticism.

Continue reading “Revised hookworm replication”

Disappointment about the war on worms in the American South 100 years ago

On GiveWell.org, I just blogged a new study revisiting the evidence on whether the campaign in the 1910s to rid the South of hookworm brought major benefits. A great 2007 paper by Hoyt Bleakley suggests that it did: after eradication school attendance rose disproportionately in historically hookworm-heavy areas; and adult earnings of babies born in affected areas also later rose.

The new study revisits Bleakley’s original by reconstructing its database from primary sources, and replicating and revising the analysis. I ended up strongly questioning the original study’s conclusion. These two pairs of graphs show why. The first graph in each pair is from the  original study, the second from the new version. The original graphs seem to show jumps in outcomes of interest—school attendance, earnings—but the new ones do not.
Continue reading “Disappointment about the war on worms in the American South 100 years ago”

Python program to scrape your solar panel production data from Enphase website


# queries Enphase Enlighten username and password
# then downloads panel-level production data for all panels, between dates hard-coded below
# time stamps expressed in Unix epoch time
# inverter ID numbers are not serial numbers; to determine those,
#   go to Devices tab on Enphase Enlighten site, hover mouse over hotlinked
#   serial numbers, and examine associated links
# saves to "Panelproduction.csv"
# prints each date for which data is scraped, along with number of inverters

import requests, csv, os, getpass
from datetime import timedelta, date
from bs4 import BeautifulSoup

start_date = date(2014, 3, 1)
end_date = date(2017, 11, 14)
user_name = input('User name: ')
password = getpass.getpass('Password: ') # this is only working for me in debug mode

os.chdir('C:\\[your csv destination path here]')

with open('Panelproduction.csv', 'w', newline='') as csvfile:
  writer = csv.writer(csvfile)
  writer.writerow(['Time','Inverter','Power'])

  with requests.Session() as s:
    # log in
    html = s.get('https://enlighten.enphaseenergy.com')
    soup = BeautifulSoup(html.text, 'html.parser')
    token = soup.find('input', attrs={'name': 'authenticity_token'})['value']
    payload = {'user[email]':user_name, 'user[password]':password, 'utf8':'✓', 'authenticity_token': token}
    html = s.post('https://enlighten.enphaseenergy.com/login/login', data=payload)

    for date in (end_date-timedelta(n) for n in range(int((end_date - start_date).days))):
      payload = {'date': str(date)}
      data = s.get('https://enlighten.enphaseenergy.com/systems/[your system ID from URL]/inverter_data_x/time_series.json', params=payload).json()
      print (date, len(data))
      for inverter, inverter_data in data.items():
        if inverter != 'date' and inverter != 'haiku':
          for datapoint in inverter_data['POWR']:
            writer.writerow([datapoint[0], inverter, datapoint[1]])

Four points on the debate over the impact of the Mariel boatlift

There’s been more back and forth this week in the argument over whether a giant influx of Cubans into Miami in 1980 lowered wages for low-education people already living there. A seminal 1990 paper by David Card said no. A 2015 reanalysis by immigration skeptic (and Cuban immigrant) George Borjas said yes. A 2015 blog post by me and a paper by Giovanni Peri and Vasil Yasenov said I don’t think so. And now Michael Clemens and Jennifer Hunt, both of whose work appears in my immigration evidence review, have announced the discovery of what they term a flaw in the Borjas analysis. It turns out that just as the Marielitos began arriving, the Census Bureau sharply increased its coverage of black Miamians in the surveys it conducts to monitor the pulse of the U.S. economy. Since black Miamians had especially low incomes, the racial shift had the power to generate the (apparent) wage decline that Borjas highlights. Borjas retorted on Tuesday, labeling the criticism “fake news.”

So, once more, academics are arguing. And concerned observers are confused by the dueling contentions and graphs. In an attempt to clarify, I’ll make a few points.

Disclosures and disclaimers: I used to work for the Center for Global Development, where I was a colleague of Michael Clemens. Now I work for the Open Philanthropy Project, which provides general support to CGD and specific support for Michael’s work on migration. This blog post represents my personal views and does not speak for the Open Philanthropy Project.

Four points:

Continue reading “Four points on the debate over the impact of the Mariel boatlift”

Worms and more worms

I just finished the second of two posts for GiveWell on the heated academic controversy over whether it is a good idea to mass-deworm children in regions where the parasite infections are common. The first post focusses on the “internal validity” of a particularly influential study that took place along Lake Victoria, in Kenya, in the late 1990s. The second thinks through how safely we can generalize from that study to other times and places. It has a lot more graphs, including some that look pretty wormy…

 

On the geometric interpretation of the determinant of a matrix

Most econometric methods are buttressed by mathematical proofs buried somewhere in academic journals that the methods converge to perfect reliability as sample size goes to infinity. Most arguments in econometrics are over how best to proceed when your data put you very far from the theoretical ideal. Prime examples are when your data are clustered (some villages get bednets and some don’t) and there are few clusters; and when instruments are weak (people offered microcredit were only slightly more likely to take it).

Mucking about in such debates recently, as they pertain to criminal justice studies I’m reviewing, I felt an urge to get back to basics, by which I mean to better understand the mathematics of methods such as LIML. That led me back to linear algebra. So I’ve been trying to develop stronger intuitions about such things as: how a square matrices can have two meanings (a set of basis vectors for a linear space, and the variances and covariances of a set of vectors); and what the determinant really is.

Continue reading “On the geometric interpretation of the determinant of a matrix”

Murder mystery

I started studying the causes and consequences of incarceration for the Open Philanthropy Project. The subject is full of mysteries. Here’s one.

As best we can measure, the US crime rate rose from the mid-1960s to the early 1990s and then reversed:

US crime rate 1960-2012

(Following FBI definitions, this graph is of “Part I” crimes and excludes excludes drug crime, white collar crime, drunk driving offenses, traffic violations, and other minor crimes. The property crime rate is graphed against the right axis, the violent crime rate against the left.)

The strange thing is, the experts aren’t completely sure why the rise and fall. Continue reading “Murder mystery”