Saturday, February 25, 2017

Arrow, Edgeworth, and Millicent Garrett Fawcett

There's not much one can say about Kenneth Arrow that hasn't already been said, but there's one personal story that I can add to all the tributes and remembrances. 

I met Arrow just once, at a Stanford conference in April 2008 that he and Matt Jackson jointly organized. While everyone else was seated around the outside of a large ring of tables, Arrow was on the inside, directly in front of the speaker. He was 86 at the time.

I was first up, presenting an early version of a paper with Sam Bowles and Glenn Loury on group inequality. Arrow interrupted me within the first couple of minutes – not aggressively at all, just seeking clarification about the information structure. Then, during a coffee break after the talk, he asked if I’d read a piece by Millicent Fawcett on gender wage inequality, published in the Economic Journal in 1892. That’s not a typo – he really meant 1892. I confessed that I hadn't.

Arrow said that Fawcett’s work was extensively discussed in a 1922 presidential address by Francis Edgeworth, but while many were familiar with the Edgeworth lecture, few had bothered to read Fawcett herself. 

It’s true. Edgeworth mentioned “Mrs. Fawcett” seven times in his address, and cited three separate pieces by her. His lecture was on “Equal Pay to Men and Women for Equal Work,” and one of papers he referenced was “Equal Pay for Equal Work,” published by Fawcett in 1918. Here’s how the latter begins:

I didn’t realize it at the time, but Dame Millicent Garrett Fawcett was every bit as remarkable as Edgeworth and Arrow, and economics was the least of her accomplishments. I imagine that Arrow saw in her a kindred spirit.

Wednesday, December 14, 2016

Thomas Schelling, Methodological Subversive

Thomas Schelling died at the age of 95 yesterday.

At a time when economic theory was becoming virtually synonymous with applied mathematics, he managed to generate deep insights into a broad range of phenomena using only close observation, precise reasoning, and simple models that were easily described but had complex and surprising properties.

This much, I think, is widely appreciated. But what also characterized his work was a lack of concern with professional methodological norms. This allowed him to generate new knowledge with great freedom, and to make innovations in method that may end up being even more significant than his specific insights into economic and social life. 

Consider, for instance, his famous "checkerboard" model of self-forming neighborhoods, first introduced in a memorandum in 1969, with versions published in a 1971 article and in his 1978 book Micromotives and Macrobehavior. This model is simple enough to be described verbally in a couple of paragraphs, but has properties that are extremely difficult to deduce analytically. It is also among the very earliest agent-based computational models, reveals some limitations of the equilibrium approach in economic theory, and continues to guide empirical research on residential segregation.

Here's the model. There is a set of individuals partitioned into two groups; let's call them pennies and dimes. Each individual occupies a square on a checkerboard, and has preferences over the group composition of its neighborhood. The neighborhood here is composed of the (at most) eight adjacent squares. Each person is content to be in a minority in their neighborhood, as long as minority status is not too extreme. Specifically, each wants strictly more than one-third of their neighbors to belong to their own group. 

Initially suppose that there are 60 individuals, arrayed in a perfectly integrated pattern on the board, with the four corners unoccupied. Then each individual in a central location has exactly half their neighbors belonging to their own group, and is therefore satisfied. Those on the edges are in a slightly different situation, but even here each individual has a neighborhood in which at least two-fifths of residents are of their own type. So they too are satisfied.

Now suppose that we remove twenty individuals at random, and replace five of these, placing them in unoccupied locations, also at random. This perturbation will leave some individuals dissatisfied. Now choose any one of these unhappy folks, and move them to a location at which they would be content. Notice that this affects two types of other individuals: those who were previously neighbors of the party that moved, and those who now become neighbors. Some will be unaffected by the move, others may become happy as a result, and still others may become unhappy. 

As long as there are any unhappy people on the board, repeat the process just described: pick one at random, and move them to a spot where they are content. What does the board look like when nobody wants to move?

Schelling found that no matter how often this experiment was repeated, the result was a highly segregated residential pattern. Even though perfect integration is clearly a potential terminal state of the dynamic process just described, it appeared to be unreachable once the system had been perturbed. The assumed preferences are tolerant enough to be consistent with integration, but decentralized, uncoordinated choices by individuals appear to make integration fragile, and segregation extremely stable. Here's how Schelling summarized the insight:
People who have to choose between polarized extremes... will often choose in a way that reinforces the polarization. Doing so is no evidence that they prefer segregation, only that, if segregation exists and they have to choose between exclusive association, people elect like rather than unlike environments.
One can tune the parameters of the model: the population size and density, or the preferences over neighborhood composition, and see that this key insight is robust. And for reasons discussed in this essay, equilibrium reasoning alone cannot be used to uncover it. 

A very different kind of contribution, but also one with important methodological implications, may be found in Schelling's 1960 classic The Strategy of Conflict. Here he considers the adaptive value of pretending to be irrational, in order to make threats or promises credible (emphasis added):
How can one commit himself in advance to an act that he would in fact prefer not to carry out in the event, in order that his commitment may deter the other party? One can of course bluff, to persuade the other falsely that the costs or damages to the threatener would be minor or negative. More interesting, the one making the threat may pretend that he himself erroneously believes his own costs to be small, and therefore would mistakenly go ahead and fulfill the threat. Or perhaps he can pretend a revenge motivation so strong as to overcome the prospect of self-damage; but this option is probably most readily available to the truly revengeful
Similarly, in bargaining situations, "the sophisticated negotiator may find it difficult to seem as obstinate as a truly obstinate man." And when faced with a threat, it may be profitable to be known to possess "genuine ignorance, obstinacy or simple disbelief, since it may be more convincing to the prospective threatener."

Starting with three classic papers in the same 1982 issue of the Journal of Economic Theory, a large literature in economics has dealt with the implications for rational behavior of interacting with parties who, with small likelihood, may not be rational. While this work has focused on characterizing rational responses to irrationality, Schelling's point speaks also to payoffs, and raises the possibility that departures from rationality may have adaptive value

The methodological implications of this are profound, because the idea calls into question the normal justification for assuming that economic agents are in fact fully rational. Jack Hirshleifer explored the implications of this in a wonderful paper on the adaptive value of emotions, and Robert Frank wrote an entire book about the topic. But the idea is right there, hidden in plain sight, in Schelling's parenthetical comments.  

Finally, consider Schelling's burglar paradox, also described in The Strategy of Conflict:
If I go downstairs to investigate a noise at night, with a gun in my hand, and find myself face to face with a burglar who has a gun in his hand, there is a danger of an outcome that neither of us desires. Even if he prefers to just leave quietly, and I wish him to, there is danger that he may think I want to shoot, and shoot first. Worse, there is danger that he may think that I think he wants to shoot. Or he may think that I think he thinks I want to shoot. And so on. "Self-Defense" is ambiguous, when one is only trying to preclude being shot in self-defense.
Sandeep Baliga and Tomas Sjöström have shown exactly how such reciprocal fear can lead to a fatal unraveling, and explored the enormous consequences of allowing for pre-play communication in the form of cheap talk. And I have previously discussed the importance of this reasoning in accounting for variations in homicide rates across time and space, as well as the effects of Stand-your-Ground laws.

There are a handful of social scientists whose impact on my own work is so profound that I can't imagine what I'd be writing if I hadn't come across their work. Among them are Glenn Loury, Elinor Ostrom, and Thomas Schelling. I can think of at least five papers: on segregation, on variations in homicide across regions and communities, on reputation in bargaining, and on social norms, that flow directly from Schelling's thought. 

It may surprise some to know that Glenn Loury's Du Bois lectures are dedicated to Schelling, but it makes perfect sense to me. Here's how Glenn explains his choice in the preface:
Shortly after arriving at Harvard in 1982 as a newly appointed Professor of Economics and of Afro-American Studies, I begin to despair of the possibility that I could successfully integrate my love of economic science with my passion for thinking broadly and writing usefully about the issue of race in contemporary America. How, I wondered, could one do rigorous theoretical work in economics while remaining relevant to an issue that seems so fraught with political, cultural and psychological dimensions? Tom Schelling not only convinced me that this was possible; he took me by the hand and showed the way. The intellectual style reflected in this book developed under his tutelage. My first insights into the problem of "racial classification" emerged in lecture halls at Harvard's Kennedy School of Government, where, for several years in the 1980s, Tom and I co-taught a course we called "Public Policies in Divided Societies." Tom Schelling's creative and playful mind, his incredible breadth of interests, and his unparalleled mastery of strategic analysis opened up a new world of intellectual possibilities for me. I will always be grateful to him.
As, indeed, will I.

Wednesday, November 02, 2016

The Prediction Market Paradox

There’s a reason why campaigns are eager to publicize polls that show them ahead, while downplaying those in which they happen to be trailing. The perception that a candidate is losing can depress donations and volunteer effort, and lower morale and turnout among supporters. Hence polls that show tightening of a race are often advertised as indicators of momentum by the trailing party, and as outliers by the leader. The actual likelihood of victory is not independent of beliefs about this likelihood.

This gives rise to what might be called a prediction market paradox. If prices are widely believed to accurately reflect underlying probabilities, then there is an incentive for deep-pocketed partisans to try and manipulate these prices at the margin. But if the possibility of manipulation is salient and prices are treated with skepticism, then incentives to manipulate are weakened and prices will in fact be quite accurate reflections of underlying beliefs.

An interesting illustration of this phenomenon is  the recent decision by PredictIt to post an electoral college map, updated by the minute, that aggregates probabilities derived from all its state level markets. Here's what the map looks like at the moment:

There are seven categories: the safe, likely, and leaning states for each candidate and one toss-up category. States shift across categories as prediction market prices cross the relevant thresholds. This way, a broad range of probability assessments is mapped onto a much coarser set that is easy to visualize and process.

But this creates the possibility that small changes in price, of the order of one cent, can lead to reassignments across categories that generate a very different picture. The incentives to manipulate prices is amplified whenever such categorical switches are feasible.

Of course these incentives apply to both sides of the market, with some traders wishing to shift states to the left while others are pushing to the right. As a result, an unusually large number of states may be expected to bounce back and forth across boundaries, and to remain within a narrow band of prices close to those selected (somewhat arbitrarily) by the exchange as thresholds.

This seems to be what we are seeing. The boundary between the lean and likely Clinton states is determined by a 75% threshold, and we see four states (Wisconsin, Michigan, Colorado, and Pennsylvania) all within a point or two of this. Here are those above the threshold:

And those below:

New Hampshire is not far from the boundary either. 

All this could be just coincidence, but if one looks at probabilistic forecasts from other sources, there is no such pattern. The New York Times conveniently collects six probabilistic forecasts including it's own, with the current picture looking like this:

These forecasts (from the Times, FiveThirtyEight, Huffington Post, Predictwise, Princeton Election Consortium and Daily Kos respectively) don't appear to be clustered around the PredictIt thresholds at all.

Still, the evidence is anecdotal at best, and a proper analysis would have to look for a discontinuity in prices around the time that the map was created, with a clustering of prices around boundary points that could not be accounted for by random chance alone. 

Meanwhile, some caution is probably warranted in interpreting prediction market data. This is a case in which the ease of visualization, aggregation and dissemination of data can have an impact on the underlying measurements themselves, and indeed on the objective probabilities that the measures are intended to reflect.

Friday, September 23, 2016

Thine Every Flaw

There’s a verse in America the Beautiful that I absolutely adore; it represents for me the very best traditions of my adopted country:
America! America!
God mend thine ev’ry flaw,
Confirm thy soul in self-control,
Thy liberty in law.
I’ve been thinking about these words a lot over the past year or so, as the election season has revealed just how divided and how lacking in common purpose we are as a nation.

It's glaringly obvious that international trade, migration, and technological progress have brought enormous benefits to many of us. Our handheld devices are more powerful than the computers that launched our first satellites into orbit. Our system of higher education remains a magnet for eager students from every corner of the world, in part because we have attracted and retained the finest research talent. We are on the verge of a revolution in transportation and urban form as driverless cars make their presence felt. Our cultural products—movies and music among them—continue to attract strong global demand. And our Olympic medal winners encompass many different identities, religions, and countries of origin.

But globalization and technological progress have also left in their wake economic devastation and social disintegration across large swathes of the country that were previously prosperous and stable. The kind of deprivation once confined to inner cities—and tolerated for decades by the rest of society—is now pervasive in once-thriving industrial areas. In his recent and acclaimed memoir, JD Vance laments the decline of Middletown, Ohio from a proud and bustling steel town to "a relic of American industrial glory," with abandoned shops and broken windows, derelict homes, druggies and dealers, and places to be avoided after dark.

Anne Case and Angus Deaton have reported a startling increase in midlife mortality among white Americans without a college degree, "largely accounted for by increasing death rates from drug and alcohol poisonings, suicide, and chronic liver diseases and cirrhosis." Stratification by sex reveals that this phenomenon has hit white working class women especially hard. Trends in criminal justice tell a similar story: the incarceration rate for white women has risen by a staggering fifty percent since 2000, while that for black women has fallen more than 30%. Similar, but much less striking trends are in evidence for males.

All this has led to what Dani Rodrik calls the politics of anger. In its American incarnation, this anger has lifted to the helm of a major political party a man who has apparent contempt for the greatest of our traditions: due process even for those accused of the most heinous crimes, the prohibition of cruel and unusual punishment, and freedom from discrimination on the basis of religion or race. He lacks the self-control for which the verse above pleads, and his appeal to liberty and law is opportunistic and entirely self-serving.

This has been too much for some in his own party to stomach. Meg Whitman, a Republican candidate for Governor of California as recently as 2010, has been actively campaigning for Hillary Clinton. And if unconfirmed reports are to be believed, former president George H.W. Bush intends to vote for her too.

But even if we manage to dodge this bullet in November, the conditions that have fueled Trump's rise will remain in place, and the anger will intensify rather than abate. Something has got to be done to prevent our social fabric from fraying further. But what?

Perhaps protectionist and exclusionary policies can provide some measure of short term relief, but much of the dislocation that results from globalization is also a consequence of technological progress, and giving up on the latter is a recipe for economic suicide. Targeted interventions that support retraining and transition to growing sectors of the economy have to be part of the solution, but these are piecemeal efforts with varying effectiveness and the potential for bureaucratic mismanagement.

An alternative approach is to target inequality and poverty directly, through cash transfer schemes such as a universal basic income or a negative income tax. But payments such as these are not contingent on the performance of the economy as a whole, and therefore provide no incentives for people to support policies that are beneficial in the aggregate but impose costs on them as individuals.

What we need is a distributive mechanism that allows for all to benefit when the country benefits. Debraj Ray has recently proposed something along these lines, a universal basic share. This is simply a share of nominal GDP,  the value of which will ebb and flow with the nation's aggregate income. Aside from some obvious advantages relative to a basic income, such as the absence of any need for indexation, this would give all citizens a stake in the prosperity of the country as a whole.

How might such a scheme be implemented? I have previously proposed the creation of individual accounts at the Federal Reserve for every citizen, including minors, which could be credited with the profits of open market operations. These profits are currently transferred to the Treasury. Any shortfall relative to the basic income share would then have to be made up by transfers from the Treasury to the Fed. One considerable benefit of such accounts is that they would do away with the need for deposit insurance, and would remove at a stroke the implicit subsidy that such insurance provides for proprietary trading at commercial banks. 

Policies of this kind already exist. For instance, the Alaska Permanent Fund collects and invests a portion of the revenue from mineral leases, and periodically distributes dividends to all qualified residents of the state.

The hope is that an initiative such as this can distribute more evenly the benefits from policies that raise aggregate incomes, whether through trade, migration, or technological progress. This ought to mitigate the political obstacles to the implementation of such policies. And perhaps the sense of common ownership will help bridge some of the deep divisions that have become so salient during this electoral season.

Through his rhetoric, Donald Trump has emboldened and empowered some of the most virulently racist and anti-Semitic elements in our society. Just take a look, for instance, at the messages received on twitter by the political theorist Danielle Allen, in response to her concerns about a Trump nomination. They are disheartening in the extreme.

But Trump has the support of about 40% of registered voters, which in my estimation is about 88 million people or 36% of the adult population. While many of them may hold views on some matters that are immensely distasteful and deeply hurtful to others, I think that JD Vance is right to point out that it is "difficult in the abstract to appreciate that those with morally objectionable viewpoints can still be good people." 

I have been an American for just six years, and it is far too soon for me write off so substantial a fraction of my fellow citizens. Call it the naive optimism of the newly naturalized if you like, but I really do think that we can get past this. With or without divine intervention, we can mend our individual and collective flaws.

Thursday, July 21, 2016

A Fallacy of Composition

Peter Moskos is a sociologist by training, a professor at John Jay College of Criminal Justice, and a former Baltimore City police officer. In responding to the shooting of Philando Castile, he had this to say:
Honestly, in this shooting, with this cop, in this locale, I don't think there's a chance in hell Castile would have been shot had he been white. 
Nor did he think this was an entirely isolated incident; it reminded him of the (non-fatal) shooting of Levar Jones by Sean Groubert at a traffic stop in South Carolina. I had exactly the same reaction when I saw the Castile video, as did others. Even the Governor of Minnesota conceded that the shooting "probably would not have happened if he were white."

And yet, Moskos was unsurprised by Roland Fryer's recent claims of an absence of racial bias in police shootings:
I was not surprised by Fryer's conclusions... if one wishes to reduce police-involved shootings... there are good liberal reasons to de-emphasize the significance of race in policing.

Jonathan Ayers, Andrew Thomas, Diaz Zerifino, James Boyd, Bobby Canipe, Dylan Noble, Dillon Taylor, Michael Parker, Loren Simpson, Dion Damen, James Scott, Brandon Stanley, Daniel Shaver, and Gil Collar were all killed by police in questionable to bad circumstances... What they have in common is none were black and very few people seemed to know or care when they were killed. 
Moskos is not arguing here that the police can do no wrong; he is arguing instead that in the aggregate, whites and blacks are about equally likely to be victims of bad shootings. 

How can these two views be reconciled? If there is bias in individual incidents, ought it not to show up in aggregate data? Doesn't the congruence between the racial composition of arrestees nationwide and the racial composition of victims of police killings indicate an absence of bias, as Sendhil Mullainathan claimed a few months ago?

I have argued previously that it does not, because of systematic differences in the qualitative nature of encounters. If police initiate more encounters with blacks that are not objectively threatening (but may in some cases be subjectively perceived to be threatening) then parity in killings per encounter can indicate the presence rather than absence of bias. As Andrew Gelman put it at the time, it's all about the denominator

But Moskos offers another, quite different reason why bias in individual incidents might not be detected in aggregate data: large regional variations in the use of lethal force. 

To see the argument, consider a simple example of two cities that I'll call Eastville and Westchester. In each of the cities there are 500 police-citizen encounters annually, but the racial composition differs: 40% of Eastville encounters and 20% of Westchester encounters involve blacks. There are also large regional differences in the use of lethal force: in Eastville 1% of encounters result in a police killing while the corresponding percentage in Westchester is 5%. That's a total of 30 killings, 5 in one city and 25 in the other.

Now suppose that there is racial bias in police use of lethal force in both cities. In Eastville, 60% of those killed are black (instead of the 40% we would see in the absence of bias). And in Westchester the corresponding proportion is 24% (instead of the no-bias benchmark of 20%). Then we would see 3 blacks killed in one city and 6 in the other. That's a total of 9 black victims out of 30. The black share of those killed is 30%, which is precisely the black share of total encounters. Looking at the aggregate data, we see no bias. And yet, by construction, the rate of killing per encounter reflects bias in both cities. 

This is just a simple example to make a logical point. Does it have empirical relevance? Are regional variations in killings large enough to have such an effect? Here is Moskos again:
Last year in California, police shot and killed 188 people. That's a rate of 4.8 per million. New York, Michigan, and Pennsylvania collectively have 3.4 million more people than California (and 3.85 million more African Americans). In these three states, police shot and killed... 53 people. That's a rate of 1.2 per million. That's a big difference.

Were police in California able to lower their rate of lethal force to the level of New York, Michigan, and Pennsylvania... 139 fewer people would be killed by police. And this is just in California... If we could bring the national rate of people shot and killed by police (3 per million) down to the level found in, say, New York City... we'd reduce the total number of people killed by police 77 percent, from 990 to 231!
This is a staggeringly large effect. 

Additional evidence for large regional variations comes from a recent report by the Center for Policing Equity. The analysis there is based on data provided voluntarily by a dozen (unnamed) departments. Take a close look at Table 6 in that document, which reports use of force rates per thousand arrests. The medians for lethal force are 0.29 and 0.18 for blacks and whites respectively, but the largest recorded rates are much higher: 1.35 for blacks and 3.91 for whites. There is at least one law enforcement agency that is killing whites at a rate more than 20 times greater than that of the median agency.

On the reasons for these disparities, one can only speculate:
I really don't know what some departments and states are doing right and others wrong. But it's hard for me to believe that the residents of California are so much more violent and threatening to cops than the good people of New York or Pennsylvania. I suspect lower rates of lethal force has a lot to do with recruitment, training, verbal skills, deescalation techniques, not policing alone, and more restrictive gun laws. 
Moskos expands on these points in a recent conversation with Glenn Loury.

All of this must be interpreted with caution, since the information we have available is so patchy and deficient. As I wrote in a recent opinion piece with Willemien Kets, there is a desperate need for better data, collected and distributed in a comprehensive and uniform manner. Without this we are just groping in the dark.

Thursday, July 14, 2016

On Arrest Filters and Empirical Inferences

I've been thinking a bit more about Roland Fryer's working paper on police use of force, prompted by this thread by Europile and excellent posts by Michelle Phelps and Ezekeil Kweku.

The Europile thread contains a quick, precise, and insightful summary of the empirical exercise conducted by Fryer to look for racial bias in police shootings. There are two distinct pools of observations: an arrest pool and a shooting pool. The arrest pool is composed of "a random sample of police-civilian interactions from the Houston police department from arrests codes in which lethal force is more likely to be justified: attempted capital murder of a public safety officer, aggravated assault on a public safety officer, resisting arrest, evading arrest, and interfering in arrest." The shooting pool is a sample of interactions that resulted in the discharge of a firearm by an officer, also in Houston. 

Importantly, the latter pool is not a subset of the former, or even a subset of the set of arrests from which the former pool is drawn. Put another way, had the interactions in the shooting pool been resolved without incident, many of them would never have made it into the arrest pool. Think of the Castile traffic stop: had this resulted in a traffic violation or a warning or nothing at all, it would not have been recorded in arrest data of this kind.

The analysis in the paper is based on a comparison between the two pools. The arrest pool is 58% black while the shooting pool is 52% black, which is the basis for Fryer's claim that blacks are less likely to be shot by whites in the raw data. He understands, of course, that there may be differences in behavioral and contextual factors that make the black subset of the arrest pool different from the white, and attempts to correct for this using regression analysis. He reports that doing so "does not significantly alter the raw racial differences."

This analysis is useful, as far as it goes. But does this really imply that the video evidence that has animated the black lives matter movement is highly selective and deeply misleading, as initial reports on the paper suggested? 

Not at all. The protests are about the killing of innocents, not about the treatment of those whose actions would legitimately plant them in the serious arrest pool. What Fryer's paper suggests (if one takes the incident categorization by police at face value) is that at least in Houston, those who would assault or attempt to kill a public safety officer are treated in much the same way, regardless of race. 

But think of the cases that animate the protest movement, for instance the list of eleven compiled here. Families of six of the eleven have already received large settlements (without admission of fault). Six led to civil rights investigations by the justice department. With one or two possible exceptions, it doesn't appear to me that these interactions would have made it past Fryer's arrest filter had they been handled more professionally. 

The point is this: if there is little or no racial bias in the way police handle genuinely dangerous suspects, but there is bias that leads some mundane interactions to turn potentially deadly, then the kind of analysis conducted by Fryer would not be helpful in detecting it. Which in turn means that the breathless manner in which the paper was initially reported was really quite irresponsible. 

For this the author bears some responsibility, having inserted the following into his discussion of the Houston findings:
Given the stream of video "evidence", which many take to be indicative of structural racism in police departments across America, the ensuing and understandable outrage in black communities across America, and the results from our previous analysis of non-lethal uses of force, the results displayed in Table 5 are startling... Blacks are 23.8 percent less likely to be shot by police, relative to whites.
His claim that this was "the most surprising result of my career" was an invitation to misunderstand and misreport the findings, which are important but clearly limited in relevance and scope.


Update. If you follow the links at the start of this post, you'll see a case made that Fryer's own findings of bias in the use of non-lethal force suggest that the composition of the arrest pool will be altered by bias in the charging of innocents for resisting or evading arrest.

It occurred to me that the same data used to examine use of non-lethal force (from the citizen's perspective) could also be used to get an estimate of this effect. This is the Bureau of Justice Statistics Police-Public Contact Survey. If anyone had done already this please let me know, I'd be interested to see the findings.

Monday, July 11, 2016

Police Use of Force: Notes on a Study

A new empirical analysis of police use of force by Harvard economist Roland Fryer is attracting national attention. The paper deals with both lethal and non-lethal force, using a variety of different data sets, some public and some painstakingly assembled by the author and his team. Given the harrowing events of the past week, it's likely that his results on shootings will attract the most attention, but it's worth carefully considering both sets of findings.

Fryer provides evidence of significant racial disparities in the experience of non-lethal force at the hands of police, even in data that relies on self-reports by officers. Using official statistics from New York City’s Stop, Question and Frisk program, he finds that blacks and Latinos are more likely to be held, pushed, cuffed, sprayed or struck than whites who are stopped. This remains the case even after controlling for a broad range of demographic, behavioral, and environmental characteristics. And using data from a nationally representative sample of civilians, which does not rely on officer accounts, he finds evidence of even larger disparities in treatment.

But Fryer also reports an absence of racial bias in police shootings for a select group of jurisdictions. He recognizes that a proper analysis of police bias in the use of lethal force requires data not only on those incidents in which shootings occurred, but also those in which suspects were successfully pacified and disarmed. Data of this kind is extremely hard to come by, but he has managed to obtain incident reports on arrests in Houston that can be used for this purpose. 

The focus is on arrest categories that are more likely to involve incidents resulting in justified use of lethal force. It turns out that in this arrest data 58% of the population is black, while in the shooting data the corresponding share is 52%. This immediately implies that in the absence of controls for other features of the interaction, blacks in the arrest population are less likely to be shot than whites. He finds that controlling for other features of the interaction "does not significantly alter the raw racial differences." Here is how Fryer characterizes these findings:
Given the stream of video "evidence", which many take to be indicative of structural racism in police departments across America, the ensuing and understandable outrage in black communities across America, and the results from our previous analysis of non-lethal uses of force, the results displayed in Table 5 are startling... Blacks are 23.8 percent less likely to be shot by police, relative to whites.
He describes this as "the most surprising result of my career."

While it is entirely possible that the Houston Police Department doesn't exhibit systematic racial bias in the use of lethal force, I'm not sure such an emphatic conclusion is warranted. A close look at the arrest data (Table 1D) alongside the shooting data (Table 1C, column 2) reveals a number of puzzles that should be a cause for concern. In the arrest data only 5% of suspects were armed, and yet 56% of suspects "attacked or drew weapon." This would suggest that over half of suspects attacked without a weapon (firearms, knives and vehicles are all classified as weapons). Moreover, there are large differences across groups in behavior: two-thirds of whites and one-half of blacks attacked, a difference that is statistically significant (the reported p-value is 0.006).  

What this means is that the pool of black arrestees and the pool of white arrestees are systematically different, at least as far as behavior is concerned. So the raw data comparison described as startling in the quote above is not really valid. (I made a similar point in response to a piece by Sendhil Mullainathan a few months ago). Still, Fryer controls for these differences in behavioral and contextual characteristics and finds that the basic picture doesn't change. This has to be taken seriously. The key question, to my mind, is whether these controls are adequate. 

I personally would be more convinced if the arrestee pool looked more like the shooting victim pool. For instance, 18% of arrestees, but only 4% of shooting victims are female. I suspect that many of the interactions in the arrestee pool are not threatening, even from the subjective perspective of the officers involved. And others are so obviously threatening---for instance those involving suicide-by-cop---that no discretion or judgement is really necessary. Pruning these from the data might give us a clearer picture of bias in the use of discretionary lethal force. 

Despite these concerns, I think that there is a case to be made that there is no systematic bias against blacks in the lethal use of force within the Houston Police Department. What one ought not to conclude, however, is that this applies nationally. The analysis of other jurisdictions considered in the paper is restricted to encounters in which shootings actually occurred, and cannot therefore be used to answer the same kinds of questions that the Houston data allows. 

One last point about shootings: I'm not sure why there are quotation marks around the word "evidence" in the above quote. Video evidence, for all its flaws, is still very powerful evidence. It was video evidence that led to the indictment of Micheal Slager on murder charges, and the conviction of Sean Groubert for assault and battery. It is selective and cannot establish the presence of racial bias in individual cases, but surely it can't be dismissed out of hand.

Finally, consider Fryer's analysis of non-lethal force, which is consistent with earlier findings. Aside from being fundamentally unjust, disparities in the use of non-lethal force have some really important implications for crime rates. The harassment of entire groups based on racial or ethnic identity is a major obstacle to witness cooperation in serious cases, including homicide. In fact, given the importance of corroboration, a belief that other witnesses will not step forward can be self-fulfilling.

With witnesses routinely unwilling to come forward in some neighborhoods, people can be killed with near impunity. And this significantly increases the incentives to kill preemptively, in a climate of reciprocal fear. Low clearance rates for homicide are directly responsible for high rates of killing, and both of these are held in place by distrust of the criminal justice system by potential witnesses. The excessive and discriminatory use of non-lethal force by police thus ends up having indirect lethal effects.