Comparing the Comparable

Logo

Why James Heckman won this year's Nobel Prize in Economics, and why this is important for labor market policies in Germany

By Christoph M. Schmidt and Jochen Kluve October 2000

This month James J. Heckman and Daniel L. McFadden were awarded this year's Nobel Prize in Economics for their outstanding work in the field of microeconometrics. In the days following the announcement Daniel McFadden's contribution – in particular on statistical models for choice problems with a limited number of alternatives – has been made sufficiently clear to the interested public. German press coverage of James Heckman's achievement, on the other hand, seemed to suggest that his outstanding academic work could not be intuitively understood and would therefore have to be far from practical applicability. The Royal Swedish Academy of Sciences based its decision to give the award to Heckman on his "development of theory and methods for analyzing selective samples." We want to take this as an opportunity to explain Heckman's contribution and its significance for economic research with the help of a simple example. This example is of central importance for the practical evaluation of labor market policies in Germany.

Suppose the federal government has decided to implement a training program – say, a comprehensive IT course – to improve the (re)employment prospects of the unemployed. In the pilot project the government provides funds for 1,000 selected participants for two months – say, September and October of a given year. The goal of this labor market intervention is to substantially increase the participants' chances of re-employment. In order to decide whether it makes sense to introduce this program nationwide, the government has to formally evaluate the success of the program. Without such an evaluation, it would be impossible to tell whether or not the program has been successful.

Those who are supposed to carry out this evaluation, however, face a very complex problem, the so-called evaluation problem. In order to evaluate the effectiveness of the program, we also need to know what would have happened to the participants if they had not taken part in the program. Our example might show, for example, that 500 of the participants were able to find a job after taking the course. But what would have happened without the course? Maybe 450 out of the 1,000 participants would have found a job in November – this would speak in favor of the course (500-450=50 additional jobs). But perhaps only 550 people would have found a job – this would speak clearly against the course (500-550=-50). Alas, this comparative quantity does not exist because a person cannot be observed in two different states at the same time. A hypothetical statement about what would have happened without the program is referred to as a counterfactual situation. This situation, as it is non-observable, must be constructed from the available information. To put it in other words: it cannot be measured in the traditional sense.

But how are we going to find an answer to such a difficult question? A simple method would be to compare the results with a group of unemployed who did not participate in the program. But such a procedure would cause problems as that group would normally include many people who had shown little motivation to join the program, or who were not selected by the job center because they lacked, for instance, the basic qualifications to follow the course. The program participants are therefore, in all likelihood, not a representative sample of the unemployed population. We are facing a problem of selection.

In order to understand ways out of this dilemma, we should look at how our evaluation problem would be solved in an experimental study of the type that is quite common in the natural sciences. A social experiment would not be much different from a medical study in which a randomly chosen group of test persons receives a certain medicine, while the other test persons are given placebos. A comparison of the recovery process among both groups easily shows the effectiveness of the medicine. The so-called randomization process by which the original group is divided into a treatment group and a control group is the crucial point about this experiment.

Applied to our labor market program, this experiment would look something like this: We select 2,000 unemployed who want to participate in an IT course and who meet the requirements. Then we randomly split this group up into a test group of 1,000 persons and a control group consisting of the remaining 1,000. It is easy to visualize this process if we imagine an employee of the job center making all 2,000 applicants line up in front of his office and then tossing a coin for each individual to decide whether they participate in the program or belong to the control group. Randomization ensures that both groups are on average identical. They do not differ systematically because every theoretically conceivable combination could have occurred in practice just the same. After this random selection, the group of 1,000 participants takes the IT course in September and October, while the control group of 1,000 persons remains in its status of being "normal unemployed", i.e. they keep searching for a job etc. In this case, it is possible for the researcher to observe the share of people who get a job in November for both groups. And since both groups do not differ systematically from one another, we are able to evaluate the effectiveness of the IT course by simply subtracting the figures.

Unfortunately, social experiments of this type are highly uncommon – at least in Europe. In the United States, politicians already realized in the 1980s that meaningful results are best derived from randomized studies. Consequently, as U.S. economists had demanded vigorously for quite some time beforehand, these studies were implemented. Looking at this desirable state-of-the-art from a European point of view, we still have a lot of catching up to do, particularly in Germany. This, however, is not always due to a lack of willingness to be innovative. Quite frequently experiments are impeded by practical considerations, such as the relatively high costs of their implementation.

In this case, the only alternative is a non-experimental study. These studies are usually retrospective, i.e. they do not accompany the program scientifically from the very beginning – as a social experiment would do – but they analyze the existing data ex post. In our example, the researcher would receive a data set with more or less detailed information on the 1,000 unemployed participants of the IT course in September and October and those 500 of them who found a job in November. The researcher then has to construct a credible counterfactual quantity from information on non-participants. This could be data provided by the job center on 5,000 non-participants. A comparison of the counterfactual employment situation (=fraction employed) in November with the 500 successful participants would then allow an evaluation of the program. The discussion above, however, clearly shows that – in contrast to the 5,000 non-participants – the 500 participants are not at all representative of the unemployed population. A direct comparison of the two groups would therefore be inappropriate.

Dealing with this lack of comparability is where Heckman's contribution to the solution of the problem starts. The main idea is that selective samples do not result from a chaotic process but from factors that systematically influence the collection of data or the behavior of individuals. In our example, all of the 1,000 participants might show an exceptionally high degree of "motivation". They might be the type of unemployed workers who often meet with a job center representative, or who respond quickly to the announcement of such a program, etc. Such a worker may therefore reap more benefits from an IT course and find employment more easily than his colleague in the comparison group of non-participants. A situation like this would distort the results because treatment group and comparison group differ in the degree of motivation and cannot be compared without reservation. Since we cannot observe characteristics such as "motivation", this phenomenon is also referred to as unobservable heterogeneity.

James Heckman contributed substantially to the development of statistical methods that offer a way out of this dilemma. In principle, he uses observable characteristics of participants and non-participants and combines them with relevant information about the method by which participants are chosen. In recent work Heckman rediscovered the matching method – a statistical instrument developed in the 1970s – and applied it to econometric evaluation studies: The researcher in our example uses the data on 5,000 non-participants to construct a comparison group of 1,000 individuals that match the participants in most or even all relevant observable characteristics, such as age, education, gender, and above all occupational biography. Provided that the information about the individuals is detailed enough, a suitable comparison group is created ex post. This group is similar to a randomized control group as attributes are balanced across both participant and non-participant samples. Thus, Heckman has shown that a retrospective study using the matching method ideally comes closest to a social experiment.

An alternative non-experimental method analyzed by Heckman with regard to its potential effectiveness in evaluating labor market policy programs is called instrumental variable estimation. This approach takes into account that a certain observable quantity in the data set may induce people to participate in the program but has no impact on their labor market prospects. In our example, the decision to participate in the program may depend on the motivation factor, but also decisively on the distance between the person's home and the course location – a factor that no individual can influence. If this is the case, a comparison in terms of labor market success between individuals with short and long commuting distances – each including a corresponding number of participants and non-participants – could replace the comparison of participants with a randomized control group. Without going into statistical details, it is safe to say at this point that in practical work it is often difficult to find an appropriate instrument (in this case, the commuting distance). Whenever this is possible, we also speak of a natural experiment.

Heckman's most famous contribution to the solution of the evaluation problem deals directly with participation decisions and is known as the control function approach. This approach assumes that besides unobservable variables – in our example: motivation – other observable variables such as age or family status have an important influence on the decision to participate in the program. If an individual whose characteristics seem to speak against participation – for example, a middle-aged woman with school-aged children – does choose to participate, then this fact alone contains important information that can be used statistically: This particular woman seems to be more motivated than comparable women of the same age and family status. It is then possible – also without going into too much detail – to calculate a correction factor, which counteracts the lacking representativeness of the participants. Heckman's early works in this area date back about a quarter of a century. They have had immense influence on the practice of social science. The so-called Heckman correction has been applied in thousands of empirical studies, not only in evaluation research but also in many other methodologically related fields. In addition, this contribution has inspired a lot of methodological work. Within the extensive literature in this field, Heckman's ideas have been refined by others as well as by himself.

This example of how to evaluate a prototypical labor market policy intervention roughly depicts central elements of Heckman's work. Above all, it shows how significant his work is for labor market policy research. Unfortunately, in Germany we would rather say "how significant his work could be" or "might be" – not just because social experiments have not yet been conducted in Germany, but also because even meaningful retrospective studies are still the exception. What impedes progress in this area is not the lack of methodological competence but rather the lack of willingness to let independent researchers evaluate labor market policy and other economic policy programs by using methods of recognized international standard. This situation as such is very deplorable, in particular with regard to the Nobel Prize-winning work of a researcher in precisely this field.

 

[We would like to emphasize that we have somewhat limited the presentation of Heckman's work to our example. Our presentation is therefore necessarily trenchant and incomplete. But perhaps this example best illustrates the practical and immediate relevance of Heckman's work. We are well aware of the fact that we omit many underlying scientific problems and institutional details (such as the responsibilities of job centers), as well as fundamental methodological concepts (such as causality). This – obviously – serves to make this article more illustrative and comprehensible.]

 

© IZA  Impressum  Last updated: 2024-02-19  webmaster@iza.org    |   Bookmark this page    |   Print View