Pay for performance and contractual choice: the case of general practitioners in England

The Quality and Outcomes Framework (QOF) is a Pay-for-Performance scheme introduced in England in 2004 to reward primary care providers. This incentive scheme provides financial incentives that reward the overall performance of a practice, not individual effort. Consequently, an important question is how the QOF may affect contractual choices, quality provision and doctor mobility in the primary healthcare labour market. The paper provides a simple theoretical model that shows that the introduction and further strengthening of the scheme may have induced practices to compete for the best doctors and modified their choices in terms of contractual agreements with practitioners. We test the implications of this model using a linkage between Doctors Census data and practices’ characteristics from 2003 to 2007. We use linear multilevel models with random intercept and we account for sample selection. We find that after the introduction of the QOF efficient doctors are more likely to become partners and mobility among doctors has increased. The strengthening of the scheme in 2005 is associated with an increase in the quality of primary care and a reduction in access to the market for new doctors.


Introduction
The number of GPs in the U.K. has risen by an average of 1.8% each year between 2004 and 2014, but there are still problems in recruiting and retaining GPs into the healthcare market (The Health Foundation [16]). In the U.K. there is also a relative long tradition in the use of Pay-for-Performance (P4P) schemes, but little is known about the sorting and retention effects they might produce (Lazear [9]).
The aim of this paper is to investigate the sorting and retention effects of the only P4P scheme for General Practitioners (GPs) in the U.K., the Quality and Outcomes Framework (QOF). GPs are paid under a mix of capitation and financial incentives including the QOF. The QOF awards an amount of money to practices that achieve certain levels of performance. Such pay is linked to the *Correspondence: mario.pezzino@manchester.ac.uk 2 University of Manchester, Economics, Arthur Lewis Building Oxford Road, Manchester, M13 9PL, UK Full list of author information is available at the end of the article performance of the practice and may then be internally distributed over to GPs. We examine the effects that the introduction of the QOF in 2004 and its subsequent strengthening in 2005 had on the contractual arrangements, the provision of quality and doctor mobility in the primary healthcare market.
We motivate our empirical analysis with a simple theoretical model with two practices combining three aspects of the primary healthcare sector that together, to our knowledge, have not yet been considered by the literature. These are: i) the altruistic 1 nature of the GP's job; ii) different contractual arrangements (e.g. salaried positions or partnerships) for GPs (National Audit Office [11]); and iii) the multidimensional nature of quality experienced by patients as a mix of the effort of GPs, front desk and nursing staff. In our model each practice serves a representative patient and requires the effort of one doctor to provide healthcare. Doctors may be of two types: efficient and inefficient. If they are efficient, then they are able to provide medical effort at a lower cost than the other type of doctors, the inefficient ones. The model predicts that the behaviour of a salaried doctor will not be affected by the introduction of the P4P scheme, since her remuneration is not quality-dependent. However, the introduction of the QOF may induce practices to compete (offering partnerships) for the best doctors and increase mobility.
We test the empirical implications of this model using a unique linkage between GP Census data from 2003 to 2007 with practices' characteristics in England. We then link these data with information on practices' performance on the QOF. Whilst we find that the introduction of the QOF in 2004 decreased the probability of becoming a partner by one percentage point, more efficient doctors are more likely to be partners. Mobility of doctors increased by two percentage points after the QOF was introduced. But more efficient doctors are less likely to move. And doctors in better practices are also less likely to move. Overall, these results might suggest some evidence of a sorting effect of the QOF with more efficient doctors becoming partners and retention of doctors in higher quality practices. Finally, we find that the strengthening of the scheme in 2005 has produced on average an increase in primary care quality of practices. Over the period between 2004 and 2007 more efficient doctors improve the quality of the practice and if they are also partners they put more effort. However, we observe a decrease in GPs' effort connected to the strengthening of the scheme. We interpret this result with the existence of reputational effects as described by the literature on pro-social behaviour and intrinsic motivation.

Literature
Incentive pay is a way to partially overcome information asymmetries in the labour market. The rationale behind P4P is to tie workers' compensation to their output aligning their individual objectives to those of the firm (see for example, Lazear [9]; Prendergast [13]; Ross [15]). If such schemes are introduced in multi-level organisations, there might be two potentially counteracting effects at play. On the one hand, individual agents might free-ride on the team thus weakening the incentive effect of P4P. On the other hand, the sorting effect might cause more able workers to select and stay in performance pay jobs and less able workers to leave these jobs (see for example, Dohmen and Falk [4]; Gielen et al. [6]; Lazear [10]; 2000). According to these studies, in equilibrium workers relocate themselves on the basis of their ability, thus increasing productivity and wages in firms that use P4P schemes.
In this paper we focus on the QOF, an incentive scheme that applied to practices, which has increased doctors' income by 25 percent (NHS Review Body [12]). We examine its effects on doctors' behaviour (e.g. mobility and effort provision) and on practices' behaviour (e.g. contractual arrangements). This is an important issue for the incentive and sorting effects in the GPs' labour market and for the resulting potential inequalities in the provision of healthcare to patients.

The Quality and Outcomes Framework
In the U.K. most healthcare is delivered by the National Health Service (NHS), which is a comprehensive freely available service funded out of general taxation. All citizens are registered with a GP who delivers primary care and acts as gatekeeper for elective secondary care. Primary care practices are typically a group of one to six doctors responsible for a pool of patients. They are independent contractors retaining the surpluses they make as private income.
In early 2000s the terms of two new contracts for GPs remuneration were negotiated (see the National Audit Office [11]). The first contract is the nationally negotiated General Medical Services (GMS), with a mix of capitation, lump sum allowances and P4P schemes, such as the QOF. The capitation payment is based on the number of patients treated by a practice, adjusted for the characteristics of both patients and area where the practice operates. The second contract is the Primary Medical Services (PMS), negotiated between the practice and the local primary care organisation. Under the PMS contract, the practice receives a lump sum payment for the same services provided in the GMS contract (Gravelle et al., [8]). The first three years of this new contract cost about £1.76 billion (National Audit Office [11]).
The new GMS contract was officially introduced on 1 st April 2004 with measurement of achievements on 1 st April 2005 for the previous 12 months ( Roland [14]). In the first two years of the QOF, practices were rewarded on the basis of performance points reported on 146 indicators. The points correspond to an amount of money in pounds sterling (£) claimed annually. Points were awarded on the basis of reported coverage rates above a lower threshold of 40% and an upper threshold that varied depending on the particular indicator (see Doran et al., [5]

The theoretical model
Let us consider a market where patients are treated by two practices, say practice 1 and 2, located at a positive distance t > 0 from each other. Practices need only one doctor to serve their population. Let us suppose that there are two types of doctors: efficient (v) and inefficient (nv). The mass of doctors of type v is equal to π ∈ [0, 1], and consequently the mass of doctors of type nv is equal to (1 − π).
Suppose that the practices know the type of two doctors, each located next to a practice. These doctors' type may be known because they were previously employed.
Practices can however access the labour market and offer a salary to new doctors (whose type will be exante unknown) who receive salary w i from practice i, i = 1, 2. The utility of the salaried doctor z = {v, nv}, who works for practice i, is (superscript N indicates that we are considering the case of new salaried doctors): where Q zi is the total level of quality of care of practice i; γ z e 2 zi 2 is the doctor's cost of effort 2 , with 1 = γ nv ≥ γ v > 0 representing the degree of inefficiency of doctor z; e i is the level of effort invested by the doctor employed by practice i and b i ≥ 0 is the benefit the doctor receives from accessing the amenities in the neighbourhood of practice i. Let us assume that the outside option of a new salaried doctor is equal to zero.
Notice the utility in (1) increases with the quality of care Q zi ; this captures the assumption that doctors are assumed to be altruistic and care about quality. The quality perceived by patients is the sum of quality improvements given by doctor's effort and practice's investment in facilities and nursing staff: (2) where q i is the quality investment of the practice (e.g. nursing staff, front desk, IT). Practices are run by doctors who have taken a completely profit-oriented managerial position in the practice.
If a practice employs a new doctor its (expected) profit function is given by: where m is a capitation payment, p ≥ 0 is the qualityrelated payment based on a P4P scheme, q 2 i 2 is the cost function of the practice 3 .
The timing unravels as follows.
In stage 1, practices (i.e. managers) choose the contract to offer to the doctor; specifically they may offer a partnership to one of the doctors whose type is known or a salaried contract to a new doctor.
In stage 2, practices choose investment in quality q i to maximise profits. Doctors who accepted a contract choose the level of effort e zi to maximise utility.
In stage 3, primary health care is provided.

Salaried contract
Let us look first at the provision of care when a practice hires a new doctor. The effort chosen by the salaried doctor in stage two is simply produced by the satisfaction of first order conditions to maximise (1).
Notice that p has no effect on the effort choice of the salaried doctor. This is not surprising, since she receives a remuneration that does not depend on the quality of care.
Similarly the maximisation of profits defines the investment in quality of the practices at stage two.
Since the outside option of a new doctor is equal to zero and the resulting utility is strictly positive, practices will set w i = 0. 4 Notice that a positive p is necessary to induce the practice to invest in quality. In equilibrium: Notice also that, when the practices choose to hire a new doctor, profits increase in m, p, and π, and decrease in γ v . In addition, if p = 0 (i.e. no P4P scheme is in place), then N i = m.

Partnership contracts and competition over doctors
Subsection Salaried contract considered the case in which practices decided to hire a new doctor in stage one. However next to each practice lives a doctor whose type is known and who could be offered a partnership. A practice may be also willing to pay a relocation package equal to t and offer a partnership to the doctor located next to the other practice. Similar to the analysis in section Salaried contract, let us study first the provision of care of a practice who employs a partner of type z.
Assuming that a partner will be entitled, in addition to a salary, to equally share profits with the management, 5 the utility of a partner and the remaining profits for the practice are equal to: Superscripts P and M indicate variables respectively related to a doctor working as a partner and a manager in a partnership.
The simultaneous choice of medical effort and practice quality produces: Comparing (7) with (4) it is clear that e P zi ≥ e N zi (this is the effect of the incentive of the P4P scheme). Not surprisingly, the practice investment is not directly affected by the partnership and q M zi = q N i . In addition, without the P4P scheme, i.e. p = 0, practice quality would be q N zi = q M zi = 0. The study of the expressions in (4), (5) and (7) produces a testable implication: an increase in the QOF price perquality-point (i.e. higher p) may increase medical effort and practice quality.
The utility of the partner in equilibrium is given by Let us suppose again that the outside option of the doctor is sufficiently low to allow practices to offer w P i = 0. Remaining equilibrium profits for the manager are given by An increase in p (P4P incentive) has a positive effect on quality (both Q P zi and e P zi ), partner's utility and practice's profits. Notice that if p = 0 (i.e. no P4P scheme is in place) In other words, without a P4P scheme, it is not desirable for a profit-oriented practice to offer a partnership. This implies that with the introduction of P4P incentives practices become more concerned about the ability of the doctors. Consequently, they will consider offering partnerships and eventually compete to attract the best doctors.
This provides another testable implication: the introduction of the QOF ( p > 0 compared to p = 0) may have created mobility among doctors.
In stage one a practice needs to choose whether to offer a partnership to (either) of the two old doctors or access the labor market.
Three scenarios may exist: (i) both old doctors are of type v, (ii) both old doctors are of type nv, (iii) the old doctor in practice 1 is of type v, while the old doctor in practice 2 is of type nv. 6 Due to the existence of relocation costs t, in cases (i) and (ii) the practices will not find profitable to offer a partnership to the doctors located in another region. There is not competition for doctors is case (i) and (ii) and, therefore, we discard these scenarios. Relocation will be feasible instead in case (iii) and this will be the focus of our analysis.
In any case, in stage one a practice will consider to offer a partnership to the local doctor if M zi > max 0, N . The necessary condition is satisfied in the subset of the parameters S, where: It follows that a practice will offer a partnership only if the probability to hire an efficient new doctor is low or if the P4P incentive is sufficiently strong to enhance the performance for the doctor and, consequently, profits. Notice that γ nv = 1 does not belong to sets S 1 nor S 2 . This means that M nvi < N . In other words, no practice will offer a partnership to an inefficient doctor.
This provides another testable implication: efficient doctors are more likely to be offered a partnership.
Focusing our attention on parameters in S, simple and intuitive comparative statics analysis on M vi − N shows that an increase in π (the probability to hire an efficient new doctor), in γ v (i.e. less efficient to provide effort for type v doctors) or in the capitation payment m reduce the profitability of offering a partnership and may increase turn-over. An increase in p, instead, induces a practice to offer a partnership to the local efficient doctor. This last result implies that an increase in p reduces access to the job market to new (salaried) doctors, another testable implication.
Let us focus our attention to the case in which an asymmetric allocation of doctors may induce practices to compete for the best doctors. Let us assume for example that an efficient doctor is located next to practice 1 and an inefficient one is located next to practice 2.
Let us restrict our attention to the following subset of parameters: SetŜ ensures that relocation costs are sufficiently small to make relocation of an efficient doctor feasible.
Practice 1 may face competition from practice 2 over the services of the efficient doctor and therefore may have to offer a higher salary in order to outbid the rival. Definẽ M v1 the profits earned by practice 1 hiring the efficient doctor when practice 2 is also offering a partnership to the same doctor. Practice 1 will offer a partnership to doctor v if: requires that the efficient doctor will not be worse off accepting a partnership by practice 1 rather than practice 2. Using expressions (7), the condition simplifies to The highest salary that practice 2 will be willing to offer to the vocational doctor is It follows thatw 2 2γ v ≥ 0 for parameters inŜ. The salary that practice 1 has to guarantee to the local efficient doctor to beat competition from practice 2 is Simple comparative statics analysis shows that dw 1 dp > 0 (an increase in p increases the performance of the partner and, therefore, the willingness to pay of practice 2) and dw 1 db 2 > 0 (if the neighbourhood of practice 2 provides attractive amenities, practice 1 needs to reward the efficient doctor more to convince her to stay); also dw 1 dm < 0, dγ v < 0, dw 1 dπ < 0 (increases in m, γ v and π make less desirable for practice 2 the option of hiring a new doctor and, consequently, practice 1 can reduce the salary offered to the efficient doctor 7 ); dw 1 dt < 0, dw 1 db 1 < 0 (an increase in t or b 1 makes more difficult for practice 2 to attract the efficient doctor).
The profits obtained by practice 1 when offering the partnership to the local efficient doctor are: Mobility (in our example the relocation of the efficient doctor from practice 1 to practice 2) ultimately depends on the relocation costs and the characteristics of the areas where the two practices operate. Indeed mobility will increase if the relocation costs or the workplace quality differential (b 1 − b 2 ) decrease. The P4P scheme does not directly affect the willingness to pay of practice 1 to retain the efficient doctor. Since the scheme offers the same incentives to both practices, an increase in p equally induces both practices to compete for the efficient doctor. The scheme, however, may have an effect on the welfare of the efficient doctor. In fact, the efficient doctor may be better off because of the competition from practice 2. The increase in utility (compared to a partnership in practice 1 without competition) depends on ultimately inw 1 .

Empirical strategy
The above-described theoretical model motivates our empirical strategy by providing the following testable implications: 1. Efficient doctors are more likely to be offered a partnership. 2. The introduction of the QOF (p > 0 compared to p = 0) may have created mobility among doctors. 3. An increase in the QOF price per-quality-point (i.e. higher p) may have (i) increased medical effort and practice quality, (ii) decreased access to the job market to new (salaried) doctors to replace inefficient colleagues.
As a result, the empirical strategy focuses on how to model efficiency, partnership, mobility and access of new doctors to the healthcare market. Efficiency can be modelled as a function of an observed component, e.g. experience, and an unobserved one, e.g. ability. Ideally, we would need a single data source with a large longitudinal sample where we have sufficient doctors' mobility and we measure wage as proxy for efficiency. Since such source of data is not available, we have to use age as proxy for efficiency in the General Medical Services (GMS) Statistics Databases. In reality there are complex interactions between experience and efficiency so it is not an unreasonable assumption. We can test whether age is positively correlated with practices' quality on a sample of singlehanded practices in order to attribute efficiency to a single doctor rather than to the average doctor in a practice. We estimate three separate pooled linear regression models. These models can be formally described as follows: (12) where y indicates the proportion of total QOF points practice j achieves. The subscript i = 1, . . . , N indicates each of the N doctors, j = 1, . . . , N indicates each practice and t = 2004, 2005. Demographic characteristics of doctor i, including age, are denoted by x ij(t−1) , w ij(t−1) is doctor's i income in practice j and z j(t−1) are practice j characteristics. ijt indicates the residual of equation (12). We then test the implications of the theoretical model using pooled linear probability models and random intercept multilevel models 8 , accounting for sample selection bias in the mobility equation.
The first and second testable implications are modelled as the probability of being a partner or to move to another practice: (13) where as in Eq. 12 i and j indicate the doctor and practice, respectively; t = 2003, . . . , 2007, where y ijt = 1 indicates whether the doctor is a partner or has moved to another practice. We argue that a i captures the unobserved component of efficiency, the ability of doctor i. The timevariant error term is denoted as ijt which also includes practice effects d j .
To test the second implication of the theory, we model mobility as a two stages Heckman model. The first stage regression is essentially the same as Eq. 13 with y ijt representing the realised value of doctors' propensity to stay in the health care job market. The exclusion restriction of the first stage regression is the number of years to statutory retirement depending on the time of the survey. Statutory retirement does not vary by practice and should not affect the probability of changing practice 9 .
From each year's probit model we calculate the Inverse Mills Ratio (IMR) as the ratio of the probability density function to the cumulative distribution function. We then "stack" all IMRs and include the time-variant IMR into the second stage random intercept multilevel model of mobility. Conditional on staying in the market, doctors might want to stay in the same practice or to move practices which is formalised with the inclusion of the time-varying IMR as additional covariate. In these specifications we include 2003 as a pre-QOF year because we want to test the effect of the introduction of a new performance scheme on the contractual choice and mobility of doctors.
Finally, we test the third implication of the theoretical model by examining whether the price increase induced by the QOF is associated with an increase of medical effort (measured by the workload of a doctor or full-time equivalence (FTE)), practice quality (measured by the proportion of total QOF points) and a decrease in access of new doctors in the market. The first two models can be formalised as follows: where y ijt indicates doctors' effort measured by their FTE and the proportion of total QOF points practice j achieves. The other covariates are the same as those described above, but t = 2004, . . . , 2007. The third model is formally the same as the one in Equation 13 with y ijt indicating whether a new doctor entered the market.

Robustness checks
One potential threat to our empirical strategy is the identification of the effect of the QOF on doctors' mobility from a before and after comparison as the P4P scheme applied to all practices. We can indirectly test the robustness of the results of the 2005 price increase using a similar price change that occurred in 2006. We use the exogenous price shock determined by the increase in threshold levels in 2006, constructed as a potential income loss for each practice, and test its effect on doctors' effort, access of new doctors in the market and mobility. If the before/after comparison was not biased by other policies then we would expect this price decrease to generate similar results as before (but with opposite sign). We highlight a caveat. As this price decrease only applied to some indicators of cardiovascular diseases (CVD), there could be effort diversion to similar tasks for which payments remained unchanged. As an additional robustness check we consider the full set of CVD indicators for which effort diversion might have occurred. More formally, we estimate: where the unobserved component α i is differenced out and y ij indicates the change in FTE, access to the market of new doctors and mobility. We estimate the change in mobility separately for the top and bottom performers, measured as practice performance in 2004.
x ij includes the same characteristics as in Eq. (14), but age. S ij is the exogenous price shock between 2005 and 2006.
Finally, models of partnership, mobility and access to the healthcare job market could be estimated as nonlinear models because the outcome variables are binary. The latent propensity to be a partner, to move to another practice or to access the healthcare job market can be modelled as: where, as in Eq. 12, i and j indicate the doctor and practice, respectively; t = 2003, . . . , 2007, and y ijt = 1.(y * ijt > 0).

Data sources
Our analysis is based on two data sources: i) the General Practitioners' Worklife Survey (GP WLS); and ii) the General Medical Services (GMS) Statistics Databases to which we link geodata and measures of practices' quality. We use multiple data sources to combine their relative strengths because the GP WLS records doctors' income, but as a small longitudinal sample it does not contain a large sample of doctors who move. The GMS database instead contains information on all doctors in England, but does not record their income.
We . They also include information on practice contract, whether it is a dispensing practice, its list size by age-group and gender.
Any doctor who treats patients is required by the law to register with the General Medical Council (GMC) to obtain a licence to practice. Once registered, doctors receive a GMC reference number that uniquely identifies them. This number has been used to trace doctors in the yearly Censuses so that our final dataset contains a description of characteristics and movements of doctors from 2003 to 2007.
Practice postcodes have been used to calculate distances and to link local areas characteristics such as the Local Income Scheme Index (LISI) to this data. LISI is a direct measure of practice list deprivation derived from prescription data.
Finally, we link practice codes to the Health and Social Care Information centre database containing numerators, denominators and number of points that each practice has obtained on all QOF indicators from 2004 onwards.

Summary statistics
Because income is recorded in the GP WLS in eight intervals from less than £25,000 to more than £150,000, we recode it as a continuous variable by assigning each doctor the mid value of the income interval they report. The average income in our GP WLS (2004)(2005) sample is about £73,311. On average, doctors report to work about 44 hours per week as GP 10 .
We also consider a number of practices' and doctors' characteristics from the GMS database (2003)(2004)(2005)(2006)(2007). We use FTE of doctors as measure of their effort (i.e. parameter e zi in the theoretical model). FTE indicates the workload of a doctor on a scale from 0 to 100 where 100 indicates the doctor is employed as full-time worker in the practice and 50 signals that the worker is only halftime. Table 1   Contractual status is available in the GMS data and records whether the doctor is partner or salaried. On average in our sample 82 percent of doctors work as partners.
We use the GMC reference code in the GMS data to identify new and old doctors. On average about seven percent of doctors between 2003 and 2007 are newly qualified, about six percent exit the healthcare job market or, if they don't exit the market, they move practice.
Our measure of quality (i.e. the parameter Q in the theoretical model) is practice performance in the QOF as measured by the proportion of total points achieved by the practice. The revenue the practice earns varies linearly between lower and upper thresholds of coverage, called "achievement" A ∈ [ 0, 1]. It indicates the proportion of eligible patients for which a specific measurement of quality has been achieved. We select a number of control variables that vary among doctors and practices. On the doctor's side, we consider gender and age in years. Approximately 42 percent of doctors are female and the average age of doctors is 45 years old (with a standard deviation of 9 years). We show that age indicates doctors' experience which is an important component of doctors' efficiency parameter (i.e. parameter γ z ).
On the practice's side, we consider the number of years the practice exists, the size of its population and the proportion of its female and male population aged 65 or over. Practice years is calculated as difference between current and opening date. On average, practices have been in business for about 29 years, but there is a standard deviation of about 6 years. On average, about four percent of the practice population is aged 65 or over. A practice could also have up to a third of its population that is aged 65 or over. We also consider a measure of deprivation based on prescribing in general practice, the LISI (i.e. parameter b i in the theoretical model). More than 80 percent of prescription by NHS GPs in England are exempt from charges. Most of such exemptions are made on the basis of age and specific diseases. However, 12.1 percent of items are exempt because of low income. LISI includes recipients (and their dependents) of family credit and income support and others who qualify because of low income. It indicates the proportion of total cost in a practice going to patients who are exempt for these reasons. On average, eleven percent of a practice's cost goes to patients with low income, but there is quite a large variation between practices. Practices in deprived areas may have to incur to 90 percent of their costs to poorer patients.
We use the postcode to obtain the latitude and longitude of each practice. We then calculate the Vincenty (1975) distance to the 10 closest practices. Vincenty's formula is used in geodesy to calculate the distance between two points on the assumption that the figure of the Earth is an oblate spheroid rather than spherical. We then calculate the distance (in miles) of the most distant practice amongst the closest 10 practices. We select the distance to the best practice (measured by its QOF points) that is the most further away amongst the 10 selected practices. Under the assumption that relocation costs increase with distance, this is an empirical proxy for parameter t in the theoretical model. We recognise, however, that attractiveness of a practice is determined by other occupation characteristics (i.e. wages) offered to the doctor that we cannot observe. On average, the best practice is at about nine miles distance, but there is a standard deviation of about 16 miles. Finally

Results and discussion
Before we discuss our results we explain how we measure efficiency. In Table 2 we report estimates of Eq. 12, the pooled regression models of practice quality measured as the proportion of total QOF points. We indirectly show that age is a proxy for doctor's efficiency with three separate models. First, in model I, estimated on the GMS database, we find that in single-handed  Note: Model I on single practices, Models II-III on all practices * * * p < 0.01; * * p < 0.05; * p < 0.10 practices an increase of doctor's age is associated with a better performance of the practice. But this association is non-linear as it is characterised by an inverted-U shape with a decline after the age of 42. The same nonlinear relation between age and quality of the practice can be found in Model II using the full sample of practices. However, one limitation is that age could capture the effect of omitted factors such as income. In Model III, estimated on the GP WLS sample, we show that even after controlling for doctors' income, age statistically significantly affects practices' quality. As doctors gain more experience they improve practices performance up to the age of 41 after which their efficiency starts declining.
We report results of the three testable implications in four tables. The first and second testable implication around doctor efficiency, partnership and mobility are reported in Table 3. Results on the third testable implications are reported in Tables 4, 5 and 6.
In Table 3 we report two models of the partnership equation that show results of the first testable implication. Model I is a pooled linear regression and Model II is a random intercept multilevel model where doctors are nested in practices. Both models show that more efficient doctors, where efficiency is proxied by age, are more likely to become partners up to the age of 35 after which the probability of becoming a partner declines. Model II indicates that with a one year of additional experience there is an increase in the probability of becoming a partner by five percentage points when ability is accounted for, compared to seven percentage points when ability was unaccounted for in Model I. Effort is a statistical significant factor in explaining the probability to become a partner. We show that the introduction of the QOF in 2004 decreased the probability of becoming a partner by one percentage point.
There is no evidence of sample selection in the mobility equation 11 . Mobility decreases with age as presumably the costs of moving become higher. An additional year decreases the probability of moving by six percentage points. A doctor working in a practice that performs better in terms of QOF points is less likely to move. A ten percentage point increase in the proportion of total QOF points decreases the probability of moving practices by 0.4 percentage point. But we observe a raise of mobility over time, in 2004 with the introduction of the QOF there is an increase in the probability of moving by two percentage points.
Models I-III in Table 4 report the estimates for the third testable implication on the medical effort, the quality of the practice and access of new doctors in the market. We find evidence of a decrease in medical effort (variable e zi in the model) associated with the increase in the QOF price introduced in 2005. In Model II we find however an increase in the proportion of total QOF points achieved by practices. This result may be explained by the possible effects that forms of intrinsic motivation play in the market. Indeed theoretical contributions (Bénabou and Tirole [1]) and industry practioners 12 have argued that doctors' effort may not simply be the result of financial decisions, but also the product of social/reputational considerations. If reputation and prosocial behaviour is important to doctors, the introduction   of a P4P scheme may induce doctors to reduce their effort in those tasks that receive a financial incentive to avoid being recognised as greedy. Practice management instead may be less affected by reputational/pro-social considerations, and may react positively to the introduction of a P4P scheme. In model I there is also evidence that more efficient doctors work more, but this association declines as they get older. Effort provision varies by contractual arrangement as more efficient partners put more effort. The more distant the nearest best QOF practice is, the less effort doctors put, indicating either less competition or lower demand for services. Finally, in Model III we report estimates of the linear probability model of access to the market of new doctors. All covariates are contemporaneous because we don't observe the characteristics of doctors prior to their  Tables 5 and 6 report the effect of the price shock in 2005. We find results symmetric to the previous ones with an increase of doctors effort and access to the market of new doctors, although the latter is not statistically significant. There is statistically significant effect on mobility for either top or bottom practices based on their performance in 2004. We replicate this analysis using the exogenous price shock on all CVD indicators and find similar results which are available on request.
In Table 7 in the Appendix we show that our results on partnership, mobility and access to the healthcare job market are unaffected by the linear specification because the coefficients of the non-linear models are qualitatively similar.

Conclusions
There is a lack of research on the sorting and retention effects that P4P may produce. In this paper we focus on the only P4P scheme for General Practitioners in England, the Quality and Outcomes framework (QOF).
We have developed a simple two-practice theoretical model where we account for doctors' altruism, different contractual arrangements and multidimensional quality. The model predicts more efficient doctors should be more likely to be offered a partnership. The QOF may induce less turn-over and generate more mobility among more efficient doctors.
We have then tested the empirical implications of this model using a unique linkage between doctors Census data from 2003 to 2007, practices' characteristics and quality as measured by the QOF in England. After controlling for doctors' unobserved heterogeneity in multilevel models where doctors are nested in practices, we have found that the QOF is associated with a two percentage point increase in mobility. We have also found that the introduction of the QOF is associated with a decrease in the probability of becoming a partner by one percentage point, but more efficient doctors are more likely to become partners. More efficient doctors improve the quality of the practice and if they are also partners they put more effort.
Despite the relative richness of the data, our study has some limitations. Firstly, as the QOF applied to all practices in England we identify its effect off time dummies. However, the National Audit Office Report [11] does not highlight any policy other than the QOF that was introduced in England between 2004 and 2005. We have also checked the robustness of our results using the 2006 negative price shock which is exogenous to the individual doctor's behaviour and we found similar results. Secondly, we cannot measure wages at the same time as mobility. We had to resort to an indirect method arguing that age can proxy for experience, a component of efficiency. We have shown this holds in single handed practices where practice quality is doctor's quality. However, this relies on the assumption that the only factor that affects practices' quality is doctors' effort and not the effort of other staff.

Endnotes
1 See for example Glazer [7]. In the rest of the paper will use the terms devotion, vocation, motivation and altruism interchangebly. 2 The assumption that the cost of effort is described by a quadratic function is commonly adopted in the literature (see for example Brekke and Sørgard [3] and Zweifel et al. [18]). The convexity of the cost function captures an opportunity cost concept, i.e. the idea that an additional unit of time/effort spent at work increases the value of leisure. 3 The assumption that health care providers face a convex function is standard. Convexity captures that idea that practices face a capacity constraintthere are physical limitations in physical resources. See for example Brekke et al. [2]. 4 If the doctors had an outside option ensuring a monetary reward greater than zero, then practices would have to consider the satisfaction of a participation constraint and, in case of a binding constraint, the fact that equilibrium profits would depend on the monetary remuneration of the outside option. Qualitatively the results of the model remain unchanged. 5 The assumption that the partners of a practice equally share profits is clearly a simplification. Nonetheless, given the lack of information in the data available to us, it is an innocent assumption. 6 The opposite case where type v is located next to practice 2 is completely symmetric to case (iii). 7 A very recent debate in the news seems to point out that m may have been reduced in the last few years. http:// www.bbc.co.uk/news/uk-politics-24362902. 8 Multilevel models are used because doctors are nested in practices. 9 We test the exclusion restriction assumption by including the number of years to statutory retirement directly in the mobility equation. We find that it does not statistically significantly affect mobility. 10 These statistics are not reported in Table 1 as they are not part of our main analysis. 11 We do not report the results of the year-by-year first stage probit selection model, but these can be made available on request. 12 http://www.telegraph.co.uk/news/health/news/ 11177632/Anger-over-cash-for-diagnoses-dementiaplan.html