Quantifying Relative Importance of Coronary Risk Factors on Patient Survival Following Coronary Artery Bypass Grafting: A Maximum Likelihood AnalysisQuantifying Relative Importance of Coronary Risk Factors on Patient Survival Following Coronary Artery Bypass Grafting: A Maximum Likelihood AnalysisQuantifying Relative Importance of Coronary Risk Factors on Patient Survival Following Coronary Artery Bypass Grafting: A Maximum Likelihood Analysis

Background: Coronary artery bypass grafting (CABG) is a major surgical intervention to relieve symptoms and promote survival for individuals with coronary heart disease (CHD). The benefits of the intervention are thought to be improved when underlying risk factors of CHD are ameliorated. However in current health care systems the long-term follow-up of patients following CABG is not centralized to allow for the determination of survival trends and their optimization. The survival of study participants who underwent CABG is compared with age and gender matched individuals from the general population. Differences in rates of survival are interpreted in terms of lifestyle choices and the impact of risk factors of CHD.


Introduction
Coronary Heart Disease (CHD) is the leading cause of death and disability in the industrialized world 1 . Much research and many treatments have been developed in an attempt to either prevent the disease, or in persons with CHD, to improve their survival and health quality of life through medical and surgical means. Coronary Artery Bypass Grafting (CABG) is one such treatment showing improved longevity for patients with continuing symptoms of CHD after optimal medical management 2 .
This work extends the temporal horizon of a cohort study which reported previously on earlier outcomes of patients who underwent CABG surgery [3][4][5][6][7] . Survival data for these patients are reported here to 1 st January 2014 (≈18 years post surgery) and are used to determine what important risk factors have influenced the survival experience of this cohort 4,6,7 . Data in the public domain are typically 30 day mortality and numbers of hospital procedures and their rates of mortality [8][9][10][11][12] . The patients investigated in this work had 30 day mortality rates of 11.4% for women and 3.6% for men, both of which are significantly higher than current mortality rates for CABG 3,9 . Although it is important to understand the various factors controlling the short-term survival of patients post CABG, in order to take advantage of improved shortterm survival rates requires an appreciation of the relative importance of longer-term risk factors in promoting survival. The term "long-term" in the literature largely refers to survival beyond several years post surgery. Such studies typically divide into analyses of survival from 3 to 10 years post-surgery, and a few exploring survival beyond 10 years of which this investigation is a rare example. The latter often take their working data from a national database which has been systematically compiled from hospital records of CABG or synthesized from the findings of smaller studies. For example, Adelborg et al. recently reported on the 30-year mortality experience of individuals in the Danish population who underwent CABG between 1980 and 2009 13 . The data comprised 51307 CABG patients and 513,070 persons from the general population. They reported that patients who underwent CABG experienced higher rates of mortality than the general population, particularly within 30 days of surgery (3.2% vs. 0.2%) and after 10 years post surgery (51.1% vs. 35.6% from 10 to 20 years and 62.4% vs. 25.0% from 20 to 30 years).
The novelties in this work lie, first, in the use of mortality tables for diseased populations, second, in the incorporation of time evolving post-operative risk within these tables, and finally in the use of a maximum likelihood methodology to compare the survival of patients to 18 years post CABG against that of gender and age-matched individuals from the general population over the same period. The maximum likelihood approach is similar to Cox regression or multivariate logistic regression commonly used in survival studies [14][15][16][17][18][19][20][21][22] . Mortality tables allow mortality in a diseased population due to ageing alone to be compensated by comparing it against that of gender and age-matched individuals in the general population. Important risk factors influencing long-term survival can then be identified and quantified in relatively small samples of diseased patients. The approach is general, could be integrated within a computer package, and requires for its operation survival data for the general and sample populations together with a list of risk factors to be considered.

Methods
The study protocol for the overall study, methods, ethical and clinical access permissions are published elsewhere 4 . Data used in this paper include new survival data and other data which are a subset of the larger dataset used in the original study cohort examining wider health outcomes and quality of life following CABG [3][4][5] . Data on survival/ times of death were extracted through a data linkage with permissions from ISD Statistics Registrar General 23 .
Baseline male and female age distributions are illustrated in Figure 1. A chi-squared test indicates that these distributions were not significantly different (χ 2 =1.395, p=0.989).

The Maximum Likelihood methodology
The maximum likelihood methodology in the context of this study is developed in four stages which are now described in overview in the text, but with a comprehensive description of the technical details underlying each stage in the Appendix. Maximum likelihood is an asymptotically unbiased and optimal estimation procedure in the sense that it achieves the Cramer-Rao lower bound as the sample size approaches infinity 24 . When feasible, maximum likelihood estimation is therefore the methodology of choice because it provides increasingly unbiased parameter estimates with the best achievable standard error as the size of the data set is progressively increased. The methodology is used widely, for example, in Finance, Biology and Economics [25][26][27] .
Stage I: Mortality tables for the diseased male and female populations are constructed and used to compare the rate of survival of individuals in the risk group with age matched individuals in the general population 28 . This strategy compensates for decreasing survival within the risk population by virtue of ageing alone, and also allows for the possibility that the important risk factors of CHD are gender dependent.
Stage II: Mortality tables developed at Stage I are used to calculate the probabilities of survival beyond or death within the period of the study for each participant. Participants behave independently, and therefore the likelihood function to be maximized for the male and female cohorts is the product of the contribution from each patient.

Stage III:
In the parlance of insurance, extra risk resulting from lifestyle choices and the presence of risk factors of CHD is specified in terms of a 'loading' pertaining to each risk factor with higher loadings being associated with greater risks 29 . The possible reoccurrence of hypertensive and cardiac risks, the onset of diabetes mellitus together with the relief of cholesterol risk means that loadings in this study are intrinsically time dependent.

Stage IV:
The likelihood function constructed at Stage II is maximized and the value of the loading pertaining to each risk factor is identified together with its associated standard deviation. Initially all risk factors are considered, but risks with loadings within two standard deviations of zero have p>0.050 and are judged to be not significant. Using this strategy, the maximum likelihood procedure is iterated and the current least important risk factor is eliminated. Iteration terminates when all remaining risk loadings are significant. However, the number of important risk that can be identified in this investigation is limited by the small sizes of the male and female cohorts.

Results
Results divide into descriptive properties of the male and female cohorts including comparisons of incidences of male and female risk factors, and the identification and estimation of loadings for the important risk factors.

Cohort survival
Eight male and six female patients died prior to a oneyear follow-up. A further 25 male and 8 female patients died between the one-year and an eight-years follow-ups, and at the final time of data collection eighteen years post surgery a further 46 male and 14 female patients had died leaving 85 (51.8%) surviving males and 16 (36.4%) surviving females from the initial sample of 164 male and 44 female participants (χ 2 =4.449, p=0.035). Male longterm survival is therefore significantly better than that for females. Equivalent percentages for an age-matched cohort from the general population are 73.0% for males and 71.5% for females (p=0.6114) 30 . Figure 2 plots the percentage survival of the male and female diseased cohorts, and for comparison purpose, plots of the expected survival for the age matched male and female cohorts from the general population Lifestyle, cardiac and CHD risks Table 1 reports on the various lifestyle, cardiac and risk factors of CHD to be considered in this study. Disparities in numbers between baseline and the one-year follow-up are due to early deaths, incomplete data and participants withdrawing consent.
Raza et al. report diabetes mellitus rates for patients who have undergone CABG of ≈7% in the 1970s reaching ≈33% around 2000 and increasing to ≈40% around 2009 31 . The male and female incidences of diabetes mellitus (DM) in this study are low at 13.9% and 4.5% respectively potentially limiting the identification of DM as a risk factor influencing patient survival. Table 1 record the absence/ presence of each risk factor for all patients with complete data for that risk at two different times. These data are used to estimate how various risks of CHD evolve. For example, of 88 male patients providing cardiac data at the one and eight-year follow-ups, the number experiencing a cardiac symptom increased from 54 to 62. By contrast, male cholesterol risk was relieved in 33 patients over the same period. If a risk either recurs or is relieved at a constant annual rate of R/year, then the probability that that risk has not recurred or not been relieved after t years is e -Rt . The bracketed data of Table 1 allows the value of R to be estimated for each risk factor. This information will be  A chi-squared comparison of the incidence of female hypertensive risk between baseline and the oneyear follow-up gave χ 2 =1.309, p=0.253 (Table 1). The same comparison for male patients gave χ 2 =11.133, p=0.001. Female numbers may be insufficient to draw a statistically significant conclusion, but the male result indicates that a side effect of CABG is increased patient susceptibility to hypertensive risk. Many patients exhibit cholesterol risk (Table 1) at baseline, and almost all have cardiac symptoms. At the one-year follow-up cardiac symptoms were present in 75.8% of women and 68.3% of men (χ 2 =0.712, p=0.399) rising to 84.2% of women and 70.2% of men eight years post operation (χ 2 =1.556, p=0.212). Although the majority of patients were relieved of angina symptoms at the one-year follow-up, surgery was rather less effective in relieving breathlessness symptoms and reducing cholesterol risk.

Risk factor loadings
After the systematic elimination of risks judged not to be significant (Stage IV), the important risks for men were post-operative smoking and the presence of cardiac symptoms. For women the equivalent risks were postoperative smoking and the presence of hypertension. Table 3 shows the loadings of each important risk with its standard error.
By comparison with an age-matched male in the general population, the loadings in Table 3 (4) 42 (17) 14 (14) 23 (13) 26 (26) 143 (93)  and for female patients this factor was obesity, which had a negative loading. Although not statistically significant in this study, the negative value of this loading suggests that obese female patients may have a marginally better survival post surgery. In a large study conducted by Terada et al. obese females were observed to have significantly better survival post CABG 21 .
Goodness-of-fit of the important loadings in Table 3 was tested using Monte Carlo simulation involving 5000 repetitions of the survival of the male and female cohorts 32 . Figure 3 illustrates the level of agreement between the Kaplan-Meier survival plots (stepped) and the average mean survival curves (smooth) for the male and female cohorts. Dashed lines denote the simulated upper and lower bounds of the 95% confidence interval. Nowhere do the observed survival curves lie outside the 95% confidence bounds. The loadings in (Table 3) are therefore reliable indicators of the post-operative survival experience of the male and female cohorts.

Discussion of Results
This study set out to identify important risk factors affecting the post-operative survival of patients who underwent CABG, and to quantify the annual loading associated with each risk. Post-operative smoking and the presence of a post-operative cardiac symptom were identified as the important risks influencing the postoperative survival of male patients accounting for annual reductions of 2.4% and 1.2% respectively (3.6% when both are present) by comparison with an age-matched male from the general population. For female patients the important risks are post-operative smoking and hypertension accounting for 3.9% and 2.7% respectively (6.6% when both are present) by comparison with an age-matched female from the general population. The long-term survival for the patients in this study closely follow patterns reported in actual patient surveys involving the recall of patients in order to obtain follow-up data 33,34 . The strategy used relies on survival data for the general population and the study population together with estimated rates of the recurrence/relief of risk factors of CHD 30 . The maximum likelihood approach offers a way to manage missing data using population survival data and minimal patient followup.
A novel development in this work uses a mortality table for the general population to manage the effects of ageing in a diseased population and incorporate extra risks 30 . These risks for patients post CABG are expressed as time-dependent loadings, one for each risk factor present. The strategy, however, can be applied to any situations in which survival is degraded by excessive exposure to risks not representative of the general population. The output of the maximum likelihood approach is then optimal values of risk factor loadings and their significance.
The novel idea of evolving risk in this work, e.g. recurrence of cardiac symptoms or relief of cholesterol risk, has general applicability and allows risk to be managed long-term with minimal recall of patients. In this study the rates of recurrence/relief of risks are estimated from data collected at the one and eight-year follow-ups. In this same cohort of patients earlier follow-up showed that well-being (SF36 scores) had also declined despite overall improvements in the initial years after operation 35 . This trend has been reported in other larger researches. In another study of 30 years follow-up after CABG the majority of patients (94%) had a repeat revascularisation intervention, similarly showing the lack of endurance of the relief of cardiac symptom after maximal pharmacological treatment 33 . However, atherosclerosis is known to occur, particularly when risk factors of CHD remain above target levels. As can be noted from Table 1, the attainment of optimal levels of risk factors remains suboptimal and could provide some explanation for the recurrence of  Table  3. Dashed lines denote the simulated upper and lower bounds of the 95% confidence intervals. cardiac ischaemic symptoms. Unfortunately, these are not uncommon findings; adverse lifestyle trends have also been reported in large surveys of lifestyle and risk factor targets of patients with coronary or other athersosclerotic disease and people at high risk of developing cardiovascular disease 36 .
The importance and novelty of the maximum likelihood approach lies in its optimality: it is the method of choice whenever feasible. Specifically, the maximum likelihood technique generates asymptotically unbiased estimates of risk factor loadings and achieves the Cramer-Rao lower bound as the volume of data is increased 24 . Maximum likelihood estimation has widespread uses inter alia in Finance, Biology and Economics [25][26][27] . For example, Avdis and Watcher use this strategy to estimate the extra interest over and above the risk-free rate that is earned for holding a risky asset 25 . Such premia inform investors how best to divide portfolios between risky (equities) and non-risky (bonds) assets. Lanot's use of the maximum likelihood approach in the labor market mirrors the survival modeling of this article 27 . An individual choosing to participate or not participate in the labor market is analogous to a patient dying within the study period or surviving beyond the end of the study.
Diabetes mellitus was not identified as a strong risk factor of CHD for male or female long-term survival in this study despite its known adverse impact on life-span 18 . An investigation of the long-term survival of 2766 patients who had undergone CABG, 43.4% of who had type II diabetes, concluded that the impact of diabetes on survival was measureable only after 10 years 22 . The finding here is therefore unsurprising given the sizes and incidences of diabetes in the study cohorts.
The importance and utility of cardiac patients' estimates of symptom severity assessments remains under investigated, yet it is of paramount importance to patient and family well-being and quality of life, and acts as a catalyst for healthcare utilization 37 . This work sheds light on the patients' health benefits in the longer term after an intensive intervention, and highlights that continued cardiac symptoms are associated with reduced survival in males. Even as electronic records system become embedded in healthcare administration, further development would be required in order to capture these important long-term patient experiences to help support choices in clinical decision-making from both patient and clinician perspectives. As digital devices become more acceptable and reliable in health monitoring and feedback, such devices could provide a sense of empowerment and motivation for individuals to try and optimize their cardiovascular health. Through improved engagement of patients with their health care professionals and through appropriate health activities, improved health outcomes could be achieved, although expert bodies recommend that more research is required to identify the best way to harness the capabilities of these devices 38 .

Conclusions
Male survival 18 years following CABG was significantly better than female survival, but both were worse than agematched individuals from the general population. New insights on the relative importance of risk factors of CHD in post-operative survival were assessed using a novel application of maximum likelihood estimation. Continued smoking was the most significant risk factor associated with decreased survival in the total group. Thereafter recurrence of cardiac symptoms in males and persistent hypertension in females were the most important other factors degrading survival.

Limitations
The most important limitation in this study is the small sizes of the male and female cohorts. Consequently only the most important risk factors could be identified. This limitation was particularly relevant for diabetes risk due to the small fraction of participants with this disease

Stage I -The Mortality Table
Consider a model population in which individuals attain the age of x. The rate of mortality at age x, denoted by , is derived from by the formula Gender based tables of used in this work were obtained from the Office for National Statistics [30] for the general population. Additional risk due to a history of CHD is incorporated within (1) by the specification * = where ( ) quantifies the extra risks experienced by a patient at age x. The outcome is a new riskadjusted table with * individuals alive at age x. Within this table, the fraction of individuals aged x years reaching age x+t years is The central concept of this work is the construction and fitting of a model for ( ) to survival data for the male and female patients in this study. The fitting procedure is based on the principal of maximum likelihood estimation, which in overview is similar to logistic regression, but differs from it via the construction of the underlying model.

Stage II -Construction of the Likelihood Function
Consider the survival experience of n patients aged x1,...,xn at the time of operation when observed over a post-operative period of T years. The probability that a patient aged x at the time of surgery will die aged (x+t,x+t+∆ ) years post-surgery, where t<T, or survive T years are respectively Because each patient provides independent data, the probability of observing the survival experience of either cohort is the product of the probabilities contributed by each patient of that cohort, namely expression (4) for patients dying within T years of surgery, or expression (5) for surviving patients. For numerically reasons, however, it is preferable to minimize the sums of the negative logarithms of expressions (4) and (5), one contribution from each patient of the cohort.

Stage III -Specification of Extra Risk
The completion of the problem requires the specification of ( ) which must take into account the behavior of individual risk factors. Mons et al. 39 suggest that the impact of pre-operative smoking ranges from 4.25 to 6.75 years for current smokers and between 1.4 and 2.4 years for former smokers.
This information is incorporated within the specification of ( ) by tapering pre-operative smoking risk to zero over five years post-surgery. Subsequent analysis will assume a fixed hazard rate for each risk factor independent of age. What matters is the duration of a patient's exposure to that risk. For a fixed annual hazard rate of h over t years post surgery, the impact of that risk on survival is managed by the expression where R is the annual rate of reoccurrence. Following a similar logic, the impact of a relievable risk is modeled over t years post-surgery by the expression Present immediately after surgery, 0 Not present after surgery. 0 The form of the time integral of ( ) post-surgery is the sum of expressions of type (8) or (9) taken over all risk factors as encapsulated in the general expression where s denotes elapsed time post-surgery and the values of PRE−SMK to DM measure the loadings of the associated risks and will be estimated from patient data by maximum likelihood estimation. The function Φ(s) has value one if the subscripted risk is active at time s post-surgery and zero otherwise. For example, suppose that a patient has no hypertensive risk immediately post-surgery, then Φ BP (s)=0. The onset of hypertensive risk during the interval (s,s+∆s) post-surgery occurs with probability − BP R BP ∆s, so that Φ BP (u)=1 for u>s with this probability. Thus, for patients who do not initially have hypertensive risk where p is a probability. Cardiac symptoms, diabetes mellitus and cholesterol risk are treated similarly.
Once the values of PRE−SMK to DM have been estimated for each risk factor, the quality of the survival model for each gender is assessed by a simulation exercise.

Stage IV -Estimation Procedure
The estimation strategy is an iterative procedure which begins by including loadings PRE−SMK associated with all risk factors. On completion of the first iteration, each loading PRE−SMK is assigned a value together with its standard deviation. Ordinarily loadings with values within two standard deviations of zero are discarded as not significantly different from zero. This strategy is repeated in which successive iterations are used to systematically eliminate loadings that are not significant until only primary risk factors remain, namely risk factors associated with loadings with value distant two standard deviations or more from zero. Risks factors associated with loadings close to two standard deviations from zero denote secondary risks. However, the sizes of the cohorts available to this investigation are small, particularly the female cohort, with the result that less important primary risk factors can be manifest as secondary risks.