Symptomatic severe aortic stenosis (AS) is the most common indication for valvular interventions.1 AS is a degenerative and progressive disease that characteristically remains asymptomatic for decades but once symptoms occur, survival is severely compromised. Historical data have shown that the time from the onset of symptoms to death is about 2 years in patients who develop heart failure (HF) symptoms, 3 years in those who present with a syncope and 5 years in those presenting with angina.2 The Long-term follow-up of the Placement of Aortic Transcatheter Valves (PARTNER 1B) trial showed that two-thirds of inoperable patients who followed standard treatment did not survive beyond 2 years, while transcatheter aortic valve replacement (TAVR) halved mortality.3
Stages of cardiac damage in patients with severe AS have recently been defined (Figure 1).4 Stage 1 includes increased left ventricular mass, increased left ventricular filling pressures and systolic dysfunction defined as left ventricular ejection fraction (LVEF) <50%. Further stages relate to damage of the left atrium or mitral valve (stage 2), pulmonary vasculature or tricuspid valve (stage 3) and right ventricular damage (stage 4). Each stage is associated with an increased risk of mortality within 1 year, ranging from 4% at stage 0 (no damage) up to 25% at stage 4. Stage 1 patients, when compared with AS patients without cardiac damage, show an increased mortality (9% versus 4%), hospitalisation rate (17% versus 7%) and stroke rate (6% versus 2%). Recent studies have called into question the traditional 50% LVEF cut-off, suggesting that in patients with AS, an ejection fraction of ≤60% may precede the onset of symptoms and may also predict progression of the disease.5 Thus, evaluation of the left ventricular systolic function is critical in the follow-up of patients with asymptomatic AS and early detection of dysfunction prompts the need for accelerated aortic valve replacement.1 Moreover, ascertainment of history of congestive HF may better characterise the prognosis of patients who are having a planned intervention. The impact of chronic HF and systolic dysfunction on treatment selection – TAVR versus surgical aortic valve replacement (SAVR) – merits further research.1
In this review, we summarise the prevalence of HF in patients included in clinical trials comparing TAVR with SAVR or medical treatment, as well as tools for assessment of the functional status, quality of life and clinical events during follow-up. We also discuss recent recommendations for broadening the definition of HF-related clinical events as well as statistical methods that not only increase the power of comparisons, but also may better capture the burden of the disease and effects of experimental therapies in the setting of clinical trials. Cardiac imaging to assess LVEF, longitudinal strain, mitral regurgitation and hemodynamic parameters, such as pulmonary artery systolic pressure, as well as biomarkers, such as brain natriuretic peptide (BNP) and N-terminal pro-BNP (NT-proBNP), have been associated with HF symptoms and worsening symptoms, but detailed discussion of these is beyond the scope of this review.6
Prevalence of Heart Failure and Related Comorbidities
HF is multifactorial in patients with severe AS and can be a consequence of the increased afterload and myocardial remodeling, with the contributory effect of cardiac damage characteristics of stages 2–4, or secondary to ischaemia.4 Characterising the aetiology requires interrogation and assessment of previous MI or coronary artery disease, status of coronary artery lesions (i.e. existence of lesions requiring intervention), atrial fibrillation and pulmonary hypertension. Defining prior congestive HF is not standardised and may range from ambulatory symptoms prior to hospitalisations for HF. In Table 1 we summarise the baseline characteristics of seven clinical trials, providing data for both the TAVR and control groups when available.7–13 Prior MI ranged from 5% in a study with an all-comers design to 31% in an extreme risk cohort;14 coronary artery disease affected two-thirds of patients and atrial fibrillation one-third.7–13 Remarkably, previous HF was captured only in three of the seven studies, and was highly prevalent (≥95%) in patients with intermediate, high or extreme risk.7–12 LVEF was reported in four of seven studies and mean values were always above the cut-off value accepted for normality (>50%). An ejection fraction <50% was seen in ~30–50% of patients with severe AS.7–9,12 The New York Heart Association (NYHA) functional class at baseline was used in all studies and reflected accurately the risk of the analysed cohorts. In an all-comers design,13 approximately half of the patients presented with NYHA class I or II (Table 1), while this number was less than 10% in cohorts with high or extreme risk.7,8,12 Likewise, NYHA class IV was present in up to half of patients at high surgical risk, while it was observed in <3% in the all-comers cohort.12,13 These findings underscore the value of NYHA class for characterising the baseline functional status of patients with severe AS. Although the reproducibility of this assessment has been criticised, its simplicity and availability make it a useful functional assessment.15 It is noteworthy that prior congestive HF should be better defined and standardised and consistently captured in cardiovascular trials.
Ascertainment of Heart Failure-related Clinical Events at Follow-up
All-cause and Cardiovascular Mortality
TAVR has revolutionised the management of severe AS. This is largely due to continued improvement in transcatheter heart valves and implantation techniques. Efforts to expand its indication have targeted populations with progressively lower surgical risk.7–13 These combined factors resulted in a consistent decrease in overall rates of all-cause death at 1 year from 31% (n=179) in the inoperable cohort of the PARTNER IB trial treated with TAVR,8 to 7% (n=864) in the Surgical Replacement and Transcatheter Aortic Valve Implantation (SURTAVI) trial targeting patients with an intermediate risk, and 5% (n=145) in the all-comers Nordic Aortic Valve Intervention (NOTION) trial (Table 2).11,13 In cardiovascular research, all-cause mortality is considered the most robust and unbiased clinical endpoint (Figure 2).16 Nevertheless, it may lack specificity, and thus differentiation between cardiovascular and non-cardiovascular death is compulsory. Given the complexity in classifying events as cardiovascular or non-cardiovascular, the involvement of an independent clinical events committee is considered a quality marker when interpreting trial outcomes.16 It has been suggested that using cardiovascular death in composite primary endpoints instead of all-cause mortality, for example cardiovascular death and hospitalisations for HF, reduces statistical noise generated by non-cardiovascular fatal events that are generally not influenced by targeted cardiovascular interventions.17
Hospitalisation due to Heart Failure
Although rehospitalisation due to HF is considered a less robust endpoint in clinical trials due to the lack of implementation of standardised definitions, it remains the most important outcome from patient prognosis and health economic perspectives. In three of the seven trials included in this review it was not reported, and definitions slightly varied when it was available.7,10,13 Frequently, it is difficult to distinguish between a hospitalisation due to aortic valve disease and/or complications of the valve procedure versus a hospitalisation due to HF. These are not always mutually exclusive and strict criteria should be applied to be able to adjudicate and report both. A standardised definition of hospitalisation due to HF is needed if meaningful comparison of rates among cardiovascular trials are to be made.16 The Standardised Data Collection for Cardiovascular Trials Initiative (SCTI), in collaboration with the Food and Drug Administration (FDA), the American College of Cardiology and the American Heart Association recommend a standardised definition for HF events, which include urgent, unscheduled outpatient office/practice, emergency department visits and hospitalisations due to HF.18 HF hospitalisation occurs when a patient is admitted to the hospital with a primary diagnosis of HF, the length of stay is at least 24 hours (or extends over a calendar date), the patient exhibits at least one new or worsening symptom of HF, has objective evidence of new or worsening HF (at least two signs or one sign and one laboratory finding), and receives initiation or intensification of treatment specifically for HF. HF signs and symptoms, relevant changes in therapy, as well as laboratory findings – BNP or NT-proBNP, radiological evidence, non-invasive cardiac imaging, right heart catheterisation – are carefully defined in the SCTI document. The almost simultaneous publication of the Mitral Valve Academic Research Consortium consensus manuscript defines what qualifies as a hospitalisation (≥24 hour stay) with criteria for HF hospitalisation or rehospitalisation requiring signs, symptoms and/or laboratory evidence of worsening HF and administration of IV or mechanical HF therapies.19 They further sub-classify HF hospitalisation into primary (cardiac-related) and secondary (non-cardiac related).
A recent sub-analysis of the Prospective Comparison of ARNI with ACEI to Determine the Impact of Global Mortality and Morbidity in Heart Failure (PARADIGM-HF) trial, a randomised, double-blind comparison of sacubitril/valsartan with enalapril in 8,399 patients with chronic HF, showed that patients hospitalised due to HF had a significantly increased risk of all-cause death (HR 5.0; 95% CI [4.4–5.7]) throughout the duration of the trial (27 months) in an adjusted analysis for randomised treatment, region and baseline covariates, when analysing hospitalisation for HF as the only event experienced as a time-updated covariate.20 When this analysis was carried out for emergency department visits due to HF (without subsequent hospitalisation), the risk of all-cause death was three times higher than in patients without an event (HR 2.9; 95% CI [1.9–4.6]). The intensification of HF therapy as an outpatient was also evaluated, since many episodes of worsening HF are treated in the community with an increase in oral pharmacological therapy or the use of short-term IV therapy.20 When analysing this endpoint as the only event experienced as a time-updated covariate, the authors observed an increased risk of death (HR 4.2; 95% CI [3.3–5.3]), almost equivalent to that observed in patients hospitalised due to HF. These findings did not only clarify the prognostic role of HF events not linked to hospitalisation, but further showed that adding intensification of HF therapy and ED visits due to HF, the frequency of HF-related events doubled, suggesting that an extended composite endpoint would increase statistical power without compromising specificity, which is especially appealing for event-driven clinical trials. These data further support the implementation of SCTI-defined HF events.18
Health Status Measures
The impact of transcatheter therapies for severe AS on functional capacity has been largely assessed by changes in NYHA class and, less frequently, by the use of validated disease-specific questionnaires such as the Kansas City Cardiomyopathy Questionnaire (KCCQ).7–13 When interpreting comparisons of cross-sectional measures, it is important to take into account that generally subjects will be not all be present for any specific follow-up time point, due to death, a missed appointment or patients lost to follow-up. Consequently, these comparisons will unequivocally exclude patients who are the most ill. Aortic valve replacement has significantly and consistently increased the number of patients classified as NYHA class I, ranging from a 24% increase relative to baseline in inoperable patients, to an 80% increase in all-comer populations at one-year follow-up.8,13 Likewise, the frequency of NYHA class III and IV has been reduced up to 65% and 50% respectively in high-risk cohorts (Table 2).7,12
The KCCQ is a 23-item, self-administered instrument that quantifies physical function, symptoms (frequency, severity and recent changes), social function, self-efficacy and knowledge and quality of life. The instrument provides a score from 0 to 100. It has shown an excellent correlation with NYHA class and each quartile is reflective of an increased risk of mortality in patients with HF. It has been validated for the assessment of prognosis and effects of therapies in severe AS.21 A recent sub-analysis of the PARTNER 2 trial evaluated changes in KCCQ at 1 month, 1 year and 2 years among patients at intermediate risk randomised to TAVR or SAVR.22 For this analysis the authors categorised changes in KCCQ as follows: death, worse (reduction from baseline >5 points), no change (change between −5 and <5 points), mildly improved (increase between 5 and <10 points), moderately improved (increase between 10 and <20 points), and substantially improved (increase ≥20 points). Overall, there was a similar increase in KCCQ at 1 year in the transfemoral TAVR group (22.1 points; 20.4–23.9) and in the SAVR group (22.1 points; 20.1–24.1), albeit an earlier benefit was observed in patients undergoing transfemoral TAVR. Moreover, the frequency of moderate or substantial improvement (≥10 points in KCCQ) was consistent among groups (71.1% in the TAVR group and 68% in the SAVR group). Similar findings have been reported in the trials included in this review.11,14,21–23 Other instruments of proven value for the assessment of health status are the disease-specific Minnesota Living with Heart Failure Questionnaire and generic health status measures such as the EuroQoL, Health Utilities Index, Duke Activity Status Index, 12- or 36-item short-form questionnaires (SF-12 or SF-36).15,24,25
Statistical Analysis of Heart Failure Endpoints
Clinical primary endpoints in HF and AS trials are customarily analysed using a Kaplan–Meier analysis using the log-rank test and the treatment effect calculated with the Cox proportional hazards regression using one of several tests, such as the Wald test. These methods are well established and used in studies with regulatory approval studies.26 Primary endpoints in HF trials include, for example, all-cause death; all-cause death and hospitalisations for HF; and more recently cardiovascular death and hospitalisations for HF. Likewise, AS trials have used all-cause death or all-cause death and stroke. The time-to-first-event analysis allows the most intuitive presentation of results, but does not fully capture the burden of recurrent events, such as hospitalisations for HF, an issue that becomes more relevant when investigating patients with AS and impaired systolic function, as envisioned in the ongoing Transcatheter Aortic Valve Replacement to UNload the Left Ventricle in Patients With ADvanced Heart Failure (TAVR UNLOAD) trial.27 The high frequency of recurrent HF events is denoted in trials such as the PARADIGM HF, in which one-third of patients who were hospitalised once during follow-up were hospitalised at least for a second time throughout the duration of the trial, and one tenth were hospitalised three or more times.28 Similar distribution of recurrent events have been reported in drug and device intervention trials.29–33 Patients with more unplanned hospitalisations exhibit a worse quality of life and survival, thus, being able to analyse recurrent hospitalisations not only better characterises the disease but also increases statistical power to detect differences in treatment effects.
Since the early 1980s, several statistical approaches to account for multiple hospitalisations have been introduced (Table 3), and recently a shift towards these more complex methods has been observed in trials enrolling patients with HF, aiming for efficiency and robustness.28,34–36 For instance, in the groundbreaking Cardiovascular Outcomes Assessment of the MitraClip Percutaneous Therapy for Heart Failure Patients with Functional Mitral Regurgitation (COAPT) trial, the chosen primary effectiveness endpoint was all hospitalisations for HF within 24 months of follow-up, including recurrent events in patients with more than one event, using the joint frailty model to account for correlated events and the competing risk of death.37 This is one of the available time-to-event approaches, which include the Wei, Lin and Weissfeld (WLW) method; the Lin, Wei, Ying and Yang (LWYY) model; the Prentice–Williams–Peterson model; and the Andersen–Gill model.28,38 The choice for the most appropriate statistical approach relates to:
- The distribution of timing of subsequent events – HF rehospitalisations may not occur after similar intervals but in clusters where some patients will present multiple adjacent episodes and others no recurrences.
- The within-patient correlation of subsequent events – it is known that hospitalisations beget more hospitalisations and worse prognosis, thus methods assuming independence of recurrent events may not be preferred for analysis of HF events.
- Frequency of the recurrent and terminal events – where methods that analyse death as non-informative censoring or as a recurrent event may not be ideal for cohorts in which mortality is expected to be relatively high.
A conservative approach is to include a method for a primary statistical analysis based on the study assumptions, and provide sensitivity analyses based on other methods for robustness.
Alternative approaches include the use of methods based on event rates, such as the Poisson regression and negative binomial regression. The latter allows for more flexibility but assuming a constant event rate over time and not analysing death as a competing risk, thus it may not be preferred in scenarios where fatal events account for a high proportion of the composite. Of interest in this situation is the Gosh and Lin cumulative incidence method, which handles fatal events as informative censoring.39 In a recent pre-specified sub-analysis of the PARADIGM-HF trial, the authors compared results of the analysis of recurrent hospitalisations using a cumulative incidence method, time-to-event models (WLW, LWYY and the joint frailty model) and the negative binomial model. All approaches provided similar estimates for the effect of the experimental therapy (sacubitril/valsartan) when compared with the traditional time-to-first-event analysis (log-rank test).28 The authors concluded that no single method can be recommended over another, and the preferred statistical approach for a specific trial should be discussed with regulatory agencies.26 It is noteworthy that the joint frailty and the LWYY methods offer advantages that have prompted their use in recent studies (Table 3).17,28,40
Generalised pairwise comparison (GPC) methods have been developed, which use non-parametric approaches to compare outcomes on the basis of pairs of subjects.41 Hierarchical GPC methods include the Finkelstein–Schoenfeld method, the unmatched Pocock method (or win ratio) and the Buyse method, among others.35,42,43 These methods allow the creation of a hierarchy that gives a higher priority to the most severe outcomes and are able to accommodate multiple events. Characteristically, GPC methods are used for binary outcomes, such as in the primary endpoint of the Tafamidis in Transthyretin Cardiomyopathy Clinical Trial (ATTR-ACT) trial, which included a hierarchical assessment of all-cause death and frequency of cardiovascular-related hospitalisations or the combination of binary and continuous outcomes, such as in the primary endpoint of the TAVR UNLOAD trial, defined as the hierarchical occurrence within one year of all-cause death, disabling stroke, hospitalisations (related to HF, symptomatic aortic valve disease or non-disabling stroke) and change in KCCQ relative to baseline.27,44 Non-hierarchical GPC methods include the O’Brien method.45 Little is known about the relative benefits of one method over another. The Finkelstein–Schoenfeld method is currently the GPC method most widely used in cardiovascular research.
Conclusion
HF events, impaired functional status and reduced disease-specific quality of life are highly prevalent in patients with aortic stenosis and are significantly and positively affected by aortic valve interventions. The use of standardised definitions for HF-related events is recommended to improve our understanding of the disease and to allow comparisons among clinical trials. Further research on complex statistical approaches, which take into account the occurrence of multiple events, is warranted.