Petraco and colleagues1 have presented the “adjusted” accuracy comparing their new index (instantaneous wave-free ratio, iFR) against the gold standard of fractional flow reserve (FFR). However, Table 2 of their manuscript misrepresents theoretical calculations as clinical observations. Misleading column labels hide the fact from the casual reader that many of its numbers are assumed from a model instead of being measured directly.
Specifically, in three of the four studies (ADVISE registry, ADVISE study, FFR-PET study) FFR values were measured only once, yet the table makes no distinction among the agreement data in its “Repeated FFR measurements” column. Even for the DEFER study, which actually repeated FFR measurements, the authors did not have access to the full raw data. Similarly, two of the four studies (DEFER, FFR-PET study) never measured iFR, yet the table presents an “observed” agreement between iFR and FFR for all rows. Should not measured values – true observations – be distinguished from assumptions? For the last row of the table (FFR-PET study), this confluence of hypothetical values reaches too far, calculating an “adjusted” iFR accuracy by dividing the iFR-FFR agreement (for a study that never measured iFR at all) by the repeated FFR agreement (for a study that only measured FFR once).
Only after careful reading of the methods section can the reader uncover that five of the eight values (>50%) in the “Overall classification agreement” columns of Table 2 are an estimation instead of a measurement. Indeed, each and every “adjusted” iFR accuracy in Table 2 contains at least one component that has been assumed from a model. Therefore, their statement that the so-called adjusted “iFR accuracy is almost identical, ranging from 94% to 96%” follows trivially from the underlying assumptions. To our knowledge, in the peer-reviewed literature only the VERIFY study2 has actually measured both iFR and FFR twice in the same patients. The VERIFY study found superior reproducibility for repeated FFR measurements compared to repeated iFR measurements.
Even their proposal to “adjust” the agreement suffers from three statistical shortcomings as we will detail in a future manuscript. First, mathematically it does not estimate the true, underlying agreement between the two variables. Second, it only accounts for variability in FFR while neglecting the variability in iFR measurements. Third, it does not generalise beyond a single repetition, whereas investigators may perform two or even more repeated measurements.
Fundamentally, Table 2 by Petraco and colleagues falls short of presenting its contents accurately.
Funding
All authors received internal funding from the Weatherhead PET Center for Preventing and Reversing Atherosclerosis.
Conflict of interest statement
All authors have signed non-financial, mutual non-disclosure agreements to discuss coronary physiology with Volcano Corporation, maker of invasive FFR and CFR wires.
We thank Johnson et al for their attention to detail and scientific interest in iFR. Their concern is valuable, because studies of new technologies such as the ADVISE Registry1 directly impact on patient care and must be fully transparent in aim and methodology2,3.
Johnson et al usefully remind readers that Table 2 of the results (and indeed the rest of the section entitled “Results”) arose by the methods described in the section entitled “Methods”. Whilst we are sorry that our legend was too concise for readers who skip over the methods with understandable eagerness, we can reassure them that there is no error in Table 2 of the manuscript.
In the ADVISE Registry, the classification agreement between instant wave-free ratio (iFR) and fractional flow reserve (FFR) was presented for different sample distributions, using a simple methodology which calculates the iFR-FFR agreement in all quantiles of FFR disease severity. This is important because the classification agreement (sometimes called “diagnostic accuracy”) of a new test against an old one (or of one test conducted twice) depends on the distribution of patients included in the sample. If only very severe and very mild patients are studied, classification agreement can easily be near 100%. In contrast, if only patients near the cut-off are evaluated, it is likely to be near 50%. In practice, it means that values of accuracy from one study cannot be extrapolated to others if the distributions of disease severity is different. Johnson et al point out that all FFR validation studies were conducted in samples whose distribution were very different from populations in which FFR is applied clinically, such as the ADVISE Registry and other clinical cohorts4, which have most of the patients in the intermediate zone. We therefore had to apply a per-range agreement methodology to combat the fact that in the landmark FFR studies the intermediate patients seemed strangely scarce. Such non-clinical, centrifugal patterns of FFR distribution are seen in the landmark NEJM 1996 study5 and other large outcome trials6. These differences in lesion distribution can severely affect the relationship between iFR and FFR (and indeed the intrinsic agreement between repeated FFR measurements7) as demonstrated in the ADVISE Registry.
The second question raised by Johnson et al is with respect to the iFR “adjusted” agreement with FFR. This is another extremely important, yet often underappreciated, point which deserves careful attention. Usually, values of iFR diagnostic accuracy to match FFR are taken at face value: 80%. But how high could it ever really get? To answer this question, we sought to investigate the accuracy of one FFR measurement to predict a second 10 minutes later, in the same sample. This is essential because no other test can agree with FFR as well as FFR agrees with itself. For instance, in samples in which iFR agrees with FFR in 80%, FFR agrees with itself 86% of the time. Therefore, iFR does (80/86 = 94%) as well as FFR does, in agreeing with FFR. This principle is not limited to the iFR-FFR relationship but applies equally to any other test compared against FFR. Clinicians should not expect any test (IVUS, SPECT, OCT, etc.) to match FFR in clinical samples more than 85% of the time. Reports exceeding mathematical and biological plausibility usually turn out to be obtained from non-clinical or unusual types of population5,8.
Our analysis, therefore, aimed to evaluate how close iFR is to achieving FFR’s ability to match itself. Johnson et al correctly point out that iFR reproducibility was not taken into account in our methodology. Future studies collecting test-retest reproducibility of iFR and FFR under bias-resistant conditions using reliable methods are needed, but it would be important to ensure that portions of systole are not included in the detection of diastole, as this can easily cause unnoticed test failure9.
We share Johnson et al’s relief that the individual patient results of the landmark DEFER reproducibility study were not, after all, lost as had been feared, but narrowly escaped oblivion through the report of Kern et al10. Our digitised results are very conservative, yielding a value of standard deviation of the difference (SDD) between FFR measurements of 3.2%, which is slightly lower than the one reported in the original DEFER publication (mean absolute difference of 3%, which corresponds to an SDD of approximately 3.7%11). Also, for simplification, our analysis did not take into account the fact that the scatter in the DEFER reproducibility data is heteroscedastic and particularly narrow close to the then cut-off of 0.75, an enigma that statistical workers in London and Oxford have been unable to decipher12. Finally, we did not use the formal six-week test-retest reproducibility data for FFR13. Although clinically relevant, its wider SDD of approximately 5% might show FFR in an unfavourable light.
Individual readers’ hospitals might have different FFR distributions to landmark FFR validation studies and therefore FFR repeatability agreement might also be different. We therefore present an Online Appendix which allows the general clinical or research reader to calculate the FFR-FFR agreement for their own sample (using the DEFER reproducibility data) by simply entering their FFR values in an Excel spreadsheet. We recommend this as a basis for comparison between other modalities (e.g., iFR, IVUS, and OCT) and FFR in other datasets.
We again thank Johnson et al for their level of interest in our study. We all share the aim of expanding the adoption of physiology-guided decision making to many more patients with coronary artery disease2. As recently highlighted by prominent interventional colleagues14,15, time (and randomised clinical trials) will tell whether these small differences in stenosis classification between iFR and FFR will affect patient outcome.
Conflict of interest statement
J.E. Davies holds patents pertaining to iFR technology, which is under licence to Volcano Corporation. J.E. Davies is a consultant for Volcano Corporation. The other authors have no conflicts of interest to declare.
Online data supplement
Online appendix. Estimation of the intrinsic FFR agreement in your sample.