scts Banner

The Cardiac Surgical Register could be bad for your nerves: A health warning for the worried well

Tom Treasure

The Cardiac Surgical Register could be bad for your nerves: A health warning for the worried well The Society's returns for cardiac surgical mortality are likely to be scrutinised as never before, and yet the provision of 95% confidence limits around some percentages in this year's Cardiac Surgery Register may cause inappropriate alarm and despondency. I will first explain what these confidence limits mean. I will then suggest how a surgeon might use the present register to check his or her results. First what are the 95% confidence limits? Consider a very small data set, such as 20 cases operated upon with two deaths, giving a mortality of 10%. If you look up the 95% confidence limits for 2/20 they are from 1% to 32%. To understand what that range is intended to convey, imagine a sack with a very large number of marbles with an unknown proportion of black ones. You are allowed only to reach in for twenty at a time, and you get out two black ones (10%) amongst your sample. What would be your confidence in saying that the sack contains exactly 10% black marbles? Put statistically the best you could say is that there is a 95% chance that the true proportion of black marbles lies between 1% and 32%. This is simply because 20 cases is a small sample on which to base an estimate. There is a truth within the sack but the estimate based on your handful of twenty is unreliable and the percentage is unstable from one dip to another. The more marbles you took, the more confident you would be of your estimate, and the narrower would be the confidence limits. To return to surgery, the point is that with one's own relatively small experience with uncommon conditions, variations from year to year, and surgeon to surgeon, is apparently great without there being necessarily any real difference from the national average. Looked at in practice, supposing there is a 10% national mortality, and 20 representative patients were operated upon by a number of equally skilled surgeons, with anywhere between 1 and 5 deaths (percentages 5% to 25%). In each instance the confidence limits include 10%. But if your sample of 20 had either no deaths at all, or seven or more, such a discrepancy would occur by chance only once in 20 times. That is to say its probability (P) is less than 0.05. Therefore if the confidence limits around your own sample overlap the national percentage, this is a simple check that your performance is not significantly different from the group. You are, if you like, checking whether you are likely to be sampling from the same sack as every one else, within which it is the rules of chance which determine who gets more and who gets fewer black marbles this year. But what is the point of putting confidence limits around selected, large national totals? None in my view. The bigger the sample the more sure the statistician can be of the "true" proportion, to the point that when you have 22,160 coronary bypass operations, the 95% confidence limits on the front page are narrow at 2.8% to 3.3%, spanning only half of one percent from top to bottom. All that says is that it is probable that if the another similar 22,160 patients had operations, it would come out very close to 3% again. This is not a useful range to use for self inspection. If we compare ourselves with the lower limit of the confidence limit of the proportion of such a massive data set, it might well exclude 40% of all surgeons, completely inappropriately. It is the range and distribution of individual percentages that matters, and whether a surgeon is in the central bunch of the distribution range, or is a straggler. Remember, half of us will be "below average" but can the public handle that concept? Meanwhile, the question is whether you are an outlier, worryingly under performing. So how else might we look at the problem? Suppose 100 surgeons pool their figures of 200 coronary operations each to provide a denominator of 20,000. With their pooled total of 600 deaths, the mortality is 3%. The figures for individuals will range perhaps from a spectacularly good (and lucky) 0% to a worrying 12% with the majority clustering between 2% and 5%. The distribution of these 100 individual performances will almost certainly be skewed because no one can lose fewer than 0% (so the distribution is curtailed at one end) with a tail out to the right, of those who have had a tough year. The 95th centile of the distribution is easily defined. It is simply the percentage mortality of the 95th ranking surgeon, thus identifying below that the five individuals with the highest mortality for that year. But in its simplest form, this is merely another version of a league table and has the problem that of 100 brilliant surgeons, there must be five occupying the lowest ranks in any given year, it might be a different five every year, and they might be only a fraction behind the others. Furthermore, it goes without saying, that without adjustment for case mix, it would be quite misleading and counter productive. There are more sophisticated ways of studying the distribution and defining outliers which we are working on, but the use of 95% confidence limits of the proportion of deaths on the front of the Register, in my view, needed a "health warning" for "the worried well".

CTSNet SCTS Search Feedback