You're looking at the alphabet utilization rate. Whereas the standard way to measure efficiency (which is the way that the author did it) is to look at entropy.
Here's how I would phrase it. Obviously, if you have a message that has 256 possible states and each one has the same probability of occurring, then the entropy is 8 bits per message.
If you have a message having 243 uniformly distributed states, then its entropy is defined as log base-2 of 243 = log_2 (243) = ln(243) / ln(2) ≈ 7.925 bits. But if you used 8 bits to express this message, then you've wasted 0.075 bits (or 75 millibits, mb) per message, so your efficiency is 7.925/8 ≈ 99.06%.
How would you achieve 7.925 bits per message in practice? By packing more messages together and sharing wasted space. By using some continued fraction magic, let me proposed to store 15867× of these base-243 messages together. There are 243^15867 possibilities, which is approximately equal to 2^125742.999994713, or just a hair under 125743 bits. So, a block of 125743 bits can store 15867× base-243 messages losslessly, which means each base-243 message uses 125743/15867 ≈ 7.925 bits on average.
243 is pretty good. You have to go up to 3^17 = 129140163 < 2^27 = 134217728 to beat it:
Generated with this: