You are quite right - I can't do simple arithmetic! However, I believe that the motion controller drives all axes effectively in parallel, not serially. So if X and A and Y and Z all need to step one pulse, the motion controller will send simultaneous pulses to each output at the same time. So the maximum pulse frequency is the pulse frequency of the "fastest" axis. I think (from memory) that the CSMIO gives out 10usec pulses, which is more than the minimum that the stepper driver needs. The maximum clock frequency is related to the exact timing accuracy of pulses but I doubt that in practice you are ever going to generate stepping pulses at that frequency. I could probably drive my router directly from Mach3 and the parallel port with a 20KHz kernel speed; I don't want to do that but it would probably work.

I haven't looked at the pulse generation mechanism in Mach3 or the CSMIO, partly because I can't see inside the code! However, I have looked at something like GRBL, the Arduino-based motion controller. That works by calculating which axes need a pulse at every internal clock cycle and then loading a single output register with bits representing pulses on each channel. That seems to make sense, and I would guess that something similar happens in the other motion controllers. Cuts the required clock rates by a large amount with no loss of functionality.