What kind of buffering do you want?
Anything which can result in any variation to the relative timing of LPT pin signals being presented to the drives will mess up the part, there is no RTS/CTS handshake involved in the open loop with Mach3, the whole set up relies upon the data at the parallel port being presented almost instantaneously to the drivers, all timings are configured within Mach3 to work within the limitations of the system being driven.
External motion devices are fed move instructions via their plugins and can thus use buffering,
- Nick