It is fun isn't it, the hours I spent making it work. Everytime I thought I had it figured I found the next little snag

I find the major axis which steps on every interrupt, and slave everything else to that. I reload the timer after every step from the acceleration table, stretching the delay by 1.5 (approx root 2) if both X and Y go together.

I can give you the working code that drives my mills but it may not help being written in Z80 assembler. It runs from a pre-digested buffer of straight lines without any multiplies or divides. On a slow CPU, 8MHz, the interrupt steps anyone set to step on entry then reconfigures the step and direction lines at exit.

I'm a bit of a dinosaur.