Interesting idea, but I'd look at the pulse output frequency of the DDCS and your intended stepping rate, then the frequency response of the PC817s and the effective throughput you'll get on a 16MHz '328. You are limiting the performance of the machine. At the least I'd look to replace the opto's. Pass-through performance can be easily added with adding a 74-series logic device to allowing DDCS signalling to pass-through, or to augment from the Ardy.

Of course, your mileage might vary.

EDIT: not suggesting this, but have a look at the ESP32s - particularly with the onboard OLEDs, having a dual-core processor at 240MHz meant for a very easy protocol converter with fancy graphic display (was converting RS485 to RS232 at 1.2MHz, with one core dedicated to the graphics and one just copying from UART buffers)... similar but different kind of solution.