It's worth mentioning that the parallel port was a good option for many years, and is more than adequate for the vast majority of machines. The main reason I wouldn't recommend it now, is there are too many potential issues in getting it working. You need a suitable motherboard. You've got to make sure the port settings are correct. You've got to hope Mach/LinuxCNC will play nicely with the motherboard. In the case of Mach, you need a suitable version of Windows.
None of which are insurmountable, but it's all things that can potentially add additional time and cost to getting a machine running.

To me, it's worth spending the extra money on a dedicated motion controller, as it removes quite a bit of uncertainty from the setup process. Plus it usually means you get improved support to get it working in the first place.

I wouldn't get too hung up on the ideal theoretical option, as in practise, and as the parallel port proves, things can still work very well even if they're far from theoretically ideal.

Servo tuning is whole other topic, but most closed loop capable controllers will have some form of tuning tools available. Dynomotion/KFlop tools are pretty advanced, and let you plot/adjust all sorts of things. CS-labs include a tuning screen, but IIRC it's pretty basic. Galil you have to buy their GDK to get servo tuning.