Don't fall for the idea of "too much processing" killing the sound "because less is more". This does not apply here.
There are real measurable acoustical effects at work, not audiophile philosophy.
Great results! As you found out there's quite a difference between USB and optical. And the results will be different for RCA again!
These values perfectly explain the first part of the puzzle: Each output (and speaker input!) can have a different latency!
What's are the consequences? Whenever you compare any connection type (USB, optical, RCA) you absolutely
must dial in the correct delay prior to performing a RoomFit run. Otherwise the results wil be flawed.
The second part is simple as level differences. As an be clearly seen, the
relative level of the subwoofer compared to the main speakers differs significantly. It's the main speakers level that really varies, of course, with USB being clearly the lowest.
RoomFit
can take care of these differences to a certain degree but that's far from ideal. It would be much better to adjust the subwoofer level in the WiiM Home app accordingly before running RoomFit. Even doing it very roughly by ear should be better than nothing. So far the assessment of the optical connection looks best but when dialed in correctly (polarity switch, delay and level) there shouldn't be such massive differences at all.
You cannot be satisfied with this result.
And this might be part of the solution.
What's your current crossover frequency? Is the subwoofer's upper limit far enough above this value?