So I made a matrix multiplication project,and the central code looked like this:
I wanted to make an algorithm a little faster than this one on the measure of that fps tester.(my matrix algorithm worked 278.2ms)
But it turned out that any algorithm that was not a matrix,did not work at all.
This does very bad,570ms!
So why could the matrix algorithm be actually faster?(which actually did a lot more extra steps,even doing that 0 on the bottom right of the rotational matrix)
btw thanks @cornelios207 making a matrix multiplication hyperblock i couldn't even optimize
what the optimization did
(the rotate then project matrix works like this)
(the two "optimize" algorithms just tally up the matrix times vector:)
(you could do the calculations yourself,it definitely works)