So I made a matrix multiplication project,and the central code looked like this:

I wanted to make an algorithm a little faster than this one on the measure of that fps tester.(my matrix algorithm worked 278.2ms)

But it turned out that any algorithm that was not a matrix,did not work at all.

300ms

This does very bad,570ms!

So why could the matrix algorithm be actually faster?(which actually did a lot more extra steps,even doing that 0 on the bottom right of the rotational matrix)

btw thanks @cornelios207 making a matrix multiplication hyperblock i couldn't even optimize

## what the optimization did

(the rotate then project matrix works like this)

c1c2 -s1

s1c2 c1

s2 0

(the two "optimize" algorithms just tally up the matrix times vector:)

x=c2(c1x+s1y)+s2z

y=c1y-s1x

(you could do the calculations yourself,it definitely works)