(This is more of a potential improvement than a bug – I'm not really sure if this is the appropriate place for it).
I was profiling a mod I was working on, and found that I was getting a lot of branch mispredicts in Minecraft's perlin noise generator, in particular the gradient function. I was able to eliminate some of the branches, to get a 2.5% time reduction in world gen. (Obviously I'm working with decompiled code – hopefully in this case it is similar enough to your code that you can tell what I'm talking about).
In the gradient function, the return statement is something like this:
return ((perm & 1) == 0 ? x : -x) + ((perm & 2) == 0 ? y : -y);
which costs you one branch mispredict per function call on average (since perm is random). It can be replaced with this:
int flip1 = -((perm & 1) << 1) + 1;
int flip2 = -(perm & 2) + 1;
return flip1*x + flip2*y;
which gets most of the speed increase I mentioned. Earlier in the function you branch on the condition:
perm != 12 && perm != 14
which can be slightly improved to:
(perm | 0x2) != 14
In theory the compiler could then compile the remaining branches down into conditional moves, thus eliminating branch mispredict costs altogether, however it doesn't seem to be able to generate floating-point conditional move instructions at the moment (integer conditional move works fine).