A better fix would be to use "-mfpmath=sse" to disable x87 math - this also makes your program slightly faster, whereas -ffloat-store can make it much slower.
.. or not use an algorithm that depends on the hardware implementation of floating point math? How about just comparing the previous & current iteration values? If you're not making any progress, you're done.