[Fast Float to Int]                           [Assembler][/][Pentium][+FPU]

Converting a floating point number to integer is normally done like this:

        fistp   [dword ptr temp]
        mov     eax,[temp]

An alternative method is:

DATASEG
ALIGN 8

temp    dq      ?
magic   dd      59c00000h       ; f.p. representation of 2^51 + 2^52

CODESEG

        fadd    [magic]
        fstp    [qword ptr temp]
        mov     eax,[dword ptr temp]

Adding the 'magic number' of 2^51 + 2^52 has the effect that any integer
between -2^31 and +2^31 will be aligned in the lower 32 bits when storing
as a double precision floating point number. The result is the same as you
get with FISTP for all rounding methods except truncation towards zero. The
result is different from FISTP if the control word specifies truncation or
in case of overflow. You may need a WAIT instruction for compatibility with
other processors.
This method is not faster than using FISTP, but it gives better scheduling
opportunities because there is a 3 clock void between FADD and FSTP which
may be filled with other instrucions. You may multiply or divide the number
by a power of 2 in the same operation by doing the opposite to the magic
number. You may also add a constant by adding it to the magic number, which
then has to be double precision.

The second thing is the inability of the FP unit to convert float to int
internally at any reasonable speed (FRNDINT takes 19 cycles apparently). In
some situations you could use:

        fistp   [qword mem]     ; 6 (estimated clock cycle count)
        fild    [qword mem]     ; 7-9

However accuracy will be sacrificed since a 64-bit integer can not
represent as high values as floating point numbers can. You can also use
the "magic" trick here, but you would get even less accuracy.
If you want to use the "magic" trick here just add a similar "magic" number
and then subtract it away. Because of insufficient precisions the floating
point value will change. This method has two drawbacks:

   * you must know the exact precision/accuracy of the FPU and if you don't
     know it, the method will fail.
   * depending on the rouding mode, you might get different results than
     with FRNDINT.

Another FRNDINT replacement:

magic   dd      59c00000h       ; f.p. representation of 2^51 + 2^52

        fadd    [magic]         ; 1-3
        fsub    [magic]         ; 4-6

The through-put of the above code is only 2 clocks. Unfortunately there are
situations where the results will not be the same than with FRNDINT. The
lowest bit of the result may be wasted.
                                                 Gem writers: Vesa Karvonen
                                                                  Agner Fog
                                                   last updated: 1998-03-16
