[maemo-developers] Performance of floating point instructions
From: Siarhei Siamashka siarhei.siamashka at gmail.comDate: Thu Mar 11 00:32:20 EET 2010
- Previous message: Performance of floating point instructions
- Next message: Performance of floating point instructions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wednesday 10 March 2010, Laurent GUERBY wrote:
> On Wed, 2010-03-10 at 21:54 +0200, Siarhei Siamashka wrote:
> > I wonder why the compiler does not use real NEON instructions with
> > -ffast-math option, it should be quite useful even for scalar code.
> >
> > something like:
> >
> > vld1.32 {d0[0]}, [r0]
> > vadd.f32 d0, d0, d0
> > vst1.32 {d0[0]}, [r0]
> >
> > instead of:
> >
> > flds s0, [r0]
> > fadds s0, s0, s0
> > fsts s0, [r0]
> >
> > for:
> >
> > *float_ptr = *float_ptr + *float_ptr;
> >
> > At least NEON is pipelined and should be a lot faster on more complex
> > code examples where it can actually benefit from pipelining. On x86, SSE2
> > is used quite nicely for floating point math.
>
> Hi,
>
> Please open a report on http://gcc.gnu.org/bugzilla with your test
> sources and command line, at least GCC developpers will notice there's
> interest :).
This sounds reasonable :)
> GCC comes with some builtins for neon, they're defined in arm_neon.h
> see below.
This does not sound like a good idea. If the code has to be modified and
changed into something nonportable, there are way better options than
intrinsics.
Regarding the use of NEON instructions via C++ operator overloading. A test
program is attached.
# gcc -O3 -mcpu=cortex-a8 -mfpu=neon -mfloat-abi=softfp -ffast-math
-o neon_float neon_float.cpp
=== ieee754 floats ===
real 0m3.396s
user 0m3.391s
sys 0m0.000s
=== runfast floats ===
real 0m2.285s
user 0m2.273s
sys 0m0.008s
=== NEON C++ wrapper ===
real 0m1.312s
user 0m1.313s
sys 0m0.000s
But the quality of generated code is quite bad. That's also something to be
reported to gcc bugzilla :)
--
Best regards,
Siarhei Siamashka
-------------- next part --------------
A non-text attachment was scrubbed...
Name: neon_float.cpp
Type: text/x-c++src
Size: 2801 bytes
Desc: not available
URL: <http://lists.maemo.org/pipermail/maemo-developers/attachments/20100311/5db2e285/attachment.cpp>
- Previous message: Performance of floating point instructions
- Next message: Performance of floating point instructions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
