[maemo-developers] Performance of floating point instructions
From: Siarhei Siamashka siarhei.siamashka at gmail.comDate: Wed Mar 10 22:33:08 EET 2010
- Previous message: Performance of floating point instructions
- Next message: Performance of floating point instructions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wednesday 10 March 2010, Laurent Desnogues wrote: > On Wed, Mar 10, 2010 at 8:54 PM, Siarhei Siamashka > <siarhei.siamashka at gmail.com> wrote: > [...] > > > I wonder why the compiler does not use real NEON instructions with > > -ffast-math option, it should be quite useful even for scalar code. > > > > something like: > > > > vld1.32 {d0[0]}, [r0] > > vadd.f32 d0, d0, d0 > > vst1.32 {d0[0]}, [r0] > > > > instead of: > > > > flds s0, [r0] > > fadds s0, s0, s0 > > fsts s0, [r0] > > > > for: > > > > *float_ptr = *float_ptr + *float_ptr; > > > > At least NEON is pipelined and should be a lot faster on more complex > > code examples where it can actually benefit from pipelining. On x86, SSE2 > > is used quite nicely for floating point math. > > Even if fast-math is known to break some rules, it only > breaks C rules IIRC. If that's the case, some other option would be handy. Or even a new custom data type like float_neon (or any other name). Probably it is even possible with C++ and operators overloading. > OTOH, NEON FP has no support > for NaN and other nice things from IEEE754. > > Anyway you're perhaps looking for -mfpu=neon, no? I lost my faith in gcc long ago :) So I'm not really looking for anything. -- Best regards, Siarhei Siamashka
- Previous message: Performance of floating point instructions
- Next message: Performance of floating point instructions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]