[bitc-dev] Bitc and Simd
Ben Kloosterman
bklooste at gmail.com
Thu Aug 12 18:51:19 PDT 2010
For C I have been tempted to use more and more SIMD as they can give a
massive boost ( far more than say the difference between Java and C)
however except for memcpy and set ( which are asm in libs anyway) the
compiler does a pretty bad job forcing you to use asm {} ( ask me for an
example but gcc and Vc10 are not great) .
Both AMD and Intel will continue in this direction eg Q4 chips will offer
AVX.
"The size of the SIMD vector registers is increased from 128-bits XMM
registers to 256-bits registers called YMM0 - YMM15. Existing 128-bit
instructions use the lower half of the YMM registers. Further extensions to
512 or 1024 bits are expected in the future."
And while alignment is relaxed optimal performance will require 256 bits.
In regards to BitC it has some ramifications eg for some compression and
crypto algorithms you cannot achieve these performance levels without SIMD.
This will force you to either
1) Develop C style intrinsics which are ugly and don't allow for much
optimization ( or inline asm)
2) Use external C or asm code
3) Include native support for SIMD type instructions . It would be very
nice to do
ymm256 r1 = (12,20,13,14,1,1,1);
ymm256 r2 = (1,2,3,4,2000 , 1<<30 , 2 ,2 );
let r3 = r1+ r2 ;
And the other bit options (and have support for converting to Simd packing
and alignment , you may want to go further and treat it as 256 bit value but
use different operators for SiMD).
While I'm not suggesting it now I would put some of this type of thing (esp
the 128 and 256 bit values) on the reserved key wordlist as a future option.
It would make a compelling argument to switch from C /asm for many shops for
fast algorithms. High performance algorithms are hard enough without all
the intrinsic baggage or asm where intrinsics do a bad job. It may be worth
considering some basic form of these like 128 and 256 bit normal and non
temporal copy and mem set currently used in the c libs ( as asm) earlier as
these would be critical for writing efficient buffers , memcpy etc.
Ben
More information about the bitc-dev
mailing list