User Tools

Site Tools


using_20sse2_20instructions

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

using_20sse2_20instructions [2018/03/31 13:19]
127.0.0.1 external edit
using_20sse2_20instructions [2018/04/13 11:42] (current)
richardrussell
Line 1: Line 1:
 =====Using SSE2 instructions===== =====Using SSE2 instructions=====
  
-//by Richard Russell, updated May 2015//\\ \\ [[http://​en.wikipedia.org/​wiki/​SSE2|SSE2]] instructions are supported by the **ASMLIB2** library, and that will generally be the most appropriate way to incorporate them in a program. However using the library has one significant disadvantage:​ the resulting program cannot be straightforwardly compiled, because the SSE2 instructions will not be accepted by the //​cruncher//​. To workaround this issue the assembler code must be placed in a separate file (with an extension other than .BBC) which is executed at run time, for example:\\ +//by Richard Russell, updated May 2015//\\ \\ [[http://​en.wikipedia.org/​wiki/​SSE2|SSE2]] instructions are supported by the **ASMLIB2** library, and that will generally be the most appropriate way to incorporate them in a program. However using the library has one significant disadvantage:​ the resulting program cannot be straightforwardly compiled, because the SSE2 instructions will not be accepted by the //​cruncher//​. To workaround this issue the assembler code must be placed in a separate file (with an extension other than .BBC) which is executed at run time, for example: 
 + 
 +<code bb4w>  ​
         CALL "​mysse2code.bba"​         CALL "​mysse2code.bba"​
 +</​code>​
 +
 (the file should have a **RETURN** as the last statement).\\ \\  Whilst this solution is relatively straightforward it is arguably inconvenient,​ especially if the amount of assembler code is small. There is an alternative way of assembling many of the SSE2 instructions which does not require the use of a library and which allows the program to be compiled conventionally;​ that is to add a **word** qualifier to the equivalent MMX instruction. So for example the instruction:​\\ ​ (the file should have a **RETURN** as the last statement).\\ \\  Whilst this solution is relatively straightforward it is arguably inconvenient,​ especially if the amount of assembler code is small. There is an alternative way of assembling many of the SSE2 instructions which does not require the use of a library and which allows the program to be compiled conventionally;​ that is to add a **word** qualifier to the equivalent MMX instruction. So for example the instruction:​\\ ​
         punpcklbw xmm0,xmm1         punpcklbw xmm0,xmm1
 can be assembled as follows:​\\ ​ can be assembled as follows:​\\ ​
         punpcklbw word mm0,mm1 ; punpcklbw xmm0,xmm1         punpcklbw word mm0,mm1 ; punpcklbw xmm0,xmm1
-\\  ​The full set of **SSE2** instructions which can be assembled in this way is as follows:\\ + 
 +The full set of **SSE2** instructions which can be assembled in this way is as follows: 
 + 
         punpcklbw word mm0,mm1 ; punpcklbw xmm0,xmm1         punpcklbw word mm0,mm1 ; punpcklbw xmm0,xmm1
         punpcklwd word mm0,mm1 ; punpcklwd xmm0,xmm1         punpcklwd word mm0,mm1 ; punpcklwd xmm0,xmm1
using_20sse2_20instructions.txt · Last modified: 2018/04/13 11:42 by richardrussell