User Tools

Site Tools


outputting_20arabic_20text

Outputting Arabic text

by Richard Russell, April 2012

BBC BASIC for Windows supports outputting Unicode text to the main output window or to the printer, and it also supports right-to-left printing, so in principle it ought to be able to output Arabic text. However there is a complication: in Arabic the shapes of characters (their glyphs) can depend on their placement within a word, i.e. depending on whether they are at the start of a word, at the end of a word, in the middle of a word or on their own (isolated).

Because BBC BASIC outputs text as a VDU stream each character is treated in isolation and therefore when outputting Arabic the 'isolated' forms of the characters are used and the text is not rendered correctly. For example try running this program (copy-and-paste it into the BB4W editor):

        VDU 23,22,640;512;8,16,16,128+8 : REM Select UTF-8 output
        *FONT Times New Roman, 28
 
        arabic1$ = "هذا هو مثال على النص العربي"
        arabic2$ = "هو مكتوب من اليمين إلى اليسار"
 
        VDU 23,16,2;0;0;0;13 : REM Select right-to-left printing
        PRINT arabic1$ ' arabic2$
        VDU 23,16,0;0;0;0;13 : REM Select left-to-right printing
        END

This is what is displayed:
The isolated forms of the characters have been used and the result is not correct.

Fortunately there is a solution to this problem. By pre-processing the strings before output the correct glyphs can be generated. Here is a revised version of the program (the function FNarabic is listed later):

        VDU 23,22,640;512;8,16,16,128+8 : REM Select UTF-8 output
        *FONT Times New Roman, 28
 
        arabic1$ = "هذا هو مثال على النص العربي"
        arabic2$ = "هو مكتوب من اليمين إلى اليسار"
 
        VDU 23,16,2;0;0;0;13 : REM Select right-to-left printing
        PRINT FNarabic(arabic1$) ' FNarabic(arabic2$)
        VDU 23,16,0;0;0;0;13 : REM Select left-to-right printing
        END

This is what is displayed:
The correct cursive script has been produced.

Here is the FNarabic function:

        DEF FNarabic(A$)
        LOCAL A%, B%, O%, P%, U%, B$
        A$ += CHR$0
        FOR A% = !^A$ TO !^A$+LENA$-1
          IF ?A%<&80 OR ?A%>=&C0 THEN
            O% = P% : P% = U%
            U% = ((?A% AND &3F) << 6) + (A%?1 AND &3F)
            IF ?A%<&80 U% = 0
            CASE TRUE OF
              WHEN U%=&622: U% = &81
              WHEN U%=&627: U% = &8D
              WHEN U%<&628:
              WHEN U%<=&629: U% = &8F+4*(U%-&628)
              WHEN U%<=&62E: U% = &95+4*(U%-&62A)
              WHEN U%<=&632: U% = &A9+2*(U%-&62F)
              WHEN U%<=&63A: U% = &B1+4*(U%-&633)
              WHEN U%<&641:
              WHEN U%<=&648: U% = &D1+4*(U%-&641)
              WHEN U%=&649: U% = &EF
              WHEN U%=&64A: U% = &F1
            ENDCASE
            IF P% IF P%<&600 THEN
              B% = P%
              IF U% IF P%>&8D IF P%<>&93 IF P%<&A9 OR P%>&AF IF P%<>&ED IF P%<>&EF B% += 2
              IF O% IF O%>&8D IF O%<>&93 IF O%<&A9 OR O%>&AF IF O%<>&ED IF O%<>&EF B% += 1
              B$ = LEFT$(LEFT$(B$))+CHR$&EF+CHR$(&B8+(B%>>6))+CHR$(&80+(B%AND&3F))
            ENDIF
          ENDIF
          B$ += CHR$?A%
        NEXT
        = LEFT$(B$)
This website uses cookies for visitor traffic analysis. By using the website, you agree with storing the cookies on your computer.More information
outputting_20arabic_20text.txt · Last modified: 2018/04/17 18:05 by tbest3112