Character codes &81 to &9F

Discussions related to the BB4W & BBCSDL interpreters and run-time engines
Post Reply
guest
Posts: 268
Joined: Mon 02 Apr 2018, 09:12

Character codes &81 to &9F

Post by guest » Sat 22 Dec 2018, 21:07

Can anybody remember why BBC BASIC for SDL 2.0 (BBCSDL) displays these characters for character codes &81 to &9F inclusive:

chars.png
chars.png (1.37 KiB) Viewed 1559 times
I have no recollection of why this set of characters was chosen, nor what the abbreviations mean. They are not the same as BB4W displays for the same codes, so compatibility is not enhanced.

guest
Posts: 268
Joined: Mon 02 Apr 2018, 09:12

Re: Character codes &81 to &9F

Post by guest » Thu 27 Dec 2018, 16:55

guest wrote:
Sat 22 Dec 2018, 21:07
Can anybody remember why BBC BASIC for SDL 2.0 (BBCSDL) displays these characters for character codes &81 to &9F inclusive
Nobody? If the reason has been forgotten, what are people's views on possibly changing them to be more compatible with BB4W? My expectation would have been that they are rarely used on either platform, but that's been shattered by the admin, no less, who uses at least one of them in his recent competition entry.

DDRM
Administrator
Posts: 104
Joined: Mon 02 Apr 2018, 18:04

Re: Character codes &81 to &9F

Post by DDRM » Fri 28 Dec 2018, 10:46

Hi Richard,

I think what that shows is that it would be an advantage to have a character set that matches that of BB4W more closely!

I think the stray character is a single quote, or something similar... or possibly some sort of hyphen. I haven't worked out why Word rendered them into some form that the text files don't like... It certainly wasn't intentional use of one of the "special" characters...

Best wishes,

D

guest
Posts: 268
Joined: Mon 02 Apr 2018, 09:12

Re: Character codes &81 to &9F

Post by guest » Fri 28 Dec 2018, 11:22

DDRM wrote:
Fri 28 Dec 2018, 10:46
I think what that shows is that it would be an advantage to have a character set that matches that of BB4W more closely!
A Google search found this: "ANSI characters 32 to 127 correspond to those in the 7-bit ASCII character set, which forms the Basic Latin Unicode character range. Characters 160–255 correspond to those in the Latin-1 Supplement Unicode character range. Positions 128–159 in Latin-1 Supplement are reserved for controls, but most of them are used for printable characters in ANSI".

So it looks as though BBCSDL is using the 'Latin-1' character set whereas BB4W is using the 'ANSI' character set. According to that reference the two are the same for characters 32-126 and 160-255, but not for characters 129-159, which is exactly what we observe. I worry that this difference was for a good reason, but if so I can't remember now what that reason was! Alternatively it was just an accident.
I haven't worked out why Word rendered them into some form that the text files don't like... It certainly wasn't intentional use of one of the "special" characters...
Arguably you should have written your BASIC program to use Unicode (UTF-8) encoding for safety. That would have ensured that BB4W and BBCSDL rendered the same characters, and given you the peace of mind of knowing that any character that Word was likely to introduce would be rendered correctly. Even Notepad will save-to-file in UTF-8 encoding!

Do you have a source of 8x8 bitmapped character shapes for those that you think should be changed? It's not easy to represent some of them with such a coarse grid, especially things like permille () or the ligaturesƒ„… (e.g. œ)…††‡ˆ‰‹ŠŽ‹Œ“Ž‘’”•–—œž™

guest
Posts: 268
Joined: Mon 02 Apr 2018, 09:12

Re: Character codes &81 to &9F

Post by guest » Fri 28 Dec 2018, 12:57

guest wrote:
Fri 28 Dec 2018, 11:22
Do you have a source of 8x8 bitmapped character shapes for those that you think should be changed?…††‡ˆ‰‹ŠŽ‹Œ“Ž‘’”•–—œž™
For reference the characters required are:

Code: Select all


&82 ‚ single low quotation mark
&83 ƒ small letter f with hook
&84 „ double low quotation mark
&85 … horizontal ellipsis
&86 † dagger
&87 ‡ double dagger
&88 ˆ circumflex accent
&89 ‰ per mille
&8A Š capital S caron
&8B ‹ single left-pointing angle quotation mark
&8C Πcapital OE ligature

&8E Ž capital Z caron


&91 ‘ left single quotation mark
&92 ’ right single quotation mark
&93 “ left double quotation mark
&94 ” right double quotation mark
&95 • bullet
&96 – en dash
&97 — em dash
&98 ˜ small tilde
&99 ™ trade mark sign
&9A š small S caron
&9B › single right-pointing angle quotation mark
&9C œ small OE ligature

&9E ž small Z caron
&9F Ÿ capital Y diaeresis
It would also be helpful to have matching 16x16 characters for the MODE 7 extended character set.

Post Reply