[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: SrPersist, sql-c-wchar, and unicode/wide-characters
> when i use sql-c-char, i get the english text ok (roman/ascii
> alphabet), but no korean, arabic, etc. (they come back as question
> marks)
That makes sense, because sql-c-char represents 8-bit C characters.
> when i tried sql-c-wchar, the mzscheme interpreter said "illegal
> instruction" and exited.
I've never tried SrPersist with a Unicode database, so the wide-character code is untested.
Which primitive caused this problem? Was it make-buffer, read-buffer, or write-buffer?
Even if this code worked perfectly, you might still have problems. The MzScheme language does not support Unicode. In SrPersist, if you read from a buffer that contains Unicode characters into a Scheme string, only the least significant 8-bit of each character are stuffed into the resulting Scheme string. That strategy works (I think) if the Unicode represents ordinary Latin-1 text. With Korean, Arabic, etc., it probably fails miserably.
You might consider modifying the wide-character code in srpbuffer.cxx to use a different strategy, say, placing the two bytes in the Unicode character in distinct characters in a Scheme string. Of course, it will look like garbage in the Scheme REPL.
-- Paul