Unicode support in PowerBuilder
PowerBuilder uses UTF-16LE encoding internally. The source code in
PBLs is encoded in UTF-16LE, any text entered in an application is
automatically converted to Unicode, and the string and
character PowerScript datatypes hold Unicode data only. Any ANSI or DBCS
characters assigned to these datatypes are converted internally to
Unicode encoding.
Support for Unicode
databases
Most PowerBuilder database interfaces support both ANSI and
Unicode databases.
A Unicode database is a database whose character set is set to a
Unicode format, such as UTF-8 or UTF-16. All data in the database is in
Unicode format, and any data saved to the database must be converted to
Unicode data implicitly or explicitly.
A database that uses ANSI (or DBCS) as its character set can use
special datatypes to store Unicode data. These datatypes are NChar,
NVarChar, and NVarChar2. Columns with one of these datatypes can store
Unicode data, but data saved to such a column must be converted to
Unicode explicitly.
For more specific information about each interface, see Connecting to Your Database.
String functions
PowerBuilder string functions, such as Fill, Len, Mid, and Pos,
take characters instead of bytes as parameters or return values and
return the same results in all environments. These functions have a
“wide” version (such as FillW) that is obsolete and will be removed in a
future version of PowerBuilder because it produces the same results as
the standard version of the function. Some of these functions also have
an ANSI version (such as FillA). This version is provided for backwards
compatibility for users in DBCS environments who used the standard
version of the string function in previous versions of PowerBuilder to
return bytes instead of characters.
You can use the GetEnvironment function to determine the character
set used in the environment:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
environment env getenvironment(env) choose case env.charset case charsetdbcs! // DBCS processing ... case charsetunicode! // Unicode processing ... case charsetansi! // ANSI processing ... case else // Other processing ... end choose |
Encoding enumeration
Several functions, including Blob, BlobEdit, FileEncoding,
FileOpen, SaveAs, and String, have an optional encoding parameter. These
functions let you work with blobs and files with ANSI, UTF-8, UTF-16LE,
and UTF-16BE encoding. If you do not specify this parameter, the default
encoding used for SaveAs and FileOpen is ANSI. For other functions, the
default is UTF-16LE.
The following examples illustrate how to open different kinds of
files using FileOpen:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
// Read an ANSI File Integer li_FileNum String s_rec li_FileNum = FileOpen("Employee.txt") // or: // li_FileNum = FileOpen("Emplyee.txt", & // LineMode!, Read!) FileRead(li_FileNum, s_rec) // Read a Unicode File Integer li_FileNum String s_rec li_FileNum = FileOpen("EmployeeU.txt", LineMode!, & Read!, EncodingUTF16LE!) FileRead(li_FileNum, s_rec) // Read a Binary File Integer li_FileNum blob bal_rec li_FileNum = FileOpen("Employee.imp", Stream Mode!, & Read!) FileRead(li_FileNum, bal_rec) |
Initialization files
The SetProfileString function can write to initialization files
with ANSI or UTF16-LE encoding on Windows systems, and ANSI or UTF16-BE
encoding on UNIX systems. The ProfileInt and ProfileString PowerScript
functions and DataWindow expression functions can read files with these
encoding schemes.
Exporting and importing
source
The Export Library Entry dialog box lets you select the type of
encoding for an exported file. The choices are ANSI/DBCS, which lets you
import the file into PowerBuilder 9 or earlier, HEXASCII, UTF8, or
Unicode LE.
The HEXASCII export format is used for source-controlled files.
Unicode strings are represented by hexadecimal/ASCII strings in the
exported file, which has the letters HA at the beginning of the header
to identify it as a file that might contain such strings. You cannot
import HEXASCII files into PowerBuilder 9 or earlier.
If you import an exported file from PowerBuilder 9 or earlier, the
source code in the file is converted to Unicode before the object is
added to the PBL.
External functions
When you call an external function that returns an ANSI string or
has an ANSI string argument, you must use an ALIAS clause in the
external function declaration and add ;ansi to the function name. For
example:
|
1 2 |
FUNCTION int MessageBox(int handle, string content, string title, int showtype) LIBRARY "user32.dll" ALIAS FOR "MessageBoxA;ansi" |
The following declaration is for the “wide” version of the
function, which uses Unicode strings:
|
1 2 |
FUNCTION int MessageBox(int handle, string content, string title, int showtype) LIBRARY "user32.dll" ALIAS FOR "MessageBoxW" |
If you are upgrading an application from PowerBuilder 9 or
earlier, PowerBuilder replaces function declarations that use ANSI
strings with the correct syntax automatically.
Setting fonts for multiple language
support
The default font in the System Options and Design Options dialog
boxes is Tahoma.
Setting the font in the System Options dialog box to Tahoma
ensures that multiple languages display correctly in the Layout and
Properties views in the Window, User Object, and Menu painters and in
the wizards.
If the font on the Editor Font page in the Design Options dialog
box is not set to Tahoma, multiple languages cannot be displayed in
Script views, the File and Source editors, the ISQL view in the DataBase
painter, and the Debug window.
You can select a different font for printing on the Printer Font
tab page of the Design Options dialog box for Script views, the File and
Source editors, and the ISQL view in the DataBase painter. If the
printer font is set to Tahoma and the Tahoma font is not installed on
the printer, PowerBuilder downloads the entire font set to the printer
when it encounters a multilanguage character. If you need to print
multilanguage characters, specify a printer font that is installed on
your printer.
To support multiple languages in DataWindow objects, set the font
in every column and text control to Tahoma.
The default font for print functions is the system font. Use the
PrintDefineFont and PrintSetFont functions to specify a font that is
available on users’ printers and supports multiple languages.
PBNI
The PowerBuilder Native Interface is Unicode based. PBNI
extensions must be compiled using the _UNICODE preprocessor directive in
your C++ development environment.
Your extension’s code must use TCHAR, LPTSTR, or LPCTSTR instead
of char, char*, and const char* to ensure that it works correctly in a
Unicode environment. Alternatively, you can use the
MultiByteToWideChar function to map character strings to Unicode
strings. For more information about enabling Unicode in your
application, see the documentation for your C++ development
environment.
Unicode enabling for Web
services
In a PowerScript target, the PBNI extension classes instantiated
by Web service client applications use Unicode for all internal
processing. However, calls to component methods are converted to ANSI
for processing by EasySoap (obsolete), and data returned from these calls is
converted to Unicode.
XML string encoding
The XML parser cannot parse a string that uses an eight-bit
character code such as windows-1253. For example, a string with the
following declaration cannot be parsed:
|
1 2 |
string ls_xml ls_xml += '<?xml version="1.0" encoding="windows-1253"?>' |
You must use a Unicode encoding value such as UTF16-LE.