Connection Information

To perform the requested action, WordPress needs to access your web server. Please enter your FTP credentials to proceed. If you do not remember your credentials, you should contact your web host.

Connection Type

About Unicode – PB Docs 2021 – PowerBuilder Library

About Unicode – PB Docs 2021

About Unicode

Before Unicode was developed, there were many different encoding
systems, many of which conflicted with each other. For example, the same
number could represent different characters in different encoding
systems. Unicode provides a unique number for each character in all
supported written languages. For languages that can be written in
several scripts, Unicode provides a unique number for each character in
each supported script.

For more information about the supported languages and scripts,
see the Unicode website at http://www.unicode.org/cldr/charts/latest/supplemental/scripts_and_languages.html.

Encoding forms

There are three Unicode encoding forms: UTF-8, UTF-16, and UTF-32.
Originally UTF stood for Unicode Transformation Format. The acronym is
used now in the names of these encoding forms, which map from a
character set definition to the actual code units that represent the
data, and to the encoding schemes, which are encoding forms with a
specific byte serialization.

  • UTF-8 uses an unsigned byte sequence of one to four bytes to
    represent each Unicode character.

  • UTF-16 uses one or two unsigned 16-bit code units, depending
    on the range of the scalar value of the character, to represent each
    Unicode character.

  • UTF-32 uses a single unsigned 32-bit code unit to represent
    each Unicode character.

Encoding schemes

An encoding scheme specifies how the bytes in an encoding form are
serialized. When you manipulate files, convert blobs and strings, and
save DataWindow data in PowerBuilder, you can choose to use ANSI
encoding, or one of three Unicode encoding schemes:

  • UTF-8 serializes a UTF-8 code unit sequence in exactly the
    same order as the code unit sequence itself.

  • UTF-16BE serializes a UTF-16 code unit sequence as a byte
    sequence in big-endian format.

  • UTF-16LE serializes a UTF-16 code unit sequence as a byte
    sequence in little-endian format.

UTF-8 is frequently used in Web requests and responses. The
big-endian format, where the most significant value in the byte sequence
is stored at the lowest storage address, is typically used on UNIX
systems. The little-endian format, where the least significant value in
the sequence is stored first, is used on Windows.


Document get from Powerbuilder help
Thank you for watching.
Was this article helpful?
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x