Configure clients for Unicode
When you set up a server to work in unicode mode, the client determines
what character set to use by examining the current environment and,
generally, you should have nothing more to do to get a correct
translation. For example a UNIX client examines the LANG or
LOCALE variables to determine the appropriate character set.
However, there might be situations when you need to override the
selection made by the client:
-
The automatically selected setting is producing bad translations.
See Troubleshoot user workstations in Unicode installations for more information.
- You want to use separate workspaces (clients) and each of these needs
to use a different character set. In this case, you must set a
different
P4CHARSETvalue for each client. -
The files you check out need to be accessed by applications for which byte order is important.
See Unicode character sets and Byte Order Markers (BOMs) or more information.
-
You need to set
P4CHARSETto anutf16orutf32setting.See Controlling translation of server output for more information.
-
The file is checked out using Helix Core Server client applications that handle Unicode environments in different ways.
See Using other Helix Core Server client applications for more information.
In each of these cases, you will need to explicitly set
P4CHARSET to an appropriate value or take some other action.
To get a list of the possible values for P4CHARSET, use the
command:
$ p4 help P4CHARSET
Do not submit a file using a P4CHARSET that is different
than the one you used to sync it; the file is translated in a way that
is likely to be incorrect. That is to say, do not change the value of
P4CHARSET while files are checked out.
Unicode character sets and Byte Order Markers (BOMs)
Byte order markers (BOMs) are used in Unicode files to specify the order in which multi-byte characters are stored and to identify the file content as Unicode. Not all extended-character file formats use BOMs.
To ensure that such files are translated correctly by the
Helix Core Server when the files are synced or submitted, you must set
P4CHARSET to the character set that corresponds to the
format used on your workstation by the applications that access them,
such as text editors or IDEs. Typically the formats are listed when you
save the file using the menu
option.
The following table lists valid settings for P4CHARSET for
specifying byte order properties of Unicode files.
| Client Unicode format | BOM? | Big or Little-Endian | Set P4CHARSET to | Remarks |
|---|---|---|---|---|
|
UTF-8 |
No |
(N/A) |
|
Suppresses Helix Core Server UTF-8 validation |
|
Yes |
|
|||
|
No |
|
|||
|
Yes |
|
|||
|
UTF-16 |
Yes |
Per client |
|
Synced with a BOM according to the client platform byte order |
|
Yes |
Little |
|
|
|
|
Yes |
Big |
|
||
|
No |
Per client |
|
||
|
No |
Little |
|
||
|
No |
Big |
|
||
|
UTF-32 |
Yes |
Per client |
|
Synced with a BOM according to the client platform byte order |
|
Yes |
Little |
|
||
|
Yes |
Big |
|
||
|
No |
Per client |
|
||
|
No |
Little |
|
||
|
No |
Big |
|
If you set P4CHARSET to a UTF-8 setting, the
Helix Core Server
does not translate text files when you sync or submit them.
Helix Core Server
does verify that such files contain valid UTF-8 data.
Controlling translation of server output
If you set P4CHARSET to any utf16 or
utf32 setting, you must set the
P4COMMANDCHARSET to a non-utf16 or
non-utf32 character set in which you want server output
displayed. "Server output" includes informational and error messages,
diff output, and information returned by reporting commands.
To specify P4COMMANDCHARSET on a per-command basis, use the
-Q flag. For example, to display all filenames in the depot,
as translated using the winansi code page, issue the
following command:
C:\> p4 -Q winansi files //...
Using other Helix Core Server client applications
If you are using other Helix Core Server client applications, note how they handle Unicode environments:
- P4V (Helix Core Visual Client): the first time you connect to a Unicode-mode server, you are prompted to choose the character encoding. Thereafter, P4V retains your selection in association with the connection. P4V also has a global default setting for Charset. If you set this, it will be used instead of asking you to provide a charset.
- P4Eclipse will ask for a charset when connecting to a Unicode-mode server.
- P4Merge: To configure the character encoding used by P4Merge,
choose P4Merge’s menu option. When launched from
P4V, P4Merge uses
P4V’s
P4CHARSETinstead of the one defined in it’s preferences. - P4GT and P4EXP, the Helix Plugin for File Explorer, use environmental settings and will fail with a Unicode-mode server.






