Setting up a server for Unicode

How you configure a Unicode-mode server and the workstations that access it, depends on whether you are starting a server for the first time or whether you are converting an existing non-unicode server to unicode mode. The following sections explain each use case.

Note

The Perforce service limits the lengths of strings used to index job descriptions, to specify filenames and view mappings, and to identify client workspaces, labels, and other objects. The most common limit is 2,048 bytes. Because no basic Unicode character expands to more than three bytes, you can ensure that no name exceeds the Perforce limit by limiting the length of object names and view specifications to 682 characters for Unicode-mode servers.

Configuring a new server for Unicode

To configure a new server for Unicode, start the server using the following command:

$ p4d -xi -r server_root [other options]

This command verifies that all existing metadata is valid UTF8, and then sets the protected counter unicode to indicate that the server now runs in Unicode mode. If you stop and restart the server, it remains in Unicode mode. Once you have placed the server in this mode, you cannot change it to non-unicode mode.

When a client connects to the server, it attempts to discover what the server’s setting is, and it sets the P4_port_CHARSET variable to reflect that setting. If the server is not in unicode mode, the variable is set to none. If the server is set to Unicode, the variable is set to auto. Likewise, the client sets the P4CHARSET variable to auto. The client then examines its environment to figure out what character set it needs to select.

The P4_port_CHARSET variable is stored in a file called .p4enviro. By default, this file is stored in the user’s home directory. To change the file location, the user must set the P4ENVIRO variable to the desired path.

Configuring an existing server for Unicode

To convert an existing server to Unicode mode, perform the following steps:

  1. Stop the server by issuing the p4 admin stop command.
  2. Create a server checkpoint, as described in Backup and Recovery.
  3. Convert the server to Unicode mode by invoking the server (p4d) and specifying the -xi flag, for example:

    p4d -xi -r server_root

    The server verifies that its existing metadata contains only valid UTF-8 characters, then creates and sets a protected configurable called unicode that is used as a flag to ensure that the next time you start the server, it runs in Unicode mode. After validating metadata and setting the configurable, p4d exits and displays the following message:

    Server switched to Unicode mode.

    If the server detects invalid characters in its metadata, it displays error messages like the following:

    Table db.job has 7 rows with invalid UTF8.

    In case of such errors, contact Perforce Technical Support for instructions on locating and correcting the invalid characters.

  4. Restart p4d, specifying server root and port as you normally do. The server now runs in Unicode mode.

When a client connects to the server, it attempts to discover what the server’s setting is, and it sets the P4_port_CHARSET variable to reflect that setting. If the server is not in Unicode mode, the variable is set to none. If the server is set to Unicode, the variable is set to auto. Likewise, the client sets the P4CHARSET variable to auto. The client then examines its environment to figure out what character set it needs to select.

The default location of the P4_port_CHARSET variable depends on your operating system:

Localizing server error messages

By default, the Helix Versioning Engine informational and error messages are in English. You can localize server messages. To ensure best results, contact Perforce Technical Support. The following overview explains the localization process.

To localize Helix Versioning Engine messages:

  1. Obtain the message file from Perforce Technical Support.
  2. Edit the message file, translating messages to the target language. Each message includes a two-character language code. Change the language code from en (English) to the code for the target language. Do not translate any of the key parameters or named parameters (which are specified between percent signs and single quotes, for example, %depot%). You can change the order in which the parameters appear in the message.

    Original English:

    @en@ 0 @db.message@ @en@ 822220833 @Depot '%depot%' unknown - use 'depot' to create it.@

    Correct translation to Portuguese (note reordered parameters):

    @pt@ 0 @db.message@ @pt@ 822220833 @Depot '%depot' inexistente - use o comando 'depot' para criar-lo.@

    Although you are free to use any two-letter language code to designate the target language (so long as it’s not "en," you might want to use a standard convention, such as the one described here:

    http://www.w3schools.com/tags/ref_language_codes.asp

    Many messages use Perforce command names. It is important to distinguish the word as a command name from the word as a description. For example:

    @Depot '%depot%' unknown - use 'depot' to create it.@

    In this case, depot and %depot% should not be translated.

  3. Load the translated messages into the server by issuing the following command:

    $ p4d -jr /fullpath/message.txt

    This command creates a db.message file in the server root. The Perforce service uses this database file when it displays error messages. The Perforce proxy can also use this db.message file; see the section on localizing P4P in Helix Versioning Engine Administrator Guide: Multi-Site Deployment

  4. The character set of the resulting translation needs to be UTF-8 for unicode mode servers. That file should not have a leading Byte-order-mark (BOM).

    If the target server is not in Unicode mode, the translation file does not need to be in UTF-8. In this case you might want multiple instances of the translated messages in multiple character sets. You can effect this by combining the language code field with a character set name. For example, @ru_koi8-r@ to indicate Russian with a koi8-r encoding versus @ru_iso8859-5@ to indicate Russian with an ISQ encoding.

  5. You can load translated message files into a p4d server by recovering them with the server’s journal recovery command:

    $ p4d -r server_root -jr translated_message_file

To view localized messages, set the P4LANGUAGE environment variable on user workstations to the language code you assigned to the messages in the translated message file. For example, to have your messages returned in Portuguese, set P4LANGUAGE to pt.

To view localized messages using P4V, you must set the LANG environment variable to the language code that you use in the messages file.