User Tools

Site Tools


localization_and_you_-_utf_8_on_netbsd

This is an old revision of the document!


Localization and You: UTF–8 on NetBSD

NetBSD is a great little operating system, but it’s a much smaller project than Linux. This means there isn’t as much call for better internationalization support, as most of the users and developers are perfectly comfortable with ASCII or the ISO–8859–1 western European locale. This can cause some problems when using software that expects Unicode, also known as UTF–8, also known as the one true text encoding for the future. Here’s how to fix it. These instructions assume you’re using a bourne-compatible shell like ksh, bash, or zsh. If you’re using (t)csh you’re on your own.

Environment Variables

Most of the time, you can “fake” proper UTF–8 support by exporting three environment variables and leaving it up to your local terminal emulator to handle the rest. Add the following three lines to your ~/.profile :

following three lines
export LANG="en_US.UTF-8"
export LC_CTYPE="en_US.UTF-8"
export LC_ALL="en_US.UTF-8"

Save, kill any screen or tmux sessions or other background processes, and log out. When you log in again, you should have a proper UTF–8 terminal as far as most programs are concerned.

Perl will throw the following error when invoked:

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
        LC_ALL = "en_US.UTF-8",
        LC_CTYPE = "en_US.UTF-8",
        LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").

Feel free to ignore this error. As long as you’ve got those environment variables set, you should be fine.

Python 3 expects all source files to be UTF–8 text, so please make sure to change these things before working on python3 code.

Rxvt-Unicode

Rxvt-Unicode, urxvt, rxvt-unicode–256color. By whatever name you call it, it’s a very popular terminal among Linux and *BSD “power users.” Unfortunately, using urxvt adds an extra degree of difficulty to connecting to SDF - there’s no $TERM setting that corresponds with it! I’m sure some of you have tried logging in to SDF from urxvt, only to have scary warnings printed to stderr and have everything treated like a dumb paper teletype. Don’t worry, there’s a very simple fix for that as well. Open up ~/.profile again and add these lines:

add these lines
if [ "$TERM" == "rxvt-unicode" ] || [ "$TERM" == "rxvt-unicode-256color" ]; then
   export TERM="rxvt"
fi

In simple terms, this tricks NetBSD into thinking your terminal is rxvt, the original program urxvt is based on. However, the same volume of home directories is mounted by the OpenBSD machine beastie, which does have an entry for rxvt-unicode in its terminfo database! So if you log in to both systems on a regular basis, and on both systems you use a shell that sources .profile, your OpenBSD experience might be needlessly downgraded. In that case, enclose the test above with another condition.

add these lines
if [ "$(uname)" == "NetBSD" ]; then
   if [ "$TERM" == "rxvt-unicode" ] || [ "$TERM" == "rxvt-unicode-256color" ]; then
      export TERM="rxvt"
   fi
fi

If you have a MetaArpa account, don’t worry - the MetaArray is running Debian, which understands urxvt just fine.

Escape Characters

NetBSD’s terminal has what are called “escape characters.” These are characters in the “high ASCII” (decimal 129–255) range that manipulate the shell session when read from stdin or written to stdout. As you might imagine, this screws with programs that write large amounts of arbitrary characters to standard output, like the “kermit -s” or “sz” file transfer programs. For sx/sy/sz (X/Y/ZMODEM protocols) your best bet is to just not use them with SDF for now. If you’re on a TCP/IP connection (which most of you are) it’s easier to stick with scp/sftp for secure transfers, and http or ftp for insecure. If you really need “in-line” file transfer, there is a way to make “kermit -s” work around NetBSD’s escape characters. This is adding the “-8” and “-0” flags. If I wanted to transfer the SQLite database “winning-lottery-numbers.sqlite” from SDF to my home machine, I would do it like this:

tidux@sdf:~$ kermit -s -8 -0 winning-lottery-numbers.sqlite

Then my local kermit program would receive the transfer and I could continue working on SDF as usual. If you do this often, it may be wise to add an alias in your shell configuration files, like so:

add an alias
alias send='kermit -s -8 -0'

I hope this guide has been helpful to you. Happy UTF–8 hacking!


Localization and You: UTF–8 on NetBSD - traditional link (using RCS)

localization_and_you_-_utf_8_on_netbsd.1644266936.txt.gz · Last modified: 2022/02/07 20:48 by jquah