decoding keyboard input when using curses

Sun May 31 18:23:39 EDT 2009

On Sun, May 31, 2009 at 02:30:54PM EDT, Arnaud Delobelle wrote:
> Chris Jones <cjns1989 at gmail.com> writes:

> [...]

> > Try this:
> >
> > #include <locale.h>
> > #include <ncurses.h>
> > #include <stdlib.h>
> > #include <stdio.h>
> > #include <string.h>
> 
> /* Here I need to add the following include to get wint_t on macOS X*/
> 
> #include <wctype.h>

Ah.. interesting. My posts were rather linux-centric, I must say.

Naturally, the library & header files setup on MacOS would likely
differ.

[..]

> > gcc -lncursesw uni10.c -o uni10   # different lib..
> >              ^
> 
> My machine doesn't know about libncursesw:
> 
> marigold:c arno$ ls /usr/lib/libncurses*
> /usr/lib/libncurses.5.4.dylib
> /usr/lib/libncurses.dylib
> /usr/lib/libncurses.5.dylib
> 
> So I've compiled it with libncurses as before and it works.

Nothing to complain about.. makes it even more seamless..!

> This is what I get:
> 
> If I run the program and type 'é', I get a code of 'e9'.
> 
> In python:
> 
> >>> print '\xe9'.decode('latin1')
> é
> 
> So it has been encoded using isolatin1.  I really don't understand why.
> I'll have to investigate this further.

That's why I tested both my efforts with a euro symbol that I entered
via the Compose key + E= .. which generates 0x20AC in a UTF-8 setup.

> If I change the line:
> 
>    setlocale(LC_ALL, "");                 /* make sure UTF8       */
> 
> to
> 
>    setlocale(LC_ALL, "en_GB.UTF-8");       /* make sure UTF8       */
> 
> then the behaviour is the same as before (i.e. get_wch() gets called
> twice instantly).
> 
> I'll do some more investigating (when I can think of *what* to
> investigate) and I will tell you my findings.

This is really strange.

I am absolutely certain that I had gotten my initial version, with the
old getch() .. etc. routines to work, but since I continued
experimenting with the same source snippet, I no longer had available to
investigate where I erred.

I copy/pasted the snippet back from my earlier post.. and so far I
haven't been able to get it work again.. :-(

The logic behind my assuming that getch() et al. should work
transparently in a UTF-8 context is that:

1. the 3NCURSES man pages are rather detailed and I could not find UTF-8
   or "unicode" mentioned anywhere apart from some obscure configuration
   option that is likely turned on everywhere by default.

2. More importantly, having to switch to the "wide character" macros or
   functions instead of the "narrow" versions would have meant that just
   about every application that uses ncurses would have had to be
   heavily patched - I would have imagined that all the changes would be
   made in the library in a transparent fashion so that the maintainers
   of such apps would only have had to link against the new ncurses
   lib..??? 

I'll keep you posted if I find something useful.

CJ