People seem to be entering lots of text using crippled telephone
keypads these days. I personally hate these things, because they're
so much slower than a regular keyboard, but I have to admit that
carrying such a keyboard in your pocket is slightly unpractical.
The layout used on keypads is great for new users. I mean, since the
letters are alphabetically ordered, you don't have to hunt that much for
the key you want. However, a lot of phone users use their keypads so
much that they know the layout by heart. I'm not one of them, but when I
see them, I wonder: "wouldn't it be possible to create a layout which is
less intuitive but more efficient ?"
Optimizing for frequency (without T9)
Of course, the answer is yes. If we arranged the letters on the keys
in a way which ensures that the most frequent letters only require one
keypress to reach, things would probably be a lot better.
Let's do the math. I assume that we will only be remapping the '2',
'3', '4', '5', '6', '7', '8' and '9' keys. The layout used by mobile
phones today is:
- 2
- ABC
- 3
- DEF
- 4
- GHI
- 5
- JKL
- 6
- MNO
- 7
- PQRS
- 8
- TUV
- 9
- WXYZ
Using letter
frequency data for the English language, here is a possible layout
optimized for letter frequency:
- 2
- ERG
- 3
- TDY
- 4
- ALP
- 5
- OCB
- 6
- IUV
- 7
- NMKQ
- 8
- SWJ
- 9
- HFXZ
Still using the letter frequency data, it's easy to compute the
average number of keypresses by letter:
- 2.149 keypresses per letter for the default
layout
- 1.462 keypresses per letter for the alternative
layout
The bottom line is that this simple alternative layout requires
32% less keypresses than the default one.
Optimizing for key repetition (still without T9)
The previous analysis fails to take into account the fact that,
whenever you want to use in succession two characters which are on the
same key, you have to wait because the phone can't guess where the two
characters could be separated. Some phone users prefer to type a dummy
character and then delete, which removes the time penalty at the cost of
two keypresses, or, more efficiently, press the "right" key (only one
keypress).
Still, it might be possible to optimize the layout by ensuring that
the letters of the most frequent digraphs are on different keys. We can
try to do this without losing the frequency-optimization we did earlier,
just by exchanging letters which require the same number of keypresses.
(It don't say that it's the most efficient way to do it, I just
say that it's a simple way to improve things so which preserves the
optimization we did above.)
I could compute the optimal permutation using English digraph
frequency data, but this would take some work. I might do it in a future
blog entry. In any case, the default layout clearly hasn't been
optimized for this.
Distinguishing ambiguities (with T9)
The T9 system requires other kinds of optimizations in order to
distinguish similar-looking words.
A first idea would be to ensure that the keys are as equiprobable as
possible (where the probability of a key is the sum of the probabilities
of its letters). This is an idea from information theory: if a key has a
very low total probability and the user almost never presses it, the
information given by the user for each keypress is reduced, and more
keypresses will be required.
The frequency-optimized layout I presented earlier is a "good"
solution to the problem of making all keys equiprobable, though it's not
the optimal one. Of course, the default layout isn't optimal at all.
Once again, the efficiency difference could be computed using the
English letter frequency tables.
However, since the user doesn't enter random strings of letters
following the frequency distribution of the English language, but real
English words, it would be possible (though perhaps
computationally intensive), using a list of the English words with their
frequency, to find the layout which minimizes ambiguities. Once again,
doing this is left as an exercise to the reader.
Using grammar (still with T9)
When you use T9, you see that it often suggests words which match
what you typed but make no grammatical sense in the current context
(suggesting "the" after "a", for instance). Wouldn't it be great if T9
only suggested words which "make sense" (or, rather, used the
grammatical plausibility of its suggestions as a way to order them)?
This might sound unrealistic, because it seems that it would require
your phone to understand (ie. correctly parse) the sentences you're
writing. In fact, it doesn't. We could simply use an n-gram model to
tag the words entered by the user with their probable grammatical role
in the sentence as they are being typed. This would require a bit more
memory on the phone (to store the likely tags for each word, and the
likely n-grams of tags), but is computationally feasible: it is
no harder than standard predictive text, just done on tags instead of
letters.
Conclusion
My point was not to develop the optimal keypad, just to show that
there are a lot of extremely simple ideas that could make keypads more
efficient but which, to my knowledge, haven't been explored. Sadly, this
is the kind of things which should have been done right the first time,
because, once widespread, they are well nigh impossible to change. If we
didn't manage to get rid of QWERTY as a default keyboard layout and to
replace it with DVORAK or any of its more efficient variants, then what
are the odds of displacing the inefficient alphabetical keypad layout?