a3nm's blog

Ambiguous verbal forms in French

— updated

This list is incomplete! You should see the new list instead! Cette liste est incomplète! Regardez plutôt la nouvelle liste!

English explanations

Added "méprise" and "méprises", thanks to Erik McDonald.

In this post, I present a list of French verbal forms which are ambiguous because they can be derived from different infinitives (always two). To determine which are those derivations (infinitive, mode, tense, person), you can use french-deconjugator from Verbiste. This (hopefully complete) list has been computed using Lexique. See also my list of non-homophonous homographs in French and my list of French words without rhymes.

Explications en français

Ajout de "méprise" et "méprises", merci à Erik McDonald.

Ce post présente une liste de formes verbales du français qui sont ambiguës car elles peuvent être dérivées de plusieurs infinitifs différents (toujours exactement deux). Pour déterminer quelles sont les dérivations possibles (infinitif, mode, temps, personne), vous pouvez utiliser french-deconjugator de Verbiste. Cette liste que j'espère exhaustive a été calculée avec Lexique. Voir aussi ma liste d'homographes non-homophones en français (en anglais) et ma liste de mots français sans rimes.

The list / la liste

Random notes on the Hercules eCafe EX

— updated

I happened to stumble upon an interesting offer for the Hercules eCafe EX (163 EUR including shipping fees), and I couldn't resist and bought one. (I have to admit that I did hesitate with "ordinary" netbooks which offer more decent performance for only 40 EUR more, but I was tempted by the exoticism of an ARM architecture.)

Looking back at this, the most important point about the machine is: a strange glitch means that a few keypresses are lost every 30 seconds or so. This doesn't sound like much, but it's enough to make typing on the device very annoying, which means I've never used it seriously for anything; it's really a pity, because the device could conceivably be useful for a lot of things otherwise (though the impossibility of running a Web browser is a bit limiting, and the fact that you're stuck with a fork of the kernel at a specific version, and to certain binary variants of software to get hardware video acceleration, is a problem for sustainability). I've tried out several ways to investigate the problem, including recompiling the kernel with additional debug, disabling all USB power saving options, etc.; nothing shows up in dmesg whenever the keypresses start getting ignored... A possibility would be to replace the built-in keyboard with an USB one, but this would not be trivial. I'm a bit out of ideas with this machine, really.

The eCafe looks more or less like an ordinary netbook on the outside (though the design is a bit weird), but it pretty different inside because it uses an ARM processor (and a flash chip for storage). This means no mechanical parts at all (and no ventilation, no heat vents even) and ridiculously high battery life (roughly over 12 hours, though I should really benchmark it) but ridiculously bad performance.

The documentation available about the device online is sparse and there aren't that many people using it, so I thought I'd post some random notes about it in case someone wants to know more about the beast. I'm still discovering the thing, so there are many questions here and not many answers, but here you go.

Non-removable battery
There is no way to remove or change the battery. There is, however, a switch to physically disconnect the battery from the device if you don't want to use it.
SIM port
There is a port to plug a SIM card. I have no idea what it's supposed to be used for, or if it's connected to anything.
SD-card reader
Yes, the thing really has two sd-card readers: one external on the right side, and one internal reachable from under the device.
DIP switch
The device is advertised as having a "DIP switch" to boot a different operating system. You can find more info about that in this PDF. What the switch does is instruct the computer to search for an U-Boot installation at a specific position on an SD-card in the external reader. Might be fun.
BIOS
The device has no proper BIOS, it uses U-Boot. I'm not exactly sure how good this is: the boot is not especially fast, and the U-Boot part seems to take some time (though it's hard to tell because of all the nice annoying splash screens which hide what is really going on). From the PDF mentioned above, it seems as if U-Boot is requesting a DHCP lease to do a TFTP boot, which might take a lot of time, but this is just a guess. I'd have to try with my own U-Boot setup.
ACPI
The device has no real ACPI support but seems to use custom stuff instead. The battery status can be queried through /sys/devices/platform/imx-i2c.0/i2c-0/0-000b/power_supply/BAT0/
Suspend
The device has pretty good suspend to RAM with pm-suspend. Resume really takes 4 seconds, as advertised. The stability of this is unclear.
Light sensor
The device seems to have a light sensor. I don't know yet how you can query it.
Backlight
The backlight brightness can be controlled via /sys/devices/platform/pwm-backlight.0/backlight/pwm-backlight.0/, all the way from full power to no LCD backlighting.
Special keys
There are a few special keys, which all generate X events. On the keyboard: XF86Sleep, XF86MonBrightnessUp, XF86MonBrightnessDown, XF86AudioMute, XF86AudioLowerVolume, XF86AudioRaiseVolume, and keycode 248. On the right side with LED effects: XF68AudioPrev, XF86AudioNext, XF86AudioPlay, XF86AudioStop. It's nice that you can remap those to whatever you want, especially the four on the right side that can be reached even when the lid is closed. I still have to find a creative use for this.
Numlock and Capslock leds
Yes, there are some, which is pretty rare nowadays.
Keyboard feel
The keyboard feel is slightly hard but pretty nice. However, there is a really annoying glitch: some keypresses seem to be lost occasionally. I still have to debug this. It occurs both in ttys and when using X, so either it's the kernel or it's a hardware problem.
Wifi chip
It is driven with the non-standard rt3070 module. It seems related to similar models with good kernel support, but I'm not really sure. It seems pretty limited: for instance, changing the MAC address (SIOCSIFHWADDR) is not supported. Once again, hopefully there's a way to make this work.
Kernel limitations
The provided kernel isn't compiled with much stuff. For instance, LUKS, iodine and openvpn don't work.
Performance
Performance is horrendously bad. /proc/cpuinfo says the processor is an "ARMv7 Processor rev 5 (v7l)". Using Firefox on a reasonably simple HTML page is already pretty unpleasant. However, it's a nice ssh client (or text-entry station), except for the keyboard glitch.
Shipped OS
The shipped OS is a customized Ubuntu with a crappy netbook interface and custom repositories from Hercules without much stuff inside. The interface is probably simple enough to be used by anyone (except that it's so slow...) but it' not really good for a power user. It is possible to add the Ubuntu Netbook Remix Lucid Lynx repos and install the applications you want, though installing their kernel will not work.
Multimedia acceleration
Hardware acceleration for multimedia decoding is available though a proprietary modified version of gstreamer. It really makes a difference, ie. you can read a video with these extensions but not without. I'm not sure yet about how versatile this can be.
Toolchain
There is some documentation and code available though I didn't try to play with it yet. I did; you can rebuild the bootloader and kernel (it isn't entirely straightforward), but you're stuck with the provided kernel version, which includes some changes that would apparently not be trivial to port to a newer kernel...
Available memory
The machine is supposed to have 512 MB of RAM, but /proc/meminfo only reports 416680 kB MemTotal. Maybe this memory is reserved for something (video acceleration?), or maybe the kernel doesn't detect it correctly? or maybe the hardware just doesn't match the spec?

A list of French words with no rhymes

— updated

English explanations

In this post, I present a list of French words which have no satisfactory rhyme (aka. orphan rhymes). I computed it from the Lexique database and curated it manually to fix errors and omissions in the DB and exclude some words (and added some from the French Wikipedia).

Of course, this list is not definite, since there is no clear definition of the boundaries of the French language, and no clear definition of what constitutes an acceptable rhyme. For this list, I excluded exotic-sounding foreign borrowings and slang terms, and required rhymes to match up to the last vowel phoneme (exactly, including the "brin"/"brun" distinction), to respect rhyme gender, and forbid rhymes between derived terms (ie. terms with a common etymological suffix, conjugations of the same verb, singular and plural forms...).

I did not expect this list to be so long, and to include so many reasonably common terms, but here you are... Preparing the list was a very nice occasion to learn about a lot of weird-sounding rare words; I provide links to the French Wiktionary if you want to learn more about some words (but I didn't check if all those pages actually exist). Thanks to Ted for his help in cleaning up the list. See also my list of non-homophonous homographs in French and my list of ambiguous verbal forms in French.

Added "film" (thanks to Hervé Bercegol).

Added "ogre".

Removed "caroube" following comment by cl-r.

Explications en français

Dans ce post, je présente une liste de mots français qui n'ont pas de rime satisfaisante (alias des rimes orphelines). Je l'ai calculée à partir de la base de données Lexique et l'ai ajustée à la main pour corriger des erreurs et des omissions, et pour exclure certains mots (et en ajouter à partir de Wikipédia).

Évidemment, cette liste n'est pas la seule possible, parce qu'il n'y a pas de définition claire des limites de la langue française, et pas de définition claire de ce qui constitue une rime acceptable. Pour cette liste, j'ai exclu certains emprunts étrangers trop exotiques et certains mots familiers, et j'ai exigé que les rimes correspondent jusqu'au dernier phonème vocalique (en respectant notamment la distinction "brin"/"brun"), qu'elles respectent le genre des rimes, et qu'elles ne soient pas faites entre des termes dérivés (par exemple des termes avec un suffixe étymologique commun, des conjugaisons du même verbe, un même terme au singulier et au pluriel...).

Je ne pensais pas que cette liste serait si longue, et qu'elle inclurait autant de termes raisonnablement communs, mais comme vous pouvez le voir... Préparer cette liste a été une occasion plaisante d'apprendre l'existence de termes rares aux sonorités étranges ; je fournis des liens vers le Wiktionnaire pour vous aider à chercher (mais je n'ai pas vérifié que les pages existent vraiment). Merci à Ted qui m'a aidé à nettoyer la liste. Voir aussi ma liste d'homographes non-homophones en français (en anglais) et ma liste de formes verbales ambiguës en français.

Ajout de "film" (merci à Hervé Bercegol).

Ajout de "ogre".

Suppression de "caroube" suivant un commentaire de cl-r.

The list / la liste

More fnacbook hacking

— updated

In this followup to my previous post about hacking the Kobo, I present a few other tips.

Reindexing the collection

It turns out that you can reindex the collection on the device without actually pluging it in a computer. This is very useful if you add or remove files to the device directly (using sftp, for instance), though you do not need it if you are just replacing existing files. The idea is that nickel uses the FIFO /tmp/nickel-hardware-status to get notified about events: if you write usb plug add or usb plug remove to this file, it will act as if the Kobo had been connected or disconnected to the computer.

Hence, you can create a file reindex.sh containing:

echo usb plug add >> /tmp/nickel-hardware-status \
  && sleep 10 \
  && echo usb plug remove

Make it executable and run it with nohup ./reindex.sh. nickel will present a dialog, quickly select "connect", and wait for the fake disconnection and reindexing to take place. Unfortunately, this will disconnect the Wi-Fi (hence the nohup business).

[This one was sent to me by Andreas Heider. Thanks a lot for suggesting this!] I also noticed that, when you download a file through the built-in web-browser, it will get indexed. It could be sensible to run a minimalistic web-server on the device to serve files already stored on the device to make the device index them. Maybe I'll investigate this later.

Automatic reverse ssh

You might want to connect to the device easily without having to find out its IP or worrying about NATs. The right tool for this problem is openvpn, but it has a lot of dependencies and isn't exactly light. A satisfactory (and probably simpler) alternative is to get the device to reverse ssh to a trusted server (with key-based authentication) when it connects to the Internet, so that you can always use this to get a shell.

I already mentioned how to get dropbear running on the device. To get reverse ssh working, it turns out that you have to add busybox ifconfig lo up to the /etc/init.d/rcS file because it isn't done by default. Then, generate a key using dropbearkey -f /root/.ssh/id_rsa -t rsa and copy the fingerprint returned by the command to your authorized_keys file on the trusted server. Obviously, if you worry that unauthorized people might have access to your device, you should use an account with very little permissions.

Now, once you have checked that dbclient -i /root/.ssh/id_rsa USER@SERVER works without asking for a password, you can add the following reverse ssh command at the end of the renew|bound) section of file /etc/udhcpc.d/default.script (note the unsafe "-y" to avoid unknown host key issues, and adapt if you don't like this):

dbclient -y -i /root/.ssh/id_rsa -R 4080:127.0.0.1:22 USER@SERVER < dev/ptmx &

Now, when obtaining or renewing a DHCP lease, the device should run the reverse ssh. I have found this to be pretty fragile when testing, for unknown reasons: workarounds you might want to try are using the IP address of SERVER directly (in case DNS resolution is not working properly yet) or moving the command to a separate script which sleeps for a few seconds before running the command.

Leaving the Wi-Fi active

I was complaining in my previous post about the Wi-Fi getting disabled after a few minutes. According to the (very insightful) guide Hacking the Kobo Touch for Dummies, the responsible for this is nickel, and, indeed, just running killall nickel will fix the problem. Obviously, without nickel running, there is little you can do with the device except from using ssh. When you're done, the /etc/init.d/rcS file suggests that you can get nickel back by issuing something like:

INTERFACE=wlan0
WIFI_MODULE=ar6000
if [ $PLATFORM == ntx508 ]; then
  INTERFACE=eth0
  WIFI_MODULE=dhd
fi

export INTERFACE
export WIFI_MODULE

export QWS_MOUSE_PROTO="tslib_nocal:/dev/input/event1"
export QWS_KEYBOARD=imx508kbd:/dev/input/event0
export QWS_DISPLAY=Transformed:imx508:Rot90
export NICKEL_HOME=/mnt/onboard/.kobo
export LD_LIBRARY_PATH=/usr/local/Kobo
export WIFI_MODULE_PATH=/drivers/$PLATFORM/wifi/$WIFI_MODULE.ko
export LANG=en_US.UTF-8
export UBOOT_MMC=/etc/u-boot/$PLATFORM/u-boot.mmc
export UBOOT_RECOVERY=/etc/u-boot/$PLATFORM/u-boot.recovery

/usr/local/Kobo/nickel -qws

Trouble is, this does not seem to work perfectly (the new nickel instance is a bit broken, especially with regards to Wi-Fi). I would need to investigate more. Of course, to fix things for real, you can always just reboot the device.

Displaying stuff on the Kobo from the shell

People have already figured out how to display stuff on the Kobo's screen from the shell. It's pretty minimalistic (only images in a weird format). To spare a click, here is how you do it:

ffmpeg -i $INPUT.png -vf transpose=2 -f rawvideo
    -pix_fmt rgb565 -s 600x800 -y $OUTPUT.raw
cat $OUTPUT.raw | ssh DEVICE /usr/local/Kobo/pickel showpic

Of course, this means that you can replace the images in /etc/images and /usr/local/Kobo/slideshow by other images. Disappointingly, though, the interesting images (the "powered off" and "sleeping" logos) don't seem to be there and seem to be displayed in another way. The most interesting thing you can customize is the boot animation (/etc/images/on-*.raw.gz).

Retrieving touchscreen events from the device

The (very insightful) guide Hacking the Kobo Touch for Dummies also explains how to recover the touchscreen input events from the device:

evtest /dev/input/event1

There is a lot you can do by forwarding these events to your computer and interpreting them creatively (a custom remote control, for instance). As always, buffering issues are a problem when trying to retrieve the events synchronously. Here is a possible way to retrieve the useful info, where DEVICE is where you can reach the device:

cat <(echo 'evtest /dev/input/event1;') - |
  socat EXEC:'ssh DEVICE',pty,ctty,echo=0 STDIO |
  sed -u 's/$/\n/' |
  stdbuf -i0 -o0 -e0 cut -d ' ' -f 9,11,13

In case you're wondering, the sed command is because the newline at the end of the last event is outputted when the next one happens, which isn't really convenient if you want to pipe the data to something.

Here is a small Python program to run on your computer which accepts this input and will draw what you do on your touchscreen:

import pygame
import pygame.draw
import sys

pygame.init()

def line(x1, y1, x2, y2, r=255, g=255, b=255):
  global window
  pygame.draw.aaline(window, (r, g, b), (x1, y1), (x2, y2))

window = pygame.display.set_mode((600, 800))

pre = None

while True:
  l = sys.stdin.readline()
  if not l:
    break
  l = l.rstrip().split(' ')[0:3]
  if len(l) != 3 or '' in l:
    continue
  print l
  if pre:
    line(600-int(pre[1]), int(pre[0]), 600-int(l[1]), int(l[0]))
  if l[2] != '0':
    pre = l
  else:
    pre = None
  pygame.display.flip()

Obviously, you should try to minimize the network latency if you want this to be pleasant.

Use, mention, and titles

— updated

In this post, I present the use-mention distinction. Consider the following sentences:

  • John told me a few words.
  • John told me "a few words".

Those sentences don't mean the same thing. The first one means that John told me a few words, without specifying what those words were. The second means that John told me literaly "a few words", that is, he told me exactly three words, which were "a", "few", and "words".

The distinction seems trivial in this example, but it is seldom fully respected. Consider for instance:

  • There's a difference between him and her.
  • There's a difference between "him" and "her".

The first sentence refers to two people, one male and one female, and says that they are different. The second says that the words "him" and "her" are different. This example is clear, but consider:

  • Notice that affect is always a verb.
  • Notice that "affect" is always a verb.

People will often write the first sentence without noticing that it is not what they mean, and that it is not even gramatically correct. Indeed, what about:

  • Notice that that is never a verb.
  • Notice that "that" is never a verb.

Clearly, we need the quotation marks.

Worse still, there are names and titles.

  • I love War and Peace. (referring to the book by Tolstoy)
  • I love war and peace. (referring to war and peace in general)

There are often typographical and case distinctions to disambiguate, but they are not always obvious:

  • I love adventure. (referring to the classical video game called "adventure")
  • I love adventure. (referring to adventure in general)

Here is a final example:

  • I read a book yesterday. (ie. I read some book)
  • I read "a book" yesterday. (ie. I read the literal words "a book")
  • I read A Book yesterday. (ie. I read this book — not an endorsement, I don't know it, but the title is fun)

The fact that humans are almost never confused by missing quotation marks or missing italics is pretty surprising when you're used to working with computers.