a3nm's blog

CalDAV and CardDAV with Radicale and DAVdroid on Android

— updated

In this post, I present an entirely self-hosted solution to manage your calendar and contacts on an Android phone running CyanogenMod, synchronizing to your own server using the open CalDAV and CardDAV protocols, using the free as in free speech DAVdroid program as client (along with the stock calendar and contacts application that come with Android) and Radicale as server. As of this writing I am using radicale 0.8 and davdroid 0.5.1-alpha.

To my knowledge, as of this writing, this is the only way to manage your calendar and contacts on an Android phone which satisfies the following desiderata: no dependency on closed protocols (eliminates Exchange); no dependency on a third-party server (eliminates Google Calendar); synchronization with an external machine (eliminates "local" calendars and local contacts); no dependency on proprietary applications (eliminates all the proprietary DAV software in the Google Play Store); actually functional (eliminates all the DAV clients I tried before DAVdroid).

After a few weeks of usage, the setup described here seems to work fine for me, except for occasional crashes of DAVdroid (but it restarts by itself and I have not experienced any data loss to date due to this, this is just a minor inconvenience of having to acknowledge occasionally that DAVdroid had to exit). Note that I usually only edit data from the phone, not a different DAV client (though I tested this once with Evolution and it seemed to synchronize fine), and I did not test having more than one calendar (maybe I will do so eventually, just to have different colors for different kinds of events on my phone), so I can't testify that this works for me

I have changed the server configuration: see this newer post for explanations. This update only concerns some sections of this guide, which are pointed out explicitly.

Warning about migrating existing data

Before we start, a big warning. On Android (at least on the versions I have used), contacts can be local to the phone, but events can not be local. If you have been using the default calendar application with what you think are local events, they are actually associated to some account provider (Google Calendar, Exchange, or some other application -- in my case, it used to be one of the many other buggy calendar synchronization applications), and deleting the associated account (even though it might seem unused) will delete your entire calendar. I did this once, essentially "lost" my entire calendar except for a nigh-unusable SQLite backup, and then I lost most of the data on my phone by a misguided attempt to undo this mistake by hand that messed up file ownership. Don't do that.

This being said, I don't know what would be a better way to migrate your existing events in the setup that I describe if you have been using pseudo-local events (with a dummy provider) on the stock Android applications. You can dig out the SQLite database in /data/data/com.android.providers.calendar/databases/ on the phone but that's not really helpful because I don't know of any way to convert it to something that can be imported. So well, if you used that, like me, you are screwed, and should have investigated switching costs more carefully.

For contacts, however, according to feedback from F., you can export them using the "Import-Export" functionality of the stock Android contacts application, and reimport them back once the CardDAV account has been set up, and it should work as intended.

For calendar events, according to Thomas Bartosik, there might be a solution to mass-migrate your calendar events from a local calendar to the DAVdroid one, by retrieving the file /data/data/com.android.providers.calendar/databases/calendar.db and editing it to copy the contents of the events table to itself, changing the value of the calendar_id attribute from the ID of the local calendar to that of the DAVdroid calendar (those IDs being obtained by peeking in that table). (Sorry, I won't provide the SQL command to do this because it is long and depends on the exact schema of that file.) Note that you should of course do backups before trying something like this, and I can't guarantee that this solution works as I have not tested it myself. The calendar and calendarstore should be killed on the phone once the modified database has been installed, so that the changes are picked up.

Now that you are warned, let us start with the actual guide on a more cheerful note: once the CalDAV setup described here is operational, your calendar and contacts will be synchronized with your computer and stored in the open iCalendar format, so you should be able to migrate them to something else whatever happens.

Install CyanogenMod

As CyanogenMod is discontinued, you should install, e.g., LineageOS

Not strictly necessary, what I'm describing probably works on Android too with minor adaptations, but I prefer CyanogenMod because it is more customizable and makes it possible to avoid cleanly all the proprietary Google applications. I won't describe how to install it -- if you're interested, go to the official website and follow the instructions. I'm using CyanogenMod version 10.1.3-maguro, which means a Galaxy Nexus.

Install F-Droid

I manage applications on my phone using F-Droid, which only includes free as in free speech software, including DAVdroid. First download the F-Droid APK, and install it, allowing the installation of third-party applications if needed yet (Settings → Security → Device Administration → Unknown sources). Launch F-Droid, let it update the application list (or do it manually: menu → Update Repos), search for "davdroid", and install the application.

You can also install DAVdroid from the Google Play Store, in which case you will have to pay a small donation to the developers of DAVdroid. I do not encourage you to use the proprietary Google Play Store, but I do encourage you to donate directly to the developers.

Set up your server

This part of the guide is superseded by this newer post, in particular I'm no longer using port 5232.

I assume that you have a server (an always-on computer) running some Unix system, and that it is reachable from the outside by a domain name or static IP (say "example.com") where it can accept incoming connections on TCP port 5232 (i.e., if the server is behind a modem performing NAT, incoming connexions to TCP port 5232 is forwarded to this computer as identified by its local network address -- having set up if needed a static DHCP lease for the machine based on its MAC address). Of course I assume that your phone will be able to reach the server, namely, that it has a working Internet connection and that port 5232 is not filtered by your mobile operator, by your Wi-Fi access point, etc.

Install Radicale on your server -- for instance, on a Debian system, run apt-get install radicale. From here, we will not need to use administrative privileges on the server (no su, no sudo), as radicale is meant to be run by the user who wants to use it. (I do not present multi-user configuration in this guide.)

Generate the certificate

This part of the guide is superseded by this newer post where SSL is managed by Apache using Let's Encrypt certificates. With Let's Encrypt being available, free, and functional, it's probably a bad idea to generate your own certificates.

We will want to encrypt the connection between the phone and the server, because we do not want the details of your events and contacts leaking to anyone snooping on the link. To do so, we need to generate an X.509 certificate for the server. If you already have such a certificate that you use for other purposes (e.g., HTTPS), you may choose to reuse your existing certificate instead, but this may be complicated because the radicale daemon (which we will run in an unpriviledged way), will probably not have the rights to read the Web server certificate's key file. In this case you can probably arrange to run radicale as root, or with rights to access the certificate, or at least you could sign both certificates with the same CA, but I will not cover this. I will just assume that you use a fresh self-signed certificate for radicale. To generate a self-signed certificate with an RSA 2048-bit key, valid for 10 years, run:

  openssl req -x509 -nodes -days 3650 -newkey rsa:2048 \
    -keyout radicale.key -out radicale.crt

The exact values asked by OpenSSL do not really matter except for the Common Name, which should be the exact name that you will use to access your server from your phone (e.g., "example.com", with no trailing dot). According to F., providing an IP address rather than a domain name will also work. Two files will be generated: radicale.crt, which contains the public part of the certificate, and radicale.key, which contains the private key (and should remain private).

Install the certificate to your phone

This part of the guide is no longer necessary when following the newer post.

As this server certificate has not been signed, Android will refuse to connect to the server using SSL unless you specifically authorize the certificate. Transfer the radicale.crt file (not the .key file) to the /mnt/sdcard folder on your phone, i.e., copy it to your (micro-)SD card if your phone has one, or move it to the topmost folder available through MTP, or use adb (apt-get install android-tools-adb on Debian). Then, install the certificate: Settings → Security → Credential Storage → Install from storage, select the right file if prompted (or it will be automatically selected if there is only one), and give any name to the certificate.

Generate the auth file, logging file, and collections folder for Radicale

With the newer guide, part of the configuration here has changed, and there is no longer any authentication.

I will assume that security of the storage on your server is not a problem (e.g., your drive is encrypted). As the connection is also encrypted, we do not need to worry about authentication information being stored or transferred in plaintext. Hence, the simplest way to authenticate users is to put the following in a file (say passwd), adapting the values as you like:

myuser:mysecretpassword

We also need to tell radicale not to write logs at locations unavailable to unpriviledged users (e.g., not /var/log/radicale/radicale.log). To do this, you need to specify a logging policy, with a file of this sort (call it logging):

[loggers]
# Loggers names, main configuration slots
keys = root

[handlers]
# Logging handlers, defining logging output methods
keys = console

[formatters]
# Logging formatters
keys = simple,full

[logger_root]
# Root logger
level = DEBUG
handlers = console

[handler_console]
# Console handler
class = StreamHandler
level = INFO
args = (sys.stdout,)
formatter = simple

[formatter_simple]
# Simple output format
format = %(message)s

[formatter_full]
# Full output format
format = %(asctime)s - %(levelname)s: %(message)s

Finally, we also need to set up a location to store the calendar and address book data on your server. Once again, I am assuming that this location is safe (stored on an encrypted partition). Create an empty folder (call it collections), and run the following to initialize the collections (replacing the first path and "myuser" as needed -- you can also adapt the names "calendar" and "contacts" if you like):

cd /where/you/put/collections
mkdir myuser
touch myuser/calendar.ics
touch myuser/contacts.vcf

Configure Radicale

With the newer guide, part of the configuration here has changed.

Adapt the following configuration and write it to a file (say config). You can refer to the Radicale user documentation for more info. When writing file locations, always write them as absolute paths and do not use ~ as an abbreviation for your home folder:

[server]
# accept incoming connections and specify the port
hosts = 0.0.0.0:5232
# do not go to the background -- useful for debug
daemon = False
# use SSL to encrypt connections
ssl = True
# adapt the following to point to the certificate and key
certificate = /where/you/put/radicale.crt
key = /where/you/put/radicale.key
# displayed to request password on the client -- use any value
realm = radicale

[encoding]
request = utf-8
stock = utf-8

[auth]
type = htpasswd
# point to the authentication file
htpasswd_filename = /where/you/put/passwd
# no encryption on this file
htpasswd_encryption = plain
# adapt the following
private_users = myuser
public_users = myuser

[rights]
# only you have access to your connection
type = owner_only

[storage]
# store in flat files
type = filesystem
# point to the collections folder
filesystem_folder = /where/you/put/collections

[logging]
# more info for debug
debug = True
# specify the logging policy
config = /where/you/put/logging

You can now run radicale using radicale -C /where/you/put/config. If it does not work, copious debug output should be produced in the console. When it works, you need to arrange for this invocation to be run on the server whenever it is restarted -- remember it should be run unprivileged, not run by root. My current way of doing things is to run it by hand and keep it in a screen alongside offlineimap.

Yuval Levy wrote to mention that at this point you might need to create a calendar and a contact list in radicale for DAVdroid to pick them up in the next step. This can be done using, e.g., Evolution, and you should be careful when doing so to create the collections in a subfolder (e.g., "http://example.com:5232/user/addressbook/") rather than directly in the "user/" folder, otherwise DAVdroid will not see them. This should show up in your server as, e.g., "collections/user/calendar.ics" and "collections/user/contacts.vcf". According to Thomas Bartosik, a more lightweight solution is to use cadaver to create the collections. He also recommends that you put a trailing slash, like https://server:port/username/{calendar.ics,contacts.vcf}/, and also provide the trailing slash when configuring in Davdroid.

Configure DAVdroid

With the newer guide, specifying 5232 is no longer necessary.

Now you need to configure DAVdroid on your phone. Go to Settings → Accounts → Add account, select DAVdroid, and provide the following information: "https" as method, example.com:5232/myuser/ as root URL (replacing "example.com" with your server domain name and "myuser" with your user name), and providing "myuser" as user and "mysecretpassword" as password. Leave the checkbox as-is (i.e., checked), and go forward. If all goes well you should now be able to select "contacts" and "calendar" as the collections to synchronize, and confirm.

From there, all should work as intended: new contacts and calendar events created in the stock Android "Calendar" and "People" applications on your phone should be eventually synchronized to your server. You can go to Settings → Accounts → DAVdroid to synchronize manually. Check on your server that all information is indeed stored to collections/myuser/calendar.ics and collections/myuser/contacts.vcf.

Configure backups

As my contacts and calendar are important information, and as the software presented here is a bit young, I mitigate the potential risk of data loss by performing automated daily full backups of all the information stored on the server (in addition to my regular backup policy). To do this, you can use the following script (say backup.sh):

#!/bin/bash

# adapt this to point to your collections folder
COLLECTIONS="/where/you/put/collections"
# adapt to where you want to back up information
BACKUP="/where/you/want/backups"

mkdir -p "$BACKUP"
tar zcf "$BACKUP/dump-`date +%s`.tgz" "$COLLECTIONS"

Make the file executable (chmod +x backup.sh) and schedule it for daily execution at 5:00 AM by running crontab -e and adding the following line:

0 5 * * * /where/you/put/backup.sh

Run the backup.sh script to check that it works, and check that dumps are indeed generated regularly by cron.

This concludes this guide: you now have a working address book and calendar on your Android phone, synchronized to your server using free and open source software and open protocols and file formats.

Mobile phones and privacy

— updated

There are multiple independent reasons to oppose mobile phones on privacy grounds, and they should be carefully distinguished. In this post, I attempt to sketch an exhaustive list.

Location tracking
It is easy for your mobile phone provider to know where your phone is located by looking at which cell sites it is connected to. As the provider usually knows your identity for billing reasons, and as your mobile phone is usually located on your person, this means that your provider usually knows where you are.
Avoiding this problem is not trivial. One solution is to use prepaid SIMs or other such systems where the billing is not performed directly by the cell phone operator (although in some countries, for instance France1, it is a legal requirement to provide proof of your identity when buying a prepaid SIM). Alternatively, you could also consider that your current location is not private information, because of all the other trends that tend to make it public (CCTV, etc.).
Internet interference
Internet traffic on mobile phones is often subjected to more invasive analysis than Internet traffic on computers through regular access providers. This may be because of the widespread policy of accounting for the volume of data transferred on mobile phones network, which is not so common (at least in France) for landline Internet connections, or because of the wish of some mobile phone operators to restrict the services that they allow in order to bill different Internet services separately (because they are used to having complete control on the phone used by the subscriber to access the Internet through the connection they provide). Because of such violations of net neutrality, the Internet access provided on mobile phones seems less trustworthy than a regular broadband connection.
The problem can be circumvented by using a different medium to access the Internet on your mobile, such as Wi-Fi. Otherwise, there is little objective reason to believe that the undesirable behavior of your mobile Internet access provider could not be replicated, at least in principle, by your landline Internet access provider.
Transfer security
Even without assuming interference from the phone provider, one can reasonably doubt the security of the encryption used between the phone and the cell site, leaving the transferred data potentially available to nearby attackers: weaknesses of A5/1, possible spoofing of cell sites, etc.
Of course, this is not worse than using, say, an open Wi-Fi network. You just have to put your own encryption on top of the link layer encryption. This may be harder, however, for standard phone calls and texts.
Phone wiretapping
There is a long history of police forces and other governmental services using wiretapping to access an individual's phone calls. This precedent is what motivates intrusive access to mobile phone calls, text messages, and to some metadata (e.g., who calls or texts whom and when) which is specific to telephones in general and mobile phones in particular.
Such wiretapping can be avoided by using encrypted Internet-based alternatives to standard telephony or text, though this is usually inconvenient because of sparser connectivity, more expensive billing, high bandwidth requirements, and reduced battery life.
Non-federated protocols
One can dislike the standard telephone network because it is less federated than the Internet. Of course, one could also criticize the Internet because it is not exactly federated, but it is certainly undesirable to use this additional single purpose network for voice and text messages with its strange historical billing policies.
This is not really a privacy problem, however, except for the reason that poorly federated protocols may promote bad security and privacy violations.
Proprietary software
Mobile phones ship with software that may be proprietary. This is, of course, a danger to privacy, as such software may misuse your data or incorporate deliberate backdoors or involuntary security flaws (conceivably opening your phone's microphone to a third party...) without any possibility of reviewing what is going on. On current smartphones, for instance, Apple iOS is proprietary, and Google's Android is mostly open source but all Google-branded applications (Google Maps, Google Play Store, etc.) are proprietary and some critical low-level components are also proprietary. Furthermore, if you obtained your phone from your phone operator (rather than buying the naked phone with a stock Android install), the carrier may have added its own applications which are probably proprietary and maybe pretty treacherous. Much more worryingly, the radio firmware of mobile phones is essentially always proprietary.
This issue can be mitigated by using free software on your phone. On Android phones, a first easy step, which does not however eliminate all proprietary dependencies, is to use a community-maintained ROM such as CyanogenMod without installing the proprietary Google applications. More radically, you can use Replicant to eliminate proprietary dependencies altogether (except for radio). It is also interesting to investigate existing or upcoming options such as the Maemo-based Nokia N900, the Openmoko-based Neo Freerunner, the Firefox-OS-based GeeksPhone Keon, the Ubuntu Touch OS, etc. As for the radio firmware, my knowledge of this is somewhat limited, but it seems like there is one (only one) open source alternative, namely OsmocomBB, which you can use on some very specific phones (for GSM, not 3G). So the issue of the radio firmware can also be solved, at least in principle.
Undesirable integrated services
Even if you trust the software on your phone to serve your needs, mobile phone operating systems today are usually configured to be very tightly integrated with third party services that you may not trust. For instance, Android phones with Google applications will encourage you to hand over your email, calendar, location (with Google Maps), searches (with Google Search), nearby Wi-Fi networks, etc., to Google.
Of course, to solve this, you just have to avoid the default services recommended by your phone software, in favor of trusted, privacy-aware or (ideally) self-hosted alternatives. This can be easier said than done, however. Most Android software is only distributed through the Google Play Store, meaning that you will be forced to use this service if you want to use such software. As another example, consider the task of maintaining a calendar on your Android phone and synchronizing with the outside world, without using Google Calendar or the protocol of the proprietary Microsoft Exchange: to my knowledge, the only way to perform this using free software has appeared only fairly recently.

My point in making this list is just of making people aware of what I hope is the complete privacy case against mobile phones, so that they can distinguish the various possible dangers and know where they stand relative to each of them. From there, the decision of what to do is a personal choice; for instance, my own choice is to give up on privacy for my physical location, mistrust the Internet connection as I would mistrust an open Wi-Fi, using a VPN and/or SSL connections, use standard calls and texts but keeping in mind that they are insecure, using Cyanogen (without Google applications, but with some proprietary blobs and proprietary radio firmware), and avoiding third-party services in favor of self-hosted ones.

Of course, this list does not cover other reasons to oppose mobile phones, such as boycotting them because of how they are produced, boycotting mobile phone plans, avoiding them on unclear health grounds, refusing to be constantly available and taking time to disconnect, etc. (this latter list is certainly not complete).

1 I tried for about two hours to to figure out the exact law which imposes this, but I couldn't find it. To my knowledge, all French mobile phones operators have a suspiciously similar activation procedure for prepaid SIMs requiring you to provide some proof of ID to use (or continue to use) the SIM you bought; however, it is never explained exactly why this activation procedure exists, the most explicit references in the TOS being "conformément à une demande ministérielle intervenue dans le cadre de la loi 91-646 du 10 juillet 1991 et à l’article L34-1-1 du code des postes et communications électroniques" which is pretty vague: I couldn't find anywhere the exact nature or text of this ministerial demand, and find this vaguely worrying.

Ambiguous verbal forms in French: a larger list

— updated

In a previous post I gave a list of 44 French verbal forms that are ambiguous (in the sense that they can correspond to different verbs), and hoped that it was exhaustive. How wrong was I. Following Erik McDonald's suggestion that entries were missing from the list, I used Verbiste to compute a new list. Added 59 more forms found using the awesome Lefff's extensional lexicon. The result contains 529 verbal forms (!) and subsumes the previous list. I will not dare to hope that this list is complete, but it is certainly more complete than the previous one.

At such a scale, I elect to divide the list in chunks.

Case 1: the -ss-

A frequent situation is that two verbs will clash because one can be formed from the other one by adding "-ss-" (often to denote a pejorative connotation), which may also happen at the imperfect tense of the subjunctive mood. In the following verbs from the first conjugation group, this is exactly what happens.

baver and bavasser
bavasse, bavassent, bavasses, bavassiez, bavassions
brouiller and brouillasser
brouillasse, brouillassent, brouillasses, brouillassiez, brouillassions
cailler and caillasser
caillasse, caillassent, caillasses, caillassiez, caillassions
crever and crevasser
crevasse, crevassent, crevasses, crevassiez, crevassions
damer and damasser
damasse, damassent, damasses, damassiez, damassions
débarrer and débarrasser
débarrasse, débarrassent, débarrasses, débarrassiez, débarrassions
dégueuler and dégueulasser
dégueulasse, dégueulassent, dégueulasses, dégueulassiez, dégueulassions
embarrer and embarrasser
embarrasse, embarrassent, embarrasses, embarrassiez, embarrassions
encrer and encrasser
encrasse, encrassent, encrasses, encrassiez, encrassions
enlier and enliasser
enliasse, enliassent, enliasses, enliassiez, enliassions
enter and entasser
entasse, entassent, entasses, entassiez, entassions
grogner and grognasser
grognasse, grognassent, grognasses, grognassiez, grognassions
ramer and ramasser
ramassent, ramasse, ramasses, ramassiez, ramassions
rêver and rêvasser
rêvassent, rêvasse, rêvasses, rêvassiez, rêvassions
terrer and terrasser
terrassent, terrasses, terrasse, terrassiez, terrassions
tourner and tournasser
tournassent, tournasses, tournasse, tournassiez, tournassions
traîner and traînasser
traînassent, traînasses, traînasse, traînassiez, traînassions

The same situation, but involving two impersonal verbs:

brumer and brumasser
brumasse
frimer and frimasser
frimasse
mouiller and mouillasser
mouillasse

The same situation sometimes occurs between first group and second group verbs:

pâtir and pâtisser
pâtissaient, pâtissais, pâtissait, pâtissant, pâtissent, pâtisse, pâtisses, pâtissez, pâtissiez, pâtissions, pâtissons
tapir and tapisser
tapissaient, tapissais, tapissait, tapissant, tapissent, tapisses, tapisse, tapissez, tapissiez, tapissions, tapissons
vernir and vernisser
vernissaient, vernissais, vernissait, vernissant, vernissent, vernisses, vernisse, vernissez, vernissiez, vernissions, vernissons

It may also occur between first group and third group verbs, as in the case of "voir" and derivatives:

voir and visser
vissent, visses, visse, vissiez, vissions
revoir and revisser
revissent, revisse, revisses, revissiez, revissions

Or in the more complex case of "bruir" and "bruire" (two verbs with different meanings):

bruir and bruisser
bruissais, bruisses, bruissez, bruissiez, bruissions, bruissons
bruire and bruir and bruisser
bruissaient, bruissait, bruissant, bruisse, bruissent
bruire and bruir
bruit

Case 2: the -i-

Adding an "-i-" to the infinitive can give a different verb, but the conjugations will once again overlap. The boundary is often blurry here: I judged that "tarifer" and "tarifier" were alternative spellings of the same verbs and did not include it in the list, but there is no doubt that "parer" and "parier" are entirely different verbs:

affiler and affilier
affiliez, affilions
aller and allier
alliez, allions
colorer and colorier
coloriez, colorions
déparer and déparier
dépariez, déparions
distancer and distancier
distanciez, distancions
parer and parier
pariez, parions
rader and radier
radiez, radions
raller and rallier
ralliez, rallions
référencer and référencier
référenciez, référencions

Case 3: other cases with the first group

I removed several entries from the list which were forms ambiguous between two alternative spellings of the same first group infinitive, like "interpeler" and "interpeller". Sometimes it was a closer call ("rengrener" and "rengréner"). However, there seems to be no doubt that "taveler" and "taveller" are very different verbs that just happen to share a large part of their conjugation:

taveler and taveller
tavellent, tavelleraient, tavellerais, tavellerai, tavellerait, tavelleras, tavellera, tavellerez, tavelleriez, tavellerions, tavellerons, tavelleront, tavelles, tavelle

Then there is the case of verbs where the third person plural of the simple past tense indicative matches another verb with an additional -er-:

galérer and galer
galèrent
maniérer and manier
manièrent
lacérer and lacer
lacèrent

And there are two more cases within the first group, after which point all cases will include at least a verb of the second or third group; one of them includes the funny verb "raller" (which, in the sense of "to go again", is putatively conjugated as the very irregular aller), we will meet it again later:

capéer and caper
capée, capées
raller and railler
raille, railles, raillent

Case 4: non-homophones

These verbs are exceptional because the forms are ambiguous in writing but are pronounced differently, so they also appear in my list of French non-homophonous homographs.

obvenir and obvier
obvient
convenir and convier
convient
pressentir and presser
pressent
surfaire and surfer
surfais, surfait, surfassent, surfasses, surfasse, surfassiez, surfassions, surferaient, surferais, surferai, surferait, surferas, surfera, surferez, surferiez, surferions, surferons, surferont

Case 5: être

The unique conjugation of "être" has three forms which overlaps with other verbs. Too bad.

être and suivre
suis
être and sommer
sommes
être and étayer
étaient

Case 6: large overlaps

In some cases a third group verb's conjugation is very irregular but looks like that of a regular first group verb. For example, "peindre" often looks like "peigner":

peindre and peigner
peignaient, peignais, peignait, peignant, peignent, peigne, peignes, peignez, peigniez, peignions, peignons
dépeindre and dépeigner
dépeignaient, dépeignais, dépeignait, dépeignant, dépeigne, dépeignent, dépeignes, dépeignez, dépeigniez, dépeignions, dépeignons
repeindre and repeigner
repeignaient, repeignais, repeignait, repeignant, repeignent, repeigne, repeignes, repeignez, repeigniez, repeignions, repeignons

Or "raire" and "rayer":

raire and rayer
raient, raie, raies, rayaient, rayais, rayait, rayant, rayez, rayiez, rayions, rayons
braire and brayer
braie, braient, braies, brayaient, brayais, brayait, brayant, brayez, brayiez, brayions, brayons

Or "ouvrir" and "ouvrer":

ouvrir and ouvrer
ouvraient, ouvrais, ouvrait, ouvrant, ouvrent, ouvre, ouvres, ouvrez, ouvriez, ouvrions, ouvrons
recouvrir and recouvrer
recouvraient, recouvrais, recouvrait, recouvrant, recouvrent, recouvre, recouvres, recouvrez, recouvriez, recouvrions, recouvrons

Or "faillir" and "failler", "saillir" and "sailler" (but pay attention to the fact that those two cases are slightly different because "faillir" and "saillir" do not follow the same pattern):

faillir and failler
faillaient, faillais, faillait, faillant, faillent, faillez, failliez, faillions, faillons
saillir and sailler
saillaient, saillait, saillant, saillent, sailleraient, saillerait, saillera, sailleront, saille

Or, well, a bunch of other cases:

fondre and fonder
fondaient, fondais, fondait, fondant, fonde, fondent, fondes, fondez, fondiez, fondions, fondons
refondre and refonder
refondaient, refondais, refondait, refondant, refonde, refondent, refondes, refondez, refondiez, refondions, fondons
moudre and mouler
moulaient, moulais, moulait, moulant, moule, moulent, moules, moulez, mouliez, moulions, moulons
remoudre and remouler
remoulaient, remoulais, remoulait, remoulant, remoule, remoulent, remoules, remoulez, remouliez, remoulions, remoulons
vermoudre and vermouler
vermoulaient, vermoulais, vermoulait, vermoulant, vermoule, vermoulent, vermoules, vermoulez, vermouliez, vermoulions, vermoulons
mouvoir and mouver
mouvaient, mouvais, mouvait, mouvant, mouvez, mouviez, mouvions, mouvons
venir and vener
venaient, venais, venait, venant, venez, veniez, venions, venons
matir and mater
mataient, matais, matait, matant, mate, matent, mates, matez, matiez, mations, matons
mouvoir and musser
musse, mussent, musses, mussiez, mussions
choir and cherrer
cherra, cherrai, cherraient, cherrais, cherrait, cherras, cherrez, cherriez, cherrions, cherrons
savoir and saurer
sauraient, saurai, saurais, saurait, saura, sauras, saurez, sauriez, saurions, saurons
médire and médiser
médisaient, médisais, médisait, médisant, médise, médisent, médises, médisez, médisiez, médisions, médisons
choir and choyer
choient, choyant, choyez, choyons

Case 7: slight overlaps

In some cases, the overlap is just on one form (here, the first and second person present indicative of "paraître" versus the imperfect indicative of "parer", not the third person because of the "î"):

paraître and parer
parais
comparaître and comparer
comparais

Or a first group verb with a misplaced -r- ends up sharing its third person plural present indicative with the third person plural simple past indicative of an irregular verb:

mettre and mirer
mirent
admettre and admirer
admirent
voir and virer
virent
revoir and revirer
revirent
moudre and moulurer
moulurent
devoir and durer
durent
mouvoir and murer
murent

Or the third group verb's feminine past participle has a tempting "-e" ending which makes it look like the present indicative tense of a perfectly regular verb. Do not miss the triple ambiguity case between "paître" and "pouvoir" in addition to "puer", the only case of triple ambiguity in the whole list along with "bruir"/"bruire"/"bruisser":

prendre and priser
prise, prises
déprendre and dépriser
déprise, déprises
méprendre and mépriser
méprise, méprises
reprendre and repriser
reprise, reprises
cuire and cuiter
cuite, cuites
médire and méditer
médite, médites
feindre and feinter
feinte, feintes
teindre and teinter
teinte, teintes
mettre and miser
mise, mises
remettre and remiser
remise, remises
remplir and remplier
remplie, remplies
décroître and décruer
décrue, décrues
mouvoir and muer
mue, mues
paître and pouvoir and puer
pue, pues
savoir and suer
sue, sues
joindre and jointer
jointe, jointes
traire and traiter
traite, traites
taire and tuer
tue, tues

We mentioned "faillir" and "failler" above, but the list would be incomplete without:

faillir and falloir
faut
failler and falloir
faille

And we still have a few more:

rentraire and rentrer
rentraient, rentrais, rentrait
ailler and aller
aille, aillent, ailles

Case 8: second and third group

Those last cases only involve second and third group verbs, and thus differ from all of the preceding ones (except the "bruir"/"bruire" case handled with "bruisser" above, the "être"/"suivre" case above, and the "paître"/"pouvoir" subcase).

vivre and voir
vis, vit
revivre and revoir
revis, revit
croire and croître
crois, cru, crue, crues, crûmes, crurent, crus, crusse, crussent, crusses, crussiez, crussions, crut, crût, crûtes
rasseoir and rassir
rassîmes, rassirent, rassis, rassise, rassises, rassissent, rassisse, rassisses, rassissiez, rassissions, rassîtes, rassit, rassît
plaire and pleuvoir
plue, plues, plu, plus, plut, plût, plurent
paître and pouvoir
pu, pus
raller and rire
rira, rirai, riraient, rirais, rirait, riras, rirez, ririez, ririons, rirons, riront

The complete list

Here is the complete list, in case you want to process it automatically:

admirent
affiliez
affilions
aille
aillent
ailles
alliez
allions
bavasse
bavassent
bavasses
bavassiez
bavassions
braie
braient
braies
brayaient
brayais
brayait
brayant
brayez
brayiez
brayions
brayons
brouillasse
brouillassent
brouillasses
brouillassiez
brouillassions
bruissaient
bruissais
bruissait
bruissant
bruisse
bruissent
bruisses
bruissez
bruissiez
bruissions
bruissons
bruit
brumasse
caillasse
caillassent
caillasses
caillassiez
caillassions
capée
capées
cherra
cherrai
cherraient
cherrais
cherrait
cherras
cherrez
cherriez
cherrions
cherrons
choient
choyant
choyez
choyons
coloriez
colorions
comparais
convient
crevasse
crevassent
crevasses
crevassiez
crevassions
crois
cru
crue
crues
crûmes
crurent
crus
crusse
crussent
crusses
crussiez
crussions
crut
crût
crûtes
cuite
cuites
damasse
damassent
damasses
damassiez
damassions
débarrasse
débarrassent
débarrasses
débarrassiez
débarrassions
décrue
décrues
dégueulasse
dégueulassent
dégueulasses
dégueulassiez
dégueulassions
dépariez
déparions
dépeignaient
dépeignais
dépeignait
dépeignant
dépeigne
dépeignent
dépeignes
dépeignez
dépeigniez
dépeignions
dépeignons
déprise
déprises
distanciez
distancions
durent
embarrasse
embarrassent
embarrasses
embarrassiez
embarrassions
encrasse
encrassent
encrasses
encrassiez
encrassions
enliasse
enliassent
enliasses
enliassiez
enliassions
entasse
entassent
entasses
entassiez
entassions
étaient
faillaient
faillais
faillait
faillant
faille
faillent
faillez
failliez
faillions
faillons
faut
feinte
feintes
fondaient
fondais
fondait
fondant
fonde
fondent
fondes
fondez
fondiez
fondions
fondons
fondons
frimasse
galèrent
grognasse
grognassent
grognasses
grognassiez
grognassions
jointe
jointes
lacèrent
manièrent
mataient
matais
matait
matant
mate
matent
mates
matez
matiez
mations
matons
médisaient
médisais
médisait
médisant
médise
médisent
médises
médisez
médisiez
médisions
médisons
médite
médites
méprise
méprises
mirent
mise
mises
mouillasse
moulaient
moulais
moulait
moulant
moule
moulent
moules
moulez
mouliez
moulions
moulons
moulurent
mouvaient
mouvais
mouvait
mouvant
mouvez
mouviez
mouvions
mouvons
mue
mues
murent
musse
mussent
musses
mussiez
mussions
obvient
ouvraient
ouvrais
ouvrait
ouvrant
ouvre
ouvrent
ouvres
ouvrez
ouvriez
ouvrions
ouvrons
parais
pariez
parions
pâtissaient
pâtissais
pâtissait
pâtissant
pâtisse
pâtissent
pâtisses
pâtissez
pâtissiez
pâtissions
pâtissons
peignaient
peignais
peignait
peignant
peigne
peignent
peignes
peignez
peigniez
peignions
peignons
plu
plue
plues
plurent
plus
plut
plût
pressent
prise
prises
pu
pue
pues
pus
radiez
radions
raie
raient
raies
raille
raillent
railles
ralliez
rallions
ramasse
ramassent
ramasses
ramassiez
ramassions
rassîmes
rassirent
rassis
rassise
rassises
rassisse
rassissent
rassisses
rassissiez
rassissions
rassit
rassît
rassîtes
rayaient
rayais
rayait
rayant
rayez
rayiez
rayions
rayons
recouvraient
recouvrais
recouvrait
recouvrant
recouvre
recouvrent
recouvres
recouvrez
recouvriez
recouvrions
recouvrons
référenciez
référencions
refondaient
refondais
refondait
refondant
refonde
refondent
refondes
refondez
refondiez
refondions
remise
remises
remoulaient
remoulais
remoulait
remoulant
remoule
remoulent
remoules
remoulez
remouliez
remoulions
remoulons
remplie
remplies
rentraient
rentrais
rentrait
repeignaient
repeignais
repeignait
repeignant
repeigne
repeignent
repeignes
repeignez
repeigniez
repeignions
repeignons
reprise
reprises
rêvasse
rêvassent
rêvasses
rêvassiez
rêvassions
revirent
revis
revisse
revissent
revisses
revissiez
revissions
revit
rira
rirai
riraient
rirais
rirait
riras
rirez
ririez
ririons
rirons
riront
saillaient
saillait
saillant
saille
saillent
saillera
sailleraient
saillerait
sailleront
saura
saurai
sauraient
saurais
saurait
sauras
saurez
sauriez
saurions
saurons
sommes
sue
sues
suis
surfais
surfait
surfasse
surfassent
surfasses
surfassiez
surfassions
surfera
surferai
surferaient
surferais
surferait
surferas
surferez
surferiez
surferions
surferons
surferont
tapissaient
tapissais
tapissait
tapissant
tapisse
tapissent
tapisses
tapissez
tapissiez
tapissions
tapissons
tavelle
tavellent
tavellera
tavellerai
tavelleraient
tavellerais
tavellerait
tavelleras
tavellerez
tavelleriez
tavellerions
tavellerons
tavelleront
tavelles
teinte
teintes
terrasse
terrassent
terrasses
terrassiez
terrassions
tournasse
tournassent
tournasses
tournassiez
tournassions
traînasse
traînassent
traînasses
traînassiez
traînassions
traite
traites
tue
tues
venaient
venais
venait
venant
venez
veniez
venions
venons
vermoulaient
vermoulais
vermoulait
vermoulant
vermoule
vermoulent
vermoules
vermoulez
vermouliez
vermoulions
vermoulons
vernissaient
vernissais
vernissait
vernissant
vernisse
vernissent
vernisses
vernissez
vernissiez
vernissions
vernissons
virent
vis
visse
vissent
visses
vissiez
vissions
vit

Even more Kobo hacking

— updated

I broke my Kobo Touch (the screen was damaged, probably because the device was crushed against something in a bag, interesting to know that you ought to be careful with it), and bought a Kobo Glo (model N613) to replace it. Here is some info about hacks I've done.

Old stuff

You can check my original post for details about what needs to be done at first, I'm just going to allude to it. You need to fake activation in the usual way, though you might need some more clever choices to make it look plausible for the Kobo (caution though, the last column mentioned by those instructions did not exist in my sqlite file). Install the latest firmware, prepare a fake update to activate a telnet daemon, and get root. Install dropbear, edit /etc/hosts. From the contents of /mnt/onboard/.kobo/Kobo/Kobo eReader.conf I feel safer adding the following to my previous list:

0.0.0.0 www.kobobooks.com webstore.kobobooks.com webstore2.kobobooks.com
0.0.0.0 secure.kobobooks.com ecimages.kobobooks.com social.kobobooks.com
0.0.0.0 partner.kobobooks.com mobilepartner.kobobooks.com

There is no home button anymore, but factory reset can be performed by booting while pressing the light button (the LED will turn to red). Reset button is still here. Interestingly, you need to press the reset button to reboot when nickel is dead, a long press on the power switch will not be enough like I think it used to be on the Touch. (Remember that nickel is the Kobo's proprietary frontend software.) So... have a paperclip ready whenever you kill nickel, or be sure to always use busybox reboot from the shell (and not to drop the connection, of course...).

Connecting to the device via USB

A useful trick from here. Just add the following at the end of /etc/init.d/rcS:

busybox insmod /drivers/ntx508/usb/gadget/arcotg_udc.ko
busybox insmod /drivers/ntx508/usb/gadget/g_ether.ko

Add the following at the end of /usr/local/Kobo/udev/ac and /usr/local/Kobo/udev/plug:

/sbin/ifconfig usb0 192.168.2.2

You should now connect the device to your computer, issue ifconfig usb0 192.168.2.1, and connect to 192.168.2.2. Depending on your network connection manager and the phase of the moon, it might help to rerun this command "occasionnally" (I did it every 2 seconds or so).

Interestingly, this trick does not interfere with the proper workings of nickel, though it will prevent you from mounting /mnt/onboard as UMS.

Putting an offline copy of Wikipedia on the device

I find it pretty cool to have a copy of the entire Wikipedia on my device. I managed to do so using Kiwix, which is comparatively easy, but then some effort is needed to use the built-in browser in offline mode.

Retrieve the ZIM file corresponding to the Wikipedia that you want from this page. For the English Wikipedia without images, the onboard storage of the device will not be sufficient, and you will need a MicroSD card. If the ZIM file is over 4 GB, you will not be able to put it on a FAT32 filesystem. This is not a problem for the Linux kernel running on the device, of course, but by default the device will complain unless the first partition of the SD card isn't a FAT partition.

Fortunately, this isn't managed by nickel and we can do things properly. The file to edit is /usr/local/Kobo/udev: for intance, you can add mount /dev/mmcblk1p2 /mnt/wikipedia before the dosfsck command and umount -l /mnt/wikipedia after the umount command. This assumes that your Wikipedia SD card has a first FAT partition and a second partition containing Wikipedia, and will mount the Wikipedia partition on /mnt/wikipedia (or fail silently if you insert a card with no suitable second partition). You can tune this to your liking. Once you're done, reboot the device and check that your Wikipedia ZIM file is indeed visible at the expected location at boot.

We now need a tool to browse the ZIM file. Fortunately, the Kiwix project has a very nice tool called kiwix-serve which runs as a HTTP server to serve the content of the dump (unlike lots of other offline Wikipedia tools which insist on serving the content with their own crappy user interface that we couldn't use here even if we wanted to). What's even more fortunate, there are ARM binaries of the Kiwix tools available, so we won't need to cross-compile. Retrieve an ARM build of Kiwix from this page. Transfer it to the device (say in /root), and add the following at the end of /etc/init.d/rcS to run the HTTP server:

(sleep 10; /root/kiwix-serve --port=80 /mnt/wikipedia/wikipedia_en_all_nopic_01_2012.zim) &

For convenience, add 127.0.0.1 a to /etc/hosts to make access to localhost easier. It seems that everything's been taken care of and that we just have to access "a" (i.e. localhost) from nickel's built-in web browser in Settings -> Extras... except that, as you will be pleased to notice, this won't work because nickel will require you to connect to a Wifi network to use the browser, even if what you want to do is just access localhost. Damn. Damn!

It seems that the only way around this extremely annoying misfeature is to reverse-engineer and patch nickel. What follows is not my own work (although it's not available online elsewhere to the best of my knowledge): I am extremely grateful to Glyn from Oxford Hackspace who managed to achieve this while I generously volunteered subtly misleading information to make his job a bit harder.

The relevant file to edit is /usr/local/Kobo/libnickel.so. If you have firmware version 2.5.2 (i.e., the SHA1 sum of your copy of this file is 4c3d7d8cdce4927cbffbde8d3d4c6b7bd35de5c1) and you are in a hurry, you can just grab this file and apply it with bspatch to your libnickel.so file (keep a backup copy of the original file to restore it if things go wrong!), and reboot your Kobo, and hopefully things should work. If you're not in a hurry or don't have the same version, I will go into some detail about how this patch was prepared, so that the process can still be applied to different versions of the firmware (assuming that this part of the code doesn't change too much between versions).

We will need to patch at two places. First, in the function _ZN23WirelessWorkflowManager11openBrowserERK4QUrl that is invoked when opening the browser from settings, we need to work around an attempt to connect to a Wifi network. On firmware 2.5.2, the objdump output looks like this:

007e52a4 <_ZN23WirelessWorkflowManager11openBrowserERK4QUrl>:
  7e52a4:       e92d4070        push    {r4, r5, r6, lr}
  7e52a8:       e1a04000        mov     r4, r0
  7e52ac:       e24dd008        sub     sp, sp, #8
  7e52b0:       e1a06001        mov     r6, r1
  7e52b4:       e59f5048        ldr     r5, [pc, #72]   ; 7e5304 <_ZN23WirelessWorkflowManager11openBrowserERK4QUrl+0x60>
  7e52b8:       ebf24996        bl      477918 <_init+0x256b8>
  7e52bc:       e1a01006        mov     r1, r6
  7e52c0:       e2840010        add     r0, r4, #16
  7e52c4:       ebf262a1        bl      47dd50 <_init+0x2baf0>
  7e52c8:       e59f1038        ldr     r1, [pc, #56]   ; 7e5308 <_ZN23WirelessWorkflowManager11openBrowserERK4QUrl+0x64>
  7e52cc:       e59f3038        ldr     r3, [pc, #56]   ; 7e530c <_ZN23WirelessWorkflowManager11openBrowserERK4QUrl+0x68>
  7e52d0:       e08f5005        add     r5, pc, r5
  7e52d4:       e0851001        add     r1, r5, r1
  7e52d8:       e3a0c080        mov     ip, #128        ; 0x80
  7e52dc:       e0853003        add     r3, r5, r3
  7e52e0:       e1a00004        mov     r0, r4
  7e52e4:       e1a02004        mov     r2, r4
  7e52e8:       e58dc000        str     ip, [sp]
  7e52ec:       ebf1ebab        bl      4601a0 <_init+0xdf40>
  7e52f0:       e1a00004        mov     r0, r4
  7e52f4:       e3a01001        mov     r1, #1
  7e52f8:       e28dd008        add     sp, sp, #8
  7e52fc:       e8bd4070        pop     {r4, r5, r6, lr}
  7e5300:       eaf252c3        b       479e14 <_init+0x27bb4>
  7e5304:       00e7eae8        rsceq   lr, r7, r8, ror #21
  7e5308:       ffce57e8                        ; <UNDEFINED> instruction: 0xffce57e8
  7e530c:       ffd28838                        ; <UNDEFINED> instruction: 0xffd28838

That last jump at 0x7e5300 needs to be changed to jump instead to a function called _ZN23WirelessWorkflowManager25openBrowserAfterConnectedEv that is located at offset 0x7e0e88 in version 2.5.2. So, the binary must be patched to change the four bytes at position 0x7e5300 to branch instead to the previously mentioned function. This should involve leaving the fourth byte at 0xea (the opcode for an unconditional jump without link) and changing the first three bytes to the correct offset.

Second, even once the browser has been started, there will be regular checks for an Internet connection. The incriminated function is _ZN19N3BrowserController15checkConnectionEv which looks like this:

00a80914 <_ZN19N3BrowserController15checkConnectionEv>:
  a80914:       e92d4070        push    {r4, r5, r6, lr}
  a80918:       e1a04000        mov     r4, r0
  a8091c:       ebe75f20        bl      4585a4 <_init+0x6344>
  a80920:       e5901000        ldr     r1, [r0]
  a80924:       e5913030        ldr     r3, [r1, #48]   ; 0x30
  a80928:       e12fff33        blx     r3
  a8092c:       e3500000        cmp     r0, #0
  a80930:       18bd8070        popne   {r4, r5, r6, pc}
  a80934:       ebe75f1a        bl      4585a4 <_init+0x6344>
  a80938:       ebe7ecbe        bl      47bc38 <_init+0x299d8>
  a8093c:       e3500000        cmp     r0, #0
  a80940:       1a000004        bne     a80958 <_ZN19N3BrowserController15checkConnectionEv+0x44>
  a80944:       e594002c        ldr     r0, [r4, #44]   ; 0x2c
  a80948:       e3500000        cmp     r0, #0
  a8094c:       0a000005        beq     a80968 <_ZN19N3BrowserController15checkConnectionEv+0x54>
  a80950:       e8bd4070        pop     {r4, r5, r6, lr}
  a80954:       eae787ce        b       462894 <_init+0x10634>
  a80958:       ebe75f11        bl      4585a4 <_init+0x6344>
  a8095c:       e3a01001        mov     r1, #1
  a80960:       e8bd4070        pop     {r4, r5, r6, lr}
  a80964:       eae7e52a        b       479e14 <_init+0x27bb4>
  a80968:       e3a00014        mov     r0, #20
  a8096c:       ebe77415        bl      45d9c8 <_init+0xb768>
  a80970:       e1a01004        mov     r1, r4
  a80974:       e1a05000        mov     r5, r0
  a80978:       ebe771d8        bl      45d0e0 <_init+0xae80>
  a8097c:       e594202c        ldr     r2, [r4, #44]   ; 0x2c
  a80980:       e1520005        cmp     r2, r5
  a80984:       01a00005        moveq   r0, r5
  a80988:       0afffff0        beq     a80950 <_ZN19N3BrowserController15checkConnectionEv+0x3c>
  a8098c:       e284002c        add     r0, r4, #44     ; 0x2c
  a80990:       e1a01005        mov     r1, r5
  a80994:       ebe7b2ba        bl      46d484 <_init+0x1b224>
  a80998:       e594002c        ldr     r0, [r4, #44]   ; 0x2c
  a8099c:       eaffffeb        b       a80950 <_ZN19N3BrowserController15checkConnectionEv+0x3c>
  a809a0:       e1a04000        mov     r4, r0
  a809a4:       e1a00005        mov     r0, r5
  a809a8:       ebe77124        bl      45ce40 <_init+0xabe0>
  a809ac:       e1a00004        mov     r0, r4
  a809b0:       ebe79be5        bl      46794c <_init+0x156ec>

The first blx instruction is calling something else to check if a WiFi connection exists, and the cmp is checking its return value. We need to alter the next instruction so that the result of this comparison is ignored, so that the pop proceeds unconditionally. So, the binary must be patched to change the four bytes starting at 0xa80930 to make the pop unconditional. This amounts to changing the fourth byte from 0x18 to 0xe8.

Now that this has been taken care of, the setup is pretty usable. Just connect to "a" and you should be all set. A word of caution, though: if the browser can't connect to the HTTP server (e.g. if kiwix-serve isn't running properly), it will fail silently. The static Wikipedia copy that you can browse in this way looks like what you would expect from running kiwix-serve on your own machine. I did not try to generate a full-text index, but maybe it is possible to use one (see kiwix-index), though it does seem to take up a lot of space. Oh, another caveat: the article names in the search box need to be typed with an initial capital, because kiwix-serve is too dumb to figure things out otherwise.

Debian chroot

You can install Debian into an image and chroot there, which makes it easier to install software thanks to Debian's package management system. What's more, you can even get an X server to run on the device, including touchscreen management. I won't go into the details of this because it's been covered elsewhere. I will just mention that you might have some commands fail because you need to tweak the PATH and LD_LIBRARY_PATH, i.e. something like:

export PATH="$PATH:/usr/local/sbin:/usr/local/bin:/sbin:/usr/bin:/usr/sbin"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/lib/arm-linux-gnueabi"

To get the X server running correctly, you should also replace the binaries in the /fb_update folder of the Debian chroot by those from this post. A very fun thing to do is to install openssh-server (don't forget to change its port to something else than 22 if you have the dropbear ssh server running outside the chroot!), install x2x, and, from your PC, assuming that the sshd port is 2222 and the Kobo's IP is 192.168.2.2, run:

ssh -XC -p 2222 root@192.168.2.2 x2x -north -to :0

This allows you to control the Kobo's X server using your computer's keyboard and mouse. You could conceivably use it as some sort of additional screen (keeping in mind that this is only about input events, you couldn't "move" programs from your computer to the Kobo and back).

This shows that there is basically no limit to how much you could tweak your Kobo's interface. For instance, people on the Mobileread forums have gotten alternative PDF readers to run, e.g. to get reflow fuctionality.

From now on, I'll assume that you have a Debian chroot, so that you can easily install Debian programs using apt-get and run them in the chroot. If you don't, you can still download the armel packages from packages.debian.org, extract them (using ar) and install their contents manually.

Wifi AP

Something that I find funny is the idea to use your Kobo's WiFi to serve an access point. The ability to do so depends on the exact WiFi hardware used, but on my hardware at least it works.

To enable an open WiFi AP, ensure that you are running on the USB connection as described above so as not to kill your connection, and issue:

busybox insmod /drivers/ntx508/wifi/dhd.ko
busybox insmod /drivers/ntx508/wifi/sdio_wifi_pwr.ko
pkill wpa_supplicant
wlarm_le up
wlarm_le ap 1
wlarm_le ssid your_ssid

You can also use wlarm_le cur_etheraddr to change your MAC address. While we're at it, I also advise you to run wlarm_le | less and marvel at all those options. I'm not sure they are all operational. For instance I didn't manage to set it in monitor mode for iwconfig's purposes, even though it seems that, when enabling promiscuous and monitor mode and running tcpdump -i eth0, you can receive traffic on the channel that is not addressed to you -- but it looks like this confuses the hell out of tcpdump because the MAC addresses and ethertype appear garbled. I'm a bit curious about wlarm_le PM as it seems the power saving mode isn't being enabled by default for some reason.

The AP will not be very useful unless you install something to serve DHCP leases. You can apt-get install dnsmasq as described here, adjusting the addresses so as not to conflict with that of the USB network connection. However, the sad truth is that the Kobo's kernel has no support for iptables, which severely restricts the use of what you can do to intercept network connections (e.g. to redirect people in a user-friendly way to the web services running on the Kobo), or relay them (e.g. bridging the wireless and USB interfaces to share your computer's Ethernet connection by creating a WiFi access point with your Kobo would be very nice...). I have tried to cross-compile iptables as a module for the Kobo's kernel, but so far I have mostly failed. I'll do a followup if I manage to achieve something there.

How to compile wl

You don't need to do that -- I just didn't realize at once that a functional wl was provided on the device. I'm just providing this for reference.

Retrieve the source archive from the Kobo github repository. We will need the wl tool from this repository to enable SoftAP mode. Sadly, the precomputed version segfaults, so we will need to cross-compile your own.

I will assume you are running Debian. Install the emdebian-archive-keyring package. Add the following to /etc/apt/sources.list, adapting for your version:

deb http://www.emdebian.org/debian wheezy main

Run apt-get update, and install g++-4.7-arm-linux-gnueabi and xapt. Extract the archive that you downloaded above, go in src/wl/exe, and cross-compile:

make -f GNUmakefile CC=arm-linux-gnueabi-gcc-4.7

Learning the gender of French nouns

— updated

The gender of French nouns is a pain for foreigners and even occasionally for native speakers. Learners of French usually rely (besides rote learning) on rules that classify words as masculine or feminine depending on their ending. In this post, I present what happens if you try to derive the minimal set of rules to determine the gender of a French noun from its ending. (In brief: it doesn't give a very compact set of rules because there are too many exceptions.)

The problem that we will study is: given a French noun, determine its gender. Let us start by taking the database from Lexique and keep the words that:

  • are not "derived" forms (e.g., plurals);
  • are nouns;
  • are either masculine or feminine (there's not much I can say about nouns that can be both, except that most of them are words like "journaliste" or "enfant" that are used to refer to people so the gender to choose is usually clear depending on the person you're referring to);
  • do not contain spaces or hyphens (because the gender of such words is usually determined from the component words, so a strategy that looks at their ending will not work well)
  • do not contain dots (remove pesky abbreviations)

Here is the code I use (see lexique.org to obtain lexique, and note that I use a custom version with some errors fixed by hand so your result may differ slightly).

cut -f1,4,5,14 lexique |
  grep '1$' |
  cut -f1,2,3 |
  grep NOM |
  grep  '[mf]$' |
  cut -f1,3 |
  grep -v ' ' |
  grep -v -- '[-\.]' > nouns.txt

Now, we need to find rules to predict the label ('m' or 'f') of nouns in the input list, in a manner that is as concise as possible. To do so, we said that we would try to determine the label by reading the noun starting from its ending. I will describe what we want to do with an example. Suppose we get an unknown noun and start reading it. The last letter is 'e'. At this point we don't know the gender, so we continue. The two last letters are 've'. We must continue still. The three last letters are 'uve'. This narrows down the set of possible nouns, but it can still be either masculine or feminine (think "fauve" vs "guimauve"). The last four letters are "luve". At this point, we know that all nouns ending in "luve" are masculine (there is only "effluve"), so we answer 'm'.

According to this example, we want a set of rules that says, for every possible suffix read so far, whether we can decide 'm' or 'f' or must continue reading. The set of rules should be minimal, which means that it should decide 'm' or 'f' as soon as possible (i.e., as soon as all nouns ending with this suffix have the same gender). Such classification strategies look a lot like deterministic finite automata, except they are acyclic. A more standard term is trie. With such strategies, you can determine the gender of all nouns of the list, and (hopefully) do a reasonable job for unknown nouns by answering given on the longest common prefix.

Now, it turns out I already wrote some code to generate tries from examples, for my project about determining if an initial 'h' in a French word is aspirated or not. Let us reuse that.

So, let us reverse those nouns, and pass them to programs from the haspirater suite to compile the trie and obtain the leaves of the trie, namely, the suffixes at which a decision is taken (and sort them nicely).

rev nouns.txt |
  buildtrie.py |
  compresstrie.py |
  leavestrie.py -1 |
  rev |
  LC_ALL=C sort -k1,1 |
  rev > leaves.txt

Following our previous example, observe that the leaves.txt file contains a line for "luve" (line 4,417). This means that "luve" is a suffix at which we decide 'm', but all shorter suffixes ("uve", "ve", "e", "") were still ambiguous. An initial space in a word indicates "beginning of word" (when we read "rive" we don't know yet between, say, "dérive" and "drive", but if the full word is "rive" then we should decide 'f'). To determine the gender of a noun using this list, look at the line containing the longest suffix of the noun, and the first field of the line should be its gender. Note that the longest leaves in this file are "patriarche" and "matriarche", for which reading "atriarche" is still insufficient to decide (that illustrates that sometimes the relevant info isn't at the end of words...).

The leaves file has 7,032 lines, to be compared to the 24,839 initial nouns in the example list. Thus, the strategy of looking at word endings gives classification rules that are shorter than the full example list, but not by much. In a way, this result illustrates that rules telling you "words in -tion are feminine" and such will always lead to mistakes, unless you have a large number of them.

To see how bad this is, I tested a strategy which reads words from the beginning instead of from the end, which seems to be a worse idea: it has 20,607 leaves, so reading from the end is definitely a better idea than reading from the beginning. Maybe different rules would be more helpful to classify (maybe using general decision trees without restricting the order of choices by saying "read from the end" or "read from the beginning"), but it doesn't seem that obvious to me.

If you ever learnt this list by heart (for instance using a spaced repetition system), you would know the gender of every French noun (except the ones with hyphens, except the ones missing from Lexique, except the ones in which both genders are possible depending on meaning, and accounting for possible errors in Lexique). I wouldn't recommend it, though, because of those caveats, and also because it still seems too long so there has to be a better way than what I did. If you still wanted to do it, though, it might be more convenient to use this file, in which I replaced the suffixes by one noun that matches this suffix (the one with the highest registered frequency in Lexique). So, if you know this last file by heart, your intuition for gender will be flawless, modulo the caveats and modulo the big assumption that your intuition proceeds by matching the longest suffix of the unknown word with a word that you know.

[Further work: looking at pronunciation instead of spelling (or in addition to it), give weights to the rules and rank them by weight, have a richer rule language (e.g., allow to give a fixed list of exceptions for each rule, which would seriously cut down the impact of pesky words like "cation")...]