a3nm's blog

Setting up SVN and git shared hosting on Debian

This note explains how to setup a shared Git and a shared SVN repository on a Debian stable (jessie) server, incidentally a Raspberry Pi. Both repositories are served with Apache, using mod_dav_svn for SVN, and using the git-http-backend CGI for git. (Note that these aren't the only possible choices, but they are the ones I will present.)

The setup uses Apache Digest authentication (with a file common to both SVN and git): the SVN and git setups are otherwise independent. The setup uses SSL provided with Let's Encrypt, set up with certbot. For brevity I am omitting sudo in all commands of this tutorial.

Note that I am dumping this from memory after having done the setup, but I'm not guaranteeing that everything will work. If something does not work, please email me.

Set up the domain names

If your main domain is example.com, then you probably want to set up svn.example.com and git.example.com to point to the machine where you will be hosting the repositories. This will allow you to configure each service easily using virtual hosts, and makes it easier to move the hosting to a different machine if you want.

You probably want to wait for the new DNS entries to have propagated before doing what follows, in particular for the Let's Encrypt challenge to work.

Install apache2 and certbot

apt-get install apache2 libapache2-svn libapache2-mod-svn
apt-get install certbot python-certbot-apache

Side note: certbot is not yet available on jessie, you need to enable jessie-backports. As my Raspberry Pi uses Raspbian, this also means I had to set up the Debian signing keys: download the keys that apt-get update complains about and install them with apt-key add.

Obviously you should also make sure that SVN and git are installed on the server.

Start your apache2 configuration

Add to the beginning of /etc/apache2/apache2.conf the default hostname of your machine (of course, change it to something reasonable):

ServerName machine.example.com

While you are at it, you can add the following to the bottom to avoid disclosing details of your Apache version:

ServerTokens Prod
ServerSignature Off

Now, create a file /etc/apache2/sites-available/000-git.conf and add the following, changing the domain name and email address (don't worry about the DocumentRoot, it doesn't matter):

<VirtualHost *:80>
        ServerName git.example.com

        ServerAdmin you@example.com
        DocumentRoot /var/www/html/

        ErrorLog ${APACHE_LOG_DIR}/error.log
        CustomLog ${APACHE_LOG_DIR}/access.log combined

Create /etc/apache2/sites-available/000-svn.conf analogously (replacing git by svn). Then, enable the right sites with:

a2dissite 000-default
a2ensite 000-git
a2ensite 000-svn

Set up SSL with certbot

Run certbot --apache: it should pick up the configuration files that you created and understand what you are doing. Say that you want to enable encryption for both subdomains, that you accept the license, and that you want the Secure setup where users are redirected to the SSL version (we don't want the version control traffic ever to happen in cleartext). Check that https://git.example.com/ and https://svn.example.com/ both work with SSL, and that http:// URLs redirect to the https:// variant. Yes, it's just a 301 redirect, it's not HSTS, but to me this is good enough.

Create the user file

Authentication will be performed by Apache Digest, so create a password file with your own user using the following command (note that the second argument, the realm, is important because it is included in the digest):

htdigest -c /etc/vcs_passwd "example.com version control" yourlogin

Then, when you want to add more users, issue:

htdigest /etc/vcs_passwd "example.com version control" theirlogin

Ensure that the file is owned by www-data and only readable by them:

chown www-data:www-data /etc/vcs_passwd
chmod og-rwx /etc/vcs_passwd

Setting up Subversion

Now that the common SSL setup and user setup is done, let's start by setting up Subversion. For reference, the part of the SVN Book about what we are trying to do is here. I prefer creating one global repository and use path-based authorization to give access to relevant subsets to various users.

Creating the Subversion repository and setting up authorization

Create the repository, e.g.:

mkdir -p /var/svn
svnadmin create /var/svn/repos

We will be using path-based authorization. Create a file /var/svn/paths with a content like the following (check the path-based authorization documentation for details). Caution, it appears that path specifications (between square brackets) do not work if you put a trailing slash, so do not put a trailing slash, as in the following:

yourlogin = rw
close_friend = r

close_friend = rw
other_friend = rw

Give ownership of everything to Apache with:

chown -R www-data:www-data /var/svn

Configuring Apache

First, you should enable the necessary modules:

a2enmod dav_svn authz_svn auth_digest

Next, restart Apache with service apache2 restart.

Edit the file /etc/apache2/sites-enabled/000-svn-le-ssl.conf which was created by certbot, and add the following block in the VirtualHost block:

<Location />
        DAV svn
        SVNPath /var/svn/repos
        AuthType digest
        AuthName "example.com version control"
        AuthUserFile /etc/vcs_passwd
        AuthzSVNAccessFile /var/svn/paths
        Require valid-user

Reload the Apache configuration with service apache2 reload. Hopefully the config should be accepted, and you should be able to checkout on a remote machine:

svn co https://svn.example.com/ myrepos

You should then be able to commit, etc.

This guide does not cover the task of setting up a backup (e.g., a periodic rsync) of the repository, i.e., the /var/svn folder.

Setting up git

We will be using git-http-backend, and create multiple repositories with one user group per repository.

For now, create the structure:

mkdir -p /var/git/repos
touch /var/git/groups

You should then make sure that this is readable by www-data.

Creating each git repository and setting up access rights

To create a git repository for project proj1, do:

cd /var/git/repos
git init --bare proj1
chown -R www-data:www-data .

Now, add to the file /var/git/groups a line to indicate who has the right to access this project (this is both read and write access):

proj1: yourlogin close_friend other_friend

Configuring Apache (initial configuration)

This only needs to be done the first time.

First, you should enable the necessary modules:

a2enmod macro authz_groupfile cgi auth_digest

And restart Apache.

Next, edit the file /etc/apache2/sites-enabled/000-git-le-ssl.conf and add, inside the VirtualHost block:

DocumentRoot /var/git/repos/
SetEnv GIT_PROJECT_ROOT /var/git/repos/
ScriptAlias / /usr/lib/git-core/git-http-backend/

<Macro Project $repos>
        <Location /$repos>
                AuthType digest
                AuthName "example.com version control"
                AuthUserFile /etc/vcs_passwd
                AuthGroupFile /var/git/groups
                Require group $repos

UndefMacro Project

Configuring Apache (for each project)

Now, each time you add a project, say proj1, you should add the following line just before the UndefMacro line:

Use Project proj1

And of course you should issue service apache2 reload. You should now be able to do the following:

git clone https://git.example.com/proj1

To avoid getting asked about your password each time (and storing it in plaintext, which is OK if you trust the security of your machine and you use disk encryption), you can issue the following (see the end of this answer):

git config --global credential.helper store

As with Subversion, you should then arrange to back up the /var/git folder.

Trying btrfs... and giving up for now

I recently tried to use btrfs to manage some data. My main use case was incremental backups relying on btrfs's snapshot feature. Indeed, btrfs allows you to:

  • create snapshots of your filesystem at various points in time: the snapshots essentially take no additional space, except that files of that FS will not really be deleted as long as they survive in a snapshot;
  • send snapshots to remote hosts, even computing efficiently the diff between each snapshot and the previous one, to minimize IO, bandwidth, and backup storage costs;
  • browse old snapshots seamlessly as if they were actual folders, and restoring files easily.

This is much better than my current solution, which is to use rsync. Indeed, by contrast, rsync has the following drawbacks:

  • rsync only synchronizes the current version (overwriting the previous one), and if you want to keep multiple versions they are not compressed relative to each other;
  • each rsync must rescan the entire filesystem, even if almost nothing has changed;
  • rsync is not always intelligent about transfers, as it tries to avoid re-sending files that haven't changed, but receives no help from the FS to understand what went on: for instance, if you move a large directory on the master, in most cases rsync will fail to notice and re-transfer the whole directory to the backup.

This post is a terse documentation of what I have learnt about btrfs in the process of exploring it. Sadly, the main outcome of my investigations is that btrfs does not seem sufficiently mature for me to use it yet. I am sorry about the negative conclusion: I think that btrfs is a great project and I imagine that the remaining rough edges will eventually be fixed. Further, the good news is that (as far as I can tell) I have only encountered crashes but I have not encountered any data loss issue.

General considerations about btrfs

So here are some general things about btrfs that I discovered when playing around:

  • btrfs supports transparent file compression with zlib and lzo. This is done by passing an option to mount. I am not too sure about what happens if you forget to pass this option (or pass the wrong value for this option). It seems to work fine, though.
  • btrfs supports deduplication, but it turns out that this did not mean what I thought it would.
    Unlike, e.g., git repositories, if you write data to the disk which happens to already exist someplace else, btrfs will not notice it and use it to share space. What it means is that btrfs supports copy-on-write, i.e., when you write data on the FS that comes from another file of the FS, then btrfs will only save a pointer to the old data, and will not create two different copies until one piece of data is modified.
    This implies that, if you want to deduplicate data which has not been created using copies, you need to do it offline with specific tools: btrfs does not support it out of the box. I tried bedup, which was quite slow; its savings amounted to 110 GB out of 2.6 TB of data when I tested it on a partition. (Of course, your mileage may vary.) It is quite worrying that the deduplication tools (in particular, bedup) do not seem very sure of what they are doing, so this does not give at all the impression of being robust.
  • btrfs supports many nice features that I didn't need: splitting a FS across multiple devices (with replication or not), adding/removing devices on the fly, performing resizes online, etc. I did not try these features out.

Here are things you will need to know when trying out btrfs, and traps in which you may fall:

  • The btrfs utilities are not shipped with Debian by default, you need to apt-get install btrfs-tools.
  • If you want to start playing with btrfs, you will probably want to convert data from an ext3 or ext4 partition. There is a tool designed to do that, btrfs-convert, but closer inspection reveals that it is now reported to be unreliable.
    As I didn't want to build the FS on shaky foundations, I created a partition from scratch, and moved my terabytes of data around.
  • When creating test filesystems, note that you cannot create btrfs filesystems that are too small (apparently, less than 100 MB), and you will get a confusing error message if you try.
  • btrfs exposes quite a lot of its internals which you apparently may need to be aware of. In particular, you may have to defragment it1. It seems that you may also need to balance the filesystem (amongst other things) to avoid problems when it is full
  • btrfs makes it possible to have subvolumes which you can mount independently. In other words, if your disk contains games and music, you could imagine having a subvolume games/ and a subvolume music/, and mounting only one of the two (or mounting them at different endpoints). In this case, if you mount the root of the filesystem, games/ and music/ will appear as folders (which are actually different filesystems).
    This means that you should be careful when starting to organize your filesystem: the root of the filesystem doesn't play the same role as in other filesystems, and you should probably always be mounting a subvolume of it instead. If you miss this point initially and want to change your mind later, it's not so simple.
  • While btrfs supports copy-on-write, cp will not use it by default. You need to pass the option: --reflink=always to cp, as explained in this FAQ entry. This is a bit unpleasant because it means that scripts must be using cp properly to take advantage of copy-on-write, and that other programs will not necessarily support it. In particular, rsync does not, for now.

Incremental backups: snapshotting, sending, receiving

Now, here is more about my specific experience with subvolumes, snapshots, and btrfs send and btrfs receive, which were the main features I was interested in. In summary, here are the basic principles:

  • You can run btrfs subvolume snapshot foo/ snap/ to create a snapshot of foo/ as snap/. This creates snap/ as a folder (but it's actually a different subvolume), which contains the contents of foo/ (using copy-on-write, so without duplicating the actual contents on disk). For backups, you want to create read-only snapshots (btrfs subvolume snapshot -r).
    If you create snapshots at different points in time, you do not need (and cannot) tell btrfs subvolume snapshot which ones are the most recent: however, for your own purposes, you should probably indicate it in the volume name.
    You can be quite trigger-happy with snapshots, I created one every hour for weeks without any problem.
  • You can run btrfs send snap/ to produce on standard output a representation of the (read-only) snapshot snap/. Alternatively, you can run btrfs send -p old_snap/ snap/ to prepare efficiently a compressed representation of snap/ that relies on old_snap/. I tested that, indeed, when the difference from old_snap/ to snap/ is that a very large folder was moved, btrfs send is able to create a very concise representation in comparatively little time.
  • You can run btrfs receive snapshots/, where snapshots/ is in the backup FS, to read on standard input a dump produced by btrfs send, and create in snapshots/ the corresponding snapshot (here, snap/: the name depends on what btrfs send is sending). Of course, the backup FS can be on a different machine: you can pipe the stream across ssh, or simply store it to a file and move that from one machine to the other.

That's the theory. Now, details and traps. First, about snapshot creation:

  • When creating snapshots periodically, it is quite easy to end up with filesystems with a very large number of files (which are very similar copies of the same hierarchy). This is very undesirable, e.g., for locate: I had updatedb wasting lots of CPU and disk space indexing a large number of these snapshots and polluting my locate results. You'll want to tell updatedb not to explore the snapshot folder, using the setting PRUNEPATHS in /etc/updatedb.conf.
  • In terms of access rights, you do not need to be root to create a snapshot (or subvolume). Indeed, if you couldn't read some files in the source, you will still be unable to read them from the snapshot.
    However, deleting subvolumes is not possible as an unprivileged user unless you pass a specific mount option: I am not sure of the implications of this, in particular, I do not know why it is not the default. Further, deleting subvolumes that were created to be read-only requires a specific step to make them writable.
    Another thing to understand is that, to remove a subvolume, whether as root or otherwise, using rm will fail with Operation not permitted; a different error than the usual Permission denied, but a possible source of confusion. You should be using btrfs subvolume delete instead.
  • Having snapshots also makes it quite complicated to understand where your disk space is going. Is it used by files currently in your FS? Or files deleted in the FS but retained because of some snapshot? If so, which snapshot(s)? How many space would you reclaim by removing a given snapshots, or, say, all snapshots older than one month?
    To answer such questions, you need to use (in particular, enable) btrfs's quota support. But even then it is not very obvious to figure all of this out.

About sending and receiving snapshots:

  • btrfs send requires root, even for snapshots that you created: this is unsurprising, as remember that you can snapshot files that you cannot read, and of course you shouldn't be able to read them from the output of btrfs send
  • You should not interrupt btrfs send or btrfs receive, either with SIGSTOP or by putting the computer in hibernation mode. If you do so, the operation will fail. In particular, an incomplete copy of the subvolume will stay around on the receiving end, which can easily mislead you and make you believe that the operation succeeded. Apparently, btrfs is smart enough to notice that the copy is incomplete (in particular, fortunately, refusing to use it as a parent to reconstruct another snapshot), but it is not sufficiently intelligent to delete the leftover files or (preferably) to resume the operation from where it left off, like rsync does. This means that, in practice, you probably want to snapshot often and have relatively small diffs between snapshots.
    Also note that btrfs send and btrfs receive give no progress information when they run.
  • Once you have created snapshots and you want to transfer them to the backup host, the problem is figuring out which backup depends on which, and what to send. You can only choose this at the level of btrfs send: snapshot creation does not need a parent, and btrfs receive is apparently able to use some ID specified in the btrfs send invocation to identify which volume it should use (or fail if a suitable volume does not exist, although I don't know whether this check is bulletproof or not).
  • Hence, when sending snapshots, btrfs leaves you free to choose the right set of send operations with the right parents to minimize IO and network cost.
    A program called buttersink attempts to do this, i.e., choosing an intelligent sequence of transfers. For my use case, sadly, it did not work. This is pretty surprising, as my case is quite simple: a series of chronological snapshots, each of which should be sent based on the previous one. Maybe the reason is that buttersink does not know in which order the snapshots were made, and relies on a size estimation of the diff between two btrfs snapshots, which apparently is both slow to compute and wildly inaccurate.
    So I wrote instead a much simpler script which order the snapshots by date (as indicated in their name) and sends them in that order. There are probably exist more elaborate tools for that purpose, like btrbk which I did not test.

Messy problems

And finally, here are the nasty problems I ran into. When running my script to perform the transfers, and disconnecting hard drives at random points to simulate messy hardware failures, I observed the following:

  • Backtraces in syslog suggesting a problem with btrfs (even during normal operation, I think):
kernel: [52053.405416] ------------[ cut here ]------------
kernel: [52053.405456] WARNING: CPU: 0 PID: 12046 at /build/linux-HoPide/linux-4.5.1/fs/btrfs/qgroup.c:2650 btrfs_qgroup_free_meta+0x88/   0x90 [btrfs]()
kernel: [52053.405459] Modules linked in: ufs(E) qnx4(E) hfsplus(E) hfs(E) minix(E) ntfs(E) vfat(E) msdos(E) fat(E) jfs(E) xfs(E)          libcrc32c(E) crc32c_generic(E) vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) veth(E) ebtable_filter(E) ebtables(E) xt_conntrack(E) ipt_MASQUERADE(E)  nf_nat_masquerade_ipv4(E) xt_addrtype(E) br_netfilter(E) bridge(E) stp(E) llc(E) overlay(E) pci_stub(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E)   sunrpc(E) fuse(E) ip6t_REJECT(E) nf_reject_ipv6(E) ip6table_filter(E) ip6_tables(E) iptable_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E)      nf_nat(E) nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_tcpudp(E) xt_owner(E) xt_multiport(E) iptable_filter(E) ip_tables(E) x_tables(E) binfmt_misc(E)  quota_v2(E) quota_tree(E) dm_crypt(E) algif_skcipher(E) af_alg(E) snd_hda_codec_hdmi(E) uas(E) usb_storage(E) iTCO_wdt(E) iTCO_vendor_support(E) ppdev(E)     intel_rapl(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) hmac(E) dm_mod(E) drbg(E) ansi_cprng(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) uvcvideo(E) pcspkr(E)                 snd_hda_codec_realtek(E) sg(E) serio_raw(E) snd_hda_codec_generic(E) videobuf2_vmalloc(E) i2c_i801(E) i915(E) videobuf2_memops(E) videobuf2_v4l2(E)           videobuf2_core(E) drm_kms_helper(E) snd_usb_audio(E) videodev(E) snd_hda_intel(E) media(E) snd_hda_codec(E) snd_usbmidi_lib(E) snd_rawmidi(E) snd_hda_core(E) snd_seq_device(E) snd_hwdep(E) snd_pcm_oss(E) drm(E) snd_mixer_oss(E) snd_pcm(E) snd_timer(E) evdev(E) joydev(E) lpc_ich(E) snd(E) mei_me(E) cdc_acm(E)       mfd_core(E) i2c_algo_bit(E) soundcore(E) mei(E) shpchp(E) 8250_fintek(E) battery(E) parport_pc(E) parport(E) video(E) soc_button_array(E) tpm_infineon(E)     tpm_tis(E) button(E) tpm(E) processor(E) it87(E) hwmon_vid(E) coretemp(E) loop(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) btrfs(E) xor(E) raid6_pq(E)  sr_mod(E) cdrom(E) sd_mod(E) hid_generic(E) usbhid(E) hid(E) crc32c_intel(E) ahci(E) libahci(E) r8169(E) psmouse(E) libata(E) xhci_pci(E) ehci_pci(E)         xhci_hcd(E) ehci_hcd(E) scsi_mod(E) mii(E) usbcore(E) usb_common(E) fan(E) thermal(E) fjes(E) [last unloaded: vboxdrv]
kernel: [52053.405581] CPU: 0 PID: 12046 Comm: rsync Tainted: G        W  OE   4.5.0-1-amd64 #1 Debian 4.5.1-1
kernel: [52053.405583] Hardware name: Gigabyte Technology Co., Ltd. H87M-HD3/H87M-HD3, BIOS F3 05/09/2013
kernel: [52053.405585]  0000000000000286 000000008e92a6d5 ffffffff81307b65 0000000000000000
kernel: [52053.405589]  ffffffffc02f15b0 ffffffff8107905d ffff8800a959e800 0000000000004000
kernel: [52053.405592]  ffff8800a959e800 0000000000004000 0000000000000002 ffffffffc02cfaf8
kernel: [52053.405595] Call Trace:
kernel: [52053.405605]  [<ffffffff81307b65>] ? dump_stack+0x5c/0x77
kernel: [52053.405610]  [<ffffffff8107905d>] ? warn_slowpath_common+0x7d/0xb0
kernel: [52053.405630]  [<ffffffffc02cfaf8>] ? btrfs_qgroup_free_meta+0x88/0x90 [btrfs]
kernel: [52053.405650]  [<ffffffffc0268702>] ? start_transaction+0x3e2/0x4a0 [btrfs]
kernel: [52053.405668]  [<ffffffffc026e507>] ? btrfs_dirty_inode+0x97/0xc0 [btrfs]
kernel: [52053.405672]  [<ffffffff81205538>] ? touch_atime+0xa8/0xd0
kernel: [52053.405676]  [<ffffffff8116d7bd>] ? generic_file_read_iter+0x63d/0x790
kernel: [52053.405681]  [<ffffffff811ee2b1>] ? cp_new_stat+0x151/0x180
kernel: [52053.405683]  [<ffffffff811e8913>] ? new_sync_read+0xa3/0xe0
kernel: [52053.405686]  [<ffffffff811e9101>] ? vfs_read+0x81/0x120
kernel: [52053.405689]  [<ffffffff811ea072>] ? SyS_read+0x52/0xc0
kernel: [52053.405693]  [<ffffffff815b6ab2>] ? system_call_fast_compare_end+0xc/0x67
kernel: [52053.405695] ---[ end trace 6c76a866f1f3e28c ]---
kernel: [52053.790081] ------------[ cut here ]------------
  • At one point, when I disconnected a hard drive that contained a mounted btrfs system, an instant hard reset of the machine (!).
  • Messed up filesystems where some operations would take apparently forever (e.g., subvolume delete, on the target of the transfer), during which mysterious processes like btrfs-cleaner and btrfs-transaction were performing varying levels of CPU/IO, and the lagging operation could not be aborted with SIGINT. I saw no way to find out what the processes were trying to do.
  • Even weirder filesystems with which the entire machine started being unresponsive and OOM-ing for unclear reasons, around 2 hours after they had been mounted. I eventually had the idea of checking slabtop, which showed that the kernel was filling the RAM (8 GB) with its own structures, presumably because of some sisyphean operations that btrsync was currently undertaking on them.
  • While the above happened, in other cases, it happened to me that syslog got flooded with messages about btrsync, filling up my root partition.

This is where I give up: even though I would very much like to have incremental backups at the FS level, for now, I do not feel comfortable handing over my data to a FS that suffers from this kind of problems. I know that, in principle, I should try to report the bugs to developers and help fixing these issues, but sadly I do not feel I can invest the time and effort to help debug an FS before I can use it. Note that I did not even do very ambitious things: essentially just snapshots, send, and receive, randomly disconnecting the devices at various stages of the process. So maybe it could be straightforward to reproduce the problems I ran into.

So I'm back to rsync for now, and I'll have to investigate incremental backup programs that are smarter than rsync but do not rely on collaboration from the FS, e.g., Borg. Or maybe I could try ZFS...

  1. It's very funny to hear that btrfs must be defragmented when you have heard for years the propaganda "only Microsoft file systems must be defragmented, because they are inferior"... 

A coat-of-arms from pixel art

My personal logo, which I use as a favicon on my website and as an avatar, is a pixel art representation of an 'a' in a weird frame that spells out "3nm" in all directions:

a3nm logo

Its size is 16 pixels by 16 pixels.

I like coats of arms, so I thought that it would be fun to try to draw one from my logo. The idea that I had is to use quartering to draw the pixel art logo recursively using standard heraldic designs.

For starters, let us git rid of the messy part at the bottom of the shield to get a nice square drawing area:

empty coat of arms

This is described as follows in French (I will only be using French descriptions as they are the only ones I know about):

D'argent à la champagne d'azur.

In the drawing above, I used a non-standard color for azure to match the pixel art logo, and I used a modern shield to get a more reasonable drawing area. I hope you don't mind. The fact that azure is a colour and silver is a metal will ensure that we do not violate the rule of tincture.

Now, let us describe the actual pixel art logo, using recursive quarterings. A complication is that simpler forms should be correctly described when we reach them, for instance, a 4x4 design should be described as "parti" or "coupé" divisions if applicable, and if only one pixel is drawn differently from the three others, it is better described as a quarter, or "franc-quartier", whose position must be specified.

I used a small program to generate the correct description, and verified it by drawing the resulting coat of arms by hand. Here it is, in French:

Grand-écartelé : 
  au premier, écartelé : 
    au premier, contre-écartelé : 
      au premier, parti d'argent et d'azur,
      au deuxième, parti d'azur et d'argent,
      au troisième d'azur au franc-quartier d'argent à senestre,
      au quatrième d'azur au franc-quartier d'argent,
    au deuxième, contre-écartelé : 
      au premier d'azur,
      au deuxième, parti d'argent et d'azur,
      au troisième, coupé d'argent et d'azur,
      au quatrième d'azur au franc-quartier d'argent,
    au troisième, contre-écartelé : 
      au premier, coupé d'argent et d'azur,
      au deuxième, parti d'argent et d'azur,
      au troisième, coupé d'azur et d'argent,
      au quatrième d'azur au franc-quartier d'argent,
    au quatrième, contre-écartelé : 
      au premier d'argent au franc-quartier d'azur,
      au deuxième, coupé d'argent et d'azur,
      au troisième d'azur au franc-quartier d'argent à senestre et en pointe,
      au quatrième, coupé d'azur et d'argent,
  au deuxième, écartelé : 
    au premier, contre-écartelé : 
      au premier, parti d'argent et d'azur,
      au deuxième, parti d'azur et d'argent,
      au troisième d'azur au franc-quartier d'argent à senestre,
      au quatrième, coupé d'argent et d'azur,
    au deuxième, contre-écartelé : 
      au premier d'azur au franc-quartier d'argent à senestre et en pointe,
      au deuxième, coupé d'argent et d'azur,
      au troisième d'azur au franc-quartier d'argent à senestre,
      au quatrième, coupé d'azur et d'argent,
    au troisième, contre-écartelé : 
      au premier, coupé d'argent et d'azur,
      au deuxième d'argent au franc-quartier d'azur à senestre,
      au troisième, coupé d'azur et d'argent,
      au quatrième d'argent,
    au quatrième, contre-écartelé : 
      au premier, parti d'azur et d'argent,
      au deuxième d'azur,
      au troisième d'azur au franc-quartier d'argent à senestre,
      au quatrième, coupé d'argent et d'azur,
  au troisième, écartelé : 
    au premier, contre-écartelé : 
      au premier, coupé d'azur et d'argent,
      au deuxième d'azur au franc-quartier d'argent en pointe,
      au troisième d'azur,
      au quatrième, parti d'argent et d'azur,
    au deuxième, contre-écartelé : 
      au premier d'argent,
      au deuxième d'azur,
      au troisième d'argent au franc-quartier d'azur en pointe,
      au quatrième, coupé d'azur et d'argent,
    au troisième, contre-écartelé : 
      au premier, coupé d'argent et d'azur,
      au deuxième d'azur au franc-quartier d'argent en pointe,
      au troisième, coupé d'azur et d'argent,
      au quatrième d'azur au franc-quartier d'argent,
    au quatrième, contre-écartelé : 
      au premier, coupé d'azur et d'argent,
      au deuxième d'azur au franc-quartier d'argent en pointe,
      au troisième, parti d'argent et d'azur,
      au quatrième, parti d'azur et d'argent,
  au quatrième, écartelé : 
    au premier, parti :
      au premier, coupé :
        au premier d'azur,
        au deuxième, écartelé d'azur et d'argent,
      au deuxième d'argent,
    au deuxième, contre-écartelé : 
      au premier d'azur au franc-quartier d'argent à senestre et en pointe,
      au deuxième, coupé d'argent et d'azur,
      au troisième, parti d'azur et d'argent,
      au quatrième, coupé d'azur et d'argent,
    au troisième, contre-écartelé : 
      au premier d'azur au franc-quartier d'argent à senestre et en pointe,
      au deuxième, coupé d'azur et d'argent,
      au troisième, parti d'azur et d'argent,
      au quatrième d'azur,
    au quatrième, contre-écartelé : 
      au premier d'azur au franc-quartier d'argent à senestre et en pointe,
      au deuxième d'azur au franc-quartier d'argent en pointe,
      au troisième, parti d'argent et d'azur,
      au quatrième, parti d'azur et d'argent ;
à la champagne d'azur.

You can use the indentation to recursively match indented blocks of the code above to squares following a kind of quadtree decomposition. Here is the result:

my coat of arms

So my coat of arms is drawn from a pixel art logo, and has a textual description which should hopefully follow (the letter of) the rules of the art, and should allow any (sufficiently motivated) herald to reconstruct the design from the text alone.

An direction for future work would be to design a general program to convert arbitrary pixel art designs to textual descriptions of coats of arms. :) I thought about it a bit, but it's trickier than it sounds if you want to use the full power of known heraldic operators.

Readers interested by heraldics and computer science may also be interested by Pascal Manoury's work on drawing coats of arms. The task studied there is to produce the graphical description of a coat of arms given a (structured) textual description of its contents. Here is the article about this topic (in French), as well as slides (in French), and sample output (in French). Thanks to Olivier Roussel for pointing this out!

Encrypt email to known GPG users with mutt using crypt_opportunistic_encrypt

If you use mutt and GPG, you may want to say that messages sent to other GPG users should be encrypted by default, and others should not.

This used to be surprisingly hacky to do, with the most common solution apparently being a script that listed known GPG keys and added mutt hooks to enable them. This was ugly and it also didn't work well because it would try to encrypt messages as soon as some recipient supported GPG, even when all recipients did not.

This post advertises a recent solution to this problem: the crypt_opportunistic_encrypt setting of mutt, which was merged1 in mutt version 1.5.24 (the one currently in Debian testing). This setting allows you to do essentially what the hacky script did, but in a much cleaner and simple way, also fixing the problem I mentioned.

I am currently using the setting in my mutt config and I am quite happy about it. Here are things to know when the setting is enabled:

  • The choice to encrypt or not encrypt the message is toggled whenever the recipients are edited, it's not only based on the initial recipients.
  • Encryption is chosen only if all recipients have a key.
  • You can always edit signing options with the usual pgp-menu command ('p'). You can also disable the opportunistic encryption setting for a single message in the pgp-menu and you can then fall back to configuring encryption in the usual way.
  • Encryption is enabled as soon as the recipients have a GPG key that looks reasonable (i.e., when I tested, it was not enabled for recipients where all keys were either disabled or expired), but there is no check2 to see whether the key is known to be valid. If a recipient has no trusted key, you will get the usual prompt to choose a key and confirm that you want to use the key even though its validity is unknown.
  • GPG seems to get invoked for the purposes of the option, so, whenever gpg decides to (presumably) check the trustdb, mutt may mysteriously hang. Just be patient.

  1. Thanks a lot to Kevin for developing and merging this! 

  2. I think that this is quite reasonable, because in practice active attacks with GPG are not a huge problem, much less than the problem that few people are using GPG. I think it especially makes sense in combination with GPG's recent support of trust on first use (TOFU) to make it less painful to use. 

Self-hosted, server-side MathJax

I use MathJax on my website to render math equations to HTML. The standard way to use MathJax is to use JavaScript so that visitors will get it from their CDN. While this is simpler and also a good idea for caching, it has drawbacks which I did not find acceptable:

  • It means that the MathJax CDN may be pinged whenever a visitor loads a page, which is bad for privacy.
  • It makes your website's security dependent on that of the CDN: if the CDN starts distributing malicious JS (i.e., MathJax turns evil or it gets hacked), then your visitors will be getting them.
  • It renders math in the browser using JavaScript. This is jarring (as the page jumps around while rendering is done), and I find it esthetically unpleasant. All of this website is static and pre-generated on my machine, I don't see why math rendering would be an exception. I find static websites preferable in terms of deployability, security, and elegance.

This post is just to explain how to render MathJax when generating static pages, using MathJax-node. As I wanted to play with Docker, and didn't want to install Node or run the code directly on my real machine, I will also explain how to set up the environment in a Docker container, but there's nothing forcing you to do that. :)

I am now using this setup on this website, so all math should now be served with server-side rendering, without requests to third-party CDNs. (In fact, this removes what was essentially the only use of JavaScript on this site.)

Setting up a mirror of the fonts

While we won't need to serve the MathJax JavaScript code to readers, we will need to serve them the fonts used by MathJax.

Fortunately, on Debian systems, these fonts are already packaged as fonts-mathjax and fonts-mathjax-extras, so you can just rely on the package manager to retrieve them and keep them up to date. The fonts are installed in /usr/share/javascript/mathjax/, so you just have to configure your Web server to serve this folder. I serve it as a3nm.net/mathjax. It's preferable to serve it from the same domain that you will otherwise use, otherwise it's necessary to jump through additional hoops because of the same-origin policy: see an explanation here.

Installing MathJax-node

I installed MathJax-node in a Docker image, and as I was paranoid I also generated my own base image for the underlying system. Feel free to simplify the instructions if you don't need to do any of this.

I'm using an amd64 Debian system. I installed Docker as docker.io (packaged with Debian), added myself to the docker group, logged out and logged in. I tweaked Docker by editing /etc/default/docker and symlinking /var/lib/docker to move its files to a partition with more disk space.

I created the base system by issuing the following. Note that we will need Debian testing to be able to run mathjax-node, as otherwise the packaged version of Node is too old:

mkdir testing
sudo debootstrap testing testing/
sudo tar -C testing/ -c . | docker import - my-debian-testing

Here is the Dockerfile:

FROM my-debian-testing:latest
RUN apt-get -y update && apt-get install -y npm nodejs-legacy && apt-get clean
RUN npm install mathjax-node
RUN npm install cssstyle
CMD ["bash"]

In the folder of the file, issue:

docker build -t my-mathjax .

You can now use the image by starting a container, let's call it my-container:

docker run -di --name=my-container my-mathjax bash >/dev/null

And you can then apply page2html by piping your HTML code in the following invocation:

docker exec -i my-mathjax node_modules/mathjax-node/bin/page2html \
    --fontURL https://a3nm.net/mathjax/fonts/HTML-CSS"

Replace the fontURL parameter by the URL you are serving the MathJax fonts.

Another possibility is to use page2svg to render the math to SVG instead of HTML markup. However, this means that text-based browsers will not be able to see it.

The actual code that I use is here. I also increase the size of math by adding the following CSS as indicated here:

.mjx-chtml {font-size: 2.26ex ! important}
.mjx-chtml .mjx-chtml {font-size: inherit ! important}

Marking up MathJax

I use Markdown to write Web pages. I use python-markdown-math to convert math notation from a dollar-based notation in Markdown to HTML spans with the right classes (class="tex" or class="math"). To use python-markdown-math with the server-side setup, simply prevent it from adding the MathJax script boilerplate. I also use the render_to_span config parameter to ensure that no script is being generated.

To prepare the MathJax afterwards, you should be careful that the command of the previous section needs to apply to the entire HTML document, not just the HTML markup generated from Markdown before applying a template. Indeed, you will see that modifies the head element as well.


On modern hardware with this setup, it takes about one second to process an HTML page (even when there is no math in it). It takes a few seconds to process HTML pages with real markup such as this one.

Here's an example formula to see how it works: it should look the same as any regular MathJax formula, but without the blinking caused by JavaScript rendering: π(I)=(FIπ(F))×(FJI(1π(F))).