Minifying files with Delta
In many situations, e.g., when filing bug reports or asking questions on mailing-lists or forums, one needs to take a file which triggers a certain behavior and reduce it to a file of minimal size that still triggers the behavior. For instance, you have written a long program that makes your compiler segfault, and you want to extract from it a minimal program that does the same. This is called minification, and the minimal file is often called a minimal working example.
You can minify your file by hand, testing again each time you remove something, but this is quite inefficient. This post is a brief tutorial on how to use the tool Delta, which does this automatically.
First, you should install Delta. On Debian systems, it is packaged as delta
,
and its main command, that we will use, is named singledelta
.
Second, the interesting part, you should create
a shell script test.sh
that takes a file as
parameter and decides whether this file triggers the behavior of interest,
returning 0
if the file is interesting and 1
if it is not. singledelta
will use this script to test intermediate versions of the file while minifying.
For instance, to detect a segfault:
#!/bin/bash
myprogram --option "$1"
if ! test $? = 139; then
exit 1
fi
exit 0
To test whether the output matches the contents of file "reference":
myprogram --option "$1" > output
! diff output reference
To test if the standard output or standard error contain the string "problem":
myprogram --option "$1" 2>&1 | grep problem
Third, you just copy your original file to a different name, say
"minified_file", then run singledelta
, which will minify "minified_file"
in-place.
cp original_file minified_file
singledelta -in_place -test=./test.sh minified_file
The process is very chatty. Once it completes, "minified_file" is a file that still triggers the behavior and is as small as possible.
Well, technically, this is not true, because I have observed that in some cases,
for reasons unknown, rerunning singledelta
again on the supposedly minified file can
minify it further. I have written a
trivial script to run singledelta
repeatedly
until the file no longer shrinks. Use it thus:
cp original_file minified_file
manydelta ./test.sh minified_file
Once again, this will minify "minified_file" in place by invoking singledelta
repeatedly. Of course, once the process has completed, you may still be able to
apply human intelligence to minify the file further in ways that singledelta
cannot do. Indeed, singledelta
only tries to remove lines, it will not, e.g.,
shorten identifiers or strings.
If you need more advanced features, Delta can also be used for other things, e.g., running on multiple files. See for instance this guide.