README (9088B)
1 Songflow is a collection of scripts to reflow sheet music (e.g., from a PDF) to 2 a new page size, e.g., to fit it on a mobile phone or tablet. It does not 3 require additional information about the music, and works with bitmap 4 renderings. 5 6 In particular, I have used Songflow to reformat the public-domain sight-reading 7 course "Melodia" 8 <https://ia800203.us.archive.org/17/items/cu31924021781434/cu31924021781434.pdf> 9 to a more convenient format. 10 11 This is the invocation used: 12 13 ./master.sh cu31924021781434.pdf 750 555 out.pdf 20 250 ./fix_melodia.sh 14 15 If you change the parameters, you should modify "fix_melodia.sh", at least to 16 avoid removing the hardcoded file paths (or you can replace "fix_melodia.sh" 17 above by "echo" to skip the fixes). 18 19 == 1. What it does == 20 21 Songflow does the following: 22 23 - Splitting a PDF file into multiple pages 24 25 - Splitting pages into systems, separated by sufficient consecutive lines of 26 white or near-white space. For this to work, your file must have sufficient 27 contrast, and must not be skewed (the separation between systems should be 28 horizontal) 29 30 The systems are also trimmed of near-white content at the left and right 31 32 - Splitting systems into blocks of measures of the right width, and resizing 33 them to the desired width. This is the most fragile step. 34 35 Empty spaces in the system are detected as having near-minimal height of 36 non-white content, and near-minimal total weight. Bars are identified as two 37 consecutive empty spaces that are sufficiently close and such that the space 38 between them has a significiantly higher density of non-white content. This 39 step requires the bars to be sufficiently vertical and the scan to be 40 sufficiently crisp. It works more reliably on systems where measure bars take 41 the whole system. For this step to work, you will probably need to adjust 42 threshold and distances. The program may fail by refusing to split (more 43 accurately doing aggressive splits at random points), or may misdetect some 44 patterns as bars (especially the stems of half-notes). The program will also 45 cut at bars that break a slur. 46 47 Once bars are detected, the program splits them in blocks of the right width 48 (by a bruteforce algorithm that finds a solution minimizing the length of the 49 shortest segment), and each block is stretched to the required width by 50 stretching empty space only to avoid distorting the picture. This step may 51 fail by stretching things (e.g., notes) that should not be stretched, or by 52 failing to detect some empty space and stretching too much the places that it 53 detects (especially when constrained because not all bars were correctly 54 detected). 55 56 - Combining the blocks back into pages of the right size (greedily fitting them 57 and arranging them on the page) 58 59 == 2. What it requires == 60 61 You need imagemagick, Python 3, pdftk, GNU parallel (not to be confused with 62 parallel from moreutils, which won't work) and some Python libraries (numpy, 63 imageio). 64 65 == 3. How to use it == 66 67 A script, master.sh, is provided to automate all of the conversion. Basic usage 68 would be: 69 70 ./master.sh INFILE.pdf WIDTH HEIGHT OUTFILE.pdf 71 72 The process can take several hours for large PDF files (e.g., for Melodia). 73 74 However, it is likely that you will need to peer into the internals, so read on. 75 76 == 4. The scripts == 77 78 The interesting scripts are: 79 80 === 4.1. Splitting pages into systems === 81 82 splith.py splits pages into systems. The way to use it is: 83 84 ./splith.py file.png out/ 85 86 It will write files out/file_0001.png, out/file_0002.png, etc., one for each 87 system, covering disjoint regions of the page. 88 89 - The parameter --whitethreshold indicates the sensitivity to consider things 90 as white space (this is a grayscale value, i.e., between 0 and 255). Setting 91 it to a higher value will make the program cut more agressively. 92 93 - The parameter --maxheight can be used to ensure that the extracted "systems" 94 will not be higher than the specified height (in pixels). This can be useful 95 in case you want to be sure the extraction won't fail later (e.g., combine.py 96 will ignore stuff which is too high to fit on the requested page height). 97 98 - The parameter --mincontentheight (in pixels) indicates the height of content 99 that is "too small to matter". If some small junk on the page gets extracted 100 as a system, or decorations or lyrics are attached to the wrong system, try 101 increasing this parameter. 102 103 - The parameter --minheight indicates the minimal height of an extracted system 104 (in pixels). If small junk gets extracted as a system, you can increase this 105 106 - The parameter --distthreshold (in pixels) can be lowered to stop cutting when 107 the empty space between two "systems" is too low compared to the largest empty 108 space. You can lower this parameter if the program cuts too agressively, but 109 it may then fail to cut, e.g., on pages with large space because of a title 110 111 === 4.2. Splitting lines into chunks == 112 113 splitw.py splits lines into chunks. The way to use it is: 114 115 ./splitw.py file.png out/ WIDTH 116 117 It will write files out/file_0001.png, out/file_0002.png, etc., one for each 118 system, having exactly the requested width WIDTH. 119 120 Parameters to detect the height: 121 122 - The parameter --whitethreshold (grayscale value between 0 and 255) indicates 123 what counts as "white" when measuring the height of a line 124 125 - The parameter --minlength controls the minimum consecutive number of pixels at 126 which something can be "low-height" (i.e., the minimal bar height) 127 128 - The parameter --margin indicates the number of pixels to be excluded at the 129 left and right of the screen when finding the minimal height 130 131 - The parameter --outlierquantile in a percentage indicating the proportion of 132 minimal height values to discard (consider as outliers) 133 134 - The parameter --heightthreshold indicates the tolerance (in pixels) up to 135 which something is considered "low-height" 136 137 Parameters to detect the weight: 138 139 - The parameter --outlierquantile mentioned above is also used to eliminate 140 outlier weights 141 142 - The parameter --weightthreshold (grayscale value between 0 and 255) indicates 143 the weight threshold up to which something is still considered as minimal 144 weight. 145 146 - The parameter --weightwindow (in pixels) indicates the width of the window 147 over which weight is computed (for smoothing) 148 149 Parameters to detect bars: 150 151 - The parameter --maxbardistance (in pixels) indicates the maximal distance 152 between two low-height, low-weight parts of the staff on each end of a bar 153 154 - The parameter --minbarweight (float) indicates how much more weight a bar 155 should have relative to the minimal weight 156 157 - The parameter --minchunk (in pixels) indicates, where we cannot find a bar 158 where to cut, what is the smallest admissible width for doing a cut at an 159 "empty point" 160 161 Parameters to debug: 162 163 - With --debug, the program will write out/debug.png with the input file colored 164 to indicate low-height and low-weight parts as well as detected and added bars 165 and the bars chosen to cut 166 167 == 4.3 Combining chunk into pages == 168 169 combine.py combines chunks into a page. The way to use it is: 170 171 ./combine.py infolder/ outfolder/ HEIGHT 172 173 All files in infolder/ should have the same width, and they will be considered 174 in alphabetical order. They will be grouped in files in outfolder/out_0001.png, 175 etc., having the prescribed HEIGHT and the common width. Input files whose 176 height is too large to fit will be ignored with a warning. 177 178 - The parameter --hmargin indicates the space in pixels left at the left and at 179 the right of the produced images 180 181 - The parameter --vmargin indicates the space in pixels left at the top and at 182 the bottom of the produced images 183 184 - The parameter --separator indicates the minimal vertical space in pixels 185 between two chunks 186 187 == 5. Limitations == 188 189 - Songflow will rasterize vector PDFs. On raster PDFs, you need to specify by 190 hand a new rasterization density which will re-scale the content during the 191 export. It would be conceivable to change splith.py to extract regions 192 vectorially, but splitv.py which stretches the partition in a complicated way 193 it would be trickier. 194 195 - When splitting pages into systems, sometimes lyrics are lost or not put at the 196 right place. The handling of rylics could be improved in splith.py to attach 197 them to the staff above instead of dropping them, and splitw.py could detect 198 the lyrics layer (looking for a cut), then ignore it when searching for cut 199 and stretch points, and the lyrics could then be moved to the right place... 200 201 - The word-wrapping algorithm is an exponential-time bruteforce algorithm, 202 whether it could be implemented using dynamic programming. That said, given 203 the size of instances, I doubt it's a bottleneck. 204 205 - Splitting lines is pretty error-prone, with some inadequate cuts (not at bars) 206 and some weird stretching. The accuracy could be improved by better improved, 207 or by genuine learning techniques instead of crude heuristiques. 208 209 - If you call Songflow on something that is not music, it will not notice and 210 will happily botch the content instead of leaving it alone.