plint

French poetry validator (local mirror of https://gitlab.com/a3nm/plint)
git clone https://a3nm.net/git/plint/
Log | Files | Refs | README

TODO (4403B)


      1 == Ongoing ==
      2 
      3 - fix pytest-3 plint to make it work
      4 - migrate the readme to markdown
      5 - turn should_be_accepted into a test
      6 - expand the corpus of classical poetry: more Racine, more other authors
      7   (Boileau, Corneille, Prudhomme, etc.)
      8 
      9 - fix problems in the new works
     10 - Train diaresesis.json on new works
     11 - check that diaeresis:permissive is indeed more permissive
     12 - check for duplicates in additions.txt
     13 - check again xmllitre
     14 
     15 - download in bulk all possible sources to train on them in an error-tolerant
     16   way, and to be able to check easily the usage of a given word
     17 
     18 == Ideas ==
     19 
     20 - Use the latest lexique (with our corrections) to generate a file of known
     21   words with their length, and when we have exactly one of these words ensure
     22   that we do not allow less syllables than indicated (but it can be more,
     23   because of diérèse)
     24 - Ensure that, on words known in Lexique, frhyme returns exactly the known
     25   pronunciation(s); so we can use it confidently, e.g. to predict elision of the
     26   ending for rhyme genre and number of syllables
     27 - remove kludge for invalid characters, split them in specific chunks
     28 - Improve performance with profiling
     29 - Only indicate hemistiche status when there is a problem with hemistiches
     30 - Clean up the code to the extent possible
     31 - Look at dicollecte
     32   <http://grammalecte.net/download/fr/lexique-dicollecte-fr-v6.4.1.zip>, which
     33   also features pronunciation, and see how it differs from Lexique
     34 
     35 == Low priority ==
     36 
     37 === Error reporting ===
     38 
     39 - When reporting hemistiche errors, highlight possible hemistiche positions
     40   where an hemistiche could have been placed
     41 
     42 === Diérèses/synérèses ==
     43 
     44 - When training, take into consideration the contexts where we haven't been able
     45   to infer the number of syllables, and only learn at each step from the
     46   contexts where we are the most certain (including the unknown occurrences),
     47   instead of having a hardcoded default threshold
     48 - Formally evaluate the performance of the approach without additions
     49 
     50 === Other approaches ===
     51 
     52 - Learn rhyme and gender agnostically by clustering: prepare an undirected graph
     53   of rhyming verses, factor out suffixes, do SCC, prepare a trie
     54 
     55 === Misc ===
     56 
     57 - Fuzz testing: try giving random input to plint and check that it behaves
     58 - Better exception logging for the Web frontend
     59 
     60 === Problems ===
     61 
     62 - Loanwords from English ("crumble", "single", etc.) shouldn't be elidable
     63 - Loadwords from Italian ("ad patres") shouldn't be elidable
     64 
     65 == Other possible sources ==
     66 
     67 The following could be easily integrated, either from
     68 https://dramacode.github.io/ or from the indicated URL:
     69 
     70 corneille_surena https://fr.wikisource.org/wiki/Sur%C3%A9na
     71 corneille_pulcherie https://fr.wikisource.org/wiki/Pulch%C3%A9rie
     72 corneille_tite_et_berenice https://fr.wikisource.org/wiki/Tite_et_B%C3%A9r%C3%A9nice
     73 corneille_attila https://fr.wikisource.org/wiki/Attila
     74 corneille_othon https://fr.wikisource.org/wiki/Othon/Texte_entier
     75 corneille_sophonisbe https://fr.wikisource.org/wiki/Sophonisbe_(Corneille)
     76 corneille_sertorius https://fr.wikisource.org/wiki/Sertorius
     77 corneille_toison_dor https://fr.wikisource.org/wiki/La_Toison_d%E2%80%99or_(Corneille)
     78 corneille_nicomede https://fr.wikisource.org/wiki/Nicom%C3%A8de
     79 corneille_don_sanche_daragon https://fr.wikisource.org/wiki/Don_Sanche_d%E2%80%99Aragon
     80 corneille_heraclius https://fr.wikisource.org/wiki/H%C3%A9raclius_empereur_d%E2%80%99Orient
     81 corneille_theodore https://fr.wikisource.org/wiki/Th%C3%A9odore_vierge_et_martyre
     82 corneille_rodogune https://fr.wikisource.org/wiki/Th%C3%A9odore_vierge_et_martyre
     83 corneille_menteur_suite https://fr.wikisource.org/wiki/La_Suite_du_Menteur
     84 corneille_menteur https://fr.wikisource.org/wiki/Le_Menteur
     85 corneille_pompee https://fr.wikisource.org/wiki/Pomp%C3%A9e
     86 corneille_polyeucte https://fr.wikisource.org/wiki/Polyeucte/%C3%89dition_Masson,_1887
     87 corneille_cinna https://fr.wikisource.org/wiki/Cinna_ou_la_Cl%C3%A9mence_d%E2%80%99Auguste
     88 corneille_horace https://fr.wikisource.org/wiki/Horace_(Corneille)
     89 corneille_cid https://fr.wikisource.org/wiki/Le_Cid
     90 corneille_comedie_des_tuileries https://fr.wikisource.org/wiki/La_Com%C3%A9die_des_Tuileries
     91 
     92 Other ideas (trickier):
     93 
     94 - https://fr.wikisource.org/wiki/Imitation_de_J%C3%A9sus-Christ/Texte_entier
     95 - https://fr.wikisource.org/wiki/Po%C3%A9sies_diverses_(Corneille)
     96 - corneille_andromede https://fr.wikisource.org/wiki/Androm%C3%A8de (but much free verse)
     97