haspirater

detect aspirated 'h' in French words
git clone https://a3nm.net/git/haspirater/
Log | Files | Refs | README

commit 7bbc79a5ef1dee811df235f8d8f43703c6a74fa1
Author: Antoine Amarilli <a3nm@a3nm.net>
Date:   Mon, 30 May 2011 04:41:16 -0400

initial commit

Diffstat:
README | 105+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
additions | 29+++++++++++++++++++++++++++++
buildtrie.py | 42++++++++++++++++++++++++++++++++++++++++++
compresstrie.py | 22++++++++++++++++++++++
detect.pl | 22++++++++++++++++++++++
haspirater.json | 1+
haspirater.py | 40++++++++++++++++++++++++++++++++++++++++
majoritytrie.py | 24++++++++++++++++++++++++
make.sh | 13+++++++++++++
prepare.sh | 6++++++
trie2dot.py | 60++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
wikipedia | 593+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
12 files changed, 957 insertions(+), 0 deletions(-)

diff --git a/README b/README @@ -0,0 +1,105 @@ +haspirater -- a toolkit to detect aspirated 'h' in French words +Copyright (C) 2011 by Antoine Amarilli + +== 0. Licence == + +Permission is hereby granted, free of charge, to any person obtaining a +copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be included +in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS +OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +== 1. Features == + +haspirater is a tool to detect if a French word starts with an aspirated +'h' or not. It is not based on a list of words but on a trie trained +from a corpus, which ensures that it should do a reasonable job for +unseen words which are similar to known ones, without carrying a big +exceptions list. The json trie used is less than 5 Kio, and the lookup +script is 40 lines of Python. + +== 2. Usage == + +If you just want to use the included training data, you can either run +haspirater.py, giving one word per line in stdin and getting the +annotation on stout, or you can import it in a Python file and call +haspirater.lookup(word) which returns True if the leading 'h' is +aspirated, False if it isn't, and raises ValueError if there is no +leading 'h'. + +Please report any errors in the training data, keeping in mind than only +one possibility is returned even when both are attested. + +== 3. Training == + +The training data used by haspirater.py is loaded at runtime from the +haspirater.json file which has been trained from French texts taken from +Project Gutenberg <www.gutenberg.org>, from the list in the Wikipedia +article <http://fr.wikipedia.org/wiki/H_aspir%C3%A9>, and from a custom +set of exceptions. If you want to create your own data, or adapt the +approach here to other linguistic features, read on. + +The master script is make.sh which accepts French text on stdin and a +list of exceptions files as arguments. Included exception files are +additions and wikipedia. These exceptions are just like training data +and are not stored as-is; they are just piped later on in the training +phase. make.sh produces on stdout the json trie. Thus, you would run +something like: + + $ cat corpus | ./make.sh exceptions > haspirater.json + +== 4. Training details == + +=== 4.1. Corpus preparation (prepare.sh) === + +This script removes useless characters, and separates words (one per +line). + +=== 4.2. Property inference (detect.pl) === + +This script examines the output, notices occurrences of words for which +the preceding word indicates the aspirated or non-aspirated status, and +outputs them. + +=== 4.3. Removing leading 'h' === + +This is a quick optimization. + +=== 4.4. Trie construction (buildtrie.py) === + +The occurrences are read one after the other and are used to populate a +trie carrying the value count for each occurrence having a given prefix. + +=== 4.5. Trie compression (compresstrie.py) === + +The trie is then compressed by removing branches which are not needed to +infer a value. This step could be followed by a removal of branches with +very little dissent from the majority value if we wanted to reduce the +trie size at the expense of accuracy: for aspirated h, this isn't +needed. + +=== 4.5. Trie majority relabeling (majoritytrie.py) === + +Instead of the list of values with their counts, nodes are relabeled to +carry the most common value. This step could be skipped to keep +confidence values. + +== 5. Additionnal stuff == + +You can use trie2dot.py to convert the output of buildtrie.py or +compresstrie.py in the dot format which can be used to render a drawing +of the trie. + diff --git a/additions b/additions @@ -0,0 +1,29 @@ +1 heaume +1 hè1ment +1 hertz +1 héraut +1 hit-parade +1 high-five +1 hlm +1 hobby +1 hongrois +1 homard +1 hoquet +1 hors-piste +1 hors-bord +1 huée +1 hildegarde +1 hiroshima +1 heimatlos +0 Haÿ-1s-Roses +0 heur +0 heure +0 h +1 have1r +0 hallucination +0 hallucine +0 halène +0 halèner +0 hadopisme +1 hadopi +0 hellénisme diff --git a/buildtrie.py b/buildtrie.py @@ -0,0 +1,42 @@ +#!/usr/bin/env python3 + +"""From a list of values (arbitrary) and keys (words), create a trie +representing this mapping""" + +import json +import sys + +# first item is a dictionnary from values to an int indicating the +# number of occurrences with this prefix having this value +# second item is a dictionnary from letters to descendent nodes +def empty_node(): + return [{}, {}] + +trie = empty_node() + +def insert(trie, key, val): + """Insert val for key in trie""" + values, children = trie + # create a new value, if needed + if val not in values.keys(): + values[val] = 0 + # increment count for val + values[val] += 1 + if len(key) > 0: + # create a new node if needed + if key[0] not in children.keys(): + children[key[0]] = empty_node() + # recurse + return insert(children[key[0]], key[1:], val) + +for line in sys.stdin.readlines(): + line = line.split() + value = line[0] + word = line[1] if len(line) == 2 else '' + # a trailing space is used to mark termination of the word + # this is useful in cases where a prefix of a word is a complete, + # different word with a different value + insert(trie, word+' ', value) + +print(json.dumps(trie)) + diff --git a/compresstrie.py b/compresstrie.py @@ -0,0 +1,22 @@ +#!/usr/bin/env python3 + +"""Read json trie in stdin, trim unneeded branches and output json dump +to stdout""" + +import json +import sys + +trie = json.load(sys.stdin) + +def compress(trie): + """Compress the trie""" + if len(trie[0].keys()) <= 1: + # no need for children, there is no more doubt + trie[1] = {} + for child in trie[1].values(): + compress(child) + +compress(trie) + +print(json.dumps(trie)) + diff --git a/detect.pl b/detect.pl @@ -0,0 +1,22 @@ +#!/usr/bin/perl + +# From a list of '\n'-separated words, output occurrences of words +# starting by 'h' when it can be inferred whether the word is aspirated +# or not. The format is "0 word" for non-aspirated and "1 word" for +# aspirated. + +my $asp; # will the next word be aspirated? + +while (<>) { + $_ = lc($_); + print "$asp $_" if (/^h/i && $asp >= 0); + chop; + # we store in asp what the current word indicates about the next word + $asp = -1; # default is unknown + $asp = 0 if /^[lj]'$/; + $asp = 0 if /^cet$/; + $asp = 1 if /^ce$/; + # only meaningful are "je", "de", "le" and "la" + $asp = 1 if /^[jdl][ea]$/; +} + diff --git a/haspirater.json b/haspirater.json @@ -0,0 +1 @@ +["0", {"a": ["1", {" ": ["1", {}], "c": ["1", {}], "b": ["0", {"i": ["0", {}], "a": ["1", {}], "e": ["0", {}]}], "d": ["1", {"a": ["1", {}], "d": ["1", {}], "j": ["1", {}], "o": ["1", {"p": ["1", {"i": ["1", {" ": ["1", {}], "s": ["0", {}]}]}]}], "\u00ee": ["1", {}], "r": ["0", {}]}], "g": ["1", {}], "i": ["1", {}], "h": ["1", {}], "m": ["1", {}], "l": ["1", {"a": ["1", {}], "b": ["1", {}], "e": ["1", {"i": ["0", {}], "c": ["1", {}], "r": ["1", {}], "t": ["1", {}]}], "d": ["1", {}], "i": ["0", {}], "\u00e8": ["0", {"t": ["1", {}], "n": ["0", {}]}], "l": ["1", {"a": ["0", {}], " ": ["1", {}], "e": ["1", {}], "i": ["1", {}], "s": ["1", {}], "u": ["0", {}]}], "o": ["1", {}], "t": ["1", {}]}], "\u00ef": ["1", {}], "n": ["1", {}], "q": ["1", {}], "p": ["1", {}], "s": ["1", {}], "r": ["1", {"a": ["1", {}], "c": ["1", {}], "e": ["1", {}], "d": ["1", {}], "g": ["1", {}], "f": ["1", {}], "i": ["1", {}], "k": ["1", {}], "m": ["0", {}], "l": ["1", {}], "o": ["1", {}], "n": ["1", {}], "p": ["1", {}], "r": ["1", {}], "t": ["1", {}], "v": ["1", {}]}], "u": ["1", {}], "v": ["1", {}], "y": ["1", {}]}], " ": ["0", {}], "\u00e2": ["1", {}], "e": ["0", {"a": ["1", {"r": ["1", {}], "u": ["1", {"m": ["1", {}], "t": ["0", {}]}]}], "i": ["1", {}], "m": ["1", {}], "l": ["0", {"l": ["0", {"\u00e9": ["0", {}], "e": ["0", {}], "o": ["1", {}]}], "v": ["0", {}]}], "n": ["1", {}], "p": ["1", {}], "s": ["1", {}], "r": ["0", {"c": ["1", {"h": ["1", {}], "u": ["0", {}]}], "b": ["0", {}], "m": ["0", {"\u00e9": ["0", {}], "i": ["0", {"t": ["1", {"a": ["0", {}], "i": ["1", {}]}], "n": ["0", {}]}]}], "n": ["1", {}], "s": ["1", {}], "t": ["1", {}]}], "u": ["0", {"s": ["1", {}], "r": ["0", {" ": ["0", {}], "e": ["0", {}], "t": ["1", {}]}], "l": ["1", {}], "/": ["1", {}]}], "x": ["0", {}]}], "i": ["0", {"a": ["1", {}], " ": ["1", {}], "c": ["1", {}], "b": ["1", {}], "e": ["1", {}], "d": ["1", {}], "g": ["1", {}], "f": ["1", {}], "\u00e9": ["1", {}], "h": ["1", {}], "l": ["1", {"a": ["0", {"i": ["1", {}], "r": ["0", {}]}], "b": ["1", {}], "e": ["1", {}], "d": ["1", {}], "o": ["1", {}]}], "n": ["1", {"d": ["1", {"i": ["1", {}], "o": ["0", {}]}]}], "p": ["0", {"p": ["0", {"i": ["1", {}], "o": ["0", {}]}], "h": ["1", {}], " ": ["1", {}]}], "s": ["0", {"s": ["1", {}], "t": ["0", {}]}], "r": ["1", {"a": ["1", {}], "o": ["1", {"s": ["1", {}], "n": ["0", {}]}]}], "t": ["1", {}], "v": ["0", {}]}], "\u00e8": ["1", {"1": ["1", {}], "r": ["1", {}], "b": ["0", {}], "l": ["1", {}]}], "\u00ea": ["1", {}], "l": ["1", {}], "o": ["0", {" ": ["1", {}], "c": ["1", {}], "b": ["1", {}], "d": ["1", {}], "g": ["1", {}], "f": ["1", {}], "m": ["0", {"\u00e9": ["0", {}], "a": ["1", {}], "m": ["0", {}], "e": ["1", {}], "o": ["0", {}]}], "l": ["1", {"\u00e0": ["1", {}], "l": ["1", {}], "o": ["0", {}], "d": ["1", {}]}], "o": ["1", {}], "n": ["0", {" ": ["1", {}], "d": ["1", {}], "g": ["1", {}], "o": ["0", {}], "n": ["0", {"i": ["1", {}], "\u00ea": ["0", {}], "e": ["0", {}]}], "s": ["1", {}], "u": ["0", {}], "t": ["1", {}]}], "q": ["1", {}], "p": ["1", {}], "s": ["0", {"a": ["1", {}], "p": ["0", {}], "t": ["0", {}]}], "r": ["0", {"d": ["1", {}], "i": ["0", {"z": ["0", {}], "o": ["1", {}]}], "m": ["1", {}], "l": ["0", {}], "n": ["1", {}], "s": ["1", {}], "r": ["0", {}]}], "u": ["1", {}], "t": ["1", {}], "y": ["1", {}]}], "\u00e9": ["0", {"s": ["0", {}], "r": ["1", {"i": ["0", {"s": ["1", {}], "t": ["0", {}]}], "a": ["1", {"u": ["1", {}], "l": ["0", {}]}], "\u00e9": ["0", {}], "o": ["1", {"s": ["1", {}], "\u00ef": ["0", {}], "n": ["1", {}]}]}], "m": ["0", {}], "l": ["0", {"i": ["0", {}], "a": ["1", {" ": ["1", {}], "s": ["0", {}]}], "e": ["1", {}]}], "b": ["0", {}]}], "u": ["0", {"a": ["1", {}], "c": ["1", {}], "b": ["1", {}], "e": ["1", {}], "d": ["0", {}], "g": ["1", {}], "p": ["1", {}], "i": ["1", {"s": ["0", {}], "t": ["1", {}], "l": ["0", {}]}], "m": ["0", {"a": ["0", {"i": ["0", {"s": ["1", {}], "n": ["0", {}]}], "g": ["1", {}], "n": ["0", {}]}], " ": ["1", {}], "b": ["0", {"l": ["0", {}], "o": ["1", {}]}], "e": ["0", {" ": ["1", {}], "r": ["1", {}], "u": ["0", {"x": ["1", {}], "r": ["0", {}]}], "m": ["1", {}]}], "i": ["0", {}], "o": ["1", {}], "p": ["1", {}], "u": ["0", {}]}], "l": ["1", {}], "\u00ee": ["0", {}], "q": ["1", {}], "\u00e9": ["1", {}], "s": ["1", {}], "r": ["1", {}], "t": ["1", {}], "n": ["1", {}]}], "\u00f4": ["0", {"p": ["0", {}], "t": ["0", {}], "l": ["1", {}]}], "H": ["0", {}], "y": ["0", {}], "\u00d4": ["0", {}]}] diff --git a/haspirater.py b/haspirater.py @@ -0,0 +1,40 @@ +#!/usr/bin/python3 + +"""Determine if a word starts by an aspirated 'h' or not, by a lookup in +a precompiled trie""" + +import os +import json +import sys + +f = open(os.path.join(os.path.dirname( + os.path.realpath(__file__)), 'haspirater.json')) +trie = json.load(f) +f.close() + +def do_lookup(trie, key): + if len(key) == 0 or (key[0] not in trie[1].keys()): + return trie[0] + return do_lookup(trie[1][key[0]], key[1:]) + +def lookup(key): + """Return True iff key starts with an aspirated 'h'""" + if key == '' or key[0] != 'h': + raise ValueError + return do_lookup(trie, key[1:] + ' ') == '1' + +if __name__ == '__main__': + while True: + line = sys.stdin.readline() + if not line: + break + line = line.lower().lstrip().rstrip() + try: + result = lookup(line) + if result: + print("%s: aspirated" % line) + else: + print("%s: not aspirated" % line) + except ValueError: + print("%s: no leading 'h'" % line) + diff --git a/majoritytrie.py b/majoritytrie.py @@ -0,0 +1,24 @@ +#!/usr/bin/env python3 + +"""Read json trie in stdin, keep majority value at each node and output +trie to stdout""" + +import json +import sys + +trie = json.load(sys.stdin) + +def get_majority(d): + """What is the most probable value?""" + return max(d, key=d.get) + +def majority(trie): + """Keep only the most probable value at each node""" + trie[0] = get_majority(trie[0]) + for child in trie[1].values(): + majority(child) + +majority(trie) + +print(json.dumps(trie)) + diff --git a/make.sh b/make.sh @@ -0,0 +1,13 @@ +#!/bin/bash + +# From a French text input and an exceptions dictionnary, prepare the +# trie. + +./prepare.sh | # reformat the text + ./detect.pl | # identify and label occurrences + cat - $* | # add in exceptions + sed 's/ h/ /g' | # we don't keep the useless leading 'h' in the trie + ./buildtrie.py | # prepare the trie + /compresstrie.py | # compress the trie + ./majoritytrie.py # keep only the most frequent information + diff --git a/prepare.sh b/prepare.sh @@ -0,0 +1,6 @@ +#!/bin/bash + +# Prepare a text for piping into detect.pl + +tr ' ' '\n' | tr -dc "a-zA-ZÀ-Ÿà-ÿ \n'-" | sed "s/'/'\n/" + diff --git a/trie2dot.py b/trie2dot.py @@ -0,0 +1,60 @@ +#!/usr/bin/env python3 + +"""Read json trie in stdin, trim unneeded branches and output json dump +to stdout""" + +import json +import sys +from math import log + +trie = json.load(sys.stdin) + +free_id = 0 + +def cget(d, k): + if k in d.keys(): + return d[k] + else: + return 0 + +def int2strbyte(i): + s = hex(i).split('x')[1] + if len(s) == 1: + return '0' + s + else: + return s + +def fraction2rgb(fraction): + n = int(255*fraction) + return int2strbyte(n)+'00'+int2strbyte(255 - n) + +def total(x): + key, node = x + return sum(node[0].values()) + +def to_dot(trie, prefix=''): + global free_id + + values, children = trie + my_id = free_id + free_id += 1 + count = cget(values, "0") + cget(values, "1") + fraction = cget(values, "1") / count + + # TODO illustrate count + print("%d [label=\"%s\",color=\"#%s\",penwidth=%d]" % (my_id, prefix, + fraction2rgb(fraction), 1+int(log(count)))) + + for (key, child) in sorted(children.items(), key=total, reverse=True): + i = to_dot(child, prefix+key) + print("%d -> %d [label=\"%s\",penwidth=%d]" % (my_id, i, + key, 1+int(log(total((None, child)))))) + + return my_id + +# TODO aspect causes graphviz crash? +# TODO check if nodes don't get removed with aspect, it seems too good +# to be true +print("digraph G {\naspect=\"3\"\n") +to_dot(trie, 'h') +print("}") diff --git a/wikipedia b/wikipedia @@ -0,0 +1,593 @@ +1 habanera +1 hâbler +1 hâblerie +1 hâbleur +1 hache +1 hacheécorce +1 hacheécorces +1 hachefourrage +1 hachelégumes +1 hachemaïs +1 hachepaille +1 hacher +1 hachereau +1 hachesarment +1 hachesarments +1 hachette +1 hacheviande +1 hachage +1 hacheur +1 hachis +1 hachich +1 hachisch +1 hachoir +1 hachure +1 hack +1 hackeur +1 hacquebute +1 hacquebutier +1 hadal +1 haddock +1 hadîth +1 hadj +1 hadji +1 hadopi +1 haguais +1 haguais +1 hague +1 hagard +1 ha +1 haha +1 hahé +1 haie +1 haïe +1 haïr +1 haïk +1 haillon +1 haillonneux +1 haine +1 haineux +1 haineusement +1 haïr +1 haïssable +1 halage +1 halbran +1 halde +1 hâle +1 halecret +1 haler +1 hâler +1 haleter +1 halètement +1 hall +1 halle +1 hallebarde +1 hallebardier +1 hallier +1 hallstatien +1 halo +1 haloir +1 halophile +1 halot +1 halte +1 hamac +1 hamada +1 hamal +1 hambourg +1 hamburger +1 hameau +1 hammal +1 hammam +1 hammerfest +1 hammerless +1 hampe +1 hamster +1 han +1 hanap +1 hanche +1 hanchement +1 hancher +1 hand +1 handball +1 handballeur +1 handicap +1 handicaper +1 hangar +1 hanneton +1 hannetonner +1 hanse +1 hanséatique +1 hanter +1 hantise +1 happe +1 happelourde +1 happer +1 happening +1 happement +1 happyend +1 haquebute +1 haquebutier +1 haquenée +1 haquet +1 harakiri +1 harangue +1 haranguer +1 harangueur +1 haras +1 harassant +1 harasser +1 harassement +1 harceler +1 harcèlement +1 harceleur +1 hachich +1 harald +1 harde +1 harder +1 hardes +1 hardi +1 hardiesse +1 hardiment +1 hardware +1 harem +1 hareng +1 harengère +1 haret +1 harfang +1 hargne +1 hargneux +1 hargneusement +1 haricot +1 haricoter +1 haridelle +1 harissa +1 harka +1 harki +1 harle +1 harlou +1 harnacher +1 harnacheur +1 harnachement +1 harnais +1 harnois +1 harold +1 haro +1 harpailler +1 harpe +1 harper +1 harpie +1 harpiste +1 harpon +1 harponner +1 harponneur +1 harponnage +1 harry +1 hart +1 harvard +1 hasard +1 hasarder +1 hasardeux +1 hasardeusement +1 hasbeen +1 haschich +1 hase +1 hast +1 hastaire +1 haste +1 hastings +1 hâte +1 hâtelet +1 hâtelette +1 hâter +1 hâtier +1 hâtif +1 hâtiveau +1 hâtivement +1 hauban +1 haubanner +1 haubanneur +1 haubergeon +1 haubert +1 hausse +1 haussecol +1 haussement +1 haussepied +1 hausser +1 hausseur +1 hausseusement +1 haussier +1 haussier +1 haut +1 hautain +1 hautain +1 hautbois +1 hautdechausses +1 hautdeforme +1 hautecontre +1 hauteforme +1 hautement +1 hautesse +1 hauteur +1 hautescontre +1 hautesformes +1 hautfond +1 hautin +1 hautlecœur +1 hautlecorps +1 hautlepied +1 hautparleur +1 hautparleurs +1 hautrelief +1 hautsdechausses +1 hautsdeforme +1 hautsfonds +1 hautsreliefs +1 hauturier +1 havage +1 havanais +1 havanais +1 havane +1 havane +1 hâve +1 haveneau +1 havenet +1 haver +1 haveur +1 havir +1 havrais +1 havrais +1 havre +1 havre +1 havresac +1 havresacs +1 hayon +1 heaume +1 heaumier +1 heimatlos +1 hein +1 héler +1 héleur +1 hèlement +1 hello +1 hem +1 hemloc +1 henné +1 hennir +1 hennissant +1 hennissement +1 hennisseur +1 henri +1 henry +1 henry +1 hep +1 héraut +1 herchage +1 hercher +1 hercheur +1 hère +1 hérissement +1 hérisser +1 hérisseur +1 hérisson +1 hérissonner +1 hermitique +1 herniaire +1 hernie +1 hernieux +1 héron +1 héronnier +1 héros +1 herschage +1 herscher +1 herscheur +1 herse +1 herser +1 hertz +1 hertzien +1 hesse +1 hêtraie +1 hêtre +1 heu/heux +1 heulandite +1 heurt +1 heurtement +1 heurtequin +1 heurter +1 heurteur +1 heurtoir +0 hélas +1 hi +1 hiatal +1 hibou +1 hic +1 hic +1 hickory +1 hideur +1 hideusement +1 hideux +1 hie +1 hiement +1 hier +1 hiéracocéphale +1 hiérarchie +1 hiérarchique +1 hiérarchiquement +1 hiérarchiser +1 hiérarchisation +1 hiératique +1 hiératiquement +1 hiératisant +1 hiératisé +1 hiératisme +1 hiérochromie +1 hiérocrate +1 hiérocratisme +1 hiérodrame +1 hiérogamie +1 hiérogamique +1 hiéroglyphe +1 hiéroglyphé +1 hiéroglyphie +1 hiéroglyphié +1 hiéroglyphique +1 hiéroglyphiquement +1 hiéroglyphisme +1 hiéroglyphite +1 hiérogramme +1 hiérogrammate +1 hiérogrammatisme +1 hiérographe +1 hiéromancie +1 hiéromoine +1 hiérophanie +1 hiéroscopie +1 hiéroscopique +1 hifi +1 highlandais +1 highlander +1 highlands +1 highlife +1 highlifer +1 highlifeur +1 hihan +1 hilaire +1 hile +1 hiloire +1 hilbert +1 hildegarde +1 hindi +1 hip +1 hiphop +1 hippie +1 hiragana +1 hiroshima +1 hissage +1 hisser +1 hissement +1 hisseur +1 hit +1 hitparade +1 hitparades +1 hittite +1 hittite +1 ho +1 hobart +1 hobby +1 hobereau +1 hobereautaille +1 hoberelle +1 hoc +1 hoca +1 hocco +1 hoche +1 hochement +1 hochepot +1 hochequeue +1 hochequeues +1 hocher +1 hochet +1 hockey +1 hockeyeur +1 hocko +1 hodja +1 hoffmannesque +1 hoffmannien +1 hognement +1 hogner +1 holà +1 holding +1 hôler +1 holdup +1 hollandais +1 hollandais +1 hollandaisement +1 hollande +1 hollande +1 hollandé +1 hollandiser +1 hollandobelge +1 hollandofrançais +1 hollandonorvégien +1 hollandosaxon +1 hollywood +1 hollywoodesque +1 hollywoodien +1 homard +1 homarderie +1 homardier +1 home +1 homecinema +1 homespun +1 hon +1 honduras +1 hondurien +1 hondurien +1 hongkong +1 hongkongais +1 hongre +1 hongreline +1 hongrer +1 hongreur +1 hongrie +1 hongrois +1 hongrois +1 hongroyage +1 hongroyer +1 hongroyeur +1 honnir +1 honnissement +1 honshu +1 honte +1 honteux +1 honteusement +1 hooligan +1 hop +1 hoquet +1 hoqueter +1 hoquètement +1 hoqueton +1 horde +1 horion +1 hormis +1 hornblende +1 hors +1 horsain +1 horsbord +1 horsbords +1 horscaste +1 horscastes +1 horsd’œuvre +1 horseguard +1 horseguards +1 horsepox +1 horsjeu +1 horslaloi +1 horssérie +1 horst +1 horstexte +1 hosanna +1 hosannière +1 hotdog +1 hotdogs +1 hotte +1 hottée +1 hotter +1 hottentot +1 hotteur +1 hou +1 houp +1 houblon +1 houblonner +1 houblonneur +1 houblonnière +1 houdan +1 houdan +1 houe +1 houhou +1 houille +1 houiller +1 houillère +1 houilleux +1 houka +1 houle +1 houler +1 houlette +1 houleux +1 houleusement +1 houlier +1 houlque +1 hoummous +1 houp +1 houppe +1 houppelande +1 houppette +1 houppier +1 houque +1 houraillis +1 hourd +1 hourdage +1 hourder +1 hourdi +1 hourdis +1 houret +1 houri +1 hourque +1 hourra +1 hourvari +1 houseau +1 houspiller +1 houspilleur +1 houspillement +1 houssage +1 houssaie +1 housse +1 housser +1 houssine +1 houssoir +1 houston +1 houx +1 hoyau +1 huard +1 hublot +1 huche +1 huchée +1 hucher +1 huchet +1 huchier +1 hue +1 huée +1 huer +1 huerta +1 huehau +1 hugo +1 hugolâtre +1 hugolâtrie +1 hugolien +1 hugotique +1 hugotisme +1 hugues +1 huguenot +1 huit +1 huitain +1 huitaine +1 huitante +1 huitième +1 hulotte +1 hululation +1 hululer +1 hum +1 humage +1 humement +1 humer +1 humeux +1 huns +1 humoter +1 hune +1 hunier +1 hunter +1 huppe +1 huppé +1 huque +1 hure +1 hurlade +1 hurlée +1 hurlement +1 hurler +1 hurleur +1 huroiroquois +1 huroiroquois +1 huron +1 huron +1 huronien +1 huronien +1 hurricane +1 husky +1 hussard +1 hussite +1 hussitisme +1 hutin +1 hutinet +1 hutte +1 hutteau +1 hutter +1 huttier