Hyphenation exception dictionaries

Hyphenation exception dictionaries allow you to declare your own hyphenation break points or stop TopLeaf from hyphenating a word altogether. Each dictionary file declares a list of exception words for a specific language. When processing multi-lingual content, you can define and load multiple exception lists.

Using a hyphenation exception dictionary

To use a hyphenation exception dictionary you must:

  1. Create an exceptions dictionary file;

  2. Enable Use Hyphenation Dictionary from the Format » Options dialog;

  3. Load the exceptions list using the <dictionary/> directive.

Exception dictionary file format

An exception dictionary file is a text file created using either UTF-8 or UTF-16 character encoding. The first two or three characters are examined to see if a byte order mark character (U+FEFF) is present. If so, this is used to determine the encoding. The byte order mark is not treated as part of the data. If no byte order mark is found UTF-8 is assumed.

The dictionary file declares one or more hyphenation word definitions, entered on separate lines terminated by a recognized line break. A hyphenation word definition specifies the permitted hyphenation points for a single alphabetic word . Hyphenation points are identified using either a hard hyphen (Unicode code point U+002D) or a soft hyphen (Unicode code point U+00AD), as shown in the following examples:

pre-car-iously
pol-y-mor-phic
mis-place
mis-placed
pre&#x002D;ce&#x002D;dent

Restrictions

Entries within an exception dictionary:

  • have a maximum length of 255 characters;

  • are case insensitive;

  • can be entered in any order;

  • can include XML character references to refer to specific Unicode letter code point values;

  • cannot include white space characters.

Word definitions must not include any punctuation characters (other than hyphens).