HTML table processing

TopLeaf supports a subset of the HTML table model DTD as defined at http://www.w3.org/TR/REC-html40/struct/tables.html.

The table element identifies the top level container that declares the content and structure of an HTML table.

Implementation

The TopLeaf composition engine renders HTML tables incrementally — as table rows arrive — rather than waiting for all the data before beginning to render each table.

Table width and alignment

The table width and align attributes cannot be used to specify a preferred width or alignment for the entire HTML table.

Table directionality

The directionality of a table is either the inherited directionality (the default is left-to-right) or that specified by the dir attribute for the TABLE element.

Table nesting

A table cell can contain another nested table subject to the following restrictions:

  • Cells in a header row may not contain tables.

  • Nested tables may have a header, but nested table header rows do not repeat if the table breaks across a page or column.

  • The maximum depth of a nested table header is the data block depth less the minimum table row depth.

Table frames

The table frame attribute specifies which sides of the frame surrounding a table will be visible.

The table border attribute is recognised and is processed in the following way:

  • Setting border="0" implies frame="void" and, unless otherwise specified, rules="none".

  • Other values of border imply frame="border" and, unless otherwise specified, rules="all"

  • If the border attribute is not explicitly declared, the default value of the rules attribute is "none"

You cannot use the value of the border attribute to set the table border width.

Certain table frame requests may produce unexpected results. For example, the use of <table frame="void" rules="rows" > will produce a ruleoff under each table row, including the last row in the table.

Internal table rules

The table element rules attribute specifies which rules appear between cells within an HTML table. The implied value of the rules attribute is "none". Unless otherwise specified, a non-zero border implies rules="all"

A non-standard extension to the HTML table model recognises the colsep and rowsep attributes within a <td> or <th> element. These attributes are analogous to the equivalent attributes within a CALS <entry> tag. They can be used to declare the internal table column and row separator for an individual table cell.

Individual mappings can control the way TopLeaf interprets the implied rules attribute value. This is done by toggling the settings for Implied row/column separator is visible from the mapping Table tab. The state of the implied column separator is used when processing vertical rules that appear between table cells or column groups. The state of the implied row separator is used when processing rules that appear below table cells or row groups.

Row groups

Table rows may be grouped into a table head using the thead element and one or more table body sections, each defined by a tbody. The tfoot element is not supported. The use of a tfoot block will generate a non-fatal warning, and the block will be processed as a tbody block.

The table element attributes scope and headers are not supported.

Table columns

A table may contain optional col and colgroup elements to specify column widths and groupings. If you do not specify a colgroup block or explicit col declarations then each table column will be allocated a percentage of the available data column width.

TopLeaf applies the following rules when calculating the number of table columns:

  1. If the table element contains colgroup or col elements, then the number of columns is calculated by summing the following:

    • For each col element, take the value of its span attribute (default value 1).

    • For each colgroup element containing at least one col element, ignore the span attribute for the colgroup element. For each col element within the colgroup take the value of its span attribute (default value 1).

    • For each empty colgroup element, take the value of its span attribute (default value 1).

  2. Otherwise, if the table element does not contain any colgroup or col elements, the number of columns is calculated on the basis of what is required by the table cells for each row processed as the table is incrementally rendered.

It is an error if a table contains colgroup or col declarations and the calculated number of columns determined by steps 1 and 2 are not identical.

Because TopLeaf renders HTML tables incrementally, it is preferable that an HTML table defines the number of columns and their widths before the first row of the table is processed. The simplest way to do this is to specify the table columns using col or colgroup declarations.

Column widths

Column widths can be specified as a pixel value, a percentage, or a relative length. Percentage (e.g. width="20%") and proportional (e.g. width="20*") specifications are resolved as a percentage of the available measure (the space available after applying paragraph formatting).

A non-standard extension to the HTML table model permits the specification of column widths using one of the following fixed measure units — pt (points), cm (centimeters), mm (millimeters), pi (picas), in (inches), pc (treated as a synonym for pi) and dp (decipoints).

The column width is interpreted as a pixel value (px) if neither a proportion, relative length, or a non-standard fixed unit is specified.

Measurements in px (pixels) are converted to an absolute measure using the current device resolution. The <topleaf-properties/> directive can be used to set the device resolution.

Use of the special form width="0*" (zero asterisk) is not supported.

In the case where an HTML table does not define a set of col or colgroup declarations, the table renderer will attempt to determine the column widths using the value of the width attribute in td and th elements. Specifying column widths in this way can lead to malformed tables.

If the total requested width for all columns exceeds the available measure, TopLeaf will proportionally reduce all table widths to ensure that the table fits within the block.

Cell margins

The table cellspacing, cellpadding, scope and header attributes are not supported.

Cell horizontal alignment

Explicit references to the char and charoff attributes may generate a warning when the table is rendered. A value of align="char" is interpreted as align="right".

Cell vertical alignment

A non-standard extension to the HTML table model recognises the following vertical alignment mode: step. A cell step alignment vertically aligns the first line of content in a cell with the last line of content in the previous adjacent cell. See the <cell-properties/> directive for more details.

Cross page cell vertical spanning

Cell vertical spanning is supported. The typesetting engine will generate a fatal error if a cell declares a negative vertical span. By default, a vertical span will not split across a column boundary. Use the command <table-properties split-rows="yes"/> if you want a cell vertical span to split across a page boundary.

Row splitting

HTML tables that are allowed to continue across column or page boundaries will normally break between table rows. When table row splitting is enabled, a break across a column or page boundary may also occur within a table row.

Tag processing

When HTML table processing is enabled the following elements are assumed to be components of an HTML table structure:

  • table

  • caption

  • thead

  • tbody

  • tfoot

  • colgroup

  • col

  • tr

  • th

  • td

In DTD-less mode, TopLeaf refers to an internally defined table DTD fragment that defines these tags and a set of default attribute values.

Unsupported features

The tfoot element is recognised, but does not generate a table footer block.

Restrictions

See Limitations - Tables.