CALS Parser Implementation

This module can parse the tables (table elements) of a CALS file.

Specifications and examples:

The main elements of a CALS table are:

  • table: a table can contains one or several tgroup.

    • titles: table titles (not supported by the CALS parser)

    • tgroup: a portion of table

      • colspec: column specifications

      • spanspec: spanning specifications (not supported by the CALS parser)

      • thead: table header

        • colspec: header column specifications (not supported by the CALS parser)

        • row: table row (see tbody)

      • tfoot: table footer

        • colspec: footer column specifications (not supported by the CALS parser)

        • row: table row (see tbody)

      • tbody: table body

        • row: table row

          • entry: table entry which contains paragraphs

          • entrytbl: table entry which contains a table (not supported by the CALS parser)

An example of CALS table is available in Wikipedia: CALS Table Model

New in version 0.5.0.

class benker.parsers.cals.CalsParser(builder, cals_ns=None, width_unit='mm', **options)

Bases: benker.parsers.base_parser.BaseParser

CALS tables parser

get_cals_qname(name)
parse_cals_colspec(cals_colspec)

Parse a CALS-like colspec element.

For instance:

<colspec
  colname="c1"
  colnum="1"
  colsep="1"
  rowsep="1"
  colwidth="30mm"
  align="center"/>
Parameters

cals_colspec (ElementType) – CALS-like colspec element.

parse_cals_entry(cals_entry)

Parse a entry element.

Parameters

cals_entry (ElementType) – table entry

Changed in version 0.5.1: The “vertical-align” style is built from the @cals:valign attribute.

Changed in version 0.5.2: Add support for the @cals:cellstyle attribute (extension). This attribute is required for two-way conversion of Formex tables to CALS and vice versa. If the CELL/@TYPE and the ROW/@TYPE are different, we add a specific “cellstyle” style. This style will keep the CELL/@TYPE value.

Changed in version 0.5.3: Improved empty cells detection for Formex4 conversion (<IE/> tag management).

parse_cals_row(cals_row)

Parse a row element which contains entry elements.

This element may be in a BLK`

Parameters

cals_row (ElementType) – table row

Changed in version 0.5.1: The “vertical-align” style is built from the @cals:valign attribute.

parse_cals_table(cals_table)

Parse a CALS table element.

Parameters

cals_table (ElementType) – CALS table Element.

Returns

State of the parser (for debug purpose).

Changed in version 0.5.1: Add support for the @cals:width attribute (table width).

parse_cals_tgroup(cals_tgroup)
parse_table(cals_table)

Convert a <table> CALS element into table object.

Parameters

cals_table (ElementType) – CALS element.

Return type

benker.table.Table

Returns

Table.

setup_table(styles=None, nature=None)
transform_tables(tree)
benker.parsers.cals.ElementType

alias of lxml.etree._Element

Submodules