Formex 4 Builder¶
This module can construct a Formex 4 table from
an instance of type Table
.
Formex describes the format for the exchange of data between the Publication Office and its contractors. In particular, it defines the logical markup for documents which are published in the different series of the Official Journal of the European Union.
This builder allow you to convert Word document tables into Formex 4 tables using the Formex 4 schema (formex-05.59-20170418.xd).
Specifications and examples:
The Formex 4 documentation and schema is available online in the Publication Office: Formex Version 4.
An example of Formex 4 table is available in the Schema documentation: TBL
Changed in version 0.5.0: Refactoring (rename “Formex4” to “Formex”):
the class
Formex4Builder
is renamedFormexBuilder
,
- benker.builders.formex.ElementTreeType¶
alias of
lxml.etree._ElementTree
- benker.builders.formex.ElementType¶
alias of
lxml.etree._Element
- class benker.builders.formex.FormexBuilder(detect_titles=False, use_cals=False, cals_ns='https://lib.benker.com/schemas/cals.xsd', cals_prefix='cals', width_unit='mm', **options)¶
Bases:
benker.builders.base_builder.BaseBuilder
Formex 4 builder used to convert tables into
TBL
elements according to the TBL Schema- build_cell(row_elem, cell, row)¶
Build the Formex 4
<CELL>
element.Formex 4 attributes:
@COL
The mandatory COL attribute is used to specify in which column the cell is located.@COLSPAN
When a cell in a row ‘A’ must be linked to a group of cells in the same row, the first CELL element of this group has to provide the COLSPAN attribute. The value of the COLSPAN attribute is the number of cells in the group. The COL attribute of the first cell indicates the number of the first column in the group.The use of the COLSPAN attribute is only allowed to relate the value of a cell in several columns within the same row. Its value must be at least equal to ‘2’.
@ROWSPAN
When a cell in column ‘A’ is linked to a cell in row ‘B’ located just below row ‘A’, the CELL element of this single cell must provide the ROWSPAN attribute. The value of the ROWSPAN attribute is equal to the number of cells in the group. The CELL element relating to the single cell must be placed within the first ROW element in the group. The ROW elements corresponding to the other rows in the group may not contain any CELL elements for the column containing the single cell ‘A’.The use of the ROWSPAN attribute is only authorised to relate the value of a cell in several rows. Its value must be at least equal to ‘2’.
@ACCH
If the group of related cells is physically delimited by a horizontal brace, this symbol must be marked up using the ACCH attribute.@ACCV
If the group of related cells is physically delimited by a vertical brace, this symbol must be marked up using the ACCV attribute.@TYPE
The TYPE attribute of the CELL element is used to indicate locally the type of contents of the cells. It overrides the value of the TYPE attribute defined for the row (ROW) which contains the cell.
- Parameters
row_elem (ElementType) – Parent element:
<ROW>
.cell (benker.cell.Cell) – The cell.
row (benker.table.RowView) – The parent row.
Changed in version 0.4.4: Modification of the Formex4 builder to better deal with empty cells (management of
<IE/>
tags).Changed in version 0.5.0: Add support for CALS-like elements and attributes. Add support for
bgcolor
(Table background color).Changed in version 0.5.1: Preserve processing instruction in cell content.
Changed in version 0.5.2: Add support for the
@cals:cellstyle
attribute (extension). This attribute is required for two-way conversion of Formex tables to CALS and vice versa. If theCELL/@TYPE
and theROW/@TYPE
are different, we add a specific “cellstyle” style. This style will keep theCELL/@TYPE
value.
- build_colspec(group_elem, col)¶
Build the CALS
<colspec>
element (only is use_cals isTrue
).CALS attributes:
@colnum
is the column number.@colname
is the column name. Its format is “c{col_pos}”.@colwidth
width of the column (with its unit). The unit is defined by the width_unit options.@align
horizontal alignment of table entry content. Possible values are: “left”, “right”, “center”, “justify” (“char” is not supported).@colsep
column separators (vertical ruling). Possible values are “0” or “1”.@colsep
row separators (horizontal ruling). Possible values are “0” or “1”.
Note
The
@colnum
attribute (number of column) is not generated because this value is usually implied, and can be deduce from the@colname
attribute.- Parameters
group_elem (ElementType) – Parent element:
<tgroup>
.col (benker.table.ColView) – Columns
Changed in version 0.5.0: Add support for CALS-like elements and attributes.
Changed in version 0.5.1: Add support for CALS-like attributes:
@colnum
,@align
,@colsep
, and@rowsep
.
- build_corpus(tbl_elem, table)¶
Build the Formex 4
<CORPUS>
element.- Parameters
tbl_elem (ElementType) – Parent element:
<TBL>
.table (benker.table.Table) – Table
Changed in version 0.5.1: If this option detect_titles is enable, a title will be generated if the first row contains an unique cell with centered text.
Changed in version 0.5.1: Add support for the
@width
CALS-like attribute (table width).
- build_row(corpus_elem, row)¶
Build the Formex 4
<ROW>
element.Formex 4 attributes:
@TYPE
The TYPE attribute indicates the specific role of the row in the table. The allowed values are:ALIAS: if the row contains aliases. Such references may be used when the table is included on several pages of a publication. The references are associated to column headers on the first page and are repeated on subsequent pages.
HEADER: if the row contains cells which may be considered as a column header. This generally occurs for the first row of a table.
NORMAL: if most of the cells of the row contain ‘simple’ or ‘normal’ data. This is the default value.
NOTCOL: if the cells of the row contain units of measure relating to subsequent rows.
TOTAL: if the row contains data which could be considered as ‘totals’.
Note that this TYPE attribute is also provided for the cells (CELL), which could be used to override the value defined for the row. On the other hand, ‘NORMAL’ is the default value, so it is necessary to specify the TYPE attribute value in each cell of a row which has a specific type in order to avoid the default overriding (see the first row of the example below).
- Parameters
corpus_elem (ElementType) – Parent element:
<CORPUS>
.row (benker.table.RowView) – The row.
Changed in version 0.5.0: Add support for CALS-like elements and attributes.
Changed in version 0.5.1: The
@cals:valign
attribute is built from the “vertical-align” style.
- build_tbl(table)¶
Build the Formex 4
<TBL>
element.Formex 4 attributes:
@NO.SEQ
This mandatory attribute provides a sequence number to the table. This number represents the order in which the table appears in the document.@CLASS
The CLASS attribute is mandatory and is used to specify the type of data contained in the table. The allowed values are:GEN: if the table contains general data (default value),
SCHEDULE: if it is a schedule,
RECAP: if it is a synoptic table.
These two last values are only used for documents related to the general budget.
@COLS
This mandatory attribute provides the actual number of columns of the table.@PAGE.SIZE
The PAGE.SIZE attribute takes one of these values:DOUBLE.LANDSCAPE: table on two A4 pages forming an A3 landscape page,
DOUBLE.PORTRAIT: table on two A4 pages forming an A3 portrait page,
SINGLE.LANDSCAPE: table on a single A4 page in landscape,
SINGLE.PORTRAIT: table on a single A4 page in portrait (default).
- Parameters
table (benker.table.Table) – Table
- Returns
The newly-created
<TBL>
element.
Changed in version 0.5.0: Add support for CALS-like elements and attributes. Add support for
bgcolor
(Table background color).
- build_title(tbl_elem, row)¶
Build the table title using the
<TITLE>
element.For instance:
<TITLE> <TI> <P>Table IV</P> </TI> </TITLE>
- Parameters
tbl_elem (ElementType) – Parent element:
<TBL>
.row (benker.table.RowView) – The row which contains the title.
Changed in version 0.4.4: Modification of the Formex4 builder to better deal with empty cells (management of
<IE/>
tags).
- cleanup_tbl_in_tbl(fmx_root)¶
Cleanup the
TBL
elements when they are direct children of anotherTBL
- Parameters
fmx_root (ElementType) – The result tree which contains the
TBL
elements to remove.
- drop_superfluous_attrs(fmx_root)¶
Drop superfluous CALS-like attributes at the end of the Formex building.
@cals:namest
and@cals:nameend
are defined by@COLSPAN
@cals:morerows
is defined by@ROWSPAN
@cals:rowstyle
is defined byROW/@TYPE
,GR.NOTES
,TI.BLK
orSTI.BLK
.
- Parameters
fmx_root (ElementType) – Root element of the Formex file.
New in version 0.5.1.
- extract_gr_notes(fmx_root)¶
Extract
GR.NOTES
from the table footers.This function moves or creates a
GR.NOTES
just before theCORPUS
.- Parameters
fmx_root (ElementType) – The result tree with
GR.NOTES
.
Changed in version 0.5.1: If the ROW contains a GR.NOTES, we move it before the CORPUS, else we create it.
- finalize_tree(tree)¶
Finalize the resulting tree structure:
Calculate the
@NO.SEQ
values: sequence number of each table;Cleanup the
TBL
elements when they are direct children of anotherTBL
;Extract
GR.NOTES
from the table footers;Group
ROW
elements byBLK
based on the@cals:rowstyle
attribute (CALS extension).
- Parameters
tree (ElementTreeType) – The resulting tree.
Changed in version 0.5.1: Drop superfluous CALS-like attributes at the end of the Formex building.
- generate_table_tree(table)¶
Build the XML table from the Table instance.
- Parameters
table (benker.table.Table) – Table
- Returns
Table tree
- get_cals_qname(name)¶
- get_formex_qname(name)¶
- insert_blk(fmx_root)¶
Group
ROW
elements byBLK
based on the@cals:rowstyle
attribute (CALS extension).- Parameters
fmx_root (ElementType) – The result tree which contains the
CORPUS/ROW
elements.
- property ns_map¶
- setup_table(table)¶
- update_no_seq(fmx_root)¶
Calculate the
@NO.SEQ
values: sequence number of each table.- Parameters
fmx_root (ElementType) – The result tree which contains the
TBL
elements to update.
- benker.builders.formex.ProcessingInstructionType¶
alias of
lxml.etree._ProcessingInstruction
- class benker.builders.formex.RowInfo(tag, type, level)¶
Bases:
tuple
- level¶
Alias for field number 2
- tag¶
Alias for field number 0
- type¶
Alias for field number 1
- benker.builders.formex.guess_row_info(rowstyle)¶
- benker.builders.formex.revision_mark(name, attrs)¶