Formex 4 Builder¶
This module can construct a Formex 4 table from
an instance of type Table
.
Formex describes the format for the exchange of data between the Publication Office and its contractors. In particular, it defines the logical markup for documents which are published in the different series of the Official Journal of the European Union.
This builder allow you to convert Word document tables into Formex 4 tables using the Formex 4 schema (formex-05.59-20170418.xd).
Specifications and examples:
- The Formex 4 documentation and schema is available online in the Publication Office: Formex Version 4.
- An example of Formex 4 table is available in the Schema documentation: TBL
Changed in version 0.5.0: Refactoring (rename “Formex4” to “Formex”):
- the class
Formex4Builder
is renamedFormexBuilder
,
-
benker.builders.formex.
ElementTreeType
¶ alias of
lxml.etree._ElementTree
-
benker.builders.formex.
ElementType
¶ alias of
lxml.etree._Element
-
class
benker.builders.formex.
FormexBuilder
(detect_titles=False, use_cals=False, cals_ns='https://lib.benker.com/schemas/cals.xsd', cals_prefix='cals', width_unit='mm', **options)¶ Bases:
benker.builders.base_builder.BaseBuilder
Formex 4 builder used to convert tables into
TBL
elements according to the TBL Schema-
build_cell
(row_elem, cell, row)¶ Build the Formex 4
<CELL>
element.Formex 4 attributes:
@COL
The mandatory COL attribute is used to specify in which column the cell is located.@COLSPAN
When a cell in a row ‘A’ must be linked to a group of cells in the same row, the first CELL element of this group has to provide the COLSPAN attribute. The value of the COLSPAN attribute is the number of cells in the group. The COL attribute of the first cell indicates the number of the first column in the group.The use of the COLSPAN attribute is only allowed to relate the value of a cell in several columns within the same row. Its value must be at least equal to ‘2’.
@ROWSPAN
When a cell in column ‘A’ is linked to a cell in row ‘B’ located just below row ‘A’, the CELL element of this single cell must provide the ROWSPAN attribute. The value of the ROWSPAN attribute is equal to the number of cells in the group. The CELL element relating to the single cell must be placed within the first ROW element in the group. The ROW elements corresponding to the other rows in the group may not contain any CELL elements for the column containing the single cell ‘A’.The use of the ROWSPAN attribute is only authorised to relate the value of a cell in several rows. Its value must be at least equal to ‘2’.
@ACCH
If the group of related cells is physically delimited by a horizontal brace, this symbol must be marked up using the ACCH attribute.@ACCV
If the group of related cells is physically delimited by a vertical brace, this symbol must be marked up using the ACCV attribute.@TYPE
The TYPE attribute of the CELL element is used to indicate locally the type of contents of the cells. It overrides the value of the TYPE attribute defined for the row (ROW) which contains the cell.
Parameters: - row_elem (ElementType) – Parent element:
<ROW>
. - cell (benker.cell.Cell) – The cell.
- row (benker.table.RowView) – The parent row.
Changed in version 0.5.0: Add support for CALS-like elements and attributes. Add support for
bgcolor
(Table background color).Changed in version 0.5.1: Preserve processing instruction in cell content.
-
build_colspec
(group_elem, col)¶ Build the CALS
<colspec>
element (only is use_cals isTrue
).CALS attributes:
@colnum
is the column number.@colname
is the column name. Its format is “c{col_pos}”.@colwidth
width of the column (with its unit). The unit is defined by the width_unit options.@align
horizontal alignment of table entry content. Possible values are: “left”, “right”, “center”, “justify” (“char” is not supported).@colsep
column separators (vertical ruling). Possible values are “0” or “1”.@colsep
row separators (horizontal ruling). Possible values are “0” or “1”.
Note
The
@colnum
attribute (number of column) is not generated because this value is usually implied, and can be deduce from the@colname
attribute.Parameters: - group_elem (ElementType) – Parent element:
<tgroup>
. - col (benker.table.ColView) – Columns
Changed in version 0.5.0: Add support for CALS-like elements and attributes.
Changed in version 0.5.1: Add support for CALS-like attributes:
@colnum
,@align
,@colsep
, and@rowsep
.
-
build_corpus
(tbl_elem, table)¶ Build the Formex 4
<CORPUS>
element.Parameters: - tbl_elem (ElementType) – Parent element:
<TBL>
. - table (benker.table.Table) – Table
Changed in version 0.5.1: If this option detect_titles is enable, a title will be generated if the first row contains an unique cell with centered text.
Changed in version 0.5.1: Add support for the
@width
CALS-like attribute (table width).- tbl_elem (ElementType) – Parent element:
-
build_row
(corpus_elem, row)¶ Build the Formex 4
<ROW>
element.Formex 4 attributes:
@TYPE
The TYPE attribute indicates the specific role of the row in the table. The allowed values are:- ALIAS: if the row contains aliases. Such references may be used when the table is included on several pages of a publication. The references are associated to column headers on the first page and are repeated on subsequent pages.
- HEADER: if the row contains cells which may be considered as a column header. This generally occurs for the first row of a table.
- NORMAL: if most of the cells of the row contain ‘simple’ or ‘normal’ data. This is the default value.
- NOTCOL: if the cells of the row contain units of measure relating to subsequent rows.
- TOTAL: if the row contains data which could be considered as ‘totals’.
Note that this TYPE attribute is also provided for the cells (CELL), which could be used to override the value defined for the row. On the other hand, ‘NORMAL’ is the default value, so it is necessary to specify the TYPE attribute value in each cell of a row which has a specific type in order to avoid the default overriding (see the first row of the example below).
Parameters: - corpus_elem (ElementType) – Parent element:
<CORPUS>
. - row (benker.table.RowView) – The row.
Changed in version 0.5.0: Add support for CALS-like elements and attributes.
Changed in version 0.5.1: The
@cals:valign
attribute is built from the “vertical-align” style.
-
build_tbl
(table)¶ Build the Formex 4
<TBL>
element.Formex 4 attributes:
@NO.SEQ
This mandatory attribute provides a sequence number to the table. This number represents the order in which the table appears in the document.@CLASS
The CLASS attribute is mandatory and is used to specify the type of data contained in the table. The allowed values are:- GEN: if the table contains general data (default value),
- SCHEDULE: if it is a schedule,
- RECAP: if it is a synoptic table.
These two last values are only used for documents related to the general budget.
@COLS
This mandatory attribute provides the actual number of columns of the table.@PAGE.SIZE
The PAGE.SIZE attribute takes one of these values:- DOUBLE.LANDSCAPE: table on two A4 pages forming an A3 landscape page,
- DOUBLE.PORTRAIT: table on two A4 pages forming an A3 portrait page,
- SINGLE.LANDSCAPE: table on a single A4 page in landscape,
- SINGLE.PORTRAIT: table on a single A4 page in portrait (default).
Parameters: table (benker.table.Table) – Table Returns: The newly-created <TBL>
element.Changed in version 0.5.0: Add support for CALS-like elements and attributes. Add support for
bgcolor
(Table background color).
-
build_title
(tbl_elem, row)¶ Build the table title using the
<TITLE>
element.For instance:
<TITLE> <TI> <P>Table IV</P> </TI> </TITLE>
Parameters: - tbl_elem (ElementType) – Parent element:
<TBL>
. - row (benker.table.RowView) – The row which contains the title.
- tbl_elem (ElementType) – Parent element:
-
cleanup_tbl_in_tbl
(fmx_root)¶ Cleanup the
TBL
elements when they are direct children of anotherTBL
Parameters: fmx_root (ElementType) – The result tree which contains the TBL
elements to remove.
-
drop_superfluous_attrs
(fmx_root)¶ Drop superfluous CALS-like attributes at the end of the Formex building.
@cals:namest
and@cals:nameend
are defined by@COLSPAN
@cals:morerows
is defined by@ROWSPAN
@cals:rowstyle
is defined byROW/@TYPE
,GR.NOTES
,TI.BLK
orSTI.BLK
.
Parameters: fmx_root (ElementType) – Root element of the Formex file. New in version 0.5.1.
-
extract_gr_notes
(fmx_root)¶ Extract
GR.NOTES
from the table footers.This function moves or creates a
GR.NOTES
just before theCORPUS
.Parameters: fmx_root (ElementType) – The result tree with GR.NOTES
.Changed in version 0.5.1: If the ROW contains a GR.NOTES, we move it before the CORPUS, else we create it.
-
finalize_tree
(tree)¶ Finalize the resulting tree structure:
- Calculate the
@NO.SEQ
values: sequence number of each table; - Cleanup the
TBL
elements when they are direct children of anotherTBL
; - Extract
GR.NOTES
from the table footers; - Group
ROW
elements byBLK
based on the@cals:rowstyle
attribute (CALS extension).
Parameters: tree (ElementTreeType) – The resulting tree. Changed in version 0.5.1: Drop superfluous CALS-like attributes at the end of the Formex building.
- Calculate the
-
generate_table_tree
(table)¶ Build the XML table from the Table instance.
Parameters: table (benker.table.Table) – Table Returns: Table tree
-
get_cals_qname
(name)¶
-
get_formex_qname
(name)¶
-
insert_blk
(fmx_root)¶ Group
ROW
elements byBLK
based on the@cals:rowstyle
attribute (CALS extension).Parameters: fmx_root (ElementType) – The result tree which contains the CORPUS/ROW
elements.
-
ns_map
¶
-
setup_table
(table)¶
-
update_no_seq
(fmx_root)¶ Calculate the
@NO.SEQ
values: sequence number of each table.Parameters: fmx_root (ElementType) – The result tree which contains the TBL
elements to update.
-
-
benker.builders.formex.
ProcessingInstructionType
¶ alias of
lxml.etree._ProcessingInstruction
-
class
benker.builders.formex.
RowInfo
(tag, type, level)¶ Bases:
tuple
-
level
¶ Alias for field number 2
-
tag
¶ Alias for field number 0
-
type
¶ Alias for field number 1
-
-
benker.builders.formex.
guess_row_info
(rowstyle)¶
-
benker.builders.formex.
revision_mark
(name, attrs)¶