Formex 4 Parser¶
This module can parse the tables (TBL
elements) of a Formex 4 file.
The TBL
element is used to mark up a Formex table, which actually contains text structured
in columns with related data.
A table usually contains the following information:
- an optional title (
TITLE
), - one or more structured text blocks (
GR.SEQ
) in order to mark up optional explanatory information about the table content, located between the title of the table and the table itself, - optionally a group of notes called in the table (
GR.NOTES
), - the corpus of the table (
CORPUS
).
When building the internal table object, this builder will:
- interpret the title (
TITLE
) and structured text blocks (GR.SEQ
) like rows. The nature attribute of each row will be “title” and “text-block” respectively. - interpret the group of notes (
GR.NOTES
) like a row of nature “footer” - interpret the corpus of the table (
CORPUS
) like the body of the table. The nature attribute of each row will be “body”.
Note
Since the Formex table structure is not suitable for typesetting/page layout, this parser is
also able to parse CALS-like attributes (for instance frame
, cols
, colsep
,
rowsep
, …) and CALS-like elements (for instance colspec
). This attributes and
elements may be added with the Formex 4 builder,
see FormexBuilder
.
New in version 0.5.0.
-
benker.parsers.formex.
ElementType
¶ alias of
lxml.etree._Element
-
class
benker.parsers.formex.
FormexParser
(builder, formex_ns=None, cals_ns=None, embed_gr_notes=False, **options)¶ Bases:
benker.parsers.base_parser.BaseParser
Formex 4 tables parser
-
contains_ie
(fmx_node)¶
-
get_cals_qname
(name)¶
-
get_formex_qname
(name)¶
-
parse_cals_row_styles
(fmx_elem)¶ Parse the row styles
Parameters: fmx_elem (ElementType) – Formex element: ROW
,TI.BLK
,STI.BLK
orGR.NOTES
.Returns: CSS-like styles Changed in version 0.5.1: The “vertical-align” style is built from the
@cals:valign
attribute.
-
parse_fmx_cell
(fmx_cell)¶ Parse a
CELL
element.Parameters: fmx_cell (ElementType) – table cell
-
parse_fmx_colspec
(cals_colspec)¶ Parse a CALS-like
colspec
element.For instance:
<colspec colname="c1" colnum="1" colsep="1" rowsep="1" colwidth="30mm" align="center"/>
Parameters: cals_colspec (ElementType) – CALS-like colspec
element.
-
parse_fmx_corpus
(fmx_corpus)¶
-
parse_fmx_row
(fmx_row)¶ Parse a
ROW
element which containsCELL
elements.This element may be in a
BLK`
Parameters: fmx_row (ElementType) – table row
-
parse_fmx_sti_blk
(fmx_sti_blk)¶ Parse a
STI.BLK
element, considering it like a row of a single cell.For instance:
<STI.BLK COL.START="1" COL.END="1"> <P>STI.BLK title</P> </STI.BLK>
Parameters: fmx_sti_blk (ElementType) – subtitle of the BLK
.
-
parse_fmx_ti_blk
(fmx_ti_blk)¶ Parse a
TI.BLK
element, considering it like a row of a single cell.For instance:
<TI.BLK COL.START="1" COL.END="2"> <P><HT TYPE="BOLD">TI.BLK title</HT></P> </TI.BLK>
Parameters: fmx_ti_blk (ElementType) – title of the BLK
.
-
parse_gr_notes
(fmx_gr_notes)¶ Parse a
GR.NOTES
element, considering it like a row of a single cell.For instance:
<GR.NOTES> <TITLE> <TI> <P>GR.NOTES Title</P> </TI> </TITLE> <NOTE NOTE.ID="N0001"> <P>Table note</P> </NOTE> </GR.NOTES>
Parameters: fmx_gr_notes (ElementType) – group of notes called in the table ( GR.NOTES
)Changed in version 0.5.1:
GR.NOTES
elements can be embedded if the embed_gr_notes options isTrue
.
-
parse_table
(fmx_corpus)¶ Convert a
<CORPUS>
Formex element into table object.Parameters: fmx_corpus (ElementType) – Formex element. Return type: ElementType Returns: Table.
-
parse_tbl_styles
(fmx_tbl)¶ Parse a
TBL
element and extract the stylesParameters: fmx_tbl (ElementType) – table Returns: dictionary of styles and nature
-
setup_table
(styles=None, nature=None)¶
-
transform_tables
(tree)¶
-