armi.utils.tabulate module

Pretty-print tabular data.

This file started out as the MIT-licensed “tabulate”. Though we have made, and will continue to make, many arbitrary changes as we need. Thanks to the tabulate team.

https://github.com/astanin/python-tabulate

Usage

The module provides just one function, tabulate, which takes a list of lists or other tabular data type as the first argument, and outputs anicely-formatted plain-text table:

>>> from armi.utils.tabulate import tabulate

>>> table = [["Sun",696000,1989100000],["Earth",6371,5973.6],
...          ["Moon",1737,73.5],["Mars",3390,641.85]]

>>> print(tabulate(table))
-----  ------  -------------
Sun    696000     1.9891e+09
Earth    6371  5973.6
Moon     1737    73.5
Mars     3390   641.85
-----  ------  -------------

The following tabular data types are supported:

list of lists or another iterable of iterables
list or another iterable of dicts (keys as columns)
dict of iterables (keys as columns)
list of dataclasses (field names as columns)
two-dimensional NumPy array
NumPy record arrays (names as columns)

Table headers

To print nice column headers, supply the second argument (headers):

headers can be an explicit list of column headers

if headers=”firstrow”, then the first row of data is used

if headers=”keys”, then dictionary keys or column indices are used

Otherwise a headerless table is produced.

If the number of headers is less than the number of columns, they are supposed to be names of the last columns. This is consistent with the plain-text format of R:

>>> print(tabulate([["sex","age"],["Alice","F",24],["Bob","M",19]],
...       headers="firstrow"))
       sex      age
-----  -----  -----
Alice  F         24
Bob    M         19

Column and Headers alignment

tabulate tries to detect column types automatically, and aligns the values properly. By default it aligns decimal points of the numbers (or flushes integer numbers to the right), and flushes everything else to the left. Possible column alignments (numAlign, strAlign) are: “right”, “center”, “left”, “decimal” (only for numAlign), and None (to disable alignment).

colGlobalAlign allows for global alignment of columns, before any specific override from: colAlign. Possible values are: None (defaults according to coltype), “right”, “center”, “decimal”, “left”.
colAlign allows for column-wise override starting from left-most column. Possible values are:: “global” (no override), “right”, “center”, “decimal”, “left”.
headersGlobalAlign allows for global headers alignment, before any specific override from: headersAlign. Possible values are: None (follow columns alignment), “right”, “center”, “left”.
headersAlign allows for header-wise override starting from left-most given header. Possible: values are: “global” (no override), “same” (follow column alignment), “right”, “center”, “left”.

Note on intended behaviour: If there is no data, any column alignment argument is ignored. Hence, in this case, header alignment cannot be inferred from column alignment.

Table formats

intFmt is a format specification used for columns which contain numeric data without a decimal point. This can also be a list or tuple of format strings, one per column.

floatFmt is a format specification used for columns which contain numeric data with a decimal point. This can also be a list or tuple of format strings, one per column.

None values are replaced with a missingVal string (like floatFmt, this can also be a list of values for different columns):

>>> print(tabulate([["spam", 1, None],
...                 ["eggs", 42, 3.14],
...                 ["other", None, 2.7]], missingVal="?"))
-----  --  ----
spam    1  ?
eggs   42  3.14
other   ?  2.7
-----  --  ----

Various plain-text table formats (tableFmt) are supported: ‘plain’, ‘simple’, ‘grid’, ‘rst’, and tsv. Variable tabulateFormats contains the list of currently supported formats.

“plain” format doesn’t use any pseudographics to draw tables, it separates columns with a double space:

>>> print(tabulate([["spam", 41.9999], ["eggs", "451.0"]],
...                 ["strings", "numbers"], "plain"))
strings      numbers
spam         41.9999
eggs        451

>>> print(tabulate([["spam", 41.9999], ["eggs", "451.0"]], tableFmt="plain"))
spam   41.9999
eggs  451

“simple” format is like Pandoc simple_tables:

>>> print(tabulate([["spam", 41.9999], ["eggs", "451.0"]],
...                 ["strings", "numbers"], "simple"))
strings      numbers
---------  ---------
spam         41.9999
eggs        451

>>> print(tabulate([["spam", 41.9999], ["eggs", "451.0"]], tableFmt="simple"))
----  --------
spam   41.9999
eggs  451
----  --------

“grid” is similar to tables produced by Emacs table.el package or Pandoc grid_tables:

>>> print(tabulate([["spam", 41.9999], ["eggs", "451.0"]],
...                ["strings", "numbers"], "grid"))
+-----------+-----------+
| strings   |   numbers |
+===========+===========+
| spam      |   41.9999 |
+-----------+-----------+
| eggs      |  451      |
+-----------+-----------+

>>> print(tabulate([["spam", 41.9999], ["eggs", "451.0"]], tableFmt="grid"))
+------+----------+
| spam |  41.9999 |
+------+----------+
| eggs | 451      |
+------+----------+

“rst” is like a simple table format from reStructuredText; please note that reStructuredText accepts also “grid” tables:

>>> print(tabulate([["spam", 41.9999], ["eggs", "451.0"]],
...                ["strings", "numbers"], "rst"))
=========  =========
strings      numbers
=========  =========
spam         41.9999
eggs        451
=========  =========

>>> print(tabulate([["spam", 41.9999], ["eggs", "451.0"]], tableFmt="rst"))
====  ========
spam   41.9999
eggs  451
====  ========

Number parsing

By default, anything which can be parsed as a number is a number. This ensures numbers represented as strings are aligned properly. This can lead to weird results for particular strings such as specific git SHAs e.g. “42992e1” will be parsed into the number 429920 and aligned as such.

To completely disable number parsing (and alignment), use disableNumParse=True. For more fine grained control, a list column indices is used to disable number parsing only on those columns e.g. disableNumParse=[0, 2] would disable number parsing only on the first and third columns.

Column Widths and Auto Line Wrapping

Tabulate will, by default, set the width of each column to the length of the longest element in that column. However, in situations where fields are expected to reasonably be too long to look good as a single line, tabulate can help automate word wrapping long fields for you. Use the parameter maxcolwidth to provide a list of maximal column widths:

>>> print(tabulate( \
      [('1', 'John Smith', \
        'This is a rather long description that might look better if it is wrapped a bit')], \
      headers=("Issue Id", "Author", "Description"), \
      maxColWidths=[None, None, 30], \
      tableFmt="grid"  \
    ))
+------------+------------+-------------------------------+
|   Issue Id | Author     | Description                   |
+============+============+===============================+
|          1 | John Smith | This is a rather long         |
|            |            | description that might look   |
|            |            | better if it is wrapped a bit |
+------------+------------+-------------------------------+

Header column width can be specified in a similar way using maxheadercolwidth.

armi.utils.tabulate.tabulate(data, headers=(), tableFmt='simple', floatFmt='g', intFmt='', numAlign='default', strAlign='default', missingVal='', showIndex='default', disableNumParse=False, colGlobalAlign=None, colAlign=None, maxColWidths=None, headersGlobalAlign=None, headersAlign=None, rowAlign=None, maxHeaderColWidths=None)[source]

Format a fixed width table for pretty printing.

Parameters:

data (object) – The tabular data you want to print. This can be a list-of-lists/iterables, dict-of-lists/ iterables, 2D numpy arrays, or list of dataclasses.
headers=() – Nice column names. If this is “firstrow”, the first row of the data will be used. If it is “keys”m, then dictionary keys or column indices are used.
optional – Nice column names. If this is “firstrow”, the first row of the data will be used. If it is “keys”m, then dictionary keys or column indices are used.
tableFmt (str, optional) – There are custom table formats defined in this file, and you can choose between them with this string: “armi”, “simple”, “plain”, “grid”, “github”, “pretty”, “psql”, “rst”, “tsv”.
floatFmt (str, optional) – A format specification used for columns which contain numeric data with a decimal point. This can also be a list or tuple of format strings, one per column.
intFmt (str, optional) – A format specification used for columns which contain numeric data without a decimal point. This can also be a list or tuple of format strings, one per column.
numAlign (str, optional) – Specially align numbers, options: “right”, “center”, “left”, “decimal”.
strAlign (str, optional) – Specially align strings, options: “right”, “center”, “left”.
missingVal (str, optional) – None values are replaced with a missingVal string.
showIndex (str, optional) – Show these rows of data. If “always”, show row indices for all types of data. If “never”, don’t show row indices for all types of data. If showIndex is an iterable, show its values..
disableNumParse (bool, optional) – To disable number parsing (and alignment), use disableNumParse=True. For more fine grained control, [0, 2] would disable number parsing on the first and third columns.
colGlobalAlign (str, optional) – Allows for global alignment of columns, before any specific override from colAlign. Possible values are: None, “right”, “center”, “decimal”, “left”.
colAlign (str, optional) – Allows for column-wise override starting from left-most column. Possible values are: “global” (no override), “right”, “center”, “decimal”, “left”.
maxColWidths (list, optional) – A list of the maximum column widths.
headersGlobalAlign (str, optional) – Allows for global headers alignment, before any specific override from headersAlign. Possible values are: None (follow columns alignment), “right”, “center”, “left”.
headersAlign (str, optional) – Allows for header-wise override starting from left-most given header. Possible values are: “global” (no override), “same” (follow column alignment), “right”, “center”, “left”.
rowAlign (str, optional) – How do you want to align rows: “right”, “center”, “decimal”, “left”.
maxHeaderColWidths (list, optional) – List of column widths for the header.

Returns:

A text representation of the tabular data.

Return type:

str

class armi.utils.tabulate.DataRow(begin, sep, end)

Bases: tuple

Create new instance of DataRow(begin, sep, end)

begin: Alias for field number 0

end: Alias for field number 2

sep: Alias for field number 1

class armi.utils.tabulate.Iterable: Bases: object

class armi.utils.tabulate.Line(begin, hline, sep, end)

Bases: tuple

Create new instance of Line(begin, hline, sep, end)

begin: Alias for field number 0

end: Alias for field number 3

hline: Alias for field number 1

sep: Alias for field number 2

class armi.utils.tabulate.Sized: Bases: object

class armi.utils.tabulate.TableFormat(lineabove, linebelowheader, linebetweenrows, linebelow, headerrow, datarow, padding, withHeaderHide)

Bases: tuple

Create new instance of TableFormat(lineabove, linebelowheader, linebetweenrows, linebelow, headerrow, datarow, padding, withHeaderHide)

datarow: Alias for field number 5

headerrow: Alias for field number 4

lineabove: Alias for field number 0

linebelow: Alias for field number 3

linebelowheader: Alias for field number 1

linebetweenrows: Alias for field number 2

padding: Alias for field number 6

withHeaderHide: Alias for field number 7

class armi.utils.tabulate.TextWrapper(width=70, initial_indent='', subsequent_indent='', expand_tabs=True, replace_whitespace=True, fix_sentence_endings=False, break_long_words=True, drop_whitespace=True, break_on_hyphens=True, tabsize=8, *, max_lines=None, placeholder=' [...]')[source]

Bases: object

Object for wrapping/filling text. The public interface consists of the wrap() and fill() methods; the other methods are just there for subclasses to override in order to tweak the default behaviour. If you want to completely replace the main wrapping algorithm, you’ll probably have to override _wrap_chunks().

Several instance attributes control various aspects of wrapping:

width (default: 70): the maximum width of wrapped lines (unless break_long_words is false)
initial_indent (default: “”): string that will be prepended to the first line of wrapped output. Counts towards the line’s width.
subsequent_indent (default: “”): string that will be prepended to all lines save the first of wrapped output; also counts towards each line’s width.
expand_tabs (default: true): Expand tabs in input text to spaces before further processing. Each tab will become 0 .. ‘tabsize’ spaces, depending on its position in its line. If false, each tab is treated as a single character.
tabsize (default: 8): Expand tabs in input text to 0 .. ‘tabsize’ spaces, unless ‘expand_tabs’ is false.
replace_whitespace (default: true): Replace all whitespace characters in the input text by spaces after tab expansion. Note that if expand_tabs is false and replace_whitespace is true, every tab will be converted to a single space!
fix_sentence_endings (default: false): Ensure that sentence-ending punctuation is always followed by two spaces. Off by default because the algorithm is (unavoidably) imperfect.
break_long_words (default: true): Break words longer than ‘width’. If false, those words will not be broken, and some lines might be longer than ‘width’.
break_on_hyphens (default: true): Allow breaking hyphenated words. If true, wrapping will occur preferably on whitespaces and right after hyphens part of compound words.
drop_whitespace (default: true): Drop leading and trailing whitespace from lines.
max_lines (default: None): Truncate wrapped lines.
placeholder (default: ‘ […]’): Append to the last line of truncated text.

unicode_whitespace_trans = {9: 32, 10: 32, 11: 32, 12: 32, 13: 32, 32: 32}

uspace = 32

wordsep_re = re.compile('\n ( # any whitespace\n [\\\t\\\n\\\x0b\\\x0c\\\r\\ ]+\n | # em-dash between words\n (?<=[\\w!"\\\'&.,?]) -{2,} (?=\\w)\n | # word, possibly hyphenated\n , re.VERBOSE)

wordsep_simple_re = re.compile('([\\\t\\\n\\\x0b\\\x0c\\\r\\ ]+)')

sentence_end_re = re.compile('[a-z][\\.\\!\\?][\\"\\\']?\\Z')

wrap(text: string) → [string][source]: Reformat the single paragraph in ‘text’ so it fits in lines of no more than ‘self.width’ columns, and return a list of wrapped lines. Tabs in ‘text’ are expanded with string.expandtabs(), and all other whitespace characters (including newline) are converted to space.

fill(text: string) → string[source]: Reformat the single paragraph in ‘text’ to fit in lines of no more than ‘self.width’ columns, and return a new string containing the entire wrapped paragraph.

x = ' '

class armi.utils.tabulate.chain

Bases: object

chain(*iterables) –> chain object

Return a chain object whose .__next__() method returns elements from the first iterable until it is exhausted, then elements from the next iterable, until all of the iterables are exhausted.

from_iterable(): Alternative chain() constructor taking a single iterable argument that evaluates lazily.

armi.utils.tabulate.namedtuple(typename, field_names, *, rename=False, defaults=None, module=None)[source]

Returns a new subclass of tuple with named fields.

>>> Point = namedtuple('Point', ['x', 'y'])
>>> Point.__doc__                   # docstring for the new class
'Point(x, y)'
>>> p = Point(11, y=22)             # instantiate with positional args or keywords
>>> p[0] + p[1]                     # indexable like a plain tuple
33
>>> x, y = p                        # unpack like a regular tuple
>>> x, y
(11, 22)
>>> p.x + p.y                       # fields also accessible by name
33
>>> d = p._asdict()                 # convert to a dictionary
>>> d['x']
11
>>> Point(**d)                      # convert from a dictionary
Point(x=11, y=22)
>>> p._replace(x=100)               # _replace() is like str.replace() but targets named fields
Point(x=100, y=22)

class armi.utils.tabulate.partial[source]

Bases: object

partial(func, *args, **keywords) - new function with partial application of the given arguments and keywords.

args: tuple of arguments to future partial calls

func: function object to use in future partial calls

keywords: dictionary of keyword arguments to future partial calls

armi.utils.tabulate.reduce(function, sequence[, initial]) → value: Apply a function of two arguments cumulatively to the items of a sequence, from left to right, so as to reduce the sequence to a single value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). If initial is present, it is placed before the items of the sequence in the calculation, and serves as a default when the sequence is empty.

class armi.utils.tabulate.zip_longest

Bases: object

zip_longest(iter1 [,iter2 […]], [fillvalue=None]) –> zip_longest object

Return a zip_longest object whose .__next__() method returns a tuple where the i-th element comes from the i-th iterable argument. The .__next__() method continues until the longest iterable in the argument sequence is exhausted and then it raises StopIteration. When the shorter iterables are exhausted, the fillvalue is substituted in their place. The fillvalue defaults to None or can be specified by a keyword argument.