Main index | Section 5 | Options |
libxo uses format strings to control the rendering of data into various output styles, including text, XML, JSON, and HTML. Each format string contains a set of zero or more "field descriptions", which describe independent data fields. Each field description contains a set of "modifiers", a "content string", and zero, one, or two "format descriptors". The modifiers tell libxo what the field is and how to treat it, while the format descriptors are formatting instructions using printf(3)-style format strings, telling libxo how to format the field. The field description is placed inside a set of braces, with a colon ‘(amp;:’) after the modifiers and a slash ‘(amp;/’) before each format descriptors. Text may be intermixed with field descriptions within the format string.
The field description is given as follows:
aq{aq [ role | modifier ]* [aq,aq long-names ]* aq:aq [ content ] [ aq/aq field-format [ aq/aq encoding-format ]] aq}aq
The role describes the function of the field, while the modifiers enable optional behaviors. The contents, field-format, and encoding-format are used in varying ways, based on the role. These are described in the following sections.
Braces can be escaped by using double braces, similar to "%%" in printf(3). The format string "{{braces}}" would emit "{braces}".
In the following example, three field descriptors appear. The first is a padding field containing three spaces of padding, the second is a label ("In stock"), and the third is a value field ("in-stock"). The in-stock field has a "%u" format that will parse the next argument passed to the xo_emit(3), function as an unsigned integer.
xo_emit("{P: }{Lwc:In stock}{:in-stock/%u} , 65);
This single line of code can generate text ("In stock: 65 ), XML ("<in-stock>65</in-stock>"), JSON (aq"in-stock": 65aq), or HTML (too lengthy to be listed here).
While roles and modifiers typically use single character for brevity, there are alternative names for each which allow more verbose formatting strings. These names must be preceded by a comma, and may follow any single-character values:
xo_emit("{L,white,colon:In stock}{,key:in-stock/%u} , 65);
M Name Description |
C color Field is a color or effect |
D decoration Field is non-text (e.g. colon, comma) |
E error Field is an error message |
L label Field is text that prefixes a value |
N note Field is text that follows a value |
P padding Field is spaces needed for vertical alignment |
T title Field is a title value for headings |
U units Field is the units for the previous value field |
V value Field is the name of field (the default) |
W warning Field is a warning message |
amp;[ start-anchor Begin a section of anchored variable-width text |
amp;] stop-anchor End a section of anchored variable-width text |
EXAMPLE: xo_emit("{L:Free}{D::}{P: }{:free/%u} {U:Blocks} , free_blocks);
When a role is not provided, the "value" role is used as the default.
Roles and modifiers can also use more verbose names, when preceded by a comma:
EXAMPLE: xo_emit("{,label:Free}{,decoration::}{,padding: }" "{,value:free/%u} {,units:Blocks} , free_blocks);
xo_emit("{C:bold}{:value}{C:no-bold} , value);
Colors and effects remain in effect until modified by other "C"-role fields.
xo_emit("{C:bold}{C:inverse}both{C:no-bold}only inverse );
If the content is empty, the "reset" action is performed.
xo_emit("{C:both,underline}{:value}{C:} , value);
The content should be a comma-separated list of zero or more colors or display effects.
xo_emit("{C:bold,underline,inverse}All three{C:no-bold,no-inverse} );
The color content can be either static, when placed directly within the field descriptor, or a printf-style format descriptor can be used, if preceded by a slash ("/"):
xo_emit("{C:/%s%s}{:value}{C:}", need_bold ? "bold" : "", need_underline ? "underline" : "", value);
Color names are prefixed with either "fg-" or "bg-" to change the foreground and background colors, respectively.
xo_emit("{C:/fg-%s,bg-%s}{Lwc:Cost}{:cost/%u}{C:reset} , fg_color, bg_color, cost);
The following table lists the supported effects:
Name Description |
bg-xxxxx Change background color |
bold Start bold text effect |
fg-xxxxx Change foreground color |
inverse Start inverse (aka reverse) text effect |
no-bold Stop bold text effect |
no-inverse Stop inverse (aka reverse) text effect |
no-underline Stop underline text effect |
normal Reset effects (only) |
reset Reset colors and effects (restore defaults) |
underline Start underline text effect |
The following color names are supported:
Name |
black |
blue |
cyan |
default |
green |
magenta |
red |
white |
yellow |
xo_emit("{D:((}{:name}{D:))} , name);
The simplified version can be generated for a single message using the "xopo -s <text>" command, or an entire .pot can be translated using the "xopo -f <input> -o <output>" command.
xo_emit("{G:}Invalid token );
The {G:} role allows a domain name to be set. gettext() calls will continue to use that domain name until the current format string processing is complete, enabling a library function to emit strings using itaqs own catalog. The domain name can be either static as the content of the field, or a format can be used to get the domain name from the arguments.
xo_emit("{G:libc}Service unavailable in restricted mode );
xo_emit("{Lwc:Cost}{:cost/%u} , cost);
xo_emit("{:cost/%u} {N:per year} , cost);
xo_emit("{P: }{Lwc:Cost}{:cost/%u} , cost); xo_emit("{P:/30s}{Lwc:Cost}{:cost/%u} , "", cost);
xo_emit("{T:Interface Statistics} ); xo_emit("{T:/%20.20s}{T:/%6.6s} , "Item Name", "Cost");
xo_emit("{Lwc:Distance}{:distance/%u}{Uw:miles} , miles);
Note that the sense of the aqwaq modifier is reversed for units; a blank is added before the contents, rather than after it.
When the XOF_UNITS flag is set, units are rendered in XML as the "units" attribute:
<distance units="miles">50</distance>
Units can also be rendered in HTML as the "data-units" attribute:
<div class="data" data-tag="distance" data-units="miles" data-xpath="/top/data/distance">50</div>
xo_emit("{:length/%02u}x{:width/%02u}x{:height/%02u} , length, width, height); xo_emit("{:author} wrote author, poem, year);
To give a width directly, encode it as the content of the anchor tag:
xo_emit("({[:10}{:min/%d}/{:max/%d}{]:}) , min, max);
To pass a width as an argument, use "%d" as the format, which must appear after the "/". Note that only "%d" is supported for widths. Using any other value could ruin your day.
xo_emit("({[:/%d}{:min/%d}/{:max/%d}{]:}) , width, min, max);
If the width is negative, padding will be added on the right, suitable for left justification. Otherwise the padding will be added to the left of the fields between the start and stop anchors, suitable for right justification. If the width is zero, nothing happens. If the number of columns of output between the start and stop anchors is less than the absolute value of the given width, nothing happens.
Widths over 8k are considered probable errors and not supported. If XOF_WARN is set, a warning will be generated.
M Name Description |
a argument The content appears as a "const char *"" argument" |
c colon A colon ( ":"") is appended after the label" |
d display Only emit field for display styles (text/HTML) |
e encoding Only emit for encoding styles (XML/JSON) |
h humanize (hn) Format large numbers in human-readable style |
hn-space Humanize: Place space between numeric and unit |
hn-decimal Humanize: Add a decimal digit, if number < 10 |
hn-1000 Humanize: Use 1000 as divisor instead of 1024 |
k key Field is a key, suitable for XPath predicates |
l leaf-list Field is a leaf-list, a list of leaf values |
n no-quotes Do not quote the field when using JSON style |
q quotes Quote the field when using JSON style |
t trim Trim leading and trailing whitespace |
w white space A blank ( ") is appended after the label" |
For example, the modifier string "Lwc" means the field has a label role (text that describes the next field) and should be followed by a colon (aqcaq) and a space (aqwaq). The modifier string "Vkq" means the field has a value role, that it is a key for the current instance, and that the value should be quoted when encoded for JSON.
Roles and modifiers can also use more verbose names, when preceded by a comma. For example, the modifier string "Lwc" (or "L,white,colon") means the field has a label role (text that describes the next field) and should be followed by a colon (aqcaq) and a space (aqwaq). The modifier string "Vkq" (or ":key,quote") means the field has a value role (the default role), that it is a key for the current instance, and that the value should be quoted when encoded for JSON.
EXAMPLE: xo_emit("{La:} {a:} , "Label text", "label", "value"); TEXT: Label text value JSON: "label": "value" XML: <label>value</label>
The argument modifier allows field names for value fields to be passed on the stack, avoiding the need to build a field descriptor using snprintf(1). For many field roles, the argument modifier is not needed, since those roles have specific mechanisms for arguments, such as "{C:fg-%s}".
EXAMPLE: xo_emit("{Lc:Name}{:name} , "phil"); TEXT: Name:phil
The colon modifier is only used for the TEXT and HTML output styles. It is commonly combined with the space modifier (aq{w:}aq). It is purely a convenience feature.
EXAMPLE: xo_emit("{Lcw:Name}{d:name} {:id/%d} , "phil", 1); TEXT: Name: phil 1 XML: <id>1</id>
The display modifier is the opposite of the encoding modifier, and they are often used to give to distinct views of the underlying data.
EXAMPLE: xo_emit("{Lcw:Name}{:name} {e:id/%d} , "phil", 1); TEXT: Name: phil XML: <name>phil</name><id>1</id>
The encoding modifier is the opposite of the display modifier, and they are often used to give to distinct views of the underlying data.
"hn" can be used as an alias for "humanize".
The humanize modifier only affects display styles (TEXT and HMTL). The "no-humanize" option will block the function of the humanize modifier.
There are a number of modifiers that affect details of humanization. These are only available in as full names, not single characters. The "hn-space" modifier places a space between the number and any multiplier symbol, such as "M" or "K" (ex: "44 K"). The "hn-decimal" modifier will add a decimal point and a single tenths digit when the number is less than 10 (ex: "4.4K"). The "hn-1000" modifier will use 1000 as divisor instead of 1024, following the JEDEC-standard instead of the more natural binary powers-of-two tradition.
EXAMPLE: xo_emit("{h:input/%u}, {h,hn-space:output/%u}, " "{h,hn-decimal:errors/%u}, {h,hn-1000:capacity/%u}, " "{h,hn-decimal:remaining/%u} , input, output, errors, capacity, remaining); TEXT: 21, 57 K, 96M, 44M, 1.2G
In the HTML style, the original numeric value is rendered in the "data-number" attribute on the <div> element:
<div class="data" data-tag="errors" data-number="100663296">96M</div>
In the following example, the strings "State" and "full" are passed to gettext() to find locale-based translated strings.
xo_emit("{Lgwc:State}{g:state} , "full");
EXAMPLE: xo_open_list("user"); for (i = 0; i < num_users; i++) { xo_open_instance("user"); xo_emit("User {k:name} has {:count} tickets , user[i].u_name, user[i].u_tickets); xo_close_instance("user"); } xo_close_list("user");
Currently the key modifier is only used when generating XPath values for the HTML output style when XOF_XPATH is set, but other uses are likely in the near future.
EXAMPLE: xo_open_list("user"); for (i = 0; i < num_users; i++) { xo_emit("Member {l:name} , user[i].u_name); } xo_close_list("user"); XML: <user>phil</user> <user>pallavi</user> JSON: "user": [ "phil", "pallavi" ]
EXAMPLE: const char *bool = is_true ? "true" : "false"; xo_emit("{n:fancy/%s}", bool); JSON: "fancy": true
xo_emit("{:bytes} {Ngp:byte,bytes} , bytes);
The plural modifier is meant to work with the gettext modifier ({g:}) but can work independently.
When used without the gettext modifier or when the message does not appear in the message catalog, the first token is chosen when the last numeric value is equal to 1; otherwise the second value is used, mimicking the simple pluralization rules of English.
When used with the gettext modifier, the ngettext(3) function is called to handle the heavy lifting, using the message catalog to convert the singular and plural forms into the native language.
EXAMPLE: xo_emit("{q:time/%d}", 2014); JSON: "year": "2014"
EXAMPLE: xo_emit("{Lw:Name}{:name} , "phil"); TEXT: Name phil
The white space modifier is only used for the TEXT and HTML output styles. It is commonly combined with the colon modifier (aq{c:}aq). It is purely a convenience feature.
Note that the sense of the aqwaq modifier is reversed for the units role ({Uw:}); a blank is added before the contents, rather than after it.
If the format string is not provided for a value field, it defaults to "%s".
Note a field definition can contain zero or more printf-style "directives", which are sequences that start with a aq%aq and end with one of following characters: "diouxXDOUeEfFgGaAcCsSp". Each directive is matched by one of more arguments to the xo_emit(3) function.
The format string has the form:
aq%aq format-modifier * format-character
The format-modifier can be:
Note that aqqaq, aqDaq, aqOaq, and aqUaq are considered deprecated and will be removed eventually.
The format character is described in the following table:
C Argument Type Format |
d int base 10 (decimal) |
i int base 10 (decimal) |
o int base 8 (octal) |
u unsigned base 10 (decimal) |
x unsigned base 16 (hex) |
X unsigned long base 16 (hex) |
D long base 10 (decimal) |
O unsigned long base 8 (octal) |
U unsigned long base 10 (decimal) |
e double [-]d.ddde+-dd |
E double [-]d.dddE+-dd |
f double [-]ddd.ddd |
F double [-]ddd.ddd |
g double as aqeaq or aqfaq |
G double as aqEaq or aqFaq |
a double [-]0xh.hhhp[+-]d |
A double [-]0Xh.hhhp[+-]d |
c unsigned char a character |
C wint_t a character |
s char * a UTF-8 string |
S wchar_t * a unicode/WCS string |
p void * aq%#lxaq |
The aqhaq and aqlaq modifiers affect the size and treatment of the argument:
Mod d, i o, u, x, X |
hh signed char unsigned char |
h short unsigned short |
l long unsigned long |
ll long long unsigned long long |
j intmax_t uintmax_t |
t ptrdiff_t ptrdiff_t |
z size_t size_t |
q quad_t u_quad_t |
For strings, the aqhaq and aqlaq modifiers affect the interpretation of the bytes pointed to argument. The default aq%saq string is a aqchar *aq pointer to a string encoded as UTF-8. Since UTF-8 is compatible with ASCII data, a normal 7-bit ASCII string can be used. "%ls" expects a "wchar_t *" pointer to a wide-character string, encoded as 32-bit Unicode values. "%hs" expects a "char *" pointer to a multi-byte string encoded with the current locale, as given by the LC_CTYPE, LANG, or LC_ALL environment variables. The first of this list of variables is used and if none of the variables are set, the locale defaults to UTF-8.
libxo will convert these arguments as needed to either UTF-8 (for XML, JSON, and HTML styles) or locale-based strings for display in text style.
xo_emit("All strings are utf-8 content {:tag/%ls}", L"except for wide strings");
"%S" is equivalent to "%ls".
For example, a function is passed a locale-base name, a hat size, and a time value. The hat size is formatted in a UTF-8 (ASCII) string, and the time value is formatted into a wchar_t string.
void print_order (const char *name, int size, struct tm *timep) { char buf[32]; const char *size_val = "unknown";if (size > 0) snprintf(buf, sizeof(buf), "%d", size); size_val = buf; }
wchar_t when[32]; wcsftime(when, sizeof(when), L"%d%b%y", timep);
xo_emit("The hat for {:name/%hs} is {:size/%s}. , name, size_val); xo_emit("It was ordered on {:order-time/%ls}. , when); }
It is important to note that xo_emit(3) will perform the conversion required to make appropriate output. Text style output uses the current locale (as described above), while XML, JSON, and HTML use UTF-8.
UTF-8 and locale-encoded strings can use multiple bytes to encode one column of data. The traditional "precision" (aka "max-width") value for "%s" printf formatting becomes overloaded since it specifies both the number of bytes that can be safely referenced and the maximum number of columns to emit. xo_emit(3) uses the precision as the former, and adds a third value for specifying the maximum number of columns.
In this example, the name field is printed with a minimum of 3 columns and a maximum of 6. Up to ten bytes are in used in filling those columns.
xo_emit("{:name/%3.10.6s}", name);
EXAMPLE: xo_emit("The hat is {:size/%s}. , size_val); TEXT: The hat is extra small. XML: <size>extra small</size> JSON: "size": "extra small" HTML: <div class="text">The hat is </div> <div class="data" data-tag="size">extra small</div> <div class="text">.</div>
xo_emit("{P: }{Lwc:In stock}{:in-stock/%u} , instock);
This call will generate the following output:
TEXT: In stock: 144 XML: <in-stock>144</in-stock> JSON: "in-stock": 144, HTML: <div class="line"> <div class="padding"> </div> <div class="label">In stock</div> <div class="decoration">:</div> <div class="padding"> </div> <div class="data" data-tag="in-stock">144</div> </div>
Clearly HTML wins the verbosity award, and this output does not include XOF_XPATH or XOF_INFO data, which would expand the penultimate line to:
<div class="data" data-tag="in-stock" data-xpath="/top/data/item/in-stock" data-type="number" data-help="Number of items in stock">144</div>
if ($src1/process[pid == $pid]/name == $src2/proc-table/proc/p[process-id == $pid]/proc-name) { ... }
Find someone else who is expressing similar data and follow their fields and hierarchy. Remember the quote is not "Consistency is the hobgoblin of little minds" but "A foolish consistency is the hobgoblin of little minds".
Field names constitute the means by which client programmers interact with our system. By choosing wise names now, you are making their lives better.
After using xolint(1) to find errors in your field descriptors, use "xolint -V" to spell check your field names and to detect different names for the same data. "dropped-short" and "dropped-too-short" are both reasonable names, but using them both will lead users to ask the difference between the two fields. If there is no difference, use only one of the field names. If there is a difference, change the names to make that difference more obvious.
LIBXO (3) | December 4, 2014 |
Main index | Section 5 | Options |
Please direct any comments about this manual page service to Ben Bullock. Privacy policy.
“ | Modern Unix impedes progress in computer science, wastes billions of dollars, and destroys the common sense of many who seriously use it. | ” |
— The Unix Haters' handbook |