2 .. index:: Field Formatting
7 The field format is similar to the format string for printf(3). Its
8 use varies based on the role of the field, but generally is used to
9 format the field's contents.
11 If the format string is not provided for a value field, it defaults to
14 Note a field definition can contain zero or more printf-style
15 'directives', which are sequences that start with a '%' and end with
16 one of following characters: "diouxXDOUeEfFgGaAcCsSp". Each directive
17 is matched by one of more arguments to the xo_emit function.
19 The format string has the form::
21 '%' format-modifier * format-character
23 The format-modifier can be:
25 - a '#' character, indicating the output value should be prefixed
26 with '0x', typically to indicate a base 16 (hex) value.
27 - a minus sign ('-'), indicating the output value should be padded on
28 the right instead of the left.
29 - a leading zero ('0') indicating the output value should be padded on the
30 left with zeroes instead of spaces (' ').
31 - one or more digits ('0' - '9') indicating the minimum width of the
32 argument. If the width in columns of the output value is less than
33 the minimum width, the value will be padded to reach the minimum.
34 - a period followed by one or more digits indicating the maximum
35 number of bytes which will be examined for a string argument, or the maximum
36 width for a non-string argument. When handling ASCII strings this
37 functions as the field width but for multi-byte characters, a single
38 character may be composed of multiple bytes.
39 xo_emit will never dereference memory beyond the given number of bytes.
40 - a second period followed by one or more digits indicating the maximum
41 width for a string argument. This modifier cannot be given for non-string
43 - one or more 'h' characters, indicating shorter input data.
44 - one or more 'l' characters, indicating longer input data.
45 - a 'z' character, indicating a 'size_t' argument.
46 - a 't' character, indicating a 'ptrdiff_t' argument.
47 - a ' ' character, indicating a space should be emitted before
49 - a '+' character, indicating sign should emitted before any number.
51 Note that 'q', 'D', 'O', and 'U' are considered deprecated and will be
54 The format character is described in the following table:
56 ===== ================= ======================
57 Ltr Argument Type Format
58 ===== ================= ======================
59 d int base 10 (decimal)
60 i int base 10 (decimal)
62 u unsigned base 10 (decimal)
63 x unsigned base 16 (hex)
64 X unsigned long base 16 (hex)
65 D long base 10 (decimal)
66 O unsigned long base 8 (octal)
67 U unsigned long base 10 (decimal)
68 e double [-]d.ddde+-dd
69 E double [-]d.dddE+-dd
72 g double as 'e' or 'f'
73 G double as 'E' or 'F'
74 a double [-]0xh.hhhp[+-]d
75 A double [-]0Xh.hhhp[+-]d
76 c unsigned char a character
78 s char \* a UTF-8 string
79 S wchar_t \* a unicode/WCS string
81 ===== ================= ======================
83 The 'h' and 'l' modifiers affect the size and treatment of the
86 ===== ============= ====================
88 ===== ============= ====================
89 hh signed char unsigned char
90 h short unsigned short
92 ll long long unsigned long long
97 ===== ============= ====================
104 UTF-8 and Locale Strings
105 ~~~~~~~~~~~~~~~~~~~~~~~~
107 For strings, the 'h' and 'l' modifiers affect the interpretation of
108 the bytes pointed to argument. The default '%s' string is a 'char \*'
109 pointer to a string encoded as UTF-8. Since UTF-8 is compatible with
110 ASCII data, a normal 7-bit ASCII string can be used. '%ls' expects a
111 'wchar_t \*' pointer to a wide-character string, encoded as a 32-bit
112 Unicode values. '%hs' expects a 'char \*' pointer to a multi-byte
113 string encoded with the current locale, as given by the LC_CTYPE,
114 LANG, or LC_ALL environment varibles. The first of this list of
115 variables is used and if none of the variables are set, the locale
118 libxo will convert these arguments as needed to either UTF-8 (for XML,
119 JSON, and HTML styles) or locale-based strings for display in text
122 xo_emit("All strings are utf-8 content {:tag/%ls}",
123 L"except for wide strings");
125 ======== ================== ===============================
126 Format Argument Type Argument Contents
127 ======== ================== ===============================
128 %s const char \* UTF-8 string
129 %S const char \* UTF-8 string (alias for '%ls')
130 %ls const wchar_t \* Wide character UNICODE string
131 %hs const char * locale-based string
132 ======== ================== ===============================
134 .. admonition:: "Long", not "locale"
136 The "*l*" in "%ls" is for "*long*", following the convention of "%ld".
137 It is not "*locale*", a common mis-mnemonic. "%S" is equivalent to
140 For example, the following function is passed a locale-base name, a
141 hat size, and a time value. The hat size is formatted in a UTF-8
142 (ASCII) string, and the time value is formatted into a wchar_t
145 void print_order (const char *name, int size,
148 const char *size_val = "unknown";
151 snprintf(buf, sizeof(buf), "%d", size);
156 wcsftime(when, sizeof(when), L"%d%b%y", timep);
158 xo_emit("The hat for {:name/%hs} is {:size/%s}.\n",
160 xo_emit("It was ordered on {:order-time/%ls}.\n",
164 It is important to note that xo_emit will perform the conversion
165 required to make appropriate output. Text style output uses the
166 current locale (as described above), while XML, JSON, and HTML use
169 UTF-8 and locale-encoded strings can use multiple bytes to encode one
170 column of data. The traditional "precision'" (aka "max-width") value
171 for "%s" printf formatting becomes overloaded since it specifies both
172 the number of bytes that can be safely referenced and the maximum
173 number of columns to emit. xo_emit uses the precision as the former,
174 and adds a third value for specifying the maximum number of columns.
176 In this example, the name field is printed with a minimum of 3 columns
177 and a maximum of 6. Up to ten bytes of data at the location given by
178 'name' are in used in filling those columns::
180 xo_emit("{:name/%3.10.6s}", name);
182 Characters Outside of Field Definitions
183 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
185 Characters in the format string that are not part of a field
186 definition are copied to the output for the TEXT style, and are
187 ignored for the JSON and XML styles. For HTML, these characters are
188 placed in a <div> with class "text"::
191 xo_emit("The hat is {:size/%s}.\n", size_val);
193 The hat is extra small.
195 <size>extra small</size>
197 "size": "extra small"
199 <div class="text">The hat is </div>
200 <div class="data" data-tag="size">extra small</div>
201 <div class="text">.</div>
208 libxo supports the '%m' directive, which formats the error message
209 associated with the current value of "errno". It is the equivalent
210 of "%s" with the argument strerror(errno)::
212 xo_emit("{:filename} cannot be opened: {:error/%m}", filename);
213 xo_emit("{:filename} cannot be opened: {:error/%s}",
214 filename, strerror(errno));
216 "%n" Is Not Supported
217 ~~~~~~~~~~~~~~~~~~~~~
219 libxo does not support the '%n' directive. It's a bad idea and we
222 The Encoding Format (eformat)
223 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
225 The "eformat" string is the format string used when encoding the field
226 for JSON and XML. If not provided, it defaults to the primary format
227 with any minimum width removed. If the primary is not given, both
233 For padding and labels, the content string is considered the content,
234 unless a format is given.
236 .. index:: printf-like
241 Many compilers and tool chains support validation of printf-like
242 arguments. When the format string fails to match the argument list,
243 a warning is generated. This is a valuable feature and while the
244 formatting strings for libxo differ considerably from printf, many of
245 these checks can still provide build-time protection against bugs.
247 libxo provide variants of functions that provide this ability, if the
248 "--enable-printflike" option is passed to the "configure" script.
249 These functions use the "_p" suffix, like "xo_emit_p()",
252 The following are features of libxo formatting strings that are
253 incompatible with printf-like testing:
255 - implicit formats, where "{:tag}" has an implicit "%s";
256 - the "max" parameter for strings, where "{:tag/%4.10.6s}" means up to
257 ten bytes of data can be inspected to fill a minimum of 4 columns and
259 - percent signs in strings, where "{:filled}%" makes a single,
260 trailing percent sign;
261 - the "l" and "h" modifiers for strings, where "{:tag/%hs}" means
262 locale-based string and "{:tag/%ls}" means a wide character string;
263 - distinct encoding formats, where "{:tag/#%s/%s}" means the display
264 styles (text and HTML) will use "#%s" where other styles use "%s";
266 If none of these features are in use by your code, then using the "_p"
267 variants might be wise:
269 ================== ========================
270 Function printf-like Equivalent
271 ================== ========================
272 xo_emit_hv xo_emit_hvp
275 xo_emit_warn_hcv xo_emit_warn_hcvp
276 xo_emit_warn_hc xo_emit_warn_hcp
277 xo_emit_warn_c xo_emit_warn_cp
278 xo_emit_warn xo_emit_warn_p
279 xo_emit_warnx xo_emit_warnx_p
280 xo_emit_err xo_emit_err_p
281 xo_emit_errx xo_emit_errx_p
282 xo_emit_errc xo_emit_errc_p
283 ================== ========================
285 .. index:: performance
286 .. index:: XOEF_RETAIN
290 Retaining Parsed Format Information
291 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
293 libxo can retain the parsed internal information related to the given
294 format string, allowing subsequent xo_emit calls, the retained
295 information is used, avoiding repetitive parsing of the format string::
298 int xo_emit_f(xo_emit_flags_t flags, const char fmt, ...);
300 xo_emit_f(XOEF_RETAIN, "{:some/%02d}{:thing/%-6s}{:fancy}\n",
303 To retain parsed format information, use the XOEF_RETAIN flag to the
304 xo_emit_f() function. A complete set of xo_emit_f functions exist to
305 match all the xo_emit function signatures (with handles, varadic
306 argument, and printf-like flags):
308 ================== ========================
309 Function Flags Equivalent
310 ================== ========================
311 xo_emit_hv xo_emit_hvf
314 xo_emit_hvp xo_emit_hvfp
315 xo_emit_hp xo_emit_hfp
317 ================== ========================
319 The format string must be immutable across multiple calls to xo_emit_f(),
320 since the library retains the string. Typically this is done by using
321 static constant strings, such as string literals. If the string is not
322 immutable, the XOEF_RETAIN flag must not be used.
324 The functions xo_retain_clear() and xo_retain_clear_all() release
325 internal information on either a single format string or all format
326 strings, respectively. Neither is required, but the library will
327 retain this information until it is cleared or the process exits::
329 const char *fmt = "{:name} {:count/%d}\n";
330 for (i = 0; i < 1000; i++) {
331 xo_open_instance("item");
332 xo_emit_f(XOEF_RETAIN, fmt, name[i], count[i]);
334 xo_retain_clear(fmt);
336 The retained information is kept as thread-specific data.
341 In this example, the value for the number of items in stock is emitted::
343 xo_emit("{P: }{Lwc:In stock}{:in-stock/%u}\n",
346 This call will generate the following output::
351 <in-stock>144</in-stock>
356 <div class="padding"> </div>
357 <div class="label">In stock</div>
358 <div class="decoration">:</div>
359 <div class="padding"> </div>
360 <div class="data" data-tag="in-stock">144</div>
363 Clearly HTML wins the verbosity award, and this output does
364 not include XOF_XPATH or XOF_INFO data, which would expand the
365 penultimate line to::
367 <div class="data" data-tag="in-stock"
368 data-xpath="/top/data/item/in-stock"
370 data-help="Number of items in stock">144</div>