Skip to content

Commit 69dcf65

Browse files
jimjonesbrCommitfest Bot
authored andcommitted
Add XMLSerialize: version and explicit XML declaration
* Explicit XML declaration (SQL/XML:2023, X078) This patch adds the options INCLUDING XMLDECLARATION and EXCLUDING XMLDECLARATION to XMLSERIALIZE, allowing users to explicitly control the presence of the XML declaration (e.g., <?xml version="1.0" encoding="UTF-8"?>) in the serialized output of XML values. If neither option is specified, the output includes the declaration only if the input XML value already contained one. * Version support (SQL/XML:2023, X076) The VERSION option allows specifying the version string to use in the XML declaration. If specified, the version must conform to the lexical rules of the XML standard, e.g., '1.0' or '1.1'. If omitted or NULL, version '1.0' is assumed. In DOCUMENT mode, the version string is validated by libxml2’s `xmlNewDoc()`, which will raise an error for invalid versions and a warning for unsupported ones. No validation is performed in CONTENT mode. This option has no effect unless INCLUDING XMLDECLARATION is also specified or the input XML value already contains a declaration. Examples: SELECT xmlserialize( DOCUMENT xmlval AS text VERSION '1.0' INCLUDING XMLDECLARATION); SELECT xmlserialize( DOCUMENT xmlval AS text EXCLUDING XMLDECLARATION); This patch also includes regression tests and documentation.
1 parent 2648eab commit 69dcf65

File tree

15 files changed

+1490
-59
lines changed

15 files changed

+1490
-59
lines changed

doc/src/sgml/datatype.sgml

Lines changed: 68 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4518,7 +4518,10 @@ xml '<foo>bar</foo>'
45184518
<type>xml</type>, uses the function
45194519
<function>xmlserialize</function>:<indexterm><primary>xmlserialize</primary></indexterm>
45204520
<synopsis>
4521-
XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable> [ [ NO ] INDENT ] )
4521+
XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable>
4522+
[ VERSION <replaceable>xmlversion</replaceable> ]
4523+
[ INCLUDING XMLDECLARATION | EXCLUDING XMLDECLARATION ]
4524+
[ [ NO ] INDENT ] )
45224525
</synopsis>
45234526
<replaceable>type</replaceable> can be
45244527
<type>character</type>, <type>character varying</type>, or
@@ -4528,13 +4531,77 @@ XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <repla
45284531
you to simply cast the value.
45294532
</para>
45304533

4534+
<para>
4535+
The <literal>VERSION</literal> option sets the version string
4536+
used in the XML declaration. If specified, the value of
4537+
<replaceable>xmlversion</replaceable> must conform to the lexical
4538+
rules of the XML standard: it must be a string of the form
4539+
<literal>'1.'</literal> followed by one or more digits (e.g.,
4540+
<literal>'1.0'</literal>, <literal>'1.1'</literal>). Versions other
4541+
than 1.x (e.g., <literal>'2.0'</literal>) are not valid in
4542+
<literal>DOCUMENT</literal> mode and will result in an error. This format
4543+
is defined in section 2.8,
4544+
<ulink url="https://www.w3.org/TR/xml/#sec-prolog-dtd">"Prolog and Document Type Declaration"</ulink>,
4545+
of the XML standard. If the <literal>VERSION</literal> option is omitted or specified as
4546+
<literal>NULL</literal>, version <literal>1.0</literal> is used by default.
4547+
4548+
In <literal>CONTENT</literal> mode, no validation is performed on the version,
4549+
and it is emitted as specified if an XML declaration is requested.
4550+
</para>
4551+
4552+
<para>
4553+
The <literal>INCLUDING XMLDECLARATION</literal> and
4554+
<literal>EXCLUDING XMLDECLARATION</literal> options control
4555+
whether the serialized output includes an XML declaration such as
4556+
<literal>&lt;?xml version="1.0" encoding="UTF-8"?&gt;</literal>.
4557+
If neither option is specified, the presence of the declaration in
4558+
the output depends on whether the input <replaceable>value</replaceable>
4559+
originally contained one.
4560+
4561+
The <literal>INCLUDING XMLDECLARATION</literal> option is allowed in both
4562+
<literal>DOCUMENT</literal> and <literal>CONTENT</literal> modes. In
4563+
<literal>CONTENT</literal> mode, no structural validation is performed,
4564+
and the declaration is emitted according to the specified options,
4565+
even if it would not normally be considered valid by the XML specification.
4566+
</para>
4567+
45314568
<para>
45324569
The <literal>INDENT</literal> option causes the result to be
45334570
pretty-printed, while <literal>NO INDENT</literal> (which is the
45344571
default) just emits the original input string. Casting to a character
45354572
type likewise produces the original string.
45364573
</para>
45374574

4575+
<para>
4576+
Examples:
4577+
</para>
4578+
<screen><![CDATA[
4579+
SELECT
4580+
xmlserialize(
4581+
DOCUMENT xmlelement(name foo, xmlelement(name bar,42)) AS text
4582+
VERSION '1.0'
4583+
INCLUDING XMLDECLARATION
4584+
INDENT);
4585+
4586+
xmlserialize
4587+
---------------------------------------
4588+
<?xml version="1.0" encoding="UTF8"?>+
4589+
<foo> +
4590+
<bar>42</bar> +
4591+
</foo>
4592+
(1 row)
4593+
4594+
SELECT
4595+
xmlserialize(
4596+
DOCUMENT '<?xml version="1.0" encoding="UTF-8"?><foo><bar>42</bar></foo>' AS text
4597+
EXCLUDING XMLDECLARATION);
4598+
4599+
xmlserialize
4600+
--------------------------
4601+
<foo><bar>42</bar></foo>
4602+
(1 row)
4603+
]]></screen>
4604+
45384605
<para>
45394606
When a character string value is cast to or from type
45404607
<type>xml</type> without going through <type>XMLPARSE</type> or

src/backend/catalog/sql_features.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -665,9 +665,9 @@ X072 XMLSerialize: character string serialization YES
665665
X073 XMLSerialize: binary string serialization and CONTENT option NO
666666
X074 XMLSerialize: binary string serialization and DOCUMENT option NO
667667
X075 XMLSerialize: binary string serialization NO
668-
X076 XMLSerialize: VERSION NO
668+
X076 XMLSerialize: VERSION YES
669669
X077 XMLSerialize: explicit ENCODING option NO
670-
X078 XMLSerialize: explicit XML declaration NO
670+
X078 XMLSerialize: explicit XML declaration YES
671671
X080 Namespaces in XML publishing NO
672672
X081 Query-level XML namespace declarations NO
673673
X082 XML namespace declarations in DML NO

src/backend/executor/execExprInterp.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4621,7 +4621,9 @@ ExecEvalXmlExpr(ExprState *state, ExprEvalStep *op)
46214621
*op->resvalue =
46224622
PointerGetDatum(xmltotext_with_options(DatumGetXmlP(value),
46234623
xexpr->xmloption,
4624-
xexpr->indent));
4624+
xexpr->indent,
4625+
xexpr->xmldeclaration,
4626+
xexpr->version));
46254627
*op->resnull = false;
46264628
}
46274629
break;

src/backend/parser/gram.y

Lines changed: 25 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -626,6 +626,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
626626
%type <defelt> xmltable_column_option_el
627627
%type <list> xml_namespace_list
628628
%type <target> xml_namespace_el
629+
%type <ival> opt_xml_declaration_option
630+
%type <str> opt_xml_declaration_version xmlserialize_version
629631

630632
%type <node> func_application func_expr_common_subexpr
631633
%type <node> func_expr func_expr_windowless
@@ -794,8 +796,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
794796

795797
WHEN WHERE WHITESPACE_P WINDOW WITH WITHIN WITHOUT WORK WRAPPER WRITE
796798

797-
XML_P XMLATTRIBUTES XMLCONCAT XMLELEMENT XMLEXISTS XMLFOREST XMLNAMESPACES
798-
XMLPARSE XMLPI XMLROOT XMLSERIALIZE XMLTABLE
799+
XML_P XMLATTRIBUTES XMLCONCAT XMLDECLARATION XMLELEMENT XMLEXISTS XMLFOREST
800+
XMLNAMESPACES XMLPARSE XMLPI XMLROOT XMLSERIALIZE XMLTABLE
799801

800802
YEAR_P YES_P
801803

@@ -16200,14 +16202,16 @@ func_expr_common_subexpr:
1620016202
$$ = makeXmlExpr(IS_XMLROOT, NULL, NIL,
1620116203
list_make3($3, $5, $6), @1);
1620216204
}
16203-
| XMLSERIALIZE '(' document_or_content a_expr AS SimpleTypename xml_indent_option ')'
16205+
| XMLSERIALIZE '(' document_or_content a_expr AS SimpleTypename opt_xml_declaration_version opt_xml_declaration_option xml_indent_option ')'
1620416206
{
1620516207
XmlSerialize *n = makeNode(XmlSerialize);
1620616208

1620716209
n->xmloption = $3;
1620816210
n->expr = $4;
1620916211
n->typeName = $6;
16210-
n->indent = $7;
16212+
n->version = $7;
16213+
n->xmldeclaration = $8;
16214+
n->indent = $9;
1621116215
n->location = @1;
1621216216
$$ = (Node *) n;
1621316217
}
@@ -16432,6 +16436,21 @@ xml_indent_option: INDENT { $$ = true; }
1643216436
| /*EMPTY*/ { $$ = false; }
1643316437
;
1643416438

16439+
xmlserialize_version:
16440+
VERSION_P Sconst { $$ = $2; }
16441+
| VERSION_P NULL_P { $$ = NULL; }
16442+
;
16443+
16444+
opt_xml_declaration_version:
16445+
xmlserialize_version { $$ = $1; }
16446+
| /*EMPTY*/ { $$ = NULL; }
16447+
;
16448+
16449+
opt_xml_declaration_option: INCLUDING XMLDECLARATION { $$ = XMLSERIALIZE_INCLUDING_XMLDECLARATION; }
16450+
| EXCLUDING XMLDECLARATION { $$ = XMLSERIALIZE_EXCLUDING_XMLDECLARATION; }
16451+
| /*EMPTY*/ { $$ = XMLSERIALIZE_NO_XMLDECLARATION_OPTION; }
16452+
;
16453+
1643516454
xml_whitespace_option: PRESERVE WHITESPACE_P { $$ = true; }
1643616455
| STRIP_P WHITESPACE_P { $$ = false; }
1643716456
| /*EMPTY*/ { $$ = false; }
@@ -18126,6 +18145,7 @@ unreserved_keyword:
1812618145
| WRAPPER
1812718146
| WRITE
1812818147
| XML_P
18148+
| XMLDECLARATION
1812918149
| YEAR_P
1813018150
| YES_P
1813118151
| ZONE
@@ -18784,6 +18804,7 @@ bare_label_keyword:
1878418804
| XML_P
1878518805
| XMLATTRIBUTES
1878618806
| XMLCONCAT
18807+
| XMLDECLARATION
1878718808
| XMLELEMENT
1878818809
| XMLEXISTS
1878918810
| XMLFOREST

src/backend/parser/parse_expr.c

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2502,6 +2502,8 @@ transformXmlSerialize(ParseState *pstate, XmlSerialize *xs)
25022502

25032503
xexpr->xmloption = xs->xmloption;
25042504
xexpr->indent = xs->indent;
2505+
xexpr->version = xs->version;
2506+
xexpr->xmldeclaration = xs->xmldeclaration;
25052507
xexpr->location = xs->location;
25062508
/* We actually only need these to be able to parse back the expression. */
25072509
xexpr->type = targetType;

src/backend/utils/adt/ruleutils.c

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10232,6 +10232,15 @@ get_rule_expr(Node *node, deparse_context *context,
1023210232
appendStringInfo(buf, " AS %s",
1023310233
format_type_with_typemod(xexpr->type,
1023410234
xexpr->typmod));
10235+
10236+
if (xexpr->version)
10237+
appendStringInfo(buf, " VERSION '%s'", xexpr->version);
10238+
10239+
if (xexpr->xmldeclaration == XMLSERIALIZE_INCLUDING_XMLDECLARATION)
10240+
appendStringInfoString(buf, " INCLUDING XMLDECLARATION");
10241+
else if (xexpr->xmldeclaration == XMLSERIALIZE_EXCLUDING_XMLDECLARATION)
10242+
appendStringInfoString(buf, " EXCLUDING XMLDECLARATION");
10243+
1023510244
if (xexpr->indent)
1023610245
appendStringInfoString(buf, " INDENT");
1023710246
else

0 commit comments

Comments
 (0)