The XML data type
You can use the XML
data type to store XML data. For this data type, PostgreSQL validates the input value structure and provides functionality to perform type-safe operations.
The XML
type can store well-formed XML documents and content fragments, whose root node is not a document node. A content fragment also can have more than one top-level element.
NOTE
The PostgreSQL XML data type and processing functions have compatibility limitations. For more information, refer to XML Limits and Conformance to SQL/XML.
|
Create XML values
You can use the options described below to generate values of the XML
data type. To test these options, create a new table with an XML
column:
CREATE TABLE xmldocs (id SERIAL PRIMARY KEY, xmldata XML);
The xmlparse function
The xmlparse
function produces XML
from character data. It has the following syntax:
xmlparse ({ DOCUMENT | CONTENT } <value>)
The first argument determines the type of value
. It can be an XML document (DOCUMENT
) or a content fragment (CONTENT
).
Example:
INSERT INTO xmldocs (xmldata)
VALUES
(xmlparse (DOCUMENT '<?xml version="1.0"?><book><title>Hyperion</title><author>Dan Simmons</author></book>')),
(xmlparse (CONTENT '<title>1984</title><author>George Orwell</author>'))
RETURNING *;
The result:
id | xmldata ----+------------------------------------------------------------------ 1 | <book><title>Hyperion</title><author>Dan Simmons</author></book> 2 | <title>1984</title><author>George Orwell</author>
PostgreSQL-specific conversion
The following syntax allows you to convert character strings to XML
:
xml <value>
<value>::xml
Note, the XML
type conversion does not validate input values against a document type declaration (DTD) and other XML schema languages.
Example:
INSERT INTO xmldocs (xmldata)
VALUES
(xml '<?xml version="1.0"?><book><title>Hyperion</title><author>Dan Simmons</author></book>'),
('<title>1984</title><author>George Orwell</author>'::xml)
RETURNING *;
The result is the same:
id | xmldata ----+------------------------------------------------------------------ 1 | <book><title>Hyperion</title><author>Dan Simmons</author></book> 2 | <title>1984</title><author>George Orwell</author>
Convert XML to a character string
To create a character string value from XML
, use the xmlserialize
function:
xmlserialize ({ DOCUMENT | CONTENT } <value> AS <type>)
type
can be the character
, character varying
, or text
data type.
Example:
SELECT xmlserialize (CONTENT xmldata AS text )
FROM xmldocs;
The result:
xmlserialize --------------------------------------------------------------------------------------- <?xml version="1.0"?><book><title>Hyperion</title><author>Dan Simmons</author></book> <title>1984</title><author>George Orwell</author>
According to the SQL standard, the xmlserialize
function is the only way to convert XML
to character types, but PostgreSQL also allows you to cast the value:
SELECT CAST(xmldata AS text)
FROM xmldocs;
When a character string value is cast to or from the XML
type without using the xmlparse
and xmlserialize
functions, the xmloption session configuration parameter determines the type of XML
value. It can be DOCUMENT
or CONTENT
. The default value is CONTENT
. Use the SET
command to change the xmloption
parameter:
SET xmloption TO DOCUMENT;
/* or */
SET XML OPTION DOCUMENT;
Create XML content elements
This section describes functions and expressions that create XML content from SQL data. You can use them to format query results into XML documents.
xmlcomment
The xmlcomment
function creates an XML comment with the specified text.
xmlcomment ( <text> ) → xml
The text cannot contain --
or end with -
, otherwise the result is not a valid XML comment. If the argument is null
, the result is null
.
Example:
SELECT xmlcomment ('This is a comment');
The result:
xmlcomment -------------------------- <!--This is a comment-->
xmlconcat
The xmlconcat
function concatenates a list of individual XML
values to create a single value containing an XML content fragment.
xmlconcat ( <xml> [, ...] ) → xml
Null
values are omitted. The result is null
, if all individual XML
values are null
.
Example:
SELECT xmlconcat('<title>Hyperion</title>','<author>Dan Simmons</author>');
The result:
xmlconcat ----------------------------------------------------- <title>Hyperion</title><author>Dan Simmons</author>
xmlelement
The xmlelement
expression creates an XML element with the given name, attributes, and content.
xmlelement ( NAME <name>
[, XMLATTRIBUTES ( <attvalue> [ AS <attname> ] [, ...] ) ] [, <content> [, ...]] ) → xml
Where:
-
name
andattname
are identifiers; -
attvalue
andcontent
are expressions that can return any PostgreSQL data type; -
xmlattributes
are attributes of the XML element; -
the
content
values are concatenated to form the element content.
Example:
SELECT xmlelement (name book,
xmlattributes('novel' as genre, current_date as record_date), 'Leo Tolstoy,', ' War and Peace');
The result:
xmlelement -------------------------------------------------------------------------------- <book genre="novel" record_date="2022-09-06">Leo Tolstoy, War and Peace</book>
If an attribute value is a column reference, you do not need to specify the attribute name, xmlelement
uses the column name as the attribute name.
Example:
SELECT xmlelement (name book, xmlattributes(b.genre, b.public_year as year), a.name, ', ',b.title)
FROM books as b, authors as a
WHERE b.author_id = a.id;
The result:
xmlelement ---------------------------------------------------------------------------------------------- <book genre="novel" year="1960">Harper Lee, To Kill a Mockingbird</book> <book genre="novel" year="1925">F. Scott Fitzgerald, The Great Gatsby</book> <book genre="fantasy" year="1955">J.R.R. Tolkien, The Lord of the Rings</book> <book genre="sci-fi" year="1949">George Orwell, 1984</book> <book genre="fantasy" year="1937">J.R.R. Tolkien, The Hobbit, or There and Back Again</book> <book genre="novel" year="1869">Leo Tolstoy, War and Peace</book> <book genre="sci-fi" year="1989">Dan Simmons, Hyperion</book> <book genre="sci-fi" year="1895">Herbert Wells, The Time Machine</book>
xmlforest
The xmlforest
expression creates an XML sequence of elements with the given names and content.
xmlforest (<content> [ AS <name> ] [, ...] ) → xml
Where name
is an identifier, content
can have any data type.
Example:
SELECT xmlforest(b.title, b.public_year as year, a.name as author)
FROM books as b, author as a
WHERE b.author_id = a.id;
The result:
<title>To the Lighthouse</title><year>1927</year><author>Virginia Woolf</author> <title>Mrs. Dalloway</title><year>1925</year><author>Virginia Woolf</author> <title>To Kill a Mockingbird</title><year>1960</year><author>Harper Lee</author> <title>The Great Gatsby</title><year>1925</year><author>F. Scott Fitzgerald</author> <title>The Hobbit, or There and Back Again</title><year>1937</year><author>J.R.R. Tolkien</author> <title>The Lord of the Rings</title><year>1955</year><author>J.R.R. Tolkien</author> <title>1984</title><year>1949</year><author>George Orwell</author> <title>War and Peace</title><year>1869</year><author>Leo Tolstoy</author> <title>Hyperion</title><year>1989</year><author>Dan Simmons</author> <title>The Time Machine</title><year>1895</year><author>Herbert Wells</author>
Note that an XML forest is not a valid XML document if it consists of more than one element. You can wrap an xmlforest
expression in xmlelement
to make it valid.
xmlpi
The xmlpi
expression creates an XML processing instruction.
xmlpi ( NAME <name> [, <content> ] ) → xml
Where name
is an identifier, content
can have any data type. Also, content
cannot contain the ?>
characters.
Example:
SELECT xmlpi(name php, 'echo "hello world";');
The result:
xmlpi ----------------------------- <?php echo "hello world";?>
xmlroot
The xmlroot
expression changes the properties of the XML root node.
xmlroot ( xml, version {text|NO VALUE} [, standalone {YES|NO|NO VALUE} ] ) → xml
If version
and standalone
settings are specified, they replace values in the root node’s version declaration.
Example:
SELECT xmlroot(xmlparse(document '<?xml version="1.1"?><content>example</content>'),
version '1.0', standalone yes);
The result:
xmlroot -------------------------------------------------------------- <?xml version="1.0" standalone="yes"?><content>example</content>
xmlagg
The xmlagg
function is an aggregate function. It concatenates the input values across rows.
Call xmlagg
for the xmldocs
table created above:
SELECT xmlagg(xmldata) FROM xmldocs;
The result:
<book><title>Hyperion</title><author>Dan Simmons</author></book><book><title>1984</title><author>George Orwell</author></book>;
Check XML value properties
XML predicates in this section allow you to determine whether an XML
value is a document or a content fragment, if it contains the specified XML element and has the well-formed structure.
IS DOCUMENT / IS NOT DOCUMENT
The IS DOCUMENT/IS NOT DOCUMENT
expressions determine if an XML
value is an XML document or content fragment.
<xml> IS DOCUMENT → boolean
<xml> IS NOT DOCUMENT → boolean
If xml
is a document, IS DOCUMENT
returns true
and IS NOT DOCUMENT
returns false
. If xml
is not a document, IS DOCUMENT
returns false
and IS NOT DOCUMENT
returns true
.
Examples:
SELECT xmldata IS DOCUMENT FROM xmldocs;
The result:
?column? ---------- t f
SELECT xmldata IS NOT DOCUMENT FROM xmldocs;
The result:
?column? ---------- f t
xmlexists
The xmlexists
function evaluates an XPath 1.0 expression (the text
argument), with the specified xml
value. The function returns false
if the evaluation result is an empty node-set. If the evaluation returns any other value, the xmlexists
result is true
.
xmlexists ( <text> PASSING [BY {REF|VALUE}] <xml> [BY {REF|VALUE}] ) → boolean
xml
must be an XML document, not a content fragment or non-XML value. The function returns null
if any argument is null
.
Update the xmldocs
table to make XML data valid:
UPDATE xmldocs
SET xmldata = xmlparse (DOCUMENT '<?xml version="1.0"?><book><title>1984</title><author>George Orwell</author></book>')
WHERE id = 2;
Call the xmlexists
function to determine whether title
with the Hyperion
value exists:
SELECT xmlexists('//title[text() = ''Hyperion'']' PASSING BY VALUE xmldata)
FROM xmldocs;
The result:
xmlexists ----------- t f
The SQL standard supports two mechanisms to pass an XML argument from SQL: BY REF
and BY VALUE
. According to SQL standard, PostgreSQL accepts the BY REF
and BY VALUE
clauses, but ignores them. For additional information, see Incidental Limits of the Implementation.
In the SQL standard, the xmlexists
function also evaluates an expression in the XML Query language. PostgreSQL allows only an XPath 1.0 expression as described in Queries Are Restricted to XPath 1.0.
xml_is_well_formed
These functions check whether a text string is a well-formed XML and returns a Boolean result.
xml_is_well_formed ( <text> ) → boolean
xml_is_well_formed_document ( <text> ) → boolean
xml_is_well_formed_content ( <text> ) → boolean
xml_is_well_formed_document
verifies whether a text string is a well-formed document, while xml_is_well_formed_content
checks for well-formed content.
xml_is_well_formed
verifies a text string according to the xmloption value.
Example:
SELECT xml_is_well_formed ('<book><title>Hyperion</title><author>Dan Simmons</author></book>');
The result is true
.
You can use xml_is_well_formed
to check if the cast to the XML
type will be successful. The xml_is_well_formed_document
and xml_is_well_formed_content
functions help you to determine if the xmlparse function can convert a text string to XML
.
Process XML data
To process XML
values, PostgreSQL offers xpath
and xpath_exists
functions that evaluate XPath 1.0 expressions. xpath
extracts XML
values and xpath_exists
determines if the specified XML
values exist.
PostgreSQL also provides the xmltable
expression that creates a table based on an XML
value.
Extract XML values
The xpath
function returns an array of XML
values corresponding to the XPath expression. If the XPath expression returns a scalar value, xpath
returns a single-element array.
xpath ( <xpath_expression> text, <xml> xml [, <nsarray> text[] ] ) → xml[]
Where:
-
xpath_expression
is a XPath 1.0 expression given as text. -
xml
is a well-formedXML
document, it must have a single root node element. -
nsarray
is an array of namespace mappings (optional). It should be an array of arrays that consists of two elements. The first element of each nested array is the namespace name (alias), the second one is the namespace URI.
Examples:
SELECT xpath('/my:a/text()', '<my:a xmlns:my="http://example.com">test</my:a>',
ARRAY[ARRAY['my', 'http://example.com']]);
/* a default (anonymous) namespace */
SELECT xpath('//mydefns:b/text()', '<a xmlns="http://example.com"><b>test</b></a>',
ARRAY[ARRAY['mydefns', 'http://example.com']]);
The result of both statements is the same:
xpath -------- {test}
The xpath_exists
function returns a Boolean value that determines whether any value other than an empty node-set is returned.
xpath_exists ( <xpath_expression> text, <xml> xml [, <nsarray> text[] ] ) → boolean
This function is equivalent to the xmlexists predicate with additional support for a namespace mapping argument.
SELECT xpath_exists('/my:a/text()', '<my:a xmlns:my="http://example.com">test</my:a>',
ARRAY[ARRAY['my', 'http://example.com']]);
The result is true
.
Create a table based on an XML value
The xmltable
expression creates a table based on an XML
value. You can use it only in the FROM
clause.
The xmltable
expression has the following syntax:
xmltable (
[ XMLNAMESPACES ( <namespace_uri> AS <namespace_name> [, ...] ), ]
<row_expression> PASSING [BY {REF|VALUE}] <document_expression> [BY {REF|VALUE}]
COLUMNS <name> { <type> [PATH <column_expression>]
[DEFAULT <default_expression>] [NOT NULL | NULL] | FOR ORDINALITY }
[, ...]
) → setof record
Where:
-
XMLNAMESPACES
(optional) — is a comma-separated list of namespace definitions, where eachnamespace_uri
is atext
expression and eachnamespace_name
is a identifier.XMLNAMESPACES
specifies the XML namespaces and their aliases used in the document. A default namespace specification is not supported. -
row_expression
is an XPath 1.0 expression given as text.xmltable
usesdocument_expression
asrow_expression
context to obtain a set of XML nodes and transform them into output rows. If thedocument_expression
isnull
orrow_expression
returns an empty node-set or any value other than a node-set,xmltable
returns no rows. -
document_expression
is a context forrow_expression
. It must be a well-formed XML document. -
COLUMNS
specifies the columns created in the output table.name
andtype
are required for each column. ThePATH
,DEFAULT
, andNOT NULL | NULL
clauses are optional. A column with theFOR ORDINALITY
attribute is populated with row numbers, starting with 1. At most one column can have theFOR ORDINALITY
attribute. -
column_expression
is an XPath 1.0 expression that is evaluated for each row, with the current node from therow_expression
result. If nocolumn_expression
is set, the column name is used as an implicit path. -
default_expression
is used ifcolumn_expression
returns an empty node-set for a current row. The column value is set tonull
, unlessdefault_expression
is specified.
Run the following query that converts the XML data into the booklist
table:
CREATE TABLE booklist AS SELECT xml
$$<books>
<book id="01">
<title>Hyperion</title>
<author>Dan Simmons</author>
<price>60</price>
</book>
<book id="02">
<title>1984</title>
<author>George Orwell</author>
<price>45</price>
</book>
</books>$$ AS books;
The query below returns a table from the XML
value listed above:
SELECT xmltable.* FROM booklist,
XMLTABLE ('/books/book' PASSING books
COLUMNS
id CHAR(2) PATH '@id' NOT NULL,
title TEXT PATH 'title' NOT NULL,
author TEXT PATH 'author' NOT NULL,
price FLOAT PATH 'price' NOT NULL);
The result:
id | title | author | price ----+----------+---------------+------- 01 | Hyperion | Dan Simmons | 60 02 | 1984 | George Orwell | 45
You can use aggregate functions with the xmltable
expression:
SELECT count(id) as total_books, avg(price) as avg_price FROM booklist1,
XMLTABLE ('/books/book' PASSING books
COLUMNS
id CHAR(2) PATH '@id' NOT NULL,
title TEXT PATH 'title' NOT NULL,
author TEXT PATH 'author' NOT NULL,
price FLOAT PATH 'price' NOT NULL);
The result:
total_books | avg_price -------------+----------- 2 | 52.5
Export tables to XML
The following functions create XML
values based on relational table contents:
table_to_xml ( <table> regclass, <nulls> boolean,
<tableforest> boolean, <targetns> text ) → xml
query_to_xml ( <query> text, <nulls> boolean,
<tableforest> boolean, <targetns> text ) → xml
cursor_to_xml ( <cursor> refcursor, <count> integer, <nulls> boolean,
<tableforest> boolean, <targetns> text ) → xml
The table_to_xml
function creates an XML
value from the content of the table, passed as the table
parameter. The regclass
type accepts strings that identify tables.
The query_to_xml
function executes the SQL query passed as the query
parameter and creates an XML
value from the result set.
The cursor_to_xml
function fetches the specified number of rows (the count
parameter) from the cursor passed as the cursor
parameter and creates an XML
value from them. Since results of these functions are built up in memory, use cursor_to_xml
if you need to export a large table.
All these functions have the following common parameters:
-
The
nulls
parameter determines whethernull
values should be included in the output. -
The
targetns
parameter specifies the XML namespace. If you do not need to define a namespace, pass an empty string. -
If the
tableforest
parameter isfalse
, the resulting XML document has additional<row>
nodes:<tablename> <row> <columnname1>data</columnname1> <columnname2>data</columnname2> </row> ... </tablename>
If
tableforest
istrue
, the result contains multiple<tablename>
nodes instead of<row>
:<tablename> <columnname1>data</columnname1> <columnname2>data</columnname2> </tablename> <tablename> ... </tablename>
For example, we have the authors
table:
id | name | country ----+---------------------+--------------- 1 | Harper Lee | USA 2 | F. Scott Fitzgerald | USA 3 | J.R.R. Tolkien | Great Britain
Execute the following statement to convert the authors
table to XML
:
SELECT table_to_xml ('authors', false, true, '' );
The result:
<authors xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">+ <id>1</id> + <name>Harper Lee</name> + <country>USA</country> + </authors> + + <authors xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">+ <id>2</id> + <name>F. Scott Fitzgerald</name> + <country>USA</country> + </authors> + + <authors xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">+ <id>3</id> + <name>J.R.R. Tolkien</name> + <country>Great Britain</country> + </authors> +