Use PL/Perl
Overview
PL/Perl is a loadable procedural language that allows you to write stored procedures and functions using the Perl programming language in ADPG/PostgreSQL. The main advantage of PL/Perl is that it allows you to use many of the string manipulation functions and operators available in Perl. Parsing complex strings in Perl can be much easier than using string functions and control structures in PL/pgSQL.
Usually, PL/Perl is installed as a trusted programming language named plperl
. In this language, certain Perl operations are disabled to preserve security. In general, all operations that interact with the environment are prohibited. This includes operations with files, the require
and use
Perl functions used for external modules. Therefore, PL/Perl functions cannot interact with the internal mechanisms of a database server or access the operating system with the permissions of the server process. As a result, any unprivileged database user can be allowed to use this language.
CAUTION
Trusted PL/Perl relies on the Opcode Perl module to preserve security. However, the Perl documentation states that this module does not work effectively for trusted PL/Perl. If you have stringent security requirements, consider executing the REVOKE command: REVOKE USAGE ON LANGUAGE plperl FROM PUBLIC . It revokes the privilege to use plperl from PUBLIC . The PUBLIC keyword refers to the implicitly defined group of all roles.
|
To install PL/Perl in a database as a trusted language, use the CREATE EXTENSION command as follows:
CREATE EXTENSION plperl;
If you need to write Perl functions that are not restricted, install Perl as an untrusted PL/PerlU language. In this case, all the features of the Perl language will be available. To select an untrusted PL/Perl variant, specify the name plperlu
when creating the extension:
CREATE EXTENSION plperlu;
Using PL/PerlU, you must ensure that functions cannot be utilized to do anything unexpected, since they can perform any operation allowed to a user with database administrator rights. Note that ADPG/PostgreSQL grants permission to create functions in untrusted languages only to database superusers.
PL/Perl functions run in a separate Perl interpreter for each SQL role. All PL/PerlU functions executed in the current session run in a single Perl interpreter. This allows PL/PerlU functions to share data, but the communication between PL/Perl and PL/PerlU functions is not possible.
NOTE
If the plperl or plperlu extension is created in the template1 database used as the default template, all subsequently created databases will have this extension installed.
|
ADPG/PostgreSQL provides some configuration parameters that affect PL/Perl. They are listed in the table below. These parameters can be set in the postgresql.conf field on the Primary configuration tab of the ADPG service. See Configuration parameters.
Name | Type | Description |
---|---|---|
plperl.on_init |
String |
Specifies the Perl code to be executed when a Perl interpreter is first initialized, before it is specialized for use by This parameter can be set only in the postgresql.conf field |
plperl.on_plperl_init plperl.on_plperlu_init |
String |
The parameters specify the Perl code to be executed when a Perl interpreter is specialized for the Only superusers can set the |
plperl.use_strict |
Bool |
When set to |
For more information, see PL/Perl Under the Hood. Configuration.
PL/Perl functions, procedures, and arguments
To create a function in the PL/Perl language, use the CREATE FUNCTION SQL command:
CREATE FUNCTION funcname (argument-types)
RETURNS return-type
-- function attributes
AS $$
# PL/Perl function body
$$ LANGUAGE plperl;
The function body is ordinary Perl code. The PL/Perl code wraps the function code inside a Perl subroutine. A PL/Perl function is called in a scalar context, so it cannot return a list. You can return non-scalar values (arrays, records, and sets) by returning a reference, as described below.
In a PL/Perl procedure, any return value from the Perl code is ignored.
PL/Perl also supports anonymous code blocks called with the DO statement:
DO $$
# PL/Perl code
$$ LANGUAGE plperl;
An anonymous code operates like a function but takes no arguments, and any returned value is discarded.
It is not recommended to use named nested subroutines in Perl, especially if they refer to lexical variables in the enclosing scope. Since a PL/Perl function is wrapped in a subroutine, any named subroutine you place inside it will be nested. In general, it is much safer to create anonymous routines and call them by reference to the code. For more information, see the entries for Variable "%s" will not stay shared
and Variable "%s" is not available
in perldiag.
The syntax of the CREATE FUNCTION
command requires the function body to be a string constant. Usually, a dollar-quoted string constant is used. If you want to utilize the escape string syntax E''
, double single quote marks ('
) and backslashes (\
) in the function body.
Arguments and results are handled as in any other Perl subroutine: arguments are passed in @_
, and a result value is returned with return
or as the last expression evaluated in the function.
For example, a function returning the greater of two integer values can be defined as:
CREATE FUNCTION return_max (integer, integer) RETURNS integer AS $$
if ($_[0] > $_[1]) { return $_[0]; }
return $_[1];
$$ LANGUAGE plperl;
Arguments will be converted from the database encoding to UTF-8 for use inside PL/Perl, and then converted from UTF-8 back to the database encoding upon return.
If an SQL NULL
value is passed to a function, the argument value is processed as undefined
in Perl. However, the above function definition handles NULL
input values as zeroes, which can lead to unpredictable results. To avoid this situation, add STRICT to the function definition. In this case, if a NULL
value is passed, the function is not called and returns a NULL
result automatically. Alternatively, you can check for undefined inputs in the function body. To illustrate this approach, update the return_max
function to return a non-null argument if one of the arguments is NULL
:
CREATE FUNCTION return_max (integer, integer) RETURNS integer AS $$
my ($a, $b) = @_;
if (not defined $a) {
return undef if not defined $b;
return $b;
}
return $a if not defined $b;
return $a if $a > $b;
return $a;
$$ LANGUAGE plperl;
As shown in the example, to return an SQL NULL
value from a PL/Perl function, pass an undefined value. This can be done whether the function is strict or not.
Any function argument that is not a reference is a string. This string is in the standard PostgreSQL external text representation for the corresponding data type. In case of ordinary numeric or text types, Perl processes them as expected, and you do not need to perform additional operations. However, with more complex types, you need to convert the argument to a suitable form for Perl. For example, to transform bytea
to a binary value, use the decode_bytea
function.
Values passed back to PostgreSQL must be in the external text format. For example, to prepare binary data to be returned as a bytea
value, use the encode_bytea
function. See Utility functions.
The transformation of logical types deserves special attention.
Boolean value transformation
By default, boolean values are passed to Perl as text, that is, as t
or f
. This is where the problem occurs, because Perl does not treat f
as false
. To improve the situation, use a transform provided by the bool_plperl
extension (bool_plperlu
for PL/PerlU). See CREATE TRANSFORM.
You need to install the bool_plperl
or bool_plperlu
extension first:
CREATE EXTENSION bool_plperl;
-- or
CREATE EXTENSION bool_plperlu;
Then use the TRANSFORM
attribute for the PL/Perl function that takes or returns bool
, for example:
CREATE FUNCTION perl_or(bool, bool) RETURNS bool
TRANSFORM FOR TYPE bool
AS $$
my ($a, $b) = @_;
return $a || $b;
$$ LANGUAGE plperl;
When this transform is applied, Perl receives the bool
arguments as 1
or an empty value, which appears as true
or false
to Perl. If the function’s result is of type bool
, it will return true
or false
, depending on whether Perl evaluates the returned value as true
. Similar transformations are also performed for the arguments and results of SPI queries executed within the function.
Arrays
Perl can return PostgreSQL arrays as references to Perl arrays. For example:
CREATE OR REPLACE function return_array()
RETURNS text[][] AS $$
return [['id " 1','text 1'],['id 2','text 2']];
$$ LANGUAGE plperl;
SELECT return_array();
Result:
return_array ------------------------------------------ {{"id \" 1","text 1"},{"id 2","text 2"}}
Perl passes an array to PostgreSQL as an object mapped to PostgreSQL::InServer::ARRAY
. This object may be treated as an array reference or a string for backward compatibility with Perl code written for PostgreSQL versions below 9.1. The result will be different. In the functions below, shift
returns the first element of the function’s argument array.
The function with an array reference:
CREATE OR REPLACE FUNCTION concat_elem_reference(text[]) RETURNS TEXT AS $$
my $arg = shift;
my $result = "";
return undef if (!defined $arg);
for (@$arg) {
$result .= $_;
}
return $result;
$$ LANGUAGE plperl;
SELECT concat_elem_reference(ARRAY['PL','/','Perl']);
Result:
concat_elem_reference ----------------------- PL/Perl
The function uses a string:
CREATE OR REPLACE FUNCTION concat_elem_string(text[]) RETURNS TEXT AS $$
my $arg = shift;
my $result = "";
return undef if (!defined $arg);
$result .= $arg;
return $result;
$$ LANGUAGE plperl;
SELECT concat_elem_string(ARRAY['PL','/','Perl']);
Result:
concat_elem_string -------------------- {PL,/,Perl}
Multidimensional arrays are represented as references to lower-dimensional arrays of references — this is a common way for Perl.
Composite types
Composite-type arguments are passed to a function as references to hashes. The keys of the hash are the attribute names of the composite type.
Create tables for demonstration purposes:
CREATE TABLE book (
book_id SERIAL PRIMARY KEY,
title VARCHAR(50) NOT NULL,
author_id INT NOT NULL,
in_stock INTEGER DEFAULT 0);
INSERT INTO book(title, author_id, in_stock) VALUES
('Mrs. Dalloway',1,25),
('To the Lighthouse',1,13),
('To Kill a Mockingbird',2,0),
('The Great Gatsby',3,3),
('The Lord of the Rings',4,45);
CREATE TABLE author (
id SERIAL PRIMARY KEY,
author_name VARCHAR(100) NOT NULL
);
INSERT INTO author(author_name) VALUES
('Virginia Woolf'),
('Harper Lee'),
('F. Scott Fitzgerald'),
('J.R.R. Tolkien'),
('George Orwell'),
('Dan Simmons');
The example below uses the book
composite type as an argument:
CREATE FUNCTION arrival(book) RETURNS integer AS $$
my ($bk) = @_;
return $bk->{in_stock} + 5;
$$ LANGUAGE plperl;
SELECT title, arrival(book.*) FROM book;
Result:
title | arrival -----------------------+--------- Mrs. Dalloway | 30 To the Lighthouse | 18 To Kill a Mockingbird | 5 The Great Gatsby | 8 The Lord of the Rings | 50
A PL/Perl function can return a composite-type result using the same approach — return a reference to a hash that has required attributes. For example:
CREATE TYPE test_row AS (book_id integer, title text, author_name text);
CREATE OR REPLACE FUNCTION create_row() RETURNS test_row AS $$
return {book_id => 1, title => 'Animal Farm', author_name => 'George Orwell'};
$$ LANGUAGE plperl;
SELECT create_row();
Result:
book_id | title | author_name ---------+-------------+--------------- 1 | Animal Farm | George Orwell
Any columns in the declared result data type that are not present in the hash will be returned as NULL
values.
Similarly, output arguments of procedures can be returned as a hash reference:
CREATE PROCEDURE calculate_cube(INOUT x integer, INOUT y integer) AS $$
my ($x, $y) = @_;
return {x => $x * $x * $x, y => $y * $y * $y};
$$ LANGUAGE plperl;
CALL calculate_cube(2, 8);
Result:
x | y ---+----- 8 | 512
PL/Perl functions can return sets of scalar or composite types. It is better to return rows one at a time to reduce preparation time and avoid storing the entire result set in memory. For this approach, you can use return_next
as illustrated below. Note that after the last return_next
call, you should invoke return
or return undef
. The last one is preferable, since it returns a single value.
The function below returns a set of scalar types:
CREATE OR REPLACE FUNCTION return_int_set(int)
RETURNS SETOF INTEGER AS $$
foreach (0..$_[0]) {
return_next($_);
}
return undef;
$$ LANGUAGE plperl;
SELECT return_int_set(4);
Result:
return_set_int ---------------- 0 1 2 3 4
The following function returns a set of composite types:
CREATE OR REPLACE FUNCTION return_composite_set()
RETURNS SETOF test_row AS $$
return_next({ book_id => 1, title => 'Mrs. Dalloway', author_name => 'Virginia Woolf' });
return_next({ book_id => 2, title => 'To Kill a Mockingbird', author_name => 'Harper Lee' });
return_next({ book_id => 3, title => 'The Great Gatsby', author_name => 'F. Scott Fitzgerald' });
return undef;
$$ LANGUAGE plperl;
SELECT return_composite_set();
Result:
return_composite_set ---------------------------------------------- (1,"Mrs. Dalloway","Virginia Woolf") (2,"To Kill a Mockingbird","Harper Lee") (3,"The Great Gatsby","F. Scott Fitzgerald")
For a small result set, you can return a reference to an array that contains scalar values, references to arrays, or references to hashes for simple types, array types, and composite types, respectively.
The function below returns a set of integer values:
CREATE OR REPLACE FUNCTION return_set_of_int(int)
RETURNS SETOF INTEGER AS $$
return [0..$_[0]];
$$ LANGUAGE plperl;
SELECT return_set_of_int(3);
Result:
return_set_of_int ------------------- 0 1 2 3
The function below returns a set of composite test_row
values:
CREATE OR REPLACE FUNCTION return_set_of_composite()
RETURNS SETOF test_row AS $$
return [
{book_id => 1, title => 'Mrs. Dalloway', author_name => 'Virginia Woolf'},
{book_id => 2, title => 'To Kill a Mockingbird', author_name => 'Harper Lee'},
{book_id => 3, title => 'The Great Gatsby', author_name => 'F. Scott Fitzgerald'}
];
$$ LANGUAGE plperl;
SELECT return_set_of_composite();
Result:
return_set_of_composite ---------------------------------------------- (1,"Mrs. Dalloway","Virginia Woolf") (2,"To Kill a Mockingbird","Harper Lee") (3,"The Great Gatsby","F. Scott Fitzgerald")
Use STRICT with functions and procedures
If you wish to use the STRICT
clause, there are several options:
-
Add the following code at the top of the function body:
use strict;
-
For temporary global use with all functions and procedures, execute:
$ SET plperl.use_strict to true
See the Configuration parameters that affect PL/Perl table. This will set the
STRICT
mode for all subsequent compilations of PL/Perl functions, but not functions already compiled in the current session. -
For permanent global use with all functions and procedures, set
plperl.use_strict
totrue
in the postgresql.conf field that is located on the Clusters → ADPG cluster → Services → ADPG → Primary configuration tab in ADCM UI.
Database access from the PL/Perl code
This section describes functions that can be used to access a database form the PL/Perl code.
Return an entire row set
The spi_exec_query
function executes an SQL command and returns the entire row set as a reference to an array of hash references. You should only use this command when you know that the result set will be relatively small.
spi_exec_query(query [, <limit>])
If <limit>
is specified and is greater than zero, spi_exec_query
retrieves at most limit rows, as if the query included the LIMIT
clause. Without specifying <limit>
or when it is 0
, the row limit is omitted.
The following code returns up to 3 rows from the book
table.
$result = spi_exec_query('SELECT * FROM book', 3);
If book
has the title
column, you can get the value from the row with the i
number:
$book_title = $result->{rows}[$i]->{title};
The total number of rows returned from a SELECT
query can be accessed as follows:
$r_number = $rv->{processed}
The following example uses a different SQL command type with spi_exec_query
:
$query = "INSERT INTO book VALUES (6, 'Hyperion', 7, 10)";
$result = spi_exec_query($query);
You can access the command status as follows:
$st = $result->{status};
The entire example looks like this:
CREATE OR REPLACE FUNCTION change_letter_case() RETURNS SETOF test_row AS $$
my $result = spi_exec_query('SELECT book.book_id, book.title, author.author_name FROM book INNER JOIN author ON author.id = book.author_id');
my $status = $result->{status};
my $r_number = $result->{processed};
foreach my $rn (0 .. $r_number - 1) {
my $row = $result->{rows}[$rn];
$row->{book_id} += 100 if defined($row->{book_id});
$row->{title} =~ tr/A-Za-z/a-zA-Z/ if (defined($row->{title}));
return_next($row);
}
return undef;
$$ LANGUAGE plperl;
SELECT * FROM change_letter_case();
Result:
book_id | title | author_name ---------+-----------------------+--------------------- 101 | mRS. dALLOWAY | Virginia Woolf 102 | tO THE lIGHTHOUSE | Virginia Woolf 103 | tO kILL A mOCKINGBIRD | Harper Lee 104 | tHE gREAT gATSBY | F. Scott Fitzgerald 105 | tHE lORD OF THE rINGS | J.R.R. Tolkien
Return a large row set
Use the spi_query
and spi_fetchrow
functions together for row sets that might be large, or for cases where you wish to return rows as they are retrieved. Note that spi_fetchrow
works only with spi_query
.
The example below accesses a file from a disk in PL/PerlU code. You can use /etc/passwd or create any other text file for this example.
Also, this example requires the Digest::MD5 module. Use the following command to install it:
The example generates the number of rows specified by the input function parameter. The type of the rows is type1
defined in the example. The text
field is created as a digest in hexadecimal form.
CREATE TYPE type1 AS (number INTEGER, text TEXT);
CREATE OR REPLACE FUNCTION lotsa_md5 (INTEGER) RETURNS SETOF type1 AS $$
use Digest::MD5 qw(md5_hex);
my $file = '/etc/passwd';
my $t = localtime;
elog(NOTICE, "opening file $file at $t" );
open my $fh, '<', $file # a file access
or elog(ERROR, "cannot open $file for reading: $!");
my @words = <$fh>;
close $fh;
$t = localtime;
elog(NOTICE, "closed file $file at $t");
chomp(@words);
my $row;
my $sth = spi_query("SELECT * FROM generate_series(1,$_[0]) AS b(a)");
while (defined ($row = spi_fetchrow($sth))) {
return_next({
number => $row->{a},
text => md5_hex($words[rand @words])
});
}
return;
$$ LANGUAGE plperlu;
SELECT * FROM lotsa_md5(50);
Result:
NOTICE: opening file /etc/passwd at Tue Feb 25 08:42:00 2025 NOTICE: closed file /etc/passwd at Tue Feb 25 08:42:00 2025 number | text --------+---------------------------------- 1 | d429087cc8a5927d6eac8ec266da4426 2 | 8bce1016b9caf72f9e981d4c7fb828cd 3 | 963a084efcda655c5312ff2264f931e7 ...
Normally, calls of spi_fetchrow
should be repeated until it returns undef
indicating that there are no more rows to read. The cursor returned by spi_query
is automatically freed when spi_fetchrow
returns undef
. If you do not need to read all the rows, call spi_cursor_close
to free the cursor.
Use prepared query plans
The spi_prepare
, spi_query_prepared
, spi_exec_prepared
, and spi_freeplan
functions operate with prepared query plans.
spi_prepare(<command>, <argument types>)
spi_query_prepared(<plan>, <arguments>)
spi_exec_prepared(<plan> [, <attributes>], <arguments>)
spi_freeplan(<plan>)
The spi_prepare
function prepares a query plan. It accepts a query string with numbered argument placeholders ($1
, $2
, etc.) and a string list of argument types:
$plan = spi_prepare('SELECT * FROM book
WHERE id > $1 AND title = $2','INTEGER', 'TEXT');
The plan is used in the spi_exec_prepared
function instead of a string query. Its result is the same as returned by spi_exec_query.
Also, you can pass the plan as a parameter of the spi_query_prepared
function, which returns a cursor exactly as spi_query
. This cursor can be passed to spi_fetchrow
. The optional second parameter of spi_exec_prepared
is an attribute hash reference. The only attribute currently supported is limit
, which sets the maximum number of rows returned from a query.
The advantage of prepared queries is that you can use one prepared plan for more than one query execution. After the plan is not needed, it can be freed with spi_freeplan
.
CREATE OR REPLACE FUNCTION init() RETURNS VOID AS $$
$_SHARED{my_plan} = spi_prepare('SELECT * FROM book WHERE in_stock < $1', 'INTEGER');
$$ LANGUAGE plperl;
CREATE OR REPLACE FUNCTION add_books( INTEGER ) RETURNS SETOF book AS $$
return spi_exec_prepared(
$_SHARED{my_plan},
$_[0]
)->{rows};
$$ LANGUAGE plperl;
CREATE OR REPLACE FUNCTION done() RETURNS VOID AS $$
spi_freeplan( $_SHARED{my_plan});
undef $_SHARED{my_plan};
$$ LANGUAGE plperl;
SELECT init();
SELECT add_books(15);
SELECT done();
Result:
add_books --------------------------------- (2,"To the Lighthouse",1,13) (3,"To Kill a Mockingbird",2,0) (4,"The Great Gatsby",3,3)
Manage transactions
From PL/Perl code, you can commit or roll back the current transaction using the spi_commit
and spi_rollback
functions, respectively. These functions can only be invoked in a procedure or anonymous code block (the DO
command) called from the top level. Note that it is not possible to run the COMMIT
or ROLLBACK
SQL commands via spi_exec_query
or similar Perl functions. After a transaction is ended, a new transaction is automatically started, so there is no separate function for that.
Example:
CREATE TABLE test (a int);
CREATE PROCEDURE transaction_test1()
LANGUAGE plperl
AS $$
foreach my $i (0..9) {
spi_exec_query("INSERT INTO test (a) VALUES ($i)");
if ($i % 2 == 0) {
spi_commit();
} else {
spi_rollback();
}
}
$$;
CALL transaction_test1();
SELECT a FROM test;
Result:
a --- 0 2 4 6 8
There are no entries for odd values because such transactions were rolled back.
Utility functions
The utility functions available in PL/Perl are listed in the table below.
Name | Description |
---|---|
elog(<level>, <msg>) |
Emits a log or error message. Possible levels are |
quote_literal(<string>) |
Returns the given string suitably quoted to be used as a string literal in an SQL statement string. Single quotes ( |
quote_nullable(<string>) |
Returns the given string suitably quoted to be used as a string literal in an SQL statement string. If the argument is |
quote_ident(<string>) |
Returns the given string suitably quoted to be used as an identifier in an SQL statement string. Quotes are added only if necessary. For example, if the string contains non-identifier characters or letters in different cases. Embedded quotes are doubled |
decode_bytea(<string>) |
Returns the unescaped binary data represented by the contents of the given string, which should be |
encode_bytea(<string>) |
Returns the given string encoded in the |
encode_array_literal(<array>) encode_array_literal(<array>, <delimiter>) |
Returns the contents of the referenced array as a string in array literal format (see Array value input). |
encode_typed_literal(<value>, <type_name>) |
Converts a Perl variable to the value of the data type passed as a second argument and returns a string representation of this value. This function handles correctly nested arrays and values of composite types |
encode_array_constructor(<array>) |
Returns the contents of the referenced array as a string in array constructor format (see Array constructors). Individual values are quoted using |
looks_like_number(<string>) |
Returns the |
is_array_ref(<argument>) |
Returns the |