Use PL/pgSQL
Overview
PL/pgSQL is a loadable procedural language for the ADPG/PostgreSQL database system.
SQL is the language that ADPG/PostgreSQL and most other relational databases use as query language. Every SQL statement must be executed individually by the database server. That means that a client application must send each query to the database server, wait for it to be processed, receive and process the results, and then send further queries to the server. All this requires inter-process communication and can also incur network overhead.
With PL/pgSQL, you can group queries and calculations inside a database server. This gives you the power of the procedural language and advantages of SQL, significantly reducing the amount of data transfer between client and server. It can result in a considerable performance increase as compared to an application without stored functions.
PL/pgSQL has the following advantages:
-
Allows you to create functions, procedures, and triggers. Functions and stored procedures developed with PL/pgSQL can be used like any built-in functions and stored procedures.
-
Provides control structures to the SQL language.
-
Allows performing complex computations.
-
Inherits all user-defined types, functions, procedures, and operators.
-
Can be defined to be trusted by the PostgreSQL database server.
-
Is easy to use.
In ADPG, the plpgsql
extension, required to use PL/pgSQL, is preinstalled in the postgres
database. When creating a new database, the extensions that are present in the template1
database are installed. By default, template1
has plpgsql
installed. See Template Databases.
PL/pgSQL structure
PL/pgSQL supports anonymous code blocks called with the DO statement.
DO
$$
DECLARE
book_num integer;
BEGIN
SELECT count(*) into book_num FROM book;
RAISE NOTICE 'The number of books: %', book_num;
END;
$$;
If you need to reuse the code, you can create a function or procedure. Use CREATE FUNCTION or CREATE PROCEDURE for this. The following example defines a function written in PL/pgSQL. LANGUAGE
specifies the name of the language that the function is implemented in:
CREATE FUNCTION my_function(integer, text) RETURNS integer
AS 'function body text'
LANGUAGE plpgsql;
The function body is a string literal. Very often dollar quoting ($$
) is used to write the function body, rather than the normal single quote syntax (see Dollar-quoted string constants). Without dollar quoting, any single quotes or backslashes in the function body must be escaped by doubling them.
PL/pgSQL is a block-structured language. The complete text of a function body must be a block. A block is defined as:
[ <<<label>>> ]
[ DECLARE
<declarations> ]
BEGIN
<statements>
END [ <label> ];
Each declaration and each statement within a block is terminated by a semicolon. A block that is located within another block must have a semicolon after END
, as shown above. However, the final END
that concludes a function body does not require a semicolon.
A label
is only needed if you want to identify the block for use in an EXIT
statement, or to qualify the names of the variables declared in the block. If a label
is given after END
, it must match the label
at the block’s beginning.
PL/pgSQL allows you to use the same comments as in ordinary SQL. A double dash (--
) starts a one-line comment, a block comment begins with /*
and ends with */
.
Any statement in the statement section of a block can be a subblock. Subblocks are used for logical grouping or to localize variables to a small group of statements. Variables with the same names declared in a subblock substitute any variables of outer blocks, but you can access the outer variables if you qualify their names with their block’s label. For example:
CREATE FUNCTION my_function() RETURNS integer AS $$
<< my_label >>
DECLARE
variable1 integer := 100;
BEGIN
RAISE NOTICE 'Variable1 is %', variable1; -- Prints 100
variable1 := 50;
-- Create a subblock
DECLARE
variable1 integer := 200;
BEGIN
RAISE NOTICE 'Variable1 is %', variable1; -- Prints 200
RAISE NOTICE 'Variable1 is %', my_label.variable1; -- Prints 50
END;
RAISE NOTICE 'Variable1 is %', variable1; -- Prints 50
RETURN variable1;
END;
$$ LANGUAGE plpgsql;
SELECT my_function();
Result:
NOTICE: Variable1 is 100 NOTICE: Variable1 is 200 NOTICE: Variable1 is 50 NOTICE: Variable1 is 50 my_function ------------- 50
It is important not to confuse the use of BEGIN/END
for grouping statements in PL/pgSQL with the similarly-named SQL commands for transaction control. In PL/pgSQL, BEGIN/END
are only for grouping. They do not start or end a transaction. For more information on how to work with transactions in PL/pgSQL, refer to Transaction management. Additionally, a block with an EXCEPTION
clause essentially creates a nested transaction that can be rolled back without affecting the outer transaction. This is described in the Trapping errors section.
Declarations
All variables used in a block must be declared in the DECLARE
section of the block. The only exception is the FOR
loop variable. PL/pgSQL variables can have any SQL data type, such as integer, char, etc.
Examples of variable declarations:
account_id integer;
amount numeric(5);
article varchar;
my_row tablename%ROWTYPE;
my_field tablename.columnname%TYPE;
arow RECORD;
The general syntax of variable declaration:
<name> [ CONSTANT ] <type> [ COLLATE <collation_name> ] [ NOT NULL ] [ { DEFAULT | := | = } <expression> ];
name |
Variable name |
CONSTANT |
Prevents the value of the variable from changing after initialization, so the value remains constant in the block |
type |
Variable type |
COLLATE <collation_name> |
Specifies a collation to use for the variable (see Collation of PL/pgSQL variables) |
NOT NULL |
If specified, an assignment of a null value results in a run-time error. All variables declared as |
{DEFAULT | := | = } <expression> |
Specifies the initial value assigned to the variable when the block is entered. If the |
The equal sign (=
) can be used instead of Oracle PL/SQL-compliant :=
.
Examples:
account_id CONSTANT integer := 10;
quantity integer DEFAULT 32;
article varchar = 'This is text';
Declare function parameters
Parameters passed to functions are named with the identifiers $1
, $2
, etc. Optionally, aliases can be declared for parameter names to increase readability. An alias or a numeric identifier can be used to refer to the parameter value. The following two functions are identical:
CREATE FUNCTION deposit_interest(real) RETURNS real AS $$
BEGIN
RETURN $1 * 1.2;
END;
$$ LANGUAGE plpgsql;
CREATE FUNCTION deposit_interest(total real) RETURNS real AS $$
BEGIN
RETURN total * 1.2;
END;
$$ LANGUAGE plpgsql;
PL/pgSQL functions also support output parameters. Output parameters are assigned numeric identifiers $1…$n
and can be given aliases in the same way as input parameters. An output parameter is a variable that should be assigned during the function execution. The last assigned value is returned. Output parameters are most useful when returning multiple values:
CREATE FUNCTION calculations (x int, y int, OUT sum int, OUT prod int) AS $$
BEGIN
sum := x + y;
prod := x * y;
END;
$$ LANGUAGE plpgsql;
SELECT * FROM calculations(10, 5);
Result:
sum | prod -----+------ 15 | 50
Aliases
You can use the ALIAS FOR
clause to declare an alias for a variable or a function parameter.
<new_name> ALIAS FOR <old_name>;
The following example creates the variable2
alias for variable1
:
DECLARE
variable2 ALIAS FOR variable1;
The example below declares the parameter2
alias for the second function parameter:
DECLARE
parameter2 ALIAS FOR $2;
Copy types
%TYPE
allows you to assign a new variable a data type of the existing variable or a table column. By using %TYPE
you do not need to know the data type of the object you are referencing. Also, if the data type of the referenced item changes in the future, you do not need to change your function definition.
The following code illustrates how to declare variables with the data type of a table column and another variable:
variable1 pgbench_accounts.abalance%TYPE;
variable3 variable2%TYPE;
Row types
A variable of a composite type is called a row variable (or row-type variable). This type of variables can hold a whole row of a SELECT
or FOR
query result. You can declare a row variable with the same type as the rows of an existing table or view. Use the <table_name>%ROWTYPE
notation for this:
row_var pgbench_accounts%ROWTYPE;
The individual fields of the row value are accessed using the dot notation, for example my_row.field
.
A function parameter can have a composite type (a table row). In this case, the corresponding identifier $n
will be a row variable, and fields can be selected from it as follows: $1.field1
. The example below uses tables from the Example of working with tables via psql article.
CREATE FUNCTION get_author(t_row book) RETURNS text AS $$
DECLARE
author text;
BEGIN
SELECT name INTO author FROM author WHERE id = t_row.author_id;
RETURN author;
END;
$$ LANGUAGE plpgsql;
SELECT get_author(b.*), title FROM book b;
Result:
get_author | title ---------------------+----------------------- Virginia Woolf | Mrs. Dalloway Virginia Woolf | To the Lighthouse Harper Lee | To Kill a Mockingbird F. Scott Fitzgerald | The Great Gatsby J.R.R. Tolkien | The Lord of the Rings George Orwell | 1984 George Orwell | Animal Farm
Record types
Record variables are similar to row-type variables, but they have no predefined structure. They take on the structure of the row they are assigned during a SELECT
or FOR
command. The structure of a record variable can be changed each time it is assigned. Until a record variable is first set, it has no structure, and any attempt to access a field in it will result in a run-time error.
Example:
CREATE FUNCTION get_book_name(int) RETURNS text AS $$
DECLARE
my_row record;
BEGIN
SELECT * INTO my_row FROM book WHERE id = $1;
RETURN my_row.title;
END;
$$ LANGUAGE plpgsql;
SELECT get_book(5);
Result:
get_book_name ----------------------- The Lord of the Rings
Execute SQL commands
In general, any SQL command that does not return rows can be executed within a PL/pgSQL function just by writing the command. For example, you can create and fill a table:
CREATE TABLE shelves (
book_id SERIAL PRIMARY KEY,
location VARCHAR(10) NOT NULL);
INSERT INTO shelves VALUES (5, 'B32');
If the command does return rows (for example SELECT
, or INSERT/UPDATE/DELETE
with RETURNING
), there are two ways to proceed. When a command returns at most one row, or you only care about the first row of output, use INTO
clause to capture the output. See Executing a command with a single-row result. INTO
is used in an example above:
...
SELECT * INTO my_row FROM book WHERE id = $1;
...
To process all the output rows, write the command as the data source for a FOR
loop, as described in Looping through query results. The following function refreshes all materialized views using this approach:
CREATE FUNCTION refresh_mat_views() RETURNS integer AS $$
DECLARE
mviews RECORD;
BEGIN
RAISE NOTICE 'Refreshing all materialized views...';
FOR mviews IN
SELECT n.nspname AS mv_schema,
c.relname AS mv_name,
pg_catalog.pg_get_userbyid(c.relowner) AS owner
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON (n.oid = c.relnamespace)
WHERE c.relkind = 'm'
ORDER BY 1
LOOP
-- "mviews" has one record with information about each materialized view
RAISE NOTICE 'Refreshing materialized view %.% (owner: %)...',
quote_ident(mviews.mv_schema),
quote_ident(mviews.mv_name),
quote_ident(mviews.owner);
EXECUTE format('REFRESH MATERIALIZED VIEW %I.%I', mviews.mv_schema, mviews.mv_name);
END LOOP;
RAISE NOTICE 'Done refreshing materialized views.';
RETURN 1;
END;
$$ LANGUAGE plpgsql;
SELECT refresh_mat_views();
Non-optimizable SQL commands (also called utility commands) do not accept query parameters. Automatic substitution of PL/pgSQL variables does not work in such commands. To include non-constant text in a utility command executed from PL/pgSQL, build the utility command as a string and then use EXECUTE
to run it, as described in Executing dynamic commands. In the example above, EXECUTE
is used to run REFRESH MATERIALIZED VIEW
:
...
EXECUTE format('REFRESH MATERIALIZED VIEW %I.%I', mviews.mv_schema, mviews.mv_name);
...
Sometimes it is necessary to evaluate an expression or SELECT
query but discard the result, for example when calling a function that has side effects but no useful result value. To do this in PL/pgSQL, use the PERFORM
statement:
PERFORM <query>;
Example:
PERFORM pg_advisory_lock(1234);
Control structures
Control structures are probably the most useful part of PL/pgSQL. With PL/pgSQL control structures, you can manipulate PostgreSQL data in a very flexible and powerful way.
RETURN
RETURN
allows you to return data from a function.
RETURN
with an expression terminates the function and returns the value of the expression to the caller. This form is used for PL/pgSQL functions that do not return a set. In the examples above, RETURN
is executed as follows:
...
RETURN total * 1.2;
...
RETURN my_row.title;
...
When a PL/pgSQL function is declared to return SETOF <sometype>
, the individual items to return are specified by a sequence of RETURN NEXT
or RETURN QUERY
commands, and then a final RETURN
command with no argument is used to indicate that the function has finished executing. RETURN NEXT
can be used with both scalar and composite data types; with a composite result type, an entire "table" of results will be returned. RETURN QUERY
appends the results of executing a query to the function’s result set.
CREATE OR REPLACE FUNCTION get_all_books() RETURNS SETOF book AS $$
DECLARE
row book%ROWTYPE;
BEGIN
FOR row IN SELECT * FROM book
LOOP
-- can do some processing here
-- the "book" table are not changed, the string "processed by RETURN NEXT" is only added in the output
row.title= row.title||' processed by RETURN NEXT';
RETURN NEXT row; -- return the current row of the `SELECT` result set
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
SELECT title FROM get_all_books();
Result:
title ------------------------------------------------ The Great Gatsby processed by RETURN NEXT The Lord of the Rings processed by RETURN NEXT 1984 processed by RETURN NEXT Animal Farm processed by RETURN NEXT Mrs. Dalloway processed by RETURN NEXT To the Lighthouse processed by RETURN NEXT To Kill a Mockingbird processed by RETURN NEXT
CREATE OR REPLACE FUNCTION get_books_by_author(int) RETURNS SETOF book AS $$
BEGIN
RETURN QUERY SELECT * FROM book WHERE author_id=$1;
IF NOT FOUND THEN
RAISE EXCEPTION 'No books';
END IF;
RETURN;
END;
$$ LANGUAGE plpgsql;
SELECT * FROM get_books_by_author(3);
Result:
id | title | author_id | public_year ----+------------------+-----------+------------- 4 | The Great Gatsby | 3 | 1925
A procedure does not have a return value. A procedure can therefore end without a RETURN
statement. If you wish to use a RETURN
statement to exit the code early, write just RETURN
with no expression. If a procedure has output parameters, the final values of the output parameter variables will be returned to the caller.
Call a procedure
A PL/pgSQL function, procedure, or DO
block can call a procedure using CALL
. Output parameters are handled differently from the way that CALL
works in plain SQL. Each OUT
or INOUT
parameter of the procedure must correspond to a variable in the CALL
statement, and whatever the procedure returns is assigned back to that variable after it returns. For example:
CREATE PROCEDURE triple(INOUT x int)
LANGUAGE plpgsql
AS $$
BEGIN
x := x * 3;
END;
$$;
DO $$
DECLARE myvar int := 5;
BEGIN
CALL triple(myvar);
RAISE NOTICE 'myvar = %', myvar; -- prints 15
END;
$$;
Conditionals
The IF
and CASE
statements let you execute alternative commands based on certain conditions. PL/pgSQL provides three forms of IF
:
-
IF … THEN … END IF
-
IF … THEN … ELSE … END IF
-
IF … THEN … ELSIF … THEN … ELSE … END IF
and two forms of CASE:
-
CASE … WHEN … THEN … ELSE … END CASE
-
CASE WHEN … THEN … ELSE … END CASE
An example of using IF
:
IF number = 0 THEN
result := 'zero';
ELSIF number > 0 THEN
result := 'positive';
ELSIF number < 0 THEN
result := 'negative';
ELSE
-- the only possibility is that number is null
result := 'NULL';
END IF;
The simple form of CASE
provides conditional execution based on the equality of operands. A search expression is evaluated once and successively compared to each expression in the WHEN
clauses. If a match is found, then the corresponding statements are executed, and then control passes to the next statement after END CASE
.
CASE <search-expression>
WHEN <expression> [, <expression> [ ... ]] THEN
<statements>
[ WHEN <expression> [, <expression> [ ... ]] THEN
<statements>
... ]
[ ELSE
<statements> ]
END CASE;
Example:
CASE book_id
WHEN 1,2 THEN
msg := 'Author is Virginia Woolf';
WHEN 3 THEN
msg := 'Author is Harper Lee';
WHEN 4 THEN
msg := 'Author is F. Scott Fitzgerald';
WHEN 5 THEN
msg := 'Author is J.R.R. Tolkien';
WHEN 6,7 THEN
msg := 'Author is George Orwell';
ELSE
msg := 'Author is unknown';
END CASE;
The searched form of CASE
provides conditional execution based on truth of boolean expressions. Each WHEN
clause’s boolean expression is evaluated in turn, until one is found that is true. Then the corresponding statements are executed, and control passes to the next statement after END CASE
. Subsequent expressions in the remaining WHEN
are not evaluated. If no true result is found, the ELSE
statements are executed. If ELSE
is not specified, a CASE_NOT_FOUND
exception is raised.
CASE
WHEN <boolean-expression> THEN
<statements>
[ WHEN <boolean-expression> THEN
<statements>
... ]
[ ELSE
<statements>]
END CASE;
Example:
CASE
WHEN balance > 1000 THEN
status = 'Platinum' ;
WHEN balance > 500 THEN
status = 'Gold' ;
ELSE
status = 'Base';
END CASE;
Loops
With the LOOP
, EXIT
, CONTINUE
, WHILE
, FOR
, and FOREACH
statements, you can repeat a series of commands in PL/pgSQL code.
LOOP
defines an unconditional loop that is repeated indefinitely until terminated by an EXIT
or RETURN
statement. The optional label
can be used by EXIT
and CONTINUE
statements within nested loops to specify which loop these statements refer to.
[ <<<label>>> ]
LOOP
<statements>
END LOOP [ <label> ];
EXIT [ <label> ] [ WHEN <boolean-expression> ];
EXIT
exits the loop. If no label
is specified for EXIT
, the innermost loop is terminated and the statement following END LOOP
is executed next. If a label is given, it must be the label of the current or an outer level of nested loop or block. Then the named loop or block is terminated and control continues with the statement after the corresponding END
.
If WHEN
is specified, the loop exit occurs only if a boolean expression is true. Otherwise, control passes to the statement after EXIT
.
EXIT
can be used with all types of loops, it is not limited to use with unconditional loops.
CONTINUE [ <label> ] [ WHEN <boolean-expression> ];
If no label is given, the next iteration of the innermost loop is begun. All statements remaining in the loop body are skipped, and control returns to the loop control expression (if any) to determine whether another loop iteration is needed. If a label is added, it determines the loop whose execution will be continued.
If WHEN
is specified, the next iteration of the loop starts only if a boolean expression is true. Otherwise, control passes to the statement after CONTINUE
.
CONTINUE
can be used with all types of loops; it is not limited to use with unconditional loops.
The following example uses LOOP
, EXIT
, and CONTINUE
to calculate the sum of even numbers from 1
to the given parameter:
CREATE OR REPLACE FUNCTION get_sum(int) RETURNS int AS $$
DECLARE
num int := 0;
my_sum int :=0;
BEGIN
LOOP
num = num + 1;
-- Exit the loop if num > the passed parameter
EXIT WHEN num > $1;
-- Skip the iteration if num is an odd number
CONTINUE WHEN mod(num,2) != 0;
my_sum = my_sum + num;
END LOOP;
RETURN my_sum;
END;
$$ LANGUAGE plpgsql;
SELECT get_sum(12);
Result:
get_sum --------- 42
The WHILE
loop statement has the following syntax:
[ <<<label>>> ]
WHILE <boolean-expression> LOOP
<statements>
END LOOP [ <label> ];
The WHILE
statement repeats a sequence of statements so long as the boolean-expression
evaluates to true. The expression is checked just before each entry to the loop body.
This example rewrites with WHILE
the get_sum
function that calculates the sum of even numbers:
CREATE OR REPLACE FUNCTION get_sum(int) RETURNS int AS $$
DECLARE
my_sum int :=0;
num int :=0;
BEGIN
WHILE num <= $1 LOOP
num = num +1;
IF mod(num,2) = 0 THEN
my_sum = my_sum + num;
END IF;
END LOOP;
RETURN my_sum;
END;
$$ LANGUAGE plpgsql;
SELECT get_sum(12);
Result:
get_sum --------- 42
The syntax of the FOR
loop statement that iterates over a range of integers:
[ <<label>> ]
FOR <counter> IN [ REVERSE ] <expression> .. <expression> [ BY <expression> ] LOOP
<statements>
END LOOP [ label ];
The counter
variable is defined as a type of integer and exists only inside the loop (any pre-existing definition of the counter
variable is ignored within the loop). The two expressions giving the lower and upper bound of the range are evaluated once when entering the loop. If the BY
clause is not specified, the iteration step is 1
, otherwise it is the value specified in the BY
clause. If REVERSE
is specified, the step value is subtracted rather than added after each iteration.
This example rewrites with FOR
the get_sum
function that calculates the sum of even numbers:
CREATE OR REPLACE FUNCTION get_sum(int) RETURNS int AS $$
DECLARE
my_sum int :=0;
BEGIN
FOR i in 1..$1 LOOP
IF mod(i,2) = 0 THEN
my_sum = my_sum + i;
END IF;
END LOOP;
RETURN my_sum;
END;
$$ LANGUAGE plpgsql;
SELECT get_sum(12);
Result:
get_sum --------- 42
Also, PL/pgSQL provides the FOREACH
loop for iterating over arrays.
Example
This example creates tables with random data.
-
Create a function that returns a string of the specified length from the given characters:
CREATE OR REPLACE FUNCTION generate_text(len int) RETURNS text AS $$ DECLARE symbols text := 'aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ'; result text := ''; place int := 0; symbols_lenght int = length(symbols); BEGIN WHILE LENGTH(result) < len LOOP place = floor(random()*symbols_lenght); result := result || SUBSTRING(symbols, place, 1); END LOOP; RETURN result; END $$ LANGUAGE 'plpgsql';
-
Write a procedure that creates a specified number of tables, using the
generate_text
function to populate them. The required number of tables is passed as a parameter:CREATE OR REPLACE PROCEDURE create_tables(num int, t_name text, row_number int)AS $$ DECLARE table_name text; BEGIN FOR i IN 1 .. num LOOP table_name= t_name || '_' || i; EXECUTE ('CREATE TABLE ' || table_name || '(id int, text varchar);'); FOR j IN 1 .. row_number LOOP -- The row number determines the number of symbols in the "text" column EXECUTE ('INSERT INTO ' || table_name || ' VALUES (' || j ||','||''''|| generate_text(j)||'''' || ');'); END LOOP; END LOOP; END $$ LANGUAGE 'plpgsql';
-
Call the
create_tables
procedure. The first parameter (3
) specifies the number of tables to create, the second parameter (my_table
) sets the prefix for table names, and the third one (25
) — the number of rows in tables.CALL create_tables(3, 'my_table', 25);
As a result, the
my_table_1
,my_table_2
, andmy_table_3
tables will be created.