Use PL/pgSQL

Eugenia Kuzina

Contents

Overview
PL/pgSQL structure
Declarations
Execute SQL commands
Control structures
Example

Overview

PL/pgSQL is a loadable procedural language for the ADPG/PostgreSQL database system.

SQL is the language that ADPG/PostgreSQL and most other relational databases use as query language. Every SQL statement must be executed individually by the database server. That means that a client application must send each query to the database server, wait for it to be processed, receive and process the results, and then send further queries to the server. All this requires inter-process communication and can also incur network overhead.

With PL/pgSQL, you can group queries and calculations inside a database server. This gives you the power of the procedural language and advantages of SQL, significantly reducing the amount of data transfer between client and server. It can result in a considerable performance increase as compared to an application without stored functions.

PL/pgSQL has the following advantages:

Allows you to create functions, procedures, and triggers. Functions and stored procedures developed with PL/pgSQL can be used like any built-in functions and stored procedures.
Provides control structures to the SQL language.
Allows performing complex computations.
Inherits all user-defined types, functions, procedures, and operators.
Can be defined to be trusted by the PostgreSQL database server.
Is easy to use.

In ADPG, the plpgsql extension, required to use PL/pgSQL, is preinstalled in the postgres database. When creating a new database, the extensions that are present in the template1 database are installed. By default, template1 has plpgsql installed. See Template Databases.

PL/pgSQL structure

PL/pgSQL supports anonymous code blocks called with the DO statement.

DO
$$
    DECLARE
        book_num integer;
    BEGIN
        SELECT count(*) into book_num FROM book;
        RAISE NOTICE 'The number of books: %', book_num;
END;
$$;

If you need to reuse the code, you can create a function or procedure. Use CREATE FUNCTION or CREATE PROCEDURE for this. The following example defines a function written in PL/pgSQL. LANGUAGE specifies the name of the language that the function is implemented in:

CREATE FUNCTION my_function(integer, text) RETURNS integer
AS 'function body text'
LANGUAGE plpgsql;

The function body is a string literal. Very often dollar quoting ($$) is used to write the function body, rather than the normal single quote syntax (see Dollar-quoted string constants). Without dollar quoting, any single quotes or backslashes in the function body must be escaped by doubling them.

PL/pgSQL is a block-structured language. The complete text of a function body must be a block. A block is defined as:

[ <<<label>>> ]
[ DECLARE
    <declarations> ]
BEGIN
    <statements>
END [ <label> ];

Each declaration and each statement within a block is terminated by a semicolon. A block that is located within another block must have a semicolon after END, as shown above. However, the final END that concludes a function body does not require a semicolon.

A label is only needed if you want to identify the block for use in an EXIT statement, or to qualify the names of the variables declared in the block. If a label is given after END, it must match the label at the block’s beginning.

PL/pgSQL allows you to use the same comments as in ordinary SQL. A double dash (--) starts a one-line comment, a block comment begins with /* and ends with */.

Any statement in the statement section of a block can be a subblock. Subblocks are used for logical grouping or to localize variables to a small group of statements. Variables with the same names declared in a subblock substitute any variables of outer blocks, but you can access the outer variables if you qualify their names with their block’s label. For example:

CREATE FUNCTION my_function() RETURNS integer AS $$
    << my_label >>
    DECLARE
        variable1 integer := 100;
    BEGIN
        RAISE NOTICE 'Variable1 is %', variable1; -- Prints 100
        variable1 := 50;
        -- Create a subblock
    DECLARE
        variable1 integer := 200;
    BEGIN
        RAISE NOTICE 'Variable1 is %', variable1; -- Prints 200
        RAISE NOTICE 'Variable1 is %', my_label.variable1; -- Prints 50
    END;

    RAISE NOTICE 'Variable1 is %', variable1; -- Prints 50

    RETURN variable1;
END;
$$ LANGUAGE plpgsql;


SELECT my_function();

Result:

NOTICE:  Variable1 is 100
NOTICE:  Variable1 is 200
NOTICE:  Variable1 is 50
NOTICE:  Variable1 is 50
 my_function
-------------
          50

It is important not to confuse the use of BEGIN/END for grouping statements in PL/pgSQL with the similarly-named SQL commands for transaction control. In PL/pgSQL, BEGIN/END are only for grouping. They do not start or end a transaction. For more information on how to work with transactions in PL/pgSQL, refer to Transaction management. Additionally, a block with an EXCEPTION clause essentially creates a nested transaction that can be rolled back without affecting the outer transaction. This is described in the Trapping errors section.

Declarations

All variables used in a block must be declared in the DECLARE section of the block. The only exception is the FOR loop variable. PL/pgSQL variables can have any SQL data type, such as integer, char, etc.

Examples of variable declarations:

account_id integer;
amount numeric(5);
article varchar;
my_row tablename%ROWTYPE;
my_field tablename.columnname%TYPE;
arow RECORD;

The general syntax of variable declaration:

<name> [ CONSTANT ] <type> [ COLLATE <collation_name> ] [ NOT NULL ] [ { DEFAULT | := | = } <expression> ];

Parameters of variable declaration

name

Variable name

CONSTANT

Prevents the value of the variable from changing after initialization, so the value remains constant in the block

type

Variable type

COLLATE <collation_name>

Specifies a collation to use for the variable (see Collation of PL/pgSQL variables)

NOT NULL

If specified, an assignment of a null value results in a run-time error. All variables declared as NOT NULL must have a non-null default value specified

{DEFAULT | := | = } <expression>

Specifies the initial value assigned to the variable when the block is entered. If the DEFAULT clause is not given, the variable is initialized to the SQL null value

The equal sign (=) can be used instead of Oracle PL/SQL-compliant :=.

Examples:

account_id CONSTANT integer := 10;
quantity integer DEFAULT 32;
article varchar = 'This is text';

Declare function parameters

Parameters passed to functions are named with the identifiers $1, $2, etc. Optionally, aliases can be declared for parameter names to increase readability. An alias or a numeric identifier can be used to refer to the parameter value. The following two functions are identical:

CREATE FUNCTION deposit_interest(real) RETURNS real AS $$
BEGIN
RETURN $1 * 1.2;
END;
$$ LANGUAGE plpgsql;

CREATE FUNCTION deposit_interest(total real) RETURNS real AS $$
BEGIN
RETURN total * 1.2;
END;
$$ LANGUAGE plpgsql;

PL/pgSQL functions also support output parameters. Output parameters are assigned numeric identifiers $1…$n and can be given aliases in the same way as input parameters. An output parameter is a variable that should be assigned during the function execution. The last assigned value is returned. Output parameters are most useful when returning multiple values:

CREATE FUNCTION calculations (x int, y int, OUT sum int, OUT prod int) AS $$
BEGIN
    sum := x + y;
    prod := x * y;
END;
$$ LANGUAGE plpgsql;

SELECT * FROM calculations(10, 5);

Result:

 sum | prod
-----+------
  15 |   50

Aliases

You can use the ALIAS FOR clause to declare an alias for a variable or a function parameter.

<new_name> ALIAS FOR <old_name>;

The following example creates the variable2 alias for variable1:

DECLARE
  variable2 ALIAS FOR variable1;

The example below declares the parameter2 alias for the second function parameter:

DECLARE
    parameter2 ALIAS FOR $2;

Copy types

%TYPE allows you to assign a new variable a data type of the existing variable or a table column. By using %TYPE you do not need to know the data type of the object you are referencing. Also, if the data type of the referenced item changes in the future, you do not need to change your function definition.

The following code illustrates how to declare variables with the data type of a table column and another variable:

variable1 pgbench_accounts.abalance%TYPE;

variable3 variable2%TYPE;

Row types

A variable of a composite type is called a row variable (or row-type variable). This type of variables can hold a whole row of a SELECT or FOR query result. You can declare a row variable with the same type as the rows of an existing table or view. Use the <table_name>%ROWTYPE notation for this:

row_var pgbench_accounts%ROWTYPE;

The individual fields of the row value are accessed using the dot notation, for example my_row.field. A function parameter can have a composite type (a table row). In this case, the corresponding identifier $n will be a row variable, and fields can be selected from it as follows: $1.field1. The example below uses tables from the Example of working with tables via psql article.

CREATE FUNCTION get_author(t_row book) RETURNS text AS $$
DECLARE
    author text;
BEGIN
    SELECT name INTO author FROM author WHERE id = t_row.author_id;
    RETURN author;
END;
$$ LANGUAGE plpgsql;

SELECT get_author(b.*), title FROM book b;

Result:

     get_author      |         title
---------------------+-----------------------
 Virginia Woolf      | Mrs. Dalloway
 Virginia Woolf      | To the Lighthouse
 Harper Lee          | To Kill a Mockingbird
 F. Scott Fitzgerald | The Great Gatsby
 J.R.R. Tolkien      | The Lord of the Rings
 George Orwell       | 1984
 George Orwell       | Animal Farm

Record types

Record variables are similar to row-type variables, but they have no predefined structure. They take on the structure of the row they are assigned during a SELECT or FOR command. The structure of a record variable can be changed each time it is assigned. Until a record variable is first set, it has no structure, and any attempt to access a field in it will result in a run-time error.

Example:

CREATE FUNCTION get_book_name(int) RETURNS text AS $$
DECLARE
    my_row record;
BEGIN
    SELECT * INTO my_row FROM book WHERE id = $1;
    RETURN my_row.title;
END;
$$ LANGUAGE plpgsql;

SELECT get_book(5);

Result:

     get_book_name
-----------------------
 The Lord of the Rings

Execute SQL commands

In general, any SQL command that does not return rows can be executed within a PL/pgSQL function just by writing the command. For example, you can create and fill a table:

CREATE TABLE shelves (
    book_id SERIAL PRIMARY KEY,
    location VARCHAR(10) NOT NULL);

INSERT INTO shelves VALUES (5, 'B32');

If the command does return rows (for example SELECT, or INSERT/UPDATE/DELETE with RETURNING), there are two ways to proceed. When a command returns at most one row, or you only care about the first row of output, use INTO clause to capture the output. See Executing a command with a single-row result. INTO is used in an example above:

...
 SELECT * INTO my_row FROM book WHERE id = $1;
...

To process all the output rows, write the command as the data source for a FOR loop, as described in Looping through query results. The following function refreshes all materialized views using this approach:

CREATE FUNCTION refresh_mat_views() RETURNS integer AS $$
DECLARE
    mviews RECORD;
BEGIN
    RAISE NOTICE 'Refreshing all materialized views...';

    FOR mviews IN
       SELECT n.nspname AS mv_schema,
              c.relname AS mv_name,
              pg_catalog.pg_get_userbyid(c.relowner) AS owner
         FROM pg_catalog.pg_class c
    LEFT JOIN pg_catalog.pg_namespace n ON (n.oid = c.relnamespace)
        WHERE c.relkind = 'm'
     ORDER BY 1
    LOOP

        -- "mviews" has one record with information about each materialized view

        RAISE NOTICE 'Refreshing materialized view %.% (owner: %)...',
                     quote_ident(mviews.mv_schema),
                     quote_ident(mviews.mv_name),
                     quote_ident(mviews.owner);
        EXECUTE format('REFRESH MATERIALIZED VIEW %I.%I', mviews.mv_schema, mviews.mv_name);
    END LOOP;

    RAISE NOTICE 'Done refreshing materialized views.';
    RETURN 1;
END;
$$ LANGUAGE plpgsql;

SELECT refresh_mat_views();

Non-optimizable SQL commands (also called utility commands) do not accept query parameters. Automatic substitution of PL/pgSQL variables does not work in such commands. To include non-constant text in a utility command executed from PL/pgSQL, build the utility command as a string and then use EXECUTE to run it, as described in Executing dynamic commands. In the example above, EXECUTE is used to run REFRESH MATERIALIZED VIEW:

...
EXECUTE format('REFRESH MATERIALIZED VIEW %I.%I', mviews.mv_schema, mviews.mv_name);
...

Sometimes it is necessary to evaluate an expression or SELECT query but discard the result, for example when calling a function that has side effects but no useful result value. To do this in PL/pgSQL, use the PERFORM statement:

PERFORM <query>;

Example:

 PERFORM pg_advisory_lock(1234);

Control structures

Control structures are probably the most useful part of PL/pgSQL. With PL/pgSQL control structures, you can manipulate PostgreSQL data in a very flexible and powerful way.

RETURN

RETURN allows you to return data from a function.

RETURN with an expression terminates the function and returns the value of the expression to the caller. This form is used for PL/pgSQL functions that do not return a set. In the examples above, RETURN is executed as follows:

...
RETURN total * 1.2;
...
RETURN my_row.title;
...

When a PL/pgSQL function is declared to return SETOF <sometype>, the individual items to return are specified by a sequence of RETURN NEXT or RETURN QUERY commands, and then a final RETURN command with no argument is used to indicate that the function has finished executing. RETURN NEXT can be used with both scalar and composite data types; with a composite result type, an entire "table" of results will be returned. RETURN QUERY appends the results of executing a query to the function’s result set.

Example of RETURN NEXT

CREATE OR REPLACE FUNCTION get_all_books() RETURNS SETOF book AS $$
DECLARE
    row book%ROWTYPE;
BEGIN
    FOR row IN SELECT * FROM book
    LOOP
    -- can do some processing here
    -- the "book" table are not changed, the string "processed by RETURN NEXT" is only added in the output
    row.title= row.title||' processed by RETURN NEXT';
    RETURN NEXT row; -- return the current row of the `SELECT` result set
    END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;

SELECT title FROM get_all_books();

Result:

                     title
------------------------------------------------
 The Great Gatsby processed by RETURN NEXT
 The Lord of the Rings processed by RETURN NEXT
 1984 processed by RETURN NEXT
 Animal Farm processed by RETURN NEXT
 Mrs. Dalloway processed by RETURN NEXT
 To the Lighthouse processed by RETURN NEXT
 To Kill a Mockingbird processed by RETURN NEXT

Example of RETURN QUERY

CREATE OR REPLACE FUNCTION get_books_by_author(int) RETURNS SETOF book AS $$

BEGIN
    RETURN QUERY SELECT * FROM book WHERE author_id=$1;
    IF NOT FOUND THEN
        RAISE EXCEPTION 'No books';
    END IF;

    RETURN;
END;
$$ LANGUAGE plpgsql;

SELECT * FROM get_books_by_author(3);

Result:

 id |      title       | author_id | public_year
----+------------------+-----------+-------------
  4 | The Great Gatsby |         3 |        1925

A procedure does not have a return value. A procedure can therefore end without a RETURN statement. If you wish to use a RETURN statement to exit the code early, write just RETURN with no expression. If a procedure has output parameters, the final values of the output parameter variables will be returned to the caller.

Call a procedure

A PL/pgSQL function, procedure, or DO block can call a procedure using CALL. Output parameters are handled differently from the way that CALL works in plain SQL. Each OUT or INOUT parameter of the procedure must correspond to a variable in the CALL statement, and whatever the procedure returns is assigned back to that variable after it returns. For example:

CREATE PROCEDURE triple(INOUT x int)
LANGUAGE plpgsql
AS $$
BEGIN
x := x * 3;
END;
$$;

DO $$
DECLARE myvar int := 5;
BEGIN
CALL triple(myvar);
RAISE NOTICE 'myvar = %', myvar;  -- prints 15
END;
$$;

Conditionals

The IF and CASE statements let you execute alternative commands based on certain conditions. PL/pgSQL provides three forms of IF:

IF … THEN … END IF
IF … THEN … ELSE … END IF
IF … THEN … ELSIF … THEN … ELSE … END IF

and two forms of CASE:

CASE … WHEN … THEN … ELSE … END CASE
CASE WHEN … THEN … ELSE … END CASE

An example of using IF:

IF number = 0 THEN
result := 'zero';
ELSIF number > 0 THEN
result := 'positive';
ELSIF number < 0 THEN
result := 'negative';
ELSE
-- the only possibility is that number is null
result := 'NULL';
END IF;

CASE

The simple form of CASE provides conditional execution based on the equality of operands. A search expression is evaluated once and successively compared to each expression in the WHEN clauses. If a match is found, then the corresponding statements are executed, and then control passes to the next statement after END CASE.

CASE <search-expression>
WHEN <expression> [, <expression> [ ... ]] THEN
<statements>
[ WHEN <expression> [, <expression> [ ... ]] THEN
<statements>
... ]
[ ELSE
<statements> ]
END CASE;

Example:

CASE book_id

    WHEN 1,2 THEN
        msg := 'Author is Virginia Woolf';
    WHEN 3 THEN
        msg := 'Author is Harper Lee';
    WHEN 4 THEN
        msg := 'Author is F. Scott Fitzgerald';
    WHEN 5 THEN
        msg := 'Author is J.R.R. Tolkien';
    WHEN 6,7 THEN
        msg := 'Author is George Orwell';
    ELSE
        msg := 'Author is unknown';
END CASE;

The searched form of CASE provides conditional execution based on truth of boolean expressions. Each WHEN clause’s boolean expression is evaluated in turn, until one is found that is true. Then the corresponding statements are executed, and control passes to the next statement after END CASE. Subsequent expressions in the remaining WHEN are not evaluated. If no true result is found, the ELSE statements are executed. If ELSE is not specified, a CASE_NOT_FOUND exception is raised.

CASE
    WHEN <boolean-expression> THEN
      <statements>
  [ WHEN <boolean-expression> THEN
      <statements>
    ... ]
  [ ELSE
      <statements>]
END CASE;

Example:

CASE
    WHEN balance > 1000 THEN
        status = 'Platinum' ;
    WHEN balance > 500 THEN
        status = 'Gold' ;
    ELSE
        status = 'Base';
END CASE;

Loops

With the LOOP, EXIT, CONTINUE, WHILE, FOR, and FOREACH statements, you can repeat a series of commands in PL/pgSQL code.

LOOP

LOOP defines an unconditional loop that is repeated indefinitely until terminated by an EXIT or RETURN statement. The optional label can be used by EXIT and CONTINUE statements within nested loops to specify which loop these statements refer to.

[ <<<label>>> ]
LOOP
    <statements>
END LOOP [ <label> ];

EXIT

EXIT [ <label> ] [ WHEN <boolean-expression> ];

EXIT exits the loop. If no label is specified for EXIT, the innermost loop is terminated and the statement following END LOOP is executed next. If a label is given, it must be the label of the current or an outer level of nested loop or block. Then the named loop or block is terminated and control continues with the statement after the corresponding END.

If WHEN is specified, the loop exit occurs only if a boolean expression is true. Otherwise, control passes to the statement after EXIT.

EXIT can be used with all types of loops, it is not limited to use with unconditional loops.

CONTINUE

CONTINUE [ <label> ] [ WHEN <boolean-expression> ];

If no label is given, the next iteration of the innermost loop is begun. All statements remaining in the loop body are skipped, and control returns to the loop control expression (if any) to determine whether another loop iteration is needed. If a label is added, it determines the loop whose execution will be continued.

If WHEN is specified, the next iteration of the loop starts only if a boolean expression is true. Otherwise, control passes to the statement after CONTINUE.

CONTINUE can be used with all types of loops; it is not limited to use with unconditional loops.

The following example uses LOOP, EXIT, and CONTINUE to calculate the sum of even numbers from 1 to the given parameter:

CREATE OR REPLACE FUNCTION get_sum(int) RETURNS int AS $$
DECLARE
   num int := 0;
   my_sum int :=0;
BEGIN

  LOOP
     num = num + 1;

	 -- Exit the loop if num > the passed parameter
	 EXIT WHEN num > $1;

	 -- Skip the iteration if num is an odd number
	 CONTINUE WHEN mod(num,2) != 0;
	 my_sum = my_sum + num;
  END LOOP;
  RETURN my_sum;
END;
$$ LANGUAGE plpgsql;

SELECT get_sum(12);

Result:

 get_sum
---------
      42

WHILE

The WHILE loop statement has the following syntax:

[ <<<label>>> ]
WHILE <boolean-expression> LOOP
<statements>
END LOOP [ <label> ];

The WHILE statement repeats a sequence of statements so long as the boolean-expression evaluates to true. The expression is checked just before each entry to the loop body.

This example rewrites with WHILE the get_sum function that calculates the sum of even numbers:

CREATE OR REPLACE FUNCTION get_sum(int) RETURNS int AS $$
DECLARE
   my_sum int :=0;
   num int :=0;
BEGIN

  WHILE num <= $1 LOOP
  	num = num +1;
    IF mod(num,2) = 0 THEN
        my_sum = my_sum + num;
	END IF;
  END LOOP;
  RETURN my_sum;
END;
$$ LANGUAGE plpgsql;

SELECT get_sum(12);

Result:

 get_sum
---------
      42

FOR

The syntax of the FOR loop statement that iterates over a range of integers:

[ <<label>> ]
FOR <counter> IN [ REVERSE ] <expression> .. <expression> [ BY <expression> ] LOOP
<statements>
END LOOP [ label ];

The counter variable is defined as a type of integer and exists only inside the loop (any pre-existing definition of the counter variable is ignored within the loop). The two expressions giving the lower and upper bound of the range are evaluated once when entering the loop. If the BY clause is not specified, the iteration step is 1, otherwise it is the value specified in the BY clause. If REVERSE is specified, the step value is subtracted rather than added after each iteration.

This example rewrites with FOR the get_sum function that calculates the sum of even numbers:

CREATE OR REPLACE FUNCTION get_sum(int) RETURNS int AS $$
DECLARE
   my_sum int :=0;
  BEGIN

  FOR i in 1..$1 LOOP
    IF mod(i,2) = 0 THEN
        my_sum = my_sum + i;
	END IF;
  END LOOP;
  RETURN my_sum;
END;
$$ LANGUAGE plpgsql;

SELECT get_sum(12);

Result:

 get_sum
---------
      42

Also, PL/pgSQL provides the FOREACH loop for iterating over arrays.

Example

This example creates tables with random data.

Create a function that returns a string of the specified length from the given characters:

CREATE OR REPLACE FUNCTION generate_text(len int) RETURNS text AS $$
DECLARE
    symbols text := 'aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ';
    result text := '';
    place int := 0;
    symbols_lenght int = length(symbols);
BEGIN
    WHILE LENGTH(result) < len
    LOOP
    place = floor(random()*symbols_lenght);
        result := result || SUBSTRING(symbols, place, 1);
    END LOOP;

    RETURN result;
END
$$ LANGUAGE 'plpgsql';

Write a procedure that creates a specified number of tables, using the generate_text function to populate them. The required number of tables is passed as a parameter:

CREATE OR REPLACE PROCEDURE create_tables(num int, t_name text, row_number int)AS $$
    DECLARE
    table_name text;
    BEGIN
    FOR i IN 1 .. num
        LOOP
            table_name= t_name || '_' || i;
            EXECUTE ('CREATE TABLE ' || table_name || '(id int, text varchar);');
            FOR j IN 1 .. row_number
                LOOP
                    -- The row number determines the number of symbols in the "text" column
                    EXECUTE ('INSERT INTO ' || table_name || ' VALUES (' || j ||','||''''|| generate_text(j)||'''' || ');');
                END LOOP;
        END LOOP;
    END
$$ LANGUAGE 'plpgsql';

Call the create_tables procedure. The first parameter (3) specifies the number of tables to create, the second parameter (my_table) sets the prefix for table names, and the third one (25) — the number of rows in tables.
```
CALL create_tables(3, 'my_table', 25);
```
As a result, the my_table_1, my_table_2, and my_table_3 tables will be created.

Found a mistake? Seleсt text and press Ctrl+Enter to report it