Oracle vs PostgreSQL: First Glance

enter image description here

We are facing some interesting times at work as we are going to start changing our technology stack in the near future. Most of our Oracle processes will be replaced by Python, processing the data stored in parquet files. For indispensable database processes we will change to PostgreSQL. I like Oracle and it has served us well, but I am excited about learning and working with new technologies. This is a great opportunity for us to take part in building something from the very beginning and gain experience in other technologies.

After having the first wider look at PostgreSQL, I decided to gather some of the differences and point them out in this article. I am planning to share my experience during and after the migration as well.

DUAL Table

Our often used DUAL table from Oracle doesn’t exist in PostgreSQL. Don’t worry though, you may even find the other way simpler.

Oracle:

SELECT 1*3 FROM dual;

PostgreSQL:

SELECT 1*3;

String Concatenation

'Concat with ' || NULL

Result in Oracle:

Concat with 

Result in PostgreSQL:

[null]

To achieve the same functionality in PostgreSQL, use the concat function:

concat('Concat with ', NULL)

ROWNUM and ROWID

Oracle:

SELECT rowid,
       rownum,
       country
FROM country
WHERE rownum <= 5;

PostgreSQL:

SELECT ctid AS rowid,
       row_number() over() AS rn,
       country 
FROM country
LIMIT 5;

IF EXISTS for DDL operations

I really miss this from Oracle, but luckily PostgreSQL has the ability to check whether or not a database object exists before running a DDL operation on it.
Some examples:

CREATE TABLE IF NOT EXISTS table_name (...);
DROP TABLE IF EXISTS table_name;
ALTER TABLE IF EXISTS table_name RENAME TO new_name;
ALTER TABLE table_name DROP COLUMN IF EXISTS column_name;

We could go on and on with all the different database objects.

Outer Join using (+)

I am confident that anyone who works with Oracle often uses the (+) inside a query to simply force an outer join. In PostgreSQL we don’t have this “luxury”, so we have to go the default way.

Oracle:

SELECT * 
FROM countries c, locations l
WHERE c.country_id = l.country_id (+);

PostgreSQL:

SELECT * 
FROM countries c
LEFT OUTER JOIN locations l ON c.country_id = l.country_id;

Postgre’s solution also works in Oracle.

UPDATE using another table

Imagine that you want to update values in a table depending on values in another table. In Oracle normally we use a subquery, in PostgreSQL, however, there is a really nice way to do it.

Oracle:

UPDATE countries
SET country_name = upper(country_name)
WHERE country_id IN (SELECT country_id FROM locations);

PostgreSQL:

UPDATE city
SET city = upper(city)
FROM country
WHERE city.country_id = country.country_id;

DELETE using another table

The same way as described before with UPDATE, in PostgreSQL we can simply use values from another table when we execute a DELETE statement.

Oracle:

DELETE FROM locations
WHERE country_id in (SELECT country_id FROM countries);

PostgreSQL:

DELETE FROM city
USING country
WHERE city.country_id = country.country_id;

Oracle’s solution also works in PostgreSQL.

MERGE (UPSERT)

MERGE is a useful and often used statement that we use to insert or update data depending on its presence. If the data exists, we update the required columns, if not, we insert a new record. In PostgreSQL it is called UPSERT.
In the following example we insert a new record or update the email address if the name columns match.

Oracle:

MERGE INTO customers a
    USING suppliers b
    ON (a.name = b.name)
  WHEN MATCHED THEN
    UPDATE SET a.email = b.email
  WHEN NOT MATCHED THEN
    INSERT (name, email)
    VALUES (b.name, b.email);

PostgreSQL:

INSERT INTO customers (name, email)
VALUES ('Somebody', 'somebody@email.com') 
ON CONFLICT (name) 
DO UPDATE
SET email = customers.email;
-- By EXCLUDED.email we could refer to the "new" email value that was excluded from inserting

Both in Oracle or in PostgreSQL we can choose to do nothing if there is a match (conflict).

Table Inheritance

In PostgreSQL tables can inherit the data and the structure from an existing table. You can think of it as being similar to the class inheritance in object-oriented programming. In Oracle table inheritance doesn’t exist in this form.
You can find a really good example for Postgre inheritance here.

Copying Tables

PostgreSQL offers some good ways to copy tables.
To copy both table structure and data:

CREATE TABLE new_table AS 
TABLE existing_table;

To copy only the table structure:

CREATE TABLE new_table AS 
TABLE existing_table 
WITH NO DATA;

In Oracle there is a way also to do something similar. However, this solution only creates the table without any additional objects such as indexes or constraints. The newly created table may or may not have data depending on the condition.

CREATE TABLE country_2 AS
SELECT * FROM country;

The above code will create a new table based on country table, with all the data.
If you just want an empty table without data, execute this:

CREATE TABLE country_2 AS
SELECT * FROM country
WHERE 1=0;

This solution also works in PostgreSQL.

TRUNCATE

Unlike in Oracle, TRUNCATE is transaction-safe in PostgreSQL. It means that if you place it within the transaction statements such as BEGIN and ROLLBACK, the truncation operation will be rolled back safely.

Hierarchical Queries

The commonly used CONNECT BY-START WITH-PRIOR clauses from Oracle don’t exist in PostgreSQL. At least not by default. By installing the tablefunc extension you can have equivalent functions. Also, in PostgreSQL you can use recursive Common Table Expressions (CTE) to achieve the same result. (CTEs are also available in Oracle.)
You can find more information with examples here.

Partition Handling

As our team works in the market research industry, the huge volume of data often requires partitioned tables. Let’s have a look at the following example on how to handle partitioning by list.

Oracle:

CREATE TABLE sales (
  salesman_id   INTEGER PRIMARY KEY,
  salesman_name VARCHAR2(30),
  sales_region  VARCHAR2(30),
  sales_date    DATE,
  sales_amount  INTEGER
) 
PARTITION BY LIST (sales_region)
(
  PARTITION p_asia    VALUES ('INDIA','CHINA'),
  PARTITION p_euro    VALUES ('FRANCE','UK'),
  PARTITION p_america VALUES ('USA','CANADA'),
  PARTITION p_rest    VALUES (DEFAULT)
);

PostgreSQL:

CREATE TABLE sales (
  salesman_id   INTEGER,
  salesman_name VARCHAR(30),
  sales_region  VARCHAR(30),
  sales_date    DATE,
  sales_amount  INTEGER
) 
PARTITION BY LIST (sales_region);

CREATE TABLE sales_p_asia    PARTITION OF sales FOR VALUES IN ('INDIA','CHINA');
CREATE TABLE sales_p_euro    PARTITION OF sales FOR VALUES IN ('FRANCE','UK');
CREATE TABLE sales_p_america PARTITION OF sales FOR VALUES IN ('USA','CANADA');
CREATE TABLE sales_p_rest    PARTITION OF sales DEFAULT;

Querying data from a specific partition in Oracle:

SELECT * FROM sales PARTITION (p_america);

In PostgreSQL:

SELECT * FROM sales_p_america;

For more information check here and here.

Block Structure

Normally we organize our code into blocks, procedures or functions. Let’s have a look at the differences.

Oracle:

DECLARE
... variables
BEGIN
... code to be executed
EXCEPTION
... exception handling
END;

PostgreSQL:

DO $$ 
DECLARE
... variables
BEGIN 
... code to be executed
EXCEPTION
.. exception handling
END $$;

The body of a block, procedure or function is passed as a string literal. To avoid having to write the code as a long string and having to escape all single quotes, PostgreSQL offers us the $$ syntactic sugar that indicates a single quote. If we wanted, however, we could still use a single quote instead of $$.
The syntax difference is the same with regards to procedures and functions as well. You can have a deeper look here and here.

Packages

There is no such thing as a package in PostgreSQL. This will probably take some time to get used to after having worked quite some years with Oracle. However, we can organize our package into separate PostgreSQL procedures and functions. (Just like an Oracle package contains separate procedures and functions.) By using a standard naming convention, we can “overcome” this and name our procedures and functions by starting with the imaginary package name followed by two underscores.

package_name__proc_name

The following procedure name would substitute Oracle’s pkg_index.prc_calculate_index procedure:

pkg_index__prc_calculate_index

Although it is very important to highlight that even if we can “organize” our PostgreSQL procedures and functions by using this standard naming convention, unfortunately we can forget about the package level global variables.

Conclusion

These are the differences that I found useful to point out at first. As we start migrating to PostgreSQL in the future, I will share our experience and highlight the things to watch out for.

I think we cannot say if one or the other database is better, both of them have their own strengths. As far as the performance is concerned, I will know more after we have completed the migration and started processing production data with PostgreSQL.

If you have any comments or if you have any experience in migrating Oracle PL SQL to PostgreSQL, don’t hesitate to share your thoughts in the comments. I would be happy and interested in hearing them.

Comments

  1. FYI truncate in postgres isn't transaction safe in a different way, if you truncate and then put stuff into the table inside a transaction other transactions may see it as empty.

    ReplyDelete
    Replies
    1. Thanks for you input, I will try it out. Good to know!

      Delete
  2. If you're going to use code examples to demonstrate how things are different in various systems, PLEASE use the same examples to do so. Your DELETE and UPDATE examples confused the heck out of me.

    ReplyDelete
    Replies
    1. You are right. I used the tables from the the sample databases for this example.
      Good advice, I will keep in mind for future articles.

      Delete
  3. Most of us moved away from the (+) outer join about 20 years ago. The (INNER|LEFT OUTER|RIGHT OUTER|CROSS) JOIN syntax is clearer, portable, and an ANSI standard supported by both databases. I've used Oracle since '83 and the "(+)" syntax always had me confused especially when multiple join criteria were involved. And don't get me started on Sybase's (MS SQL Servers) "*=" outer join syntax!

    Also the bigger surprises I've had are that the PG optimizer is not as smart which means I have to work much harder to optimize and end up with queries that are much harder to maintain.

    ReplyDelete
    Replies
    1. "Most" not really what I have seen in the wild -- the (+) shorthand in my experience is still widely used especially in long lived codebases or companies with seasoned (read older) DBAs.

      Delete
    2. I think also important to touch upon is schemas. Schemas and users are tightly integrated in Oracle. You are not required to have a schema for each user or have a schema correspond to a user. You could create a schema to store your functions, having the schema act akin to an Oracle package.

      Delete
    3. Anonymous: thanks for your input about the Postgre optimizer. I will bring it up in the team and surely check it out before we start the migration.
      Regarding the (+), I agree it can be confusing with huge queries, but in normal queries I think it is well-used. At least in the projects that I have worked for so far, we have liked to use it.

      Delete
    4. John: regarding your input about creating a schema to store functions, this was my idea as well to substitute Oracle packages. Of course, probably not a separate schema for all packages, but to organize them. Thanks.

      Delete
  4. The MERGE (UPSERT) postgres example is wrong. EXCLUDED.email gives you the new value, customers.email will give you the value that is in the already existing row.
    EXCLUDED as in 'this new row that was excluded from inserting' as it conflicts with an existing row.

    ReplyDelete
    Replies
    1. Ah, what a mistake. Thanks Christopher, I have corrected it.

      Delete
  5. Postgres has not packages, but it has schemas. Schemas in Postgres are very similar to packages in Oracle, and can be used instead of packages.

    ReplyDelete
    Replies
    1. I agree, in Postgre using schemas is the way to organize your procedures and functions.

      Delete
  6. In PostgreSQL you can also CREATE TABLE x LIKE y which is quite handy

    ReplyDelete
  7. thank you for your post Roland, but please don't use "Postgre" it is not the right name. the shorter form that can be used is Postgres.

    ReplyDelete
    Replies
    1. With colleagues we just call it Postgre, it seems I got used to it too much. :)

      Delete
  8. Nice collection, I looking forward to see more of your experience.
    Keep up the good work!
    Thanks.

    ReplyDelete
    Replies
    1. Thanks a lot for your comment, I am really happy you liked it.

      Delete

Post a Comment