Fetching a file from a URL and saving it using PL SQL

enter image description here

I am going to present an example how we can fetch a file from a URL and save it for later usage. And all using only Oracle PL SQL. Our clients at work have many ways of providing data to us and now one of them decided to do it in a form of a file through their web API. The task is to download the file and save it on our server so that later it can be the source of an external table.

The Final Solution

declare
    V_SQL_DIR varchar2(35) := 'TEST_FILES';
    N_MAX_LEN_TO_WRITE_IN_BYTES number(5) := 32767;
    uri httpuritype;
    b_remote_file_content blob;
    f_output_csv utl_file.file_type;
    r_blob_piece_to_write raw(32767);
    n_start_position_to_read number := 1;
    n_remaining_bytes_to_read number;
    n_bytes_to_read number(5);
begin
    -- fetch the CSV file and read it into a blob
    uri := httpuritype('http://winterolympicsmedals.com/medals.csv');
    b_remote_file_content := uri.getblob();

    -- create a file into which we save the remote file content
    f_output_csv := utl_file.fopen_nchar(V_SQL_DIR, 'result.csv', 'wb',
                                         N_MAX_LEN_TO_WRITE_IN_BYTES);

    n_start_position_to_read := 1;
    n_remaining_bytes_to_read := dbms_lob.getlength(b_remote_file_content);

    -- looping over the remote file content
    while n_remaining_bytes_to_read > 0
    loop
        -- determine how many bytes we have to read; either the maximum or the remaining
        n_bytes_to_read := least(N_MAX_LEN_TO_WRITE_IN_BYTES, n_remaining_bytes_to_read);
        -- read a piece from the blob
        dbms_lob.read(b_remote_file_content, n_bytes_to_read, n_start_position_to_read,
                      r_blob_piece_to_write);
        -- write the piece into the file
        utl_file.put_raw(f_output_csv, r_blob_piece_to_write);
        utl_file.fflush(f_output_csv);
        -- calculate the start position for the next cut and the remaining bytes
        n_start_position_to_read := n_start_position_to_read + n_bytes_to_read;
        n_remaining_bytes_to_read := n_remaining_bytes_to_read - n_bytes_to_read;
    end loop;

    utl_file.fclose(f_output_csv);
end;
/

The Solution Breakdown

uri := httpuritype('http://winterolympicsmedals.com/medals.csv');
b_remote_file_content := uri.getblob();

In the first line we create an HTTPURITYPE object from the given URL.
Then we load the content of the remote file into the memory as a blob.

f_output_csv := utl_file.fopen_nchar(V_SQL_DIR, 'result.csv', 'wb',
                                     N_MAX_LEN_TO_WRITE_IN_BYTES);
-- N_MAX_LEN_TO_WRITE_IN_BYTES = 32767

We create a file to store the remote content. The V_SQL_DIR variable contains the name of the SQL directory object. Note that I use fopen_nchar to open the file in Unicode format.

n_start_position_to_read := 1;
n_remaining_bytes_to_read := dbms_lob.getlength(b_remote_file_content);

-- looping over the remote file content
while n_remaining_bytes_to_read > 0
loop
    -- determine how many bytes we have to read; either the maximum or the remaining
    n_bytes_to_read := least(N_MAX_LEN_TO_WRITE_IN_BYTES, n_remaining_bytes_to_read);
    -- read a piece from the blob
    dbms_lob.read(b_remote_file_content, n_bytes_to_read, n_start_position_to_read, 
                  r_blob_piece_to_write);
    -- write the piece into the file
    utl_file.put_raw(f_output_csv, r_blob_piece_to_write);
    utl_file.fflush(f_output_csv);
    -- calculate the start position for the next cut and the remaining bytes
    n_start_position_to_read := n_start_position_to_read + n_bytes_to_read;
    n_remaining_bytes_to_read := n_remaining_bytes_to_read - n_bytes_to_read;
end loop;

We start a loop that goes until we run out of bytes to be written into the file. If the number of actual (remaining) bytes is less than the value in the N_MAX_LEN_TO_WRITE_IN_BYTES variable, then we use the former, otherwise we must use N_MAX_LEN_TO_WRITE_IN_BYTES. The N_MAX_LEN_TO_WRITE_IN_BYTES variable is equal to 32767, which is the maximum number of bytes to be written at once.
A piece of blob is returned in the r_blob_piece_to_write output variable. Then we write it into the file. First into the buffer, then physically to the disc by calling the fflush method. Instead of calling fflush separately, we could just set the autoflush parameter to true when we call put_raw. At last, we calculate the new starting position and the remaining number of bytes to be written.

Then we do this.

As I mentioned it in the beginning, this file will serve as the source of an external table. If you want to check a working example for external tables with a preprocessor, please have a look at my article on this.

Comments