Oracle – Equivalent Idioms in Google BigQuery

google-bigqueryoracle

I was reading this article that is about Google BigQuery:

Exploring a powerful SQL pattern: ARRAY_AGG, STRUCT and UNNEST

They use a couple of functions that I am trying to figure out what the Oracle equivalents are, if they even exist.

I am digging through the Oracle documentation and I found something similar to ARRAY_AGG in Oracle called LISTAGG.

What I have not been able to find are the idiomatic equivalents of STRUCT and UNNEST.

Are there things that do the same thing as `STRUCT` and `UNNEST` or will I have to write my own?

Best Answer

Technique Translation

The technique for solving a problem in one RDBMS doesn't always translate well into another.

I've had performance problems with similar code to that which was in the article. Thus, my comment "This technique will kill performance in Oracle".

I don't see CAST(MULTISET()) (Oracle's version of ARRAY_AGG) being used too much. As such, the code might be unmaintainable by your replacement after you get a promotion.

Translating Terms

Translation goes like this

STRUCT in BigData is TYPE in Oracle
ARRAY_AGG in BigData is CAST(MULTISET()) in Oracle
UNNEST in BigData is TABLE() in Oracle
UNNEST of XML Data is XMLTABLE() in Oracle
UNNEST of JSON Data is JSON_TABLE() in Oracle (12c+)

You can't create the structure on the fly. You have to define it ahead of time.

EXAMPLES

Creating a structure of a single row:

create type emp_t as object (
  EMPNO             NUMBER(4),    
  ENAME             VARCHAR2(10) ,
  JOB               VARCHAR2(9)  ,
  MGR               NUMBER(4)    ,
  HIREDATE          DATE         ,
  SAL               NUMBER(7,2)  ,
  COMM              NUMBER(7,2)  ,
  DEPTNO            NUMBER(2)  
);
/

Creating a structure of a set of rows based on above: ( nested table )

create type emp_tt as table of emp_t;
/

Using CAST(MULTISET())

select d.deptno
  ,cast(multiset(
           select * from scott.emp e where e.deptno=d.deptno
        ) as emp_tt) as emp_table
from dept d
;

Using TABLE()

with data as (
  select d.deptno
    ,cast(multiset(
             select * from scott.emp e where e.deptno=d.deptno
           ) as emp_tt) AS emp_table
  from dept d
)
select b.*
from data a, table( a.emp_table ) b
order by empno;

Oracle's Solution

Business Requirement: I want the first time a hurricane reached its maximum category along with its position.

Other, inferred, identifiers to "GROUP BY" would be season, basin, subbasin in addition to hurricane name.

One method to solve this in Oracle is to use Analytics.

Since I don't have access to the huricane data, I'll have to improvise.

  select deptno as hurricane_name
        ,2017   as season
        ,'NA'   as basin
        ,'WP'   as subbasin
        ,sal    as category
        ,ename  as position
        ,rownum as time
  from scott.emp

First thing we do is use the analytic function RANK() to rank all of the rows by category (descending) and time (ascending) but partition the rankings by season, basin, subbasin, and hurricane_name

  select h.*
    ,RANK() over (partition by season, basin, subbasin, huricane_name
                  order by category desc, time)
      as rank_score
  from huricane_data h

Finally, we'll pick only the best (rank_score=1)

select *
from analyized_data
where rank_score=1
order by season, basin, subbasin, hurricane_name, time

Putting it all together

with hurricane_data as (
  select deptno as hurricane_name
        ,2017   as season
        ,'NA'   as basin
        ,'WP'   as subbasin
        ,sal    as category
        ,ename  as position
        ,rownum as time
  from scott.emp
), analyized_data as (
  select h.*
    ,RANK() over (partition by season, basin, subbasin, hurricane_name
                  order by category desc, time) rank_score
  from hurricane_data h
)
select *
from analyized_data
where rank_score=1
-- place season/basin/subbasin filters here
order by season, basin, subbasin, hurricane_name, time;

Related Solutions

Oracle sort varchar2 column with special characters last

If the sort order that you want to specify is already supported by Oracle, you can do this by ordering by the NLSSORT function - like so:

ORDER BY NLSSORT(sorted_column, 'NLS_SORT = XDanish') -- Replace XDanish as appropriate

You can find a list of supported sort orders here.

Find a syntax reference for Oracle SQL Developer’s “Generate DB Doc” function

_{Community wiki answer initially based on comments left by thatjeffsmith:}

This is an exhaustive list of what it supports:

(reproduced from http://pldoc.sourceforge.net/maven-site/samples/sample1.sql)

CREATE OR REPLACE
PACKAGE CUSTOMER_DATA
IS
/** 
* Project:         Test Project (<a href="http://pldoc.sourceforge.net">PLDoc</a>)<br/>
* Description:     Customer Data Management<br/>
* DB impact:       YES<br/>
* Commit inside:   NO<br/>
* Rollback inside: NO<br/>
* @headcom
*/

/**
* Record of customer data.
*
* @param id     customer ID
* @param name       customer name
* @param regno      registration number or SSN
* @param language   preferred language
*/
TYPE customer_type IS RECORD (
  id                        VARCHAR2(20),
  name                      VARCHAR2(100),
  regno                     VARCHAR2(50),
  language                  VARCHAR2(10)
);

/** Table of customer records. */
TYPE customer_table IS TABLE OF customer_type INDEX BY BINARY_INTEGER;

/**
* Gets customer by ID.
*
* @param p_id       customer ID
* @param r          record of customer data
* @throws no_data_found if no such customer exists
*/
PROCEDURE get_customer (
  p_id              VARCHAR2,
  customer_rec      OUT customer_type);

/**
* Searches customer by criteria.
*
* @param p_criteria record with assigned search criteria
* @param r_records  table of found customers <b>(may be empty!)</b>
*/
PROCEDURE get_by_criteria (
  p_criteria        customer_type,
  r_records         OUT customer_table);

/**
* Creates a customer record.
*
* @param customer_rec record of customer data
*/
PROCEDURE create_customer (
  customer_rec      customer_type);

/**
* Changes customer data.
*
* @param customer_rec record of updated customer data
*/
PROCEDURE update_customer (
  customer_rec      customer_type);

END;
/

We support everything in PLDOC - we just have a GUI vs a CLI for it. There are three code samples there, you should be able to do anything listed in those samples.

Are there things that do the same thing as STRUCT and UNNEST or will I have to write my own?