ELT Data – Does it End Up in an RDBMS?

etl

I think I am confusing things.

In ETL, data conversion starts at Database A and ends up in Database B, both are relational databases. Maybe I have a 20-year-old system. I use ETL to get the data between the two databases. I do not know what else ETL is used for.

In ELT, is that the same except the data from Database A never ends up in Database B? Instead, the raw data is sitting in tables or some other data structure on Server B, from Database A, but requires something like Hadoop to transform that data into something an application that is specifically designed to use Hadoop?

Edit: I have done it again. Hadoop != ELT. I was looking only at Hadoop and thought it was ELT or the manifestation of it. edit: and that ELT meant you required a unique non-RDBMS file data structure instead of an RDBMS and you dumped the RDBMS altogether.

Best Answer

I think this will be easier with an example:

Database A has table C. Database B has table D. C and D are very similar but not identical and the data needs to be cleaned up before being loaded in to D.

  • ETL

    An ETL program (SSIS, Pentaho, whatever) pulls the data from table C. It then makes some changes to the data to clean it up and put it in the format that is needed. The ETL process then moves/copies that data up to table D on Database B.

  • ELT

    The data from Table C is extracted. It is uploaded to Database B as table C. Database B now has two tables, C and D. A database process (SQL, a job, whatever) now makes some changes to the data in table C and puts it in the format that is needed. That same process then copies the now clean data in to table D. The temporary Table C is no longer needed in Database B and can be deleted or truncated.

In both cases the data is now loaded in Database B, Table D. The front end application continues to use Table D.

Related Question