The IKM is in charge of writing the
final, transformed data to the target table. Every interface uses a single IKM.
When the IKM is started, it assumes that all loading phases for the remote
servers have already carried out their tasks. This means that all remote source
data sets have been loaded by LKMs into "C$" temporary tables in the
staging area, or the source datastores are on the same data server as the
staging area. Therefore, the IKM simply needs to execute the "Staging and
Target" transformations, joins and filters on the "C$" tables,
and tables located on the same data server as the staging area. The resulting
set is usually processed by the IKM and written into the "I$"
temporary table before loading it to the target. These final transformed
records can be written in several ways depending on the IKM selected in your
interface. They may be simply appended to the target, or compared for
incremental updates or for slowly changing dimensions. There are 2 types of
IKMs: those that assume that the staging area is on the same server as the
target datastore, and those that can be used when it is not. These are
illustrated below:
When the staging area is on the
target server, the IKM usually follows these steps:
1.
The IKM executes a single
set-oriented SELECT statement to carry out staging area and target declarative
rules on all "C$" tables and local tables (such as D in the figure).
This generates a result set.
2.
Simple "append" IKMs
directly write this result set into the target table. More complex IKMs create
an "I$" table to store this result set.
3.
If the data flow needs to be checked
against target constraints, the IKM calls a CKM to isolate erroneous records
and cleanse the "I$" table.
4.
The IKM writes records from the
"I$" table to the target following the defined strategy (incremental
update, slowly changing dimension, etc.).
5.
The IKM drops the "I$"
temporary table.
6.
Optionally, the IKM can call the CKM
again to check the consistency of the target datastore.
These types of KMs do not manipulate
data outside of the target server. Data processing is set-oriented for maximum
efficiency when performing jobs on large volumes.
When the staging area is different
from the target server, as shown in the IKM usually follows these steps:
1.
The IKM executes a single
set-oriented SELECT statement to carry out declarative rules on all
"C$" tables and tables located on the staging area (such as D in the
figure). This generates a result set.
2.
The IKM loads this result set into
the target datastore, following the defined strategy (append or incremental
update).
This architecture has certain
limitations, such as:
- A CKM cannot be used to perform a data integrity audit on the data being processed.
- Data needs to be extracted from the staging area before being loaded to the target, which may lead to performance issues.
No comments:
Post a Comment