Hi All,
We have a logic present in transformation from PSA to DSO. PSA holds full load every week and it holds history requests also containing 2 years .The PSA has 30 million records.
The logic is if more than one comp id is coming from the 30 million records(including current request),then stamp 1 in the COUNT .
That means, the target dso has to store the number of duplicate comp id's from the source.
This is achieved by sort statement by comp id and collect statement for count.
for example
Source PSA
COMPID DATE TIME and some other fields.
101 28/03
101 29/03
102 30/03
Target DSO
COMPID DATE TIME COUNT.
101 29/03(latest date) 2
102 30/03 1
The issue is data load takes 1 day to complete the entire load because the entire 30 million records are compared (30 million times in loop statement) to identify the duplcate records every week along with the current week request.
Can you please suggest the approach to reduce the time duration.
I feel that scaning the two years psa request will take lot time.So If I load first to write optimized dso without any logic and then I load to the standard DSO will reduce few hours instead of modifying the existing code.
regards
Pradeep