My batches need to have some sequential ids added to them. These ids need to be system-wide.
Here is example input data. I am given the first two cols and need to generate the third:
Code Snippet
Row RowType ID
A LN LN1001
B LN LN1002
C TR TR2001
D LN LN1003
E TR TR2002
F TR TR2003
There is no relation between Row and ID except for the sequence of the records.
I have a table that looks like this:
Code Snippet
RowType MaxID
LN 1003
TR 2003
and that is where I go to get/update my latest id. (SELECT ? = MaxID = MaxID + 1 Where RowType = 'TR')
My question is this:
I will probably use a script step to do the work of obtaining and applying a new row id to each row. How do I run the requisite SQL from within a script step? Or is there a better way of doing it?
You would need to cache the current max ID by row type, in preparation for the main script execuction, so the script PreExecute event would seem like a good place. You can use an SSIS connction of type ADO.NET, and use that to get the System.Data.SqlClient.SqlConnection object, and write some stanadard .Net data access code. See http://msdn2.microsoft.com/en-us/library/ms136045.aspx for a start.You do not want to be making that SQL call per row, so track them in code, and write back the latest values in the PostExecute event.
Alternatively you could just use the Exec SQL Task to query this data and store it in a variable.
You would use a second Exec SQL Task to write back the latest values used after the Data Flow.
|||Cool.
That is just the sort of info I was after.
I think I will go with the query before and update after model. You've given me some excellent pointers. I guess the main thing I was concerned about was the impact on performance of doing multiple updates (running SQL for each row of data.)
Thanks,
Dylan.
No comments:
Post a Comment