Thursday, March 8, 2012

Aggregate/Concatenate Strings in a Script Component?

Hi--

I am uncertain how to do this. I am thinking it could be done in a script component, but after a day of experimentation, I'm not getting any closer. I'd also like to know if there is another component I may be able to use for this.

I have data coming from an Excel Spreadsheet that looks like this:

CustNumb Invoice

1 a

1 b

2 c

3 d

3 e

3 f

I would like an output that looks like this:

CustNumb Invoice

1 a, b

2 c

3 d, e, f

I am not even sure if I should be trying to do this in the subroutine that looks at each row or the one that looks at the entire buffer.

Thanks for any help or ideas...

Hilary

If you know how the maximum number of values you might have, you might be able to do this in a pivot transform. However, I would probably do it in a script. You'd need to set up the output as asynchronous (make the SynchronousOutputID = None on the output). In the ProcessInputRow method, you'll need to out a row on the async output each time the CustNumb changes from the previous value using the Output0Buffer.AddRow method.

|||

Thank you for your reply. When I was trying to do this in script earlier, that was one of the things I had trouble understanding--how to find out when the CustNumb changed. I thought that ProcessInputRow was looking at one row at a time, so I tried grabbing the customer number and then checking the next row to see if it matched... something like:

Code Snippet

Dim intCustNumber as Integer

intCustNumber = Row.CustNumber

With Row.NextRow

Dim intCustNumber1 as Integer

intCustNumber1 = Row.CustNumber

If intCustNumber = intCustNumber1 then

Etc.... not surprisingly, I lost all the rows that didn't have a repetition of the customer number. How do I look at each row and know when the number changes?

Thank you,

Hilary

|||

ProcessInputRow does look at a row each time. You need to declare a variable at the instance level (or a static variable, but I prefer instance) to hold the customer number. I freehanded the code below, so it may not compile, but hopefully it gives you an idea. You'll need to make sure it handles the first row and last row properly - I haven't tested it.

Code Snippet

Public Class ScriptMain

Inherits UserComponent

Private intCustomer As Integer

Private strString As String

Private blnFirstRow As Boolean = True

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

If Row.CustNumber = intCustomer Then

'Same customer

strString = strString + Row.ColumnVal

Else

If blnFirstRow Then

blnFirstRow = False

Else

'new customer - output the current values

Output0Buffer.AddRow()

Output0Buffer.CustNumber = intCustomer

Output0Buffer.NewCol = strString

End If

'now reset the variables

intCustomer = Row.CustNumber

strString = Row.ColumnVal

End If

End Sub

Public Overrides Sub Input0_ProcessInput(ByVal Buffer As Input0Buffer)

MyBase.Input0_ProcessInput(Buffer)

If Row.EndOfRowset Then

Output0Buffer.AddRow()

Output0Buffer.CustNumber = intCustomer

Output0Buffer.NewCol = strString

Output0Buffer.SetEndOfRowSet()

End If

End Sub

|||

Thank you, that is just what I needed-- a bit of a map for how to approach it.

Hilary

|||

There was a bug in the script above, where the last row could be skipped. I found it when when I was writing this problem up to post on my blog. Here's the full sample:

http://agilebi.com/cs/blogs/jwelch/archive/2007/09/14/dynamically-pivoting-rows-to-columns.aspx

I've also edited the script above to correct the issue.

|||

Thank you so very much. That is just above and beyond, you know? I really appreciate the help.

Hilary

No comments:

Post a Comment