Hi,
Can anyone from MS exaplain why the AGGREGATE component doesn't allow you to select MIN/MAX when the column is DT_STR/DT_WSTR?
Thanks
Jamie
Anyone?
|||
Thanks Jamie.
Frankly, it did not seem like a common request for our core data warehousing scenarios - so it was not coded in from day one. During beta, a couple of customers did request it, but they were able to work around the issue using a script component. And, as I remember, becuase they had some additional processing to do once they had found the max string value, a script would have been needed at some point anyway.
Always interested to hear scenarios of course. Meanwhile, this would be an interesting DCR, but so far we have not had much demand.
Donald
However, as my Aunt once said to a salesman who suggested there was "no demand" for something she was seeking - "There is a demand standing right in front of you, young man!"
|||OK thanks Donald. Sommeone on this forum was indeed asking for it and when he asked why it wasn't there I couldn't answer him. Eventually he used, as you say, a script component.
-Jamie
|||
That someone was me
I am implementing a DW/DM, where I collect data from ca. 25 different source systems / DW's. One case where I need max on varchar: In some cases, as I collect data on invoice row level, one invoice row is allocated to more than one cost center (1-n), and I somehow have to collect only one of the cost centers. The cost center data is in a varchar column, and thus I need some way to collect one of the many choices. For the sake of plausible validation, I always want to take value using same method (max or min, since they are easy to write into sql).. I know the proper way of collecting this kind of data would be at the cost center level, but let's not get into that.
In earlier DW implementations I have made (using Ascential/IBM Datastage), I had tens of cases where I had to take max/min of a varchar column.
It could easily be so that even if the data itself is numeric, it is stored in a varchar column, and I hate strong type casts, since I can never be sure if there could sometimes be text information as the column allows it. I have seen that also.
Not implementing max/min on varchar columns seems as a silly limitation in SSIS. In my opinion, this kind of features should be included in the basic transforms, so that there would be no need to always write short scripts - thats what was used in DTS.
IMHO Datastage / Informatica are much more user friendly than DTS in sense for not needing programming skills, I would really like to see SSIS to evolve more into that direction (which it already has, when comparing to DTS).
Markus
|||If you consider adding min/max aggregation of text values to the aggregate transform, while MS in the code consider adding first and last. Occasionally they are very helpful.|||Guys,
You're more likely to get this functionality if you ask for it thru the proper channels. Click through here and vote, and add a comment
https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=131210
Anecdotes of why you need it are a huge help as well.
-Jamie
|||How about including it because it's valid transact-sql to use it on non-numeric fields?
It'd be better to allow for the aggregate functions to work as they are supposed to according to the MS Transact-SQL documentation. It's also better to use an aggregate transform in my package rather than having to query outside of the datastream to do something that should've been included in the first place.
For a given group of records, I want to, for instance, find the minimum text value in a column -- I shouldn't need a script to tell me that.|||
Phil Brammer wrote: It'd be better to allow for the aggregate functions to work as they are supposed to according to the MS Transact-SQL documentation.
Why?
SSIS is an ETL-tool, not a relational database. Nowhere in the SSIS documentation does it state that there is any adherance to T-SQL, ANSI SQL or any other SQL dialect.
I agree that the AGGREGATE should allow min/max on string fields but my justification for that is because it is something useful for ETL developers, not because some other development platform supports it.
-Jamie
|||
Jamie Thomson wrote: Phil Brammer wrote: It'd be better to allow for the aggregate functions to work as they are supposed to according to the MS Transact-SQL documentation.
Why?
SSIS is an ETL-tool, not a relational database. Nowhere in the SSIS documentation does it state that there is any adherance to T-SQL, ANSI SQL or any other SQL dialect.
I agree that the AGGREGATE should allow min/max on string fields but my justification for that is because it is something useful for ETL developers, not because some other development platform supports it.
-Jamie
Actually, yes it does reference Transact-SQL. Read the MSDN page on aggregate transformations. For each operation in the aggregation, they state to read the Transact-SQL documentation for more information.
Granted, I see the exception on the min/max operations... I'm just providing yet another reason to include the functionality. I can also appreciate why it was left out... It's far easier to calculate min/max on a strictly numeric field, considering you have to do an expensive sort on non-numeric fields to determin min/max.
Phil|||
Phil Brammer wrote: Jamie Thomson wrote: Phil Brammer wrote: It'd be better to allow for the aggregate functions to work as they are supposed to according to the MS Transact-SQL documentation.
Why?
SSIS is an ETL-tool, not a relational database. Nowhere in the SSIS documentation does it state that there is any adherance to T-SQL, ANSI SQL or any other SQL dialect.
I agree that the AGGREGATE should allow min/max on string fields but my justification for that is because it is something useful for ETL developers, not because some other development platform supports it.
-Jamie
Actually, yes it does reference Transact-SQL. Read the MSDN page on aggregate transformations. For each operation in the aggregation, they state to read the Transact-SQL documentation for more information.
Phil Brammer wrote: Does it? Then I stand corrected.
I personally think that's a bad idea because of the reasons elucidated in my last post in this thread, but hey!
Granted, I see the exception on the min/max operations... I'm just providing yet another reason to include the functionality. I can also appreciate why it was left out... It's far easier to calculate min/max on a strictly numeric field, considering you have to do an expensive sort on non-numeric fields to determin min/max.
Phil
true!
-J
|||
All,
The original feedback item was posted under the wrong category so I've re-posted here: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=246223
Thanks to Phil for pointing it out.
-Jamie
No comments:
Post a Comment