Thursday, March 8, 2012

Aggregate only top 20 records in a table- MSSQL2000

Hi All

I need to aggregate a query to produce the following:

Workplace Avg
M100 4.7
M120 3.45

Which would be a normal aggregate:
SELECT Workplace, Avg(VALUE)
FROM PROD
GROUP BY Workplace

However I need the average to only be based on the most recent 20
results from each of the Workplace groups.

I've never had to do something like this before so can't think of any
way to only take off the most recent 20 for each group (ordered by
Date). It doesn't really matter if there were 25 spread across 2 days
I would just cut the list at 20 VALUEs as there is no time component
invloved.

Is there any way to do a sub-query that uses select top 20 ... for
each group that could then be aggregated?

I would prefer to do it through a select statement rather than having
to use a stored procedure using and variables, etc which I can do. The
table is not huge but is growing rapidly so I'm concerned that
anything using dyamic SQL or similar would be become painfully as the
number of groups grows to 5,000 or more.

If anyone has any ideas they would be greatly appreciated.

Thanks in advance,
BevanOne option is to use a derived table construct like:

SELECT col1, AVG(col2)
FROM ( SELECT TOP 20 col1, col2
FROM tbl
ORDER BY col2 ) D
GROUP BY col1 ;

--
- Anith
( Please reply to newsgroups only )|||"Bevan Ward" <bevan_ward@.hotmail.com> wrote in message
news:b9cc76b2.0306270622.4ec978c5@.posting.google.c om...
> Hi All
> I need to aggregate a query to produce the following:
> Workplace Avg
> M100 4.7
> M120 3.45
> Which would be a normal aggregate:
> SELECT Workplace, Avg(VALUE)
> FROM PROD
> GROUP BY Workplace
> However I need the average to only be based on the most recent 20
> results from each of the Workplace groups.
> I've never had to do something like this before so can't think of any
> way to only take off the most recent 20 for each group (ordered by
> Date). It doesn't really matter if there were 25 spread across 2 days
> I would just cut the list at 20 VALUEs as there is no time component
> invloved.
> Is there any way to do a sub-query that uses select top 20 ... for
> each group that could then be aggregated?
> I would prefer to do it through a select statement rather than having
> to use a stored procedure using and variables, etc which I can do. The
> table is not huge but is growing rapidly so I'm concerned that
> anything using dyamic SQL or similar would be become painfully as the
> number of groups grows to 5,000 or more.
> If anyone has any ideas they would be greatly appreciated.
> Thanks in advance,
> Bevan

CREATE TABLE Prod
(
workplace VARCHAR(10) NOT NULL,
dt DATETIME NOT NULL,
value FLOAT NOT NULL,
PRIMARY KEY (workplace, dt)
)

SELECT workplace, AVG(value) AS avg_value
FROM Prod AS P
WHERE dt IN (SELECT TOP 20 dt
FROM Prod
WHERE workplace = P.workplace
ORDER BY dt DESC)
GROUP BY workplace

Regards,
jag

No comments:

Post a Comment