Friday, February 24, 2012

Age Old Question about GROUP BY clause (i think) - Probably easy answer

How does one get the primary key of the row that is joined in via a
group by aggregate clause when the aggregate is not performed on the
primary key?

For example,

Person table
(
PersonID int,
FirstName varchar(50)
LastName varchar(50)
)

Visit table
(
VisitID int,
PersonID int,
VisitDate datetime
)

These are simplified versions of my tables. I'm trying to create a
view that gets the first time each person Visited:

selectp.PersonID,
min(v.VisitDate)
fromVisit v
joinPerson p on p.PersonID = v.PersonID
group byp.PersonID

The problem is that I would like to return the VisitID in the
resultset, but when I do it expands the query since I have to also put
it in the group by clause.

What are the different ways to achieve this?
Subqueries?
Only return the date and then join off of date on the outside?

Neither of these seem too entising...

Thanks in advance for any help.

-DaveHow about:

select person.*, visit.*
from person left join visit
on person.personid = visit.visitid
and visit.visitdate =
(select min(visitdate) from visit where personid = person.personid)

FN

malcolm wrote:
> How does one get the primary key of the row that is joined in via a
> group by aggregate clause when the aggregate is not performed on the
> primary key?
> For example,
> Person table
> (
> PersonID int,
> FirstName varchar(50)
> LastName varchar(50)
> )
>
> Visit table
> (
> VisitID int,
> PersonID int,
> VisitDate datetime
> )
> These are simplified versions of my tables. I'm trying to create a
> view that gets the first time each person Visited:
> selectp.PersonID,
> min(v.VisitDate)
> fromVisit v
> joinPerson p on p.PersonID = v.PersonID
> group byp.PersonID
> The problem is that I would like to return the VisitID in the
> resultset, but when I do it expands the query since I have to also put
> it in the group by clause.
> What are the different ways to achieve this?
> Subqueries?
> Only return the date and then join off of date on the outside?
> Neither of these seem too entising...
> Thanks in advance for any help.
> -Dave|||Oops, make that third line:

on person.personid = visit.personid

But I'm sure you got the gist.

FN

fn wrote:

> How about:
> select person.*, visit.*
> from person left join visit
> on person.personid = visit.visitid
> and visit.visitdate =
> (select min(visitdate) from visit where personid = person.personid)
> FN
> malcolm wrote:
>> How does one get the primary key of the row that is joined in via a
>> group by aggregate clause when the aggregate is not performed on the
>> primary key?
>>
>> For example,
>>
>> Person table
>> (
>> PersonID int,
>> FirstName varchar(50)
>> LastName varchar(50)
>> )
>>
>>
>> Visit table
>> (
>> VisitID int,
>> PersonID int,
>> VisitDate datetime
>> )
>>
>> These are simplified versions of my tables. I'm trying to create a
>> view that gets the first time each person Visited:
>>
>> select p.PersonID,
>> min(v.VisitDate)
>> from Visit v
>> join Person p on p.PersonID = v.PersonID
>> group by p.PersonID
>>
>> The problem is that I would like to return the VisitID in the
>> resultset, but when I do it expands the query since I have to also put
>> it in the group by clause.
>>
>> What are the different ways to achieve this? Subqueries? Only return
>> the date and then join off of date on the outside?
>>
>> Neither of these seem too entising...
>>
>> Thanks in advance for any help.
>>
>> -Dave|||"malcolm" <chakachimp@.yahoo.com> wrote in message
news:4fe7c9e8.0406241624.18ca60ed@.posting.google.c om...
> How does one get the primary key of the row that is joined in via a
> group by aggregate clause when the aggregate is not performed on the
> primary key?
> For example,
> Person table
> (
> PersonID int,
> FirstName varchar(50)
> LastName varchar(50)
> )
>
> Visit table
> (
> VisitID int,
> PersonID int,
> VisitDate datetime
> )
> These are simplified versions of my tables. I'm trying to create a
> view that gets the first time each person Visited:
> select p.PersonID,
> min(v.VisitDate)
> from Visit v
> join Person p on p.PersonID = v.PersonID
> group by p.PersonID
> The problem is that I would like to return the VisitID in the
> resultset, but when I do it expands the query since I have to also put
> it in the group by clause.
> What are the different ways to achieve this?
> Subqueries?
> Only return the date and then join off of date on the outside?
> Neither of these seem too entising...
> Thanks in advance for any help.
> -Dave

SELECT P.PersonID, V1.VisitID, V1.VisitDate
FROM Persons AS P
LEFT OUTER JOIN
Visits AS V1
ON P.PersonID = V1.PersonID
LEFT OUTER JOIN
Visits AS V2
ON P.PersonID = V2.PersonID AND
V2.VisitDate < V1.VisitDate
WHERE V2.VisitDate IS NULL

--
JAG|||Please post DDL in the future. What you did post had no keys, too
many NULLs, the wrong datatypes (ever meet anyone with a fifty letter
first name? Only if they were named for a full Bible verse) and
singular table names. Is this what you meant?

CREATE TABLE Persons
(person_id INTEGER NOT NULL PRIMARY KEY, --assumption
first_name VARCHAR(15) NOT NULL, --USPS size name
last_name VARCHAR(15) NOT NULL); --USPS size name

Now I have to make assumptions about not having visits from unknown
people in my DRI.

CREATE TABLE Visits
(visit_nbr INTEGER NOT NULL PRIMARY KEY, --assumption
person_id INTEGER NOT NULL -- DRI assumption
REFERENCES Persons (person_id),
visit_date DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL -- default
assumption
);

>> I'm trying to create a view that gets the first time each person
visited .. The problem is that I would like to return the visit_id in
the resultset, <<

CREATE VIEW FirstVisits (person_id, visit_nbr, visit_date)
AS
SELECT V1.person_id, V1.visit_nbr, V1.visit_date
FROM Visits AS V1
WHERE V1.visit_date
= (SELECT MIN(v2.visit_date)
FROM Visits AS V2
WHERE V1.person_id = V2.person_id);

Yeah, yeah, I know it was quicky posting, but get in the habit of
doing it right all the time. Most DML problems come from bad DDL.|||You're right I did quickly slop together the question. What I posted
doesn't even come close to my example so I simply took too many
shortcuts in my post; I was simply trying to get to the point of my
question. Consider it DDL-UML ;)

The reason I didn't post working DDL is because I assumed someone
would have the answer off the top of thier head. Thanks for the
detailed response though.

-dave

jcelko212@.earthlink.net (--CELKO--) wrote in message news:<18c7b3c2.0406251250.13026daa@.posting.google.com>...
> Please post DDL in the future. What you did post had no keys, too
> many NULLs, the wrong datatypes (ever meet anyone with a fifty letter
> first name? Only if they were named for a full Bible verse) and
> singular table names. Is this what you meant?
> CREATE TABLE Persons
> (person_id INTEGER NOT NULL PRIMARY KEY, --assumption
> first_name VARCHAR(15) NOT NULL, --USPS size name
> last_name VARCHAR(15) NOT NULL); --USPS size name
> Now I have to make assumptions about not having visits from unknown
> people in my DRI.
> CREATE TABLE Visits
> (visit_nbr INTEGER NOT NULL PRIMARY KEY, --assumption
> person_id INTEGER NOT NULL -- DRI assumption
> REFERENCES Persons (person_id),
> visit_date DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL -- default
> assumption
> );
> >> I'm trying to create a view that gets the first time each person
> visited .. The problem is that I would like to return the visit_id in
> the resultset, <<
> CREATE VIEW FirstVisits (person_id, visit_nbr, visit_date)
> AS
> SELECT V1.person_id, V1.visit_nbr, V1.visit_date
> FROM Visits AS V1
> WHERE V1.visit_date
> = (SELECT MIN(v2.visit_date)
> FROM Visits AS V2
> WHERE V1.person_id = V2.person_id);
> Yeah, yeah, I know it was quicky posting, but get in the habit of
> doing it right all the time. Most DML problems come from bad DDL.|||Don't use a group by clause, use a subquery instead.

SELECT p.personID,v.VisitDate
FROM Person p,Visit v
WHERE p.PersonID = v.PersonID
and p.VisitDate = (SELECT MIN(v2.VisitDate)
FROM Visit v2
WHERE v2.PersonID = p.PersonID)

No comments:

Post a Comment