Monday, March 19, 2012

Ah, now we get to the reason

Your scenario is not all that uncommon. In fact,
I've been in several situations where clients refuse
to sign contracts if "their" data is mixed in with
someone else's. Not just for security reasons but
"mishaps" where someone runs an adhoc sql
statement could screw up "their" data or programming
bugs could more easily expose their data.
Granted, it is a risk either way but they
feel it is less likely to happen to them if their
data is separated at least at the database level.
If they are footing the bill, it is awfully hard
to turn them away...
That said, I'd suggest keeping your user id's,
client ids, and other generic info in a single master
database. GUID or uniqueidentifiers as the keys
versus bigint would be preferred. Depending
on your architecture, you might want either
a column or separate table that holds the
database info as to where the client's data is
held and related back to the client.
You could just hold the database name in
memory (static variable for desktop and session
for web).
This gives you the advantage of being able to easily
move one or more databases to new servers based
on activity. If 100 clients are rare users on the system,
they could all go to the same server. If 2 or 3 are heavy
users, they could go somewhere else.
Read up on the maximum number of databases per
instance of sql server to get a better idea of potential
hardware costs.
Robbe Morris - 2004/2005 Microsoft MVP C#
Earn money answering .NET Framework
messageboard posts at EggHeadCafe.com.
http://www.eggheadcafe.com/forums/merit.asp
"tshad" <tfs@.dslextreme.com> wrote in message
news:ev9vUtUpFHA.3996@.TK2MSFTNGP12.phx.gbl...
> "Stu" <stuart.ainsworth@.gmail.com> wrote in message
> news:1124507833.319732.67700@.o13g2000cwo.googlegroups.com...
> Good point.
>
> That makes sense, but my boss is concerned that if someone got past
> security, that all the data would be compromised. He feels that if each
> company had its own database and one companies data was compromised the
> security of the databases wouldn't be.
> He feels that the other companies would be more comfortable if they knew
> their data was separate from the others, especially after the other
> companies data had been broken into that were in the news in the last 6
> months.
> Tom
>"Robbe Morris [C# MVP]" <info@.eggheadcafe.com> wrote in message
news:e6QT7PYpFHA.3552@.TK2MSFTNGP10.phx.gbl...
> Your scenario is not all that uncommon. In fact,
> I've been in several situations where clients refuse
> to sign contracts if "their" data is mixed in with
> someone else's. Not just for security reasons but
> "mishaps" where someone runs an adhoc sql
> statement could screw up "their" data or programming
> bugs could more easily expose their data.
> Granted, it is a risk either way but they
> feel it is less likely to happen to them if their
> data is separated at least at the database level.
This was one of the things I was getting at when I posted the question
"Table theory question" a month ago.
I was curious if it was a problem to carry the CompanyCode and UserID down
the Table chain (bookings/passenger/tickets). If I have the CompanyCode in
the Booking table, should I also have it in the Passenger and Ticket tables
when I also have the BookingID in these tables. I assume this would break
"Normal Forms", but if all the records had the CompanyCodes in them, it
might be better security to allow companies to only see their data for
certain reports and queries (that may not be the case, but it was a
question).

> If they are footing the bill, it is awfully hard
> to turn them away...
Agreed.
And it does make things a little more complicated than having all in one
database. But what are you going to do.
One of the things we want to be able to sell is the ease of giving them
their data completely. They would be more confident if the data was in one
database that we could just give them, versus intermingled with other
companies data, that we would have to extract from.

> That said, I'd suggest keeping your user id's,
> client ids, and other generic info in a single master
> database. GUID or uniqueidentifiers as the keys
> versus bigint would be preferred.
Actually, I have already started changing the ID's to integers instead of
BigInts, but I am as to why GUID (uniqueidentifiers) would be
better than an identity with either (tinyint, int or bigint)? One of the
arguments I got against using a BigInt was the amount of space I was wasting
as well as the size of any index that it was included it. But the GUID is
twice the size of a Bigint and 4 times the size of an Int.
The only advantage I can see, is that the numbers would not be sequential
and harder to guess.

>Depending
> on your architecture, you might want either
> a column or separate table that holds the
> database info as to where the client's data is
> held and related back to the client.
> You could just hold the database name in
> memory (static variable for desktop and session
> for web).
> This gives you the advantage of being able to easily
> move one or more databases to new servers based
> on activity. If 100 clients are rare users on the system,
> they could all go to the same server. If 2 or 3 are heavy
> users, they could go somewhere else.
> Read up on the maximum number of databases per
> instance of sql server to get a better idea of potential
> hardware costs.
>
I will look into that. I am not sure that would be an issue, but it may be.
Also, would more databases cause a performance problem for Sql Server?
Thanks,
Tom
> --
> Robbe Morris - 2004/2005 Microsoft MVP C#
> Earn money answering .NET Framework
> messageboard posts at EggHeadCafe.com.
> http://www.eggheadcafe.com/forums/merit.asp
>
> "tshad" <tfs@.dslextreme.com> wrote in message
> news:ev9vUtUpFHA.3996@.TK2MSFTNGP12.phx.gbl...
>|||BTW, I would suggest using bigint for most
of your tables that will have large amounts of data in them. Space
is cheap. This way you can feel pretty safe that you won't
have problems in the future.
Why GUIDs? If your web developers expose certain
database keys in hidden fields, viewstate, or as a part of
urls, then it would be easy for users of your site to start
tinkering with ids at runtime. Your clients would easily spot
the security hole and perhaps close their accounts or
force you to fix the problem post production.
It is a whole lot harder to guess a GUID...
Robbe Morris - 2004/2005 Microsoft MVP C#
Earn money answering .NET Framework
messageboard posts at EggHeadCafe.com.
http://www.eggheadcafe.com/forums/merit.asp
"tshad" <tfs@.dslextreme.com> wrote in message
news:u8zx$dopFHA.3064@.TK2MSFTNGP15.phx.gbl...
> "Robbe Morris [C# MVP]" <info@.eggheadcafe.com> wrote in message
> news:e6QT7PYpFHA.3552@.TK2MSFTNGP10.phx.gbl...
> This was one of the things I was getting at when I posted the question
> "Table theory question" a month ago.
> I was curious if it was a problem to carry the CompanyCode and UserID down
> the Table chain (bookings/passenger/tickets). If I have the CompanyCode
> in
> the Booking table, should I also have it in the Passenger and Ticket
> tables
> when I also have the BookingID in these tables. I assume this would break
> "Normal Forms", but if all the records had the CompanyCodes in them, it
> might be better security to allow companies to only see their data for
> certain reports and queries (that may not be the case, but it was a
> question).
>
> Agreed.
> And it does make things a little more complicated than having all in one
> database. But what are you going to do.
> One of the things we want to be able to sell is the ease of giving them
> their data completely. They would be more confident if the data was in
> one
> database that we could just give them, versus intermingled with other
> companies data, that we would have to extract from.
>
> Actually, I have already started changing the ID's to integers instead of
> BigInts, but I am as to why GUID (uniqueidentifiers) would be
> better than an identity with either (tinyint, int or bigint)? One of the
> arguments I got against using a BigInt was the amount of space I was
> wasting
> as well as the size of any index that it was included it. But the GUID is
> twice the size of a Bigint and 4 times the size of an Int.
> The only advantage I can see, is that the numbers would not be sequential
> and harder to guess.
>
> I will look into that. I am not sure that would be an issue, but it may
> be.
> Also, would more databases cause a performance problem for Sql Server?
> Thanks,
> Tom
>|||"Robbe Morris [C# MVP]" <info@.eggheadcafe.com> wrote in message
news:e6SXxKppFHA.3380@.TK2MSFTNGP10.phx.gbl...
> BTW, I would suggest using bigint for most
> of your tables that will have large amounts of data in them. Space
> is cheap. This way you can feel pretty safe that you won't
> have problems in the future.
That was what I thought. But others in this group seem to think I was crazy
for doing this.

> Why GUIDs? If your web developers expose certain
> database keys in hidden fields, viewstate, or as a part of
> urls, then it would be easy for users of your site to start
> tinkering with ids at runtime. Your clients would easily spot
> the security hole and perhaps close their accounts or
> force you to fix the problem post production.
> It is a whole lot harder to guess a GUID...
>
So it would be more of a perception issue?
The other reason I was seeing about using GUIDs over Identities was a
problem of spanning tables across multiple servers (I really don't see that
happening), in that you could get multiple numbers this way - not sure why
this would happen.
Also, I see why hidden fields and urls would be a problem, but why
viewstate? I thought that was encoded?
Also, based on your security issue, which I agree with, I can see using the
GUID for a UserID and possibly a Company ID, but not to uniquify rows, such
as Passenger records or Ticket numbers.
I think I mentioned that one of the ways I would uniquify a row, was to use
an integer combined with the parent tables key to uniquify the row. For
example, in my travel scenario, I would use a 5 character alphanumeric
number for my booking number and add a 3 character numeric number. I would
start at 1. If the booking number was 7A331, the passenger records would be
7A331001, 7A331002 etc. I would store Passenger Uniquifier as an integer
and display it as a characters with leading 0's. But this meant the sql
statement to add the record would be something like:
****************************************
************************************
****************
SET ISOLATION LEVEL REPEATABLE READ
INSERT INTO ftsolutions.dbo.Pax (BookingID,PaxUnique, FirstName,LastName)
Select
@.BookingID,COALESCE(max(PaxUnique)+1,1),
@.FirstName, @.LastName
from ApplicantProfilePriorJobs where CompanyCode = @.Code and BookingID
= @.Booking ID
Select max(PaxUnique) from ApplicantProfilePriorJobs where CompanyCode =
@.Code and BookingID = @.Booking ID
****************************************
************************************
*****************
Even though the above record would be uniquified by the BookingID + the
PaxUnique - would it still be a good idea to have a surrogate key (identity
or GUID)? Some have stated that is is not a good idea to have a composite
key.
I started doing a lot of my child records this way, but then felt in most
cases, I didn't need a sequential key, such as for our tickets. So in that
case I would just do a normal select with an identity for the Ticket number,
which worked fine, since it had to be sequential (a GUID would not work
here).
Thanks,
Tom
> --
> Robbe Morris - 2004/2005 Microsoft MVP C#
> Earn money answering .NET Framework
> messageboard posts at EggHeadCafe.com.
> http://www.eggheadcafe.com/forums/merit.asp
>
> "tshad" <tfs@.dslextreme.com> wrote in message
> news:u8zx$dopFHA.3064@.TK2MSFTNGP15.phx.gbl...
down
break
of
the
is
sequential
or
your
databases.
in
the
6
>|||On Sun, 21 Aug 2005 16:57:25 -0400, Robbe Morris [C# MVP] wrote:

>BTW, I would suggest using bigint for most
>of your tables that will have large amounts of data in them. Space
>is cheap.
Hi Robbe,
Space is, speed isn't. The speed of most queries is determined by the
amount of data that has to be taken from disk to storage. Double the
size of a key (that is also included in any index!), and the number of
page reads will increase.

> This way you can feel pretty safe that you won't
>have problems in the future.
With int columns, the allowed range is from -2^31 (-2,147,483,648)
through 2^31 - 1 (2,147,483,647). That means that if you insert one row
per second, you'll run out of values in 136 years.
For databases with very high-speed insert rate that are estimated to be
running for lots of years - yes, bigint would be the better choice in
that case.

>Why GUIDs? If your web developers expose certain
>database keys in hidden fields, viewstate, or as a part of
>urls,
(snip)
then they should be fired. And those responsible for testing should be
fired as well.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)|||On Sun, 21 Aug 2005 20:30:24 -0700, tshad wrote:
(snip)
>Even though the above record would be uniquified by the BookingID + the
>PaxUnique - would it still be a good idea to have a surrogate key (identity
>or GUID)? Some have stated that is is not a good idea to have a composite
>key.
Hi Tom,
Composite keys are not bad. But there are situation where a surrogate
key can be beneficial. Just remember the basic rules: keep the "real"
key in your table as well, declare it either UNIQUE or PRIMARY KEY and
never show the surrogate key value to the users.
To decide if a surrogate key can be beneficial, consider:
1. Are many other tables referring to this table? A surrogate key
(assuming it's int -or bigint if you really expect horseloads of data)
takes less space than most rel keys. Of course, the size of the real key
should be taken into account as well - three varchar(300) columns is a
better candidate to get a surrogate key than three tinyint columns.
2. Are there many nonclustered indexes on the table? Each nonclustered
index includes the clustered index value as well. If you have the index
on the surrogate key clustered, many indexes will become smaller
3. How about the joins? Joining on a small integer column is faster than
joining on a composite key with icky datatypes. On the other hand,
having the "real" key values in related tables might also reduce the
number of joins required.
4. Finally: insert performance. If you add an IDENTITY as surrogate key
and define a clustered index on it, you'll reduce the number of page
splits on inserts. OTOH, if you choose GUID and use it in the surrogate
key, you'll have nothing but page splits (in other words: if you choose
GUID, don't make it the clustered index!).
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)|||"Hugo Kornelis" <hugo@.pe_NO_rFact.in_SPAM_fo> wrote in message
news:2blkg15nufj3oevciq7dme7v4vik058m88@.
4ax.com...
> On Sun, 21 Aug 2005 20:30:24 -0700, tshad wrote:
> (snip)
> Hi Tom,
> Composite keys are not bad. But there are situation where a surrogate
> key can be beneficial. Just remember the basic rules: keep the "real"
> key in your table as well, declare it either UNIQUE or PRIMARY KEY and
> never show the surrogate key value to the users.
>
In my case, for the Pax file, the real key would be the "BookID +
PaxUnique"?
Create Table(
BookingID char(6) Not Null,
PaxUnique int Not Null,
FirstName varChar(30),
LastName varChar(45)
) on Primary
INSERT INTO Pax (BookingID,PaxUnique, FirstName,LastName)
Select
@.BookingID,COALESCE(max(PaxUnique)+1,1),
@.FirstName, @.LastName
from Pax where CompanyCode = @.Code and BookingID = @.Booking ID
I assume that a surrogate key would not be necessary here based on size.
But I might use an int identity surrogate be if I was going to use this as
Foreign keys to 10 other tables as I would need to use 2 fields in each
table and if I had a surrogate, I would only need one integer.

> To decide if a surrogate key can be beneficial, consider:
> 1. Are many other tables referring to this table? A surrogate key
> (assuming it's int -or bigint if you really expect horseloads of data)
> takes less space than most rel keys. Of course, the size of the real key
> should be taken into account as well - three varchar(300) columns is a
> better candidate to get a surrogate key than three tinyint columns.
> 2. Are there many nonclustered indexes on the table? Each nonclustered
> index includes the clustered index value as well. If you have the index
> on the surrogate key clustered, many indexes will become smaller
> 3. How about the joins? Joining on a small integer column is faster than
> joining on a composite key with icky datatypes. On the other hand,
> having the "real" key values in related tables might also reduce the
> number of joins required.
This was my question about having company or user id in most of my tables.
Even though not necessary if one of the tables above it had these fields, it
would mean more joins everytime I was to run a query that needed the company
or user.

> 4. Finally: insert performance. If you add an IDENTITY as surrogate key
> and define a clustered index on it, you'll reduce the number of page
> splits on inserts. OTOH, if you choose GUID and use it in the surrogate
> key, you'll have nothing but page splits (in other words: if you choose
> GUID, don't make it the clustered index!).
Thanks,
Tom
> Best, Hugo
> --
> (Remove _NO_ and _SPAM_ to get my e-mail address)|||On Mon, 22 Aug 2005 16:39:31 -0700, tshad wrote:
(snip)
>In my case, for the Pax file, the real key would be the "BookID +
>PaxUnique"?
>Create Table(
> BookingID char(6) Not Null,
> PaxUnique int Not Null,
> FirstName varChar(30),
> LastName varChar(45)
> ) on Primary
Hi Tom,
I don't know your data as good as you do, but I guess that this would
indeed be the real key. You probably can't be sure that BookingID +
FirstName + LastName is unique.

>INSERT INTO Pax (BookingID,PaxUnique, FirstName,LastName)
>Select
> @.BookingID,COALESCE(max(PaxUnique)+1,1),
@.FirstName, @.LastName
> from Pax where CompanyCode = @.Code and BookingID = @.Booking ID
I'm not sure where this CompanyCode column that you're referring to
comes from. If the key is on BookingID + PaxUnique and you have to find
the next PaxUnique value, leave out the CompanyCodeand use:
INSERT INTO Pax (BookingID,PaxUnique, FirstName,LastName)
SELECT @.BookingID, COALESCE(MAX(PaxUnique)+1,1), @.FirstName, @.LastName
FROM Pax
WHERE BookingID = @.Booking ID
If the clustered index is on (BookingID, PaxUnique), this should be
about as fast as it gets too.

>I assume that a surrogate key would not be necessary here based on size.
Yes.

>But I might use an int identity surrogate be if I was going to use this as
>Foreign keys to 10 other tables as I would need to use 2 fields in each
>table and if I had a surrogate, I would only need one integer.
Indeed. That makes it easier for you to type the JOIN conditions, and it
will speed up the joins as well. But for reports that require you to
show the BookingID but not the names, the surrogate key would introduce
the need for an extra join that would not be needed if you use the real
key only.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)

No comments:

Post a Comment