Showing posts with label app. Show all posts
Showing posts with label app. Show all posts

Monday, March 19, 2012

Database Encryption - Employee performance review

Hi,
I am writing an .NET 2.0 app that has different users logging in who
have different access levels in the app. One of the user roles is "HR".
When the user logs in with these credentials, they have a whole heap of
buttons/links/pages related to HR stuff that normal team members don't.
One of the sensitive information that needs to be displayed and more
importantly, stored in the database is the employee performance reviews
(HR access only).
I am wondering how I would go about storing/maintaining this
information as even the DBA's are not supposed to have access to this
information, but should still be able to administer it and/or retrieve
the encryption keys if the HR personnel forget their passwords etc.
The simplest way I can think of is to issue all the HR people with one
password, encrypt the data in the DB layer (in code) and then store it
in a table in the database. This way, anyone with access to the
Database won't necessarily have access to the data. The problem with
this one is that all the HR personnel share one password, which doesn't
seem like a nice (secure) solution.
The other option was to store the encryption key in the database and
encrypt the key itself with the password provided by the individual
user (HR person), which will be stored against that user's record. This
way, whenever a data request is made, the database sends the encrypted
performance review data and the password encrypted key to the user. The
user will then decrypt the key using their password, and then decrypt
the data using the key. This way, if the user ever forgets their
password, all the DBA (and/or App with owner acess) has to do is
reencrypt the key with a new user password and store it against the
user's record. The problem with this is storing the key in the database
and yet restricting access to the DBA. They should not have raw access
to the key (which essentially means they can see the data in the
database), but still should have enough access that if a user forgets
their password, the DBA can reset the password without losing all the
existing data.
I was thinking of storing the key on a key server somewhere with
different access rights etc. but that means the system starts to get
complex.
How would you guys go about building such a system? Does MS Sql Server
2005 provide any mechanisms for such functionality? Plus, as a design
issue, is it better to let the database handle the encryption, key
management, roles/privileges or is it better to do it in the
application itself? One of the advantages I can think of for doing it
in the application is that it gives a bit of database independence, i
can change the underlying database easily without having to rewrite a
whole heap of functionality. The other reason is that this way, i won't
have unencrypted data or keys on wire.
Looking forward to some expert opinions and discussions regarding this
problem.
Thanks in advance.Hello? Anyone? Surely this is not such a hard problem, or particularly
unusual. I would guess there are many times when you would want to
encrypt the data in a database, so as to make it inaccessible to the
DBA's, but still give them some administrative privileges. One of the
constraints is that the user with access to the data doesn't
necessarily "own" the data, so that if they forget the password (i.e
lose the encryption key), all the data is not lost.
On Jan 22, 6:37 pm, nyathan...@.hotmail.com wrote:
> Hi,
> I am writing an .NET 2.0 app that has different users logging in who
> have different access levels in the app. One of the user roles is "HR".
> When the user logs in with these credentials, they have a whole heap of
> buttons/links/pages related to HR stuff that normal team members don't.
> One of the sensitive information that needs to be displayed and more
> importantly, stored in the database is the employee performance reviews
> (HR access only).
> I am wondering how I would go about storing/maintaining this
> information as even the DBA's are not supposed to have access to this
> information, but should still be able to administer it and/or retrieve
> the encryption keys if the HR personnel forget their passwords etc.
> The simplest way I can think of is to issue all the HR people with one
> password, encrypt the data in the DB layer (in code) and then store it
> in a table in the database. This way, anyone with access to the
> Database won't necessarily have access to the data. The problem with
> this one is that all the HR personnel share one password, which doesn't
> seem like a nice (secure) solution.
> The other option was to store the encryption key in the database and
> encrypt the key itself with the password provided by the individual
> user (HR person), which will be stored against that user's record. This
> way, whenever a data request is made, the database sends the encrypted
> performance review data and the password encrypted key to the user. The
> user will then decrypt the key using their password, and then decrypt
> the data using the key. This way, if the user ever forgets their
> password, all the DBA (and/or App with owner acess) has to do is
> reencrypt the key with a new user password and store it against the
> user's record. The problem with this is storing the key in the database
> and yet restricting access to the DBA. They should not have raw access
> to the key (which essentially means they can see the data in the
> database), but still should have enough access that if a user forgets
> their password, the DBA can reset the password without losing all the
> existing data.
> I was thinking of storing the key on a key server somewhere with
> different access rights etc. but that means the system starts to get
> complex.
> How would you guys go about building such a system? Does MS Sql Server
> 2005 provide any mechanisms for such functionality? Plus, as a design
> issue, is it better to let the database handle the encryption, key
> management, roles/privileges or is it better to do it in the
> application itself? One of the advantages I can think of for doing it
> in the application is that it gives a bit of database independence, i
> can change the underlying database easily without having to rewrite a
> whole heap of functionality. The other reason is that this way, i won't
> have unencrypted data or keys on wire.
> Looking forward to some expert opinions and discussions regarding this
> problem.
> Thanks in advance.|||The key (ahem) here is this:
>> Whoever has access to the decryption key has access to the data. <<
Plain and simple. If the DBAs can read the decryption key, they can
read the data. So the statement "but should still be able to administer
it and/or retrieve the encryption keys if the HR personnel forget their
passwords etc." is what I call a "non-starter". By definition, the DBAs
have access to the data.
You're right on one point though. If all people share access to the
same password or decryption key, you have an insecure solution. Think
of it this way -- the bigger the "secret" is, and the more people who
have access to that secret, the less secure the system.
Encrypting the key doesn't work either because you have to encrypt the
key with another key. Where are you going to put that key? Are the
DBAs going to have access to it too?
Here's my take on it:
Carefully analyze what is and what is not a secret in the system. Only
encrypt the real secrets. Limit the access to the database to the
smallest set of people possible. If you don't want the DBAs to have
access to the encryption key, that by definition means you have to store
the key outside the database. I recommend that you use .NET's
encryption routines to encrypt your secrets *outside* of the database,
and simply store the data in the system once it's encrypted.
So it boils down to this. SOMEBODY needs access to the keys. They need
to exist somewhere (even if it's not on your SQL Server). The
administrator of that machine will probably be able to read the key.
Deny that person read access to the database. That person needs to be
trustworthy, too. If he/she is not, they can share the key with an
untrustworthy DBA and get your data by writing a small program.
HTH
-Dave
nyathancha@.hotmail.com wrote:
> Hello? Anyone? Surely this is not such a hard problem, or particularly
> unusual. I would guess there are many times when you would want to
> encrypt the data in a database, so as to make it inaccessible to the
> DBA's, but still give them some administrative privileges. One of the
> constraints is that the user with access to the data doesn't
> necessarily "own" the data, so that if they forget the password (i.e
> lose the encryption key), all the data is not lost.
>
>
>
> On Jan 22, 6:37 pm, nyathan...@.hotmail.com wrote:
>> Hi,
>> I am writing an .NET 2.0 app that has different users logging in who
>> have different access levels in the app. One of the user roles is "HR".
>> When the user logs in with these credentials, they have a whole heap of
>> buttons/links/pages related to HR stuff that normal team members don't.
>> One of the sensitive information that needs to be displayed and more
>> importantly, stored in the database is the employee performance reviews
>> (HR access only).
>> I am wondering how I would go about storing/maintaining this
>> information as even the DBA's are not supposed to have access to this
>> information, but should still be able to administer it and/or retrieve
>> the encryption keys if the HR personnel forget their passwords etc.
>> The simplest way I can think of is to issue all the HR people with one
>> password, encrypt the data in the DB layer (in code) and then store it
>> in a table in the database. This way, anyone with access to the
>> Database won't necessarily have access to the data. The problem with
>> this one is that all the HR personnel share one password, which doesn't
>> seem like a nice (secure) solution.
>> The other option was to store the encryption key in the database and
>> encrypt the key itself with the password provided by the individual
>> user (HR person), which will be stored against that user's record. This
>> way, whenever a data request is made, the database sends the encrypted
>> performance review data and the password encrypted key to the user. The
>> user will then decrypt the key using their password, and then decrypt
>> the data using the key. This way, if the user ever forgets their
>> password, all the DBA (and/or App with owner acess) has to do is
>> reencrypt the key with a new user password and store it against the
>> user's record. The problem with this is storing the key in the database
>> and yet restricting access to the DBA. They should not have raw access
>> to the key (which essentially means they can see the data in the
>> database), but still should have enough access that if a user forgets
>> their password, the DBA can reset the password without losing all the
>> existing data.
>> I was thinking of storing the key on a key server somewhere with
>> different access rights etc. but that means the system starts to get
>> complex.
>> How would you guys go about building such a system? Does MS Sql Server
>> 2005 provide any mechanisms for such functionality? Plus, as a design
>> issue, is it better to let the database handle the encryption, key
>> management, roles/privileges or is it better to do it in the
>> application itself? One of the advantages I can think of for doing it
>> in the application is that it gives a bit of database independence, i
>> can change the underlying database easily without having to rewrite a
>> whole heap of functionality. The other reason is that this way, i won't
>> have unencrypted data or keys on wire.
>> Looking forward to some expert opinions and discussions regarding this
>> problem.
>> Thanks in advance.
>
-Dave Markle
http://www.markleconsulting.com/blog|||Hi Dave, Thanks for the reply.
I was thinking along the same lines as well. I couldn't think of anyway
of storing the key in the database and still restricting access to the
DBA. Which is why I mentioned in my original post the "key server
somewhere with
different access rights" . Unless sql server has some magic
features/functionality that I don't know about.
So how hard would it be to build/deploy a key server? Maybe this is
more of an ecryption/security question rather than a database question.
My idea of encrypting the key with another key was that the second
"key" would be the user's password. This way the DBA doesn't have
access to the original data encryption key even if its stored in the
database (because it can only be decrypted using the user's password),
but if the user does forget their password, the data is not lost
because it wasn't encrypted with their password(/key). The DBA can
"reset" their password by reencrypting the orginal data key with a new
user password key and storing it against the column. Which is where I
came to the key server. The fact that the DBA should be able to get the
data key encrypted with a user password, but not the raw data key
itself. I can't think of anyway of doing this without an independent
application.
Does anyone know any commercial encryption key storage and distribution
solutions/products?
I am sure at some stage the data encryption key will itself have to be
encrypted before being put on the wire. Other wise you would have the
encrypted data coming from the database, the plain encryption key
coming from the key server. Doesn't seem particularly secure.
What is the general industry solution to a situation like this, where
you want to store sensitive data in a database, but the data is
accessible (i.e belongs to) to more than one person (so that if that
one person forgets their password/key, the data itself is not lost)
On Jan 25, 1:28 pm, Dave Markle <"dma[remove_ZZ]ZZrkle"@.gmail.dot.com>
wrote:
> The key (ahem) here is this:
> >> Whoever has access to the decryption key has access to the data. <<
> Plain and simple. If the DBAs can read the decryption key, they can
> read the data. So the statement "but should still be able to administer
> it and/or retrieve the encryption keys if the HR personnel forget their
> passwords etc." is what I call a "non-starter". By definition, the DBAs
> have access to the data.
> You're right on one point though. If all people share access to the
> same password or decryption key, you have an insecure solution. Think
> of it this way -- the bigger the "secret" is, and the more people who
> have access to that secret, the less secure the system.
> Encrypting the key doesn't work either because you have to encrypt the
> key with another key. Where are you going to put that key? Are the
> DBAs going to have access to it too?
> Here's my take on it:
> Carefully analyze what is and what is not a secret in the system. Only
> encrypt the real secrets. Limit the access to the database to the
> smallest set of people possible. If you don't want the DBAs to have
> access to the encryption key, that by definition means you have to store
> the key outside the database. I recommend that you use .NET's
> encryption routines to encrypt your secrets *outside* of the database,
> and simply store the data in the system once it's encrypted.
> So it boils down to this. SOMEBODY needs access to the keys. They need
> to exist somewhere (even if it's not on your SQL Server). The
> administrator of that machine will probably be able to read the key.
> Deny that person read access to the database. That person needs to be
> trustworthy, too. If he/she is not, they can share the key with an
> untrustworthy DBA and get your data by writing a small program.
> HTH
> -Dave
>
> nyathan...@.hotmail.com wrote:
> > Hello? Anyone? Surely this is not such a hard problem, or particularly
> > unusual. I would guess there are many times when you would want to
> > encrypt the data in a database, so as to make it inaccessible to the
> > DBA's, but still give them some administrative privileges. One of the
> > constraints is that the user with access to the data doesn't
> > necessarily "own" the data, so that if they forget the password (i.e
> > lose the encryption key), all the data is not lost.
> > On Jan 22, 6:37 pm, nyathan...@.hotmail.com wrote:
> >> Hi,
> >> I am writing an .NET 2.0 app that has different users logging in who
> >> have different access levels in the app. One of the user roles is "HR".
> >> When the user logs in with these credentials, they have a whole heap of
> >> buttons/links/pages related to HR stuff that normal team members don't.
> >> One of the sensitive information that needs to be displayed and more
> >> importantly, stored in the database is the employee performance reviews
> >> (HR access only).
> >> I am wondering how I would go about storing/maintaining this
> >> information as even the DBA's are not supposed to have access to this
> >> information, but should still be able to administer it and/or retrieve
> >> the encryption keys if the HR personnel forget their passwords etc.
> >> The simplest way I can think of is to issue all the HR people with one
> >> password, encrypt the data in the DB layer (in code) and then store it
> >> in a table in the database. This way, anyone with access to the
> >> Database won't necessarily have access to the data. The problem with
> >> this one is that all the HR personnel share one password, which doesn't
> >> seem like a nice (secure) solution.
> >> The other option was to store the encryption key in the database and
> >> encrypt the key itself with the password provided by the individual
> >> user (HR person), which will be stored against that user's record. This
> >> way, whenever a data request is made, the database sends the encrypted
> >> performance review data and the password encrypted key to the user. The
> >> user will then decrypt the key using their password, and then decrypt
> >> the data using the key. This way, if the user ever forgets their
> >> password, all the DBA (and/or App with owner acess) has to do is
> >> reencrypt the key with a new user password and store it against the
> >> user's record. The problem with this is storing the key in the database
> >> and yet restricting access to the DBA. They should not have raw access
> >> to the key (which essentially means they can see the data in the
> >> database), but still should have enough access that if a user forgets
> >> their password, the DBA can reset the password without losing all the
> >> existing data.
> >> I was thinking of storing the key on a key server somewhere with
> >> different access rights etc. but that means the system starts to get
> >> complex.
> >> How would you guys go about building such a system? Does MS Sql Server
> >> 2005 provide any mechanisms for such functionality? Plus, as a design
> >> issue, is it better to let the database handle the encryption, key
> >> management, roles/privileges or is it better to do it in the
> >> application itself? One of the advantages I can think of for doing it
> >> in the application is that it gives a bit of database independence, i
> >> can change the underlying database easily without having to rewrite a
> >> whole heap of functionality. The other reason is that this way, i won't
> >> have unencrypted data or keys on wire.
> >> Looking forward to some expert opinions and discussions regarding this
> >> problem.
> >> Thanks in advance.--
> -Dave Markle
> http://www.markleconsulting.com/blog

Database Encryption - Employee performance review

Hi,
I am writing an .NET 2.0 app that has different users logging in who
have different access levels in the app. One of the user roles is "HR".
When the user logs in with these credentials, they have a whole heap of
buttons/links/pages related to HR stuff that normal team members don't.
One of the sensitive information that needs to be displayed and more
importantly, stored in the database is the employee performance reviews
(HR access only).
I am wondering how I would go about storing/maintaining this
information as even the DBA's are not supposed to have access to this
information, but should still be able to administer it and/or retrieve
the encryption keys if the HR personnel forget their passwords etc.
The simplest way I can think of is to issue all the HR people with one
password, encrypt the data in the DB layer (in code) and then store it
in a table in the database. This way, anyone with access to the
Database won't necessarily have access to the data. The problem with
this one is that all the HR personnel share one password, which doesn't
seem like a nice (secure) solution.
The other option was to store the encryption key in the database and
encrypt the key itself with the password provided by the individual
user (HR person), which will be stored against that user's record. This
way, whenever a data request is made, the database sends the encrypted
performance review data and the password encrypted key to the user. The
user will then decrypt the key using their password, and then decrypt
the data using the key. This way, if the user ever forgets their
password, all the DBA (and/or App with owner acess) has to do is
reencrypt the key with a new user password and store it against the
user's record. The problem with this is storing the key in the database
and yet restricting access to the DBA. They should not have raw access
to the key (which essentially means they can see the data in the
database), but still should have enough access that if a user forgets
their password, the DBA can reset the password without losing all the
existing data.
I was thinking of storing the key on a key server somewhere with
different access rights etc. but that means the system starts to get
complex.
How would you guys go about building such a system? Does MS Sql Server
2005 provide any mechanisms for such functionality? Plus, as a design
issue, is it better to let the database handle the encryption, key
management, roles/privileges or is it better to do it in the
application itself? One of the advantages I can think of for doing it
in the application is that it gives a bit of database independence, i
can change the underlying database easily without having to rewrite a
whole heap of functionality. The other reason is that this way, i won't
have unencrypted data or keys on wire.
Looking forward to some expert opinions and discussions regarding this
problem.
Thanks in advance.
Hello? Anyone? Surely this is not such a hard problem, or particularly
unusual. I would guess there are many times when you would want to
encrypt the data in a database, so as to make it inaccessible to the
DBA's, but still give them some administrative privileges. One of the
constraints is that the user with access to the data doesn't
necessarily "own" the data, so that if they forget the password (i.e
lose the encryption key), all the data is not lost.
On Jan 22, 6:37 pm, nyathan...@.hotmail.com wrote:
> Hi,
> I am writing an .NET 2.0 app that has different users logging in who
> have different access levels in the app. One of the user roles is "HR".
> When the user logs in with these credentials, they have a whole heap of
> buttons/links/pages related to HR stuff that normal team members don't.
> One of the sensitive information that needs to be displayed and more
> importantly, stored in the database is the employee performance reviews
> (HR access only).
> I am wondering how I would go about storing/maintaining this
> information as even the DBA's are not supposed to have access to this
> information, but should still be able to administer it and/or retrieve
> the encryption keys if the HR personnel forget their passwords etc.
> The simplest way I can think of is to issue all the HR people with one
> password, encrypt the data in the DB layer (in code) and then store it
> in a table in the database. This way, anyone with access to the
> Database won't necessarily have access to the data. The problem with
> this one is that all the HR personnel share one password, which doesn't
> seem like a nice (secure) solution.
> The other option was to store the encryption key in the database and
> encrypt the key itself with the password provided by the individual
> user (HR person), which will be stored against that user's record. This
> way, whenever a data request is made, the database sends the encrypted
> performance review data and the password encrypted key to the user. The
> user will then decrypt the key using their password, and then decrypt
> the data using the key. This way, if the user ever forgets their
> password, all the DBA (and/or App with owner acess) has to do is
> reencrypt the key with a new user password and store it against the
> user's record. The problem with this is storing the key in the database
> and yet restricting access to the DBA. They should not have raw access
> to the key (which essentially means they can see the data in the
> database), but still should have enough access that if a user forgets
> their password, the DBA can reset the password without losing all the
> existing data.
> I was thinking of storing the key on a key server somewhere with
> different access rights etc. but that means the system starts to get
> complex.
> How would you guys go about building such a system? Does MS Sql Server
> 2005 provide any mechanisms for such functionality? Plus, as a design
> issue, is it better to let the database handle the encryption, key
> management, roles/privileges or is it better to do it in the
> application itself? One of the advantages I can think of for doing it
> in the application is that it gives a bit of database independence, i
> can change the underlying database easily without having to rewrite a
> whole heap of functionality. The other reason is that this way, i won't
> have unencrypted data or keys on wire.
> Looking forward to some expert opinions and discussions regarding this
> problem.
> Thanks in advance.
|||The key (ahem) here is this:
[vbcol=seagreen]
Plain and simple. If the DBAs can read the decryption key, they can
read the data. So the statement "but should still be able to administer
it and/or retrieve the encryption keys if the HR personnel forget their
passwords etc." is what I call a "non-starter". By definition, the DBAs
have access to the data.
You're right on one point though. If all people share access to the
same password or decryption key, you have an insecure solution. Think
of it this way -- the bigger the "secret" is, and the more people who
have access to that secret, the less secure the system.
Encrypting the key doesn't work either because you have to encrypt the
key with another key. Where are you going to put that key? Are the
DBAs going to have access to it too?
Here's my take on it:
Carefully analyze what is and what is not a secret in the system. Only
encrypt the real secrets. Limit the access to the database to the
smallest set of people possible. If you don't want the DBAs to have
access to the encryption key, that by definition means you have to store
the key outside the database. I recommend that you use .NET's
encryption routines to encrypt your secrets *outside* of the database,
and simply store the data in the system once it's encrypted.
So it boils down to this. SOMEBODY needs access to the keys. They need
to exist somewhere (even if it's not on your SQL Server). The
administrator of that machine will probably be able to read the key.
Deny that person read access to the database. That person needs to be
trustworthy, too. If he/she is not, they can share the key with an
untrustworthy DBA and get your data by writing a small program.
HTH
-Dave
nyathancha@.hotmail.com wrote:
> Hello? Anyone? Surely this is not such a hard problem, or particularly
> unusual. I would guess there are many times when you would want to
> encrypt the data in a database, so as to make it inaccessible to the
> DBA's, but still give them some administrative privileges. One of the
> constraints is that the user with access to the data doesn't
> necessarily "own" the data, so that if they forget the password (i.e
> lose the encryption key), all the data is not lost.
>
>
>
> On Jan 22, 6:37 pm, nyathan...@.hotmail.com wrote:
>
-Dave Markle
http://www.markleconsulting.com/blog
|||Hi Dave, Thanks for the reply.
I was thinking along the same lines as well. I couldn't think of anyway
of storing the key in the database and still restricting access to the
DBA. Which is why I mentioned in my original post the "key server
somewhere with
different access rights" . Unless sql server has some magic
features/functionality that I don't know about.
So how hard would it be to build/deploy a key server? Maybe this is
more of an ecryption/security question rather than a database question.
My idea of encrypting the key with another key was that the second
"key" would be the user's password. This way the DBA doesn't have
access to the original data encryption key even if its stored in the
database (because it can only be decrypted using the user's password),
but if the user does forget their password, the data is not lost
because it wasn't encrypted with their password(/key). The DBA can
"reset" their password by reencrypting the orginal data key with a new
user password key and storing it against the column. Which is where I
came to the key server. The fact that the DBA should be able to get the
data key encrypted with a user password, but not the raw data key
itself. I can't think of anyway of doing this without an independent
application.
Does anyone know any commercial encryption key storage and distribution
solutions/products?
I am sure at some stage the data encryption key will itself have to be
encrypted before being put on the wire. Other wise you would have the
encrypted data coming from the database, the plain encryption key
coming from the key server. Doesn't seem particularly secure.
What is the general industry solution to a situation like this, where
you want to store sensitive data in a database, but the data is
accessible (i.e belongs to) to more than one person (so that if that
one person forgets their password/key, the data itself is not lost)
On Jan 25, 1:28 pm, Dave Markle <"dma[remove_ZZ]ZZrkle"@.gmail.dot.com>
wrote:
> The key (ahem) here is this:
>
> Plain and simple. If the DBAs can read the decryption key, they can
> read the data. So the statement "but should still be able to administer
> it and/or retrieve the encryption keys if the HR personnel forget their
> passwords etc." is what I call a "non-starter". By definition, the DBAs
> have access to the data.
> You're right on one point though. If all people share access to the
> same password or decryption key, you have an insecure solution. Think
> of it this way -- the bigger the "secret" is, and the more people who
> have access to that secret, the less secure the system.
> Encrypting the key doesn't work either because you have to encrypt the
> key with another key. Where are you going to put that key? Are the
> DBAs going to have access to it too?
> Here's my take on it:
> Carefully analyze what is and what is not a secret in the system. Only
> encrypt the real secrets. Limit the access to the database to the
> smallest set of people possible. If you don't want the DBAs to have
> access to the encryption key, that by definition means you have to store
> the key outside the database. I recommend that you use .NET's
> encryption routines to encrypt your secrets *outside* of the database,
> and simply store the data in the system once it's encrypted.
> So it boils down to this. SOMEBODY needs access to the keys. They need
> to exist somewhere (even if it's not on your SQL Server). The
> administrator of that machine will probably be able to read the key.
> Deny that person read access to the database. That person needs to be
> trustworthy, too. If he/she is not, they can share the key with an
> untrustworthy DBA and get your data by writing a small program.
> HTH
> -Dave
>
> nyathan...@.hotmail.com wrote:
>
>
>
>
> -Dave Markle
> http://www.markleconsulting.com/blog

Database Encryption - Employee performance review

Hi,
I am writing an .NET 2.0 app that has different users logging in who
have different access levels in the app. One of the user roles is "HR".
When the user logs in with these credentials, they have a whole heap of
buttons/links/pages related to HR stuff that normal team members don't.
One of the sensitive information that needs to be displayed and more
importantly, stored in the database is the employee performance reviews
(HR access only).
I am wondering how I would go about storing/maintaining this
information as even the DBA's are not supposed to have access to this
information, but should still be able to administer it and/or retrieve
the encryption keys if the HR personnel forget their passwords etc.
The simplest way I can think of is to issue all the HR people with one
password, encrypt the data in the DB layer (in code) and then store it
in a table in the database. This way, anyone with access to the
Database won't necessarily have access to the data. The problem with
this one is that all the HR personnel share one password, which doesn't
seem like a nice (secure) solution.
The other option was to store the encryption key in the database and
encrypt the key itself with the password provided by the individual
user (HR person), which will be stored against that user's record. This
way, whenever a data request is made, the database sends the encrypted
performance review data and the password encrypted key to the user. The
user will then decrypt the key using their password, and then decrypt
the data using the key. This way, if the user ever forgets their
password, all the DBA (and/or App with owner acess) has to do is
reencrypt the key with a new user password and store it against the
user's record. The problem with this is storing the key in the database
and yet restricting access to the DBA. They should not have raw access
to the key (which essentially means they can see the data in the
database), but still should have enough access that if a user forgets
their password, the DBA can reset the password without losing all the
existing data.
I was thinking of storing the key on a key server somewhere with
different access rights etc. but that means the system starts to get
complex.
How would you guys go about building such a system? Does MS Sql Server
2005 provide any mechanisms for such functionality? Plus, as a design
issue, is it better to let the database handle the encryption, key
management, roles/privileges or is it better to do it in the
application itself? One of the advantages I can think of for doing it
in the application is that it gives a bit of database independence, i
can change the underlying database easily without having to rewrite a
whole heap of functionality. The other reason is that this way, i won't
have unencrypted data or keys on wire.
Looking forward to some expert opinions and discussions regarding this
problem.
Thanks in advance.Hello? Anyone? Surely this is not such a hard problem, or particularly
unusual. I would guess there are many times when you would want to
encrypt the data in a database, so as to make it inaccessible to the
DBA's, but still give them some administrative privileges. One of the
constraints is that the user with access to the data doesn't
necessarily "own" the data, so that if they forget the password (i.e
lose the encryption key), all the data is not lost.
On Jan 22, 6:37 pm, nyathan...@.hotmail.com wrote:
> Hi,
> I am writing an .NET 2.0 app that has different users logging in who
> have different access levels in the app. One of the user roles is "HR".
> When the user logs in with these credentials, they have a whole heap of
> buttons/links/pages related to HR stuff that normal team members don't.
> One of the sensitive information that needs to be displayed and more
> importantly, stored in the database is the employee performance reviews
> (HR access only).
> I am wondering how I would go about storing/maintaining this
> information as even the DBA's are not supposed to have access to this
> information, but should still be able to administer it and/or retrieve
> the encryption keys if the HR personnel forget their passwords etc.
> The simplest way I can think of is to issue all the HR people with one
> password, encrypt the data in the DB layer (in code) and then store it
> in a table in the database. This way, anyone with access to the
> Database won't necessarily have access to the data. The problem with
> this one is that all the HR personnel share one password, which doesn't
> seem like a nice (secure) solution.
> The other option was to store the encryption key in the database and
> encrypt the key itself with the password provided by the individual
> user (HR person), which will be stored against that user's record. This
> way, whenever a data request is made, the database sends the encrypted
> performance review data and the password encrypted key to the user. The
> user will then decrypt the key using their password, and then decrypt
> the data using the key. This way, if the user ever forgets their
> password, all the DBA (and/or App with owner acess) has to do is
> reencrypt the key with a new user password and store it against the
> user's record. The problem with this is storing the key in the database
> and yet restricting access to the DBA. They should not have raw access
> to the key (which essentially means they can see the data in the
> database), but still should have enough access that if a user forgets
> their password, the DBA can reset the password without losing all the
> existing data.
> I was thinking of storing the key on a key server somewhere with
> different access rights etc. but that means the system starts to get
> complex.
> How would you guys go about building such a system? Does MS Sql Server
> 2005 provide any mechanisms for such functionality? Plus, as a design
> issue, is it better to let the database handle the encryption, key
> management, roles/privileges or is it better to do it in the
> application itself? One of the advantages I can think of for doing it
> in the application is that it gives a bit of database independence, i
> can change the underlying database easily without having to rewrite a
> whole heap of functionality. The other reason is that this way, i won't
> have unencrypted data or keys on wire.
> Looking forward to some expert opinions and discussions regarding this
> problem.
> Thanks in advance.|||The key (ahem) here is this:

Plain and simple. If the DBAs can read the decryption key, they can
read the data. So the statement "but should still be able to administer
it and/or retrieve the encryption keys if the HR personnel forget their
passwords etc." is what I call a "non-starter". By definition, the DBAs
have access to the data.
You're right on one point though. If all people share access to the
same password or decryption key, you have an insecure solution. Think
of it this way -- the bigger the "secret" is, and the more people who
have access to that secret, the less secure the system.
Encrypting the key doesn't work either because you have to encrypt the
key with another key. Where are you going to put that key? Are the
DBAs going to have access to it too?
Here's my take on it:
Carefully analyze what is and what is not a secret in the system. Only
encrypt the real secrets. Limit the access to the database to the
smallest set of people possible. If you don't want the DBAs to have
access to the encryption key, that by definition means you have to store
the key outside the database. I recommend that you use .NET's
encryption routines to encrypt your secrets *outside* of the database,
and simply store the data in the system once it's encrypted.
So it boils down to this. SOMEBODY needs access to the keys. They need
to exist somewhere (even if it's not on your SQL Server). The
administrator of that machine will probably be able to read the key.
Deny that person read access to the database. That person needs to be
trustworthy, too. If he/she is not, they can share the key with an
untrustworthy DBA and get your data by writing a small program.
HTH
-Dave
nyathancha@.hotmail.com wrote:[vbcol=seagreen]
> Hello? Anyone? Surely this is not such a hard problem, or particularly
> unusual. I would guess there are many times when you would want to
> encrypt the data in a database, so as to make it inaccessible to the
> DBA's, but still give them some administrative privileges. One of the
> constraints is that the user with access to the data doesn't
> necessarily "own" the data, so that if they forget the password (i.e
> lose the encryption key), all the data is not lost.
>
>
>
> On Jan 22, 6:37 pm, nyathan...@.hotmail.com wrote:
>
-Dave Markle
http://www.markleconsulting.com/blog|||Hi Dave, Thanks for the reply.
I was thinking along the same lines as well. I couldn't think of anyway
of storing the key in the database and still restricting access to the
DBA. Which is why I mentioned in my original post the "key server
somewhere with
different access rights" . Unless sql server has some magic
features/functionality that I don't know about.
So how hard would it be to build/deploy a key server? Maybe this is
more of an ecryption/security question rather than a database question.
My idea of encrypting the key with another key was that the second
"key" would be the user's password. This way the DBA doesn't have
access to the original data encryption key even if its stored in the
database (because it can only be decrypted using the user's password),
but if the user does forget their password, the data is not lost
because it wasn't encrypted with their password(/key). The DBA can
"reset" their password by reencrypting the orginal data key with a new
user password key and storing it against the column. Which is where I
came to the key server. The fact that the DBA should be able to get the
data key encrypted with a user password, but not the raw data key
itself. I can't think of anyway of doing this without an independent
application.
Does anyone know any commercial encryption key storage and distribution
solutions/products?
I am sure at some stage the data encryption key will itself have to be
encrypted before being put on the wire. Other wise you would have the
encrypted data coming from the database, the plain encryption key
coming from the key server. Doesn't seem particularly secure.
What is the general industry solution to a situation like this, where
you want to store sensitive data in a database, but the data is
accessible (i.e belongs to) to more than one person (so that if that
one person forgets their password/key, the data itself is not lost)
On Jan 25, 1:28 pm, Dave Markle <"dma[remove_ZZ]ZZrkle"@.gmail.dot.com>
wrote:
> The key (ahem) here is this:
>
> Plain and simple. If the DBAs can read the decryption key, they can
> read the data. So the statement "but should still be able to administer
> it and/or retrieve the encryption keys if the HR personnel forget their
> passwords etc." is what I call a "non-starter". By definition, the DBAs
> have access to the data.
> You're right on one point though. If all people share access to the
> same password or decryption key, you have an insecure solution. Think
> of it this way -- the bigger the "secret" is, and the more people who
> have access to that secret, the less secure the system.
> Encrypting the key doesn't work either because you have to encrypt the
> key with another key. Where are you going to put that key? Are the
> DBAs going to have access to it too?
> Here's my take on it:
> Carefully analyze what is and what is not a secret in the system. Only
> encrypt the real secrets. Limit the access to the database to the
> smallest set of people possible. If you don't want the DBAs to have
> access to the encryption key, that by definition means you have to store
> the key outside the database. I recommend that you use .NET's
> encryption routines to encrypt your secrets *outside* of the database,
> and simply store the data in the system once it's encrypted.
> So it boils down to this. SOMEBODY needs access to the keys. They need
> to exist somewhere (even if it's not on your SQL Server). The
> administrator of that machine will probably be able to read the key.
> Deny that person read access to the database. That person needs to be
> trustworthy, too. If he/she is not, they can share the key with an
> untrustworthy DBA and get your data by writing a small program.
> HTH
> -Dave
>
> nyathan...@.hotmail.com wrote:
>
>
>
>
>
>
>
>
> -Dave Markle
> http://www.markleconsulting.com/blog

Saturday, February 25, 2012

Database Design Question

I am writing a job tracking app with ASP/MSSQL and I have a question about setting up an archive system.

For my app, the database consists of several relational tables that track all of the information about a job. I want to set it up so that if a job is completed, it gets archived.

Over the next 2 or 3 years, this system will likely grow to 100,000+ records.

My question is, should I (1) just use a flag to mark the job as archived, (2) make a flat record of the job and move it into one archive table, or (3) set up archive tables that mirror my production tables and copy the record to the archive tables keeping the relationships intact?

Does anyone have a method that has worked well for them in the past? What will keep my app running the fastest? Any suggestions would be appreciated. Thanks!i always prefer a single table with the jobs flagged as not active (1 or 0)
it makes for simpler querying later with just
select * from table1
where active is 1
and
when you want to see completed tasks
select * from table1
where active is 0

(less tables, less joins etc,... )|||Cool ... thanks!

Do you think 100,000+ records somewhere down the line will be a problem when running reports or doing searches if I just use a flag? I'm trying to design this w/ optimized tuning in mind because I don't know how long it will be in place.

When I flag a job as complete (bit 0,1) and then query w/

WHERE open=(0 or 1)

It still has to check every record correct? How many records before there is a noticable difference in the speed of the app?|||two ways to think of this
you could make the column a bit datatype to save on space but it stores 1,0 or null small to the point but you cannot create an index on this column

OR
create the column as tinyint (1 byte , largest value 255) and then you could index it
but the indexing on a heavily duplicated columns is not recommended
you also could create statistics on the flagged column
i'm of the mind that you should be okay with this but just monitor performance during operations|||Thanks for your ideas!|||The optimizer will likely not use an index on a boolean column, regardless of whether it is bit, tiny int, or even char(1). There just isn't sufficient cardinality to make it worthwhile.|||true
this is an area where i have (gulp!) envied ORACLE
they have bitmap indexes that you can use on heavily duplicated columns like gender and yes\no

oh well maybe in yukon.|||Yeah, but that's not the point.

You could create an index on boolean values stored in a tinyint field, but the optimizer would likely ignore it because its not any more efficient than using a table scan. The cardinality is likely so low (high? I can never remember...) that the small performance boost gained by using the index is offset by the overhead of accessing the index.

I'm not an Oracle DBA, but I don't see how Oracle would gain much from indexing boolean values either.|||yes and using bitmap indexes on yes no or gender or small sets like rating can give you a boost
but you are right about the boolean but still i prefer the 1/0 flag to the splitting up of tables|||I agree. In most cases (not all) a flag is better than splitting the data.|||And the rambling goes on. Well, I don't know who would come up with an idea on creating an index just on a Gender or Yes/No field. And you guys keep on going on it. Kind of silly, would you really consider doing it? Its value WILL come to play if such a field is combined with something else.

And another thing, - YOU CAN CREATE AN INDEX ON A BIT FIELD, even by itself.|||Yes and no.

You can't create an index on a bit field through Enterprise Manager's table design form.

Also, Books Online states:

"Columns of type bit cannot have indexes on them. "

But you can create the index using SQL:

CREATE INDEX [IX_BitTest] ON [dbo].[YourTable]([BitValue]) ON [PRIMARY]
GO

...so there is a loophole or inconsistency in SQL server regarding bit values.

Lot's of people come up with the idea of indexing boolean fields because they don't have an in depth understanding of how SQL Server works. That's why they come to dbforums for advice, not ridicule Bob.|||Originally posted by blindman
...Lot's of people come up with the idea of indexing boolean fields because they don't have an in depth understanding of how SQL Server works. That's why they come to dbforums for advice, not ridicule Bob. Hey, you started first, I just try to survive here. Remember your comments about my inability to understand English? Help me out here, - weren't you ridiculing me?

Somet things I found funny, others I saw as an insult, but in neither of those posts were you even attempting to hide or retract your intent to ridicule. I guess you yourself can't take your own medicine, hey?! ;)

Originally posted by blindman
...You can't create an index on a bit field through Enterprise Manager's table design form.

Also, Books Online states:

"Columns of type bit cannot have indexes on them. "

Which Books Online are you reading, Lindman? Get on with the program, everybody else did!

SQL Server Books Online January 2004 Update (http://www.microsoft.com/downloads/details.aspx?FamilyId=A6F79CB1-A420-445F-8A4B-BD77A7DA194B&displaylang=en#filelist)|||YEAH BOB!!!!

rdjabarov does insulted....|||Always go with your strength.|||At the scale of 100,00 or so records, you'll certainly want to be able to index the field, so bit is out. Plus people writing code against boolean fields get confused about null evaluations and equalities, so it's better to use a more humane data type anyway.

tinyint will take the least amount of space and provide the fastest field evaluation available. Although, at such a small table size, neither is a major concern. Whether or not you are expanding a server tree in the Enterprise Manager during query execution will have a much greater effect on performance than how you optimize this field. I agree that standard indexing would have little effect, but I would use a clustered index with the status field as the first element and then your primary where condition field as the second element of the index. The optimizer is unable to overlook this index becuase it is reflected in the table's physicality.|||The optimizer does not overlook boolean indexes because it is lazy, or because it can't use them. It chooses to overlook them when it is more efficient NOT to use them. You could force it to use the index using a query hint, but forcing the use of the index does not necessarily make the query more efficient. The same goes for putting the boolean index at the front of the key.|||Originally posted by dbslave
tinyint will take the least amount of space

Isn't tinyint and char(1) the same size?|||Originally posted by Lindman
The optimizer does not overlook boolean indexes because it is lazy, or because it can't use them. It chooses to overlook them when it is more efficient NOT to use them... The optimizer DOES NOT overlook indexes when it is more efficient not to use them. That decision is based on STATISTICS associated with tables referenced by the query. Don't give too much credit to the Optimizer.

And again, according to Books Online (UPDATED!!!) and personal experience, you can have an index on a bit field. In fact, you can even have a clustered index on a bit field (wouldn't know what for though). The point is that you shouldn't have an index on a field which data is equally split 50/50. Such a field should be part of a composite index to make a difference.

Going back to the original question, whether to have a flag or a replica of the original table for archive purposes...It depends on how many records are going to be archived at a time, and how many records regardless of whether they are archived or not would be retrieved. And also, how wide is the table now.

The decision should depend on these questions because if you plan to archive thousands of rows at a time and the table is wide enough (judgment call, no particulat width is in mind), I'd go with a flag of bit datatype, since updating of 1 field will not kill my transaction log as opposed to inserting into a different table.|||Originally posted by rdjabarov
In fact, you can even have a clustered index on a bit field (wouldn't know what for though). The point is that you shouldn't have an index on a field which data is equally split 50/50. Such a field should be part of a composite index to make a difference.


Logically, yes. Physically, this can be useful for partitioning data.|||Originally posted by Brett Kaiser
Isn't tinyint and char(1) the same size?

Yes, but comparisons on a tinyint are way faster than on a char(1)|||Why would that be?

Got any links that discuss this?

and whats waaaaaaaaaay faster?|||Originally posted by dbslave
Logically, yes. Physically, this can be useful for partitioning data. You're planning to partition 2 records?|||Clustered does not necessarily mean unique.|||Originally posted by blindman
Clustered does not necessarily mean unique. Did you get your "real" Books Online? You have a lot to read...|||USE Northwind
GO

CREATE TABLE myTable99(Col1 int, Col2 int)
GO

CREATE CLUSTERED INDEX myTable99_IX1 ON myTable99(Col1)
GO

INSERT INTO myTable99 SELECT 1,1 UNION ALL SELECT 1,1
GO

SELECT * FROM myTable99
GO

DROP TABLE myTable99
GO

Now say you're sorry...

And I want you all to play nice or it's time out...

Oh, wait, that stuff doesn't start till I get home...

Sorry...|||Hey Brett, I wasn't arguing that clustered does not mean unique! I just don't like "him" butting into other people's conversation, just to make some pointless point (or however "he" put it).|||Oh good heavens, that's exactly why I participate. Having a dozen eyes available helps "keep me honest" because I know that there are many points of view, most of which are probably more "on track" at any given point in time than mine are.

While there are plenty of people that annoy me in one way or another, its still a good thing that they all participate (at least in my opinion). If one person annoys me too badly, I just filter them from my view for a while (until I cool off).

There's a lot of advice that is well meant, and well received that nearly sends me ballistic. I know how badly certain approaches can hammer things up with large (anything between 150 and 1200 Gb) databases, but those approaches often work fine in smaller, simpler environments. I have to keep reminding myself to give appropriate answers based on the questions!

-PatP|||Originally posted by Brett Kaiser
Isn't tinyint and char(1) the same size?

yes they are both 1 byte datatypes but
integer vs char|||I'm with Brett in never having read anything about tinyint being a faster comparison than char(1). White papers?|||boys, this is a non-smoking flight, put your guns down

tiny, small, why not bit? you'd have to have at least 9 of them before it reaches the size of the tinyint. what's the fuss about?|||If tinyint is indeed faster than char(1) then I'll use it in the future.|||oh that's just absurd! how can a datatype be faster than another datatype?|||Originally posted by ms_sql_dba
oh that's just absurd! how can a datatype be faster than another datatype? Way to go, she got you on that one :D :D :D :D :D|||Hey .. where was I when all this was going on ...

hmm ... have been busy lately ... but will be back soon ...|||she?|||Now that you point out .. i was also confused by "She" ...

Hmm .. i m getting confused a lot lately ...|||Originally posted by r937
she? The "ms" can be interpreted as many things,k among them: Microsoft, or as an honoric "Ms." that was quite popular among what were called the "rabidly liberated" in the 1970s. Ms. was used instead of Mrs. or Miss by a very small group of women that didn't want to reveal their marital status. I assume that rdjabarov read it in that context.

-PatP|||She got me?

I never claimed it was faster. The notion that tinyint was faster than char(1) was news to both Brett and I, and we simply asked if there was any documentation to support the assertion.

As usual, ms_sql_dba didn't bother to really read the previous posts. She keeps to her old habit of dropping in at the end of the thread and repeating something somebody else had already written 10 posts before, without adding a shred of value to the conversation.

Nyah, nyah, ya missed me! Ya missed me! PPPbbbbbtttt! :p

Grow up.|||It's got to be a full moon or something...

does ANYONE want to put forth WHY tinyint would be more effecient that char(1)?

I'm here for education (and social interaction, well I guess some types of it...the others type are a watse of time...life is too short)|||Originally posted by ms_sql_dba
boys, this is a non-smoking flight, put your guns down

tiny, small, why not bit? you'd have to have at least 9 of them before it reaches the size of the tinyint. what's the fuss about?

[With the guns blazing]

This is straight out of the Holy book
bit

If there are 8 or fewer bit columns in a table, the columns are stored as 1 byte

tinyint

Integer data from 0 through 255. Storage size is 1 byte.

[/With the guns blazing]

first Get a copy of the Holy Book and then read it from end to end.|||OK, so that means...

1 bit = 1 byte
2 bits = 1 byte
3 bits = 1 byte
4 bits = 1 byte
5 bits = 1 byte
6 bits = 1 byte
7 bits = 1 byte
8 bits = 1 byte

But the values of bit is still 0 or 1 or null (which isn't a value at all)

And has very limited meaning, and I've never seen a good reason to index it..

And Char(1) and tinyint are still both 1 byte

And is it a misconception that 1 of the above is more effecient in an index than the other...|||Originally posted by Brett Kaiser
OK, so that means...

1 bit = 1 byte
2 bits = 1 byte
3 bits = 1 byte
4 bits = 1 byte
5 bits = 1 byte
6 bits = 1 byte
7 bits = 1 byte
8 bits = 1 byte

9 bit = 2 byte
10 bits = 2 byte
11 bits = 2 byte
12 bits = 2 byte
13 bits = 2 byte
14 bits = 2 byte
15 bits = 2 byte
16 bits = 2 byte

and so on ...

so 1 single bit storage space = tiny int storage space|||It was dbslave who originally stated that tinyint was faster, and since we haven't heard anything back from him I'm going to assume that was an error and recommend we drop the inquiry.|||i feel slightly responisble because it was me who brought it up.
but if you are gonna flame someone else then count me in
Mob Rules
string him up...|||Nah, I wasn't stringing him up. Everybody makes mistakes, or has made incorrect assumptions while wading through the masses of documentation for SQL server. I've learned a hell of a lot from this forum.|||Originally posted by blindman
I've learned a hell of a lot from this forum.

Ditto...

Just trying to realize either way...

My gut says no....|||My gut says no....

Thats wierd ... you still got guts|||Originally posted by blindman
She got me?

I never claimed it was faster. The notion that tinyint was faster than char(1) was news to both Brett and I, and we simply asked if there was any documentation to support the assertion.

As usual, ms_sql_dba didn't bother to really read the previous posts. She keeps to her old habit of dropping in at the end of the thread and repeating something somebody else had already written 10 posts before, without adding a shred of value to the conversation.

Nyah, nyah, ya missed me! Ya missed me! PPPbbbbbtttt! :p

Grow up. So, that makes 2 of us, me and ms_sql_dba. At least I am not alone!!!|||Not to jump in and strive for polarity, but there's been nothing I've seen that say that's true...

And are you going to the rainbow lounge?

Make sure you bring a lot of singles...

:D|||Originally posted by Enigma
Thats wierd ... you still got guts I'll bet that I've got more gut than anybody around here!

Somebody made a comment about a rather heated discussion here being an "international incident", and I pointed out that even I'm not that fat (yet).

-PatP|||...Sounds like a call for a pissing contest...|||Originally posted by rdjabarov
...Sounds like a call for a pissing contest... Well, I'd guess that I could "hold my own" (if you'll pardon the term) in that area too!

-PatP|||Try this for yourself. Run test few times to be sure table is in cache. You will be surprised.

create table test(i int identity primary key, a tinyint, b char)
set nocount on
declare @.c int
set @.c = 0
while @.c < 20000 begin
insert test(a,b)
values(0,'0')
insert test(a,b)
values(1,'1')
set @.c = @.c + 1 end

--test scan
declare @.t datetime
set @.t = getdate()
select a
from test
where a = 1
print 'tinyint: ' + cast(datediff(ms,@.t,getdate()) as varchar) + 'ms'
set @.t = getdate()
select b
from test
where b = '1'
print 'char: ' + cast(datediff(ms,@.t,getdate()) as varchar) + 'ms'

--test seek as run above script after adding these indexes:
create index test_a on test(a)
create index test_b on test(b)

Chars were slower before SQL Server version 7.|||I got about 700ms faster on tinyint after creating indexes.|||And about 1600 faster than using char if I replace tinyint with bit...|||Well, in the interests of scientific speculation:

-- CREATE TABLE test_datatype_speed_1(
-- ident INT IDENTITY PRIMARY KEY,
-- tinyint_noindex TINYINT,
-- char1_noindex CHAR(1),
-- bit_noindex BIT,
-- tinyint_index TINYINT,
-- char1_index CHAR(1),
-- bit_index BIT)
--
-- CREATE INDEX idx_test_datatype_speed_1_tinyint ON test_datatype_speed_1(tinyint_index)
-- CREATE INDEX idx_test_datatype_speed_1_char1 ON test_datatype_speed_1(char1_index)
-- CREATE INDEX idx_test_datatype_speed_1_bit ON test_datatype_speed_1(bit_index)
--
-- SET NOCOUNT ON
--
-- DECLARE @.c INT
--
-- SELECT @.c = 0
--
-- WHILE @.c < 20000
-- BEGIN
--
-- INSERT test_datatype_speed_1(tinyint_noindex, char1_noindex, bit_noindex, tinyint_index, char1_index, bit_index)
-- VALUES(0,'0',0,0,'0',0)
-- INSERT test_datatype_speed_1(tinyint_noindex, char1_noindex, bit_noindex, tinyint_index, char1_index, bit_index)
-- VALUES(1,'1',1,1,'1',1)
-- SELECT @.c = @.c + 1
--
-- END

GO

SET STATISTICS IO ON

--test scan
DECLARE @.t DATETIME

--Test tinyint_noindex
SELECT @.t = GETDATE()
SELECT tinyint_noindex
FROM test_datatype_speed_1
WHERE tinyint_noindex = 1
PRINT 'tinyint_noindex: ' + CAST(DATEDIFF(ms,@.t,GETDATE()) AS VARCHAR) + 'ms'

--Test char1_noindex
SELECT @.t = GETDATE()
SELECT char1_noindex
FROM test_datatype_speed_1
WHERE char1_noindex = '1'
PRINT 'char1_noindex: ' + CAST(DATEDIFF(ms,@.t,GETDATE()) AS VARCHAR) + 'ms'

--Test bit_noindex
SELECT @.t = GETDATE()
SELECT bit_noindex
FROM test_datatype_speed_1
WHERE bit_noindex = '1'
PRINT 'bit_noindex: ' + CAST(DATEDIFF(ms,@.t,GETDATE()) AS VARCHAR) + 'ms'

--Test tinyint_index
SELECT @.t = GETDATE()
SELECT tinyint_index
FROM test_datatype_speed_1
WHERE tinyint_index = 1
PRINT 'tinyint_index: ' + CAST(DATEDIFF(ms,@.t,GETDATE()) AS VARCHAR) + 'ms'

--Test char1_index
SELECT @.t = GETDATE()
SELECT char1_index
FROM test_datatype_speed_1
WHERE char1_index = '1'
PRINT 'char1_index: ' + CAST(DATEDIFF(ms,@.t,GETDATE()) AS VARCHAR) + 'ms'

--Test bit_index
SELECT @.t = GETDATE()
SELECT bit_index
FROM test_datatype_speed_1
WHERE bit_index = '1'
PRINT 'bit_index: ' + CAST(DATEDIFF(ms,@.t,GETDATE()) AS VARCHAR) + 'ms'

SET STATISTICS IO OFF
GO

This was the average result on my machine:

Table 'test_datatype_speed_1'. Scan count 1, logical reads 91, physical reads 0, read-ahead reads 0.
tinyint_noindex: 30ms

Table 'test_datatype_speed_1'. Scan count 1, logical reads 91, physical reads 0, read-ahead reads 0.
char1_noindex: 93ms

Table 'test_datatype_speed_1'. Scan count 1, logical reads 91, physical reads 0, read-ahead reads 0.
bit_noindex: 63ms

Table 'test_datatype_speed_1'. Scan count 1, logical reads 30, physical reads 0, read-ahead reads 0.
tinyint_index: 63ms

Table 'test_datatype_speed_1'. Scan count 1, logical reads 30, physical reads 0, read-ahead reads 0.
char1_index: 93ms

Table 'test_datatype_speed_1'. Scan count 1, logical reads 30, physical reads 0, read-ahead reads 0.
bit_index: 60ms

Kind of interesting.

I'm running Windows 2003 Standard Edition/SQL Server 2000 Enterprise Edition/2ghz Xeon/1gb RAM|||I'm running Windows 2003 Standard Edition/SQL Server 2000 Enterprise Edition/2ghz Xeon/1gb RAM

i'm more impressed by your server.
oh yeah nice code too|||OK, cool, I'm up for testing things out...

Anyone got anything from M$ though?

Also, how curious...an non indexed tinuint is 33 ms faster?

Doesn't make sense...

I bet the results are significantly different when we're talking volumes...|||I just got around to running HansVE's code, and sometimes tinyint was faster and sometimes char was faster. It was about 50/50, so it still looks like a wash to me, at least until I see some kind of documentation or at least some reasoning.

And indexes made no difference.|||How many records are you guys testing with? Also, testing against 1-field index with 1/0 or y/n is going to yield an index scan. And we all know that the number of index pages for all 3 datatypes (char, tinyint, bit) will be the same. In order to test it for practical real-life situations we at least need to agree that such a field needs to be tested with another field with higher selectivity than 50/50.|||Agreed. 50/50 may be a typical distribution of boolean data, but it is not always the case and performance may be different with different cardinality.

90/10?|||No, I was talking about combining this boolean field with another, let's say a 10-character field or a date, while creating an index, and then using this combination of fields in the WHERE clause or in a JOIN. Something like this:

create table t1 (f1 char(10) not null, f1 tinyint/char(1)/bit not null)
--here we need to insert a large number of records, about 100K, where f1 would be 10% duplicates and f2 would be 50% dups.
select * from t1 where f1 = <something> and f2 = 'y'/1/1|||If you read the whole post, the entire thing was centered around bit vs. tinyint. Then, the whole char(1) thing got thrown in. So, if we're talking true bit comparison, you should stick with 0 and 1. Otherwise, you're expanding the field to do a full-scale comparison of datatypes.

In this case, you would have to set baselines at all kinds of different selectivity levels, index level (fill-factors, clustered vs. non-clustered, single vs multi-column), scale, precision, datatype, etc.

After you set the baseline, you have just began the real work of analyzing the datatype performance. You would then have to compare across size of system, memory, load in tps, size of recordset per selectivity, etc.

It would be great fun, but it's probably a little out of the scope of the thread isn't it.

:)|||Oh no...A thread that has strayed....

What will happen next?

No more tequila?

AAAAAAAAAAAAAHHHHHHHHHHHHHHHHHHHHHHHHHHHH

:D|||Let the thread keep going as long as it is interesting and informative.

rdjabarov, can you adapt the prevous sample code to test or illustrate your conjecture?|||I was hoping Brett will do his "create mytable99..." thing and I'll tweak it ;)|||Spoken a like a true lazy DBA! (And I was hinting that YOU should do it because I didn't want to.)

In Enterprise Manager, right-click on the Procedures folder, select "New Stored Procedure", and choose the "Ask Brett to do it" button.|||use a cross join

:cool:|||Originally posted by blindman
Spoken a like a true lazy DBA! (And I was hinting that YOU should do it because I didn't want to.)

In Enterprise Manager, right-click on the Procedures folder, select "New Stored Procedure", and choose the "Ask Brett to do it" button.

That's coming out in Yukon

:D|||Let the thread keep going as long as it is interesting and informative.

We better just stick to the interesting part. :)|||Personally, I don't give a rats arse which is faster but I am surprised that the collation of the CHAR data type has not been mentioned.

A binary collation outperforms a non binary collation...

Thanks to Derrick's code and simply adding "COLLATE Latin1_General_BIN " against the CHAR columns reveals...

Table 'TEST_DATATYPE_SPEED_1'. Scan count 1, logical reads 91, physical reads 0, read-ahead reads 0.
tinyint_noindex: 73ms

Table 'TEST_DATATYPE_SPEED_1'. Scan count 1, logical reads 91, physical reads 0, read-ahead reads 0.
char1_noindex: 100ms

Table 'TEST_DATATYPE_SPEED_1'. Scan count 1, logical reads 91, physical reads 0, read-ahead reads 0.
bit_noindex: 90ms

Table 'TEST_DATATYPE_SPEED_1'. Scan count 1, logical reads 23, physical reads 0, read-ahead reads 0.
tinyint_index: 80ms

Table 'TEST_DATATYPE_SPEED_1'. Scan count 1, logical reads 23, physical reads 0, read-ahead reads 0.
char1_index: 80ms

Table 'TEST_DATATYPE_SPEED_1'. Scan count 1, logical reads 23, physical reads 0, read-ahead reads 0.
bit_index: 80ms|||(20000 row(s) affected)
Table 'test_datatype_speed_1'. Scan count 1, logical reads 92, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
tinyint_noindex: 230ms

(20000 row(s) affected)
Table 'test_datatype_speed_1'. Scan count 1, logical reads 92, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
char1_noindex: 360ms

(20000 row(s) affected)
Table 'test_datatype_speed_1'. Scan count 1, logical reads 92, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
bit_noindex: 290ms

(20000 row(s) affected)
Table 'test_datatype_speed_1'. Scan count 1, logical reads 31, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
tinyint_index: 183ms

(20000 row(s) affected)
Table 'test_datatype_speed_1'. Scan count 1, logical reads 31, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
char1_index: 230ms

(20000 row(s) affected)
Table 'test_datatype_speed_1'. Scan count 1, logical reads 31, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
bit_index: 250ms|||Tried out the following

In Enterprise Manager, right-click on the Procedures folder, select "New Stored Procedure", and choose the "Ask Brett to do it" button.

It simply closed my SQL workbench and shut down the system ...

Weird !!! :D|||That's becasue it engages the margarita request task...

It involves shut down, turn off lights, catch train, fire up blender....

and ok...if we say tinyint IS faster, that's great...

Now add the join back to the code table to obtain the meaningful value for the surrogate key...|||Originally posted by Brett Kaiser
That's becasue it engages the margarita request task...

It involves shut down, turn off lights, catch train, fire up blender....

Take a train ... you must be living quite far from your office ...|||3 stops...can read 1 article...

town shuttle picks me up 3 houses away to the train...monthly pass US$36.00 a month
Company shuttle takes me to work...

reverse on the way home...

transmission blew on the piece o junk chrysler...after they replaced it under warranty...

No payement, No Gas, No Insurance, No hassle

beautiful Newark NJ|||Wish we had commuter rail in Ohio...|||Originally posted by blindman
Wish we had commuter rail in Ohio...

http://www.cota.com/cota/cotaweb/main.xml

Actually it's just proposed

http://www.cotafasttrax.com/nclrt_transit_large.php|||Originally posted by Enigma
Take a train ... you must be living quite far from your office ... I guess that depends on your definition of far...

I can work from a number of places. They range from zero to three hundred Km from home (home itself moves twice a week too, but that's another story).

Working downtown can be ugly. If you do it the hard way, the last 100 Km took significantly longer than the first 200 Km! The commuter rail makes the whole process a lot easier, faster, and safer.

-PatP|||km?

I thought you were in southern Cal...|||I created a monster.

Thank you all for the long, very educational discussion regarding my problem. I think I am all set.|||What about another approach? Instead of using binary information such as an archive flag, use e.g. a datetime attribute for creation and another for the closing or also archiving date/time. Have an index for each. Set the value of creation and close to the same value and exclude in every selection, don't use the NOT argument but the CREATION=CLOSE for active. The index seed will be optimal and you can access only archived or open or both and in addition every range of time. This is how you would manage temporal valid records in one to many relationships, e.g. portfolio constructions or account records which you may correct but not delete for tracability reasons.

More far..if you would need uniqueness you could use a timestamp as well, but then you have to define a second attribute not as a timestamp but as a (guess) long. The timestamp field is filled automatically, a value cannot be assigned, but you can read it with an insert trigger, the inserted record will have the timestamp value, wich you can access while inserting the record and write it to the second attribute, which would not be of type timestamp.This would make the values unique.

If you have to access the archived values very often and you are not happy with the performance you have to store the latest timestamp used in a meta table. This will allow direct access of the last and compare with earlier closed ones.

Tried this with SQL Server 2000 and Sybase System 11, it is very fast.

Regards|||News flash! Dead posts are rising from the grave to walk the Earth once again! Authorities urge all citizens to remain calm! Stayed tuned for further instructions...|||mambo,

Thanks for the additional insight ... I am still in the design phase on this project (it got pushed back), so your idea may prove to be useful. I'll check it out. Thanks!|||Wow. Blast to the past. lol Let us know what you finally end up doing Wigz. It's always fun to see 1800 year old posts resurrect themselves.

Friday, February 24, 2012

Database design for versioned data

We're working on an app that needs to keep versioned data (i.e., the
current values plus all previous values). The versioning is integral
to the app so it's more than just an audit trail or history (some
versioned data needs to link to specific versions of other data).
Can anyone share experiences with the database structure for this type
of requirement or point me to helpful resources?
Thanks,
Sam
----
We're hiring! B-Line Medical is seeking .NET
Developers for exciting positions in medical product
development in MD/DC. Work with a variety of technologies
in a relaxed team environment. See ads on Dice.com.Check out the free book from Richard Snodgrass at:
http://www.cs.arizona.edu/people/rts/publications.html
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://sqlblog.com/blogs/tibor_karaszi
"Samuel R. Neff" <samuelneff@.nomail.com> wrote in message
news:uh0j73hhdv0po8bbq7cd15nv42inl63v4c@.4ax.com...
> We're working on an app that needs to keep versioned data (i.e., the
> current values plus all previous values). The versioning is integral
> to the app so it's more than just an audit trail or history (some
> versioned data needs to link to specific versions of other data).
> Can anyone share experiences with the database structure for this type
> of requirement or point me to helpful resources?
> Thanks,
> Sam
> ----
> We're hiring! B-Line Medical is seeking .NET
> Developers for exciting positions in medical product
> development in MD/DC. Work with a variety of technologies
> in a relaxed team environment. See ads on Dice.com.|||I have an article on one way to do this, here:
http://simple-talk.com/sql/t-sql-programming/a-primer-on-managing-data-bitemporally/
Note that one of my main references was the book Tibor posted the link to,
so you might want to read that instead of or in addition to this.
Adam Machanic
SQL Server MVP
Author, "Expert SQL Server 2005 Development"
http://www.apress.com/book/bookDisplay.html?bID=10220
"Samuel R. Neff" <samuelneff@.nomail.com> wrote in message
news:uh0j73hhdv0po8bbq7cd15nv42inl63v4c@.4ax.com...
> We're working on an app that needs to keep versioned data (i.e., the
> current values plus all previous values). The versioning is integral
> to the app so it's more than just an audit trail or history (some
> versioned data needs to link to specific versions of other data).
> Can anyone share experiences with the database structure for this type
> of requirement or point me to helpful resources?
> Thanks,
> Sam
> ----
> We're hiring! B-Line Medical is seeking .NET
> Developers for exciting positions in medical product
> development in MD/DC. Work with a variety of technologies
> in a relaxed team environment. See ads on Dice.com.|||Perhaps an app like ApexSQL's Audit can help. Not sure based on your spec
whether you need the data to exist in the exact same tables or not.
--
TheSQLGuru
President
Indicium Resources, Inc.
"Samuel R. Neff" <samuelneff@.nomail.com> wrote in message
news:uh0j73hhdv0po8bbq7cd15nv42inl63v4c@.4ax.com...
> We're working on an app that needs to keep versioned data (i.e., the
> current values plus all previous values). The versioning is integral
> to the app so it's more than just an audit trail or history (some
> versioned data needs to link to specific versions of other data).
> Can anyone share experiences with the database structure for this type
> of requirement or point me to helpful resources?
> Thanks,
> Sam
> ----
> We're hiring! B-Line Medical is seeking .NET
> Developers for exciting positions in medical product
> development in MD/DC. Work with a variety of technologies
> in a relaxed team environment. See ads on Dice.com.

Database design for versioned data

We're working on an app that needs to keep versioned data (i.e., the
current values plus all previous values). The versioning is integral
to the app so it's more than just an audit trail or history (some
versioned data needs to link to specific versions of other data).
Can anyone share experiences with the database structure for this type
of requirement or point me to helpful resources?
Thanks,
Sam
We're hiring! B-Line Medical is seeking .NET
Developers for exciting positions in medical product
development in MD/DC. Work with a variety of technologies
in a relaxed team environment. See ads on Dice.com.
Check out the free book from Richard Snodgrass at:
http://www.cs.arizona.edu/people/rts/publications.html
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://sqlblog.com/blogs/tibor_karaszi
"Samuel R. Neff" <samuelneff@.nomail.com> wrote in message
news:uh0j73hhdv0po8bbq7cd15nv42inl63v4c@.4ax.com...
> We're working on an app that needs to keep versioned data (i.e., the
> current values plus all previous values). The versioning is integral
> to the app so it's more than just an audit trail or history (some
> versioned data needs to link to specific versions of other data).
> Can anyone share experiences with the database structure for this type
> of requirement or point me to helpful resources?
> Thanks,
> Sam
> ----
> We're hiring! B-Line Medical is seeking .NET
> Developers for exciting positions in medical product
> development in MD/DC. Work with a variety of technologies
> in a relaxed team environment. See ads on Dice.com.
|||I have an article on one way to do this, here:
http://simple-talk.com/sql/t-sql-programming/a-primer-on-managing-data-bitemporally/
Note that one of my main references was the book Tibor posted the link to,
so you might want to read that instead of or in addition to this.

Adam Machanic
SQL Server MVP
Author, "Expert SQL Server 2005 Development"
http://www.apress.com/book/bookDisplay.html?bID=10220
"Samuel R. Neff" <samuelneff@.nomail.com> wrote in message
news:uh0j73hhdv0po8bbq7cd15nv42inl63v4c@.4ax.com...
> We're working on an app that needs to keep versioned data (i.e., the
> current values plus all previous values). The versioning is integral
> to the app so it's more than just an audit trail or history (some
> versioned data needs to link to specific versions of other data).
> Can anyone share experiences with the database structure for this type
> of requirement or point me to helpful resources?
> Thanks,
> Sam
> ----
> We're hiring! B-Line Medical is seeking .NET
> Developers for exciting positions in medical product
> development in MD/DC. Work with a variety of technologies
> in a relaxed team environment. See ads on Dice.com.
|||Perhaps an app like ApexSQL's Audit can help. Not sure based on your spec
whether you need the data to exist in the exact same tables or not.
TheSQLGuru
President
Indicium Resources, Inc.
"Samuel R. Neff" <samuelneff@.nomail.com> wrote in message
news:uh0j73hhdv0po8bbq7cd15nv42inl63v4c@.4ax.com...
> We're working on an app that needs to keep versioned data (i.e., the
> current values plus all previous values). The versioning is integral
> to the app so it's more than just an audit trail or history (some
> versioned data needs to link to specific versions of other data).
> Can anyone share experiences with the database structure for this type
> of requirement or point me to helpful resources?
> Thanks,
> Sam
> ----
> We're hiring! B-Line Medical is seeking .NET
> Developers for exciting positions in medical product
> development in MD/DC. Work with a variety of technologies
> in a relaxed team environment. See ads on Dice.com.

Database design for versioned data

We're working on an app that needs to keep versioned data (i.e., the
current values plus all previous values). The versioning is integral
to the app so it's more than just an audit trail or history (some
versioned data needs to link to specific versions of other data).
Can anyone share experiences with the database structure for this type
of requirement or point me to helpful resources?
Thanks,
Sam
----
We're hiring! B-Line Medical is seeking .NET
Developers for exciting positions in medical product
development in MD/DC. Work with a variety of technologies
in a relaxed team environment. See ads on Dice.com.Check out the free book from Richard Snodgrass at:
http://www.cs.arizona.edu/people/rts/publications.html
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://sqlblog.com/blogs/tibor_karaszi
"Samuel R. Neff" <samuelneff@.nomail.com> wrote in message
news:uh0j73hhdv0po8bbq7cd15nv42inl63v4c@.
4ax.com...
> We're working on an app that needs to keep versioned data (i.e., the
> current values plus all previous values). The versioning is integral
> to the app so it's more than just an audit trail or history (some
> versioned data needs to link to specific versions of other data).
> Can anyone share experiences with the database structure for this type
> of requirement or point me to helpful resources?
> Thanks,
> Sam
> ----
> We're hiring! B-Line Medical is seeking .NET
> Developers for exciting positions in medical product
> development in MD/DC. Work with a variety of technologies
> in a relaxed team environment. See ads on Dice.com.|||I have an article on one way to do this, here:
http://simple-talk.com/sql/t-sql-pr...orall
y/
Note that one of my main references was the book Tibor posted the link to,
so you might want to read that instead of or in addition to this.
Adam Machanic
SQL Server MVP
Author, "Expert SQL Server 2005 Development"
http://www.apress.com/book/bookDisplay.html?bID=10220
"Samuel R. Neff" <samuelneff@.nomail.com> wrote in message
news:uh0j73hhdv0po8bbq7cd15nv42inl63v4c@.
4ax.com...
> We're working on an app that needs to keep versioned data (i.e., the
> current values plus all previous values). The versioning is integral
> to the app so it's more than just an audit trail or history (some
> versioned data needs to link to specific versions of other data).
> Can anyone share experiences with the database structure for this type
> of requirement or point me to helpful resources?
> Thanks,
> Sam
> ----
> We're hiring! B-Line Medical is seeking .NET
> Developers for exciting positions in medical product
> development in MD/DC. Work with a variety of technologies
> in a relaxed team environment. See ads on Dice.com.|||Perhaps an app like ApexSQL's Audit can help. Not sure based on your spec
whether you need the data to exist in the exact same tables or not.
TheSQLGuru
President
Indicium Resources, Inc.
"Samuel R. Neff" <samuelneff@.nomail.com> wrote in message
news:uh0j73hhdv0po8bbq7cd15nv42inl63v4c@.
4ax.com...
> We're working on an app that needs to keep versioned data (i.e., the
> current values plus all previous values). The versioning is integral
> to the app so it's more than just an audit trail or history (some
> versioned data needs to link to specific versions of other data).
> Can anyone share experiences with the database structure for this type
> of requirement or point me to helpful resources?
> Thanks,
> Sam
> ----
> We're hiring! B-Line Medical is seeking .NET
> Developers for exciting positions in medical product
> development in MD/DC. Work with a variety of technologies
> in a relaxed team environment. See ads on Dice.com.