Saturday, February 25, 2012

Database design question - Isolated, unrelated tables

Hi,

I have a question regarding best practices in database design. In a
relational database, is it wise/necessary to sometimes create tables
that are not related to other tables through a foreign Key
relationship or does this always indicate some sort of underlying
design flaw. Something that requires a re evaluation of the problem
domain?

The reason I ask is because in our application, the user can perform x
number of high level operations (creating/updating projects, creating/
answering surveys etc. etc.). Different users can perform different
operations and each operation can manipulate one or more table. This
part of the system is done and working. Now there is a requirement to
have some sort of audit logging inside the database (separate from the
text based log file that the application generates anyway). This
"audit logging" table will contain high level events that occur inside
the application (which may or may not relate to a particular
operation). This table is in some sense related to every other table
in the database, as well as data that is not in the database itself
(exceptions, external events etc.). For example : it might have
entries that specify that at time x user created project y, at time A
user filled out survey B, at time C LDAP server was down, At time D an
unauthorized login attempt occurred etc.

As I said, these seems to suggest a stand alone, floating table with a
few fields that store entries regarding whats going on the system
without any direct relationship to other tables in the database. But I
just feel uneasy about creating such an isolated table. Another option
is to store the "logging" information in another schema/database, but
that doubles the maintainance work load. Not really looking forward to
maintaining/designing two different schemas.

I had a look at the microsoft adventureworks database schema diagram
and they seem to have 3 standalong tables - AWBuildVersion, ErrorLog
and DatabaseLog (unless i am reading it wrong!)

Any advice, Information or resources are much appreciated.On 25 Jun, 08:08, nyathan...@.hotmail.com wrote:

Quote:

Originally Posted by

Hi,
>
I have a question regarding best practices in database design. In a
relational database, is it wise/necessary to sometimes create tables
that are not related to other tables through a foreign Key
relationship or does this always indicate some sort of underlying
design flaw. Something that requires a re evaluation of the problem
domain?
>
The reason I ask is because in our application, the user can perform x
number of high level operations (creating/updating projects, creating/
answering surveys etc. etc.). Different users can perform different
operations and each operation can manipulate one or more table. This
part of the system is done and working. Now there is a requirement to
have some sort of audit logging inside the database (separate from the
text based log file that the application generates anyway). This
"audit logging" table will contain high level events that occur inside
the application (which may or may not relate to a particular
operation). This table is in some sense related to every other table
in the database, as well as data that is not in the database itself
(exceptions, external events etc.). For example : it might have
entries that specify that at time x user created project y, at time A
user filled out survey B, at time C LDAP server was down, At time D an
unauthorized login attempt occurred etc.
>
As I said, these seems to suggest a stand alone, floating table with a
few fields that store entries regarding whats going on the system
without any direct relationship to other tables in the database. But I
just feel uneasy about creating such an isolated table. Another option
is to store the "logging" information in another schema/database, but
that doubles the maintainance work load. Not really looking forward to
maintaining/designing two different schemas.
>
I had a look at the microsoft adventureworks database schema diagram
and they seem to have 3 standalong tables - AWBuildVersion, ErrorLog
and DatabaseLog (unless i am reading it wrong!)
>
Any advice, Information or resources are much appreciated.


Hi Nyathan,

In general terms it is quite acceptable to have a standalone table
with no FK relationships instansiated.
Indeed, in times gone by whole databases were created in this manner
as the overhead for OLTP with all the index data manipulation behind
the scences could bring a system to its knees. (PK & FK are backed by
'hidden' indexes).

As regards your specific task, (and here you'll realise I'm not from a
SQL Server background), check to see if there is an existing audit
ability in the rdbms.
Some rdbms allow for tracking all the activity on a table or all the
activity of a certain user, some allow both and some changed from one
to the other, (rats).

If you need to create your own system then think generic.
A single table with datetime stamp, username, table effected, action
taken, and the say 100 columns of varchar 100.
Each table you are seeking to audit will need triggers, ( insert,
update, delete), that calls a genric stored procedure that populates
the table you have created.
If your going to be doing this for a lot of tables or repeatedly I'd
advise writting a little noddy program to get the source table column
information from the systables and then generate the triggers.
The triggers will need to cause the stored procedure to write away
both the before and after image of the data.

BEWARE of bulk updates or deletes, ( best to disable the triggers
before you do them).
Also archiving / purging of your generic table becomes interesting,
depending on activity levels.

I would not recommend the above for busy tables.

Hope that helps, Tim|||<nyathancha@.hotmail.comwrote in message
news:1182755325.216207.318140@.g37g2000prf.googlegr oups.com...

Quote:

Originally Posted by

Hi,
>
I have a question regarding best practices in database design. In a
relational database, is it wise/necessary to sometimes create tables
that are not related to other tables through a foreign Key
relationship or does this always indicate some sort of underlying
design flaw. Something that requires a re evaluation of the problem
domain?
>
The reason I ask is because in our application, the user can perform x
number of high level operations (creating/updating projects, creating/
answering surveys etc. etc.). Different users can perform different
operations and each operation can manipulate one or more table. This
part of the system is done and working. Now there is a requirement to
have some sort of audit logging inside the database (separate from the
text based log file that the application generates anyway). This
"audit logging" table will contain high level events that occur inside
the application (which may or may not relate to a particular
operation). This table is in some sense related to every other table
in the database, as well as data that is not in the database itself
(exceptions, external events etc.). For example : it might have
entries that specify that at time x user created project y, at time A
user filled out survey B, at time C LDAP server was down, At time D an
unauthorized login attempt occurred etc.
>
As I said, these seems to suggest a stand alone, floating table with a
few fields that store entries regarding whats going on the system
without any direct relationship to other tables in the database. But I
just feel uneasy about creating such an isolated table. Another option
is to store the "logging" information in another schema/database, but
that doubles the maintainance work load. Not really looking forward to
maintaining/designing two different schemas.
>
I had a look at the microsoft adventureworks database schema diagram
and they seem to have 3 standalong tables - AWBuildVersion, ErrorLog
and DatabaseLog (unless i am reading it wrong!)
>
Any advice, Information or resources are much appreciated.
>


An isolated table that is not logically connected to the rest of the system
by foreign key references suggests something that, in concept, is really a
separate schema. Ordinarily the Universe of Discourse (the subject matter
that the data describes) is composed of items that are all related to each
other in one way or another. The schema that is derived from a model of the
Universe of discourse will likewise be interrelated.

The situation you describe may be just such a situation, a separate schema.
An audit logging table may be storing data not for what it says about the
"real world" subject matter that the rest of the data describes, but for
what it says about the series of events that the application processed.

If so, it could be legitimate design. Ultimately, the question boils down
to this: how do you intend to use the data captured in the logging table?
If it's going to be used in an isolated fashion (not combined with data in
other tables by the DBMS), then your design may well be legitimate.|||On Mon, 25 Jun 2007 00:08:45 -0700, nyathancha@.hotmail.com wrote:

Quote:

Originally Posted by

>Hi,
>
>I have a question regarding best practices in database design. In a
>relational database, is it wise/necessary to sometimes create tables
>that are not related to other tables through a foreign Key
>relationship or does this always indicate some sort of underlying
>design flaw. Something that requires a re evaluation of the problem
>domain?


Hi nyathancha,

It can happen, but it's definitely not common. For me, it would be a
reason to look again, but not to dismiss the design right away.

I have once encountered a situation where I needed unrelated tables.
This had to do with auditing, but not at all like the method you are
proposing - in fact, I don't really like what I think you're trying to
do. Having one table to log "everything" shares many of the problems of
the EAV design - you'll be creating a very generic table with a few very
generic columns. They can hold everything, making it virtually
impossible to constrain or query the data in the table. If this is the
kind of audit tale that should normally never be used but is only kept
for the 1 in a million chance of a completely unforeseen disaster, and
wasting countless man hours to sift manually through the collected data
is an acceptable price to pay in that situation, than this design MIGHT
be considered. In all other cases, I'd steer away from it and go for a
more constrained design.

The situation where I had to use unrelated tables was at a firm that had
to keep a full record of changes for some tables - so for each of those
tables, a history table was made with all the same columns, plus a
datetime (as part of the primary key), userid of who made the change,
etc. We then added triggers to the main tables to ensure that each
modification in those tables was properly recorded in the corresponding
history table. But we did NOT define any foreign keys, for the simple
reason that after e.g. a deletion of a customer, the change history of
that customer still had to be kept on file; we couldn't remove the
customer from the history table, and a foreign key to the customers
table would have prevented the DELETE.

Another example of a design with an unrelated table that I never used in
practice but can imagine easily enough, would be a single-row table to
hold an application's "processing date" (so that a batch that runs past
midnight can all be processed as if all was on the same date, and that
actions can be "redone" [or tested] on a simulated dy - I have worked
with such systems back in my mainframe PL/I programming days, but they
used flat files rather than databases <g>).

Quote:

Originally Posted by

>I had a look at the microsoft adventureworks database schema diagram
>and they seem to have 3 standalong tables - AWBuildVersion, ErrorLog
>and DatabaseLog (unless i am reading it wrong!)


For examples of good design, please don't look at any Microsoft supplied
samples. Even though AdventureWorks is miles ahead of Northwind and
pubs, it's still filled to the brim with bad practices.

--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis|||In addition to the other replies, I would add that foreign key constraints
are just one of many tools the database designer can use to help ensure that
bad data does not get placed in your database. Other tools include check
constraints, using the right datatypes (eg, store dates in a datetime
column, not a varchar column), sometimes triggers, etc.

So an important question is what the consequences will be if (when!, my
experience is if bad data can be put into a database, sooner or later, it
will be) invalid data is put into your audit table(s). That might range
from nobody really cares, to it's going to be a lot of work to fix it, to
somebody (you?) gets fired, to your company would be subject to a
significant fine, to somebody might go to prison (if, for example, your
audit trail is being used to prove compliance with SOX). So ask yourself
questions like what will happen if your boss comes to you and says the audit
trail says that user x created project y at time z, but there is no project
y in the system.

I certainly have tables in databases I have designed that do not have any
foreign key relationships to other tables, but before implementing one, I
would always think carefully about it.

Tom

<nyathancha@.hotmail.comwrote in message
news:1182755325.216207.318140@.g37g2000prf.googlegr oups.com...

Quote:

Originally Posted by

Hi,
>
I have a question regarding best practices in database design. In a
relational database, is it wise/necessary to sometimes create tables
that are not related to other tables through a foreign Key
relationship or does this always indicate some sort of underlying
design flaw. Something that requires a re evaluation of the problem
domain?
>
The reason I ask is because in our application, the user can perform x
number of high level operations (creating/updating projects, creating/
answering surveys etc. etc.). Different users can perform different
operations and each operation can manipulate one or more table. This
part of the system is done and working. Now there is a requirement to
have some sort of audit logging inside the database (separate from the
text based log file that the application generates anyway). This
"audit logging" table will contain high level events that occur inside
the application (which may or may not relate to a particular
operation). This table is in some sense related to every other table
in the database, as well as data that is not in the database itself
(exceptions, external events etc.). For example : it might have
entries that specify that at time x user created project y, at time A
user filled out survey B, at time C LDAP server was down, At time D an
unauthorized login attempt occurred etc.
>
As I said, these seems to suggest a stand alone, floating table with a
few fields that store entries regarding whats going on the system
without any direct relationship to other tables in the database. But I
just feel uneasy about creating such an isolated table. Another option
is to store the "logging" information in another schema/database, but
that doubles the maintainance work load. Not really looking forward to
maintaining/designing two different schemas.
>
I had a look at the microsoft adventureworks database schema diagram
and they seem to have 3 standalong tables - AWBuildVersion, ErrorLog
and DatabaseLog (unless i am reading it wrong!)
>
Any advice, Information or resources are much appreciated.
>

|||On Jun 26, 10:29 am, "Tom Cooper"
<tomcoo...@.comcast.no.spam.please.netwrote:

Quote:

Originally Posted by

In addition to the other replies, I would add that foreign key constraints
are just one of many tools thedatabasedesigner can use to help ensure that
bad data does not get placed in yourdatabase. Other tools include check
constraints, using the right datatypes (eg, store dates in a datetime
column, not a varchar column), sometimes triggers, etc.
>
So an important question is what the consequences will be if (when!, my
experience is if bad data can be put into adatabase, sooner or later, it
will be) invalid data is put into your audit table(s). That might range
from nobody really cares, to it's going to be a lot of work to fix it, to
somebody (you?) gets fired, to your company would be subject to a
significant fine, to somebody might go to prison (if, for example, your
audit trail is being used to prove compliance with SOX). So ask yourself
questions like what will happen if your boss comes to you and says the audit
trail says that user x created project y at time z, but there is no project
y in the system.
>
I certainly have tables in databases I have designed that do not have any
foreign key relationships to other tables, but before implementing one, I
would always think carefully about it.
>
Tom
>
<nyathan...@.hotmail.comwrote in message
>
news:1182755325.216207.318140@.g37g2000prf.googlegr oups.com...
>

Quote:

Originally Posted by

Hi,


>

Quote:

Originally Posted by

I have a question regarding best practices indatabasedesign. In a
relationaldatabase, is it wise/necessary to sometimes create tables
that are not related to other tables through a foreign Key
relationship or does this always indicate some sort of underlying
design flaw. Something that requires a re evaluation of the problem
domain?


>

Quote:

Originally Posted by

The reason I ask is because in our application, the user can perform x
number of high level operations (creating/updating projects, creating/
answering surveys etc. etc.). Different users can perform different
operations and each operation can manipulate one or more table. This
part of the system is done and working. Now there is a requirement to
have some sort of audit logging inside thedatabase(separate from the
text based log file that the application generates anyway). This
"audit logging" table will contain high level events that occur inside
the application (which may or may not relate to a particular
operation). This table is in some sense related to every other table
in thedatabase, as well as data that is not in thedatabaseitself
(exceptions, external events etc.). For example : it might have
entries that specify that at time x user created project y, at time A
user filled out survey B, at time C LDAP server was down, At time D an
unauthorized login attempt occurred etc.


>

Quote:

Originally Posted by

As I said, these seems to suggest a stand alone, floating table with a
few fields that store entries regarding whats going on the system
without any direct relationship to other tables in thedatabase. But I
just feel uneasy about creating such an isolated table. Another option
is to store the "logging" information in another schema/database, but
that doubles the maintainance work load. Not really looking forward to
maintaining/designing two different schemas.


>

Quote:

Originally Posted by

I had a look at the microsoft adventureworksdatabaseschema diagram
and they seem to have 3 standalong tables - AWBuildVersion, ErrorLog
and DatabaseLog (unless i am reading it wrong!)


>

Quote:

Originally Posted by

Any advice, Information or resources are much appreciated.


Thanks for the prompt replies everyone.

Quote:

Originally Posted by

>From what I am hearing, the consensus seems to be use it if you


absolutely must, but try to avoid it if you can.

One good point everyone seems to raise is "what is it used for?" ...
To be perfectly honest I am not entirely sure myself. Its one of those
requirements that filtered down from the management cloud. I think
the view is to use it mainly for "reporting" kind of functionality and
maybe only on some rare occasion for some sort of postmortem
debugging. Although in the latter situation, the application logs and
the sql server logs will probably end up being more helpful. I think
there is a system table somewhere in sql server that logs all the
transactions and changes that happen in the table right?

Crystal reports were being considered at some stage for more
sophisticated reports, but for now they want some sort of entries in
there to see whats happening (not necessarily at the database level,
but at the application level). The resolution of the reporting and
entries hasn't been decided yet ... as in, do we want to know
everytime someone retrieves a list of customers or only when someone
adds/removes customers. I have a feeling that if I chase this up, the
answer is going to be "both", "we may not want to start logging very
detailed stuff into the database right away, but if at some stage we
want to do it, the design should allow for it."

So just thinking in terms of some sort of "reporting" solution, in
abstract a sort of condensed data for easier consumption, does it make
sense to store an isolated table(s)/schemas along with the actual
data?

As to the consequences of a bad audit trail/log entry, I don't think
it would be catastrophic (fines, people going to prison etc.). Its an
internal application used to streamline inhouse processes. But of
course, we still don't want bad, inconsistent data in there and it
would lead to a lot of headaches, finger pointings, late nights etc.|||<nyathancha@.hotmail.comwrote in message
news:1182820095.745339.128220@.e9g2000prf.googlegro ups.com...

Quote:

Originally Posted by

One good point everyone seems to raise is "what is it used for?" ...
To be perfectly honest I am not entirely sure myself. Its one of those
requirements that filtered down from the management cloud. I think
the view is to use it mainly for "reporting" kind of functionality and
maybe only on some rare occasion for some sort of postmortem
debugging. Although in the latter situation, the application logs and
the sql server logs will probably end up being more helpful. I think
there is a system table somewhere in sql server that logs all the
transactions and changes that happen in the table right?
>


In architecture, form follows function. If the question "what is this data
being captured for" is unanswered, then the question of whether the design
is appropriate becomes moot.|||
<nyathancha@.hotmail.comwrote in message
news:1182820095.745339.128220@.e9g2000prf.googlegro ups.com...

Quote:

Originally Posted by

On Jun 26, 10:29 am, "Tom Cooper"
<tomcoo...@.comcast.no.spam.please.netwrote:

Quote:

Originally Posted by

>In addition to the other replies, I would add that foreign key
>constraints
>are just one of many tools thedatabasedesigner can use to help ensure
>that
>bad data does not get placed in yourdatabase. Other tools include check
>constraints, using the right datatypes (eg, store dates in a datetime
>column, not a varchar column), sometimes triggers, etc.
>>
>So an important question is what the consequences will be if (when!, my
>experience is if bad data can be put into adatabase, sooner or later, it
>will be) invalid data is put into your audit table(s). That might range
>from nobody really cares, to it's going to be a lot of work to fix it, to
>somebody (you?) gets fired, to your company would be subject to a
>significant fine, to somebody might go to prison (if, for example, your
>audit trail is being used to prove compliance with SOX). So ask yourself
>questions like what will happen if your boss comes to you and says the
>audit
>trail says that user x created project y at time z, but there is no
>project
>y in the system.
>>
>I certainly have tables in databases I have designed that do not have any
>foreign key relationships to other tables, but before implementing one, I
>would always think carefully about it.
>>
>Tom
>>
><nyathan...@.hotmail.comwrote in message
>>
>news:1182755325.216207.318140@.g37g2000prf.googlegr oups.com...
>>

Quote:

Originally Posted by

Hi,


>>

Quote:

Originally Posted by

I have a question regarding best practices indatabasedesign. In a
relationaldatabase, is it wise/necessary to sometimes create tables
that are not related to other tables through a foreign Key
relationship or does this always indicate some sort of underlying
design flaw. Something that requires a re evaluation of the problem
domain?


>>

Quote:

Originally Posted by

The reason I ask is because in our application, the user can perform x
number of high level operations (creating/updating projects, creating/
answering surveys etc. etc.). Different users can perform different
operations and each operation can manipulate one or more table. This
part of the system is done and working. Now there is a requirement to
have some sort of audit logging inside thedatabase(separate from the
text based log file that the application generates anyway). This
"audit logging" table will contain high level events that occur inside
the application (which may or may not relate to a particular
operation). This table is in some sense related to every other table
in thedatabase, as well as data that is not in thedatabaseitself
(exceptions, external events etc.). For example : it might have
entries that specify that at time x user created project y, at time A
user filled out survey B, at time C LDAP server was down, At time D an
unauthorized login attempt occurred etc.


>>

Quote:

Originally Posted by

As I said, these seems to suggest a stand alone, floating table with a
few fields that store entries regarding whats going on the system
without any direct relationship to other tables in thedatabase. But I
just feel uneasy about creating such an isolated table. Another option
is to store the "logging" information in another schema/database, but
that doubles the maintainance work load. Not really looking forward to
maintaining/designing two different schemas.


>>

Quote:

Originally Posted by

I had a look at the microsoft adventureworksdatabaseschema diagram
and they seem to have 3 standalong tables - AWBuildVersion, ErrorLog
and DatabaseLog (unless i am reading it wrong!)


>>

Quote:

Originally Posted by

Any advice, Information or resources are much appreciated.


>
Thanks for the prompt replies everyone.
>

Quote:

Originally Posted by

>>From what I am hearing, the consensus seems to be use it if you


absolutely must, but try to avoid it if you can.
>
One good point everyone seems to raise is "what is it used for?" ...
To be perfectly honest I am not entirely sure myself. Its one of those
requirements that filtered down from the management cloud. I think
the view is to use it mainly for "reporting" kind of functionality and
maybe only on some rare occasion for some sort of postmortem
debugging. Although in the latter situation, the application logs and
the sql server logs will probably end up being more helpful. I think
there is a system table somewhere in sql server that logs all the
transactions and changes that happen in the table right?
>


No. It doesn't. And depending on database options, the transaction log may
not contain a persistent record of changes either, not to mention that
special tools are required to read those logs.

It is sometimes important to determine who did what, when, where (from which
workstation), and how (with which application). It's easier to track down a
bug if you don't have to rely on what the user claims they did. It can also
be used to determine which users need additional training. A DBMS can only
guarantee that the result of a modification is consistent with respect to
the schema, it cannot determine whether or not a consistent modification is
also correct.

Quote:

Originally Posted by

Crystal reports were being considered at some stage for more
sophisticated reports, but for now they want some sort of entries in
there to see whats happening (not necessarily at the database level,
but at the application level). The resolution of the reporting and
entries hasn't been decided yet ... as in, do we want to know
everytime someone retrieves a list of customers or only when someone
adds/removes customers. I have a feeling that if I chase this up, the
answer is going to be "both", "we may not want to start logging very
detailed stuff into the database right away, but if at some stage we
want to do it, the design should allow for it."
>
So just thinking in terms of some sort of "reporting" solution, in
abstract a sort of condensed data for easier consumption, does it make
sense to store an isolated table(s)/schemas along with the actual
data?
>


I think it all depends on the requirements. Logging consumes resources and
therefore impacts all database operations.

Quote:

Originally Posted by

As to the consequences of a bad audit trail/log entry, I don't think
it would be catastrophic (fines, people going to prison etc.). Its an
internal application used to streamline inhouse processes. But of
course, we still don't want bad, inconsistent data in there and it
would lead to a lot of headaches, finger pointings, late nights etc.
>


Actually, you DO want bad information in there. Nobody's perfect, and it's
easier to find and correct mistakes if you have a record of how they got
into the database in the first place.

Quote:

Originally Posted by

>
>

|||On Jun 26, 11:08 am, nyathan...@.hotmail.com wrote:

Quote:

Originally Posted by

On Jun 26, 10:29 am, "Tom Cooper"
>
>
>
<tomcoo...@.comcast.no.spam.please.netwrote:

Quote:

Originally Posted by

In addition to the other replies, I would add that foreign key constraints
are just one of many tools thedatabasedesigner can use to help ensure that
bad data does not get placed in yourdatabase. Other tools include check
constraints, using the right datatypes (eg, store dates in a datetime
column, not a varchar column), sometimes triggers, etc.


>

Quote:

Originally Posted by

So an important question is what the consequences will be if (when!, my
experience is if bad data can be put into adatabase, sooner or later, it
will be) invalid data is put into your audit table(s). That might range
from nobody really cares, to it's going to be a lot of work to fix it, to
somebody (you?) gets fired, to your company would be subject to a
significant fine, to somebody might go to prison (if, for example, your
audit trail is being used to prove compliance with SOX). So ask yourself
questions like what will happen if your boss comes to you and says the audit
trail says that user x created project y at time z, but there is no project
y in the system.


>

Quote:

Originally Posted by

I certainly have tables in databases I have designed that do not have any
foreign key relationships to other tables, but before implementing one, I
would always think carefully about it.


>

Quote:

Originally Posted by

Tom


>

Quote:

Originally Posted by

<nyathan...@.hotmail.comwrote in message


>

Quote:

Originally Posted by

news:1182755325.216207.318140@.g37g2000prf.googlegr oups.com...


>

Quote:

Originally Posted by

Quote:

Originally Posted by

Hi,


>

Quote:

Originally Posted by

Quote:

Originally Posted by

I have a question regarding best practices indatabasedesign. In a
relationaldatabase, is it wise/necessary to sometimes create tables
that are not related to other tables through a foreign Key
relationship or does this always indicate some sort of underlying
design flaw. Something that requires a re evaluation of the problem
domain?


>

Quote:

Originally Posted by

Quote:

Originally Posted by

The reason I ask is because in our application, the user can perform x
number of high level operations (creating/updating projects, creating/
answering surveys etc. etc.). Different users can perform different
operations and each operation can manipulate one or more table. This
part of the system is done and working. Now there is a requirement to
have some sort of audit logging inside thedatabase(separate from the
text based log file that the application generates anyway). This
"audit logging" table will contain high level events that occur inside
the application (which may or may not relate to a particular
operation). This table is in some sense related to every other table
in thedatabase, as well as data that is not in thedatabaseitself
(exceptions, external events etc.). For example : it might have
entries that specify that at time x user created project y, at time A
user filled out survey B, at time C LDAP server was down, At time D an
unauthorized login attempt occurred etc.


>

Quote:

Originally Posted by

Quote:

Originally Posted by

As I said, these seems to suggest a stand alone, floating table with a
few fields that store entries regarding whats going on the system
without any direct relationship to other tables in thedatabase. But I
just feel uneasy about creating such an isolated table. Another option
is to store the "logging" information in another schema/database, but
that doubles the maintainance work load. Not really looking forward to
maintaining/designing two different schemas.


>

Quote:

Originally Posted by

Quote:

Originally Posted by

I had a look at the microsoft adventureworksdatabaseschema diagram
and they seem to have 3 standalong tables - AWBuildVersion, ErrorLog
and DatabaseLog (unless i am reading it wrong!)


>

Quote:

Originally Posted by

Quote:

Originally Posted by

Any advice, Information or resources are much appreciated.


>
Thanks for the prompt replies everyone.
>

Quote:

Originally Posted by

From what I am hearing, the consensus seems to be use it if you


>
absolutely must, but try to avoid it if you can.
>
One good point everyone seems to raise is "what is it used for?" ...
To be perfectly honest I am not entirely sure myself. Its one of those
requirements that filtered down from the management cloud. I think
the view is to use it mainly for "reporting" kind of functionality and
maybe only on some rare occasion for some sort of postmortem
debugging. Although in the latter situation, the application logs and
the sql server logs will probably end up being more helpful. I think
there is a system table somewhere in sql server that logs all the
transactions and changes that happen in the table right?
>
Crystal reports were being considered at some stage for more
sophisticated reports, but for now they want some sort of entries in
there to see whats happening (not necessarily at thedatabaselevel,
but at the application level). The resolution of the reporting and
entries hasn't been decided yet ... as in, do we want to know
everytime someone retrieves a list of customers or only when someone
adds/removes customers. I have a feeling that if I chase this up, the
answer is going to be "both", "we may not want to start logging very
detailed stuff into thedatabaseright away, but if at some stage we
want to do it, the design should allow for it."
>
So just thinking in terms of some sort of "reporting" solution, in
abstract a sort of condensed data for easier consumption, does it make
sense to store an isolated table(s)/schemas along with the actual
data?
>
As to the consequences of a bad audit trail/log entry, I don't think
it would be catastrophic (fines, people going to prison etc.). Its an
internal application used to streamline inhouse processes. But of
course, we still don't want bad, inconsistent data in there and it
would lead to a lot of headaches, finger pointings, late nights etc.


Actually, another best practices question now that I am here. Does it
make sense for a table to have two (or more different foreign keys)
both (or all) of which can be nullable and then tie them to different
tables for different records? For example I have a survey table. It
has all the fields are relations for describing various survey data
(survey questions, participants, start, finish dates etc. ). Now a
survey can be related to a project or a supplier. Of course, the same
thing can be done with two different junction tables. Which is the
better method? Add the junction tables and increasing the number of
tables, complexity of the system (and the number of joins required for
a query) or just adding extra nullable foreign key field(s) to the
table? Is there a rule of thumb I should be following here?|||<nyathancha@.hotmail.comwrote in message
news:1182830820.680395.9910@.e16g2000pri.googlegrou ps.com...

Quote:

Originally Posted by

On Jun 26, 11:08 am, nyathan...@.hotmail.com wrote:

Quote:

Originally Posted by

>On Jun 26, 10:29 am, "Tom Cooper"
>>
>>
>>
><tomcoo...@.comcast.no.spam.please.netwrote:

Quote:

Originally Posted by

In addition to the other replies, I would add that foreign key
constraints
are just one of many tools thedatabasedesigner can use to help ensure
that
bad data does not get placed in yourdatabase. Other tools include
check
constraints, using the right datatypes (eg, store dates in a datetime
column, not a varchar column), sometimes triggers, etc.


>>

Quote:

Originally Posted by

So an important question is what the consequences will be if (when!, my
experience is if bad data can be put into adatabase, sooner or later,
it
will be) invalid data is put into your audit table(s). That might
range
from nobody really cares, to it's going to be a lot of work to fix it,
to
somebody (you?) gets fired, to your company would be subject to a
significant fine, to somebody might go to prison (if, for example, your
audit trail is being used to prove compliance with SOX). So ask
yourself
questions like what will happen if your boss comes to you and says the
audit
trail says that user x created project y at time z, but there is no
project
y in the system.


>>

Quote:

Originally Posted by

I certainly have tables in databases I have designed that do not have
any
foreign key relationships to other tables, but before implementing one,
I
would always think carefully about it.


>>

Quote:

Originally Posted by

Tom


>>

Quote:

Originally Posted by

<nyathan...@.hotmail.comwrote in message


>>

Quote:

Originally Posted by

>news:1182755325.216207.318140@.g37g2000prf.googlegr oups.com...


>>

Quote:

Originally Posted by

Hi,


>>

Quote:

Originally Posted by

I have a question regarding best practices indatabasedesign. In a
relationaldatabase, is it wise/necessary to sometimes create tables
that are not related to other tables through a foreign Key
relationship or does this always indicate some sort of underlying
design flaw. Something that requires a re evaluation of the problem
domain?


>>

Quote:

Originally Posted by

The reason I ask is because in our application, the user can perform
x
number of high level operations (creating/updating projects,
creating/
answering surveys etc. etc.). Different users can perform different
operations and each operation can manipulate one or more table. This
part of the system is done and working. Now there is a requirement to
have some sort of audit logging inside thedatabase(separate from the
text based log file that the application generates anyway). This
"audit logging" table will contain high level events that occur
inside
the application (which may or may not relate to a particular
operation). This table is in some sense related to every other table
in thedatabase, as well as data that is not in thedatabaseitself
(exceptions, external events etc.). For example : it might have
entries that specify that at time x user created project y, at time A
user filled out survey B, at time C LDAP server was down, At time D
an
unauthorized login attempt occurred etc.


>>

Quote:

Originally Posted by

As I said, these seems to suggest a stand alone, floating table with
a
few fields that store entries regarding whats going on the system
without any direct relationship to other tables in thedatabase. But I
just feel uneasy about creating such an isolated table. Another
option
is to store the "logging" information in another schema/database, but
that doubles the maintainance work load. Not really looking forward
to
maintaining/designing two different schemas.


>>

Quote:

Originally Posted by

I had a look at the microsoft adventureworksdatabaseschema diagram
and they seem to have 3 standalong tables - AWBuildVersion, ErrorLog
and DatabaseLog (unless i am reading it wrong!)


>>

Quote:

Originally Posted by

Any advice, Information or resources are much appreciated.


>>
>Thanks for the prompt replies everyone.
>>

Quote:

Originally Posted by

>From what I am hearing, the consensus seems to be use it if you


>>
>absolutely must, but try to avoid it if you can.
>>
>One good point everyone seems to raise is "what is it used for?" ...
>To be perfectly honest I am not entirely sure myself. Its one of those
>requirements that filtered down from the management cloud. I think
>the view is to use it mainly for "reporting" kind of functionality and
>maybe only on some rare occasion for some sort of postmortem
>debugging. Although in the latter situation, the application logs and
>the sql server logs will probably end up being more helpful. I think
>there is a system table somewhere in sql server that logs all the
>transactions and changes that happen in the table right?
>>
>Crystal reports were being considered at some stage for more
>sophisticated reports, but for now they want some sort of entries in
>there to see whats happening (not necessarily at thedatabaselevel,
>but at the application level). The resolution of the reporting and
>entries hasn't been decided yet ... as in, do we want to know
>everytime someone retrieves a list of customers or only when someone
>adds/removes customers. I have a feeling that if I chase this up, the
>answer is going to be "both", "we may not want to start logging very
>detailed stuff into thedatabaseright away, but if at some stage we
>want to do it, the design should allow for it."
>>
>So just thinking in terms of some sort of "reporting" solution, in
>abstract a sort of condensed data for easier consumption, does it make
>sense to store an isolated table(s)/schemas along with the actual
>data?
>>
>As to the consequences of a bad audit trail/log entry, I don't think
>it would be catastrophic (fines, people going to prison etc.). Its an
>internal application used to streamline inhouse processes. But of
>course, we still don't want bad, inconsistent data in there and it
>would lead to a lot of headaches, finger pointings, late nights etc.


>
Actually, another best practices question now that I am here. Does it
make sense for a table to have two (or more different foreign keys)
both (or all) of which can be nullable and then tie them to different
tables for different records? For example I have a survey table. It
has all the fields are relations for describing various survey data
(survey questions, participants, start, finish dates etc. ). Now a
survey can be related to a project or a supplier. Of course, the same
thing can be done with two different junction tables. Which is the
better method? Add the junction tables and increasing the number of
tables, complexity of the system (and the number of joins required for
a query) or just adding extra nullable foreign key field(s) to the
table? Is there a rule of thumb I should be following here?
>


Yes. The presence of a null should only ever indicate that a value for an
*applicable* attribute is absent. If an attribute is not universally
applicable, then it should appear in a different relation schema. A
functional dependency A --B requires that whenever two different tuples
have the same set of values for A, they have the same set of values for B.
Just because one of the values in B has not been supplied doesn't alter that
requirement: whenever a value is supplied, it must necessarily be the same
in all tuples with the same set of values for A. Furthermore, if an
attribute is not universally applicable, then the dependency between A and B
is no longer functional, since even in a world in which all missing values
were supplied there can be some values for A that do not determine a value
for each element of B. Since a key implies a set of functional
dependencies, including attributes that do not universally apply is an
indication that a relation is not fully normalized because the relationship
between the key and the attribute is definitely not a functional dependency.
On the other hand, decomposing a relation where all attributes universally
apply but some may not be supplied introduces ambiguity: the sense that an
attribute universally applies is lost in translation. It cannot be
determined from the schema whether an attribute applies only some of the
time or if the attribute universally applies but a value hasn't yet been
supplied. So a null should only be used as a placeholder for a value that
has yet to be supplied.|||
Tim <tim_rogers01@.hotmail.comwrote:

Quote:

Originally Posted by

In general terms it is quite acceptable to have a standalone table
with no FK relationships instansiated.
Indeed, in times gone by whole databases were created in this manner
as the overhead for OLTP with all the index data manipulation behind
the scences could bring a system to its knees. (PK & FK are backed by
'hidden' indexes).


The only thing that will bring a system to its knees faster than
having indexes and FKs is *_not_* having indexes and FKs.

They are purely and simply a nightmare.

<auditing>

Quote:

Originally Posted by

I would not recommend the above for busy tables.


And there's a point to auditing tables that are relatively static -
normally lookups?

Paul...

Quote:

Originally Posted by

Hope that helps, Tim


--

plinehan __at__ yahoo __dot__ __com__

XP Pro, SP 2,

Oracle, 10.2.0.1 (Express Edition)
Interbase 6.0.2.0;

When asking database related questions, please give other posters
some clues, like operating system, version of db being used and DDL.
The exact text and/or number of error messages is useful (!= "it didn't work!").
Thanks.

Furthermore, as a courtesy to those who spend
time analysing and attempting to help, please
do not top post.|||nyathancha@.hotmail.com wrote:

Quote:

Originally Posted by

On Jun 26, 11:08 am, nyathan...@.hotmail.com wrote:
>

Quote:

Originally Posted by

>>On Jun 26, 10:29 am, "Tom Cooper"
>>
>>
>>
>><tomcoo...@.comcast.no.spam.please.netwrote:
>>

Quote:

Originally Posted by

>>>In addition to the other replies, I would add that foreign key constraints
>>>are just one of many tools thedatabasedesigner can use to help ensure that
>>>bad data does not get placed in yourdatabase. Other tools include check
>>>constraints, using the right datatypes (eg, store dates in a datetime
>>>column, not a varchar column), sometimes triggers, etc.


>>

Quote:

Originally Posted by

>>>So an important question is what the consequences will be if (when!, my
>>>experience is if bad data can be put into adatabase, sooner or later, it
>>>will be) invalid data is put into your audit table(s). That might range
>>>from nobody really cares, to it's going to be a lot of work to fix it, to
>>>somebody (you?) gets fired, to your company would be subject to a
>>>significant fine, to somebody might go to prison (if, for example, your
>>>audit trail is being used to prove compliance with SOX). So ask yourself
>>>questions like what will happen if your boss comes to you and says the audit
>>>trail says that user x created project y at time z, but there is no project
>>>y in the system.


>>

Quote:

Originally Posted by

>>>I certainly have tables in databases I have designed that do not have any
>>>foreign key relationships to other tables, but before implementing one, I
>>>would always think carefully about it.


>>

Quote:

Originally Posted by

>>>Tom


>>

Quote:

Originally Posted by

>>><nyathan...@.hotmail.comwrote in message


>>

Quote:

Originally Posted by

>>>news:1182755325.216207.318140@.g37g2000prf.googlegr oups.com...


>>

Quote:

Originally Posted by

>>Hi,


>>

Quote:

Originally Posted by

>>I have a question regarding best practices indatabasedesign. In a
>>relationaldatabase, is it wise/necessary to sometimes create tables
>>that are not related to other tables through a foreign Key
>>relationship or does this always indicate some sort of underlying
>>design flaw. Something that requires a re evaluation of the problem
>>domain?


>>

Quote:

Originally Posted by

>>The reason I ask is because in our application, the user can perform x
>>number of high level operations (creating/updating projects, creating/
>>answering surveys etc. etc.). Different users can perform different
>>operations and each operation can manipulate one or more table. This
>>part of the system is done and working. Now there is a requirement to
>>have some sort of audit logging inside thedatabase(separate from the
>>text based log file that the application generates anyway). This
>>"audit logging" table will contain high level events that occur inside
>>the application (which may or may not relate to a particular
>>operation). This table is in some sense related to every other table
>>in thedatabase, as well as data that is not in thedatabaseitself
>>(exceptions, external events etc.). For example : it might have
>>entries that specify that at time x user created project y, at time A
>>user filled out survey B, at time C LDAP server was down, At time D an
>>unauthorized login attempt occurred etc.


>>

Quote:

Originally Posted by

>>As I said, these seems to suggest a stand alone, floating table with a
>>few fields that store entries regarding whats going on the system
>>without any direct relationship to other tables in thedatabase. But I
>>just feel uneasy about creating such an isolated table. Another option
>>is to store the "logging" information in another schema/database, but
>>that doubles the maintainance work load. Not really looking forward to
>>maintaining/designing two different schemas.


>>

Quote:

Originally Posted by

>>I had a look at the microsoft adventureworksdatabaseschema diagram
>>and they seem to have 3 standalong tables - AWBuildVersion, ErrorLog
>>and DatabaseLog (unless i am reading it wrong!)


>>

Quote:

Originally Posted by

>>Any advice, Information or resources are much appreciated.


>>
>>Thanks for the prompt replies everyone.
>>

Quote:

Originally Posted by

>>>From what I am hearing, the consensus seems to be use it if you


>>
>>absolutely must, but try to avoid it if you can.
>>
>>One good point everyone seems to raise is "what is it used for?" ...
>>To be perfectly honest I am not entirely sure myself. Its one of those
>>requirements that filtered down from the management cloud. I think
>>the view is to use it mainly for "reporting" kind of functionality and
>>maybe only on some rare occasion for some sort of postmortem
>>debugging. Although in the latter situation, the application logs and
>>the sql server logs will probably end up being more helpful. I think
>>there is a system table somewhere in sql server that logs all the
>>transactions and changes that happen in the table right?
>>
>>Crystal reports were being considered at some stage for more
>>sophisticated reports, but for now they want some sort of entries in
>>there to see whats happening (not necessarily at thedatabaselevel,
>>but at the application level). The resolution of the reporting and
>>entries hasn't been decided yet ... as in, do we want to know
>>everytime someone retrieves a list of customers or only when someone
>>adds/removes customers. I have a feeling that if I chase this up, the
>>answer is going to be "both", "we may not want to start logging very
>>detailed stuff into thedatabaseright away, but if at some stage we
>>want to do it, the design should allow for it."
>>
>>So just thinking in terms of some sort of "reporting" solution, in
>>abstract a sort of condensed data for easier consumption, does it make
>>sense to store an isolated table(s)/schemas along with the actual
>>data?
>>
>>As to the consequences of a bad audit trail/log entry, I don't think
>>it would be catastrophic (fines, people going to prison etc.). Its an
>>internal application used to streamline inhouse processes. But of
>>course, we still don't want bad, inconsistent data in there and it
>>would lead to a lot of headaches, finger pointings, late nights etc.


>
Actually, another best practices question now that I am here. Does it
make sense for a table to have two (or more different foreign keys)
both (or all) of which can be nullable and then tie them to different
tables for different records? For example I have a survey table. It
has all the fields are relations for describing various survey data
(survey questions, participants, start, finish dates etc. ). Now a
survey can be related to a project or a supplier. Of course, the same
thing can be done with two different junction tables. Which is the
better method? Add the junction tables and increasing the number of
tables, complexity of the system (and the number of joins required for
a query) or just adding extra nullable foreign key field(s) to the
table? Is there a rule of thumb I should be following here?


With all due respect, performing work you lack the qualifications for by
using arbitrary answers from usenet amounts to malpractise. I strongly
urge you to learn the fundamentals BEFORE engineering solutions for anyone.|||On Mon, 25 Jun 2007 21:07:00 -0700, nyathancha@.hotmail.com wrote:

Quote:

Originally Posted by

>Actually, another best practices question now that I am here. Does it
>make sense for a table to have two (or more different foreign keys)
>both (or all) of which can be nullable and then tie them to different
>tables for different records?


Hi nyathancha,

This pattern isn't uncommon:

CREATE TABLE SomeTable
(SomePrimKey int NOT NULL,
FirstForeignKey int NULL,
SecondForeignKey int NULL,
-- Other columns,
PRIMARY KEY (SomePrimKey),
FOREIGN KEY (FirstForeignKey) REFERENCES SomeTable,
FOREIGN KEY (SecondForeignKey) REFERENCES OtherTable,
CHECK ((FirstForeignKey IS NULL AND SecondForeignKey IS NOT NULL)
OR (FirstForeignKey IS NOT NULL AND SecondForeignKey IS NULL))
);

--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis

Database design question

I have a need to keep a bunch of data regarding some wan links, and I don't know how to design my database for it.
Here's the data I need to store.

I have about 35 different links, and about 50 different application volumes that need to be stored in this database, for every hour of the day.
Right now the way I have it setup doesn't seem too good, but i can't figure out any other way.
I currently have 1 different table for each link, and every application as a column in these tables.
So for example:
Table Link 1
Date App1 App2 App3 App4
Jan 1 01:00 54613 351546 6848435 6847684
Jan 1 02:00 668468 6846433 646464 6546846

Table Link 2
Date App1 App2 App3 App4
Jan1 01:00 6846 6844354 6846434 38463434
Jan1 02:00 648458 3848646 6846 684684

etc.

The primary key on each table is the date.
I'm only storing about 7 applications right now, but I need to be able to increase this to about 50, so it's gonna make for huge queries for reporting.
The thing is I need to be able to report on total application volumes across every link.
So that means right now i'm doing a Sum(Link1.App1) + Sum(Link2.App1).... which I have to do 35 times, and then repeat again for the next application. And most reports have a lot of applications required so it makes for queries that are 20 pages long...

Is there any design solution that would make this more efficient?
Thanks!

You need to supply more information. It sounds to me like you might need an APPLICATION table to describe each applaction and a LINK table to describe each specific link -- but without addtional information might change that. It also sounds like you need at least one more table to record your time-sensitive data; however, I don't clearly see this at the moment.

Like I said: More information is needed.

|||

Ok.

Basically, let's say I have 35 different wan links going across my network.

Going across those links are a bunch of different applications. Let's say there's 50 of them.

I have a system that records the volume generated by each application over everyone of those 35 links, every hour.

Now I need to be able to store this data long term.

So the only thing being stored really is the volume for each application.

Right now it's setup this way:

TableLink1

Date Citrix FTP Telnet HTTP ..................etc

Jan 1 01:00 839203 923849 239487 83823

The date/time is the primary key as there will never be two entries for the same date/time.

The numbers being stored under the application columns are bytes.

I will need to be able to report on a few different things. For example, I will have to be able to produce the Top 10 applications for a given time period.

So that will mean i'll have to add up every application for every link and determine which 10 volumes are highest. With 35 different tables right now, my query

is extremely long and I don't know if it's supposed to be like that.

There's also a requirement for reports on a specific application across all links. So for example, I'll need to show the total volume used by FTP across

the 35 links for a given time period.

With the design that i'm using right now, it's possible, but it's a long query which looks like:

SELECT SUM(TableLink1.Citrix) + SUM(TableLink2.Citrix) + SUM(TableLink3.Citrix) .....etc AS Citrix

I hope this is enough info...if not tell me what you require. I'm pretty sure my design is ok, but just want to confirm if there's no better way.

I was also considering a different design that would look like this:

ApplicationCitrix

Date Link1 Link2 Link3 ...... Link35

Jan 1 01:00 23487234 234234 23487 ...... 3248324

Basically instead of using a different table for each link, i'd use a different table for each application...but I think it comes back to the same thing.

|||

Probably create application table something like this:

tb_APPLICATION

ApplicationID tinyint identity(1,1)

ApplicationName nvarchar(255)

Link table something like this:

tb_LINK

LinkID tinyint identity(1,1)

LinkValue bigint

and finally a cross-reference table:

xRef_App_Link

ApplicationID (fk)

LinkID (fk)

LinkStartDate smalldatetime

LinkEndDate smalldatetime

|||

Umm...i'm not sure i'm quite following...

If you can explain a little more maybe.

I don't get how the tb_LINK table would work...

How can I store a LinkValue without associating it to a specific application...

The way i'm thinking is maybe this:

keep the same tb_APPLICATION

then the tb_LINK:

LinkID tinyint identity(1,)

LinkName nvarchar(255)

And then the cross reference table:

xRef_App_Link

ApplicationID (fk) (pk)

LinkID (fk) (pk)

Date (pk)

Volume

Would this work?

I didn't quite get the part where you wrote:

LinkStartDate smalldatetime

LinkEndDate smalldatetime

|||

I think you want the app table:

tb_APPLICATION

ApplicationID tinyint identity(1,1)

ApplicationName nvarchar(255)

I am not sure if you need a LINK table or whether all you need is a VOLUME table that contains something like

Volume

RowId identity

LinkId

ApplicationId

LinkVolume

LinkStartTime

LinkEndTime

|||

Ok,

I think I will do it this way instead of having 35 tables lol...

Thanks a lot for the help.

|||I did not mean my response as "the answer" and I think it is pre-mature to mark it as such. I was hoping to get other opinions. Other ideas?

Database Design Question

Which of the following options is best?

Option A)

One database with all the tables for all applications (HR, TechStaff, Inventory, Purchasing, etc.). The bonus to this is all the data is stored in one place and the data is not duplicated, but when the database has to come down, all applications will be unavailable. Plus, volume may be present an read/write row/column issue.

Option B)

Multiple databases... one for each application. Then to replicate the necessary tables to each database as needed. For example, replicate the employee table (ID, firstname, lastname) to the databases that would require that data. The replication would be setup to be one-way replication. Bonus is that if one database goes down, the others are still available, but a downside is the same data is stored in multiple places and updated through replication (which could slow down the server).

Thoughts?

Seems like you are using 'database' to refer to an 'instance' of SQL Server? If your primary concern is high-availability, you may want to consider cluster or mirroring or even log shipping.

Thanks

|||

I would go with B option. It proved to be more flexiblein my case. I used a database for each application. The employees were stored in a separate database that was populated from Active Directory. Each application had an employee table that contained only the authorized employees for that specific application.

|||

Sunil Agarwal wrote:

Seems like you are using 'database' to refer to an 'instance' of SQL Server? If your primary concern is high-availability, you may want to consider cluster or mirroring or even log shipping.

Thanks

Seperate databases vs a single database on one instance os SQL Server.|||

Cristian Lefter wrote:

I would go with B option. It proved to be more flexiblein my case. I used a database for each application. The employees were stored in a separate database that was populated from Active Directory. Each application had an employee table that contained only the authorized employees for that specific application.

That is what I was doing now... but instead of maintaining multiple employee tables across many databases, I would setup the server to automatically replicate the employee tables from the main employee databases to cascade the changes down to all the other databases the required some sort of employee data.|||Both have their benefits but replication does seem to be a bit of an overhead in this case.

There is an alternative and that is to create individual databases for each application. Then, rather than replicate the data, you can create a view in the source database selecting the data you need to display in new database (eg EmployeeList - a list of employees in a HR database). This can join several table together as you would in any normal view. In the new database you create a new view which contains a simple SQL statement of "SELECT * FROM HrDbName.EmployeeList" and call it anything you like. You can then use this view as you would any other table or view. Provided your users have access to the data in both databases then it will appear as though the data resides in the same database.

This does not stop the problem of one database being unavailable affecting the others but it should result in a number of smaller databases targeted at one objective while still enabling you to effectively share the data between applications.

If the databases are on a different server then you can include the server name in the SELECT statement but you will also have to set up security between the servers.

This is a technique I have used for externally provided applications where we write our own queries. By setting up an enquiry database with views direct to the tables in the suppliers database I can create as many views as i need without any dabger of breaking the application now or in the future.

Database design question

I am brand new to both programming and databases and am trying to design a database for my job as a paramedic.

I am using Visual Studio 2005 , and MSSQL . I am trying to create a log for all of our ambulance calls.

My question comes in when I try to design a field that holds our case number , which is formatted as 07-0001. Basically, the year , then four digits for the remainder of the case number .

Any help on how to accomplish this would be SO appreciated !!!I would use a varchar for this.

presumably you want to preserve the minus sign and the leading zero - this rules out any numeric storage types such as int.|||Thanks ! So, does this mean I can't use that field as an ID for my database ? Also, how would I go about the formatting I would need ?

Sorry to be such a novice, but I do SO appreciate the help !|||the reply depends on many other questions. will the database generate the next call-number or it is a user input/front-end generated number? is the number editable? will the number recycle missing numbers specially the last deleted ones?
there are quite a few options. it could be saved in 2 separate fileds and formatted properly during display, it could be a formula field, it could be a plain varchar field with validations etc. etc...

Database Design Question

Anyone know a good database scheme for storing formatted text? The text will be coming from XML in this type of format:

<text id="01">
<headline>Here is the title</headline>
<subhead>Here is a subhead</subhead>
<paragraph>Here is some text</paragraph>
</text>
<text id="02">
<headline>Here is the title</headline>
<subhead>Here is a subhead</subhead>
<paragraph>Here is some text</paragraph>
<subhead>Here is a subhead</subhead>
<paragraph>Here is some text</paragraph>
</text>
... etc

How can I store in a database and keep the headlines, subheads, and paragraphs delimited without just storing the XML tags and keeping everything in order (each "text" may contain multiple paragraphs and subheadings ... )?

I'm just wondering if anyone has done this before and what has worked for them ... just looking for opinions on the most efficient way to accomplish ... thanks!If the data length will be less than 8000 characters drop the data into an a varchar column. If the data will exceed 8000 characters you will have to drop the data into a text column.|||Originally posted by smorton
If the data length will be less than 8000 characters drop the data into an a varchar column. If the data will exceed 8000 characters you will have to drop the data into a text column.

Thanks for the reply. I'm looking for the most efficient way to set up relationships that will keep my text formatted the same way it is in the XML doc and in the same order. Any ideas?|||Again, why don't you just drop this into a varchar or text field? Do you need to be able to query on different Tags? I don't get why you need to establish relationships to the XML.

blindman|||I need to be able to store other properties about each piece of text ... I guess this could be done with attributes of the XML tags. Is that the best way? Just dump the XML right into the db? Thanks!|||What other attributes?

How do you define a piece of text? One line? All the text between two tags?

What about nested tags?

blindman|||There will be no nested tags ... no need for them at this point. Attributes will be font size, family, color, etc ... and a transition id.

(I'm using ASP to generate XML for Flash ... but I also need to save the everything I write the XML in a database).

So I guess I could just store in the db as follows:

<text id="1">
<head font="arial" size="12" color="000000" transition_id="1">This is the title</head>
<sub font="arial" size="11" color="000000" transition_id="2">This is a sub title</sub>
<para font="arial" size="10" color="000000" transition_id="3">this is a paragraph</para>
<para font="arial" size="10" color="000000" transition_id="3">this is a paragraph</para>
<sub font="arial" size="11" color="FF0000" transition_id="2">This is a sub title</sub>
<para font="arial" size="10" color="FF0000" transition_id="3">this is a paragraph</para>
</text>
... etc

Does that sound right to you? Thanks!|||(follow up) ... text is everything between 2 tags.|||Well, you could have each line be an independent record containing both the text and the common attributes as separate fields. To keep the lines in order, use an incrementing key.

blindman|||you can also identify all the attributes that define a piece of text and separate them into a set of tables that would contain key-based references, all grouped together by a page reference (if applicable).|||Thanks guys. This is kind of what I was thinking ... just wanted to get an expert's opinion. If I use both the incrementing key and the page id with several related tables then I can keep the data all seperated out into organized tables and carry all the required attributes. Then when I am ready to write to XML, for each page I just do a select across the related tables by page id and order it by the incremented order id.

That brings the text out for each page in the right order with all of the formatting in place.

Sounds right to me. Thanks again!

Database Design Question

I am writing a job tracking app with ASP/MSSQL and I have a question about setting up an archive system.

For my app, the database consists of several relational tables that track all of the information about a job. I want to set it up so that if a job is completed, it gets archived.

Over the next 2 or 3 years, this system will likely grow to 100,000+ records.

My question is, should I (1) just use a flag to mark the job as archived, (2) make a flat record of the job and move it into one archive table, or (3) set up archive tables that mirror my production tables and copy the record to the archive tables keeping the relationships intact?

Does anyone have a method that has worked well for them in the past? What will keep my app running the fastest? Any suggestions would be appreciated. Thanks!i always prefer a single table with the jobs flagged as not active (1 or 0)
it makes for simpler querying later with just
select * from table1
where active is 1
and
when you want to see completed tasks
select * from table1
where active is 0

(less tables, less joins etc,... )|||Cool ... thanks!

Do you think 100,000+ records somewhere down the line will be a problem when running reports or doing searches if I just use a flag? I'm trying to design this w/ optimized tuning in mind because I don't know how long it will be in place.

When I flag a job as complete (bit 0,1) and then query w/

WHERE open=(0 or 1)

It still has to check every record correct? How many records before there is a noticable difference in the speed of the app?|||two ways to think of this
you could make the column a bit datatype to save on space but it stores 1,0 or null small to the point but you cannot create an index on this column

OR
create the column as tinyint (1 byte , largest value 255) and then you could index it
but the indexing on a heavily duplicated columns is not recommended
you also could create statistics on the flagged column
i'm of the mind that you should be okay with this but just monitor performance during operations|||Thanks for your ideas!|||The optimizer will likely not use an index on a boolean column, regardless of whether it is bit, tiny int, or even char(1). There just isn't sufficient cardinality to make it worthwhile.|||true
this is an area where i have (gulp!) envied ORACLE
they have bitmap indexes that you can use on heavily duplicated columns like gender and yes\no

oh well maybe in yukon.|||Yeah, but that's not the point.

You could create an index on boolean values stored in a tinyint field, but the optimizer would likely ignore it because its not any more efficient than using a table scan. The cardinality is likely so low (high? I can never remember...) that the small performance boost gained by using the index is offset by the overhead of accessing the index.

I'm not an Oracle DBA, but I don't see how Oracle would gain much from indexing boolean values either.|||yes and using bitmap indexes on yes no or gender or small sets like rating can give you a boost
but you are right about the boolean but still i prefer the 1/0 flag to the splitting up of tables|||I agree. In most cases (not all) a flag is better than splitting the data.|||And the rambling goes on. Well, I don't know who would come up with an idea on creating an index just on a Gender or Yes/No field. And you guys keep on going on it. Kind of silly, would you really consider doing it? Its value WILL come to play if such a field is combined with something else.

And another thing, - YOU CAN CREATE AN INDEX ON A BIT FIELD, even by itself.|||Yes and no.

You can't create an index on a bit field through Enterprise Manager's table design form.

Also, Books Online states:

"Columns of type bit cannot have indexes on them. "

But you can create the index using SQL:

CREATE INDEX [IX_BitTest] ON [dbo].[YourTable]([BitValue]) ON [PRIMARY]
GO

...so there is a loophole or inconsistency in SQL server regarding bit values.

Lot's of people come up with the idea of indexing boolean fields because they don't have an in depth understanding of how SQL Server works. That's why they come to dbforums for advice, not ridicule Bob.|||Originally posted by blindman
...Lot's of people come up with the idea of indexing boolean fields because they don't have an in depth understanding of how SQL Server works. That's why they come to dbforums for advice, not ridicule Bob. Hey, you started first, I just try to survive here. Remember your comments about my inability to understand English? Help me out here, - weren't you ridiculing me?

Somet things I found funny, others I saw as an insult, but in neither of those posts were you even attempting to hide or retract your intent to ridicule. I guess you yourself can't take your own medicine, hey?! ;)

Originally posted by blindman
...You can't create an index on a bit field through Enterprise Manager's table design form.

Also, Books Online states:

"Columns of type bit cannot have indexes on them. "

Which Books Online are you reading, Lindman? Get on with the program, everybody else did!

SQL Server Books Online January 2004 Update (http://www.microsoft.com/downloads/details.aspx?FamilyId=A6F79CB1-A420-445F-8A4B-BD77A7DA194B&displaylang=en#filelist)|||YEAH BOB!!!!

rdjabarov does insulted....|||Always go with your strength.|||At the scale of 100,00 or so records, you'll certainly want to be able to index the field, so bit is out. Plus people writing code against boolean fields get confused about null evaluations and equalities, so it's better to use a more humane data type anyway.

tinyint will take the least amount of space and provide the fastest field evaluation available. Although, at such a small table size, neither is a major concern. Whether or not you are expanding a server tree in the Enterprise Manager during query execution will have a much greater effect on performance than how you optimize this field. I agree that standard indexing would have little effect, but I would use a clustered index with the status field as the first element and then your primary where condition field as the second element of the index. The optimizer is unable to overlook this index becuase it is reflected in the table's physicality.|||The optimizer does not overlook boolean indexes because it is lazy, or because it can't use them. It chooses to overlook them when it is more efficient NOT to use them. You could force it to use the index using a query hint, but forcing the use of the index does not necessarily make the query more efficient. The same goes for putting the boolean index at the front of the key.|||Originally posted by dbslave
tinyint will take the least amount of space

Isn't tinyint and char(1) the same size?|||Originally posted by Lindman
The optimizer does not overlook boolean indexes because it is lazy, or because it can't use them. It chooses to overlook them when it is more efficient NOT to use them... The optimizer DOES NOT overlook indexes when it is more efficient not to use them. That decision is based on STATISTICS associated with tables referenced by the query. Don't give too much credit to the Optimizer.

And again, according to Books Online (UPDATED!!!) and personal experience, you can have an index on a bit field. In fact, you can even have a clustered index on a bit field (wouldn't know what for though). The point is that you shouldn't have an index on a field which data is equally split 50/50. Such a field should be part of a composite index to make a difference.

Going back to the original question, whether to have a flag or a replica of the original table for archive purposes...It depends on how many records are going to be archived at a time, and how many records regardless of whether they are archived or not would be retrieved. And also, how wide is the table now.

The decision should depend on these questions because if you plan to archive thousands of rows at a time and the table is wide enough (judgment call, no particulat width is in mind), I'd go with a flag of bit datatype, since updating of 1 field will not kill my transaction log as opposed to inserting into a different table.|||Originally posted by rdjabarov
In fact, you can even have a clustered index on a bit field (wouldn't know what for though). The point is that you shouldn't have an index on a field which data is equally split 50/50. Such a field should be part of a composite index to make a difference.


Logically, yes. Physically, this can be useful for partitioning data.|||Originally posted by Brett Kaiser
Isn't tinyint and char(1) the same size?

Yes, but comparisons on a tinyint are way faster than on a char(1)|||Why would that be?

Got any links that discuss this?

and whats waaaaaaaaaay faster?|||Originally posted by dbslave
Logically, yes. Physically, this can be useful for partitioning data. You're planning to partition 2 records?|||Clustered does not necessarily mean unique.|||Originally posted by blindman
Clustered does not necessarily mean unique. Did you get your "real" Books Online? You have a lot to read...|||USE Northwind
GO

CREATE TABLE myTable99(Col1 int, Col2 int)
GO

CREATE CLUSTERED INDEX myTable99_IX1 ON myTable99(Col1)
GO

INSERT INTO myTable99 SELECT 1,1 UNION ALL SELECT 1,1
GO

SELECT * FROM myTable99
GO

DROP TABLE myTable99
GO

Now say you're sorry...

And I want you all to play nice or it's time out...

Oh, wait, that stuff doesn't start till I get home...

Sorry...|||Hey Brett, I wasn't arguing that clustered does not mean unique! I just don't like "him" butting into other people's conversation, just to make some pointless point (or however "he" put it).|||Oh good heavens, that's exactly why I participate. Having a dozen eyes available helps "keep me honest" because I know that there are many points of view, most of which are probably more "on track" at any given point in time than mine are.

While there are plenty of people that annoy me in one way or another, its still a good thing that they all participate (at least in my opinion). If one person annoys me too badly, I just filter them from my view for a while (until I cool off).

There's a lot of advice that is well meant, and well received that nearly sends me ballistic. I know how badly certain approaches can hammer things up with large (anything between 150 and 1200 Gb) databases, but those approaches often work fine in smaller, simpler environments. I have to keep reminding myself to give appropriate answers based on the questions!

-PatP|||Originally posted by Brett Kaiser
Isn't tinyint and char(1) the same size?

yes they are both 1 byte datatypes but
integer vs char|||I'm with Brett in never having read anything about tinyint being a faster comparison than char(1). White papers?|||boys, this is a non-smoking flight, put your guns down

tiny, small, why not bit? you'd have to have at least 9 of them before it reaches the size of the tinyint. what's the fuss about?|||If tinyint is indeed faster than char(1) then I'll use it in the future.|||oh that's just absurd! how can a datatype be faster than another datatype?|||Originally posted by ms_sql_dba
oh that's just absurd! how can a datatype be faster than another datatype? Way to go, she got you on that one :D :D :D :D :D|||Hey .. where was I when all this was going on ...

hmm ... have been busy lately ... but will be back soon ...|||she?|||Now that you point out .. i was also confused by "She" ...

Hmm .. i m getting confused a lot lately ...|||Originally posted by r937
she? The "ms" can be interpreted as many things,k among them: Microsoft, or as an honoric "Ms." that was quite popular among what were called the "rabidly liberated" in the 1970s. Ms. was used instead of Mrs. or Miss by a very small group of women that didn't want to reveal their marital status. I assume that rdjabarov read it in that context.

-PatP|||She got me?

I never claimed it was faster. The notion that tinyint was faster than char(1) was news to both Brett and I, and we simply asked if there was any documentation to support the assertion.

As usual, ms_sql_dba didn't bother to really read the previous posts. She keeps to her old habit of dropping in at the end of the thread and repeating something somebody else had already written 10 posts before, without adding a shred of value to the conversation.

Nyah, nyah, ya missed me! Ya missed me! PPPbbbbbtttt! :p

Grow up.|||It's got to be a full moon or something...

does ANYONE want to put forth WHY tinyint would be more effecient that char(1)?

I'm here for education (and social interaction, well I guess some types of it...the others type are a watse of time...life is too short)|||Originally posted by ms_sql_dba
boys, this is a non-smoking flight, put your guns down

tiny, small, why not bit? you'd have to have at least 9 of them before it reaches the size of the tinyint. what's the fuss about?

[With the guns blazing]

This is straight out of the Holy book
bit

If there are 8 or fewer bit columns in a table, the columns are stored as 1 byte

tinyint

Integer data from 0 through 255. Storage size is 1 byte.

[/With the guns blazing]

first Get a copy of the Holy Book and then read it from end to end.|||OK, so that means...

1 bit = 1 byte
2 bits = 1 byte
3 bits = 1 byte
4 bits = 1 byte
5 bits = 1 byte
6 bits = 1 byte
7 bits = 1 byte
8 bits = 1 byte

But the values of bit is still 0 or 1 or null (which isn't a value at all)

And has very limited meaning, and I've never seen a good reason to index it..

And Char(1) and tinyint are still both 1 byte

And is it a misconception that 1 of the above is more effecient in an index than the other...|||Originally posted by Brett Kaiser
OK, so that means...

1 bit = 1 byte
2 bits = 1 byte
3 bits = 1 byte
4 bits = 1 byte
5 bits = 1 byte
6 bits = 1 byte
7 bits = 1 byte
8 bits = 1 byte

9 bit = 2 byte
10 bits = 2 byte
11 bits = 2 byte
12 bits = 2 byte
13 bits = 2 byte
14 bits = 2 byte
15 bits = 2 byte
16 bits = 2 byte

and so on ...

so 1 single bit storage space = tiny int storage space|||It was dbslave who originally stated that tinyint was faster, and since we haven't heard anything back from him I'm going to assume that was an error and recommend we drop the inquiry.|||i feel slightly responisble because it was me who brought it up.
but if you are gonna flame someone else then count me in
Mob Rules
string him up...|||Nah, I wasn't stringing him up. Everybody makes mistakes, or has made incorrect assumptions while wading through the masses of documentation for SQL server. I've learned a hell of a lot from this forum.|||Originally posted by blindman
I've learned a hell of a lot from this forum.

Ditto...

Just trying to realize either way...

My gut says no....|||My gut says no....

Thats wierd ... you still got guts|||Originally posted by blindman
She got me?

I never claimed it was faster. The notion that tinyint was faster than char(1) was news to both Brett and I, and we simply asked if there was any documentation to support the assertion.

As usual, ms_sql_dba didn't bother to really read the previous posts. She keeps to her old habit of dropping in at the end of the thread and repeating something somebody else had already written 10 posts before, without adding a shred of value to the conversation.

Nyah, nyah, ya missed me! Ya missed me! PPPbbbbbtttt! :p

Grow up. So, that makes 2 of us, me and ms_sql_dba. At least I am not alone!!!|||Not to jump in and strive for polarity, but there's been nothing I've seen that say that's true...

And are you going to the rainbow lounge?

Make sure you bring a lot of singles...

:D|||Originally posted by Enigma
Thats wierd ... you still got guts I'll bet that I've got more gut than anybody around here!

Somebody made a comment about a rather heated discussion here being an "international incident", and I pointed out that even I'm not that fat (yet).

-PatP|||...Sounds like a call for a pissing contest...|||Originally posted by rdjabarov
...Sounds like a call for a pissing contest... Well, I'd guess that I could "hold my own" (if you'll pardon the term) in that area too!

-PatP|||Try this for yourself. Run test few times to be sure table is in cache. You will be surprised.

create table test(i int identity primary key, a tinyint, b char)
set nocount on
declare @.c int
set @.c = 0
while @.c < 20000 begin
insert test(a,b)
values(0,'0')
insert test(a,b)
values(1,'1')
set @.c = @.c + 1 end

--test scan
declare @.t datetime
set @.t = getdate()
select a
from test
where a = 1
print 'tinyint: ' + cast(datediff(ms,@.t,getdate()) as varchar) + 'ms'
set @.t = getdate()
select b
from test
where b = '1'
print 'char: ' + cast(datediff(ms,@.t,getdate()) as varchar) + 'ms'

--test seek as run above script after adding these indexes:
create index test_a on test(a)
create index test_b on test(b)

Chars were slower before SQL Server version 7.|||I got about 700ms faster on tinyint after creating indexes.|||And about 1600 faster than using char if I replace tinyint with bit...|||Well, in the interests of scientific speculation:

-- CREATE TABLE test_datatype_speed_1(
-- ident INT IDENTITY PRIMARY KEY,
-- tinyint_noindex TINYINT,
-- char1_noindex CHAR(1),
-- bit_noindex BIT,
-- tinyint_index TINYINT,
-- char1_index CHAR(1),
-- bit_index BIT)
--
-- CREATE INDEX idx_test_datatype_speed_1_tinyint ON test_datatype_speed_1(tinyint_index)
-- CREATE INDEX idx_test_datatype_speed_1_char1 ON test_datatype_speed_1(char1_index)
-- CREATE INDEX idx_test_datatype_speed_1_bit ON test_datatype_speed_1(bit_index)
--
-- SET NOCOUNT ON
--
-- DECLARE @.c INT
--
-- SELECT @.c = 0
--
-- WHILE @.c < 20000
-- BEGIN
--
-- INSERT test_datatype_speed_1(tinyint_noindex, char1_noindex, bit_noindex, tinyint_index, char1_index, bit_index)
-- VALUES(0,'0',0,0,'0',0)
-- INSERT test_datatype_speed_1(tinyint_noindex, char1_noindex, bit_noindex, tinyint_index, char1_index, bit_index)
-- VALUES(1,'1',1,1,'1',1)
-- SELECT @.c = @.c + 1
--
-- END

GO

SET STATISTICS IO ON

--test scan
DECLARE @.t DATETIME

--Test tinyint_noindex
SELECT @.t = GETDATE()
SELECT tinyint_noindex
FROM test_datatype_speed_1
WHERE tinyint_noindex = 1
PRINT 'tinyint_noindex: ' + CAST(DATEDIFF(ms,@.t,GETDATE()) AS VARCHAR) + 'ms'

--Test char1_noindex
SELECT @.t = GETDATE()
SELECT char1_noindex
FROM test_datatype_speed_1
WHERE char1_noindex = '1'
PRINT 'char1_noindex: ' + CAST(DATEDIFF(ms,@.t,GETDATE()) AS VARCHAR) + 'ms'

--Test bit_noindex
SELECT @.t = GETDATE()
SELECT bit_noindex
FROM test_datatype_speed_1
WHERE bit_noindex = '1'
PRINT 'bit_noindex: ' + CAST(DATEDIFF(ms,@.t,GETDATE()) AS VARCHAR) + 'ms'

--Test tinyint_index
SELECT @.t = GETDATE()
SELECT tinyint_index
FROM test_datatype_speed_1
WHERE tinyint_index = 1
PRINT 'tinyint_index: ' + CAST(DATEDIFF(ms,@.t,GETDATE()) AS VARCHAR) + 'ms'

--Test char1_index
SELECT @.t = GETDATE()
SELECT char1_index
FROM test_datatype_speed_1
WHERE char1_index = '1'
PRINT 'char1_index: ' + CAST(DATEDIFF(ms,@.t,GETDATE()) AS VARCHAR) + 'ms'

--Test bit_index
SELECT @.t = GETDATE()
SELECT bit_index
FROM test_datatype_speed_1
WHERE bit_index = '1'
PRINT 'bit_index: ' + CAST(DATEDIFF(ms,@.t,GETDATE()) AS VARCHAR) + 'ms'

SET STATISTICS IO OFF
GO

This was the average result on my machine:

Table 'test_datatype_speed_1'. Scan count 1, logical reads 91, physical reads 0, read-ahead reads 0.
tinyint_noindex: 30ms

Table 'test_datatype_speed_1'. Scan count 1, logical reads 91, physical reads 0, read-ahead reads 0.
char1_noindex: 93ms

Table 'test_datatype_speed_1'. Scan count 1, logical reads 91, physical reads 0, read-ahead reads 0.
bit_noindex: 63ms

Table 'test_datatype_speed_1'. Scan count 1, logical reads 30, physical reads 0, read-ahead reads 0.
tinyint_index: 63ms

Table 'test_datatype_speed_1'. Scan count 1, logical reads 30, physical reads 0, read-ahead reads 0.
char1_index: 93ms

Table 'test_datatype_speed_1'. Scan count 1, logical reads 30, physical reads 0, read-ahead reads 0.
bit_index: 60ms

Kind of interesting.

I'm running Windows 2003 Standard Edition/SQL Server 2000 Enterprise Edition/2ghz Xeon/1gb RAM|||I'm running Windows 2003 Standard Edition/SQL Server 2000 Enterprise Edition/2ghz Xeon/1gb RAM

i'm more impressed by your server.
oh yeah nice code too|||OK, cool, I'm up for testing things out...

Anyone got anything from M$ though?

Also, how curious...an non indexed tinuint is 33 ms faster?

Doesn't make sense...

I bet the results are significantly different when we're talking volumes...|||I just got around to running HansVE's code, and sometimes tinyint was faster and sometimes char was faster. It was about 50/50, so it still looks like a wash to me, at least until I see some kind of documentation or at least some reasoning.

And indexes made no difference.|||How many records are you guys testing with? Also, testing against 1-field index with 1/0 or y/n is going to yield an index scan. And we all know that the number of index pages for all 3 datatypes (char, tinyint, bit) will be the same. In order to test it for practical real-life situations we at least need to agree that such a field needs to be tested with another field with higher selectivity than 50/50.|||Agreed. 50/50 may be a typical distribution of boolean data, but it is not always the case and performance may be different with different cardinality.

90/10?|||No, I was talking about combining this boolean field with another, let's say a 10-character field or a date, while creating an index, and then using this combination of fields in the WHERE clause or in a JOIN. Something like this:

create table t1 (f1 char(10) not null, f1 tinyint/char(1)/bit not null)
--here we need to insert a large number of records, about 100K, where f1 would be 10% duplicates and f2 would be 50% dups.
select * from t1 where f1 = <something> and f2 = 'y'/1/1|||If you read the whole post, the entire thing was centered around bit vs. tinyint. Then, the whole char(1) thing got thrown in. So, if we're talking true bit comparison, you should stick with 0 and 1. Otherwise, you're expanding the field to do a full-scale comparison of datatypes.

In this case, you would have to set baselines at all kinds of different selectivity levels, index level (fill-factors, clustered vs. non-clustered, single vs multi-column), scale, precision, datatype, etc.

After you set the baseline, you have just began the real work of analyzing the datatype performance. You would then have to compare across size of system, memory, load in tps, size of recordset per selectivity, etc.

It would be great fun, but it's probably a little out of the scope of the thread isn't it.

:)|||Oh no...A thread that has strayed....

What will happen next?

No more tequila?

AAAAAAAAAAAAAHHHHHHHHHHHHHHHHHHHHHHHHHHHH

:D|||Let the thread keep going as long as it is interesting and informative.

rdjabarov, can you adapt the prevous sample code to test or illustrate your conjecture?|||I was hoping Brett will do his "create mytable99..." thing and I'll tweak it ;)|||Spoken a like a true lazy DBA! (And I was hinting that YOU should do it because I didn't want to.)

In Enterprise Manager, right-click on the Procedures folder, select "New Stored Procedure", and choose the "Ask Brett to do it" button.|||use a cross join

:cool:|||Originally posted by blindman
Spoken a like a true lazy DBA! (And I was hinting that YOU should do it because I didn't want to.)

In Enterprise Manager, right-click on the Procedures folder, select "New Stored Procedure", and choose the "Ask Brett to do it" button.

That's coming out in Yukon

:D|||Let the thread keep going as long as it is interesting and informative.

We better just stick to the interesting part. :)|||Personally, I don't give a rats arse which is faster but I am surprised that the collation of the CHAR data type has not been mentioned.

A binary collation outperforms a non binary collation...

Thanks to Derrick's code and simply adding "COLLATE Latin1_General_BIN " against the CHAR columns reveals...

Table 'TEST_DATATYPE_SPEED_1'. Scan count 1, logical reads 91, physical reads 0, read-ahead reads 0.
tinyint_noindex: 73ms

Table 'TEST_DATATYPE_SPEED_1'. Scan count 1, logical reads 91, physical reads 0, read-ahead reads 0.
char1_noindex: 100ms

Table 'TEST_DATATYPE_SPEED_1'. Scan count 1, logical reads 91, physical reads 0, read-ahead reads 0.
bit_noindex: 90ms

Table 'TEST_DATATYPE_SPEED_1'. Scan count 1, logical reads 23, physical reads 0, read-ahead reads 0.
tinyint_index: 80ms

Table 'TEST_DATATYPE_SPEED_1'. Scan count 1, logical reads 23, physical reads 0, read-ahead reads 0.
char1_index: 80ms

Table 'TEST_DATATYPE_SPEED_1'. Scan count 1, logical reads 23, physical reads 0, read-ahead reads 0.
bit_index: 80ms|||(20000 row(s) affected)
Table 'test_datatype_speed_1'. Scan count 1, logical reads 92, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
tinyint_noindex: 230ms

(20000 row(s) affected)
Table 'test_datatype_speed_1'. Scan count 1, logical reads 92, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
char1_noindex: 360ms

(20000 row(s) affected)
Table 'test_datatype_speed_1'. Scan count 1, logical reads 92, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
bit_noindex: 290ms

(20000 row(s) affected)
Table 'test_datatype_speed_1'. Scan count 1, logical reads 31, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
tinyint_index: 183ms

(20000 row(s) affected)
Table 'test_datatype_speed_1'. Scan count 1, logical reads 31, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
char1_index: 230ms

(20000 row(s) affected)
Table 'test_datatype_speed_1'. Scan count 1, logical reads 31, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
bit_index: 250ms|||Tried out the following

In Enterprise Manager, right-click on the Procedures folder, select "New Stored Procedure", and choose the "Ask Brett to do it" button.

It simply closed my SQL workbench and shut down the system ...

Weird !!! :D|||That's becasue it engages the margarita request task...

It involves shut down, turn off lights, catch train, fire up blender....

and ok...if we say tinyint IS faster, that's great...

Now add the join back to the code table to obtain the meaningful value for the surrogate key...|||Originally posted by Brett Kaiser
That's becasue it engages the margarita request task...

It involves shut down, turn off lights, catch train, fire up blender....

Take a train ... you must be living quite far from your office ...|||3 stops...can read 1 article...

town shuttle picks me up 3 houses away to the train...monthly pass US$36.00 a month
Company shuttle takes me to work...

reverse on the way home...

transmission blew on the piece o junk chrysler...after they replaced it under warranty...

No payement, No Gas, No Insurance, No hassle

beautiful Newark NJ|||Wish we had commuter rail in Ohio...|||Originally posted by blindman
Wish we had commuter rail in Ohio...

http://www.cota.com/cota/cotaweb/main.xml

Actually it's just proposed

http://www.cotafasttrax.com/nclrt_transit_large.php|||Originally posted by Enigma
Take a train ... you must be living quite far from your office ... I guess that depends on your definition of far...

I can work from a number of places. They range from zero to three hundred Km from home (home itself moves twice a week too, but that's another story).

Working downtown can be ugly. If you do it the hard way, the last 100 Km took significantly longer than the first 200 Km! The commuter rail makes the whole process a lot easier, faster, and safer.

-PatP|||km?

I thought you were in southern Cal...|||I created a monster.

Thank you all for the long, very educational discussion regarding my problem. I think I am all set.|||What about another approach? Instead of using binary information such as an archive flag, use e.g. a datetime attribute for creation and another for the closing or also archiving date/time. Have an index for each. Set the value of creation and close to the same value and exclude in every selection, don't use the NOT argument but the CREATION=CLOSE for active. The index seed will be optimal and you can access only archived or open or both and in addition every range of time. This is how you would manage temporal valid records in one to many relationships, e.g. portfolio constructions or account records which you may correct but not delete for tracability reasons.

More far..if you would need uniqueness you could use a timestamp as well, but then you have to define a second attribute not as a timestamp but as a (guess) long. The timestamp field is filled automatically, a value cannot be assigned, but you can read it with an insert trigger, the inserted record will have the timestamp value, wich you can access while inserting the record and write it to the second attribute, which would not be of type timestamp.This would make the values unique.

If you have to access the archived values very often and you are not happy with the performance you have to store the latest timestamp used in a meta table. This will allow direct access of the last and compare with earlier closed ones.

Tried this with SQL Server 2000 and Sybase System 11, it is very fast.

Regards|||News flash! Dead posts are rising from the grave to walk the Earth once again! Authorities urge all citizens to remain calm! Stayed tuned for further instructions...|||mambo,

Thanks for the additional insight ... I am still in the design phase on this project (it got pushed back), so your idea may prove to be useful. I'll check it out. Thanks!|||Wow. Blast to the past. lol Let us know what you finally end up doing Wigz. It's always fun to see 1800 year old posts resurrect themselves.