Saturday, February 25, 2012

Database Design Question

Hello all,

Over the years as I design more database schemas the more I come up with
patterns in database design. The more patterns I recognize the more I
want to try to design some kind of generic design patterns that can be
used and shared amongst many sub-schemas.

For example, the grouping of entities. I may have the following tables:
employee, product and client. These tables have no direct relationship
with each other. But each can be grouped: employee_department,
product_category, client_region.

The idea of grouping for those three tables is exactly the same and thus
I would decide to perhaps create one table named grouping with the
following fields: parent_table_name, parent_table_pk, child_table_name,
child_table_id.

This would make programing and maintaining of the database and sourcode
(Java, ASP, ColdFusion, etc) a lot simpler. Of course there is a limits
and drawbacks to such a solution.

My question is: Is there any literature on such kind of database design.

Thank You

--
* Don Vaillancourt
Director of Software Development
*
*WEB IMPACT INC.*
phone: 416-815-2000 ext. 245
fax: 416-815-2001
email: donv@.web-impact.com <mailto:donv@.webimpact.com>
web: http://www.web-impact.com

/ This email message is intended only for the addressee(s)
and contains information that may be confidential and/or
copyright. If you are not the intended recipient please
notify the sender by reply email and immediately delete
this email. Use, disclosure or reproduction of this email
by anyone other than the intended recipient(s) is strictly
prohibited. No representation is made that this email or
any attachments are free of viruses. Virus scanning is
recommended and is the responsibility of the recipient.
/"Don Vaillancourt" <donv@.webimpact.com> wrote in message
news:E3hCd.87498$vO1.514048@.nnrp1.uunet.ca...
> Hello all,
> Over the years as I design more database schemas the more I come up with
> patterns in database design. The more patterns I recognize the more I
> want to try to design some kind of generic design patterns that can be
> used and shared amongst many sub-schemas.
> For example, the grouping of entities. I may have the following tables:
> employee, product and client. These tables have no direct relationship
> with each other. But each can be grouped: employee_department,
> product_category, client_region.
> The idea of grouping for those three tables is exactly the same and thus
> I would decide to perhaps create one table named grouping with the
> following fields: parent_table_name, parent_table_pk, child_table_name,
> child_table_id.
> This would make programing and maintaining of the database and sourcode
> (Java, ASP, ColdFusion, etc) a lot simpler. Of course there is a limits
> and drawbacks to such a solution.
> My question is: Is there any literature on such kind of database design.
> Thank You
>
> --
> * Don Vaillancourt
> Director of Software Development

My question is: Is there any literature on such kind of database design.

Thank You

When you think about, there would not be many patterns to talk about.
Relational database design is constrained by the fact that at the end of the
day you have only binary relationships to work with. So you have:

1:0-1
1:0-m
m:m (which resolves to binaries, anyways)

The great thing about relational databases when done right is the semantics
you can captured in the design. Once you begin to generalize tables,
columns, and relationships, you lose the semantics.

I've have seen attempts to accomplish what you are talking about, but in the
long run the resulting design and implementation actually harder to maintain
and understand then have a normalized database with good naming conventions
and foreign key constraints. You lose all semantics when the column names
and values refer to other database objects. Much like the same problems with
intelligent keys.

If you are thinking along the lines of 'Design Patterns', I just don't think
there are that many that would make life any easier. A good design makes
everything easier!

You might be interested in looking at Fowler's 'Analysis Patterns', Inmon,
et all 'The Data Model Resource Book.' and David Hay's 'Data Model Patterns:
Conventions of Thought'.
I haven't read Hay's book. But from Book News, Inc. 'Analyzes data
structures common to many types of businesses in areas such as accounting,
material requirements planning, process manufacturing, contracts,
laboratories, and documents, for analysts who have learned the basics of
data modeling (or entity/relationship modeling) but who need more insight to
prepare a model of a real business. Includes b&w diagrams, and a set of
ready-to-use models for typical applications in many industries.'

Good luck. I'm sure you've started a lively discussion.

Regards,

Rich

> / This email message is intended only for the addressee(s)
> and contains information that may be confidential and/or
> copyright. If you are not the intended recipient please
> notify the sender by reply email and immediately delete
> this email. Use, disclosure or reproduction of this email
> by anyone other than the intended recipient(s) is strictly
> prohibited. No representation is made that this email or
> any attachments are free of viruses. Virus scanning is
> recommended and is the responsibility of the recipient.
> /|||Sure, if you look at the very low level, you do end up with pure
relationships.

But when you look a bit higher I notice two types of patterns:

1. The relationship between tables
2. The similarity between tables

The following example falls within both of the above bullets. I needed
to create a number of different types of users. These users needed to
be distinct much like employees and clients. In my case it was staff,
members and advertisers. I needed to store a number of bits of
information which was similar to all three. Those bits of information
was stored in four tables that are contact_information (email, phone,
etc), identification (first name, last name, etc), address and login
(username , password).

The reason I created three user tables was between each had added
specifications which required 1:1 or 1:m or m:n relationships which were
not required by the other three.

Rich R wrote:
> "Don Vaillancourt" <donv@.webimpact.com> wrote in message
> news:E3hCd.87498$vO1.514048@.nnrp1.uunet.ca...
>>Hello all,
>>
>>Over the years as I design more database schemas the more I come up with
>>patterns in database design. The more patterns I recognize the more I
>>want to try to design some kind of generic design patterns that can be
>>used and shared amongst many sub-schemas.
>>
>>For example, the grouping of entities. I may have the following tables:
>>employee, product and client. These tables have no direct relationship
>>with each other. But each can be grouped: employee_department,
>>product_category, client_region.
>>
>>The idea of grouping for those three tables is exactly the same and thus
>>I would decide to perhaps create one table named grouping with the
>>following fields: parent_table_name, parent_table_pk, child_table_name,
>>child_table_id.
>>
>>This would make programing and maintaining of the database and sourcode
>>(Java, ASP, ColdFusion, etc) a lot simpler. Of course there is a limits
>>and drawbacks to such a solution.
>>
>>My question is: Is there any literature on such kind of database design.
>>
>>Thank You
>>
>>
>>
>>--
>>* Don Vaillancourt
>>Director of Software Development
>
> My question is: Is there any literature on such kind of database design.
> Thank You
> When you think about, there would not be many patterns to talk about.
> Relational database design is constrained by the fact that at the end of the
> day you have only binary relationships to work with. So you have:
> 1:0-1
> 1:0-m
> m:m (which resolves to binaries, anyways)
> The great thing about relational databases when done right is the semantics
> you can captured in the design. Once you begin to generalize tables,
> columns, and relationships, you lose the semantics.
> I've have seen attempts to accomplish what you are talking about, but in the
> long run the resulting design and implementation actually harder to maintain
> and understand then have a normalized database with good naming conventions
> and foreign key constraints. You lose all semantics when the column names
> and values refer to other database objects. Much like the same problems with
> intelligent keys.
> If you are thinking along the lines of 'Design Patterns', I just don't think
> there are that many that would make life any easier. A good design makes
> everything easier!
> You might be interested in looking at Fowler's 'Analysis Patterns', Inmon,
> et all 'The Data Model Resource Book.' and David Hay's 'Data Model Patterns:
> Conventions of Thought'.
> I haven't read Hay's book. But from Book News, Inc. 'Analyzes data
> structures common to many types of businesses in areas such as accounting,
> material requirements planning, process manufacturing, contracts,
> laboratories, and documents, for analysts who have learned the basics of
> data modeling (or entity/relationship modeling) but who need more insight to
> prepare a model of a real business. Includes b&w diagrams, and a set of
> ready-to-use models for typical applications in many industries.'
> Good luck. I'm sure you've started a lively discussion.
> Regards,
> Rich
>
>>/ This email message is intended only for the addressee(s)
>>and contains information that may be confidential and/or
>>copyright. If you are not the intended recipient please
>>notify the sender by reply email and immediately delete
>>this email. Use, disclosure or reproduction of this email
>>by anyone other than the intended recipient(s) is strictly
>>prohibited. No representation is made that this email or
>>any attachments are free of viruses. Virus scanning is
>>recommended and is the responsibility of the recipient.
>>/

--
* Don Vaillancourt
Director of Software Development
*
*WEB IMPACT INC.*
phone: 416-815-2000 ext. 245
fax: 416-815-2001
email: donv@.web-impact.com <mailto:donv@.webimpact.com>
web: http://www.web-impact.com

/ This email message is intended only for the addressee(s)
and contains information that may be confidential and/or
copyright. If you are not the intended recipient please
notify the sender by reply email and immediately delete
this email. Use, disclosure or reproduction of this email
by anyone other than the intended recipient(s) is strictly
prohibited. No representation is made that this email or
any attachments are free of viruses. Virus scanning is
recommended and is the responsibility of the recipient.
/|||Don Vaillancourt (donv@.webimpact.com) writes:
> Over the years as I design more database schemas the more I come up with
> patterns in database design. The more patterns I recognize the more I
> want to try to design some kind of generic design patterns that can be
> used and shared amongst many sub-schemas.
> For example, the grouping of entities. I may have the following tables:
> employee, product and client. These tables have no direct relationship
> with each other. But each can be grouped: employee_department,
> product_category, client_region.
> The idea of grouping for those three tables is exactly the same and thus
> I would decide to perhaps create one table named grouping with the
> following fields: parent_table_name, parent_table_pk, child_table_name,
> child_table_id.
> This would make programing and maintaining of the database and sourcode
> (Java, ASP, ColdFusion, etc) a lot simpler. Of course there is a limits
> and drawbacks to such a solution.
> My question is: Is there any literature on such kind of database design.

I would not be surprised if you can find something on the topic. An SQL
conference I attended there was a guy who gave a presentation on how you
could employ model in a relational database. I didn't attend his talk,
but some friends did, and they were curious enough to talk with him in
the hotel bar later on, and that's were I met them. Sure, his demo was
very slick. But while he was away to the loo, I asked one my friends
"does it scale", and he answered "of course not".

And there lies the problem with a design you suggest: it could look
neat. There could be considerably fewer tables in the database. For
the object-oriented oriented - not to say fixated - programmer, it
would look very clean. But once you have 100 GB data in your database,
you may find that performance is horrible. Remember, that there are
no tools that are so powerful in making things going really slow as
relational database engines.

The key here is that while the tables may look very similar on the surface,
their usage, their size and their distribution of data may vary
vividly, and that's why they need to be separate tables.

If you find that you have many similarily looking tables, some sort of
preprocessor or code-generator may be the way to do.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||You are quite right in regards to the table size getting extremely large
which is one of my chief concerns. One of the tables that I have in one
of our existing installation is already reaching 2.5 millions records
and it's only a 3-way relationshop table. So I do understand the
implications which is why I'm trying to do some research.

But I had been thinking of doing what you suggested and approach
everything from a programming standpoint.

So I'll keep doing research either way until I find some kind of solution.

Here's something that I had been thinking of at one point. Another one
of our tables is getting fairly large and was thinking of using some
kind of hash methodology where I would spread the rows across a number
of tables.

Erland Sommarskog wrote:
> Don Vaillancourt (donv@.webimpact.com) writes:
>>Over the years as I design more database schemas the more I come up with
>>patterns in database design. The more patterns I recognize the more I
>>want to try to design some kind of generic design patterns that can be
>>used and shared amongst many sub-schemas.
>>
>>For example, the grouping of entities. I may have the following tables:
>>employee, product and client. These tables have no direct relationship
>>with each other. But each can be grouped: employee_department,
>>product_category, client_region.
>>
>>The idea of grouping for those three tables is exactly the same and thus
>>I would decide to perhaps create one table named grouping with the
>>following fields: parent_table_name, parent_table_pk, child_table_name,
>>child_table_id.
>>
>>This would make programing and maintaining of the database and sourcode
>>(Java, ASP, ColdFusion, etc) a lot simpler. Of course there is a limits
>>and drawbacks to such a solution.
>>
>>My question is: Is there any literature on such kind of database design.
>
> I would not be surprised if you can find something on the topic. An SQL
> conference I attended there was a guy who gave a presentation on how you
> could employ model in a relational database. I didn't attend his talk,
> but some friends did, and they were curious enough to talk with him in
> the hotel bar later on, and that's were I met them. Sure, his demo was
> very slick. But while he was away to the loo, I asked one my friends
> "does it scale", and he answered "of course not".
> And there lies the problem with a design you suggest: it could look
> neat. There could be considerably fewer tables in the database. For
> the object-oriented oriented - not to say fixated - programmer, it
> would look very clean. But once you have 100 GB data in your database,
> you may find that performance is horrible. Remember, that there are
> no tools that are so powerful in making things going really slow as
> relational database engines.
> The key here is that while the tables may look very similar on the surface,
> their usage, their size and their distribution of data may vary
> vividly, and that's why they need to be separate tables.
> If you find that you have many similarily looking tables, some sort of
> preprocessor or code-generator may be the way to do.

--
* Don Vaillancourt
Director of Software Development
*
*WEB IMPACT INC.*
phone: 416-815-2000 ext. 245
fax: 416-815-2001
email: donv@.web-impact.com <mailto:donv@.webimpact.com>
web: http://www.web-impact.com

/ This email message is intended only for the addressee(s)
and contains information that may be confidential and/or
copyright. If you are not the intended recipient please
notify the sender by reply email and immediately delete
this email. Use, disclosure or reproduction of this email
by anyone other than the intended recipient(s) is strictly
prohibited. No representation is made that this email or
any attachments are free of viruses. Virus scanning is
recommended and is the responsibility of the recipient.
/|||Don Vaillancourt (donv@.webimpact.com) writes:
> You are quite right in regards to the table size getting extremely large
> which is one of my chief concerns. One of the tables that I have in one
> of our existing installation is already reaching 2.5 millions records
> and it's only a 3-way relationshop table. So I do understand the
> implications which is why I'm trying to do some research.
> But I had been thinking of doing what you suggested and approach
> everything from a programming standpoint.
> So I'll keep doing research either way until I find some kind of solution.
> Here's something that I had been thinking of at one point. Another one
> of our tables is getting fairly large and was thinking of using some
> kind of hash methodology where I would spread the rows across a number
> of tables.

The number of rows is not necssarily the problem. It mainly becomes a
problem if

1) You want to spread out the data over several disks.
2) You periodically need to remove a portion of the data.
3) Most of the queries concern only a small set of the data, and these
queries can include a different search conditions.

2.5 million is not a small table - but it is by no means huge.

What I found problematic in your approach is that you in the same
table would mix different sorts of data, with different distrbution
etc. Now you may get into problem, because the query plan that is
good for the ABC-relation is bad for the XYZ-relation.

I should add that what is essential for big tables are proper indexing.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

No comments:

Post a Comment