Saturday, February 25, 2012

Database design question

I have a series of database objects that represent things such as people,
accounts, etc. I have a set of options (boolean) that I need to add to
these objects. Normally I would just create a bit field for each one and be
done with it. The challenge however is that there could be hundreds of
these options (maybe 300) and a user decides which of these options
(actually attributes) that should apply to these db objects.

For example, a user can create a new type of person object. What makes this
person object different is the fact that it has a unique set of these
attributes. So whenever someone creates a new instance of this person
object (another one), it contains these attributes (whose values can be set
uniquely).

These attributes are mainly for searching, in other words, you can flag an
account with some of these attributes and then search on them. In terms of
database design, what is a good approach? I was thinking of having a master
list of all the available attributes in a separate table. Then a separate
table containing the mappings to the types (e.g. persontype1 has attribute
1,3,4,5,6...). Then a third table containing the specific instances of
these persons and their value (e.g. 'bob' of persontype1 values for 1=on,
3=off, etc.).

If this is the case, how would would one search on these items? It seems
that the code to build the searching (considering all the items are dynamic)
would be very ugly.

Sorry if this isn't totally clear, this is actually the first time I am
explaining it on paper (well sort of).>> I have a series of database objects that represent things such as
people, accounts, etc. I have a set of options (boolean) that I need
to add to these objects. Normally I would just create a bit field
[sic] for each one and be
done with it. <<

Series? SQL uses sets. Booleans? SQL has no boolean data type. You
are probably about to detroy your data integrity with a EAV design.

>> The challenge however is that there could be hundreds of these
options [sic] (maybe 300) <<

Options? Well, at least you know they are really attributes. An
object does not have optional attributes; it is the sum of all its
attributes in an RDBMS. This is foundations, not rocket science.

>> and a user decides which of these options (actually attributes) that
should apply to these db objects. <<

I certainly hope not! You are supposed to know what you are doing and
not let any random user design the schema.

You then describe mixing data and metadata in such a way that you will
never have data integrity. Stop what you are doing and read a book
RDBMS basics; you missed the whole concept.|||300 *boolean* attributes! I wonder how many you would really need if
you modelled the same attributes with well-chosen codes and valued
attributes instead of check boxes (which I bet is the origin of this
design). 300 bits of data is actually very little.

Represent classes and sub-classes with the common attributes in a
common table and the specific attributes in related tables sharing the
same primary key. Optional attributes can also be nullable or use
tokens for the inapplicable values.

--
David Portas
SQL Server MVP
--|||Tim Mavers wrote:
> I have a series of database objects that represent things such as people,
> accounts, etc. I have a set of options (boolean) that I need to add to
> these objects. Normally I would just create a bit field for each one and be
> done with it. The challenge however is that there could be hundreds of
> these options (maybe 300) and a user decides which of these options
> (actually attributes) that should apply to these db objects.
> For example, a user can create a new type of person object. What makes this
> person object different is the fact that it has a unique set of these
> attributes. So whenever someone creates a new instance of this person
> object (another one), it contains these attributes (whose values can be set
> uniquely).
> These attributes are mainly for searching, in other words, you can flag an
> account with some of these attributes and then search on them. In terms of
> database design, what is a good approach? I was thinking of having a master
> list of all the available attributes in a separate table. Then a separate
> table containing the mappings to the types (e.g. persontype1 has attribute
> 1,3,4,5,6...). Then a third table containing the specific instances of
> these persons and their value (e.g. 'bob' of persontype1 values for 1=on,
> 3=off, etc.).
> If this is the case, how would would one search on these items? It seems
> that the code to build the searching (considering all the items are dynamic)
> would be very ugly.
> Sorry if this isn't totally clear, this is actually the first time I am
> explaining it on paper (well sort of).

I think you've done a good job with design if I understood you correctly.
I presume you have a 3 tables: Person, Attribute and PersonAttribute.
PersonAtribute is N:N link that has PersonID and AttributeID.

One way to preform search is to create a stored procedure which will
find all persons who have a set of attributes linked to it. Pass the
selected attributes to a procedure as a parameter and then join them
with Persons thru PersonAttribute.|||Tim Mavers (webview@.hotmail.com) writes:
> These attributes are mainly for searching, in other words, you can flag
> an account with some of these attributes and then search on them. In
> terms of database design, what is a good approach? I was thinking of
> having a master list of all the available attributes in a separate
> table. Then a separate table containing the mappings to the types (e.g.
> persontype1 has attribute 1,3,4,5,6...). Then a third table containing
> the specific instances of these persons and their value (e.g. 'bob' of
> persontype1 values for 1=on, 3=off, etc.).

Hm, if I understand this correctly, you appear to need more tables.

First there is:

CREATE TABLE attributes (attributeid int NOT NULL,
attributename varchar(50) NOT NULL,
CONSTRAINT pk_attributes PRIMARY KEY (attributeid))

Then you have:

CREATE TABLE persontyoes (persontypeid int NOT NULL,
persontypename varchar(50) NOT NULL,
CONSTRAINT pk_persontypes PRIMARY KEY (persontypeid))

Which defines the possible persontypes. Whether this should really be
an objecttypes table, or you should have one table for persons, another
for accounts etc I can't really tell with the scant information that I
have.

Then to define which attributes that are possible for a person type:

CREATE TABLE persontypeattributes (persontypeid NOT NULL,
attributeid NOT NULL,
CONSTRAINT pk_personstypeattr PRIMARY KEY (persontypeid,
attreibuteid), CONSTRAINT fk1_persontype FORIEGN KEY (persontypeid)
REFERENCES persontypes (persontypeid),
CONSTRAINT fk2_attribute FOREIGN KEY (attributeid)
REFERNCES attributes (attributeid)

Then you would need a column in the persons table to identify the
persontype - or if a person can belong to more than one person table,
you need a personpersontypes table. And finally you would need a
personattributes table. This last table is a little tricky, because
you somehow need to ascertain that the attributes applicable to the
person's person type(s) go into the table. You probably need a trigger
for that.

> If this is the case, how would would one search on these items? It
> seems that the code to build the searching (considering all the items
> are dynamic) would be very ugly.

I'm not really sure how these searches really looks like. But if a user
searches for users with certain settings flags of five attributes (and
the rest thus "don't care"), I think you could do something like:

SELECT personid
FROM personatttributes pa
JOIN searchcriteris sa ON pa.attributeid = sa.attributeid
AND pa.attributeval = sa.attributeval
WHERE sa.searchkey = @.searchkey
GROUP BY personid
HAVING COUNT(*) = (SELECT COUNT(*)
FROM searchcriteria
WHERE searchkey = @.searchkey)

That is, you would shove down the uses choices in a table, and identify
each search with some session key.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||Thanks for the all the replies so far. Right now I have:

Person Table (name, phone, etc.)

I have a pre-set list of attributes (attributes for a person). These mostly
consist of yes/no values, but some are can contain numbers or text.
Functionally, special PersonTypes (like MarketingPerson, EducationalPerson,
etc.) need to be created. What makes them a specific type is the set of
these attributes (for lack of a better term) that needs to be assigned to
them. So again funcitonally, someone will go in and define a
"MarketingPerson" and select one or more of these values to be included.

When someone creates a new MarketingPerson, those new attributes (booleans,
text fields,etc) are available even though they really aren't part of the
"Person" table.

I am concerned about searching? Having so many attributes available can
generate a really ugly (and large) query, correct?

"Erland Sommarskog" <esquel@.sommarskog.se> wrote in message
news:Xns960E128F8CDAYazorman@.127.0.0.1...
> Tim Mavers (webview@.hotmail.com) writes:
>> These attributes are mainly for searching, in other words, you can flag
>> an account with some of these attributes and then search on them. In
>> terms of database design, what is a good approach? I was thinking of
>> having a master list of all the available attributes in a separate
>> table. Then a separate table containing the mappings to the types (e.g.
>> persontype1 has attribute 1,3,4,5,6...). Then a third table containing
>> the specific instances of these persons and their value (e.g. 'bob' of
>> persontype1 values for 1=on, 3=off, etc.).
> Hm, if I understand this correctly, you appear to need more tables.
> First there is:
> CREATE TABLE attributes (attributeid int NOT NULL,
> attributename varchar(50) NOT NULL,
> CONSTRAINT pk_attributes PRIMARY KEY (attributeid))
> Then you have:
> CREATE TABLE persontyoes (persontypeid int NOT NULL,
> persontypename varchar(50) NOT NULL,
> CONSTRAINT pk_persontypes PRIMARY KEY (persontypeid))
> Which defines the possible persontypes. Whether this should really be
> an objecttypes table, or you should have one table for persons, another
> for accounts etc I can't really tell with the scant information that I
> have.
> Then to define which attributes that are possible for a person type:
> CREATE TABLE persontypeattributes (persontypeid NOT NULL,
> attributeid NOT NULL,
> CONSTRAINT pk_personstypeattr PRIMARY KEY (persontypeid,
> attreibuteid), CONSTRAINT fk1_persontype FORIEGN KEY (persontypeid)
> REFERENCES persontypes (persontypeid),
> CONSTRAINT fk2_attribute FOREIGN KEY (attributeid)
> REFERNCES attributes (attributeid)
> Then you would need a column in the persons table to identify the
> persontype - or if a person can belong to more than one person table,
> you need a personpersontypes table. And finally you would need a
> personattributes table. This last table is a little tricky, because
> you somehow need to ascertain that the attributes applicable to the
> person's person type(s) go into the table. You probably need a trigger
> for that.
>> If this is the case, how would would one search on these items? It
>> seems that the code to build the searching (considering all the items
>> are dynamic) would be very ugly.
> I'm not really sure how these searches really looks like. But if a user
> searches for users with certain settings flags of five attributes (and
> the rest thus "don't care"), I think you could do something like:
> SELECT personid
> FROM personatttributes pa
> JOIN searchcriteris sa ON pa.attributeid = sa.attributeid
> AND pa.attributeval = sa.attributeval
> WHERE sa.searchkey = @.searchkey
> GROUP BY personid
> HAVING COUNT(*) = (SELECT COUNT(*)
> FROM searchcriteria
> WHERE searchkey = @.searchkey)
> That is, you would shove down the uses choices in a table, and identify
> each search with some session key.
>
> --
> Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
> Books Online for SQL Server SP3 at
> http://www.microsoft.com/sql/techin.../2000/books.asp|||Tim Mavers (webview@.hotmail.com) writes:
> I have a pre-set list of attributes (attributes for a person). These
> mostly consist of yes/no values, but some are can contain numbers or
> text.

This can be covered by sql_sqlvariant. The attribute definition would
have a column that defines the data type. This can then be enforced
in a trigger by using sql_variant_property() to find the current data
type.

> I am concerned about searching? Having so many attributes available can
> generate a really ugly (and large) query, correct?

Did you look at this query that I proposed:

>> SELECT personid
>> FROM personatttributes pa
>> JOIN searchcriteris sa ON pa.attributeid = sa.attributeid
>> AND pa.attributeval = sa.attributeval
>> WHERE sa.searchkey = @.searchkey
>> GROUP BY personid
>> HAVING COUNT(*) = (SELECT COUNT(*)
>> FROM searchcriteria
>> WHERE searchkey = @.searchkey)
>>
>> That is, you would shove down the uses choices in a table, and identify
>> each search with some session key.

Doesn't look ugly to me. However, it may not be effective. In fact, since
I only made it up, it may not even work.

I would be essential to enforce the actual data types for the sql_variants
in searchcriterias too.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

No comments:

Post a Comment