database - when - why is it better to have multiple separate tables?
Three customer addresses in one table or in separate tables? (6)
In my application I have a
Customer class and an
Address class. The
Customer class has three instances of the
Whats the best way to reflect this structure in a database?
- The straightforward way would be a customer table and a separate address table.
- A more denormalized way would be just a customer table with columns for every address (Example for "street": customer_street, delivery_street, invoice_street)
What are your experiences with that? Are there any advantages and disadvantages of these approaches?
I'd go (as database theory teaches) for two separate tables: Customer and Address.
The idea of putting three fields in the Customer table is bad, as you say, because it would violate normalization rules (and fail when addresses would become more than three).
edit: also, I'd split the Address table record in several fields, one for the toponomastic prefix, one for the street, etc. and put a unique key on those. Otherwise, you'd end with a database full of duplicates.
I'd go with denormalized. It's easier.
If you normalize it, the address table would require you to remove duplicates and relink records from the customer table. If I had the same delivery and invoice address, it seems like it should link to the same record. If I change one, you're required to:
- Create a new address record.
- Relink the customer record.
If I change it back you need to check:
- Does a similar address entry already exist.
- Relink the customer record.
This programming overhead seems to obviate the advantage of less space that normalization seems to offer. A denormalized solution, like you pointed out, would provide faster queries and easier maintenance (programming-wise). This seems to be the deciding factor for me.
A normalized solution, like pointed out above, would allow you to add more addresses later. But you'd have to add a field for the foreign key anyways, unless you planned on organizing the tables without linkage from the customer to the address table.
Advantages of Normalized
- Less space.
- Duplicate logic kinda built in (though maybe they weren't actually duplicates?)
- Support addition of new address fields (kinda).
Advantages of Denormalized
- Faster queries.
- Less programming.
If you are 100% certain that a customer will only ever have the 3 addresses you described then this is OK:
CREATE TABLE Customer ( ID int not null IDENTITY(1,1) PRIMARY KEY, Name varchar(60) not null, customerAddress int not null CONSTRAINT FK_Address1_AddressID FOREIGN KEY References Address(ID), deliveryAddress int null CONSTRAINT FK_Address2_AddressID FOREIGN KEY References Address(ID), invoiceAddress int null CONSTRAINT FK_Address3_AddressID FOREIGN KEY References Address(ID), -- etc ) CREATE TABLE Address ( ID int not null IDENTITY(1,1) PRIMARY KEY, Street varchar(120) not null -- etc )
Otherwise I would model like this:
CREATE TABLE Customer ( ID int not null IDENTITY(1,1) PRIMARY KEY, Name varchar(60) not null -- etc ) CREATE TABLE Address ( ID int not null IDENTITY(1,1) PRIMARY KEY, CustomerID int not null CONSTRAINT FK_Customer_CustomerID FOREIGN KEY References Customer(ID), Street varchar(120) not null, AddressType int not null -- etc )
One important fact you may need to consider (depending on your problem domain) is that people change addresses, and may want to let you know in advance of their address change; this is certainly true for utility companies, telcos, etc.
In this case you need to have a way to store multiple addresses for the customer with validity dates, so that the address can be set up in advance and automatically switch at the correct point. If this is a requirement, then a variation on (2) is the only sensible way to model it, e.g.
Customer (id, ...) Address (id, customer_id, address_type, valid_from, valid_to)
On the other hand, if you don't need to cater for this (and you're sure you won't in the future) then probably (1) is simpler to manage because it's much easier to maintain data integrity as there's no issues with ensuring only one address of the same type exists, and the joins become simpler as they're only on one field.
So either (1) or (2) are fine depending on whether you need house-moves, but I'd steer clear of (3) because you're then repeating the definition of what an address is in the table, and you'll have to add multiple columns if you change what an address looks like. It's possibly slightly more performant, but to be honest when you're dealing with properly indexed joins in a relational database there isn't a lot to be gained, and it's likely to be slower in some scenarios where you don't need the address as the record size for a customer will be larger.
Best way to model Customer <--> Address
I tend towards first approach for all the usual reasons of normalisation. This approach also makes it easier to perform data cleansing on mailing details.
If you are possibly going to allow multiple addresses (mail, residential, etc) or wish to be able to use effective dates, consider this approach
Customer (id, phys_address_id) Cust_address_type (cust_id, mail_address_id, address_type, start_date, end_date) Address (id, street, city, etc.)
SQL Database Design
Option 1 for sure. All your addresses should be in one spot. This shouldn't hinder performance either. You can limit the addresses returned by the WHERE statement when you query the data.
Three customer addresses in one table or in separate tables?
Referencing the aforementioned link: Since you're saying your customer address relationship is modeled as one-to-many, I would use the following example of one address table, an AddressType and a EntityId FK. Using this method would allow for an Entity (client/employee/contact) to have many addresses.
CREATE TABLE Entity ( ID int not null IDENTITY(1,1) PRIMARY KEY, Name varchar(60) not null, EntityType int not null -- etc ) CREATE TABLE Address ( ID int not null IDENTITY(1,1) PRIMARY KEY, EntityID int not null CONSTRAINT FK_Entity_EntityID FOREIGN KEY References Entity(ID), Street varchar(120) not null, AddressType int not null -- etc )