Let’s take a table that stores invoices. A lot of people would name that table “Invoices”.
Simple enough, but what about related tables?
A table that contains invoice line items may be called “InvoiceItems”. Let’s add a third called “InvoiceItemDetails”.
What you end up with now, is the following order:
Now your tables are in alphabetical order, but completely out of logical order. Invoices is the parent table, so it should come first, followed by InvoiceItems, and InvoiceItemDetails.
This is why I like using the singular:
Simply removing the “s” puts everything in logical order, the way they should be.
It’s a simple thing, but it makes it easier to find what you’re looking for.
Don’t ever update your database without taking a backup first.
That shouldn’t need to be said, but it does.
I built a critical database system for a client, and it ran great for years.
Until Company B came along to install a new system. I’ll leave them unnamed, although do we really need to protect the guilty?
Anyway, they needed to install THEIR database.
Usually not a problem. You have SQL Server running, you add a database, you add another database, you even add a new instance if need be.
But what would possess you to delete the existing database, uninstall SQL Server, re-install it, set up your database, and then restore a 4-month old copy of MY database you found somewhere on the network?
It happened. All I could say was, “Wow…”.
Because of the particular nature of this client, the loss of a database like this could have crippled them for months, or even put them out of business completely.
Fortunately, I had them making regular backups, and after a massive panic on their part I was able to restore everything back to normal.
They lost a day’s worth of work before they noticed something was wrong, and another day waiting to have it fixed, but that was a minor inconvenience compared to what could have happened.
There are a lot of things I could have foreseen happening to the system that would require backups, but never once did it occur to me that another database “expert” would come in and flat out delete something that didn’t belong to them.
At the very least, they could have taken a backup first. Instead, they found an old copy and restored it, hoping nobody would notice.
If you are doing any kind of work on a database, no matter how minor, take a backup first. You never know what could go wrong…
Sometimes we have a calculation in a query, and we need to use the result of that calculation in multiple places.
The way I usually see that done is to repeat the calculation multiple times. For example:
SELECT DISTINCT A.SomeNumber * LOG(B.NextNumber) + POWER(C.ThisNumber, D.Exponent) AS FirstValue, 100 - (A.SomeNumber * LOG(B.NextNumber) + POWER(C.ThisNumber, D.Exponent)) AS SecondValue, 25 * (A.SomeNumber * LOG(B.NextNumber) + POWER(C.ThisNumber, D.Exponent)) AS ThirdValue FROM TableOne A INNER JOIN TableTwo B ON B.AKey = A.AKey INNER JOIN TableThree C ON C.BKey = B.BKey INNER JOIN TableFour D ON D.CKey = C.CKey
There’s one main calculation in there, which is being used for multiple columns:
A.SomeNumber * LOG(B.NextNumber) + POWER(C.ThisNumber, D.Exponent)
Besides being difficult to read and understand, it’s really easy to make a mistake when you have to do updates.
And this is a fairly simple, made-up example. I’ve seen queries that used CASE statements to perform different calculations depending on various criteria, with the entire CASE statement repeated multiple times throughout the query. Ugly stuff…
When writing code, we try to use the Don’t Repeat Yourself (DRY) Principle, but this applies to queries as well.
What we can do is simplify the main query to get rid of the repeated code, and then wrap it with another query that uses those calculated values:
SELECT InnerValue, (100 - InnerValue) AS SecondValue, (25 * InnerValue) AS ThirdValue FROM ( SELECT DISTINCT A.SomeNumber * LOG(B.NextNumber) + POWER(C.ThisNumber, D.Exponent) AS InnerValue FROM TableOne A INNER JOIN TableTwo B ON B.AKey = A.AKey INNER JOIN TableThree C ON C.BKey = B.BKey INNER JOIN TableFour D ON D.CKey = C.CKey ) AS T
The main calculation is now done in a subquery. The outer query uses InnerValue for the SecondValue and ThirdValue columns, instead of repeating the calculation itself.
So now the calculation is written only once, and only needs to be updated in one place. Plus, it’s much easier to read, understand, and debug.
There are lots of things that irritate me about poorly named database fields, but one of the worst is naming the primary key field “ID”.
Naming primary keys, or any other database field, is an afterthought for many programmers, but not taking the time to use a consistent naming convention leads to confusion and errors.
Let’s take 3 tables – Product, Invoice and InvoiceItem. And let’s assume we’re using artificial (surrogate) keys (my preference).
It’s pretty easy to guess in this example that Invoice.ID = InvoiceItem.InvoiceID. But you still need to go into the table design to verify the relationship actually exists.
But what about that ItemID field? It should relate to the Product.ID field, but the name is misleading. Now we have no choice but to go into the table design to find out which field links back to the Product table.
Also consider your queries:
SELECT Invoice.ID, InvoiceItem.ID, Product.ID, Product.ProductName
INNER JOIN InvoiceItem ON InvoiceItem.InvoiceID = Invoice.ID
INNER JOIN Product ON Product.ID = InvoiceItem.ItemID
First, the relationship between the fields is a mess because there is no consistent naming. You can’t tell just from looking whether or not the query is correct.
Then, you end up with the following dataset:
ID ID ID ProductName 16 23 49 My Product
Which ID came from which table?
This is a simple example, of course, but when you’re writing a complex query with multiple tables, having an “ID” field in every table gets confusing really fast. It gets exponentially worse when you have more tables and lots of foreign keys.
A better way is to use the table name in the ID field:
The primary key is always tableID. It’s consistent in every table, and the foreign key has the same name as the primary key.
The InvoiceID fields are now obviously related, and you don’t need to guess which fields to join on. Same with the ProductID fields.
Take the time to give your primary key fields unique names that are consistent from one table to the next. It avoids confusion, and makes it easier to check that your query is correct.
Naming primary keys properly is a simple way to make your database easier to use. The programmer who takes over after you leave will appreciate it.