Common Database Design Mistakes

Project Management

When creating a new database instance, people will often make mistakes. While I can’t list all the mistakes that people can or will make, I hope this brief list will help you know what mistakes are possible, and help guide you to not making as many mistakes. Sometimes we attack a design problem with the idea that we will just get the work done, but most times it is better to take the extra time to do it right.

I’m not perfect, and I have made these (any many other) mistakes in database design. I’m not trying to tell you what to do or even how to do it. I’m just trying to take my lessons learned and provide a simple list so that you might not make the same mistakes. I also want to point out that no list will ever be the only way to do anything. With database design questions, the best answer is usually “it depends”. When considering the many variables that make up your environment, you will need to make many decisions that help your database instance work best in your unique environment. You have to take into account the personnel you are working with, limits of your hardware, company policies, etc.

Database design and implementation is the cornerstone of any database related project and should be treated will the importance that deserves. If you do your job really well, people will tend to minimize how important your job is in getting their  projects completed. Like a police department that does a good job catching and locking up criminals, people start wondering why they need so many policemen when the crime rate goes down. People might start asking why they need your help in getting good database design, but it will only take a few failed projects for them to come back to you for your professional help.

Database Design Mistakes:

  1. Poor System Design
  2. Poor Documentation
  3. Ignoring Database Normalization
  4. Using Poor Keys
  5. Not Using Stored Procedures
  6. Bad Naming Standards
  7. Poor Testing

Poor System Design

Many people are bad at planning, and that is reflected in the business and personal life. It has been said that the failure to plan is planning to fail. You need to plan out your database design before you start creating anything. Think about what this new database instance will be required to support, and design the system to support that activity before you actually create anything. I agree you can’t possibly think about all possible scenarios, but you can try to think of what could happen in the next three years. What size will the database be in three years? How many users will be accessing the database? What is the expected number of transactions per minute? How ill the database be backed up or restored? What is the incident recovery plan, and how will the database design make that plan possible?

Poor documentation

When creating a new database instance, everything seems obvious. Of course you named the database objects in a way that is obvious to their purpose and everyone will understand what each object does just by looking at them. Maybe this is true, but it probably isn’t as obvious as you think. While carefully naming your objects, table columns, and database files can help make it clear to everyone what each object is used for, given the passage of time the meaning of undocumented objects becomes less clear.

Using a standard naming convention is only the first step in effective documentation. When you have to go back and make changes in a few years, you will really appreciate any written documentation that is available. Create a short document that provides some description of the decisions you are making, and why those decisions seemed so important today.  Things change, and even naming standards change. When someone wonders about the overall design in three or four years, a simple document that outlines the decision process will go a long way to helping explain why the design decisions made sense at the time. Sure, you can include the documentation in the form of extended properties, but it isn’t required.

You want enough documentation that you can hand the database over to a new database administrator or programmer and they can easily figure out what each object is supposed to do without having to track you down for an explanation. You may think that poor documentation is a form of job security, but doing a great job is always a better way of keeping your high-paying job.

Ignoring Database Normalization

The concepts of database normalization have been around for many years, and is the basis for relational databases. Database normalization defines a set of methods to break down tables to their constituent parts until each table represents one and only one “thing”, and its columns serve to fully describe only the one “thing” that the table represents.

Normalizing your data is essential to good performance, and ease of development, but you will often find yourself asking how much normalization is enough. The accepted standards say that 3rd Normal Form is essential, but 4th and 5th Normal Forms are really useful and well worth the time required to implement them.

Unfortunately it is quite common that 1st Normal Form is implemented incorrectly.

Using Poor Keys

With database normalization, 1st Normal Form dictates that all rows in a table must be uniquely identifiable. That means each and every table should have a primary key. SQL Server allows you to define a numeric column as an IDENTITY column, and then automatically generates a unique value for each row. Alternatively, you can use NEWID() to generate a random, 16 byte unique value for each row. These types of values, when used as keys, are what are known as surrogate keys. The word surrogate means “something that substitutes for” and in this case, a surrogate key should be the stand-in for a natural key.

The problem is that too many designers use a surrogate key column as the only key column on a given table. The surrogate key values have no actual meaning in the real world; they are just there to uniquely identify each row. If you make it obvious how to identify the unique row, you have selected a good key.

Not Using Stored Procedures

If you haven’t already figured this out, stored procedures are your best friend when it comes to using SQL Server effectively. You don’t want the developer accessing the data layer directly, so you create a stored procedure to insulate the database layer from the users. Stored procedures will make database development much easier to understand, helps developers maintain and share business logic, and helps keep your database logic cleaner. It can also help you maintain a secure database by forcing security to the stored procedures instead of at the table level.

Poor naming standards

People will sometimes think of database object names as unimportant and might not put much time into naming database objects in a way that is easily understood or documented. The best choice is to use a standard naming convention, be consistent in your object names, and remember that the names shouldn’t require explanation. No administrator or developer wants to read through a long document looking for the meaning of your object name.

Avoid spaces in the object names (MenuItem is better than [Menu Item]), and try not to include metadata in the object name (Customer is always better than tblCustomer).

There are several options for reading more about this subject.

Lack of testing

Testing during development and design will save you time once the system goes into production. Any database administrator knows that the database is always blamed if a systems starts running slow.The best way to combat that perception is to test everything, and always know what has changed.

I know it is easier to say you should test thoroughly than you actually accomplishing a true in-depth testing plan. As pressure is put on a project plan to make up lost time or shorten the development schedule, the first thing that is the target for abbreviation is the testing plan.

Even after something moves to production doesn’t mean the testing has to stop. A good database administrator will continue to evaluate database performance, review error logs, and look for those things that aren’t working well. You should know what normal looks like and be able to warn the developers and management if the database starts behaving abnormally.

You might find some additional help on this subject here, here, or here.

6 thoughts on “Common Database Design Mistakes”

  1. Good article, and I applaud the support for 4NF and 5NF which is so often disregarded. Just a nitpick about your description of normalization.

    Tables don’t represent “things”. Peter Chen explained in his seminal paper on the entity-relationship model that entities are abstract and represented by values. Tables represent facts about entities and relations between entities (attributes are just functional relations). The goal of normalization is to ensure that every fact is represented only once, and free of interference from updates to other facts.

    The misconception that relational databases represent things is behind many arguments that they represent things poorly, and much unnecessary effort is created by the conflation of concepts in software engineering disciplines.

    Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.