What is your methodology for troubleshooting an issue with your database systems?
Most of the technical problems I have encountered with database systems are not really the fault of the technology at hand. These problems are either the result of rushed work, apathetic work, ignorance of proper procedures, or botched troubleshooting.
Before talking about what troubleshooting is, let’s talk about what it is not. Troubleshooting is not:
- Searching Google for solutions based on information you don’t even understand
- Trying something you knew worked in some completely unrelated situation
- Trying to work out a quick solution that doesn’t address the original problem
- Replacing the system because you can’t figure out the cause
For a professional database administrator, guessing or trying various solutions without understanding what you are doing is not a reasonable option.
You probably hire people that have a methodical approach toward technical issues, applies common sense to business problems, and maybe even has extensive certifications or years of experience. All those things can be taught, but do they have the required traits to be a troubleshooter?
So what are some common traits among true troubleshooters?
- Calm Under Pressure – Do they understand the importance of a deadline, but refuse to rush to a solution just to meet a deadline?
- Uses Tools Wisely – There is actually using an internet search to help gather information about a problem. But do they seek to understand the proposed solution, do they vet the possible answer to verify the proposed solution is the correct solution? There is a difference between finding a solution that might work and understanding what will work.
- Knows When To Say When – Do they know when to admit they don’t know an answer and also know who to call for support? No one will think less of you if you ask for help and get the problem solved.
- Simple Solutions First – If a server fails to boot, you probably shouldn’t just grab a screwdriver and start replacing the motherboard. A smart troubleshooter starts with the simple things first, like checking the power cord. They should look for the simple solutions and increase scope and complexity as needed until a solution is found.
- Learn from Disaster – Just because you figured out the problem, that doesn’t mean you skip the step of performing a “post mortem” process to understand why the issue occurred, how it can be prevented, what specifically solved it, what you learned about your disaster recovery strategy, where you can make immediate improvements, and when you will make the longer term improvements.
These same troubleshooting skills can be applied everywhere in just about any job. Understand what you are doing, learn from your mistakes, and apply a consistent (and calm) methodology to escape trouble wherever it hits your business.
There are books on the subject, like this book.