This month, my good friend and fellow Dune-fan Joe Fleming (@muaddba.bsky.social) from SQL Tailor Consulting asks us about weird problems, and in particular, how we troubleshoot them.
Now, ‘weird’ is an interesting word to use, and Joe actually says “weird or unique”. I’m imagining those times when someone looks at something and is surprised by what they see. That’s certainly a situation I recognise. There are plenty of times when I’m working at a customer site and someone calls me over because there’s a situation they don’t understand. Being available for my customers like this is one of the reasons I still like to be on-site when I can.
The way I troubleshoot these situations is typically to look for things that I don’t seem right. I’ve heard it said that people learning how to spot counterfeit bank notes study legitimate ones, so they know a real note so intimately that they can sense when something isn’t right, and then narrow it down as to why their “spidey-sense” is tingling. I like to think that after a lot of years of dealing with customer situations, I have a good sense for where problems are.
Except that occasionally I don’t, and that’s when I start to break things down.
Way back in the very early days of T-SQL Tuesday, there was a month where we had to present a puzzle (and its solution). My entry was about the minutiae of the HAVING clause. I chose this because when I teach Advanced T-SQL, I like to get people to understand what each component of the language is really doing. After all, these things are the building blocks of queries, and it’s important to be very familiar with them. Components like HAVING or ON or SUM(col) are easy to understand on the surface, but knowing some of the nuances can really help make a difference when you’re troubleshooting something ‘weird’.
Using frameworks or getting AI agents to create code can be very helpful for getting code created quickly, but if you don’t understand what each component does or the implications of the choices made, then troubleshooting is going to be that much harder. It can help to know really know those nuances.
Just about any process is made up of a combination of patterns. I’ll either recognise them or I won’t. If I don’t, I’ll break them down until they’re made up of patterns I do recognise, at which point I can see how they interact with each other as they get built back up. Almost always, I can spot that a particular pattern isn’t behaving the way I’d expect, or there’s an issue in the way that some patterns are interacting.
For example, I know what it means to have a clustered index on an identity column in a SQL database table. I know that if the clustered index is not marked as unique (as it is if the column is the table’s primary key), then it’s possible to have duplicates and this could affect the performance of a query (whether there are duplicates or not). I’ve seen the impact of joins that use columns which are thought of as the primary key, but which aren’t. And I’ve seen times when the business thinks things should be unique even when they’re not. This is more of a ‘uniqueness’ problem than a ‘unique’ problem, but it’s certainly the type of thing that can make a report-creator in a business say that something is ‘weird’. The query might look like it’s fine. But by understanding what’s happening with the joins being used, or with the table/index designs, I can generally spot where the patterns I’m supposed to recognise don’t seem to be behaving in the ways that I’d expect.
Joe asks about non-technical problems too. For this, I’m turning to some lessons from a business-leaders group that I’m in (but much like my customers, these meetings are confidential, so I’m not going to go into detail). Just like with technical problems, there are patterns that can be easily recognised and patterns that are less familiar. But with non-technical situations, things typically come down to the deeper questions that my fellow business leaders know will get asked in that group – questions like “What would be the impact of getting this wrong?”, “What information don’t we have here?”, and “Whose perspective are we not considering here?”. If these (and other) types of questions can be explored honestly, the underlying issues almost always come to the surface. I apply the same things with my customers at LobsterPot Solutions, where our ‘why’ is to help our customers do what they do better, whether the problems are technical or non-technical.
So my advice for you, reading this, is to troubleshoot by looking for where the patterns you recognise are not behaving the way you’d expect – maybe something hasn’t been implemented correctly, or maybe there’s an underlying opinion causing a problem. And please remember, the challenge is not just to find where the problem is, but to find a solution. Although can you really find a solution if you don’t know what the problem is?
@robfarley.com@bluesky (previously @rob_farley@twitter)