Round-up for T-SQL Tuesday 182 – Integrity

January 16, 2025

I was the host for T-SQL Tuesday this month, inviting people to write about integrity. And because I don’t believe in just providing a list of posts (although I’ve also done that at the end of the post), I have a completely fabricated story for you. And no, this wasn’t created using AI. I have more integrity than that (see what I did there?).

Here’s the story:

Eugene was frustrated. He’d been working hard on his database, only to find that something wasn’t right. He suspected corruption. It wasn’t a block of missing data, like Chad’s missing 20 minutes of data, but Chad certainly felt his pain. Andy also sympathised, and asked whether the corruption was in the storage layer, or coming from somewhere else. Shane had some excellent processes for detecting storage corruption, so figured it must’ve been coming from somewhere else. Deborah piped up. She could see the referential integrity was in place (I could too, but that didn’t matter to me so much), but asked if she could see the data lineage too, because that’s important. Luckily, Kevin’s CI/CD processes were in place, so everyone could see that the data was being fed through correctly. And then the nature of the problem came to light – Gene could see bad DAX posts, created using AI tools that were simply wrong. The corruption was in the people who were using AI tools without considering the ethics. Jeff pointed out that it would be better to not have any DAX posts rather than DAX posts that were bad. Quite rightly, Hugo questioned the professional integrity of those people, and everyone agreed that the integrity is a rare commodity these days.

And here’s the list of posts that were referenced in the story:

Andy Yun – https://sqlbek.wordpress.com/2025/01/14/t-sql-tuesday-182-integrity – wrote about how corrupt data isn’t always because of issues in the storage layer, but can also come from bugs in the application code. Personally, I think those are the most terrifying ones, and have spent considerable time addressing those kinds of things myself. I feel your pain, Andy.

Deborah Melkin – https://debthedba.wordpress.com/2025/01/14/t-sql-tuesday-182-data-integrity – wrote a wonderful piece that started off talking about standard technical constraints, but then dipped into the fact that data integrity is more than just what happens in a single database. She talked about how data makes its journey from system to system, and challenges us to make sure we understand the lineage of data and have appropriate controls to ensure the data integrity across its whole lifecycle.

Chad Callihan (@callihandata) – https://callihandata.com/2025/01/14/t-sql-tuesday-182-integrity – wrote about a time there was 20 minutes missing in a migration, thanks to a failed backup & restore process. Chad’s personal integrity is such that he couldn’t just turn a blind eye – he needed to get that data sorted. Well done, Chad.

Kevin Chant – https://www.kevinrchant.com/2025/01/14/t-sql-tuesday-182-improve-your-data-platform-integrity-with-ci-cd-practices – wrote about how strong CI/CD practices can help make sure your data is good. He points out that automation is our friend, because a strong process creates demonstratable evidence in the data integrity. And I agree!

Gene Meidinger (@sqlgene.com) – https://www.sqlgene.com/2025/01/01/the-fraught-ethics-around-ai-chatgpt-and-power-bi – wrote a piece about how the lack of integrity in others is changing the space in AI and BI. (I’m less sure about CI and DI, but certainly AI and BI.) He outlines some obligations that people should have if they’re using AI tools, and some very real warnings too. It’s a good read and I’m pleased he included it.

Shane O’Neill (@sozdba.bsky.social) – https://nocolumnname.blog/2025/01/14/t-sql-tuesday-182-integrity/ – shares some useful scripts that he uses to check for corruption in his database. He said he’s never had to deal with corruption in an actual production database, only the Corruption Challenge that Steve Stedman ten years ago. I have fond memories of that challenge – it was fun, and I managed to win it. I still point people to it as a useful reference for fixing corruption. But Shane’s post is a great resource for noticing the corruption in the first place. (Hint, fixing corruption can be much easier if you have a good backup to use for comparison.)

Jeff Mlakar ((@jmlakar)- https://www.mlakartechtalk.com/integrity-database-corruption – wrote about the significance of integrity as a pillar of databases and data. That you need integrity to ensure trust, and that no data is better than bad data. A lot of foundational stuff that is amazingly useful.

Hugo Kornelis (@hugokornelis.bsky.social) – https://sqlserverfast.com/blog/hugo/2025/01/t-sql-tuesday-182-integrity – brings out some great points about professional integrity and personal integrity. He highlights the fact that professional integrity can take a long time to build up but be destroyed very quickly. He also mentions that the values that underpin his personal integrity also come through in his professional integrity, which I love. I’m pretty sure people who lack personal integrity struggle with professional integrity as well, so I’m very pleased to see that Hugo also recognises the connection between them.

And my own post – https://lobsterpot.com.au/blog/2025/01/14/how-much-does-that-primary-key-or-foreign-key-matter – in which I talk about some of the reasons why it’s not that much of a problem that Fabric DW doesn’t have enforceable primary and foreign keys.

Thanks to all of you, and again, to the wider T-SQL Tuesday community!

@robfarley.com@bluesky (previously @rob_farley@twitter)