Figuring out what has changed in a system isn’t always easy, so it’s a good topic that Meagan Longoria has chosen this month – read the invite and see what other people have contributed on the topic at https://datasavvy.me/2026/05/04/t-sql-tuesday-198-invitation-how-do-you-detect-data-changes/
I’ve used quite a few different methods over the years, because there are pros and cons of each, and what will work best for one situation might not work at all for another. For example – it’s rarely okay to make changes to an existing application, but sometimes it’s okay to replicate the data to a second database, and use that to detect changes. However, if your database doesn’t have primary keys, then some kinds of replication won’t work either…
In an ideal world, the database is controlled in-house, so that methods like Change Tracking, Change Data Capture, or the new-in-2025 Change Event Streaming (CES) can be used. Bespoke applications written by a local development team can often agree to avoiding the restrictions that these methods have, but it all depends on what’s needed.
For example, I really like CES. It’s an easily-configurable tool which will stream changes out to Azure Event Hubs using standard protocols like Apache Kafka or AMQP. The data appears in a CloudEvents message using JSON format, and includes the old values and the new values (how good is that!), so it’s easy to look in the event stream for messages that suit particular purposes but also dump the data into longer-term storage to let you go back to it. I wouldn’t want to have to search through a backlog of changes from years ago in JSON format, but it’s easy to push the changes into a table for the next ETL load (which can save me trying to use start/end times to pick up the changes for each ETL load, and hoping that I can rely on arrival times), and also store the data in a different format to suit a longer history of changes, at whatever grain I like. Tracking every row with triggers feels a little too close to the application in most cases, while using temporal tables physically adds columns (even if they’re ‘hidden’ columns that’s probably not a good thing) and locks the tables down a lot.
But every silver cloud has a dark lining… You know how I mentioned that local developers can often code around the restrictions here? Well that’s a thing.
At least at the moment (and I’m not saying that because I have inside knowledge – I don’t have any inside knowledge about this at all), there are some limitations.
Some are fine – for example, you can add or remove columns, and CES doesn’t care. If a new value appears in a column that wasn’t there when CES was first configured, that’s fine! The message going to Event Hubs will reference that new column (it’s just JSON after all), and hopefully your code handling the stream can handle that. Maybe have a column for “other column data”? But if you want to rename a column that’s in a CES-enabled table… sorry, that won’t work.
If you’re using partitions, CES isn’t for you. You might not notice it on day one, but when you go to use SWITCH, then you’ll get an error. Worth knowing. Oh, and you’re probably not using TRUNCATE TABLE on a table you’re tracking, but stranger things have existed.
But when we get to the column restrictions, there are some surprises. If you’re someone who uses geometry and geography, then great – I always felt like there should be more people using those data types. But json and xml are also not supported – and I do see a lot of xml columns. Like I said, if you’re wanting to use CES in a local environment, then maybe you can code around this? Or consider CDC, CT, temporal tables, or plain old replication, which all support those types because they don’t need the data to be serialised as much as CES.
For me, CES is very promising for detecting changes, and gives me a bunch of flexibility about how I do that. It’s very probably my preferred method for tracking history, but the limitations are still pretty annoying. Hopefully in vNext…?
@robfarley.com@bluesky (previously @rob_farley@twitter)