SQL Server Reporting Services plays nicely. You can have things in the catalogue that get shared. You can have Reports that have Links, Datasets that can be used across different reports, and Data Sources that can be used in a variety of ways too.
So if you find that someone has deleted a shared data source, you potentially have a bit of a horror story going on. And this works for this month’s T-SQL Tuesday theme, hosted by Nick Haslam, who wants to hear about horror stories. I don’t write about LobsterPot client horror stories, so I’m writing about a situation that a fellow MVP friend asked me about recently instead.
The best thing to do is to grab a recent backup of the ReportServer database, restore it somewhere, and figure out what’s changed. But of course, this isn’t always possible.
And it’s much nicer to help someone with this kind of thing, rather than to be trying to fix it yourself when you’ve just deleted the wrong data source. Unfortunately, it lets you delete data sources, without trying to scream that the data source is shared across over 400 reports in over 100 folders, as was the case for my friend’s colleague.
So, suddenly there’s a big problem – lots of reports are failing, and the time to turn it around is small. You probably know which data source has been deleted, but getting the shared data source back isn’t the hard part (that’s just a connection string really). The nasty bit is all the re-mapping, to get those 400 reports working again.
I know from exploring this kind of stuff in the past that the ReportServer database (using its default name) has a table called dbo.Catalog to represent the catalogue, and that Reports are stored here. However, the information about what data sources these deployed reports are configured to use is stored in a different table, dbo.DataSource. You could be forgiven for thinking that shared data sources would live in this table, but they don’t – they’re catalogue items just like the reports. Let’s have a look at the structure of these two tables (although if you’re reading this because you have a disaster, feel free to skim past).
Frustratingly, there doesn’t seem to be a Books Online page for this information, sorry about that. I’m also not going to look at all the columns, just ones that I find interesting enough to mention, and that are related to the problem at hand. These fields are consistent all the way through to SQL Server 2012 – there doesn’t seem to have been any changes here for quite a while.
dbo.Catalog
The Primary Key is ItemID. It’s a uniqueidentifier. I’m not going to comment any more on that. A minor nice point about using GUIDs in unfamiliar databases is that you can more easily figure out what’s what. But foreign keys are for that too…
Path, Name and ParentID tell you where in the folder structure the item lives. Path isn’t actually required – you could’ve done recursive queries to get there. But as that would be quite painful, I’m more than happy for the Path column to be there. Path contains the Name as well, incidentally.
Type tells you what kind of item it is. Some examples are 1 for a folder and 2 a report. 4 is linked reports, 5 is a data source, 6 is a report model. I forget the others for now (but feel free to put a comment giving the full list if you know it).
Content is an image field, remembering that image doesn’t necessarily store images – these days we’d rather use varbinary(max), but even in SQL Server 2012, this field is still image. It stores the actual item definition in binary form, whether it’s actually an image, a report, whatever.
LinkSourceID is used for Linked Reports, and has a self-referencing foreign key (allowing NULL, of course) back to ItemID.
Parameter is an ntext field containing XML for the parameters of the report. Not sure why this couldn’t be a separate table, but I guess that’s just the way it goes. This field gets changed when the default parameters get changed in Report Manager.
There is nothing in dbo.Catalog that describes the actual data sources that the report uses. The default data sources would be part of the Content field, as they are defined in the RDL, but when you deploy reports, you typically choose to NOT replace the data sources. Anyway, they’re not in this table. Maybe it was already considered a bit wide to throw in another ntext field, I’m not sure. They’re in dbo.DataSource instead.
dbo.DataSource
The Primary key is DSID. Yes it’s a uniqueidentifier…
ItemID is a foreign key reference back to dbo.Catalog
Fields such as ConnectionString, Prompt, UserName and Password do what they say on the tin, storing information about how to connect to the particular source in question.
Link is a uniqueidentifier, which refers back to dbo.Catalog. This is used when a data source within a report refers back to a shared data source, rather than embedding the connection information itself. You’d think this should be enforced by foreign key, but it’s not. It does allow NULLs though.
Flags this is an int, and I’ll come back to this.
When a Data Source gets deleted out of dbo.Catalog, you might assume that it would be disallowed if there are references to it from dbo.DataSource. Well, you’d be wrong. And not because of the lack of a foreign key either.
Deleting anything from the catalogue is done by calling a stored procedure called dbo.DeleteObject. You can look at the definition in there – it feels very much like the kind of Delete stored procedures that many people write, the kind of thing that means they don’t need to worry about allowing cascading deletes with foreign keys – because the stored procedure does the lot.
Except that it doesn’t quite do that.
If it deleted everything on a cascading delete, we’d’ve lost all the data sources as configured in dbo.DataSource, and that would be bad. This is fine if the ItemID from dbo.DataSource hooks in – if the report is being deleted. But if a shared data source is being deleted, you don’t want to lose the existence of the data source from the report.
So it sets it to NULL, and it marks it as invalid.
We see this code in that stored procedure.
1 2 3 4 5 6 7 8 9 |
UPDATE [DataSource] SET [Flags] = [Flags] & 0x7FFFFFFD, -- broken link [Link] = NULL FROM [Catalog] AS C INNER JOIN [DataSource] AS DS ON C.[ItemID] = DS.[Link] WHERE (C.Path = @Path OR C.Path LIKE @Prefix ESCAPE '*') |
Unfortunately there’s no semi-colon on the end (but I’d rather they fix the ntext and image types first), and don’t get me started about using the table name in the UPDATE clause (it should use the alias DS). But there is a nice comment about what’s going on with the Flags field.
What I’d LIKE it to do would be to set the connection information to a report-embedded copy of the connection information that’s in the shared data source, the one that’s about to be deleted. I understand that this would cause someone to lose the benefit of having the data sources configured in a central point, but I’d say that’s probably still slightly better than LOSING THE INFORMATION COMPLETELY. Sorry, rant over. I should log a Connect item – I’ll put that on my todo list.
So it sets the Link field to NULL, and marks the Flags to tell you they’re broken. So this is your clue to fixing it.
A bitwise AND with 0x7FFFFFFD is basically stripping out the ‘2’ bit from a number. So numbers like 2, 3, 6, 7, 10, 11, etc, whose binary representation ends in either 11 or 10 get turned into 0, 1, 4, 5, 8, 9, etc. We can test for it using a WHERE clause that matches the SET clause we’ve just used. I’d also recommend checking for Link being NULL and also having no ConnectionString. And join back to dbo.Catalog to get the path (including the name) of broken reports are – in case you get a surprise from a different data source being broken in the past.
1 2 3 4 5 6 |
SELECT c.Path, ds.Name FROM dbo.[DataSource] AS ds JOIN dbo.[Catalog] AS c ON c.ItemID = ds.ItemID WHERE ds.[Flags] = ds.[Flags] & 0x7FFFFFFD AND ds.[Link] IS NULL AND ds.[ConnectionString] IS NULL; |
When I just ran this on my own machine, having deleted a data source to check my code, I noticed a Report Model in the list as well – so if you had thought it was just going to be reports that were broken, you’d be forgetting something.
So to fix those reports, get your new data source created in the catalogue, and then find its ItemID by querying Catalog, using Path and Name to find it.
And then use this value to fix them up. To fix the Flags field, just add 2. I prefer to use bitwise OR which should do the same. Use the OUTPUT clause to get a copy of the DSIDs of the ones you’re changing, just in case you need to revert something later after testing (doing it all in a transaction won’t help, because you’ll just lock out the table, stopping you from testing anything).
1 2 3 4 5 6 7 |
UPDATE ds SET [Flags] = [Flags] | 2, [Link] = '3AE31CBA-BDB4-4FD1-94F4-580B7FAB939D' /*Insert your own GUID*/ OUTPUT deleted.Name, deleted.DSID, deleted.ItemID, deleted.Flags FROM dbo.[DataSource] AS ds JOIN dbo.[Catalog] AS c ON c.ItemID = ds.ItemID WHERE ds.[Flags] = ds.[Flags] & 0x7FFFFFFD AND ds.[Link] IS NULL AND ds.[ConnectionString] IS NULL; |
But please be careful. Your mileage may vary. And there’s no reason why 400-odd broken reports needs to be quite the nightmare that it could be. Really, it should be less than five minutes.
This Post Has 11 Comments
Very helpful and I’m glad you’ve published this just in case somebody needs help around 9pm at night because a Junior DBA did something "bad" during the day, this resource can be discovered.
Very informative.. Thanks for the details..
Great article that I found 1 day too late 🙂 I had this situation last night, luckily it was only 20 reports and not 400. I’ve book marked this article for future reference.
Hi Rob,
Interesting article! Some feedback:
"There is nothing in dbo.Catalog that describes the actual data sources that the report uses. The default data sources would be part of the Content field, as they are defined in the RDL, but when you deploy reports, you typically choose to NOT replace the data sources. Anyway, they’re not in this table. Maybe it was already considered a bit wide to throw in another ntext field, I’m not sure. They’re in dbo.DataSource instead."
I think it’s logical that the report record does not show what shared data source it uses. As a report can use more than one shared data source and a shared data source can be used by more than one report, an additional table is needed to store the many-to-many relation. But I’m sure you know that? (might I have misunderstood what you meant?)
"Type tells you what kind of item it is. Some examples are 1 for a folder and 2 a report. 4 is linked reports, 5 is a data source, 6 is a report model. I forget the others for now (but feel free to put a comment giving the full list if you know it)."
3 = resource, 8 = dataset, 9 = report part (ref. http://bidn.com/blogs/ChrisAlbrektson/bidn-blog/1083/cross-reference-for-reportserver-dbo-catalog-type)
cya,
@ValentinoV42
Thanks Valentino.
Regarding the potential many-to-many table required, I agree but can also point out that the parameters are populated using a single ntext column. It breaks first normal form, but there’s definitely a precedent right there. I prefer the way they have it though, and actually wish parameters had their own table too.
Source control!
Ken,
Yes. If you can redeploy everything and have it sort out which reports use which shared data sources, then that’s even better.
Rob
Great post! I had a slightly different issue: A new report was deployed to 70 servers but most of them didn’t link the data source even though the name was the same. Slight modification to your code plus a little SQL Multi-Script and all is right and good in the world. Thanks!
If you know of the person that deleted the datasource have them check their recycle bin and then restore it. It’s worked for me in the past without having to update the datasource in each report.
Lifesaver – thank you! 410 reports in my case 🙂
Pingback: rsInvalidDataSourceReference Error after Moving RDL Files to Another SSRS Server - DBA Fire