Publish a Duplicate Detection Rule: Behind the scenes

When you import a solution you get a nice progress bar that you can use to approximately estimate the finish time. But when you publish a duplicate detection rule, you get nothing. All you can basically do is hit the publish and check back 10-15 minutes later.

Sometimes, it might sit in the publishing state longer that you expect. It can be frustating to just wait, without actually knowing whether the process has frozen or it is actually doing something. The underlying entity DuplicateRule is not searcheable from advanced find, and so there is no other information available except the window saying “Publishing”.

I did some research into what actually happens under the covers when you hit publish. When you hit publish CRM has to:

  1. Create new matchcode tables for the base entity and the matching entity
  2. Populate the matchcode tables with the duplicate criteria specified in the rule

For some reason that I haven’t been able to figure out so far, it seems the matchcode table count is always actual entity count + 1. There is a matchcode record with ObjectId 00000000-0000-0000-0000-000000000000 in each matchcode table. I have no idea why this record exists, but my theory is that this is some sort of bit mask. I maybe totally wrong on this though and this could just be some sort of sequence.

I wrote the script below to confirm the duplicate rules are indeed doing something.

declare @duplicaterules table (rownum int identity(1,1), name varchar(200),matchcodetable varchar(200),entity varchar(200))
declare @i int,@matchcodetable varchar(200),@entity varchar(200),@name varchar(200),@max int
declare @result varchar(max),@query nvarchar(max),@matchcodecount int,@entityrowcount int
insert into @duplicaterules
select name,MatchingEntityMatchCodeTable,MatchingEntityName from DuplicateRule where statecode=0
select @max=@@ROWCOUNT,@i=1
	select @matchcodetable=matchcodetable,@entity=entity,@name=name from @duplicaterules where rownum=@i
	set @query=N'select @matchcodecountOUT=count(*) from ' + @matchcodetable + ' where ObjectId<>''00000000-0000-0000-0000-000000000000'''
	exec sp_executesql @query,N'@matchcodecountOUT int OUTPUT',@matchcodecountOUT=@matchcodecount OUTPUT
	set @query=N'select @entityrowcountOUT=count(*) from ' + @entity
	exec sp_executesql @query,N'@entityrowcountOUT int OUTPUT',@entityrowcountOUT=@entityrowcount OUTPUT
	print 'Duplicate Rule '+@name+' is '+CAST(CAST(@matchcodecount*100.0/@entityrowcount as numeric(5,2)) as varchar(6))+'% complete: '
	print 'Entity: '+@entity+', Matchcode Table: '+@matchcodetable+', Matchcodes: '+CAST(@matchcodecount as varchar(20))+', Entity Rows: '+CAST(@entityrowcount as varchar(20))+CHAR(10)
	set @i=@i+1

When you run the script in SSMS on the MSCRM database, you will see a output that looks something like this.


So now you can quickly check the status in the database, if you are little impatient. Once again, sorry CRMOnline, but I like having access to the actual database.

EDIT (22/09/15): Updated SQL Script slightly to produce the percentages correctly. The previous script was always displaying integer. I also exclude matchcode row with objectid 00000000-0000-0000-0000-000000000000


One comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s