My organization has been successfully using service broker, dialog security, for a few years.
I have a new instance of service broker that I would like to configure, but I believe I am running into a situation where the order in which the new broker instance is being incorrectly placed into the databases (with existing broker instances in place) causing the new broker instance to fail.
My error is on the target server; Certificate Not Found. Security context could not be retrieved.
My situation is:
Source server A (dmz) sending to target server B, where there are already existing broker instances sending messages both ways but on different databases.
Test 1: I switched the role of the source and target server around, making server B my source and server A (dmz) my target while using the same databases. This was successful.
Test 2: I dropped the test 1 instance and created 2 new databases on each server. I have again, reconfigured the source and target roles, now sending from server A (dmz) to server B and this was successful. There is only the one service broker instance configured in the new databases, so the order of placement is not an issue.
I have come into this situation in the past when we first set up our dev/test environments; However, the situation was slightly different as many other broker instances were also failing. Not all of them, but some. Once I realized what was happening, dropping all instances and recreating them in a specific order, source instances before target instances, the situation was resolved.
I would think that service broker would add a new instance in the correct place of order, automatically.
Am I correct in thinking that the broker instances must be configured within a specific order? If so, how can I add new broker instances without interrupting what is already there?
Please share your thoughts and suggestions.
Best Answer
Set up a profiler trace and monitor for
Security Audit/Audit Broker Conversation
. Repro the issue, an event of this class should be produced. Look at the event details. It should shed light. I think two of the event columns (can't remeber which, and MSDN seems to omit the info) contain the problem certificate issuer name/serial number, which should further help narrow down which certificate is being used.[Edit] Run this query query on all three databases to find out which certificates are used by which service and remote service binding:
[Edit after your data]
1) In PRd2DB services tab: all services in the DB are owned by
dbo
, anddbo
owns 4 certificates that are valid for SSB (certificate ids 280, 288, 327 and 357). The xls did not correctly capture theexpiry_date
(is all 00:00.0) so I cannot know for sure which one is used, but if the results come from the query I posted above then the first one will be used, certificate id 280 with the serial number54 89 2c 1a...
. So this is the certificate used to identify any service from this db.2) In Prd2 in the RSB tab, the query shows a mixture of certificates with private key and ones without a private key. This should never happen. An user associated with an RSB represents the identity of a remote service and should own only certificates with no private key. Also, there is only one RSB user (the one with principal_id 58) for all remote services. There should be a different identity user for each remote service. Right now any dialog started in PrdDB2, to any service, will use the certificate with serial number
34 99 ...
.3) DMZDB1 services: same as prd2, all services owned by
dbo
, dbo has 3certificates (ids 335, 341, 371) and the one used will be the one with serial number74 0a ...
4) DMZDB RSBs: there are two RSB users, one principal id 9 and one 30. The principal id 9 has a mixture of certificates, including some with private key, while principal id 30 has only one certificate but that certificate has a private key.
Overall, the configuration is deeply flawed and and no wonder it doesn't work. You do not understand the role and the proper configuration of a remote service binding.
Please follow the following rules. Consider a remote service 'Foo':
remote_user_Foo
remote_user_Foo
)remote_user_Foo
should own exactly one certificate, and it should not have a private keySEND
) toremote_user_Foo
Next time you run the query for showing the RSB it should show a number of rows equal to the number of remote service bindings, the
principal_id
values should be all distinct and thepvt_key_encryption_type
should all beNA
.