Skype for Business Mediation Service, Call Park Service and RGS Service crash on Windows Server 2008R2

Hi All,
yesterday I managed a very strange behaviour on a customer’s Skype for Business deployment that I want to share with you, maybe it could help someone.

Scenario
One SfB Front-End Standard (updated to CU5) on Windows Server 2008R2
One SfB EDGE (updated to CU5) on Windows Server 2008R2
One Sonus SBC 1000 as PSTN Gateway
Enterprise Voice, Audio Conferencing, RGS, Call Park for Pickup enabled

Yesterday at a one point, the Mediation Service, Call Park Service and Response Group Service start to crash, not togheter at the same time but with few seconds between them. If we restart the crashed service, it continue to work for few seconds, then it crash again!
Only these three services were involved, no others, for example Front-End Service continued to work perfectly.

Event Viewer Errors
On the Frond-End server we have these two connected errors on Application Event Viewer for every service that crashed.
These are the errors for Mediation Service
edge_1

Log Name:      Application
Source:        Application Error
Date:          1/3/2018 3:04:06 PM
Event ID:      1000
Task Category: (100)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      frontend.domain.lan
Description: Faulting application name: MediationServerSvc.exe, version: 6.0.9319.272, time stamp: 0x57ff4069
Faulting module name: Microsoft.Rtc.Internal.Media.dll, version: 6.0.8953.265, time stamp: 0x58c2fe98
Exception code: 0xc0000005
Fault offset: 0x0000000000388362
Faulting process id: 0x35b0
Faulting application start time: 0x01d3849944f869f2
Faulting application path: C:\Program Files\Skype for Business Server 2015\Mediation Server\MediationServerSvc.exe
Faulting module path: C:\Windows\Microsoft.Net\assembly\GAC_64\Microsoft.Rtc.Internal.Media\v4.0_6.0.0.0__31bf3856ad364e35\Microsoft.Rtc.Internal.Media.dll
Report Id: f56561c7-f08e-11e7-b798-005056a143b5 

edge_2

Log Name:      Application
Source:        .NET Runtime
Date:          1/3/2018 3:04:04 PM
Event ID:      1026
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      frontend.domain.lan
Description:
Application: MediationServerSvc.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: exception code c0000005, exception address 000007FED9EB8362

 

The root cause
Long story short, after many (many!) different test on the Front-End (update to CU6, update .NET, installed a brand new Front-End) without any success, I turned my attention to the EDGE (not clear why I do not do that before, but that’s the story), and I found this Event Error:edge_3

Log Name:      Lync Server
Source:        LS A/V Edge Server
Date:          1/3/2018 9:29:20 PM
Event ID:      22032
Task Category: (1028)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      edge.domain.lan
Description: The system is low on non-paged memory. LS A/V Edge Server will start dropping packets.
Cause: The system needs more non-paged memory to handle the current work load.
Resolution: Increase the size of the non-paged memory.

Bingo!
On the EDGE I found RAM full, after a restart everything start to work fine again.

I suppose this issue is related to MRAS and media flow candidate identification via ICE protocol, what is strange is the effect on the services on the Front-End!
I do not know why they crashed instead of simply goes in time-out during candidate search.

Remember that you can define IF and WHICH EDGE Server is associated to your Front-End servers as media flow candidate.
The defined EDGE Server will be always used in every calls, inbound and outbound, PSTN related or not.edge_4

If you are interested in the Media Flow process and in ICE-TURN-STUN protocol I suggest to watch this great video from Ignite:
Troubleshoot media flows in Skype for Business across online, server and hybrid

I hope this short article could help some of you.
Best regards
Luca

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s