VoLTE call drop case study

For Anchoring and Domain Selection, Many Know the What but Not the Why; This Case Will Clarify It.

This case is an easily misjudged call drop scenario.

First, let’s examine the call signaling. The caller sends an SR+INVITE to initiate the call and establish a dedicated bearer. The INVITE enters the IMS domain, as shown in the yellow position in the diagram. The I-CSCF forwards the INVITE to the callee side, where it’s evident that both the caller and callee share the S-CSCF. Once the callee’s service triggers are completed, the INVITE reaches the SBC, followed by the paging message from the EPC. The callee responds to the page, establishes a dedicated bearer.

After the callee’s dedicated bearer is established, an INVITE (shown in red above) replies with a 183 (shown in orange here). The callee’s stream is missing the Gm interface, which we manually supplement. Subsequently, the caller sends a PRACK to acknowledge the 183 and receives a 200OK, followed by an UPDATE to confirm the media negotiation results.

Note that in the diagram, the caller receives the 200OK for the PRACK earlier than the callee receives the PRACK because it’s the callee’s AS that responds with the 200OK. Once the callee’s 200OK is also received, it’s time for the AS to cut off, thus the segmented response from the B2BUA increases the concurrency of the process and shortens the end-to-end interaction time.

The callee receives the UPDATE reply with a 200OK, then sends a 180 Ringing. After the caller receives the 180 Ringing, the callee rings for 12 seconds, during which there is an inactive release (shown in yellow in the diagram), followed by an SR+Lift 200OK for invite.

The caller receives the lift, replies with an ACK to connect the call, and after the callee receives the ACK, the end-to-end call is connected. Up to this point, everything is normal. The call lasts nearly 4 minutes, with two TAU and one CS service notification. Checking the ESR reveals that there is a circuit domain call requiring a fallback. At the very beginning of the process, there was hardly any attention to this CSFB (shown in red in the diagram). Afterward, the SBC sends a BYE in both directions to end the call, carrying the error code 503 (bearer released).

At the end of the process, the BYE to the caller is retried four times, and the caller’s UE does not respond. On the other side, the BYE sent by the SBC to the callee does not go far; the first hop AS initially does not respond, and only after the fourth retry does it return a 100 Trying, but still, it hesitates to forward the BYE to the other side immediately. As a result, the S-CSCF times out unexpectedly and reports an error 500 (no response from peer) to the SBC, and then the AS finally starts to forward the BYE to the other side after 4 seconds of 100 Trying.

A few seconds later, the BYE sent by the caller’s SBC has not yet reached the callee side, and the callee’s SBC also sends a BYE due to an RTP-Timeout, which is a timer expiration for inactivity (the caller has fallen back, and there is no media stream). This BYE is also sent in both directions, and we manually supplement the BYE on the callee’s Gm. Afterward, the core networks of both the caller and the callee clean up separately.

End of the process.

True Cause of Call Drop

Although the call process in this case is simple, if the ESR is overlooked, whether by manual analysis or automatic fault detection, seeing the repeated retries of the downstream BYE and the subsequent SBC RTP timeout, one would suspect a wireless issue causing the disconnection. The 503 of the BYE and the 500 series error code from the S-CSCF are strongly suggesting a server-side error. When viewed together, it implies wireless disconnection + bearer loss + call drop, thus leading to a misjudgment.

The release of the bearer is a result; to analyze the cause of the call drop, one must find the reason for the release, which inevitably leads to the CS service notification and ESR, finding that the bearer release was due to a circuit domain fallback. Strangely, with such an obvious call drop, there was no ASR, and it’s unclear what the main equipment was thinking. At this point, it’s natural to think of the previously dawdling AS…

Now the story is a bit more complete. Because a CSFB occurred, the bearer was released + call dropped, which had nothing to do with the wireless.

But why did the CSFB occur? Wasn’t it agreed that the callee would be anchored?

What is Anchoring and Domain Selection?

Simply put, whether it’s a VoLTE incoming call or a CS incoming call, the callee’s business is processed in the IMS, which is the business domain anchoring. Then, based on whether the callee is currently suitable for LTE or the CS domain, a temporary access domain is selected, which is domain selection.

Significance of Anchoring

For the situation in this case, the callee’s business control was not anchored in the IMS domain and was registered in both the IMS and CS domains. When a VoLTE call comes in, it’s transferred in the IMS, and when a CS call comes in, it’s transferred to the CS. If calls come one by one, finishing one before starting the next, there’s no problem. However, if they come at the same time, as in this case, a VoLTE call is in progress, and a CS call arrives. Without business domain anchoring, there is no unified control, the business is inconsistent, and the two domains act independently. The CS call arrives and immediately forces a fallback, causing the VoLTE call to drop.

If the business domain were anchored in the IMS domain, how would this case likely proceed?

Simply put, when you’re on a call and another call comes in, it’s definitely call waiting.

The network has unified business control to determine that there are two calls now, the second one needs to wait, notify the callee of a new incoming call, and then play a notification tone for the second caller or give a busy signal directly.

As shown in Case Library #19, the callee receives an INVITE from the IMS domain, and the message contains an XML attachment, , notifying that there is a new call waiting.

As shown in Case Library #18, the second caller receives a 180 from the IMS domain (PEM.sendrecv plays early media, SDP notifies the IP address and port of the early media), and the Alert-Info in the 180 message body: <urn:alert: service; call-waiting> notifies that there is a new call waiting.

In summary, the callee was not anchored in the IMS domain, dual registration led to inconsistent business, and when a conflict occurred, the VoLTE call dropped.

try the flowshark here: http://shark.haohandata.com:20280/

1 Like