Besides shorter TTI via numerology and use of mini-slot TTI, other ways to reduce latency include frequency transmission opportunities to minimize wait time, shorter processing time via pipeline processing, grant-free UL transmission and flexible TDD frame structure.
DMRS is front loaded, that is, it comes at the start of frame. A UE can therefore start channel estimation and decoding at the earliest.
To enable quick HARQ feedback, 5G NR uses a self-contained subframe structure. A subframe contains DL control, DL data, guard period and UL control. Thus, ACK/NACK for DL data can be sent in the same subframe. A similar self-contained uplink-centric subframe exists.
Multiple code blocks are grouped into a Code Block Group (CBG). If there’s an error, only that CBG is retransmitted, not the entire TB. With fewer bits, errors at CBG are less likely than at TB. Moreover, slot aggregation tries to minimize retransmissions.
For channel coding, 5G NR uses Low-Density Parity Check (LDPC). LDPC has a highly parallelizable decoder, thus reducing processing time.
Source: Sayon Ghatak on LinkedIn: #5g_nr #low_latency #urllc