# High Speed Decoding of Serial Concatenated Codes

## Globecom 2006 CTH02-1

<u>Doré Jean-Baptiste</u>, Hamon Marie-Hélène and Pénard Pierre {jeanbaptiste.dore;mhelene.hamon;pierre.penard}@orange-ftgroup.com







# Outline

#### Introduction

#### S-SCP codes

- Introduction
- Decoding Strategy

#### Joint strategy for decoding and code design

- Memory contention
- Towards a pipeline decoding

#### Conclusion

# Introduction

## Natural approach to design advanced coding schemes:

- First design a code (LDPC codes, Turbo and Turbo like codes..):
  - For required system performance: threshold, minimal distance...
- Find an efficient hardware architecture for such a code
  - In many case, such a constructed code has very little chance to be suited for hardware implementation...
- "Architecture driven" approach
  - For advanced coding scheme: First introduced for Turbo-Codes Interleaver design
    - Avoid memory contention when decoders are parallelized
  - Methodology extended for Turbo-like and LDPC codes
    - Increase throughput...
  - Joint code design and encoder/decoder architecture

## Outline

#### Introduction

#### S-SCP codes

- Introduction
- Decoding Strategy

#### Joint decoding strategy and code design

- Memory contention
- Towards a pipeline decoding

#### Conclusion

## S-SCP codes : Systematic with Serially Concatenated Parity

#### S-SCP codes description [1] :

- The structure can be viewed either as:
  - Serial concatenation of three codes (ARA like codes)
  - Punctured irregular LDPC codes



- We consider a particular class of S-SCP codes
  - <u>Outer code</u> is a circular convolutional code: <u>1+D</u>
  - Inner code is an accumulator code: 1/1+D

[1] K.M Chugg et al. "A new class of turbo-like codes with universally good performance and high speed decoding ", IEEE Milcom 2005



Parity check matrix is derived from encoding equations:

$$\mathbf{H}\underline{x}^{T} = \begin{bmatrix} \mathbf{G_{1}} & \mathbf{I} & \mathbf{0} \\ \mathbf{0} & \mathbf{V} & \mathbf{G_{2}} \end{bmatrix} \begin{bmatrix} \underline{c}^{T} \\ \underline{h}^{T} \\ \underline{p}^{T} \end{bmatrix} = \underline{\mathbf{0}}^{T}$$

- G<sub>1</sub>: dual-diagonal KxK matrix when outer code is 1+D
- G<sub>2</sub>: dual-diagonal MxM matrix when inner code is 1/1+D
- V : MxK matrix: Interleaver + SPC
  - $-\mathbf{J}$  ones per row
  - **Q** ones per column

- Design of Quasi-Cyclic (QC) S-SCP codes [2]:
  - New definition of the parity check matrix H
    - Construction based on shifted permutation matrices I<sub>x</sub>
    - $I_x$  is a ( $z \times z$ ) circularly right shifted identity matrix by ( $x \mod z$ ) positions, if x is positive
    - if x is negative,  $I_x$  is defined to be a (z x z) zero matrix
    - Interleaver and SPC functions are jointly characterized by matrix V of size (mz x nz)

$$\mathbf{V} = \begin{bmatrix} \mathbf{I}_{\delta(0,0)} & \mathbf{I}_{\delta(0,1)} & & \mathbf{I}_{\delta(0,n-1)} \\ \mathbf{I}_{\delta(1,0)} & \mathbf{I}_{\delta(1,1)} & & \mathbf{I}_{\delta(1,n-1)} \\ & & \mathbf{I}_{\delta(i,j)} \\ \mathbf{I}_{\delta(m-1,0)} & \mathbf{I}_{\delta(m-1,1)} & & \mathbf{I}_{\delta(m-1,n-1)} \end{bmatrix}$$

• We also define all **positive** coefficients through 3 integers:

$$\delta(i,j) = \left(a^{(i+1)}b^{(j+1)}\right) \mod \hat{z}$$

[2] Dore et al. "Design and decoding of a serial concatenated code structure based on quasi-cyclic ldpc codes," 4th International Symposium on Turbo-Codes and Related Topics, April 2006.



- New definition of inner and outer code
  - Outer code definition from circularly shifted identity matrices
    - Circular 1+D convolutional code

$$\mathbf{G_1} = \begin{bmatrix} \mathbf{I} & & \mathbf{I_1} \\ \mathbf{I} & \mathbf{I} & & \\ & \mathbf{I} & & \\ \mathbf{0} & & \mathbf{I} & \mathbf{I} \\ \end{bmatrix}$$

- Inner code definition
  - Accumulator code (1/1+D)

$$\mathbf{G_2} = \begin{bmatrix} \mathbf{I} & & \mathbf{I'_1} \\ \mathbf{I} & \mathbf{I} & & \\ & \mathbf{I} & & \\ & & \mathbf{I} & \\ \mathbf{0} & & \mathbf{I} & \mathbf{I} \end{bmatrix}$$

• I'<sub>x</sub> is a non circulary shifted identity matrix by x positions

[2] Dore et al. "Design and decoding of a serial concatenated code structure based on quasi-cyclic ldpc codes," 4th International Symposium on Turbo-Codes and Related Topics, April 2006.

Decoding strategy

- Various decoding algorithms [2]:
  - BP familiy algorithms
- Turbo like decoding algorithm:
  - Outer and Inner SISO decoder: Forward Backward Algorithm (FBA)
  - SPC decoder: can be viewed as a LDPC decoder
    - 2 steps, Inward and Outward



[2] Dore et al. "Design and decoding of a serial concatenated code structure based on quasi-cyclic ldpc codes," 4th International Symposium on Turbo-Codes and Related Topics, April 2006.



# Outline

Introduction

- S-SCP codes
  - Introduction
  - Decoding Strategy

### Joint decoding strategy and code design

- Memory contention
- Towards a pipeline decoding

#### Conclusion

- Brute force" architecture
  - Serial Scheduling
    - Low data rate
  - High memory requirements
    - All forward and backward metrics must be stored...
  - No particular design rules on the code



Brute force" architecture with sliding windows decoding

- Serial Scheduling
  - Low data rate
- Outer and Inner SISO SW decoding
  - Pipeline decoding
  - Latency is **reduced**
  - Memory requirements are reduced
- No particular design rules



#### Pipeline architecture (I)

- Serial scheduling
  - Pipeline between SPC Inward and Inner SISO
- No particular design rules
  - Scheduling of parity equations to check

$$\mathbf{H} = \left[ \begin{array}{ccc} \mathbf{G_1} & \mathbf{I} & \mathbf{0} \\ \mathbf{0} & \mathbf{V} & \mathbf{G_2} \end{array} \right]$$





#### Pipeline architecture (II)

- Serial scheduling
  - Pipeline between Outer SISO ,SPC Inward, Inner SISO and SPC Outward
- Throughput is increased
  - Without hardware duplication
  - Without additional border effects due to parallelization
- Design rules required for efficient pipeline decoding







#### Pipeline decoding

- Idea: Find design rules on matrix V to characterize pipeline decoding
- Introduction of the overlapping parameter  $\Delta$ 
  - A SPC Inward window starts when  $\Delta$  Outer SISO windows have been decoded



- Rules on matrix V to characterize pipeline decoding
  - First Hypothesis:
    - Outer code is not a circular code



$$\mathbf{H} = \begin{bmatrix} \mathbf{G_1} & \mathbf{I} & \mathbf{0} \\ \mathbf{0} & \mathbf{V} & \mathbf{G_2} \end{bmatrix} \qquad \mathbf{V} = \begin{bmatrix} \mathbf{0} & \mathbf{1} \\ \mathbf{0} & \mathbf{0} \end{bmatrix}$$

| <b>Outer SISO</b> | 0 | 1 | 2 | 3 |   |   |   |   |   |   |   |
|-------------------|---|---|---|---|---|---|---|---|---|---|---|
| <b>SPC I</b>      |   |   | 0 | 1 | 2 | 3 |   |   |   |   |   |
| Inner SISO        |   |   |   | 0 | 1 | 2 | 3 |   |   |   |   |
| SPE 0             |   |   |   |   |   |   |   | 0 | 1 | 2 | 3 |

Relation between V coefficients and A

$$\Delta = 1 + \max(\delta(i, j)), \ \forall \delta(i, j) \geq 0$$

• Remark:

$$\Delta \le 1 + \hat{z}$$
  
$$\delta(i,j) = \left(a^{(i+1)}b^{(j+1)}\right) \mod \hat{z}$$

- Rules on matrix V to characterize pipeline decoding
- Hypothesis: **Outer SISO**  Outer code is not a circular code SPC I Δ Graphical representation **Inner SISO** SPC O • z = 54  $\mathbf{V} = \begin{bmatrix} 26 & 12\\ 50 & 27 \end{bmatrix}$ 50 12  $\Delta = 51$ 26. 27

- Rules on matrix V to characterize pipeline decoding
  - Hypothesis:
    - Outer code is a circular code
  - Graphical representation
    - z = 54





- Pipeline decoding
  - Throughput analysis
    - 12 iterations
    - J = 2, n = 12



#### Design example





# Conclusion

## Architecture driven" approach

- Parity check matrix of the code is based on shifted identity matrices
- Simple constraints on code design (law on coefficients choice)
  - Avoid memory conflicts
  - Overlapping decoding for a particular architecture
- Throughput analysis :
  - Improve throughput by two without hardware duplication in the example considered

### Methodology and design rules can be extended

- For an architecture with <u>only</u> one memory bank
- To enable decoder <u>parallelization</u>
  - Avoid memory conflicts
- To guarantee a good <u>convergence</u> of the decoding algorithm
  - Rules on shift coefficients are derived for layered decoding

# **Iterative decoding of QC S-SCP**

### **Outer and inner SISO decoders**

Forward Backward Algorithm on the graph

Simply described using parity check function q(...)•

$$F_{i} = g(c_{i-1} + F_{i-1}, h_{i})$$
  

$$B_{i} = g(c_{i+1} + B_{i+1}, h_{i+1})$$

$$E_{c_i} = B_i + B_i$$

 $\begin{array}{rcl} B_i &=& g\left(c_{i+1}+B_{i+1},h_{i+1}\right)\\ E_{c_i} &=& B_i+F_i\\ \bullet \ \mathrm{Memor}E_{h_i} &=& g\left(c_{i-1}+F_{i-1},c_i+B_i\right) \text{liding windows r} \end{array}$ 





research & development



# Hardware plateform



