working on eq

2025-04-16 07:06:48 +00:00 · 2017-04-12 15:49:17 -04:00 · 2017-04-12 15:49:17 -04:00 · 4e3c57c5f3
commit 4e3c57c5f3
parent 297162af13
13 changed files with 288 additions and 97 deletions
--- a/docs/source/eq.rst
+++ b/docs/source/eq.rst
@ -0,0 +1,91 @@
+Sub-carrier Equalization and Pilot Correction
+============================================
+
+- **Module**: :file:`equalizer.v`
+- **Input**: ``I (16), Q (16)``
+- **Output**: ``I (16), Q (16)``
+
+This is the first module in frequency domain. There are two main tasks:
+sub-carrier gain equalization and correcting residue phase offset using the
+pilot sub-carriers.
+
+Sub-carrier Structure
+---------------------
+
+The basic channel width in 802.11a/g/n is 20 MHz, which is further divided into
+64 sub-carriers (0.3125 MHz each).
+
+.. _fig_subcarrier:
+.. figure:: /images/subcarrier.png
+    :align: center
+
+    Sub-carriers in 802.11 OFDM
+
+:numref:`fig_subcarrier` shows the sub-carrier structure of the 20 MHz band. 52
+out of 64 sub-carriers are utilized, and 4 out of the 52 (-7, -21, 7, 21)
+sub-carriers are used as pilot sub-carrier and the remaining 48 sub-carriers
+carries data. As we will see later, the pilot sub-carriers can be used to
+correct the residue frequency offset.
+
+Each sub-carrier carries I/Q modulated information, corresponding to the output
+of 64 point FFT from :file:`sync_long.v` module. 
+
+
+.. _fig_lts_fft:
+.. figure:: /images/lts_fft.png
+    :align: center
+    :scale: 80%
+
+    FFT of the Perfect and Two Actual LTS
+
+To plot :numref:`fig_lts_fft`:
+
+.. code-block:: python
+
+    lts1 = samples[11+160:][32:32+64]
+    lts2 = samples[11+160:][32+64:32+128]
+    fig, ax = plt.subplots(nrows=3, ncols=1, sharex=True);
+    ax[0].plot([c.real for c in np.fft.fft(lts)], '-bo');
+    ax[1].plot([c.real for c in np.fft.fft(lts1)], '-ro');
+    ax[2].plot([c.real for c in np.fft.ff t(lts2)], '-ro');
+    plt.show()
+
+:numref:`fig_lts_fft` shows the FFT of the perfect LTS and the two actual LTSs
+in the samples. We can see that each sub-carrier exhibits different magnitude
+gain. In fact, they also have different phase drift. The combined effect of
+magnitude gain and phase drift (known as *channel gain*) can clearly be seen in
+the I/Q plane shown in :numref:`fig_lts_fft_iq`.
+
+
+.. _fig_lts_fft_iq:
+.. figure:: /images/lts_fft_iq.png
+    :align: center
+    :scale: 80%
+
+    FFT in I/Q Plane of The Actual LTS
+
+
+To map the FFT point to constellation points, we need to compensate for the
+channel gain. This can be achieved by normalize the data OFDM symbols using the
+LTS. In particular, the mean of the two LTS is used as channel gain (:math:`H`):
+
+.. math::
+
+    H[i] = \frac{1}{2}(LTS_1[i] + LTS_2[i])\times L[i], i \in
+    [-26,\ldots, -1, 1, \ldots, 26]
+
+where :math:`L[i]` is the sign of the LTS sequence:
+
+.. math::
+
+    L_{-26,26} = \{
+    &1, 1, –1, –1, 1, 1, –1, 1, –1, 1, 1, 1, 1, 1, 1, –1, –1, 1,\\
+    &1, –1, 1, –1, 1, 1, 1, 1, 0, 1, –1, –1, 1, 1, –1, 1, –1, 1,\\
+    &–1, –1, –1, –1, –1, 1, 1, –1, –1, 1, –1, 1, –1, 1, 1, 1, 1\}
+
+And the FFT output at sub-carrier :math:`i` is normalized as:
+
+.. math::
+
+    S'[i] = \frac{S[i]}{H[i]}
+
--- a/docs/source/files/lts.txt
+++ b/docs/source/files/lts.txt
@ -1,67 +1,3 @@
-1.559999999999999998e-01
-0.000000000000000000e+00
-1.200000000000000025e-02
-9.800000000000000377e-02
-9.199999999999999845e-02
-1.059999999999999970e-01
-9.199999999999999845e-02
-1.150000000000000050e-01
-3.000000000000000062e-03
-5.399999999999999939e-02
-7.499999999999999722e-02
-7.399999999999999634e-02
-1.270000000000000018e-01
-2.100000000000000130e-02
-1.219999999999999973e-01
-1.700000000000000122e-02
-3.500000000000000333e-02
-1.509999999999999953e-01
-5.600000000000000117e-02
-2.199999999999999872e-02
-5.999999999999999778e-02
-8.100000000000000255e-02
-7.000000000000000666e-02
-1.400000000000000029e-02
-8.200000000000000344e-02
-9.199999999999999845e-02
-1.310000000000000053e-01
-6.500000000000000222e-02
-5.700000000000000205e-02
-3.899999999999999994e-02
-3.699999999999999817e-02
-9.800000000000000377e-02
-6.199999999999999956e-02
-6.199999999999999956e-02
-1.189999999999999947e-01
-4.000000000000000083e-03
-2.199999999999999872e-02
-1.610000000000000042e-01
-5.899999999999999689e-02
-1.499999999999999944e-02
-2.400000000000000050e-02
-5.899999999999999689e-02
-1.370000000000000107e-01
-4.700000000000000011e-02
-1.000000000000000021e-03
-1.150000000000000050e-01
-5.299999999999999850e-02
-4.000000000000000083e-03
-9.800000000000000377e-02
-2.599999999999999881e-02
-3.799999999999999906e-02
-1.059999999999999970e-01
-1.150000000000000050e-01
-5.500000000000000028e-02
-5.999999999999999778e-02
-8.799999999999999489e-02
-2.100000000000000130e-02
-2.800000000000000058e-02
-9.700000000000000289e-02
-8.300000000000000433e-02
-4.000000000000000083e-02
-1.110000000000000014e-01
-5.000000000000000104e-03
-1.199999999999999956e-01
 1.559999999999999998e-01
 0.000000000000000000e+00
 -5.000000000000000104e-03
@ -126,3 +62,67 @@
 1.059999999999999970e-01
 1.200000000000000025e-02
 9.800000000000000377e-02
+-1.559999999999999998e-01
+0.000000000000000000e+00
+1.200000000000000025e-02
+-9.800000000000000377e-02
+9.199999999999999845e-02
+-1.059999999999999970e-01
+-9.199999999999999845e-02
+-1.150000000000000050e-01
+-3.000000000000000062e-03
+-5.399999999999999939e-02
+7.499999999999999722e-02
+7.399999999999999634e-02
+-1.270000000000000018e-01
+2.100000000000000130e-02
+-1.219999999999999973e-01
+1.700000000000000122e-02
+-3.500000000000000333e-02
+1.509999999999999953e-01
+-5.600000000000000117e-02
+2.199999999999999872e-02
+-5.999999999999999778e-02
+-8.100000000000000255e-02
+7.000000000000000666e-02
+-1.400000000000000029e-02
+8.200000000000000344e-02
+-9.199999999999999845e-02
+-1.310000000000000053e-01
+-6.500000000000000222e-02
+-5.700000000000000205e-02
+-3.899999999999999994e-02
+3.699999999999999817e-02
+-9.800000000000000377e-02
+6.199999999999999956e-02
+6.199999999999999956e-02
+1.189999999999999947e-01
+4.000000000000000083e-03
+-2.199999999999999872e-02
+-1.610000000000000042e-01
+5.899999999999999689e-02
+1.499999999999999944e-02
+2.400000000000000050e-02
+5.899999999999999689e-02
+-1.370000000000000107e-01
+4.700000000000000011e-02
+1.000000000000000021e-03
+1.150000000000000050e-01
+5.299999999999999850e-02
+-4.000000000000000083e-03
+9.800000000000000377e-02
+2.599999999999999881e-02
+-3.799999999999999906e-02
+1.059999999999999970e-01
+-1.150000000000000050e-01
+5.500000000000000028e-02
+5.999999999999999778e-02
+8.799999999999999489e-02
+2.100000000000000130e-02
+-2.800000000000000058e-02
+9.700000000000000289e-02
+-8.300000000000000433e-02
+4.000000000000000083e-02
+1.110000000000000014e-01
+-5.000000000000000104e-03
+1.199999999999999956e-01
--- a/docs/source/freq_offset.rst
+++ b/docs/source/freq_offset.rst
@ -20,22 +20,26 @@ visually how each correction step helps in the final constellation plane.
 .. _fig_cons:
 .. figure:: /images/cons.png
    :align: center
+    :scale: 80%

    Constellation Points Without Any Correction

 .. figure:: /images/cons_w_coarse.png
    :align: center
+    :scale: 80%

    Constellation Points With Only Coarse Correction

 .. figure:: /images/cons_w_coarse_fine.png
    :align: center
+    :scale: 80%

    Constellation Points With both Coarse and Fine Correction 

 .. _fig_cons_full:
 .. figure:: /images/cons_w_coarse_fine_pilot.png
    :align: center
+    :scale: 80%

    Constellation Points With Coarse, Fine and Pilot Correction

@ -49,7 +53,7 @@ The coarse CFO can be estimated using the short preamble as follows:

 .. math::

-    \alpha_{ST} = \frac{1}{16}\angle(\sum_{i=0}^{N}\overline{S[i]}S[i+16])
+    \alpha_{ST} = \frac{1}{16}\angle(\sum_{i=0}^{N-1}\overline{S[i]}S[i+16])

 where :math:`\angle(\cdot)` is the phase of complex number and :math:`N \le 144
 (160 - 16)` is the subset of short preambles utilized. The intuition is that the
@ -69,5 +73,22 @@ set :math:`N=64`. The ``prod_avg`` in :numref:`fig_sync_short` is fed into a
 ``moving_avg`` module with window size set to 64.


+Fine CFO Correction
+-------------------
+
+A finer estimation of the CFO can be obtained with the help of long training
+sequence inside the long preamble.
+
+The long preamble contains two identify training sequence (64 samples each at 20
+MSPS), the phase offset can be calculated as:
+
+.. math::
+
+    \alpha_{LT} = \frac{1}{64}\angle(\sum_{i=0}^{63}\overline{S[i]}S[i+64])
+
+This step is omitted in |project| due to the limited resolution of phase
+estimation and rotation in the look up table.

 .. [1] Sourour, Essam, Hussein El-Ghoroury, and Dale McNeill.  "Frequency Offset Estimation and Correction in the IEEE 802.11 a WLAN." Vehicular Technology Conference, 2004. VTC2004-Fall. 2004 IEEE 60th. Vol. 7.  IEEE, 2004. 
+
+
--- a/docs/source/images/lts.png
+++ b/docs/source/images/lts.png
--- a/docs/source/images/lts_16.png
+++ b/docs/source/images/lts_16.png
--- a/docs/source/images/lts_fft.png
+++ b/docs/source/images/lts_fft.png
--- a/docs/source/images/lts_fft_iq.png
+++ b/docs/source/images/lts_fft_iq.png
--- a/docs/source/images/match_size.png
+++ b/docs/source/images/match_size.png
--- a/docs/source/images/quadrant.png
+++ b/docs/source/images/quadrant.png
--- a/docs/source/images/subcarrier.png
+++ b/docs/source/images/subcarrier.png
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@ -14,6 +14,7 @@ Welcome to |project|'s documentation!
   detection
   freq_offset
   sync_long
+   eq
   setting
   verilog

--- a/docs/source/sync_long.rst
+++ b/docs/source/sync_long.rst
@ -1,6 +1,10 @@
 Symbol Alignment
 ================

+- **Module**: :file:`sync_long.v`
+- **Input**: ``I (16), Q (16), phase_offset (32), short_gi (1)``
+- **Output**: ``long_preamble_detected (1), fft_re (16), fft_im (16)``
+
 After detecting the packet, the next step is to determine precisely where each
 OFDM symbol starts. In 802.11, each OFDM symbol is 4 |us| long. At 20 MSPS
 sampling rate, this means each OFDM symbol contains 80 samples. The task is to
@ -14,20 +18,20 @@ achieved using the long preamble following the short preamble.
    802.11 OFDM Packet Structure (Fig 18-4 in 802.11-2012 Std)

 As shown in :numref:`fig_training`, the long preamble duration is 8 |us| (160
-samples), and contains two identical long training sequence (LTS), 64 samples each.
-The LTS is known and we can use `matched filter
-<https://en.wikipedia.org/wiki/Matched_filter>`_ to find it.
+samples), and contains two identical long training sequence (LTS), 64 samples
+each.  The LTS is known and we can use `cross correlation
+<https://en.wikipedia.org/wiki/Cross-correlation>`_ to find it.

-The match *score* at sample :math:`i` can be calculated as follows.
+The cross validation *score* at sample :math:`i` can be calculated as follows.

 .. math:: 
-    :label: eq_matched
+    :label: eq_cross_corr

-    Y[i] = \sum_{k=0}^{63}(S[i+k]\overline{H[63-k]})
+    Y[i] = \sum_{k=0}^{63}(S[i+k]\overline{H[k]})

 where :math:`H` is the 64 sample known LTS in time domain, and can be found in
-Table L-6 in :download:`802.11-2012 std </files/802.11-2012.pdf>` (index 64 to
-127). A numpy readable file of the LTS (64 samples) can be found :download:`here
+Table L-6 in :download:`802.11-2012 std </files/802.11-2012.pdf>` (index 96 to
+159). A numpy readable file of the LTS (64 samples) can be found :download:`here
 </files/lts.txt>`, and can be read like this:

 .. code-block:: python
@ -39,7 +43,7 @@ Table L-6 in :download:`802.11-2012 std </files/802.11-2012.pdf>` (index 64 to
 .. figure:: /images/lts.png
    :align: center

-    Long Preamble and Matched Filter Result
+    Long Preamble and Cross Correlation Result

 To plot :numref:`fig_lts`, load the data file (see :ref:`sec_sample`), then:

@ -53,30 +57,30 @@ To plot :numref:`fig_lts`, load the data file (see :ref:`sec_sample`), then:
    fig, ax = plt.subplots(nrows=2, ncols=1, sharex=True)
    ax[0].plot([c.real for c in samples][:500])
    # lts is from the above code snippet
-    ax[1].plot([abs(c) for c in np.convolve(samples, lts, mode='same')][:500], '-ro')
+    ax[1].plot([abs(c) for c in np.correlate(samples, lts, mode='valid')][:500], '-ro')
    plt.show()

    

-:numref:`fig_lts` shows the long preamble samples and also the result of matched
-filter. We can clearly see two spikes corresponding the two LTS in long
+:numref:`fig_lts` shows the long preamble samples and also the result of cross
+correlation. We can clearly see two spikes corresponding the two LTS in long
 preamble. And the spike width is only 1 sample which shows exactly the beginning
 of each sequence. Suppose the sample index if the first spike is :math:`N`, then
-the 160 sample long preamble starts at sample :math:`N-33`.
+the 160 sample long preamble starts at sample :math:`N-32`.

 This all seems nice and dandy, but as it comes to Verilog implementation, we
-have to make a few compromises.
+have to make a compromise.

-First, from :eq:`eq_matched` we can see for each sample, we need to perform 64
+From :eq:`eq_cross_corr` we can see for each sample, we need to perform 64
 complex number multiplications, which would consume a lot FPGA resources.
-Therefore, we need to reduce the matched filter size. The idea is to only use
-a portion instead of all the LTS samples.
+Therefore, we need to reduce the size of cross validation. The idea is to only
+use a portion instead of all the LTS samples.

 .. _fig_match_size:
 .. figure:: /images/match_size.png
    :align: center

-    Matched Filter with Various Size (8, 16, 32, 64)
+    Cross Correlation with Various Size (8, 16, 32, 64)

 :numref:`fig_match_size` can be plotted as:

@ -86,17 +90,44 @@ a portion instead of all the LTS samples.

    fig, ax = plt.subplots(nrows=5, ncols=1, sharex=True)
    ax[0].plot([c.real for c in lp])
-    ax[1].plot([abs(c) for c in np.convolve(lp, lts[:8], mode='same')], '-ro')
-    ax[2].plot([abs(c) for c in np.convolve(lp, lts[:16], mode='same')], '-ro')
-    ax[3].plot([abs(c) for c in np.convolve(lp, lts[:32], mode='same')], '-ro');
-    ax[4].plot([abs(c) for c in np.convolve(lp, lts, mode='same')], '-ro')
+    ax[1].plot([abs(c) for c in np.correlate(lp, lts[:8], mode='valid')], '-ro')
+    ax[2].plot([abs(c) for c in np.correlate(lp, lts[:16], mode='valid')], '-ro')
+    ax[3].plot([abs(c) for c in np.correlate(lp, lts[:32], mode='valid')], '-ro');
+    ax[4].plot([abs(c) for c in np.correlate(lp, lts, mode='valid')], '-ro')
    plt.show()

-:numref:`fig_match_size` shows the long preamble (160 samples) as well as
-matched filter with different size. It can be seen that using the first 16
-samples of LTS is good enough to exhibit two narrow spikes. Therefore, |project|
-use matched filter of size 16 for symbol alignment. And the first sample of the
-long preamble starts at :math:`N_{16}-57`, where :math:`N_{16}` is the index of
-the first spike when the filter size is 16 (for completeness, it is
-:math:`N_{32}-49` when filter size is
-32).
+:numref:`fig_match_size` shows the long preamble (160 samples) as well as cross
+validation with different size. It can be seen that using the first 16 samples
+of LTS is good enough to exhibit two narrow spikes. Therefore, |project| use
+cross correlation of first 16 samples of LTS for symbol alignment. To confirm,
+:numref:`fig_lts_16` shows the cross correlation of the first 16 samples of LTS
+on the actual packet. The two spikes are not as obvious as the ones in
+:numref:`fig_lts`, but are still clearly visible.
+
+.. _fig_lts_16:
+.. figure:: /images/lts_16.png
+    :align: center
+
+    Cross Validation using the First 16 Samples of LTS
+
+To find the two spikes, we keep a record of the max correlation sample for the
+first 64 samples (since the first spike is supposed to be at the 32th sample).
+Similarly, we also keep a record of the max correlation sample for the second 64
+samples. For further eliminate false positives, we also check if the two spike
+sample indexes are :math:`64 \pm 1` apart.
+
+
+FFT
+---
+
+Now we have located the start of each OFDM symbol, the next task is to perform
+FFT on the last 64 data samples inside each symbol. For this we utilize the
+`XFFT core
+<https://www.xilinx.com/support/documentation/ip_documentation/xfft_ds260.pdf>`_
+generated by Xilinx ISE. Depend on if `short guard interval (SGI)
+<https://en.wikipedia.org/wiki/Guard_interval>`_ is used, the first 16 or 8
+samples of each OFDM symbol need to be skipped.
+
+But before performing FFT, we need to first apply the frequency offset
+correction (see :ref:`freq_offset`). This is achieved via the ``rotate`` module
+(see :ref:`rotate`).
--- a/docs/source/verilog.rst
+++ b/docs/source/verilog.rst
@ -9,7 +9,9 @@ quantization and look up table. In |project|, these approximations are used.
 Magnitude Estimation
 --------------------

-**Module**: :file:`complex_to_mag.v`
+- **Module**: :file:`complex_to_mag.v`
+- **Input**: ``i (32), q (32)``
+- **Output**: ``mag (32)``

 In the ``sync_short`` module, we need to calculate the magnitude of the
 ``prod_avg``, whose real and imagine part are both 32-bits. To avoid 32-bit
@ -36,10 +38,15 @@ second cycle, ``max`` and ``min`` are determined. In the final cycle, the
 magnitude is calculated.


+.. _sec_phase:
+
 Phase Estimation
 ----------------

-**Module**:: :file:`phase.v`
+- **Module**: :file:`phase.v`
+- **Input**: ``i (32), q (32)``
+- **Output**: ``phase (32)``
+- **Note**: The returned phase is scaled up by 512 (i.e., :math:`int(\theta *512)`)

 When correcting the frequency offset, we need to estimate the phase of a complex
 number. The *right* way of doing this is probably using the `CORDIC
@ -102,3 +109,43 @@ Refer to `this guide
 <https://www.xilinx.com/itp/xilinx10/isehelp/cgn_p_memed_single_block.htm>`_ on
 how to create a look up table in Xilinx ISE. The generated module is stored in
 :file:`verilog/coregen/atan_lut.v`.
+
+
+
+.. _rotate:
+
+Rotation
+--------
+
+- **Module**: :file:`/verilog/rotate.v`
+- **Input**: ``i (16), q (16), phase (32)``
+- **Output**: ``out_i (16), out_q (16)``
+- **Note**: The input phase is assumed to be scaled up by 512.
+
+To rotate a complex number :math:`C=I+jQ` by :math:`\theta` degree, we can
+multiply it by :math:`e^{j\theta}`, as shown in :eq:`eq_rot`.
+
+.. math::
+    :label: eq_rot
+
+    C' = (I+jQ)\times(\cos(\theta)+j\sin(\theta))
+
+Again, this can be done using the CORDIC algorithm. But similar to
+:ref:`sec_phase`, we use the look up table.
+
+
+.. _fig_quadrant:
+.. figure:: /images/quadrant.png
+    :align: center
+    :scale: 60%
+
+    Quadrant in I/Q Plane
+
+As shown in :numref:`fig_quadrant`, we split the I/Q plane into 8 quadrants,
+:math:`\pi/4` each. To avoid storing nearly duplicate entries in the table, we
+first map the phase to be rotated (:math:`[-\pi, \pi]`) into the :math:`[0,
+\pi/4]` range. Next, since the incoming phase is scaled up by 512, each quadrant
+is further split into :math:`402=int(\pi/4*512)` sectors. And the
+:math:`\cos(\theta)` and :math:`\sin(\theta)` values (scaled up by 2048) are
+stored in the look up table. The table is generated by the
+:file:`scripts/gen_rot_lut.py`.