075582814553
What is SERDES?

FREE-SKY (HK) ELECTRONICS CO.,LIMITED / 06-30 14:11

The abbreviation SERDES stands for SERializer/DESerializer in English. It's a point-to-point (P2P) serial communication technique that uses time division multiplexing (TDM). That is, at the transmitting end, multiple low-speed parallel signals are changed into high-speed serial signals, which are then re-converted into low-speed parallel signals at the receiving end via the transmission medium (optical cable or copper wire).

Ⅰ. What is SERDES?

The abbreviation SERDES stands for SERializer/DESerializer in English. It's a point-to-point (P2P) serial communication technique that uses time division multiplexing (TDM). That is, at the transmitting end, multiple low-speed parallel signals are changed into high-speed serial signals, which are then re-converted into low-speed parallel signals at the receiving end via the transmission medium (optical cable or copper wire). This point-to-point serial communication method makes maximum use of the transmission medium's channel capacity, decreases the number of required transmission channels and device pins, speeds up signal transmission, and lowers communication costs significantly.

1

Ⅱ. What's the Function of SERDES?

2.1 Parallel Bus Interface

The interconnection between chips was used to send data over a system-synchronous or source-synchronous parallel interface before SerDes became widespread. The system and source-synchronous parallel interface are shown in the diagram below.

2.

Several reasons limit the continued increase of the effective data window width in the system synchronous interface mode as the interface frequency increases.

a) The clock propagation delay between the two chips is not equal (clock skew)

b) Each bit of parallel data has a different propagation latency (data skew)

c), the clock's propagation delay and the data's propagation delay are incompatible (skew between data and clock)

Although the clock skew can be compensated by using the PLL in the target chip (chip #2), the amount of change in the clock delay and the amount of change in the data delay are different when the PVT changes. The data window will become much worse as a result of this.

The Tx on the transmitting side delivers the clock along with the data in the source synchronous interface mode, limiting the clock skew's impact on the valid data window. The source synchronization interface in the transmitting side chip usually processes the clock signal and the data signal in the same way, that is, they both go via the same path and have the same delay. When the PVT changes, the clock, and data will both increase or decrease by the same amount in the same direction, which is ideal for skew.

Assuming a 32-bit data-parallel bus, let's make some acceptable assumptions.

a), data skew at sender = 50 ps-extremely stringent criteria

b), skew induced by PCB wiring = 50ps-extremely stringent criteria

c), the clock's period jitter = +/-50 ps-extremely stringent criteria

d), the IO trigger of the Xilinx V7 high-end device has a sampling window of 250 ps at the receiving end.

The highest clock of the parallel interface can be approximated to be 1/(50+50+100+250) = 2.2GHz (DDR) or 1.1GHz (SDR).

The effective window of data can be greatly improved by using the source synchronous interface. The frequency is usually less than 1GHz. The clock speed of the SPI4.2 interface can be as high as DDR 700MHz x 16bits wide in actual applications. The DDR Memory interface is a source synchronous interface as well. In the FPGA, DDR3 can achieve a clock speed of around 800MHz.

The interface's transmission bandwidth can be increased in one of two ways: by increasing the clock frequency or by increasing the data bit width. So, is it feasible to expand the data's bit width indefinitely? Another highly important topic will be Synchronous Switching Noise (SSN).

The SSN principle is not explored here, and the SSN formula is given directly: SSN = L *N* di/dt.

The chip package inductance is L, the data width is N, and the slope of the current change is di/dt.

SSN has been the key constraint in expanding transmission bandwidth as the frequency and data size have increased. DDR3 crosstalk is shown in Figure 1.2. The low level in the figure has a theoretical value of 0V. The low level appears to be oscillating due to the influence of SSN. Because the oscillating noise's maximum value is 610mV, the noise margin is just 1.5V/2-610mV=140mV.

3.

DDR3 Crosstalk Demo

As a result, extending the data bit width indefinitely cannot continue to increase bandwidth. Using differential signals instead of single-ended signals is one technique to address SSN. The SSN problem can be solved extremely well by employing differential signals, but this comes at the cost of more chip pins. The usage of differential signals is still unable to overcome the data skew problem. The parallel interface has significant hurdles when dealing with a differential signal with a wide bit width and severe time requirements.

2.2 SERDES Interface

The source synchronous interface's clock frequency has already reached its limit. Because of the channel's non-ideal (channel) properties, increasing the frequency will substantially degrade the signal, necessitating equalization and data clock phase detection algorithms. SerDes makes advantage of this technology. Serializer and deserializer are abbreviated as SerDes (Serializer-Deserializer). The SerDes transmitter (Tx) is also known as the serializer (Serializer), and the receiver (Rx) is also known as the serializer (Deserializer). The connectivity of N pairs of SerDes broadcast and receive channels is shown in the diagram below. N is usually less than 4.

4.

SerDes does not send a clock signal, as can be seen, which is also its most unique feature. At the receiving end, SerDes incorporates a CDR (Clock Data Recovery) circuit, which employs CDR to extract the clock from the data's edge information and determine the best sampling location.

SerDes transmits data in differential mode. In most cases, data from many channels is grouped together to share PLL resources, but each channel continues to function independently.

SerDes necessitates the use of a reference clock (Reference Clock), which is usually in differential form to reduce noise. The receiving end Rx and transmitting end Tx reference clocks can have a frequency difference of several hundred ppm (plesio-synchronous system), or they can have the same frequency, but no phase difference is required.

To put it another way, a SerDes channel (channel) has four pins (Tx+/-, Rx+/-), and the current FPGA can handle up to 28Gbps. A 16bit DDR3-1600 has a line throughput of 1.6Gbps*16 = 25Gbps, but it requires 50 pins. The benefits of SerDes in terms of transmission bandwidth are demonstrated in this comparison.

SerDes has several advantages over the source synchronous interface, including: a) SerDes is contained in the data line clock and does not require the transmission of a clock signal.

b) SerDes can use emphasis/equalization technology to achieve high-speed and long-distance transmission, such as backplanes.

d) SerDes employs a smaller number of chip pins.

2.3 Middle Type

SerDes and parallel interfaces have various interface types in common. These intermediate interfaces, like source synchronous interfaces, use serializers and deserializers, as well as transmit clocks for synchronization. Signal. Video display interface 7:1 LVDS, for example, are examples of such interfaces.


Ⅲ. What's the structures of SERDES?

SerDes is made up of three primary components: a PLL module, a sensing module Tx, and a receiving module Rx. Control and status registers, loopback testing, and PRBS testing are all included in order to make maintenance and testing easier. Take a look at the figure below.

5.

Basic Blocks of a Typical SERDES

The PCS layer, shown in blue in the figure, is a typical synthesizable CMOS digital logic that can be implemented in hard logic or FPGA soft logic and is reasonably simple to comprehend. The PMA layer, which is a digital-analog hybrid CML/CMOS circuit, is the sub-module with a brown background. It is also the topic of this post, as it is the key to comprehending the difference between SerDes and the parallel interface.

The signal flow in the transmit direction (Tx) is as follows: To avoid data having too many consecutive zeros or ones, the parallel signal sent by the FPGA soft logic (fabric) is transferred to the 8B/10B encoder (8B/10B encoder) or the scambler through the interface FIFO (Interface FIFO). Then it's delivered to the serializer (Serializer) to be converted from parallel to serial. An equalization adjusts the serial data before it is sent out by a driver.

The external serial signal is altered using a linear equalizer (Linear Equalizer) or DFE (Decision Feedback Equalizer) structural equalizer to reduce portion of the deterministic jitter in the receiving direction (Rx). Through the deserializer, the CDR retrieves the sampling clock from the data and converts it into an aligned parallel signal. The decoding or descrambling is completed by an 8B/10B decoder (8B/10B decoder) or descrambler (de-scambler). If the clock system is asynchronous (plesio-synchronous), a flexible FIFO should be placed before the user FIFO to compensate for the frequency discrepancy.

Supplement: Equalizer

A compensation filter is introduced in the baseband or intermediate frequency component of the communication system to reduce inter-symbol interference. There are two types of equalizers: frequency domain equalizer and time-domain equalizer.

Frequency domain equalizer

The frequency-domain equalizer compensates for the amplitude-frequency and group delay characteristics of the actual channel using the frequency characteristics of the tunable filter, so that the total frequency characteristics of the entire system, including the equalizer, meet the transmission conditions without inter-symbol interference.

Time-domain equalizer

The impulse response of the entire transmission system, including the equalization, is directly evaluated from the perspective of time response, ensuring that there is no inter-symbol interference. The Nyquist shaping theorem is met by the frequency domain equalization, and the condition of no inter-code interference is only loose at the decision point. As a result, in digital communications, time-domain equalizers are commonly utilized.

Linear equalizers and nonlinear equalizers are the two types of time-domain equalizers. It is a nonlinear equalizer if the outcome of the receiver's decision is fed back to alter the equalization's parameters; otherwise, it is a linear equalizer. The linear horizontal equalizer, which consists of numerous tapped delay lines with a delay time interval equal to the symbol interval, is the most often used equalizer structure in the linear equalizer. The decision feedback equalizer (DFE), maximum likelihood (ML) symbol detector, and maximum likelihood sequence estimation are all examples of nonlinear equalizers.

The PLL is in charge of generating the clock signals that each SerDes module requires, as well as regulating the phase connection between these clocks. As an example, the line rate in the figure is 10Gbps, while the reference clock frequency is 250MHz. Serializers and decoders require at least 5GHz 0 phase and 5GHz 90-degree phase clocks, as well as 1GHz (10bit parallel)/1.25GHz (8bit parallel) clocks, among other things.

A SerDes is usually capable of debugging as well. For instance, production and comparison of pseudo-random code streams, various loopback tests, control status registers, and access interfaces, LOS detection, eye diagram testing, and so on.

3.1 Serializer/Deserializer

Serializer The serializer is a program that converts parallel signals to serial signals. Serial signals are converted into parallel signals by the deserializer. The parallel signal is typically 8/10bit or 16/20bit wide, while the serial signal is 1bit wide (it can alternatively be serialized in stages, such as 8bit->4bit->2bit->equalizer->1bit to lower the equalizer's operating frequency). 8/16bit parallel width is used by scrambled protocols like SDH/SONET and SMPTE SDI, while 8B/10B encoding protocols like PCI-Express and GbE employ 10bits/20bits width.

The figure below depicts a 4:1 serializer. A similar implementation is used by the 8:1 and 16:1 serializers. In order to minimize the equalizer's working frequency, the serializer will first convert the parallel data to 2bits and send it to the equalization for filtering, followed by 2:1 serialization. The rest of this article is based on serial signals that are 1 bit wide. explain.

6.

4:1 Serializer Demo

The figure below depicts a 1:4 deserializer. A deserializer with an 8:1 or 16:1 ratio utilizes a similar technology. DFE works in DDR mode, and the deserializer's input is 2bit or wider, in order to minimize the equalizer's working frequency (DFE based Equalizer). A 1bit serial signal is used to describe the last section of this article.

7.

1:4 De-Serializer Demo

The Serializer/Deserializer is implemented utilizing the double-edge (DDR) working mode and the area-for-speed method, which reduces the proportion of high-frequency circuits in the circuit and thereby reduces noise.

The receiving direction usually has an alignment function logic in addition to the Deserializer (Aligner). In contrast to the SerDes transmitting end, the SerDes receiving end can begin operating at any moment, and the first bit correctly received by the receiver can be at any bit position where parallel data is sent. As a result, alignment logic is required to decide which bit position should be used to create accurate parallel data. By looking for the Alignment Code in the serial data stream, the alignment logic determines the start location of the serial-to-parallel conversion. The alignment word in the 8B/10B encoding scheme, for example, is commonly K28.5 (positive code 10'b1110000011, negative code 10'b0001111100). Alignment reasoning is demonstrated in figure above. The state machine finds the position of the alignment code (Align-Code) through the sliding window, bit by bit comparison, and after finding the alignment code at the same position multiple times, the state machine locks the position and selects the corresponding position to output the alignment data.   

3.2 Tx Equalizer

The SerDes signal travels through a channel, which includes components such as chip packaging, PCB traces, vias, cables, and connectors, on its way from the sending chip to the receiving chip. The channel can be simplified to a low-pass filter (LPF) model in the frequency domain. The signal will be corrupted to some extent if the rate of SerDes is higher than the channel's cutoff frequency. The equalizer's job is to compensate for the signal's degradation caused by the channel.

The FFE (Feedforward equalizers) structure is used by the equalizer at the sending end, and the equalization at the sending end is also known as an emphasis. De-emphasis and pre-emphasis are two types of emphasis. The swing of the differential signal is reduced via de-emphasis. The swing of the differential signal is increased by pre-emphasis. De-emphasis is used by most FPGAs; the stronger the emphasis, the lower the average signal amplitude.

The transmit-side equalization is a high-pass filter (HPF), which is generally the inverse of the channel frequency response H-1(f) (f). The purpose of FFE is to ensure that the signal received at the receiving end is clean. FFE can be implemented in a variety of ways. The figure below depicts a common scenario.

 8.

Baud-Spaced 3-Taps FFE

The frequency response of the filter can be changed by adjusting the coefficient to adapt to varying channel characteristics. In general, it can be configured dynamically. As an example, consider the 10Gbps line rate. The DFE frequency response is demonstrated in the image above. The high-frequency gain at 5GHz is 4dB higher than the low-frequency area for the configuration of C0=0, C1=1.0, and C2=-0.25, accounting for the channel's attenuation of the high-frequency spectrum.

9.

Frequency Response of Different FFE

The sampling clock's frequency restricts this FFE to Fs/2 (in this case, Fs/2=5GHz). The sampling theorem states that the information in serial data is contained within 5GHz, which is sufficient in this case. To compensate for frequencies more than Fs/2, you'll need a functional clock with an FFE greater than Fs or a continuous time domain filter (Continuous Time FFE).

The DFE time-domain filtering effect is demonstrated in the diagram below, using a 10Gbps line rate and a UI=0.1 nS=100ps as an example. The binary serial data code stream shown is [00000000100001111011110000].

10.

3.3 Rx Equalizer

3.3.1 Linear Equalizer

The receiving equalizer's goal is the same as the transmitting equalizer's. The continuous-time domain is commonly used for low-speed (5Gbps) SerDes. Peaking amplifiers, for example, use linear equalizers. The equalizer has a higher gain for high-frequency components than for low-frequency components. The frequency-domain characteristics of a linear equalizer are shown in the diagram below. The equalization features are usually encapsulated by the factory into various levels, which can be dynamically changed to respond to different channel characteristics, such as High/Med/Low.

11.

Frequency Response of a Peaking Amplifier Based Rx Equalizer

3.3.2 DFE(Decision Feedback Equalizer)

Because signal jitter (such as ISI-related deterministic jitter) may surpass or approach a symbol interval (UI, Unit Interval) in high-speed (>5Gbps) SerDes, the linear equalizer alone is no longer appropriate. The linear equalizer amplifies both the noise and the signal, but it has no effect on SNR or BER. A nonlinear equalizer called DFE (Decision Feedback Equalizer) is utilized for high-speed SerDes. By tracking the data (historical bits) of numerous UIs in the past, DFE predicts the sample threshold of the current bit. DFE only amplifies the signal and not the noise, resulting in a significant increase in SNR.

supplement:

Unit Interval: The standard unit for expressing the jitter amplitude in communication signal jitter tests. The nominal time difference between two adjacent effective instants of an equal-step signal is represented by this value. An example 5th order DFE is shown in the diagram below. The comparator (slicer) judges the incoming serial data as 0 or 1, and then a filter is used to forecast the data stream's inter-symbol interference (ISI), which is subsequently subtracted from the input original signal, resulting in a clean signal. In order for the DFE equalizer circuit to perform within its linear range, the serial signal must first pass via the VGA, which automatically controls the signal amplitude into the DFE.

 12.

SERDES DFE Equalizer Structure, with Linear Equalizer&Eye-Test

In order to understand the working principle of DFE, let's first look at the impulse response of a 10Gbps backplane. This backplane model is a model based on actual measurements given by Matlab and has typical characteristics.

13.

Pulse Response of a 10G Backplane

A horizontal grid depicts the time of a user interface in the diagram above. After traveling via the backplane, a UI (0.1nS = 1/10GHz) pulse signal can be seen leaking into many neighboring UIs before and after, creating interference to other UI data. Post-cursor interference is the interference that occurs after the sampling point, while pre-cursor interference occurs before the sampling point. The first post-cursor is corrected by the first DFE coefficient h1 (0.175 in this example), and the second post-cursor is corrected by the second DFE coefficient h2 (0.075 in this example). The higher the DFE order, the more post-cursor correction is possible.

14.

"ISI" of Bitstream "11011" for a 10G Backplane

Use the backplane indicated above to send a code stream of 11011. If there is no equalization, the '0' will not be recognized due to post-cursor and pre-cursor leakage, as seen in the image above. Assuming a second-order DFE, the amplitude at the '0' bit should be subtracted from the h2 of the first '1' bit and the h1 of the second '1' bit to get 0.35-0.075-0.175 = 0.1, which is sufficient to be recognized as 0.

DFE analyzes the post-cursor interference of prior bits and subtracts the interference from the present bit to create a clean signal, as can be shown. Because DFE can only calibrate post-cursor ISI, DFE is frequently preceded by LE. Ideal results can be reached as long as the DFE coefficient is close to the channel's pulse response. However, because the channel is a time-varying medium, things such as gradual temperature changes and voltage processes will alter its characteristics. As a result, the coefficient of DFE necessitates an adaptive algorithm to collect and follow channel changes automatically. The DFE coefficient adaptive algorithm is very academic, and each manufacturer's algorithm is proprietary and will not be revealed to the public. The standard algorithm criterion for NRZ codes is a sign-error-driven algorithm. The sign error is the difference between the signal's amplitude after equalization and the predicted value. The approach uses the optimization aim of the minimum mean square error of sign-error to optimize h1/h2/h3... Because sign-error and sample location are linked and influence one another, the two criteria of sign-error and eye pattern width can be utilized to predict DFE coefficients. As a result, SerDes with DFE structures often includes an eye diagram test circuit, as shown in Figure SerDes DFE Equalizer structure. The eye diagram test circuit calculates the bit error rate BER of each shifted position by shifting the amplitude of the signal in the vertical direction and the sampling position in the horizontal direction, resulting in an "eye diagram" of the relationship between each offset position and the bit error rate. Take a look at the list below.

15.

SERDES Embedded Eye-Diagram Test Function

3.4 CDR

CDR's purpose is to determine the best sampling time, which necessitates a large number of data jumps. The longest continuous 0 or continuous 1 length tolerance (Max Run Length or Consecutive Identical Digits) capability is a CDR capability. The CDR cannot be successfully trained if the data does not leap for a long time, and the CDR sampling time will drift, resulting in more 1 or 0 being gathered than the original data. And if the data starts to spike again, there could be an issue with sampling. PLL, for example, is used to implement some CDRs. The PLL's output frequency will wander if the data stops jumping for a long time. In fact, the data sent through SerDes is scrambled or encoded to keep the Max Run Length within a reasonable range.

a) Using the 8B/10B encoding approach, the Max Run Length is limited to 5 UIs.

b) Using the 64B/66B encoding method, the Max Run Length will not exceed 66 UIs.

c) The SONET/SDH scrambling mechanism can guarantee a maximum run length of 80 UI (BER10-12).

Most SerDes protocols use continuous-mode in point-to-point connections, which means the data flow on the line is uninterrupted. Burst-mode connections, such as PON, are commonly utilized in point-to-multipoint connections. Burst-Mode obviously has tight requirements for SerDes lock time.

Continuous-Mode protocols, such as SONET/SDH, have severe requirements on the CDR's jitter transmission performance and must endure extended connection 0. (because of loop timing).

The CDR must have a wider phase tracking range to track the Rx/Tx frequency difference if receiving (Rx) and sending (Tx) are in asynchronous mode, or spread spectrum (SSC) applications.

There are numerous architectures for CDR implementation based on the various requirements of application situations. CDR based on digital PLL and CDR based on phase interpolator is frequently used in FPGA SerDes. In comparison to the analog charge pump plus analog filter configuration, these two CDRs use digital filters in the loop, which saves space.

16.

Phase Rotators Based CDR

The CDR of the phase interpolator is used in the graph above. To obtain phase error signals over multiple UI spans, the phase detector array compares the input serial data with M clocks with equal phase intervals. The frequency of the phase error signal is extremely high, as is its breadth. The decimator reduces and smoothes the signal before sending it to the digital filter. The bandwidth, stability, and response speed of the loop will be affected by the digital filter's performance. The phase rotators use the erroneous signal smoothed by the digital filter to adjust the clock phase. When the loop is ultimately locked, the phase error is theoretically zero, and the serial input is sampled using the 90-degree offset clock as the recovered clock.

17.

Digital PLL Based CDR

The following illustration is based on a DPLL CDR with two loops. The working concepts of the phase tracking loop and the phase rotator-based CDR are identical. To obtain a phase error signal, the phase detector array compares the input serial data with M clocks with equal phase intervals (perhaps across several UIs). The digital filter receives the phase error signal. The bandwidth, stability, and response speed of the loop will be affected by the digital filter's performance. To rectify the clock phase, the erroneous signal is smoothed by the digital filter and supplied to the VCO. When the loop is ultimately locked, the phase error is theoretically zero, and the serial input is sampled using the 90-degree offset clock as the recovered clock.

A frequency tracking loop is included in the DPLL-based CDR (Frequency Tracking Loop). This is done to lower the CDR's lock time and the loop filter's design limitations. It will only switch to the data phase tracking loop once the frequency tracking loop has been locked. When the phase tracking loop loses lock, it switches to the frequency tracking loop automatically. The VCO steady-state control voltages of the two loops are about similar since N times the reference clock (Reference Clock) frequency and the line rate are approximately equal. The phase tracking loop's acquisition time is lowered with the help of the frequency tracking loop.

The frequency tracking loop has no effect on the phase tracking loop when it is locked. As a result, the SerDes receiving side does not place a significant value on the reference clock's jitter.

The phase interpolator-based CDR's reference clock might be a common PLL for transmitting and receiving or a separate PLL for each channel. The jitter of this structure's reference clock will have a direct impact on the jitter of the recovered clock as well as the received bit error rate.

3.4.1 PD

To compare phase errors, the phase detector is utilized. The signal of UP or DN represents the phase error. The time it takes for UP/DN to complete is related to the phase inaccuracy. The figure below shows an example of a bang-bang structure phase detector. Only the four-phase recovered clock is given as an example in this example.

18.

Bang-Bang Phase Detector

3.4.2Decimator and Filter

The decimator lowers the frequency at which the filter operates. The loop's performance is influenced by the extraction step size and smoothing approach. A proportional branch (Proportion) and an integral branch (Integral) make up the digital filter, which track phase and frequency error, respectively. Furthermore, the digital filter's processing latency cannot be too long. The loop will not be able to track the quick changes in phase and frequency if the processing delay is too long, resulting in bit mistakes.

The structure of CDR is not restricted to the two examples above; there are several others. It's a phase-locked loop in essence. Loop follow performance, stability (STABILITY), bandwidth (bandwidth)/gain (gain) performance analysis is a very academic subject, and there are many books and materials to describe the quantification of the loop performance using tiny signal linear model analysis. The following are some of the CDR loop's characteristics:

3.4.3 Loop Bandwidth

1. Through the CDR, a phase jitter with a frequency less than the loop bandwidth will be transferred to the recovered clock. In other words, the CDR can track jitter with a frequency lower than the loop bandwidth without creating bit errors. High-frequency jitter components might produce bit errors depending on the size of the jitter amplitude.

2. The higher the loop bandwidth, the faster the lock time and the more jitter the recovered clock will have. The less the jitter of the recovered clock, on the other hand, the longer the lock time. As a CDR, we hope that the loop bandwidth will be increased to increase jitter tolerance, but there are constraints on the jitter of the recovered clock in loop timing applications such as SONET/SDH, and it should not be excessive.

3. Because the switching power supply's switching frequency is typically lower than the loop bandwidth, the CDR can follow it. On the one hand, the loop cannot track the noise of the switching power supply coupled to the Digital to Multi-Phase Converter (Digital to Multi-Phase Convertor), and the low-cost Ring VCO is especially susceptible to power supply noise. The switching power supply's harmonics, on the other hand, may surpass the loop bandwidth.

Some protocols, such as SDH/SONET, include CDR gain templates. In order to be compatible with these protocols, input and output jitter budgets must be calculated.

3.5  PLL

To work in DDR mode, SerDes requires an internal clock that operates at the data baud rate, or an internal clock that operates at 1/2 the data baud rate. SerDes off-chip receives a reference clock with a frequency significantly lower than the data baud rate, which is multiplied by the PLL to provide the internal high-frequency clock. In order to support commonly used SerDes interface agreements, FPGA SerDes PLLs typically have 8x, 16x, 10x, 20x, and 40x modes. For example, in 40x mode, PCI-Express must supply an off-chip reference clock of 125MHz, while in 20x mode, it must give an off-chip reference clock of 250MHz.

The following diagram depicts a third-order PLL circuit. The phase detector compares the phase of the input signal and the phase of the VCO feedback signal. The charge pump converts the phase error into a voltage or current signal, and the control voltage is generated after the Loop Filter smooths the phase error to correct the VCO. Phase, and finally the phase error tends to zero.

19.

A 3-Order Type II PLL

PLL's functioning procedure is broken into two parts: lock and tracking. The model of the loop can be described as a nonlinear differential equation in the process of entering the lock, which can evaluate the capture time, capture bandwidth, and other indicators. In the tiny signal range, the PLL model is a linear equation with constant coefficients after entering the lock. In the Laplace transform domain, the bandwidth, gain, stability, and other characteristics of the PLL can be investigated. The small-signal mathematical model is depicted in the diagram below.

20.

PLL Small-Signal Model by Laplace Transform

The loop's order is determined by the number of transfer function poles (the root of the denominator). Because the VCO has an integral effect on phase (Kvco/s), the loop without the filter is referred to as a first-order loop. A second-order loop is a loop with a first-order filter. The first-order ring and the second-order ring are two systems that are unconditionally stable. Higher-order loops, on the other hand, have more poles and zeros, which can be changed separately for band type, gain, stability, capture band, capture time, and other factors.

The loop filter F(s)|s=jw determines the frequency domain transfer function properties of PLL. Loop banding and jitter peaking are two significant properties of a generic PLL frequency domain transfer curve. Excessive peaking will magnify jitter, while a large damping factor will reduce peaking but increase loop lock time and alter the roll-off speed and natural frequency.

a) The phase difference is fixed while the loop is locked: Kdc is the loop's DC open-loop gain, and is the difference between the VCO's center frequency and the regulated frequency. The phase error of the PLL is zero for the charge pump + passive filter configuration.

b) When the loop is locked, the phase difference between the two input signals is fixed, and the two input signals have the same frequency.

fo/N = fr/M

The loop acts as a low-pass filter for noise at the input, suppressing noise or interference over the loop's cut-off frequency. It is preferable to have a smaller bandwidth in a SerDes PLL to reduce interference and noise on the reference clock.

The loop works as a high-pass filter for VCO noise. Only VCO noise below the loop's cutoff frequency is suppressed. The jitter of the clock will be worsened by excessive VCO high frequency noise. For financial concerns, low-speed SerDes (5Gbps) VCOs use Ring structure VCOs, which are loud and power sensitive. The low-noise LC structure VCO is used in the high-speed SerDes VCO.

 

Ⅳ. Jitter and Signal Integration

Jitter is a phenomenon in which the timing of a signal's edge jump deviates from its ideal or intended timing. Jitter is caused by noise, non-ideal channels, and non-ideal circuits.

4.1 Clock Jitter

21.

Clock Jitter

The notion of jitter for clock signals varies depending on the application environment. When digital logic calculates timing margin, for example, it considers period jitter. The spectrum may be used to analyze the phase jitter, and the spectrum can be used to evaluate the contribution of specific interference to the total phase jitter, which is why clock designers favor phase jitter.

To introduce different meanings of jitter, look at the diagram above.

Phase jitter

tn – n*T = Jphase(n). There is no jitter and each period T of the ideal clock is equal. Phase jitter is the difference between the real clock's edge and the ideal clock's edge.

Period jitter

Jperiod(n)= (tn- tn-1)– T. Period jitter is the difference between the real clock's period and the ideal clock's period. Jperiod(n) = Jphase(n)-Jphase(n)-Jphase(n)-Jphase(n)-Jphase(n)-Jphase(n) (n-1).

l Jitter from one cycle to the next

(tn- tn-1)- Jcycle(n) (tn-1- tn-2). Cycle-Cycle jitter is the difference between two consecutive cycles before and after. Jcycle(n) obviously equals Jperiod(n)-Jperiod(n)-Jperiod(n)-Jperiod(n)-Jperiod(n) (n-1).

Assume that phase jitter has a maximum value of +/-Jp and that the frequency of jitter is fjitter = 0.5 fclock = 0.5/T.

The greatest value of phase jitter is +Jp at time tn-2, while the smallest value is -Jp at time tn-1.

The highest value of phase jitter at time tn is +Jp, and the minimum value of phase jitter at time tn+1 is -Jp.

The maximum period jitter Jperiod=+/- 2* Jp is then calculated.

Then Jcycle = +/- 4* Jp is the maximum Cycle-Cycle jitter.

4.2. Data Jitter

Because jitter is directly tied to bit error rate, everyone in the high-speed SerDes field is talking about it (BER).

Jitter generation—the jitter generated by the SerDes transmitter for a certain pattern, rate, and load—is an important requirement of the SerDes transmitter.

The jitter will be amplified much more when the signal reaches the receiving end via the channel. Generate deterministic jitter according to the data pattern. Different patterns contain different frequency components, and the transmission delay of the channel to vary frequency components is likewise different (non-linear phase). Data jitter is created by discontinuous impedance reflections, crosstalk, and noise from nearby signals.

The jitter tolerance (Jitter Tolerance) of the SerDes receiver—the amount of jitter that the SerDes receiver can accept for specified patterns and bit error rate requirements (BER10-12)—is a key indicator. Graphical means such as the eye-diagram, bath curve, jitter distribution histogram (PDF), and jitter spectrum are used to evaluate jitter.

There is one aspect that needs to be clarified. Low-frequency jitter is not included when talking about high-speed SerDes data jitter (Tj, Rj, Dj, etc.). Because low-frequency jitter is regarded as a wander, CDR can detect it without producing bit errors. You can select the CDR loop bandwidth incorporated in the oscilloscope when measuring data jitter with an oscilloscope (SDA), and the jitter data measured by the oscilloscope has filtered out the low-frequency jitter.

Jitter is frequently classified into many groups based on the cause of jitter and the probability density function. The importance of identifying jitter stems from the fact that some types of jitter can be addressed while others cannot. Total jitter Tj (Total Jitter) is traditionally divided into deterministic jitter Dj (deterministic jitter) and random jitter Rj (random jitter) (random jitter). The jitter is measured in UI or ps and can be expressed as a root mean square or a peak-to-peak value.

4.2.1 Dj

Dj is further subdivided:

DCD (Duty cycle distortion) is a jitter caused by duty cycle distortion. The duty cycle will be warped if the bias voltage of the positive and negative ends of the differential signal is inconsistent, or if the rising edge and falling edge time are inconsistent. DCD is a jitter that can be rectified because it is tied to the data pattern.

Data pattern-related jitter, also known as intersymbol interference, is referred to as DDJ (Data dependent jitter) (ISI). An unfavorable channel is the source of DDJ. The jitter is something that the equalizer can help with.

P.J. (Periodic jitter) Jitter on a regular basis. Periodic interference sources on the circuit cause Pj. For instance, the switching frequency of the switching power supply, the clock signal crosstalk, and so on. Although the power supply's switching frequency is usually within the CDR's tracking range, low-order harmonic components may be beyond the loop bandwidth or jitter peaking area. The interference of power supply harmonics on the VCO in the CDR, meanwhile, cannot be suppressed. And tracking, therefore LDO power supply should be employed as much as feasible for CDR based on Ring VCO. The equalizer will not be able to fix Pj.

BUJAN (Bounded uncorrelated jitter) Non-clock interference sources cause BUJ. The probability distribution of jitter is a bounded Gaussian distribution if the aggressor and victim are asynchronous, which is also known as CBGJ (Correlated Bounded Gaussian Jitter) at this time. It is impossible to correct BUJ/CBGJ.

4.2.2 Rj

Rj is created by the semiconductor's own noise. Rj's probability density function is Gaussian, has no bounds, and has nothing to do with the data pattern, which is a key property. It can only be regarded as bounded if a specific bit error rate restriction is met.

4.2.3 Tj

The probability distribution function of jitter can be thought of as a convolution of the Gaussian and double-bottom Lak distributions in mathematical terms.

The following are the jitters that contribute to the Gaussian distribution:

Rj represents a Gaussian distribution.

A high number of Pj stacks has a Gaussian effect.

BUJ is also Gaussian in parts.

The jitter that contributes to the Lak distribution's double bottom is:

The probability distribution of double dirac Convolution of Gaussian distribution and double bottom Lak distribution is referred to as DCD:

22.

Among them, W is the deterministic jitter's peak-to-peak value and is the Gaussian distribution's mean square error. As the deterministic jitter W rises, a double peak forms at the top of the probability density distribution curve, as seen in the picture below. In general, the top curve represents the amount of deterministic jitter.

23.

PDF of Tj With Different Dj and Rj

A jittery bathtub curve is created by putting the probability distribution functions of the two transition edges (0 UI and 1UI) of a UI in an image. The Y coordinate is displayed in the logarithm due to the logarithm's wide dynamic range. The bathtub curve with deterministic jitter W=0.05UI and Gaussian jitter variance 0.05UI is shown below.

24.

Bathtub Curve of Tj with 0.05 Dj peak and 0.05 Rj RMS

The BER coordinates for the appropriate bit error rate will likewise be marked by the bathtub curve. Tj(p-p)=0.373*2 = 0.746 UI, for example, is the peak-to-peak jitter of BER=10-12 in the figure. The bit error rate is the ratio of the area under the curve to the total area. For instance, in the diagram,

25.

The deterministic jitter Dj is primarily responsible for the top of the bathtub curve. The larger the contribution of the Gaussian jitter as you get closer to the bottom, and the slope of the Gaussian curve attenuates it. As a result, the Gaussian distribution's properties are frequently used in estimation. The relationship between Gaussian distribution and mean square error is seen in the table below.

26.

 

27.

This table can be used to quickly evaluate the relationship between the mean square error and the peak-to-peak value within the defined BER. The root mean square of Gaussian jitter, for example, is 0.05UI, and the bit error rate must be 10-12 BER. If Q=7, the peak-to-peak value of the Gaussian jitter is 0.05UI*7*2 = 0.7UI, according to the table.

W=0.05UI, Rj=0.05UI, Tj=0.746UI; as previously said, W=0.05UI, Rj=0.05UI;

Gaussian property estimates 0.7UI for Gaussian jitter.

Tj = Rj(0.7UI)+Dj(0.05UI) yields 0.75U, which is essentially the same. The disparity is due to the drawing program's quantization mistake.

 

Ⅴ. Signal Integration (SI) and Simulation

5.1 Channel

The SerDes channel's attention spans a frequency range of 0 Hz to the Nyquist frequency, which is 2 times the signal's fundamental frequency. The signal's fundamental frequency is half of the line rate, implying that the signal's Nyquist frequency is the line rate. Insertion loss, reflection, crosstalk, and other signal degradation effects are all caused by the channel. The S-parameter channel model can express these deficiencies. Vector Network Analyzer can measure the S-parameter (Vector Network Analyzer). The channel isn't just a resistive network; it also has capacitive and perceptual components. As a result, the time delays for components at various frequencies differ, resulting in jittering in the data pattern.

Reflections will occur at each discontinuous impedance point on the channel. The reflected signal will be superimposed on the original signal with varying phases according to the position of the opposing, increasing, or lowering signal loudness.

The SerDes signal is in differential form, which suppresses common-mode interference effectively. Crosstalk will be introduced if there is a difference in interference at the +/- end. The SerDes data and the interference source are usually kept at a sufficient distance by the external PCB, but it is difficult to ensure a sufficient isolation distance between the SerDes signal and the interference source due to economic considerations inside the chip, especially if a channel's own transmission signal interferes with its own reception signal.

5.2 Package

The channel also includes the bundle. The VNA can measure the channel outside the chip, and the chip maker normally provides the packed S-parameter, which can be cascaded during simulation. Because of the small distance between the package and the destination, insertion loss is rarely a concern; instead, the impedance matching problem is the primary concern.

5.3 Signal Integration (SI) Simulation

Signal integration (SI) simulation can construct a simulation platform by cascading the SerDes sender SPICE model, the package and channel S-parameter model, and the receiver SPICE model, and then utilize simulation tools to create circuits for various excitations and test circumstances. simulation. Measure the eye diagram of the SerDes receiving end to see if it satisfies the design requirements. Measuring the eye diagram of the receiving end of the eye diagram template stated in the agreement can also be used to see if the eye diagram template of the receiving end of the eye diagram template specified in the agreement is met.

This standard circuit simulation method can no longer match the design requirements for high-speed SerDes (>5Gbps). To begin with, significant inter-symbol interference (ISI) causes the receiving end's eye diagram to be entirely closed, but following equalization by the chip's DFE, the eye diagram can be quite good. Second, the speed with which circuit simulation (SPICE) is performed is extremely slow. Even if DFE equalization could be added to the simulation, the circuit simulation duration would be unacceptable at the present moment since DFE simulation requires lengthy enough bits for training.

Statistical analytic approaches are required for high-speed SerDes simulation. The statistical analysis technique treats the sender-channel-receiver link as a linear system, calculates the system impulse response h(t), adds a noise source to mimic jitter, and finally convolves the impulse response with excitation to get the signal at the receiver. The manufacturer's unique FFE and DFE adaptive algorithms can be included in the simulation using this way.

Because statistical analysis approaches can't replicate nonlinear and time-varying circuit characteristics, high-speed SerDes frequently mix the two to imitate SI. Please see for more information on statistical analysis approaches.


Processed in 0.062303 Second , 23 querys.