#### Moscow-Bavarian Joint Advanced Student School (MB-JASS)

# Characterizing Multistage Nonlinear Drivers and Variability for Accurate Timing and Noise Analysis

# Clemens Satzger

Abstract—As the conventional gate delay models are difficult to adopt to nanoscale designs, a parameterizable waveform independent gate model (PWiM) will be presented in this document. The model can work for multistage load cells providing still compact description and efficient simulation time.

 $\mathit{Index Terms}{-}\mathsf{Gate}$  delay modeling, timing and noise analysis

## I. INTRODUCTION

# A. Cause of Variations of Circuit Timing

The circuit timing does vary because of some influencing factors. These factors are due to the production process and the physical layout of the FPGA. As these factors could be possibly reduced within the production, it is necessary to understand the effects. The list below shows the effects which cause timing variations.

- Imperfect CMOS manufacturing process
- Environmental factors such as drops in Vdd
- Substrate temperature changes
- Device fatigue phenomena

Electron-migration, Hot electron effects

Negative bias temperature instability

The first and most evident cause of variation is the imperfect CMOS manufacturing process. As it is not possible to have exactly the same condition for the production of every chip, some variations in the production have to be considered.

The second point of time variation happens during the chip is active. As the chip can be considered as a part of a system, the environment of the system can change. If there is for example a DC-motor started, the system voltage  $V_{dd}$ can vary as the result of the high startup current of the motor. Therefore the timing can vary because the rising slope would be smaller.

Another environmental effect is the temperature shift. The changing temperature can effect the behavior of the CMOS transistors, as they conduct better or worse for a lower or higher temperature.

The last point is a group of phenomena caused on the fatigue of the chip. As the FPGA becomes older, the substrate will change caused by diffusion of electrons and holes (electron-migration). Hot electron effects resulting from high electric fields in typical VLSI MOSFETs can severely degrade device characteristics. Threshold voltage shifts and reduced current drive capability are typical. Negative bias instability is that phenomenon, in which the threshold voltage of a field-effect transistor shifts in the negative direction under negative bias. The effect manifests itself as a negative shift of the threshold voltage when stressed under negative bias.

## B. Why is there an Increasing Deviation?

The larger deviation of the timing analysis is caused by three factors:

- Increasing circuit speed
- Crosstalk noise (smaller design process)
- Inductive coupling in nanoscale designs

Therefore we can conclude that for a nanoscale process the model has to take into account these effects. And as the common models are not as adaptive as they should have to be, a new model of the timing analysis will be presented in this paper.

## C. Increasing Deviation

As the circuit speed is increasing not neglectable crosstalk occurs. At the same time the process technologies are getting smaller. Therefore, high capacitive coupling occurs. Also the impedance of the interconnect lines does not scale down by the same factor as the gate impedance. As a result of these changes the common models are not applicable any more, because they do not consider in a sufficient manner these effects. Therefore a new modeling concept is necessary.

#### D. The two Types of Noise

The noise of a logic circuit can be separated into two different noise types [8], [5].

The functional noise is the noise induced in quiet nets, which are called victims, by their switching neighbors, the aggressors. This noise can cause unwanted logic activity and consequently logical errors.

The other type of noise is the delay noise [10], which is the



Figure 1. The Two Different Types of Noise

noise caused by the switching aggressor onto the switching victim. Is this case complex signals can occur, which are not possible for modeling as a ramp signal.

## E. Criteria of a Good Model

A good model has to fulfill the following criteria.

- Adequate coverage for wave shapes typically seen in circuits
- Concurrent usage of both old and new model
- Intuitive parameters
- Simple gate characterization  $\rightarrow$  no additional characterization necessary
- Minimal storage space for gate characterization
- Controllability of the complexity by the user

The most important of these criteria is to fulfill an adequate coverage for the wave shapes typically seen. As the typical signals of nanoscale designs are not good enough approximated by ramps, most of the standard models are not applicable any more.

#### II. THE DIFFERENT MODELS

## A. Model Categories

Three model categories can be seen in the actual usage. The linear timing models are based on the adaption of the effective capacitance. This model gives a system of nonlinear equations which can be solved by the simulator. The best-fit resistance models are using quite the same approach, but this approach is not adapting the effective capacitance, but the equivalent gate resistance. The third category is the large signal driver current model. Here a current source is modeling the gate of the transistor. The current source is derived using DC gate output current measurements.

All these three models have in common, that they can not model complex input signals. In the following section (section III) a nonlinear current model is presented which can be applied also for that case.

#### B. The Most Popular Approach

The most popular approach [2], [3] is a ramp model with a delay and a transition time. In this model the wire between the gates is modeled as a  $\pi$ -structure of two capacitances and a resistance. This  $\pi$ -structure then get transformed to an effective capacitance. So here the output waveform has been analyzed to get the two timings and out of it the values of the effective capacitance. The transistor gate has been modeled with a voltage source and a resistance in series. These components together with the effective capacitance give the delay and transition time of the model. Out of this model we get a good approximation for a typical ramp signal. Therefore, this model has been used as the most common approximation. Although for nanoscale designs the input signal can vary largely from this typical ramp waveform. In this case the ramp model is not a good approximation any more.



Figure 2. The most popular approach

#### C. The Variation-Aware Gate Timing Analysis

The variation aware gate timing analysis [4] uses the same  $\pi$ -structure as the model before, but statistical analysis has been added. As it was the most common way to fix one time a value for delay and transition here the RC  $\pi$ -structure can vary by using the canonical first order model (CFO equation 1).

$$A = a_{nom} + \sum_{i=1}^{m} a_i \cdot \Delta X_i + a_{m+1} \cdot \Delta S_a \tag{1}$$

In this way the imperfect CMOS manufacturing processes, so different global sources of variation  $(\Delta X_i)$ , as also random sources of variation  $(\Delta S_a)$  can be considered. Therefore the average error of the approximation can be reduced to only about 7%. Also the runtime stays 145 times faster than the normal spice simulation. But also this model can only adapt a ramp for an approximation of the signal.

## D. Model Based on Finite Elements Method (FEM)

Another idea is a model using finite elements for the modeling process [5]. Here the transistor gate is characterized by a resistance and a voltage source using the two parameters  $\omega$  and u. Here reusable models can be created



Figure 3. Finite Elements Method

and also the delay can be modeled. The accuracy should stay between 1 to 1.5% for a ramp signal. But also this model can't model more complex signals.

## E. Equivalent Waveform Propagation

This idea is a sort of an add-on for the last models [6]. It combines the static timing analysis with the model. The idea is to derive the input waveform that produces the matching output waveform (see equation 2).

$$\frac{\partial v_{out}}{\partial v_{in}} = \frac{\partial v_{out}}{\partial t} \cdot \frac{1}{\frac{\partial v_{in}}{\partial t}}$$
(2)

With the knowledge of this equation the term can be minimized.

$$\int_{t_1}^{t_2} \left| \frac{\partial v_{out}}{\partial v_{in}} \right| (f(t) - g(t))^2 dt \tag{3}$$

Where the function f(t) is the equivalent waveform and the function g(t) the actual waveform.

In this way the model is getting more accurate with a maximum of 15-30% more costs.

#### F. Current-Based Gate Models

A more recent idea is the use of current supplies instead of the voltage supplies for the gate of the transistor [7]. So the ramp will be constructed by a current source loading up a capacitance. In this way the ramp can be constructed in a much easier way.

The accuracy of the model if up to 4.6% for a not precharacterised model.

## G. Blade and Razor

The Blade and Razor model is a current source model [12]. Blade is a novel cell model and runtime engine based on current. The cell contains a voltage controlled current source, an internal capacitance and a time shift of the output waveform. Razor is an interconnect model using a novel implementation of recursive convolution.

The combination of Blade and Razor gives an acceleration of tenth thousands of times compared to SPICE and operate on arbitrary waveforms.

## H. Their Problems

The problem of all these models is that they can not operate with multistage designs and that they provide only a good precision for ramp signals and no more complex signal waveforms.



Figure 4. Ramp approximation of more complex signals

#### III. WAVEFORM INDEPENDENT MODEL (WIM)

## A. What is the Difference?

The waveform independent model uses a totally different approach to get the cell values. Here the idea is not to rebuild the waveform behavior, but to get the model parameters out of the SPICE simulation [1]. So it differs from the common practice where the gate is pre-characterized for a given ramp input.

## B. The Advantage of this Approach

As this approach uses also more complex signals in the SPICE simulation it provides certain advantages compared to the existing models. It encapsulates the intrinsic nonlinear DC and dynamic behaviors of a nonlinear driver. It provides the possibility of a multistage simulation. Additionally it stays cost comparable to that of a waveformcentric model, but can be applied to arbitrary input signals. It is suitable for capturing resistive shielding, inductive ringing, and capacitive and inductive coupling noise. And can even consider accurate timing and noise analysis under process voltage temperature analysis. Additionally the runtime of the analysis will be reduced by 40%.

## C. The Structure of the WiM

The waveform independent model proposed a two level structure, where the first stage considers the "internal" delay and creates a fictitious internal control node. Where the second stage contains static and dynamic nonlinearities to drive the output. Figure 5 shows exactly these two



Figure 5. Waveform independent model structure

stages. Directly from this point on it can be seen that this approach can be easily used for a multistage simulation as the output is just driven by the fictitious control node. The input capacitance models the loading to the preceding stage and is controlled by input and output voltage. The transfer stage depends on a double RC-stage and as output on a nonlinear current source and capacitance.

The nonlinear input capacitance (left on Figure 6) de-



Figure 6. Waveform independent model structure detailed

pends on  $V_{input}$  and  $V_{output}$ , so it simulated the load to

the preceding stage. This nonlinear capacitance is necessary, because the capacitance of a MOSFET transistor depend on the channel. If there is no channel established, so the capacitance is largely different form the saturation case where the channel is established on  $\frac{2}{3}$  of the channel length.

The double RC Input stage (in the middle of Figure 6) gives a second order model for the wire connection between the transistors. The nonlinear output behavior (right on Figure 6) is modeled by the nonlinear current source depending on the voltage of the fictitious control node  $(V_c)$  and the output voltage  $(V_o)$ . This nonlinear current source models the behavior of the MOSFET. To model the non-linear charge voltage a nonlinear capacitance depending on the fictitious control voltage and the output voltage is used.

## D. Detailed Model Extraction Steps

The detailed extraction steps for the components of the model can be followed best by a regard on Figure 7, which represents the steps in a compact form.



Figure 7. Extraction steps

#### D.1 Creation of the DC Current Lookup Table

The first step is, as it can be seen on Figure 7, the creation of a DC current lookup table which gives almost exactly the current of the nonlinear current source at the output. Due to the fact that we perform a DC voltage sweep we can neglect the nonlinear output capacitance  $(Q_{nc})$ . Figure 8 shows the DC voltage sweep at the input and output of the model.

#### D.2 Extraction of the Input Stage (RC Stage)

This step is the most complicated step, because the two poles of the RC stage have to be found. The transfer function of the double RC stage can be written as in Equation



Figure 8. DC voltage sweep for the creation of a DC current lookup table

4, where  $k_1$  and  $k_2$  are constants depending on  $p_1$  and  $p_2$ .

$$H(s) = H_1(s) + H_2(s) = \frac{k_1}{s+p_1} + \frac{k_2}{s+p_2}$$
(4)

Here again a DC voltage has to be applied at the output,



Figure 9. Extraction of the input stage (RC stage)

however at the input has to be applied a transient voltage. Here again the output current will be measured and again the output capacitance is neglected (Figure 9). Different to the case before this negligence of the current in the output capacitance is not perfectly correct, because the capacitance  $Q_{nc}$  varies with  $V_c$ , which is transient in this case. Therefore we have to take into account that we have created a certain error with this approximation (Equation 5) that the current is only coming from the nonlinear output current source.

$$I(V_o, t_i) = I_n(V_c(t_i), V_{o,dc})$$

$$\tag{5}$$

To minimize the error, we have to do a nonlinear optimization to find the optimal RC parameters. Here least square fitting has been used to reduce the error. The idea of this method is to find the minimum of the error in square (Equation 6).

$$E(p_1, p_2, i) = (I(V_o, t_i) - I_n(V_c(t_i), V_{o,dc}))^2$$
(6)

The condition of the minimum is  $\frac{\partial R^2}{\partial a} = 0$ . Therefore it is just necessary to differentiate the error of our approximation by the two poles  $p_1, p_2$  to get the two equations we need. Therefore we get Equation 7.

$$\frac{\partial E(p_1, p_2, i)}{\partial p_{1,2}} = -2(I(V_o, t_i) - I_n(V_c(t_i), V_{o,dc})) \qquad (7)$$
$$\cdot \frac{\partial I_n(V_c(t_i), V_{o,dc})}{\partial V_c(t_i)} \cdot \frac{\partial V_c(t_i)}{\partial p_{1,2}}$$

So we need the fictitious voltage to calculate the minimum. This we get from the Laplace transformation of the transfer function H(s) with a ramp as input (Equation 9). The sum of all these n ramps gives the output voltage of one RC stage y(t) (Equation 10), because arbitrary inputs can be composed as a sum of n ramps with different scaling factors  $a_j$ .

$$u(t) = at \tag{8}$$

$$y(t) = ak_1 \left( -\frac{1}{p_1^2} + \frac{t}{p_1} + \frac{1}{p_1^2} e^{-p_1 t} \right)$$
(9)

$$y(t) = \sum_{j=1}^{n} a_j k_1 \left( -\frac{1}{p_1^2} + \frac{t}{p_1} + \frac{1}{p_1^2} e^{-p_1 t} \right) \cdot U(t - t_j)$$
(10)

Now we just have to summarize the two y(t) out of the two transfer functions to get  $V_c$  (Equation 11 and 12).

$$V_c(t) = \frac{\partial y_1(t)}{\partial p_1} + \frac{\partial y_2(t)}{\partial p_2} \tag{11}$$

$$V_{c}(t) = \sum_{i=1}^{2} \sum_{j=1}^{n} \frac{a_{j}k_{i}}{p_{i}^{2}} \left( -1 + (t - t_{j})p_{i} + e^{-p_{i}(t - t_{j})} \right) \cdot U(t - t_{j})$$
(12)

Now we have all the necessary components to formulate the least square minimization problem with two equations and two variables.

## D.3 Nonlinear Output Capacitance

To get the dependency of the nonlinear output capacitance on  $V_c$  it is first necessary to set  $V_o$  to 0 and a ramp at the input. As the current of the nonlinear output current source is already known, the difference of the measured current (Equation 13) and the current of this source can be obtained, which is directly the current flowing in the capacitance.

$$I_{nc}(t_i) = I(V_o(t_i)) - I_n(V_c(t_i), 0)$$
(13)

By integrating this current flow, the nonlinear capacitance can be obtained. For example the trapezoidal rule gives.

$$Q_{nc}(V_c(t_i), 0) = 0.5 \sum_{k=1}^{i-1} I_{nc}(t_k)(t_{k+1} - t_{k-1}) + 0.5 I_{nc}(t_i)(t_i - t_{i-1})$$
(14)

To get the dependency of the nonlinear capacitance on the output voltage  $V_o$  for each  $V_c(t_i)$  a ramp voltage has to be applied on the output. Then the same procedure has been used as before to subtract the two currents and to integrate the current in the nonlinear capacitance to get its value.

In the case that the model has to be more precise, the extraction of the input stage (section D.2) can be redone using the new information of the variation of the output capacitance depending on the fictitious voltage  $V_c$ .

#### D.4 Nonlinear Input Capacitance

The characterization of the nonlinear input capacitance is a quite similar procedure than that of the output capacitance. At first the output voltage  $V_o$  is set to 0 and at the input voltage has been set a ramp. Then again the nonlinear capacitance can be calculated out of the integration on the difference of the input currents. Than multiple transient analysis have to be computed to get the dependency on the output voltage.

## D.5 Post-Tuning

To get better accuracy a post tuning process is possible as shown in Figure 7. This post-tuning process permits in a simple way to reduce the delay error by increasing the pole  $p_2$  and also to optimize the slew rate by increasing the nonlinear output capacitance. In this way a fast optimization can be obtained without recalculation the least square fitting to get the poles  $p_1$  and  $p_2$ .

## IV. PARAMETERIZABLE WAVEFORM INDEPENDENT MODEL

The Parameterizable Waveform Independent Model is an additional idea to the WiM. Here all the steps described before have to be done for different channel lengths, different ambient temperatures, different widths or different threshold voltages. In this way a set of different models can be obtained where the right model can be extracted using the response surface modeling technique [11].

#### V. Experimental Results

The experimental results show promise accuracy also of complex signals and also different process conditions.

#### A. Results Using Complex Inputs

Figure 10 shows the simulation results of SPICE compared with the simulation results of the WiM model for complex input signals.



Figure 10. Results of complex input signals

#### B. Results of Crosstalk Noise / Variational Modeling

Figure 11 shows the simulation results of SPICE compared to the results of the WiM model for crosstalk noise and variational modeling.

## C. Delay / Slew Errors of WiM and the Speedup

For delay and slew errors the model provides precision with less than 7.61% / 5.27% of error, combined with a speedup of up to 224 times the speed of the SPICE simulation.



Figure 11. Results of crosstalk noise / variational modeling

For the variational modeling less than 8.0% for the delay error and 11.1% for the slew error will be obtained and a speedup of up to 357 times the SPICE speed.

#### VI. CONCLUSION

The WiM model is a model which is easy to adopt, provides near spice accuracy even for complex signals, can be used for multistage simulations, provides a compact library, can handle delay noise simulation and has 2nd oder of magnitude speedup onto SPICE. Therefore, it provides a good model for semiconductors produced in the nanometer regime.

#### References

- Peng Li, Member IEEE, Zhuo Feng and Emrah Acar, Senior meber IEEE Characterizing Multistage Nonlinear Drivers and Variability for Accurate Timing and Noise Analysis, IEEE Transactions on very large scale integration (VLSI) Systems, Vol. 15, No. 11.November 2007.
- [2] Florentin Dartu, Noel Menezes, Jessica Qian, and Lawrence T. Pillage A Gate-Delay Model for High-Speed CMOS Circuits, IEEE Design Automation Conference, Department of Electrical and Computer Engineering, University of Texas at Austin, Texas 78712, 1994
- [3] Ravishankar Arunachalam, Florentin Dartu and Lawrence T. Pileggi CMOS Gate Delay Models for General RLC Loading, IEEE Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, 1997
- [4] Soroush Abbaspour, Hanif Fatemi, Massoud Pedram VGTA: Variation-Aware Gate Timing Analwsis, IEEE Electrical Engeneering Department, University of Sothern Califonia,sabbasp,fatemi,pedram@ceng.usc.edu, 2005
- [5] Bogdan Tutuianu, Ross Baldick, Mark S. Johnstone Nonlinear Driver Models for Timing and Noise Analysis, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2004
- [6] Masanori Hashimoto, Yuji Yamada, and Hidetoshi Onodera Equivalent Waveform Propagation for Static Timing Analysis, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2004
- [7] Andrew B. Kahng, Bao Liu and Xu Xu Constructing Current-Based Gate Models Based on Existing Timint Library, IEEE, proceedings of the 7th International Symposium on Quality Electronic Design, 2006
- [8] Igor Keller, Ken Tseng and Nishath Verghese A Robust Cell-Level Crosstalk Delay Change Analysis, IEEE, Cadence Design Systems, San Jose, USA, 2004
- [9] Peng Li, Emrah Acar A Waveform Independent Gate Model for Accurate Timing Analysis, IEEE, Proceedings of the 2005 International Conference on Computer Design
- [10] David Blaauw, Supamas Sirichotiyakul, and Chanhee Oh Driver Modeling and Alignment for Worst-Case Delay Noise, IEEE, Tansacrions of very large scale integration(VLSI) systems, 2003
- [11] K. K. Low, Stephen W. Director An Efficient Methology

for Building Marcomodels of IC Fabrication Processes, IEEE, Tansacrions on computer-aided design, 1989

[12] John F. Croix, D. F. Wong Blade and Razor: Cell and Interconnect Delay Analysis Using Current-Based Models, IEEE, Anaheim, California, USA, 2003