Technical Feature
Implementation of a Neural Network HEMT Model into ADS
Neural network algorithms have been applied to various areas of engineering and computer science. A high electronmobility transistor (HEMT) largesignal model has been implemented in the Advanced Design System (ADS) software package using a multilayered neural network. The neural network model was trained to the drain current, gatesource capacitance and gatedrain capacitance characteristics of an Angelov model previously developed. Training was done using the backpropagation algorithm, which achieved excellent results for the model. Linear and harmonic balance simulations were performed to compare the linear and power performance of the neural network model to the Angelov model, without any additional optimization after the initial training.
Willie L. Thompson II, Eric V. Miller and Carl White
Morgan State University Center of Microwave, Satellite and RF Engineering
Baltimore, MD
In recent years, device characterization and modeling have become essential components of the design cycle. Many CAD packages allow engineers to simulate and modify their circuits during the initial design cycle, which help to decrease the need for additional iterations and decrease the timetomarket, especially for microwave monolithic integrated circuits (MMIC). For accurate design simulations, engineers require device models for the active and passive components used in the CAD packages. These models must be accurate, reliable, easily extracted and have limited computational requirements.
Currently, there are several types of device models. Physicsbased models (PBM)^{1,2} utilize solidstate physics equations to obtain the device characteristics. PBMs are very computer intensive, and, in some cases, the accuracy of the model is affected by assumptions made to solve closedform equations. In addition, most PBMs are device dependent, which means a PBM of a HEMT device will not model the behavior of a metaloxide semiconductor fieldeffect transistor (MOSFET) without modification of the model's equations. Empiricalbased models (EBM)^{36} use multidimensional polynomials to simulate the device characteristics. Polynomial expressions provide good approximation for mild nonlinear characteristics, but usually breakdown for high nonlinearities such as DC characteristics at pinchoff. The empirical approach usually requires intensive extraction procedures following several measurements to obtain an accurate model. Tablebased models (TBM)^{7} use a table of measured data to model the device characteristics. The size of the table grows at an exponential rate as the number of elements being modeled increases. In addition, TBMs require interpolation or extrapolation functions to obtain results at points that are not included within the range of the table's measurements, thus degrading the model accuracy. One major advantage of EBMs and TBMs is that they are less computer intensive than PBMs.
Neural networks (NN) have been reported for device and circuit modeling.810 A neural network is a mathematical approach for mimicking the way the human brain is able to learn and process data. The brain is a network of neurons that are connected to each other through axons. By modifying the axons or connections between the neurons in response to input data, the brain is able to learn the data. Using its neurons and axons, the brain is able to generate a representation of the data that can be used later in response to similar input data. Neural network algorithms use this same principle to learn data using a network of mathematical functions (neurons) and weights (axons) to generate a representation of the data. The major advantages of NN models are: theoretically, they can handle any degree of nonlinearity within the data;^{8} the size of the model does not increase exponentially with an increase in the number of characteristics being modeled; and the computational requirement is limited to the one time training of the NN model, which is usually performed offline.


Fig. 1 Multilayered structure used from implementation of the NN model.  Fig. 2 A typical largesignal HEMT equivalent circuit. 
A typical multilayered neural network is shown in Figure 1 , consisting of an input layer, hidden layer and output layer. Using a similar NN structure, Zaabab, et al.,^{8,9} have demonstrated NN algorithms used to predict device IV curves and Sparameters with excellent results. Also, Shirakawa, et al.,^{10} have used NN algorithms to model the largesignal behavior for a HEMT. They have demonstrated good results for modeling C_{gs} and gm versus bias.
This article presents an approach for implementing a NN algorithm to develop a largesignal HEMT model to be used in ADS by circuit designers. A typical largesignal model is shown in Figure 2 . The NN model proposed in this article implements NN algorithms for I_{ds} , C_{gs} and C_{gd} , with I_{ds} being the most nonlinear element in the circuit, and C_{gs} and C_{gd} being secondary nonlinear elements that affect the accuracy of the model as suggested by Statz, et al.4 Using the Angelov model6 that was developed for a PHEMT device to generate the I_{ds} , C_{gs} and C_{gd} data, the NN model was trained using the backpropagation algorithm. After training was completed, a set of weights for the NN structure was generated that represents the behavior of the elements in the model. After the weights set is imported into a userdefined ADS model, any simulation and results can be performed without the need of any additional optimization to the model. In particular, linear and harmonic balance simulations were performed to compare the Sparameters, maximum available gain, kfactor, power at 1dB compression point and poweradded efficiency of the NN model with the Angelov model.
Multilayered Neural Network
Neural Network Structure
A typical multilayered NN structure consists of an input layer, hidden layer and output layer. Each circle represents a neuron with I neurons for the input layer, J neurons for the hidden layer and K neurons for the output layer. The hidden layer allows the NN to model the complex nonlinear relationship of the input and output efficiently and accurately. The network can have as many hidden layers as needed.
In this work, the model utilizes a threelayer NN structure. The total weighted input and output for the ith neuron in the L layers is given by
and
where M = number of neurons in the (L1)th layer
The other parameters are defined in a sidebar on page 75. The size of the model does not grow exponentially when the number of input parameters is increased. The total number of model parameters is {(I x J) + (K x J) + (I + J + K)}. The first term is equal to the number of weights between the input layer and hidden layer, the second term is equal to the number of weights between the hidden layer and output layer, and the third term is equal to the number of bias weights for each neuron in the network. The input parameters for the model are V_{gs} and V_{ds} with the output parameters being I_{ds} , C_{gs} and C_{gd} .
Backpropagation Algorithm
The backpropagation algorithm is a popular training algorithm that minimizes the error between the outputs of the NN and desired outputs of the training set. The total error is given by
where
M = number of neurons in output layer
n = loop counter that indicates the current iteration of the training points, which ranges from 1 to the maximum number of training points
The main goal of the backpropagation algorithm is to adjust the weights and biases of the neurons to minimize the total error of the outputs. They are adjusted using
where
n = loop counter as shown previously
The training algorithm that was implemented for the model was based on the gradient descent method.^{9,11} The important structure and training parameters of the network are given in Table 1 .
Table 1  
NN layers  3 
Neurons in input layer, I  2 
Neurons in hidden layer, J  75 
Neurons in output layer, K  3 
Learning rate of weights, b  0.01 
Learning rate of biases, h  0.1 
Momentum term, a  0.4 
Angelov Empiricalbased Model
Angelov^{6} describes a largesignal EBM for HEMT and MESFET devices. The model is capable of simulating the DC characteristics and capacitances of the device. The drainsource current is modeled using a harmonic series, while the derivatives of the current with respect to the potentials are modeled using singularvalue decomposition. In addition, the model is capable of simulating C_{gs} and C_{gd} , which increases the model's accuracy in predicting the different harmonics.^{11} The empirical equations are
I_{ds} = I_{pk} (1 + l V_{ds} ) {1 + tanh (Y)}tanh(d V_{ds} ) (6)
where
I_{pk} = drain current at maximum gm
l = channel length modulation
d = saturation voltage parameter
Y = P_{1} (V_{gs}  V_{pk} ) + P_{2} (V_{gs}  V_{pk} )^{2} + P_{3} (V_{gs}  V_{pk} )^{3} + ...
P_{i} = fitting coefficients
V_{pk} = gate voltage at peak gm
C_{gs} = C_{gso} {1 + tanh(f_{1} )}{1 + tanh(f_{2} )} (7)
C_{gd} = C_{gdo} {1 + tanh(f_{3} )}{1 + tanh(f_{4} )} (8)
where
f_{1} = P_{0gsg} + P_{1gsg} V_{gs} + P_{2gsg} V^{2} _{gs} + ...
f_{2} = P_{0gsd} + P_{1gsd} V_{ds} + P_{2gsd} V^{2} _{ds} + ...
f_{3} = P_{0gdg} + P_{1gdg} V_{gs} + P_{2gdg} V^{2} _{gs} + ...
f_{4} = P_{0gdd} + (P_{1gdd} + P_{1cc} V_{gs} )V_{ds} + P_{0gdd} V^{2} _{ds} + ...
The term P_{1cc} V_{gs} V_{ds} reflects the cross coupling of C_{gd} with V_{gs} and V_{ds} potentials. The remaining terms are fitting parameters.
An Angelov EBM was developed for a 600 mm NEC device (NEC12801).^{11} The device was fully characterized from measured Sparameters with V_{gs} ranging from pinchoff to saturation (2.0 to 0.2 V) and V_{ds} ranging from 0 to 8 V, over a frequency range from 45 MHz to 26.5 GHz. DC measurements were taken for I_{ds} , I_{gd} and I_{gs} . Upon measurement of the device Sparameters, a smallsignal equivalent model was created for the bias points to obtain the intrinsic parameters. The C_{gs} and C_{gd} values, obtained from the smallsignal models as a function of bias, were then fitted to satisfy Equations 7 and 8. The drainsource current was then fitted to the Angelov current Equation 6, while I_{gd} and I_{gs} were fitted to normal diode IV expressions. The model parameters for Angelov's expressions are given in Table 2 . Good results for DC characteristics and Sparameters were demonstrated.^{11}



Fig. 3 Comparison of the neural network output to training data for I_{ds} .  Fig. 4 Comparison of the neural network output to training data for C_{gs} .  Fig. 5 Comparison of the neural network output to training data for C_{gd} . 
Training Results
For this work, the training data was obtained using the Angelov model that was described previously. The training data was generated directly for the NN model for I_{ds} , C_{gs} and C_{gd} . Figures 3 to 5 are samples of training results for I_{ds} , C_{gs} and C_{gd} , respectively. The training for the NN model was performed using a 166 MHz computer with a training time of 150 minutes. The training ranges for each output are shown in Table 3 . The training ranges are large for the first iteration of the model. More optimized data set size can be achieved, but was not the focus of this work.
Table 2  
Drain current at max. gm, I_{pk}  0.24 
Channel length modulation, l  0.03 
Saturation voltage, d  3.9 
I_{ds} fitting coefficient, P_{1}  0.47 
I_{ds} fitting coefficient, P_{2}  0.5 
I_{ds} fitting coefficient, P_{3}  0.4 
Gate voltage at peak gm, V_{pk}  0 
Capacitance parameter, C_{gs} o  6.97e13 
C_{gs} fitting coefficient, P_{0gsg}  4.45 
C_{gs} fitting coefficient, P_{1gsg}  2.56 
C_{gs} fitting coefficient, P_{2gsg}  0 
C_{gs} fitting coefficient, P_{0gsd}  0.77 
C_{gs} fitting coefficient, P_{1gsd}  0.02 
C_{gs} fitting coefficient, P_{2gsd}  0 
Capacitance parameter, C_{gdo}  2.13e14 
C_{gd} fitting coefficient, P_{0gdg}  2.29 
C_{gd} fitting coefficient, P_{1gdg}  1.43 
C_{gd} fitting coefficient, P_{2gdg}  0 
C_{gd} fitting coefficient, P_{0gdd}  0.38 
C_{gd} fitting coefficient, P_{1gdd}  0.34 
C_{gd} fitting coefficient, P_{3gdd}  0 
C_{gd} coupling coefficient, P_{1cc}  0 
Implementation
The implementation of a NN model into ADS requires the development of a userdefined model. The three main steps in developing a userdefined model are:
1. Defining the parameters that the user will interface with the model from the ADS schematic.
2. Defining the circuit symbol and number of pins for interfacing with ADS simulators.
3. Development of C code.

Fig. 6 Implementation of the neural network into the circuit simulator 
The first two steps consist of developing the application extension language (AEL) code to interface the model with ADS. AEL provides the coupling of the model's parameters and pins in the schematic design to the simulator. The C code is used to define the device's response to its parameter configuration, simulation controls and pin voltages.
ADS has several types of simulators in which a model can be used. The linear simulator is used for emulating the smallsignal response versus frequency of a transistor. Typical outputs include Sparameters, stability factor and maximum available gain. A smallsignal or a largesignal model can be used within the linear simulator. The nonlinear simulator (harmonic balance) is used for the largesignal response of a transistor at different power excitations. The transient simulator is used to model the response of a transistor versus time. The harmonic balance and transient simulators require a largesignal model which accurately represents the behavior of a transistor.
In developing the userdefined model, the typical largesignal equivalent circuit is divided into a linear subnetwork and a nonlinear subnetwork. The linear subnetwork includes R_{i} , R_{c} , R_{s} , R_{g} , R_{d} , C_{rf} , C_{ds} , L_{s} , L_{g} and L_{d} , whose values can be obtained from extraction procedures and smallsignal modeling. The linear elements do not exhibit bias dependence, but exhibit frequency dependence. The nonlinear subnetwork includes I_{ds} , I_{gs} , I_{dg} , C_{gs} and C_{gd} , which exhibit strong bias dependence.
The implementation of a NN subnetwork into a harmonic balance simulator proceeds as follows. In a harmonic balance simulation, the simulator solves the harmonic balance equation
F(V) = I(V) + jWQ(V) + YV + I_{ss} = 0 (9)
where
I(V) = currents out of the nonlinear subnetwork
Q(V) = charges out of the nonlinear subnetwork
Y = admittance matrix of the linear subnetwork
V = node voltages in the circuit
W = angular frequency matrix
I_{ss} = sources
Zaabab, et al.^{9} proposed dividing the equivalent circuit into three subnetworks, which include the linear elements, the NN algorithm that is used to model to nonlinear relationship of the current and capacitances and all other nonlinear elements of the equivalent circuit as demonstrated in Figure 6 . The new harmonic balance equation is
F(V) = I(V) + jWQ(V)+ I_{n} (V) + jWQ_{n} (V) + YV + I_{ss} = 0 (10)
where
I_{n} (V) = nonlinear NN current
Q_{n} (V) = charges of NN capacitances
The NN subnetwork is simply included as an additional nonlinear subnetwork within the harmonic balance equation.
The NN userdefined model code contains three main functions that correspond to the linear subnetwork, nonlinear subnetwork and NN subnetwork. The linear function models the linear elements of the circuit and computes the admittance matrix, Y of Equation 10, for each node. The nonlinear function models the I_{gs} , I_{dg} . These two nonlinear current sources were fitted to normal diode IV expressions as previously mentioned. This function computes the nonlinear I(V) and Q(V) matrices for Equation 10. Finally, the NN function models the I_{ds} , C_{gs} and C_{gd} , and computes the nonlinear I_{n} (V) and Q_{n} (V) matrices for Equation 10. In addition to providing the currents and charges, the nonlinear and NN functions must provide the derivatives of the currents and charges with respect to each voltage source. These derivatives are used in the linear simulator and enhance the convergence of the nonlinear simulator.
For this work, the NN algorithm was trained to the drain current, gatesource capacitance and gatedrain capacitance as discussed previously using data generated by an Angelov model. Once the offline training was performed, the generated weight set is imported and used by the NN function within the ADS userdefined model. The charges for the subnetwork are generated from the capacitances and the current derivatives are computed. The linear and nonlinear functions remained the same as those used for the Angelov userdefined model with the same values for the elements, excluding the nonlinear elements modeled by the NN subnetwork.


Fig. 7 DC simulation results within training range.  Fig. 8 DC simulation results for expanded range of V_{gs} and V_{ds} . 
Verification
DC Simulations
DC characteristics are essential for the modeling of the linear and nonlinear performance of the transistor. DC characteristics within the training range of the neural network were simulated. Excellent DC performance of the NN model and the ability of the NN algorithm to predict untrained V_{gs} curves are demonstrated in Figure 7 . The untrained curves are V_{gs} = 0.15, 0.35, 0.85 and 1.35 V, respectively. Figure 8 demonstrates the DC characteristics exceeding the training range of the drain current. Once again, good DC performance of the NN model and the ability of the NN algorithm to predict untrained V_{gs} curves are demonstrated.

Fig. 9 Sparameters for V_{gs} =0.95 V and V_{ds} =5.0 V. 
Linear Simulations
Linear simulations were performed to verify the smallsignal analysis of the NN model to the Angelov model. Sparameter simulation results are demonstrated for a frequency range of 45 MHz to 26.5 GHz at a bias point of V_{gs} = 0.95 V and V_{ds} = 5.0 V, as shown in Figure 9 . Maximum available gain, stability factor, and simultaneousmatched input gamma and output gamma were simulated. They are excellent indicators of the accuracy of the model to smallsignal analysis.^{12} Excellent results are demonstrated in Figures 10, 11 and 12 . In addition to verifying the smallsignal analysis of the model, performing linear simulations with a nonlinear largesignal model verify the convergence of the model to the smallsignal case.



Fig. 10 Maximum available gain for V_{gs} =0.95 V and Vds5.0 V.  Fig. 11 Stability factor for V_{gs} =0.95 V and V_{ds} 5.0 V.  Fig. 12 Simultaneousmatched input/output gamma for V_{gs} =0.95 V and V_{ds} =5.0 V. 
Nonlinear Simulations
Nonlinear performance of the NN model was simulated using ADS's harmonic balance simulator. The harmonic balance simulation requires accurate modeling of the charge and capacitance at each node, and the derivatives of the drain current, which are used in Equation 10. The simulations were performed at a fundamental frequency of 2.4 GHz and a bias point of V_{gs} = 0.95 V and V_{ds} = 5.0 V. Good output power versus input power characteristics are demonstrated in Figure 13 , with the power characteristics at the 1dB compression point (P1dB) given in Table 4 . The NN model predicted greater compression (or less output power) than the Angelov model, but the result is within 2 percent of the Angelov model. The poweradded efficiency (PAE) versus input power is shown in Figure 14 and shows good agreement with the Angelov model.


 
A largesignal model must model the harmonics that can be generated during nonlinear operation. The output power levels of the first three harmonics are demonstrated in Figure 15 . This simulation demonstrates the NN model's ability to model the different harmonics and shows good agreement with the Angelov model. The generation of harmonics will cause distortion of the output waveform as demonstrated in Figure 16 .




Fig. 13 Output power vs. input power.  Fig. 14 PAE vs. input power.  Fig. 15 Output power for the first three harmonics.  Fig. 16 Output waveform vs. different input power. 
Conclusion
A nonlinear largesignal multilayered NN model implemented into Agilent's ADS is described. The model is fully integrated into the simulator, allowing any simulation to be performed with the model (DC, linear, harmonic balance and transient). The NN was trained to the drain current, gatesource capacitance and gatedrain capacitance offline using a backpropagation algorithm. After training, the weight set of the NN was imported into ADS where no additional optimization was required to the model. Harmonic balance results of the model strongly agreed with the Angelov model. To the author's knowledge, this is the first reported fully integrated NN model in a commercially available CAD package demonstrating excellent linear and nonlinear results.
Acknowledgment
The authors would like to thank the Maryland Industrial Partnerships (MIPS) for funding of this work. The authors would also like to thank Marek Mierzwinski of Agilent Technologies, EEsof EDA Division, Kreative Control Systems Inc. (KCS) and to Adrian Gilbert of Hughes Space and Communications. Thanks also to the undergraduate research students of COMSARE (John Brice, Chris Guisto, Clifton Martin, Jerhome Petway and Ammyanna Williams).
References
1. J.W. Bandler, R.M. Biernacki, Q. Cai, S.H. Chen, S. Ye and Q.J. Zhang, "Integrated Physicsoriented Statistical Modeling, Simulation and Optimization," IEEE Transactions on Microwave Theory and Techniques , Vol. 40, 1992, pp. 13741400.
2. D.E. Stoneking, G.L. Bilbro, P.A. Gilmore, R.J. Trew and C.T. Kelley, "Yield Optimization Using a GaAs Process Simulator Coupled to a Physical Device Model," IEEE Transactions on Microwave Theory and Techniques , Vol. 40, 1992, pp. 13531363.
3. W.R. Curtice and M. Ettenberg, "A Nonlinear GaAs FET Model for Use in the Design of Output Circuits for Power Amplifiers," IEEE Transactions on Microwave Theory and Techniques , Vol. 33, 1985, pp. 13831394.
4. H. Statz, P. Newman, I.W. Smith, R.A. Pucel and H. Haus, "GaAs FET Device and Circuit Simulation in SPICE," IEEE Transactions on Electron Devices , Vol. 34, 1987, pp. 160166.
5. S. Maas and D. Neilson, "Modeling of MESFETs for Intermodulation Analysis of Mixers and Amplifiers," 1990 IEEE MTTS Digest , pp. 12911294.
6. I. Angelov, H. Zirath and N. Rorsman, "A New Empirical Nonlinear Model for HEMT and MESFET Devices," IEEE Transactions on Microwave Theory and Techniques , Vol. 40, 1992, pp. 22582266.
7. D.E. Root, S. Fan and J. Meyer, "Technology Independent Largesignal Nonquasistatic FET Models by Direct Construction from Automatically Characterized Device Data," Proceedings of the 21st European Microwave Conference , Stuttgart, Germany, September 1991, pp. 927932.
8. A.H. Zaabab, Q.J. Zhang and M. Nakhla, "Analysis and Optimization of Microwave Circuits and Devices Using Neural Network Models," IEEE International Microwave Symposium Digest , San Diego, CA, 1994, pp. 393396.
9. A.H. Zaabab, Q.J. Zhang and M. Nakhla, "A Neural Network Modeling Approach to Circuit Optimization and Statistical Design," IEEE Transactions on Microwave Theory and Techniques , Vol. 43, 1995, pp. 13491358.
10. K. Shirakawa, M. Shimiz, N. Oktubo and Y. Daido, "A Largesignal Characterization of an HEMT Using a Multilayered Neural Network," IEEE Transactions on Microwave Theory and Techniques , Vol. 45, 1997, pp. 16301633.
11. A. Gilbert and C. White, "Implementing a Bias Dependent Largesignal Into a Well Known Circuit Simulator," Morgan State University, 1996.
12. C. Giusto and C. White, "Techniques for Smallsignal Modeling," Applied Microwave and Wireless Magazine , May 2000, pp. 4246.