# 高速低功率單時脈 CMOS 動態可程式化邏輯陣列電路 A Low-power and High-Speed Dynamic PLA Circuit Configuration for Single-Clock CMOS\* 王朝欽吳啓豐黃潤德高家雄 Chua-Chin Wang \*, Chi-Feng Wu, Rain-Ted Hwang & Chia-Hsiung Kao 國立中山大學電機系 Department of Electrical Engineering National Sun Yat-Sen University Kaohsiung, Taiwan 80424 Email: ccwang@ee.nsysu.edu.tw # 摘要 我們發表一個高速低功率單時脈 CMOS 動態可程式化邏輯陣列電路。利用在兩個動態反或閘平面間插入靜態反及閘以消除 Racing 和縮短突波,因而降低其動態功率。此外,亦有無接地開關,無電荷分散及零位移的優點。 關鍵字: 低功率,高速度,可程式化邏輯陣列。 ### Abstract We present a low-power high-speed CMOS circuit implementation of NOR-NOR PLA using a single-phased clock. Buffering static NAND gates are inserted between the NOR planes to erase the racing problem and shorten the duration of glitches such that the dynamic power is reduced besides the low static power dissipation, no ground switch, no charge sharing and zero offset. Key Words: single-clock, NOR-NOR PLA, low-power & high-speed. #### 1. Introduction A PLA consists of three parts: an input decoder, an AND plane and an OR plane. PLAs can be implemented by either static or dynamic styles. The style is chosen depending on the timing strategy. Modern CAD tools are required to support the integration of commonly used single-phased edge-triggered basic elements [1], including PLAs. Though Blair [2] proposed a single-clock PLA design, there is plenty of room for the improvement in terms of power and speed for more advanced designs. We consider the combination of dynamic, pseudo-N, and domino logic design styles to develop a low-power and high-speed design for PLAs using only one clock. The basic concept is to insert a buffering NAND gate between two NOR planes in order to eliminate the ground switch and reduce the duration of dynamic power spikes and to avoid racing problems. In this paper, we will briefly discuss the drawbacks of prior PLA design alternatives and then present a new circuit design that NAND gates are used as buffers between the NOR planes. # 2. Low-Power and High-Speed (LP-HS) Single-Clock PLA Design ### 2.1 Prior PLA Design Alternatives Before the discussion of the proposed PLA design, the shortcoming of several PLA design methods are listed as follows. Pseudo-NMOS [6]: It is the simplest design style to realize PLAs. A PMOS is used as a static load with its gate tied to ground while the desired function is implemented as a pull-down network of NMOSs, The main disadvantage of this approach is the existence of a DC path from VDD to ground which causes a large static power dissipation. Besides, when the pull-up time is critical, the PMOS load device should be large enough which will force the size of pull-down NMOSs to be enlarged to avoid the ratio problem. Dynamic NOR-NOR [6], [4]: The standard technique for the single-phase operation is to charge (or discharge) the logic gate output to "high" (or "low") through a single PMOS (or NMOS) transistor. Then that transistor is turned OFF, while the logic gate is resolved to determine whether the output is charged or discharged through the pre-defined evaluation pull-down (or pull-up) block. The major problem of this type of logic is the racing problem when two dynamic logic gates are cascaded in series. There is a possibility that the output <sup>\*</sup>This research was partially supported by National Science Council under grant NSC 87-2215-E-110-010. The contact author. of the first gate wrongly turns ON or OFF the second gate such that the final result is incorrect. In addition, the ground switch will produce a large parasitic capacitance which certainly reduce the speed. Domino [6]: In domino logic design, the gates are either all precharged or all predischarged, and connected to the next stage through inverters which increase delays and power dissipation. Besides, the serial NMOSs of the front AND plane will cause large pull-down delay. Dhong's design [3]: Dhong et al. proposed a PLA design approach which employs a predischarged OR array and a charge sharing AND array. Since the charge sharing is used, the output voltage is kept approximately 3.0 V when Vdd is 5.0 V. It cannot provide the full swing of the voltage aside the possible noise margin problem. Blair's design [2]: Blair replaced the usual AND plane with a predischarging pseudo-NMOS NOR plane in order to shorten the series NMOS transistors in the evaluation block. However, the buffering inverters between the NOR plane and the OR plane and the inverter for the clock fed into the OR plane produce delay problem. In addition, the PMOS load transistor is constrained by the sizing ratio such that it is hard to drive large capacitance load. #### 2.2 LP-HS PLA circuit Referring to Fig. I, the gated NOR array used to replace the usual AND array. All of the inputs of the first NOR plane are ANDed with clock signal, clk, before they are fed into the gates of the evaluation block. Thus, in the precharging duration of the clock, clk=0, the node p will be charged to high. Another novel design is to utilize an NAND gate as the buffer. The advantage of this NAND gate, as shown in Fig. I, is to precharge node q to high while to predischarge node r to ground such that the output s is also precharged to be high. Note that the second plane of the PLA is a NOR plane. When the clock turns high which is the evaluation duration, clk = l, the input is fed through the gated AND to the NMOS transistors in the evaluation block. In the meantime, the buffering NAND gate turns into an inverter. If the pull-down NMOS network resolves "high," the node p is discharged which keeps q and r, respectively, high and low. The state of output s remained unchanged. If the pull-down NMOS network resolves "low," the node p is remained "high" which in turn flips the states of g and r, respectively, to be low and high. The state of output s then is grounded. In short, the functions of the proposed PLA design is listed in Table 1. ### 2.3 Analysis Of Speed and Power Speed: The speed of the dynamic style PLA depends on the charging (and discharging) speed of nodes p and s. The buffering NAND gate helps to charge node q to be high during the precharging duration. This results in that the second NOR plane doesn't have to wait for the charging of node p and operates alone by itself. Then r is low to turn off the pull-down NMOS network. Then s is charged through the load PMOS. Besides, the charging through NAND gate reduce the load on node p so that the size of PMOS load can be increased to accommodate a large capacitance. Power: The static power dissipation of the proposed design contrasts favorably with that in the pseudo-NMOS implementation owing to the fact that there is no DC path from Vdd to ground. In our design, the static power dissipation only occurs when the pull-down NMOS network resolves "high". The most important factor regarding power dissipation is the buffering NAND gate. Statistically speaking, half of the input vectors will change the state at node p. In all of the prior PLA designs, the states at the inverters between the first plane (array) and the second plane (array) will be changed consequently. This will possibly result in many necessary dynamic power loss. For example, the power supply has to charge or discharge the voltage at the nodes of those buffering inverters in the conventional dynamic NOR-NOR design and Blair's design whenever the state of node p flips. In contrast, in our PLA design, if the previous state of node p is "low", which set q to be high, then the state of q will not be changed when the clock goes low. This will save about 50% of dynamic power provided that the input vectors are randomly given. #### 3. Simulation and Analysis Speed (Delay) Simulations: In order to verify the proposed low-power high-speed PLA configuration, we conduct a series of different PLAs' simulations to compare with other PLA designs as down in Fig. 2. Table 2 shows all of the sizes of each PLA designs which are all implemented by TSMC O.6 µm SPDM technology. Note that "ourpla" in the following tables denotes the proposed low-power high-speed PLA. Fig. 3 shows the timing responses of these PLA configurations. To effect a comparison, the output load of the first planes of the PLAs is assumed to be 0.5 pF, the ground switch is also assumed to be 0.5 pF, the wire capacitance of the buffers is assumed to be 1.0 pF, and the output load of these PLAs is set to be 1.0 pF. Another thing to be noted is that the response of state of the output of the Domino logic should be complementary to the that of input. We apply another input waveform which is out-of-phase to the input shown in the top of Fig. 3 to the Domino logic in order to compare the fall-time delay response. The waveforms in Fig. 3 are simulated by CADENCE and HSPICE tools. The average delay of these PLAs are tabulated in Table 3. Note that the delay of Dhong's PLA using charge sharing is faster than that of our design is owing to the high voltage of their PLA is only 3.0 volts. It has trouble to reach the full voltage swing. If we tend to magnify the voltage swing by adding an inverter in the end of Dhong's circuit, the delay and the power will be increased. Power Dissipation Simulations: As for the power consumption comparison, we also conduct a series of simulations which employ Monte Carlo method of HSPICE. The number of sweeps is 30, and the signal frequency is 10 MHz. The power dissipation results are tabulated in Table 4. The proposed PLA produces the least power consumption among these PLA design approaches. These results correspond to what we expect in terms of dynamic power saving. #### 4. Conclusion The proposed PLA configuration, using one NAND gate between the product line and output line instead of one inverter, can shorten the charging time and discharging time of the PLA and consequently the duration of the dynamic power dissipation from Vdd to GND. The proposed buffering NAND gate can also keep the internal state statistically. This approach makes PLA low-power and high-speed possible. Its performance is also verified by the simulations. #### References - [1] M.Afghahi, "A robust single phase clocking or low power, high-speed VLSI applications," IEEE J. of Solid-State Circuits, vol. 31, no. 2, PP. 247-253, Feb. 1996. - [2] G.M. Blair, "PLA design for single-clock CMOS," IEEE J. Solid-State Circuits, vol. 27, no. 8, pp. 1211-1213, Aug. 1992. - [3] Y.B. Dhong, and C. P. Tsang, "High speed CMOS POS PLA using predischarged OR array and charge sharing AND array," IEEE Trans. on Circuits & Systems-II: Analog and Digital Signal Processing, vol. 39, no. 8, pp. 557-564, Aug. 1992. - [4] N. F. Goncalves, and H. J. De Man, "NORA: A race-free dynamic CMOS technology for pipelined logic structures," IEEE J. on Solid-State Circuits, vol. 18, PP. 261-266, June 1983. - [5] R. Linz, "A low-power PLA for a signal processor," IEEE J. Solid-State Circuits, vol. 26, no. 2, pp. 107-115, Feb. 1991. - [6] N. H. E. Weste, and K. Eshraghian, "Principles of CMOS VLSI Design A Systems Perspective," 2nd edition. Reading MA: Addison-Wesley, 1993. - [7] C-C. Wang, and M.-D. Jeng, "Power Estimation of Internal Nodes for Finite State Machine Using Gray Code Encoding in State Assignment," National Computer Symposium 1995, pp. 842-949, Dec. 1995. | Signals | p | q | r | Output s | |---------------------------|---|---|---|----------| | clk=0, input = don't care | 1 | 1 | 0 | 1 | | clk=1, input = high | 0 | 1 | 0 | 1 | | clk=1, input = low | 1 | 0 | 1 | 0 | Table 1: The function table of the proposed PLA design. | Name | PMOS (1: w) μm | NMOS(1:w)μm | |----------|----------------|-------------| | pseudo-N | 1.2:0.9 | 0.6:0.9 | | NOR-NOR | 0.6:2.25 | 0.6:0.9 | | Domino | 0.6:2.25 | 0.6:0.9 | | Dhong | 0.6:2.25 | 0.6:0.9 | | Blair | 0.6:2.25 | 0.6:0.9 | | ourpla | 0.6:2.25 | 0.6:0.9 | Table 2: The transistor sizes of different PLA designs. | Name | Delay ( ns ) | $V_{_H}(\mathit{volts})$ | |----------------|--------------|--------------------------| | pseudo-N | 42.0 | 5.0 | | NOR-NOR | 42.6 | 5.0 | | Domino | 33.7 | 5.0 | | | 20.2 | 2.7 | | Dhong<br>Blair | 30.5 | 5.0 | | ourpla | 24.0 | 5.0 | Table 3: The average delay of different PLA designs. | Name | Average (μW) | Min. (μW) | Max. ( mW ) | |----------|--------------|------------|-------------| | pseudo-N | 604.7585 | 531.2836 | 1.6221 | | NOR-NOR | 68.6455 | 17.2681E-6 | 2.8317 | | Domino | 61.7357 | 9.0306E-6 | 3.8929 | | Dhong | 76.1136 | 2.8798E-6 | 4.7348 | | Blair | 294.9889 | 1.8295E-3 | 4.7854 | | ourpla | 61.0667 | 23.8389E-6 | 5.0932 | Table 4: The power dissipation comparison of different PLA designs. Figure 1: Low-power and high-speed PLA circuit Figure 2: Schematic diagrams of PLA design alternatives Blair's ourpla Figure 3: Waveforms of PLA design alternatives