12.11
Performance-Driven Synthesis
Many logic synthesizers allow the use of directives. The
pseudocomment in the following code directs the logic synthesizer to minimize the delay of an addition:
module
add_directive (a, b, z);
input
[3:0] a, b;
output
[3:0] z;
//compass maxDelay 2 ns
//synopsys and so on.
assign z = a + b;
endmodule
These directives become complicated when we need to describe complex timing constraints.
Figure 12.7
(a) shows an example of a more flexible method to measure and specify delay using
timing arcs
(or timing paths). Suppose we wish to improve the performance of the
comparator/MUX example from
Section 12.2
. First we define a
pathcluster
(a group of circuit nodes—see
Figure 12.7
b). Next, we specify the
required time
for a signal to reach the output nodes (the
end set
) as 2 ns. Finally, we specify the
arrival time
of the signals at all the inputs as 0 ns. We have thus constrained the delay of the comparator/MUX to be 2 ns—measured between any input and any output. The logic-optimization step will simplify the logic network and then map it to a cell library while attempting to meet the timing constraints.
|
|
|
FIGURE 12.7
Timing constraints. (a) A pathcluster. (b) Defining constraints.
|
Table 12.11
shows the results of a
timing-driven logic optimization for the comparator/MUX. Comparing these results with the default optimization results shown in
Table 12.3
reveals that the timing has dramatically improved (critical path delay was 2.43 ns with default optimization settings, and the delay varies between 0.31 ns and 1.64 ns for the timing-driven optimization).
|
TABLE 12.11
Timing-driven synthesis reports for the comparator/MUX example of
Section 12.2
.
|
|
Command
|
Synthesizer output
|
|
> set pathcluster pc1
> set requiredTime 2 outp[0] outp[1] outp[2] -pathcluster pc1
> set arrivalTime 0 * -pathcluster pc1
|
|
> optimize
|
Num Gate Count Tot Gate Width Total
Cell Name Insts Per Cell Count Per Cell Width
--------- ----- ---------- -------- -------- --------
an02d1 1 1.3 1.3 12.0 12.0
in01d0 2 .8 1.5 7.2 14.4
mx21d1 2 2.2 4.5 21.6 43.2
nd02d0 2 1.0 2.0 9.6 19.2
oa03d1 1 1.8 1.8 16.8 16.8
oa04d1 1 1.3 1.3 12.0 12.0
--------- ----- ---------- -------- -------- --------
Totals: 9 12.2 117.6
|
|
> report timing
-allpaths
|
path cluster name: pc1
path type: maximum
----------------------------------------------------------------------
end node current required slack
----------------------------------------------------------------------
outp[1] 1.64 2.00 .36 MET
outp[0] 1.64 2.00 .36 MET
outp[2] .31 2.00 1.69 MET
|
Figure 12.8
shows that timing-driven optimization and the subsequent mapping have simplified the logic considerably. For example, the logic for
outp[2]
has been reduced to a two-input AND gate. Using
sis
reveals how optimization works in this case.
Table 12.12
shows the equations for the intermediate signal
sel
and the three comparator/MUX outputs in the BLIF. Thus, for example, the following line of the BLIF code in
Table 12.12
(the first line following
.names a0 b0 a1 b1 a2 b2 sel
) includes the term
a0·b0'·a1'·b1'·a2'·b2'
in the equation for
sel
:
100000 1
There are six similar lines that describe the six other product terms for
sel
. These seven product terms form a cover for
sel
in the Karnaugh maps of
Figure 12.5
.
|
`timescale 1ns / 10ps
module
comp_mux_o (a, b, outp);
input
[2:0] a;
input
[2:0] b;
output
[2:0] outp;
supply1
VDD;
supply0
VSS;
mx21d1 B1_i1 (.I0(a[0]), .I1(b[0]), .S(B1_i6_ZN), .Z(outp[0]));
oa03d1 B1_i2 (.A1(B1_i9_ZN), .A2(a[2]), .B1(a[0]), .B2(a[1]), .C(B1_i4_ZN), .ZN(B1_i2_ZN));
nd02d0 B1_i3 (.A1(a[1]), .A2(a[0]), .ZN(B1_i3_ZN));
nd02d0 B1_i4 (.A1(b[1]), .A2(B1_i3_ZN), .ZN(B1_i4_ZN));
mx21d1 B1_i5 (.I0(a[1]), .I1(b[1]), .S(B1_i6_ZN), .Z(outp[1]));
oa04d1 B1_i6 (.A1(b[2]), .A2(B1_i7_ZN), .B(B1_i2_ZN), .ZN(B1_i6_ZN));
in01d0 B1_i7 (.I(a[2]), .ZN(B1_i7_ZN));
an02d1 B1_i8 (.A1(b[2]), .A2(a[2]), .Z(outp[2]));
in01d0 B1_i9 (.I(b[2]), .ZN(B1_i9_ZN));
endmodule
|
|
|
FIGURE 12.8
The comparator/MUX example of
Section 12.2
after logic optimization with timing constraints. The figure shows the structural netlist,
comp_mux_o2.v
, and its derived schematic. Compare this with Figures
12.2
and
12.3
.
|
In addition
sis
must be informed of the
don’t care values (called the
external don’t care set
) in these Karnaugh maps. This is the function of the PLA-format input that follows the
.exdc
line. Now
sis
can simplify the equations including the don’t care values using a standard script,
rugged.script
, that contains a sequence of
sis
commands. This particular script uses a series of factoring and substitution steps. The output (
Table 12.12
) reveals that
sis
finds the same equation for
outp[2]
(named
outp2
in the
sis
equations):
{outp2} = a2 b2
The other logic equations in
Table 12.12
that
sis
produces are also equivalent to the logic in
Figure 12.8
. The technology-mapping step hides the exact details of the conversion between the internal representation and the optimized logic.
|
TABLE 12.12
Optimizing the comparator/MUX equations using
sis
.
|
|
sis
input file (BLIF)
|
sis
results
|
|
.model comp_mux
.inputs a0 b0 a1 b1 a2 b2
.outputs outp0 outp1 outp2
.names a0 b0 a1 b1 a2 b2 sel
100000 1
101100 1
--1000 1
----10 1
100011 1
101111 1
--1011 1
.names sel a0 b0 outp0
1-1 1
01- 1
.names sel a1 b1 outp1
1-1 1
01- 1
.names sel a2 b2 outp2
1-1 1
01- 1
.exdc
.names a0 b0 a1 b1 a2 b2 sel
000000 1
110000 1
001100 1
111100 1
000011 1
110011 1
001111 1
111111 1
.end
|
|
</usr/user1/msmith/sis> sis
UC Berkeley, SIS Development Version
(compiled 11-Oct-95 at 11:50 AM)
sis> read_blif comp_mux.blif
sis> print
{outp0} = a0 sel' + b0 sel
{outp1} = a1 sel' + b1 sel
{outp2} = a2 sel' + b2 sel
sel = a0 a1 a2 b0' b1 b2
+ a0 a1 a2' b0' b1 b2'
+ a0 a1' a2 b0' b1' b2
+ a0 a1' a2' b0' b1' b2'
+ a1 a2 b1' b2
+ a1 a2' b1' b2'
+ a2 b2'
sis> source script.rugged
sis> print
{outp0} = a0 sel' + b0 sel
{outp1} = a1 sel' + b1 sel
{outp2} = a2 b2
sel = [9] a2 b0'
+ [9] b0' b2'
+ a1 a2 b1'
+ a1 b1' b2'
+ a2 b2'
[9] = a1 + b1'
sis> quit
</usr/user1/msmith/sis>
|
[ Chapter start ] [ Previous page ] [ Next page ] |