Next: RTL Interconnect Power Estimation
Up: Observations and Motivational Examples
Previous: Spurious switching activity (SSA)
In this section, we present an example to motivate the techniques
to be presented. Fig. 4(a) shows the CDFG for
Diffeq and one of its possible schedules (the numbers on
the right indicate clock cycles). Suppose that there are one
adder, one subtracter and three multipliers in the datapath. Then
the bindings for operations , , and are fixed.
Meanwhile, how the multiplication operations are bound to the
three multipliers will significantly affect the interconnect
structure. For example, Fig. 5 shows two
bindings of operations to functional units (an oval depicts a
functional unit). It is clear that the bindings in
Fig. 5(a) have fewer data exchanges between
different functional units than the bindings in
Fig. 5(b). Binding of variables to registers
has a similar effect on the interconnect structure. It is obvious
that functional unit and register binding has significant impact
on the interconnect structure. Nevertheless, interconnect power
optimization through judicious binding has not been adequately
explored before.
Figure 4:
Diffeq:
(a) scheduled CDFG, and
(b) an RTL implementation.
|
Figure 5:
Different bindings for the multiplication operations.
|
Next, let us consider an RTL implementation of Diffeq, as
shown in Fig. 4(b), with one multiplier,
one subtracter, one adder, 12 multiplexers and six registers. The schedule is the same as
that in Fig. 4(a). Let us examine the output
network of the multiplier.
Values of variables t1, t2, t3, t4, t5
and y1 are transmitted to registers Reg1, Reg2 and
Reg3. Each of these registers, however, only
needs a subset of these variables. Such data broadcasting
introduces SSA not only in data transfer wires but also in the
steering logic along these wires, such as multiplexer mux1. The
impact of SSA in interconnects is not limited to themselves. For
example, the multiplier, the adder, and the subtracter only need
variable t3, y1 and t6 from Reg2, respectively, while Reg2 sends
values of all of these variables to all these functional units. As
in the output network of the multiplier, the data broadcasting
causes SSA in the wires and along the multiplexer tree above the
functional units. Even worse, according to the schedule in
Fig. 4(a), when the transition from t3 to t6
occurs in Reg2, the adder is idle (cycle 10 to 11), and when the
transition from t6 to y1 occurs, the subtracter is idle (cycle 13
to 14). Therefore, if care is not taken, the SSA may propagate
into the adder and the subtracter. By eliminating SSA in the
output networks, we can therefore suppress a significant portion
of the SSA in the functional units as well. In
Section V, we offer techniques to suppress
interconnect SSA.
Next: RTL Interconnect Power Estimation
Up: Observations and Motivational Examples
Previous: Spurious switching activity (SSA)
Lin Zhong
2003-10-11