optimizing RTL via code profiling... 
Author Message
 optimizing RTL via code profiling...

Hi,

We have some long running verilog simulations.  I'm trying to improve
the simulation performance and would like to get the best increase with
the least effort.  I was wondering if there are any code profiling
tools that would help me identify where verilog is spending most of its
time.  I'm not sure if this is even possible in an event based language.

I suspect that if you look at the number of events generated or some
such metric you could compare 2 coding styles and select one for
simulation time efficiency.  I have even considered rewriting any
especially "slow" code in C/PLI.  I am looking for a tool to help me
identify which modules/tasks blocks are good candidates.  I have my
regression tests to assure that my new model is functionally correct
and am truly only concerned about efficiency and basic functionality.

I followed the discussion on blocking vs non-blocking. I tend to follow
the non-blocking approach, but I wonder if the large sensitivity list
causes that block to be evaluated *many* times as the inputs ripple in,
with only the last evaluation being important. I considered making an
ifdef with a dummy sensitivity list.  I envision a +define+FAST that
loads/ifdefs whatever and runs significantly faster at the possible
expense of a few "checks". Has anyone else looked at this?

thanks,
randy melton
Atmel Corp.

Sent via Deja.com http://www.*-*-*.com/
Before you buy.



Sat, 08 Mar 2003 03:00:00 GMT  
 optimizing RTL via code profiling...

Quote:

> I suspect that if you look at the number of events generated or some
> such metric you could compare 2 coding styles and select one for
> simulation time efficiency.  I have even considered rewriting any
> especially "slow" code in C/PLI.  I am looking for a tool to help me
> identify which modules/tasks blocks are good candidates.  I have my
> regression tests to assure that my new model is functionally correct
> and am truly only concerned about efficiency and basic functionality.

> I followed the discussion on blocking vs non-blocking. I tend to follow
> the non-blocking approach, but I wonder if the large sensitivity list
> causes that block to be evaluated *many* times as the inputs ripple in,
> with only the last evaluation being important. I considered making an
> ifdef with a dummy sensitivity list.  I envision a +define+FAST that
> loads/ifdefs whatever and runs significantly faster at the possible
> expense of a few "checks". Has anyone else looked at this?

Funny you should mention this - I'm working on performance tuning in our
verilog compiler at the moment - for us at leaat blocking vs. non-blocking
vs. assignments with delays is probably a wash mostly because we
heavily optimise them - they all probably get 'gang-scheduled' on a clock edge
anyway so big sensitivity lists are probably not a big deal - provided
all their inputs are from flops - wires and outputs from other
always statements may result in things being run more than once.

So - caveat here's my recomended laundry-list from our compiler,
it's not complete and of course YMMV in other people's implementations

What you do want to do is reduce the number of events - things to avoid

        - wires with delays
        - force/release/assign/deassign,
        - non-local disables

use whole register models rather than individual flop models (let
synopsys make them for you if at all possiible).

Use registers rather than wires if possible.

Keep wires with nultiple drivers to a minimum (tri-state buses and the like)

Don't use:


                x = 0;
                ....
                if (a)
                        x = 1;

        end

instead avoid the potential glitch/event-pair on x - use


                if (a) begin
                        x = 1;
                end else begin
                        x = 0;
                end

        end

(I know this is contrary to a normal synthesis practice - but you're after
simulation performance ...)

And the number one thing to avoid - don't compile in PLIs
unless you are  going to use them ESPECIALLY wave libraries
(VCD/DeBussy/Signalscan etc). Use 'ifdef DONT use plusargs - instead
when you start a regression cycle build 2 versions of your test, and rerun
failures - in our compiler, at least, if there's any possibility that
a write to a variable might cause an event we have to disable a
whole range of inter-assignment optimisations and allocate storage
for managing the events - the result is slower, larger code
if something like $dumpvars is compiled in somewhere, even if
it isn't called (if itis things get even slower of course :-)

        Paul Campbell



Sat, 08 Mar 2003 03:00:00 GMT  
 optimizing RTL via code profiling...

Quote:

>Hi,

>We have some long running verilog simulations.  I'm trying to improve
>the simulation performance and would like to get the best increase with
>the least effort.  I was wondering if there are any code profiling
>tools that would help me identify where verilog is spending most of its
>time.  I'm not sure if this is even possible in an event based language.

verilog-xl contains a limited code coverage built in. The relevant
sys task you want to explore are $startprofile, $stoprofile $reportprofile
and $listcounts.


Sun, 09 Mar 2003 03:00:00 GMT  
 optimizing RTL via code profiling...
You should also check your simulator manual for specific ways to speed it up. For
VCS, the manual has specific sections on coding styles and compilation and runtime
switches that affect speed.

Of course, in terms of speed impact, the manual's suggestions may be
non-portable...

-cb



Sun, 09 Mar 2003 03:00:00 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. verilog different behavior code and RTL code?

2. RTL shift register code in VCS

3. RTL coding for simulation performance

4. RTL code guide

5. rtl code for i2c serial eeprom interface

6. How do you represent this in rtl code?

7. Looking For RTL code source of Vending Machine

8. Obfuscating (or shrouding) Verilog RTL code

9. Nested Case / RTL Code Question

10. RTL shift register code in VCS

11. Optimizing programmer vs optimizing compiler (was: Re: Assembly vs c/c++)

12. ANNOUNCE: Profile 1.0 (Simple Tcl Profiling)

 

 
Powered by phpBB® Forum Software