Posts filed under 'EDA'

Synthesis and Implementaion (P&R) of low-power realtime H.264/AVC baseline decoder

As the title says, anyone interested in Video decoder of H.264 can check the Verilog code from Open Cores. Many kudos to the author for giving the detailed spec .  There are few cores which has decent feature set
and this is one of  them. For details and features , please visit the OpenCores/Authors webpage at :

http://www.opencores.org/projects.cgi/web/nova/overview

Apart from reusing the valuable cores as IP, open cores is also a valuable resource for someone willing to learn different aspects of the design from RTL-GDSII .

Design Stats & snapshots

Full Chip Layout

Full Chip Layout - Nova

Clock-Tree

Nova-CTS

Congestion

nova-congestion

Pin and Cell Density

Nova- Cell DensityNova- Pin Density

Technology Node: 45nm process
Clock Freq: 333 MHz (Please note, I have changed the clk freq. I was scaling the clk freq to see study design feasibility . However , it might be possible to close timing by bumping the freq higher, I havent done that as this is intended purely for fun and learning purposes )


  • Used non default rules with double spacing during CTS
  • We also enable cross talk and cross talk noise based optimizations during detailed routing.

Std Cell Area:0.744mm2 with 70% util and 3.8 mts wire length
Cell Count : 176K
Scan Insertion : Done. No of scan chains 6

EDA Tools: Talus Design for Logic Synthesis,
Talus Vortex for Placement,Clock Tree Synthesis and Routing
Power:  ( We havent really done too many power optimizations except Clock gating which helps on dynamic power) .  I will try to post another version of script where I did many power optimizations to save leakage/dynamic power while not hurting timing. The cell count/area and wire length are direct consequence of tightening the clock freq.

Another reason why over constraining is bad :) . as long as the tool is predictable and have very good front end - back end correlation, you dont need to over constraint. Synthesis tools like Talus Design and from other vendors offer correlation in range 6-10%  range or may be less in some cases ( please note this largely depends on design/timing criticality/whether macro intensicve etc).

Based on spec, I have written some timing constraints to take the design through the entire flow.  I havent got time to close timing, but it was nearly close. The final timing is about -50ps and is easily fixable with few signal DRC left ( less than 30 ) .

The full flow script/timing cons/relevant script are pasted below.
############
import volcano  library.volcano
set m /work/nova/nova
set l /library

config rtl clockgate on -integrated $l/ICG
config rtl verilog 2000 on
config map clockedge on
config rtl datapath physical off
config primary unique off
config timing clockgating on
config timing inout net on
config timing inout cell off
config timing slew default generation on
config sdc unit capacitance p
config sdc unit resistance k
config sdc unit time n
config timing clock multiple on
config timing borrow method relax
config timing borrow automatic on
config timing slew mode largest
config timing propagate constants combinational
config timing check recovery on
config volcano -crash.volcano off
config snap error_volcano off
config async concur off
config snap output on [config snap level] volcano prefixtime-dft-insert-scan
config message limit SWP-8 1
config message limit MAP-111 1

config multithread -thread auto -feature all -gr on

set rtl_list [glob ../src/*.v ]
eval import rtl -verilog -include ../src $rtl_list

fix rtl $m

config snap replace [config snap level] fix-netlist-sweep ” run gate sweep $m -hier -cross_boundary -uniquify ”
run bind logical -no_uniquify $m $l

force dft scan style $m muxed_flip_flop
force gate opt_mode $m delay -hier
fix netlist $m $l -effort high
export verilog netlist $m snap/compile_mingate.v -minsize

# Scan Insertion starts here
enwrap “config dft scan lockup on
config dft scan shift_register on
config dft setup clock_groups on
config dft repair violation clock_violation on
config dft repair violation comb_loop on
config dft repair violation disable_tribus on
config dft repair violation latch on
config dft repair violation reset_violation on

” prefixtime-dft-configure $m

enwrap {
force dft scan clock $m [list $m/mpin:clk]

for {set sid 0} {$sid < 6 } {incr sid} {
data create port [data only model_entity $m] SI${sid} -direction in
data create port [data only model_entity $m] SO${sid} -direction out
force dft scan chain $m $sid SI${sid} SO${sid}
}

force dft scan control $m $m/mpin:SE scan_enable

} prefixtime-dft-force $m

enwrap “run dft check $m -pre_scan
run dft scan insert $m
run dft check $m -post_scan
run dft scan trace $m ” prefixtime-dft-insert-scan $m

# Scan Insertion Ends here

# Source Timing Constraints
source -echo $SCRPATH/nova_constraints.tcl

enwrap {
force undriven $m 0
run gate sweep $m -hier -cross_boundary
} pre-ftime-sweep $m

#To enable pipeliing with 5 stages and using clk as clock and also to enable retiming , uncomment below 2 lines
#force gate pipeline $m $l -stage 5 -clk $m/clk
#force gate retime $m on -hier

# Turn on netlist level clock gating. Helps saves area by eliminating feedback
# loops with muxes and helps in routing due to lesser number of pins

config gate clockgate on

# Use sized netlist flow and use new mapper with -smap option
set  FT_FLOW sized
fix time $m $l -effort high -timing_effort high -size -smap

enwrap {
export verilog netlist $m snap/logic_opt_mingate.v
} ftime-export-verilog $m

enwrap {
config hierarchy separator “_”
data flatten $m
} flatten-design $m

### Floorplanning  starts here

enwrap {
force model routing layer $m highest M6

force plan net VDD $m -usage power -port VDD
force plan net VSS $m -usage ground -port VSS

config optimize leakage on -auto on
config capacitance congestion true

} floorplan-configs $m

set fp [ data create floorplan $m fp ]
run floorplan size $fp -target_total_util 0.5 -aspect_ratio 1.0
config autoflow $m set fix_plan
config autoflow $m set fix_shape

fix power $m $l -default_mesh -auto_domains -mesh_range { M4 M5 }

enwrap { run plan create pin $m -incremental } pin-pl-incr $m

config flow $m set floorplan
check design $m

if {[data exists /macro_lib ] } {
export volcano ./snap/floorplan.volcano -object /work -object /macro_lib } else {
export volcano  ./snap/floorplan.volcano -object /work }

#Floorplanning ends here

#Spare Cell Insertion
config snap procedure spare_flops {
global SCRPATH
#Spare Cells to be created are defined in sparE_cell.tcl and they are attached to VSS net and belong the floorplan of $m
run plan create sparecell $m $SCRPATH)/spare_cell.tcl VSS -floorplan [data only model_floorplan $m ]
run plan identify sparecell $m -file snap/spare_cells_identified.tcl
run plan place sparecell $m
puts “nn Executing spare flops proc nn” }

config snap output on [config snap level] spare_flops -snap after fix-cell-place-global1

#force boundary_cell $m -cell “*BND_UF*” -blockage buffer

#Sub Cap and End Cap Insertion
enwrap {
run plan create subcaps $m -subcap [ find_model FILL $l ] -stepdistance 120u
set welltie [find_model FILL $l ]
run plan create endcaps $m -left_endcap $welltie -right_endcap $welltie
} cap-insertion $m

# While Scan reordering, dont order the first flop
config scan optimize $m -not_order_first_flop on

fix cell $m $l -timing

fix opt global $m $l -effort high -label 1

enwrap {
config clock auto_skew_balance on
force plan clock $m -buffer $l/BUF/BUF_HYPER
force plan clock $m -inverter $l/INV/INV_HYPER
force plan clock $m -max_skew 50ps -max_useful_skew 40ps
force timing adjust_latency $m boundary_average } PRE-FIXCLOCK $m
config snap replace  [ config snap level ] fix-clock-route-clock “run route clock $m $l -nondefaul
t_mode double_s -shielding_mode noleaf -effort high -overdrive 3
fix clock $m $l -weight skew -critical_slack 0ps -clock_effort high -timing -nondefault_mode double_s -shielding_mode noleaf

fix opt global $m $l -effort high -dont_move_reg -critical_slack 0ps -secondary_effort off -label 2

fix hold $m $l

enwrap {
config prepare access mode enhanced -ui on
run prepare model access $m -reset
run prepare model access $m
config route flow adaptive $l on
} pre-fix-wire-eap-wrapper $m

fix wire $m $l -slew -crosstalk_delay -crosstalk_effort high

enwrap { config condition case both } rod-case-both $m

enwrap { run place detail $m -eco } rod-rpd-eco $m

run optimize detail $m $l -dont_move_reg -critical_slack 10ps -optimize all -hold_fix_hold_margin 10p -hold_fix_setup_margin 100p -useful_skew

# DRC Cleanup
enwrap {
check route drc $m
check route spacing_short $m
run route refine $m
run route final -incremental -reroute_tile_width 30 $m
run route final -incremental -reroute_tile_width 60 $m
run route final -incremental -reroute_tile_width 20 $m
run route final -incremental -reroute_tile_width 10 $m
run route final -incremental -reroute_tile_width 30 -effort maximum $m
run route final -incremental -reroute_tile_width 60 -effort maximum $m
run route final -incremental -reroute_tile_width 10 -effort maximum $m
run route final -incremental -reroute_tile_width 30 -effort maximum $m
run route final -incremental -reroute_tile_width 50 -effort maximum $m
run route refine $m -type nontrivial
run route refine $m -type notch
run route refine $m -type island
check route antenna $m
check route drc $m
} post-fix-wire-opt-wrapper $m

###########

####Timing Constraints #######

force timing clock {mpin:clk} 3ns -waveform { -rise 0p -fall 1.5ns} -context /work/nova/nova

###############################################################################
# Collect all inputs with some exclusions
###############################################################################
set Inputs [data list "model_pin -direction in" $m]
set Bidirs [data list "model_pin -direction inout" $m ]

# Add the inout ports to the list of inputs

set Inputs [concat $Inputs [data list "model_pin -direction inout" $m ]]

# Remove the clock ports from the list of constrainable inputs

foreach clockPort [data list model_clock $m ] {
set Inputs  [lsearch -all -inline -not -exact $Inputs $clockPort]
}

puts “The number of inputs is     : [llength $Inputs]”

set io_delay_max [ expr [ expr 1.0 / 1500000.0 ] * 0.9 ]
set io_delay_min [ expr [ expr 1.0 / 1500000.0 ] * 0.1 ]

foreach iport $Inputs {
puts “Adding input delay on port $iport”
force timing delay clk $iport -time {-worst  6e-07p  } -type rising_edge  -context $m
force timing delay clk $iport -time {-best 6.66666666667e-08p } -type rising_edge  -context $m
}

###############################################################################
# Collect all outputs with some exclusions
###############################################################################

set Outputs [data list "model_pin -direction out" $m]
set Bidirs [data list "model_pin -direction inout" $m ]
# Add the inout ports to the list of outputs
set Outputs [concat $Outputs $Bidirs]
puts “The number of outputs is    : [llength $Outputs]”

foreach oport $Outputs {
puts “Adding output delay timing check to output port $oport”
force timing check $oport clk -time  6e-07p -type setup_rising  -context $m
force timing check $oport clk -time  6.666666667e-08p -type hold_rising   -context $m
}
###############################################################################
# Assign a default input transition time or set a driving cell if you rather.
###############################################################################
foreach iport $Inputs {
puts “Setting input transition on port $iport”
force timing slew  $iport { -rise {-max 0318p} }
force timing slew  $iport { -rise {-min 0008p} }
force timing slew  $iport { -fall {-max 0314p} }
force timing slew  $iport { -fall {-min 0004p} }
}

###############################################################################
# Assign default loads
###############################################################################
foreach oport $Outputs {
puts “Setting load on port $oport”
force load capacitance $oport {-worst 0.181p }
force load capacitance $oport {-best  0.001p }
}

##########################################################################
puts “Multicycle the async applied inputs”
##########################################################################
set a01  [data list "model_pin -direction in" $m -names {.*mpin:reset_n}]

catch {set a01 [concat $a01] }
foreach p $a01 {
puts “Adding MCP through scan control input $p”
force timing multicycle -from $p -cycles 7 -type setup -reference start
force timing multicycle -from $p -cycles 6 -type hold  -reference start
}

##########################################################################
puts “Set timing constants”
##########################################################################
# Define functional mode
force timing constant $m/mpin:SE 0

return
##########################################################################
puts “Multicycle the output ports that are async in nature”
##########################################################################
set a01 [data list "model_pin -direction out" -names .*mpin:ready.* $m]
set a02 [data list "model_pin -direction out" -names .*mpin:erfound.* $m]
catch {set a01 [concat $a01 $a02] }
foreach aport $a01 {
puts “Setting MCP to async input port $aport”
force timing multicycle -to $a01 -cycles 5 -type setup -reference start
force timing multicycle -to $a01 -cycles 4 -type hold  -reference start
}

##########################################################################
puts “Set max delay on IO through path, use rarely”
# Mindelay and Maxdelay should be the last resort.
##########################################################################
set a01  [data list "model_pin -direction in" $m -names {.*mpin:some_goofy_input}]
set b01  [data list "model_pin -direction out" $m -names {.*mpin:some_goofy_output}]
foreach p $a01 {
puts “Adding min/max delay through input $p”
force timing maxdelay 1.0n -from $a01 -to $b01
force timing mindelay 0.5n -from $a01 -to $b01
}

#####Spare Cells Addition  (spare_cells.tcl) ######

spare_and2 $l/AND2/AND4 20
spare_inv  $l/INV/INVD3 20
spare_buf  $l/BUF/BUF2P 20
spare_nand $l/NAND2/ND4P 20
spare_or $l/OR2/OR4P 20
spare_nor $l/NOR4/NRD2P 20
spare_ff  $l/SDFF/SDFF2 20

###########

Add comment October 23rd, 2008

Congestion Analysis : Logic Synthesis and Floorplanning

Often there might be cases when we dont know whats the cause of congestion. You can never point at one source and this contributes to congestion. This has to be tracked from very early in the flow (logic synthesis).

I will try to highlight some of the things to check based on my experience. Also,by no means this is complete reference .

1. Does your cell count looks suspicious? even though area is comparable, but if cell count is pretty high, P&R tools might have a  problem in placing and routing so many cells and leads to congestion.

2. Is synthesis tool using lot of complex gates or are there any big muxes infered from RTL itself, then you might need to recode the RTL making routing job easier.

3. If RTL is OK, and synthesis is inferring complex gates, it might help yet times to decompose those logic.

4. Some times logic restructering with cone depth greater than default set in the tool will help..

5. check if you are over contraining/are giving very agressive slack targets to logic synthesis tools..

6. See if you can flatten some smaller modules where constraints are not set..this helps all optimization commands

7. Some times dont touch/force keep attributes prevents synthesis tools from remapping

8.  Check the library to see if any functionally equivalent cells with smaller area footprint should have been used…for example, if you have hidden a DFFS flop and only DFFRS (set-reset flop) is available, you are adding one more pin and higher cell area to be used..check for these..

9. Often incorrect constraints , it can be synthesis or timing or floorplanning or placement constraints also leads to the problem. If the problem is here, dont expect the tool to override these as its a user issue and the tool tries to honour the constraints

10) If you see too many level of logic, you might want to collapse them .  One more point that pops up in this context is hierarchy maintainence. Check if you can do selective hierarchy maintainence and  if its correctly setup.

11) Check for HFN

12) Check secondary cost function objectives

13) If DFT is already done, check the number of Testpoints (TP) inserted . you might be inserting too  many TP for very small cov gain. You need to consult with your DFT team to quantify how much you can really sacrifice. It varies for every design

14) Check if a particular block/module can be optimized for area while the timing critical part of the ckt can be  optimized for delay.

15)  Majority of the congestion issues can be traced to floorplan.

Things to check at floorplan stage in no specific order:

a) Is the congestion around channels between macros? You might need to resize the channels sothat all macro pins can be accessed by the router.

b)See if you are wasting too much space for channel/island widths etc..you might not need channels all the times..an example could be for CAM’s where a pair macros have to be aligned and you can abut the macro pair on side where there no pins. This will save some space.

c) Check the pin density forthe overall block.

d) Check if there are routes around the macro corners

e) check the average & peak track overflows on each metal layer. This will give you hints what can be the reason.

f) Use blockages cautiously

g) did you set the highest routing layer incorrectly

h) is too much wire causing the issue ?

i) are the endpoints which are connected placed far apart. Need to check why ?

j) is it because of scan chain wirelength? Did the scan optimization happened correctly. Yet times, incorrect scan order constraints prevent the scan chain wirelength optimization.

k) check the scan repartitions (often tools like LV will print what are compatiable grps and so what can  be reordered and optimized)

l) Check if the tool is buffering a lot to fix the timing issues..

m)check to see if the pre-placement of analog blocks etc are in optimal location ?

n)check to see if the floorplan grid is defined and set correctly..If incorrect, you are wasting some routing resources. Setting right and efficient routing constraints are essential to get best routing results .

o) check to see if the MBIST controllers are placed at optimal locations to the memories they control.  Optimal sharing of memories and mode of MBIST sharing  (whether serial/parallel) also pays siginifcant role

p) If the macro placement floorplan provided for MBIST insertion is different  from the actual floorplan you are trying to comeup with, then expect congestion issues as the memories shared are not correct in the revised floorplan. Floorplan changes should be only incremental . This is bit subjective and has to reviewed on case by case basis

q) check the cell density. Are there any decent empty spaces in the floorplan while there are some spaces that are heavily congested. This happens if the tool is trying to squeeze the logic inorder to meeting timing.

r) Check for the overlaps. Is there enough space for all the  macros/std cells to legalize .

s) check your power planning ( power mesh/rail creation)

If you checked all the above and you still cannot resolve, may be you are trying to stuff too much logic and you might need to expand/grow your floorplan. This often involves top level floorplanning changes.

If anyone has any other suggestions/tips for congestion anlaysis , or if you have any ideas or methods on predicting the  congestion itself, I would welcome that feedback as well…

Add comment May 27th, 2008

Debugging Logic Synthesis & Timing Optimization QOR issues

Lets face it. You have run the logic synthesis/physical synthesis tool and you have a problem. You havent met slack. Now what? where do you start? Each issue depends on the design/process node, I’m going to keep it simple and will list few pointers/tips as to where we should be looking for. These are just for guidance and you as Designer/CAD engineer/applications engineer have to dig deep and find the root cause. I’m assuming you have stopped the flow right after global placement and routing and havent entered CTS ( ofcourse it doesnt make sense even to do CTS when you failing timing )

Before , we start discussing about debugging timing issue/QOR, I’m assuming that the floorplan is of good quality. Please remember that a bad floorplan can give you a bad QOR no matter how hard the tool tries. Dont even both to do any debugging. Correct your floorplan first. If you are still in the early exploration phase, then dont complain about QOR, but rather concentrate on the correct by construction approach to give you best results.

Lets use the old and well known divide and conquerer approach. Ok now. lets break down the problem into 2 areas.
1. Your logic synthesis tool did the good job and its your physical synthesis which made the things worse.
2. The QOR after your logic synthesis is already bad.

Below is a sort of check list like what you want to do if you have issue with the above 2 items.

A1) What was the critical path looking like ? Is is a datapath or Register - IO path, IO-IO path or some macro-reg path?.

Look at the timing histogram and see how many paths fail .This gives you a good idea on how bad the timing looks like on your design. Check the top 15-20 critical paths. If it is a IO path, check the IO constraints. Also, relax the IO margings sothat they dont fail and become critical, do incremental timing optimization . Now check again, how the timing looks like. Is the critical path a reg-reg path? If it is a reg-reg path, then check step 1.1b

A2) For register-register path, check the detailed timing path and see
A2.1) Check if its a High Fanout issue, and if yes, whether the path has been buffered/cloned correctly?
A2.2) Also, check if the path is overbuffered/under buffered?
A2.2) Check if the tool has picked up correct drive strength cells? Yet times, tools dont upsize/down size the cells correctly and as a results, more buffers/inverters are added making the timing worse. Remember in 65nm a buffer has around 50ps delay.
A2.3) Also check if the tool has picked up a multi-stage cells. It is always a good idea to give the freedom to the tool to pick a cell and buffer it rather allowing the tool to select a available multistage cells. If you are trying to extract every pico second out of the tool, you might want to check this.

A3) Check if the tool complains about congestion. If you see congestion, then you might want to check the utilization. A quick visual inspection of the floorplan will tell you if there is more space available in the primary inner shape for the tool to move around. If this is the issue, then you have a placement issue.

A4) If Cloning/Buffering doesnt seem to the issue, then cross probe the timing path into the layout window using fly lines and see how the path is laid out. Is the path very long and jogging all around. If yes, then its a global placement/routing issue.

A5) Check the macro placement. How does it look? Does it look optimal? Did you create the halos around the macros? What about Placement blockages? If these are missing, please provide them and re-run the physical synthesis.

A6)If you are utilizing the auto floorplan capabilities of modern tools, then better check the quality of floorplan. Many tools have issue in creating a good and optimal rectilinear floorplans.

A7)If all else looks good, check the number of logic levels for register-register timing path. If this seems to high or suspicious, then check the logic levels at the of logic synthesis. IF you still the same thing there, then you have a logic synthesis issue rather than physical synthesis issue.

A8) To debug a logic synthesis issue, check the timing for the same path you see at the end of physical synthesis in front end STA. Check if it datapath or a control path or some kind of distributed logic sitting and getting shared between two modules.

A9)For datapath, check if the tool has picked the correct architecture. For example in case of adders,it might have selected ripple, but may be it could have selected carry-look ahead adder. similarly check if the tool can pick better architectures for other datapath components

A10) It is a control path, then check how the logic is being written in RTL. Whether if it is a deeply nested if-else logic with mutually exclusive conditions. should we really create a deep mux chain. How was the case logic written. May be the tool is inferring a big shifter, but only partial shiter is used. So the tool unnecessarily created the big shifter logic there.

A11) Yet times, it is problem of optimization itself. May be the tool couldnt have knocked off the extra registers by doing more agressive constant flop optimizations and dead code removal .

A12) Some times mistakenly users set unnecessary dont touch (synopsys) or force keep (magma) or they mess with the configs where they insist the tool the retain floating logic. Be careful in what you want to retain or what can be knocked off.

A13) Yes, the world is not flat always . Some users want to keep all the hierarchy. You dont want even that. Many optimization alogorithms works best when there is no hierarchy. Whenever there is a hierarchy, the scope of the optimizations is limited to within that module. So, only retain the hierarchy on which the timing contraints or present or when there are special requirements from other 3rd party tools regarding maintaining the hierarchy they introduced. So, check if any of the logic in the critical path can be flattened.

A14) Sometimes hiding the high drive strength cells in the library or preventing the tool from using very complex gates like XOR/XNOR and in some cases AOI/OAI cells helps to improve the timing. But this should not be done blidnly. Check the library and the cells the tool is picking. Then decide whether selective hiding is the way to go.

A15)Over Synthesis: Many users blindly push the tool to meet a high target slack in the design. I’m not referring to the clock uncertainity normally you account for. This target slack is applied only for front end synthesis. Be reasonable in what you want the target slack to be. Normally 15% of the clock freq is decent enough.

A16) Over Constraining : This is a setup margin you apply all across the flow till CTS is done. Again be reasonable in how much you constrain for. Over contraining can sometime break the alogorithms. The optimization alogorithms sees more negative slack that it really is and so inorder to meet slack, it tries over buffering/cloning/bad sizing and gives it up. Just for the record, this is not a bug in the tool. But it has to do more with constraining the design correctly.

A17) One more thing to check for is the slew limits and fanout limits and check if the tool is honouring them.

A18)Some times, because of the incorrect false paths or multi cycle paths set , you are mis guiding the tool. Remember folks, over exceptions kill the design. Dont set a false path unless it is needed . Perhaps setting a multi cycle path is way to go.

A19) Perhaps this should have been mentioned in the beginning. Quality of Contraints dictate your timing results. Bad constraints leads to bad QOR. So check your timing contraints before you start your timing analysis. There are many tools out there which can help you in this. As a thumb rule,

A19.1) check your IO constraints,

A19.2) check your exceptions ( multi cycle/false paths)

A19.3) Check if there are unconstrained nodes in the design? You should not have any uncontrained nodes.

A19.4) Check if there are more events happening on a given node? Say 12 or 16 timing events happening on the same node is not good sign. Check the node.

A19.5) Also check the timing event density . This will tell you if you have over lapping or conflicting contraints or if you have high number of timing events in the design.Either way, its not good. For example, some 3rd party DFT tools whether they write our post scan SDC, it some times large timing events on some nodes and this causes havoc on timing algorithms.

A19.6)Also check if there are any nodes where there are zero timing events. This is not same as uncontraining the design.

A19.7)Check the clock definitions and units .

A19.8)Check the generated clock definitions and whether the source clock is mentioned correctly.

A19.9) Check to see if there exists any cases in the design where clock becomes data. If yes, then timing analysis tools, will treat this as data node as opposed to clock node .

A19.10)Check if the case analysis contraints have been setup correctly.

A19.11). If you have a clock gating cell say CKG1 between 2 registers say FF1/Q and FF2/D , then the tool will see two paths : path from FF1/Q to CKG1/EN and CKG/OUT to FF2/D . But in reality, there is only one path FF1/D to FF2/Q and setup checks are done on FF2/Q. So, you have to tell the tool somehow to consider the delay through the clock gate and that the clock cycle time is from FF1/CK to FF2 . So the way you do it is you want to apply a negative margin at the clk pin of the clock gating cell and setup margin on the clock gating output pin and constrain the path.

A19.12) The cleaner and better the constraints are , the timing results will that much better.

Misc Tips: It is always good idea to study the library . It gives good idea on what cells and of what strengths are available in the library. It helps to fine tune your optimization and guide your implementation tools.

Add comment September 17th, 2007

Debugging Formal Verification Problems : Part II

Folks who wants to read my part I of the formal verification series , click this link :

www.srikiran.net/blog/2007/01/22/debugging-formal-verification-fv-problems-fv-primer


I wont be elaborating much on points when they are self explanatory and some have been already covered in part I ( setting up the FV env section )

1. Check for unmapped points. Naming rules/mapping issues. Unless you all Seq elements and Black Box I/O’s in Reference and Implementation has been matched, the formal verif results are meaningless. Some tools report mapping failures when they see  some objects in Reference (Golden) doesnt exists on Implementation (revised). This is one exception to this as it is quite possible that synthesis tool might have optimized them out. But if you see that Objects in Implementation netlist doesnt exists in Reference, then it means your formal tool is doing more optimization than necessary or something is worng.

2. Blackbox (BB) issues : If there is formal failure, check if BB’s are same in both the netlists are same and number of BB should be equal.

3. Debugging using hierarchical and flat approach : When hierarchical approach lists failures, dont jump to conclusions immediately. May be the context is not propagated correctly and hence the tool propagated wrong values and hence failure. Do a flat level verification either at the same failing module level or one level up the hierarchy.

4. Spare Cells :check for spare cells getting removed .

5. Port information: check if ports are getting removed like DFT test mode etc

6. Overlay/Feed Throughs : If you are loading your Implementation (Revised) netlist from your physical synthesis flow, check if there are any ovelay cells or feed throughs . Normally these dont exist in the original (RTL) or logic synthesis netlist. These are added by the physical synthesis tools during the top-down chip/block routing or integration stage. Overlay cells change the netlist by adding a hierarchy while logically/functionally the design is still same.

7. Cloning during CTS : Again in physical synthesis flow, during CTS, the place and route tool might have cloned some ports ( for example reset pin/port)  and you need to tell your formal tool somehow that the cloned reset port is just an alias of original reset port .

8. Clock Gate Cloning :Another common issue is, the clock gates are cloned in physical synthesis flow and so you need to tell your FV tool about all the different clock-gates being used .

9. Power Connectivity : Some synthesis tools dont write power connectivity information of non standard cells by default , for example analog power information ( like AVDD or AVSS etc) and sometimes they write empty power connections for all blocks . This can cause formal failures tool.

Lastly , FAILURES are different from UNSOLVED points. FAILURES means the netlists differ and UNSOLVED means they are certain datapath elements like multipliers or for example ECC etc which are not verified. They can be still be verified if you can afford to wait for real looooooong times ( perhaps years..decades )

Add comment March 1st, 2007

Push Button design Flows

I was recently talking to couple of design engineers at various companies and most of them want to have push button RTL-Placed Gates or RTL-GDSII flows. Though this sounds like a reasonable expectation, but in reality as many experienced designers know, this is often not practical. The issue here is right set of expectations.

We can develop a push button flow if we have a good design methodology with reasonable and manageable expectations. A designer or CAD design engineer need to understand that there are certain things they have to do like setting the synthesis env or constraints, providing good quality timing constraints etc . I have seen in numerous cases where the designers blame the tool for poor timing results , but when analyzed, they have a messed up their timing constraints or has specified timing exceptions where not necessary . Simply put, they might have over constrained their designs. While it is reasonable to expect the tool do a good job for a classical physical synthesis problems where the designers has very little to do, but it is not for a logic synthesis issues where a lot depends on the quality of RTL, constraints , DFT methodology etc.

Each chip is unique ( I’m not talking about the revisions of the same chip here) and the requirments differe from chip to chip in terms of complexity of the design, design size, number of macros used , number and freq of clock domains, DFT logic ( along with JTAG etc) , clock latencies, skew balancing ,cross-talk etc . Clearly OOTB Flow ( out of the tool box) might not always deliver the best QOR (ofcourse it means its a enhancement time for R&D ) and some amount of playing with different knobs/options is necessary to give the best QOR.

Add comment March 1st, 2007

Sequential Equivalence Checking…

This will gain momentum once design industry starts seeing “the value of ESL” .

http://www.eetimes.com/news/design/showArticle.jhtml;jsessionid=GEM2H4T50V5WU
QSNDLRCKH0CJUNN2JVN?articleID=197000961

Still a very niche area and we gotta see how calypto does and breaks the barrier…but I gotta tell ya…I’m impressed with their technology..

Add comment January 26th, 2007

Debugging Formal Verification (FV) Problems .. FV Primer

Anyone who has been around for a while in the FV (formal verification) world for sometime agrees that FV is easy  as long as the FV environment is setup correctly. Ofcourse when FV says failure , it is a whole different issue . BTW, I’m talking about formal verification for Digital Ckts. Will cover FV for analog circuits later.

Just for the record, Most of us in the industry or most companies which develops FV tools, when they say formal verification , they mean equivalence checking. There is another type called model checking as well .Search google if you want to know more about model checking as well.

This is definition of Formal verification according to wikipedia :

http://en.wikipedia.org/wiki/Formal_verification

There are many people who dont understand when FV has to be used and when functional simulations have to be used. I met one person who thought FV replaces the simulations . So, I just want to clear things first. Once the RTL coding is done, you might want to check if the written RTL conforms to the spec and is functionally correct without any glitches etc. Once this RTL simulations (RTL functional simulations) pass , the design is then synthesized .

You want to do formal verification to answer the following question : Is the synthesized circuit functionally same as intended in the RTL? Yes, you can still do gate level simulations and verify the same. But simulations are slow and you need to do exhaustive simulations to cover all the possible vectors for a given circuit. Formal Verification can do the same in less time mathematically .

I’m are not going to talk about the different types of simulations and pros/cons of each here. Similarly I’ll not talk about assertion based simulation either. There are tons of articles and lecture notes on the web .

Formal verification is still a niche area and less than 25% of the design companies are using formal verification today. Partly because the Formal tools are either costly, takes time to setup the env and there arent enough resources for FV.

From the methodology perspective, there are two methodologies hierarchical and flat based.

Hierarchical Methodology : It is very fast and verifies the design by partitioning it based on the hierarchy. It is very similar to the bottom-up approach in synthesis. Once the lower modules/blocks are verified, it blackboxes them and then verifies the next upper level in the hierarchy. Once all the modules/blocks are verified and blackboxed, it then verifies at the top level ( glue logic). There are some inherent issue that comes with this methodology : there are can be some false failures because of the blackbox approach. If there are cases when the constants have to be propagated from the top level, then the tool will not be able to do so and hence it might be flagged as false failures. It is not problem with the FV tools, but inherently a draw back with the methodology itself.

Flat Level Methodology : If the design is flattened (hierarchy is collapsed) in synthesis,this is the only way one can verify the circuit.The runtimes can vary depending on the design size and complexity and the tool ofcourse.

Having talked about the FV methodologies, lets talk about two important pieces of Formal Verification:

Setting up Formal Verification Environment : This is not a straight forward process . Some of the things that have to be noted are :

Talk to your logic/physical synthesis engineer :) . Yes, without his inputs, you will never know what options/configurations he has used in synthesis and without this knowledge, you will never be able to let your FV tool what the synthesis tool has done in terms of optimizations. You need to mimic the synthesis environment as much as possible.

At the minimum , you need to know the following :

1. Any Pragmas that have been used in RTL like compiler directives , translate_off/on, full_case/parallel_case , async_set_reset etc

2. If specific value is used for Dont-care optimization .

3. If constant propagation is enabled and used

4. If any logic is marked as set_dont_touch or force kept

5. If any ports/nets/pins are tied to constants.

6. If you are comparing pre and post DFT netlists, then you need to disable the scan logic by driving scan_enable/test_mode ports to a constant value . so, get this information.

7. If any models doesnt have the definition available or if any model is assumed or modeled as blackbox in synthesis, you need to get that list and blackbox the same .

8. If clock gating is enabled in synthesis , you need to get the clock gating configuration ( name of clock gating cell used in synthesis) ;

9. Get the list of libraries used in synthesis ; it is recommended to get and read the .lib into the FV tool ( assuming FV tool can read .lib directly) rather than using the verilog netlist written out of synthesis tool. Some synthesis tools dont write the complete definition of a model and they just write the interface for the module. If you dont have .lib files, at the minimum , verilog simulation models need to be read in.

10. Designware : Get the list of the designware components used in the design. There are some limitations as to what designwares can be formally verified . For example, designware multipliers , ECC etc cant be formally verified and has to be simulated to check the functionality.

11. Re-timing : Check if re-timing is enabled in the synthesis ; some synthesis tools automatically detect and perform retiming when some datapath components are used.

12. Clock gate cloning : If clock gate cloning is done, some FV tools require you to explicitly mention the list of all clock gate cells that got cloned and their names. Otherwise you might see false failures.

13: Sometimes synthesis tools synthesize sequential elements whose state is inverted with respect to the RTL ; so it is the case, you need to enable that option in the FV tool.

14. Naming Convention: Most synthesis tools follow a standard naming conventions and sometimes your FV tool might not be able to map the element/object in the reference netlist to implementation netlist. so, get the naming conventions in your synthesis tool and configure your FV tool accordingly.

15. check if there is any default value set for undriven nets/pins/ports.

16. Some advanced options like if d-input of the flop is/has tied to be constant ( 1 or 0 ) or if the clock has been/need to be tied to constant high or low, enable the relevant configs.

17. check if there are datapath components in the design. Sometimes, if the size of the RTL inferred multipliers is more than 40bit, then some special datapath solving techniques might be necessary. Also there is a limitation on this FV technology on how much wide multipliers you can verify. If you believe you might hit the limit,then better blackbox that component. A 128bit multipler can takes ages to get it verified.

18.Port mapping/aliases : Sometimes during CTS , the physical synthesis tools while building clock-tree/reset-tree networks , they might duplicate the ports ( for example there might be a reset port called “rst” and tool might have created rst_l port as well which is logically and functionally equivalent to “rst” ) . So, if you are doing FV between pre-CTS and post-CTS netlists, you need to take care of this either by port mapping or through some aliasing features in the tool.Else the tools might report false failures.

Debugging Formal Verification Issues: These are some tips and by no means a complete guide. Debugging comes by experience and the more issues you solve, the more insight you will have. Some of the bullets below might look like a repeat of the points I mentioned in the above section “Setting up formal verification environment” .

General suggestion : Formal Verification can be done between RTL and synthesis netlist or between synthesis netlist and DFT netlist etc, depending on the stage and flow you are in. It is always recommended to verify the circuit in increments between two immediate stages in the flow. If you hit a FV failure , it will be very difficult to narrow down which stage in the flow has caused this and which optimization step or command is causing this.

Also when a formal tool raises a flag, it is recommended to first do an simulation between reference(Golden) and implementation netlist first. Most formal tools generate test vectors for the failing compare point and you can use the same to simulate and see if the simulation results match for reference and implementation. If they match, then it means it is a false failure and you know that your synthesis tool is correct.

Some formal tools have very advanced capabilities like generating a test bench for a given logic cone. So, if you are ever caught between two formal tools and they disagree on any result, then generate the test vectors and run functional simulation using it. Some formal tools dont generate test vectors ( they use vector less approach), then  get the logic cone on which the formal tool is complaining and generate the testbench for that logic cone in the other formal tool and then run simulations.

casez and casex statements : Somes times synthesis tools infer the casex/casez statements and based on the dont care optimization alogorithsm, they might create a different logic than the formal tool . For example, if there is an if - else loop and if the variable is not declared earlier , but used in the loop, synthesis tool might not always infer a latch if it feels that there is no necessity to store a state. What I would suggest is, check for 2 cycles and see if there is a necessity, if not, then synthesis tool is correct as latch is redundant here.

Pragmas: Make sure the intepration of pragma’s between synthesis , formal and simulation tool is same. Some simulators like ncsim doesnt support certain synthesis pragma’s like translate_off/on and so they might interpret the logic and gives incorrect results when you want to see if your formal tool is flagging an false error.

Also, in most of the cases , formal errors get introduced during the rtl elaboration stage and certain RTL constructs like generate statements, genvar, multi-dimensional arrays etc can confuse our little buddy ( rtl elaborator ) . So check for these advanced or complex constructs in RTL.

Then check for if any variables are used but never given a value,  etc . These are some corner cases though :)

Lastly, formal errors come from very small snippet of the RTL code. So, using divide and conquer approach, reduce the design to as small as possible while maintaining the same formal erorrs on the same compare points/logic cone.

If, you see thousands of failing compare points, you’d better check your SETUP to see if there is anything wrong. Unless the synthesis tool is developed by people who dont understand it, it is not very common to see thousands of failing points for a decent sized testcase ( 300k cell count) . If you do see,then my adivce would be , STOP USING THAT SYNTHESIS tool :)

Add comment January 22nd, 2007

Formal verification of SOC

There was an interesting question that was asked during the interview with Moshar from Broadcom
http://www.eetimes.com/news/design/showArticle.jhtml;jsessionid=MKCJBAP5HOPZKQSNDLPCKH0CJUNN2JVN?articleID=196902151

==================
EE Times: Are you using formal verification, and does it reduce the need for simulation and acceleration?

Moshar: We are using formal verification, but I don’t believe it is reducing the scope of the work we need to do. It will help you make sure that your IP is golden, but formal verification really does not apply at the SoC level. You have to go through all the traffic scenarios you need to cover.

===================

It would have been nice if a detailed answer was given ..From the question posed and reading the answer , it appears as if it is a limitation of Formal Verification. I think if you describe in terms of Formal properties using PSL , it would be still possible to formally verify the traffic between various cores on SOC..

I think it would be interesting to know on how companies operating in ESL space view this. As systemC and C based language design becoming popular and support TLM, it would interesting to see how one can extract information from these abstract models and verify the design intentions.

Add comment January 22nd, 2007

Logic Synthesis Primer

I have recently seen many folks asking questions like “How to write a synthesis script” or “Whats is in a synthesis environment” . Many freshers right out of the school claim to know about synthesis but in reality doesnt have a clue where to start.

When I refer to Synthesis in this post…I mean Logic Synthesis and I mostly cover only logic synthesis and it doesnt include any STA (Timing analysis). I will cover some points which overlap between logic synthesis and front end timing optimization. I will write a seperate post on Front-end timing optimization and Physical Synthesis (Floorplan , Global Placement and Routing , CTS ).

Synthesis is not just a script you write for a specific tool. It is actually much more than that. I have seen many folks who loosly couple it with a specific tool like Synopsys DC or Magma Blast Create. It is in a sense a methodology which evolves over the time by doing many runs as the block/chip evolves. Before you start synthesis , One has to know

1. What are the area goals? If area is one of the important criteria, then one needs to know what are the area reduction techniques available from the tool . I’m assuming that RTL has conforms to best design practices. Check the number of logic levels and cells used after first iteration. Accordingly decide what your strategy for area reduction should be. One benefit of reducing the area will help to comeup with a relatively smaller floorplan. Ofcourse most EDA folks prefer bigger floorplans sothat their floorplanning or placement tools can easily do the job :) , easily avoid the congestion and cross-talk issues :). But I strongly suggest synthesis folks to have some understanding of physical synthesis.

2. What are the power goals? Some tools have advanced power savings schemes in synthesis itself like Power gating flops/Retention flops. They do this when the RTL designer uses some special pragmas in their RTL code and they capture this when the tools parse the RTL.

3. If clock gating allowed? Are there any modules/blocks for which clock gating has to be disabled? You need to understand why it has to enabled/disabled and should be aware of its impact. You should know if the technology library has the support for ICG (Integrated
clock gating cells) .

4.How much hierarchy has to be kept? Keep in mind that Logical Hierarchy is different from Physical Hierarchy. When you decide to maintan hierarchy, it has to be considered that some optimization algorithms are limited by the module hierarchies/boundaries and so care should be taken as if and how much QOR can be sacrificed.

5. Is flattening allowed ? The answer to this question depends partly on the decision you make on the above question. If yes , is it allowed on entire design ? If not, can you atleast do flattening selectively. Other relevant information you need to know is , if rtl inferred models can be flattened .

6. What is your DFT strategy and methodology. Many might wonder why is it important to consider at Logic Synthesis stage. This is especially important when you plan for DFT during RTL development. Like you might have declared the test/scan ports in RTL and since scan insertion will not be done till synthesis has been done, many optimization algos in the synthesis engine will see them as floating and will blow them away. So, you might need to instruct the tool not to touch them.

7. Resource sharing and Operator Merging : If the these options are available in the synthesis and if you dont have any constraint or reason for not using them, then it is highly recommended to take advantage of these . But care has to be taken as some formal tools either dont have good support for this or dont support them at all .

8. Datapath architecture selection : Some advanced synthesis tool allows you to configure/pre-select the datapath architectures . If timing is critical for a particular block, then you might want to overide the area optimization steps by selecting the fastest architecture available for all the datapath components in that particular block. Do remeber that selecting fastest architecture might blow up area sometimes.

9. Formal Verification : FV (Formal Verification) tools dont do agressive optimizations ( or should I say, it is not a good idea to so :) ) as synthesis tools do. So, it is highly important that you let your FV tool know about the synthesis options when exists & possible. You should try to mimic the synthesis env in your FV environment. Else you can see some false failures.

10. Hard Macros : When macros are used and if some of the inputs/outputs are unused, they might be removed. So if any macros or hard instantiated gates are present , you should set the appropriate commandslike force keep or set_dont_touch .

11. Spare gates/registers : When spare registers are described in the RTL itself, synthesis tools should be instructed to preserve them else they are treated as a part of unreachable registers and might be thrown off during
synthesis (deadcode removal ). Some people use spare registers in the backend and sprinkle them evenly.

12. It is important to analyze the technology library for the cell delays and area. One another important factor most people forget is to consider the effect of EM (electro-migration) and yield . All the bad cells ( which have high delays , or bad for EM or yield, bad area ) should be hidden or disabled from being used by synthesis engine. Forgetting to disable cells bad for EM or yield effects timing closure with cross-talk/SI during backend.

Apart from this, make sure all complex cells like AOI, OAI, Full Adders and Half Adders, XOR etc are available to the synthesis tool. It helps save area and increases drive load capability resulting less buffering.

13. Dont use highest effort levels in synthesis by default ( unless you know what you are doing ). Some optimization algorithms might hurt your design by doing agressive optimizations. Synthesis knobs have to be used with care and by studying what it does to the design.

14. Designware usage: Sometimes, it makes sense to use Designware components in RTL . Make sure the synthesis tool used can detect the Designware components , understand and synthesize them. Some synthesis tool vendors change them to their equivalent models ( for example, Lavaware components from magma ) . If not , you might need to black box them and read in the gate level netlist of those designware components after synthesis is done.

15. Pipe-lining : Almost all synthesis tool support this . So where necessary, the designer or synthesis expert has to know which block/module needs pipe-lining ; how many stages are required and what is the latency at each stage etc.

16. Re-Timing : Sometimes when RTL has those designware or lavaware components , some synthesis tools automatically apply re-timing and some synthesis tools require you to explicilty set the relevant re-timing configs . But keep in mind that re-timing is not supported that well in FV tools.

17. Dontcare optimization : If the RTL contains dont-cares (X) , many synthesis tools allow you to choose whether you want x to be treated as “0″ or “1″. I suggest it would be better if we leave it to the synthesis tool to decide. Most dontcare algorithms select the value of x which will result in smaller ckt area if area optimization is enabled or better ckt with better timing if timing mode is enabled.

18. Clock Edge mapping: Some synthesis tool map to neg edge flops and add a inverter if they see that is has better area savings than pickingup a pos edge flop. Some design methodologies especically back-end teams dont prefer this sometimes. So, you need to set the configs accordingly.

19. If there are any complex cells like Full Adders, Half Adders etc with multiple outputs in your library , then most synthesis tools dont utilize them and so if your want to synthesis tool  to use them , then you have to hand instantiate them in RTL. Remeber these cells will be timed, but will not infered or decomped to simpler cells/logic during optimization phases .
With all these said, I cant stress enough how important it is for RTL coders to follow best practices. There is lot of information out there or one can refer to STARC methodology guide or Design-Reuse methodology manual for information .

Add comment January 12th, 2007

Being an Field Application Engineer

It is a long post (you are warned !! :)
I have been recently asked by someone as what it takes to be sucessful application engineer . So, I thought why not blog about ..Though much of it is written from EDA industry perspective, it applies for appln engineers in other industries as well. So, here it goes ….
1. Technical expertise : You have to be atleast good if not proficient in the domain..for example, lets say, if you are application engineer for a formal verification product, you need to have expertise in the FV techniques and good understanding as what logic/physical synthesis tools do in terms of optimization. Just mere tool knowledge will not suffice..

2. You should be a like a double edged sword ..You need to be able to understand the hardware design…be it RTL , Scan insertion , P&R or CTS and at the same time , you should be able to understand how the algorithm (tool) behaves (from software perspective..)..If you dont understand both, you will not be able to understand what the HW designer is trying to accomplish and at the same time, you will not be able to find out if the tool is missing any feature or is it a limitation of the technology and finally if it is a bug …one more important reason is , you might need to translate the designers intention into a feature speficiation and direct your R&D.

3. Business Sense : I think this is very important component for an Appln engineer. You need to be in constant touch with the customer and get feedback on the product. You should be able to sense the impact derived from that feedback. Whenever there is a oppurtunity to promote a new product, you should do so immediately and let your marketing/sales team know about it immediately. Just being technical is not enough.
Application engineer without good business sense can negatively impact the company he represents.

4. Pre-Sales : Ability to benchmark against the competitor and convince him about your products technical merits. Depending upon the competetion and product and domain in which you operate, this can be very intensive and grilling. Failure is not an option . A true winning aptitude and to do whatever it takes is absolute must. No compromises.
I’m not exagaretting , but it might involve some sacrifices like working during xmas or thanksgiving :) . Pre-Sales campaigns can very stressful and can burn a person. So, if you cant work under pressurized environment and have strict rules about your work timings, then you might not like this role. Believe me there are some customers who keep  evaluating for very long time or they evaluate now and then re-evaluate after couple of months and there are reasons why they do like this ( first and foremost reason is to check the quality of the tool :) ) . So it is tiresome and it requires willingness to walk that extra mile to win the benchmark is a must.
5. Post-Sales/Deployment : A succesful tech campaign and business(pre-saleS) win is the starting step. The 20/80 rule applies here ( 80% of the business comes from 20% of your customers). So, sucessful deployment of the product across the depth and breadth of the company is key . It will also gives Sales folks a chance to push other products
into the company. Dedicated and fast support is one of the strategies. Providing support for their first tapeout with your company’s product is another key. A sucessfull deployment also means to work with the design methodology groups,designers ( front end and backend ), understanding their design goals and issues ; resolving their issues . It might be necessary to come up a design methodlogy /flow either on a project basis or company wide . A constant interaction with the design team is a must . This also helps the appln engineer to see what is lacking and fill in the gaps either through scripting or getting R&D implement the missing features and enhance the product.

6. Evangelism : Not many folks know about this. Some people mix this with the marketing. This is virually non-existent in EDA/semiconductor industry. Marketing is more about the product , evangelism is creating a community around the product. Who else can be a better person other than the appln engineer to do this?

7. Customer Facing Skills : Only few people have this skill and like to be infront of customers. You need to have some thick skin and take all the yelling :) ..Imagine when you are presenting or giving a demo to a customer and your tool crashes everytime you invoke it :) , scary is’nt it? okk..lets ease up a bit, it crashes only few times, how can you face the customer now? You should be able to ease and control the situtation …I can list hundreds of scenarios like this . It also takes a great deal of energy to say NO to a customer. Believe me its not an easy situtation. You need to be diplomatic when saying so sothat relationships are’nt hurt . It all comes by experience and ability to dynamically change the situtation on the fly :)

8. Issue Management : Very important skill . Should be in constant touch with the customer , track down the issues and have a proper resolution to all their issues with a fix schedule . It is important that the customer acknowledges and is actually OK with the fix schedule.If the schedule is missed for any issue, customer should be informed immediately.

9. Time management : Ability to multi-task is a must.

10. Debugging Skills: If you are not good in debugging or cant debug fast enough, you dont fit to be an Appln engineer.

11. Attitude : Having a proper attitude and ability to learn things fast is necessary to suceed in the job. You might need to learn different technologies, products/tools to perform your job better.
12. Peer-Peer Commn : Try to maintain peer-peer communication. There is no book which teaches on how to debug faster or perform each of the above skills I mentioned sofar. It is only through peer-peer communication you can learn . You might have an experienced AE in your organization, who can give you pointers ; its not that you cant solve it , Its that the other AE has done it 100 times and so knows the common pitfalls . You can avoid doing the same mistakes and save your valuable time.

13. Product Strategy : This requires knowledge in competitors products and its features , different technologies , business sense. Only then you will be able to place the product strategically infront of the customer.

14. Licensing Model : It is not essential , but very good skill to posses and understand how the licensing works like what features can be licensed ( to understand this, you need to justify why the customer will pay for this in the first place ) . If you know of any other venues through which you can generate a revenue for your software, it surely helps the Sales organization. Remember sales fix everything :)

In short, appln engineer is best evangelist an EDA company can have. He is the face of the company , best knowledgeable (technical) person who can deliver solutions out of the box, best person who has access to people who use the tool and therefore can promote the product to real decision influentiers , best person to give feedback to the marketing and sales organization, drive the product usability in the field, enhance and validate the product ( and its features) ;

So sounds like fun job right!! Atleast I love it and I’m being constantly challenged with newer technologies , products, sales and marketing campaigns:)

I would appreciate any feedback or comments.

1 comment January 8th, 2007

Previous Posts


About the Author

Kiran Bulusu is an Field Applications Engineer with experience in the domain of Formal Verification, Logic Synthesis, DFT,Timing Closure, Floorplan and Place and Route, ,RTL-GDSII Design Methodology and Flow development, Pre-Sales and Post-Sales of the product. He is an evangelist and has expereince in technical marketing in EDA and Semiconductor industry. His other interests include Management Consulting,Marketing and Entreprenuership. He is currently employed at Magma Design Automation.

More About Me

View Kiran Bulusu's profile on LinkedIn

Recent Posts

Blogroll

Calendar

November 2008
M T W T F S S
« Oct    
 12
3456789
10111213141516
17181920212223
24252627282930

Kiran Bulusu Calendar

Tags

Categories

Technorati Profile


Kiran