

# Agenda

# ► Introduction:

- Freescale
- MSC8144
- dal\_4exec test case
- ► Timing/Area closure problem
- ► Solutions:
  - Encounter flow
  - Timing improvement methods
  - placeDesign flow

# ► Summary

Freescale<sup>™</sup> and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2007.



1



# **Introduction to Freescale**

## ► About Freescale Semiconductor

- Freescale Semiconductor is a global leader in the design and manufacture of embedded semiconductors for the automotive, consumer, industrial, networking and wireless markets.
- Freescale is one of the world's largest semiconductor companies with 2006 sales of \$6.4 billion (USD). www.freescale.com

# ► Freescale Semiconductor Israel (FIL)

- There are 2 design centers in Israel. Main center in Hertzelia and second center in Omer.
- "Diversity is the keyword"
  - FIL is a unique design center with groups developing solutions for networking, DSP cores & platforms, cellular & base stations.
  - Design groups deal with <u>all</u> design aspects, starting from product definitions, going through design, verification & implementation stages, and ending with Silicon testing.





# Introduction MSC8144

- ► Multicore Starcore<sup>™</sup> based programmable DSP at 4 x 1GHz (16GMACs), industry's highest performance and at lowest power/channel
- ► 3rd Generation of market leadership in Multicore DSPs, with proven, eminent and robust architecture
- ► Industry's largest embedded memory, enable to eliminate the need for external memories, reducing total system cost, board space and power dissipation
- ► Supporting next generation and legacy interfaces including Dual Gig Ethernet, Serial Rapid IO, Utopia, PCI, DDR2, TDM
- Accelerate 'Time-to-Market' with Freescale software libraries, Framework and Codecs, 'Best-in-Class' Multicore IDE, SmartDSP OS or Commercial RTOS





# Introduction dal\_4exec

# ► Data arithmetic logic unit.

- Performs 4 operations per cycle.
- Highly optimized RTL

# ► Design Details:

- Design Name: dal\_4exec
- Technology: 90nm soi
- Max clock Freq: 1 GHz
  - Custom clock tree
- No. of Inst: approx 100K
  - No Macros
- Mostly data path
- Total Modes: 1







## Timing/Area closure problem

#### ► Timing

- Total paths in the design: 2600.
- 40% failing paths with slacks less than -60ps.
- Design can achieve a WNS of -60ps easily. Getting the WNS/TNS down further is the problem.
- Design very sensitive from placement point of view.
- Fighting for Pico seconds, still 60ps is 6% of the freq target!

#### ► Area

- Design needed to be timing closed with max possible density, Taking into account addition of:
  - Custom clock tree
  - Hold fixing
  - SI closure
- Extra effort on timing caused the area to increase significantly.



#### **Encounter Flow**









# **Timing improvement methods**

## ► RTL compiler:

- Smart ungrouping during synthesis.
- Path adjust flow. Work harder on critical paths in RC.
- Clock gating cells instead of mux to retain seq cell state. Reduces the logic in the "data" path.

# Encounter:

- RCfactor=1. Although Ostrich reported a 1.13 default Vs. detailed C factor, We stayed with factor 1 & fixed bad cap cases in post-route optimizations.
- Placement with "inPlace Optimization".
- Increasing the critical paths to 30% (default is 2%) during placement.
- "setOptMode –noPreserveModuleFunction"









- ► A special variable (beta) was provided by R&D to increase the number of paths with priority to 30%.
- Note: Slack distribution in dal\_4exec is very symmetric. Applying higher priority on other designs will create over congestion.





#### setOptMode -noPreserveModuleFunction

► Determines whether to preserve logical functions at hierarchical output ports. Using this feature will enable the Encounter software to share logic across hierarchical boundaries.

Default: -preserveModuleFunction









#### ► Results:

- Timing target was met. Able to achieve 1GHz!
- Area was much lower (20%) after implementing all methods, compared to the starting point.

## ► Summary notes:

- The dal\_4exec is a unique block, containing a large number of symmetrical timing critical data paths. Solutions used in this block will not necessarily be useful in more general blocks.
- In order to close timing on this block, it was required to bring together very good knowledge of the design along with expert in-dept knowledge of the tools.









# cādence"

