Test Infrastructure Design for Core-Based System-on-Chip Under Cycle-Accurate Thermal Constraints Thomas Edison Yu, Tomokazu Yoneda, Krishnendu Chakrabarty and Hideo Fujiwara Nara Institute of Science and Technology Duke University Nara Institute of Science & Technology
Outline Background Core-based testing Power / Heat-related problems during test Limits of power-constrained SoC testing Related Works Objectives Proposed Method: Fixed-TAM architecture Scheduling techniques Thermal model & cost function Experimental Results Summary 2
Core-Based System-on-Chip SoC (System-on-Chip): One-chip system Use pre-designed functional blocks called IP cores Reduced design and manufacturing cost IP design and test re-use Test challenges High test data volume Limited access to internal circuitry Long test application time (TAT) High test power and temperature Courtesy of WindowsForDevices.com 3
Design for Test (DFT) for SoCs TAM (Test Access Mechanism) Dedicated test bus connecting test source/sink and core-under-test (CUT) Wrapper Isolates CUT from other cores during test Interfaces TAM & core Enables test reuse M PU 1G H z WRAPPER Source TAM D R AM 500M Hz TAM Sink 4 Limited # of I/O pins UDL 300MHz M PEG 200M Hz I/O 100MHz D S P 150M Hz Most cores not directly controllable or observable
Test Scheduling Determine test sequence for each core Minimize test time under certain constraints Power, TAM width, temperature, etc. Parallel testing results in high test power and temperature TAM Overall TAM width Core Core 1 Core 3 2 TAM width x Time Core 6 Core 4 Core 5 5 Time t
Power & Heat-related Issues During Test High test power dissipation can cause chip damage or random errors result in yield loss High power dissipation can cause overheating every 20 o C rise in temperature = approx. 5-6% timing delay chip packaging are designed for worst case typical application 6
Limits of Power-constrained SoC Testing Ignore non-uniform spatial power distribution across chip layout, core proximity affects temperature can t ensure thermalsafety Ignore effect of time on temperature 60000 50000 Testing 5 cores in parallel 1 2 4 5 7 8 3 6 9 1 2 4 5 7 8 Max. Power and Temp. for p93791 SoC Max. Power Max. Temp. 3 6 9 180 160 140 Max. Power 40000 30000 20000 120 100 80 60 Max. Temp. (C) 40 10000 20 7 0 1 2 3 4 5 Schedule No. *Results using HotSpot temp. simulator (Skadron et al., ISCA 03) 0
Objectives Given an SoC, test data and maximum allowable temperature, Propose a thermal-safe test architecture design and test scheduling methodology the given thermal constraint is satisfied at any time during test the test application time is minimized 8
Related Works Thermal-safe test scheduling (Rosinger et al., DATE 05) group cores with fixed wrappers into test sessions minimize temperature per session Uniform heat distribution (Liu et al., DFT 05) use layout information to determine wrapper configuration and test schedule 9 Test set partitioning and interleaving (He et al., DFT 06) partition test set when temperature constraint is exceeded interleave test partitions sequentially, allow other cores to cool down
Characteristics of Related Works Use constant power per core Ignore effects of active cores on non-active cores Do not optimize both the TAM and Wrapper configuration Thermal-safe TAM / Wrapper Co-optimization (Yu et al., ATS 07) use cycle-accurate power profiles per core wrapper configuration consider heat exchange across all cores and test sequence via heat dissipation paths first work to optimize TAM and Wrapper under a thermal constraint 10
Research Contributions Improvements over work done for ATS`07 Allow schedule reshaping Allow dynamic test partitioning and interleaving Allow insertion of bandwidth matching circuitry Find solutions under much tighter temperature constraints Preserve advantages of ATS`07 work Cycle-accurate power and temperature profiles Consider inter-core temperature effects Optimize TAM & Wrapper architecture 11
Target Test Architecture Characteristics: 1. TAM has fixed partitioning 2. Cores are assigned to only one TAM partition 3. Cores in the same TAM scheduled sequentially, order is variable 4. Cores in different TAMs can be scheduled in parallel c1 c1 c3 c3 c5 c5 TAM 1 = 16bits TAM 1 = 16bits c2 c2 c6 c6 c8 c8 c5 c6 c10 c3 c1 c8 c7 c2 c4 c9 Schedule A 12 TAM 2 = 8bits TAM 2 = 8bits c4 c4 c7 c7 c9 c9 c10 c10 TAM 3 = 8bits TAM 3 = 8bits SoC SoC d695 c3 c5 c6 c10 c1 c8 c4 c2 c7 c9 Schedule B Time t
Test Schedule Reshaping, Partitioning & Interleaving Test schedule reshaping Reconfigure schedule to minimize temperature of target core Ex. Avoid testing hot cores in parallel and in sequence Test partitioning and interleaving Partition test before temperature exceeds constraint c9 c4 c6 c5 c1 c7 c10 c8 c2 c3 c5 c3 c1 c5 c3 c1 c5a c3 c5bc1 c6 c8 c2 c8 c2 c6 c8 c2 c6 c10 c7 c4 c9 c7 c4 c10 c9 c7 c10 c4 c9 13 Max. Temp = 110 o C Max. Temp = 100 o C Max. Temp = 95 o C
Bandwidth Matching Frequency throttling lower scan frequency => lower power Add bandwidth matching circuitry to TAM partition 2*TAM width, ½ freq. => temperature reduction ideally w/o TAT increase Core 1 Core 2 Core 3 TAM 14
Bandwidth Matching Frequency throttling lower scan frequency => lower power Add bandwidth matching circuitry to TAM partition 2*TAM width, ½ freq. => temperature reduction ideally w/o TAT increase Core 1 Core 2 Core 3 DMUX TAM TAM MUX 15
Proposed Method Flow Initial TAM design & schedule Thermal simulation maxt Temp max YES Scheduling is NP-Hard need heuristic scheduling algorithm NO Reshape possible NO Can partition hot core? NO YES YES Reshape schedule Do partitioning Thermal simulation is time consuming need simplified thermal model need simple thermal cost function Virtual TAM limit reached? YES END 16 NO Do bandwidth matching
Thermal Model Model SoC as a network of thermal resistances first proposed by P. Rosinger, K. Chakrabarty, et.al (DATE 05) takes advantage of thermal and electrical duality only consider lateral thermal resistances Models lateral heat flow More heat flow from cores = higher temperature CORE1 R 1,North CORE2 R 2,North R 1,west R 1,2 R 2,1 R 2,East TAM CORE3 R 1,3 R 2,3 Core 1 R 3,1 R 3,2 R 3,South Core 3 Core 2 17 t
Thermal Cost Function Initial TAM design & schedule Thermal simulation maxt Temp max NO Reshape possible NO Can partition hot core? NO YES YES YES Reshape schedule Do partitioning Tcont j (c i ): thermal contribution of Core j to Core i Tcont ( c where : R j i R R ) = R ji TOT, j Pavg Trel TOT, j :Lateral thermal resistance ji from core c j ji Pavg to c : Total lateral resitance of core c : Average power j j Trel TAT i,( R : Relative test time of c ii ji i = 0) dissipation of j and c Minimize thermal contribution to hotspot core i j c j 18 Virtual TAM limit reached? NO Do bandwidth matching YES END Reset cost values and revert to initial schedule
Experimental Setup Benchmark ITC 02 SoCs d695, p22810 with hand-crafted layouts cycle-accurate power profiles from Samii et al. ( Cycle- Accurate Test Power Modeling and its Application to SoC Test Scheduling, Proc. of IEEE International Test Conference (ITC), pp. 1-10, 2006) Scheduling parameters TAM = 16, 24, 32, 64 Tmpmax : initially set to max temperature of schedule under no power & thermal constraint decreased by 5 o C intervals 19
Max. Temperature & Power vs. Temp. Constraint d695 with TAM = 16bits Max. Temp. (deg. C) 120.00 110.00 100.00 90.00 80.00 70.00 60.00 50.00 40.00 30.00 20.00 10.00 0.00 Max. Temp, Max. Power vs. Temp. Constraint Max. Temp. Relatively constant max. power but temperature was lowered further 106.28 101.28 96.28 91.28 86.28 81.28 76.28 71.28 66.28 61.28 56.28 Temp. Constraint (deg. C) Max. Power 1800 1600 1400 1200 1000 800 600 400 200 0 Max. Power (switches) 20
Max. Temperature & TAT vs. Temp. Constraint d695 with TAM = 16bits Max. Temp. (deg. C) 120.00 110.00 100.00 90.00 80.00 70.00 60.00 50.00 40.00 30.00 20.00 10.00 0.00 Max. Temp, TAT vs. Temp. Constraint Max. Temp. TAT 106.28 101.28 96.28 91.28 86.28 81.28 76.28 71.28 66.28 61.28 56.28 Temp. Constraint (deg. C) Relatively minimal TAT change but temperature was lowered further 140000 120000 100000 80000 60000 40000 20000 0 TAT (cycles) 21
Max. Temperature & Power vs. Temp. Constraint p22810 with TAM = 32bits 180 160 140 120 100 80 60 40 20 0 Max. Temp, Max. Power vs. Temp. Constraint Max. Temp. Max. Power 161.17 156.17 151.17 146.17 141.17 136.17 131.17 126.17 121.17 116.17 111.17 106.17 101.17 96.17 91.17 86.17 81.17 76.17 71.17 Temp. Constraint (deg. C) 12000 10000 Max. Temp. (deg. C) Max. Power (switches) lower max. power doesn t ensure lower temperature 8000 6000 4000 2000 0 22
23 Max. Temperature & TAT vs. Temp. Constraint p22810 with TAM = 32bits 180 160 140 120 100 80 60 40 20 0 Max. Temp, TAT vs. Temp. Constraint Max. Temp. TAT 161.17 156.17 151.17 146.17 141.17 136.17 131.17 126.17 121.17 116.17 111.17 106.17 101.17 96.17 91.17 86.17 81.17 76.17 71.17 Temp. Constraint (deg. C) 600000 500000 400000 300000 200000 100000 0 Max. Temp. (deg. C) TAT (cycles)
24 Minimum Temperature Comparison p22810 d695 SoC 133.02C 92.79C ATS 07 TAM=16 TAM=24-46 58.46C 91.49C -152 55.8C 7 102.29C 110.1C -53 77.71C dtat(%) Proposed ATS 07 dtat(%) Proposed p22810 d695 SoC 109.36C 77.15C ATS 07 TAM=32 TAM=64 26 80.59C 84.71C 16 69.91C -20 92.79C 107.25C -98 71.17C dtat(%) Proposed ATS 07 dtat(%) Proposed Max. 40% reduction in minimum temperature
Summary Studied the impact of test-set partitioning, bandwidth matching on thermal-aware TAM / Wrapper optimization and test scheduling Algorithm based on computationally tractable thermal-cost model makes thermal simulation more useable; ensures thermal safety The results show that: method allows more flexibility to trade-off temperature and TAT while minimizing TAT increase method provides solutions even under tight temperature constraints, including situations where previous work fails to find a solution 25