# Impact of Faulty Links on Quality-of-Service in Network-on-Chip under Different Traffic Patterns

Krishan Kumar Paliwal<sup>†</sup>, Vijay Janyani<sup>††</sup>, M.S.Gaur<sup>†</sup> and Vijay Laxmi<sup>†</sup>,

<sup>†</sup>Department of Computer Engineering, <sup>††</sup> Department of Electronics & Communication Engineering, Malaviya National Institute of Technology, Jaipur, INDIA-302017

#### Summary

The objective of this paper is to introduce the impact of faulty links on Quality of Service in Network-on-Chip (NoC) under different traffic pattern. NoC paradigm has made it possible to concurrently run multiple applications on IP-core based System on Chip. It is therefore necessary to predict the multi-processor systems-on-chip communication, which is a critical issue and needs to be addressed by the right mix of soft and hard real-time guarantees. To meet this requirement state of the art packet switched NoC provide different levels of quality of service (QoS) such as best effort (BE) and guaranteed throughput (GT). This paper presents a novel scheme which compares and evaluates the performance of guaranteed throughput and best effort traffic in Network-on-Chip under different synthetic traffic generators and highlights its dependence in terms of latency on the type of traffic patterns and number of link failures for mesh topology. It also explores the effect of robustness of a particular traffic pattern in terms of fault tolerance (variation in average latency as the number of link failures increases) for planar adapter routing function on latency of GT and BE traffic class for mesh topology.

#### Key words:

Guaranteed Throughput, Best Effort, robustness, fault tolerance and QoS.

## **1. Introduction**

Advances in the semiconductor technologies allow designers to integrate tens or hundreds of IP blocks together. To date, the shared bus scheme has been the system communication architecture of choice but suffers from the problem of non scalability. A global bus implies large delays and huge power consumption and in deep submicron (DSM) design of long, wide bus becomes a real challenge. The scalability and success of switch based networks and packet based communication has inspired the researchers to propose the network-on-chip (NoC) architecture as a viable solution to the complex on-chip communication problems [1]. The advantages of the NoC approach lies in the reuse of its communication network, predictability (since the network wires are structured and their electrical parameters can be very well controlled and

optimized) and the scalability [2]. Due to its simpler design

two dimensional mesh topology remains as the most popular choice.



QoS refers to the capability of a network to provide better service to selected network traffic. There are primarily two types of traffic classes- Guaranteed Throughput (GT) and Best Effort (BE) [3, 4, 5]. Communication services with guarantees on throughput and latency enables predictable system design. Although necessary for real time applications this results in poor resource utilization for applications that require variable bit rate (VBR) communication. Best Effort is a communication service with no guarantees on latency and bandwidth. It can give high resource utilization by using unreserved unused resources. However BE traffic is prone to congestion. Guarantees in case of GT traffic are given by reserving the communication resources in the NoC (wires and buffers). QoS can also be defined as fairly allocating the resources as per the demand of the application rather than efficiently allocating the resources. Basically, Guaranteed Services

Manuscript received March 5, 2009 Manuscript revised March 20, 2009

(GS) incur predictability, a quality which is often desirable, for example, in real-time systems, while BE improves the average resource utilization [6, 7, 8]. Strictly speaking, BE refers to communication for which no commitment can be given, whatsoever. In most NoC works, however, BE covers the traffic for which only correctness and completion are guaranteed, while GS is traffic for which additional guarantees are given, that is, on the performance of a transaction. In order to give hard guarantees, GS communication must be logically independent of other traffic in the system. This requires connection oriented routing. Connections are instantiated as virtual circuits, which use logically independent resources, thus avoiding contention [9, 10]. The virtual circuits can either be implemented by virtual channels, time slots, parallel switch fabric, etc. Rest of the paper is organized as follows. Section-II deals with NoC simulation framework. Section-III deals with the experimental setup, Section-IV deals with simulation results and analysis and Section-V covers conclusion and scope of future work.

### 2. NoC Simulation Framework

Booksim, a network-on-Chip discrete event, cycle accurate simulator targeted at Network on Chip (NoC) research is used. It allows to experiment with various options available at every stage of NoC design: topology, switching technique, virtual channels, buffer parameters, routing mechanisms and applications. Besides built in capabilities, it can be easily extended to include new applications and routing algorithms. The simulator can output performance metrics (latency and throughput) for a given set of choices. It provides support to experiment with NoC design in terms of different routing functions, different traffic types and varying load on various topologies.

## 3. Experimental Setup

To validate the network a discrete event, cycle accurate simulation is performed with the help of NoC simulator. In this experimental setup a 8x8, 2-dimensional mesh topology is selected and wormhole switching technique is employed using both types of traffic classes that is, GT as well as BE traffic. Basically we have two types of flits. BE flits having the priority as 0 (lower priority) and GT flits having the priority as 1 (higher priority).

The communication traffic pattern used in the simulation is basically of seven types of permutation synthetic traffic pattern, out of which five are bit permutations and two are digit permutation (Tornado and neighbour) [11]. In this

paper we restrict ourselves to uniform, Bit complement, Bit reverse, shuffle and transpose traffic patterns only. Bit permutations traffic pattern includes a uniform traffic pattern, in which each source sends an equal amount of traffic to each destination. Bit complement, Bit reverse, Shuffle and Transpose are those traffic patterns in which the destination address in bit is complemented, reversed, shuffled and transposed of the source bit address. The traffic pattern other than uniform comes under the category of permutation traffic, which is typically used to stress the given topology or the routing algorithm. To describe these communication patterns we make use of the following notations.  $s_i$  (d<sub>i</sub>) denotes the i<sup>th</sup> bit of source (destination) address whereas  $s_x$  (d<sub>x</sub>) denotes the x<sup>th</sup> radix-k digit of the source (destination) address. The bit length of the address is  $b=\log_2 N$ , where N is the number of nodes in the network.  $d_i = s_i$ 

Bit complement

| Bit reverse | $d_i = s_{b - i - 1}$                 |
|-------------|---------------------------------------|
| Shuffle     | $d_i\!=s_{i\text{-}1 \text{ mod } b}$ |
| Transpose   | $d_i = s_{i+b/2 \mod b}$              |

These traffic patterns are based on communication patterns arising in particular applications, for example transpose pattern is induced as a result of matrix transpose or cornerturn operations while shuffle permutation is caused as a result of Fast Fourier Transform (FFT) or sorting applications [12]. In this experimental setup 10% of the total traffic is GT which is given more priority than the remaining 90% BE traffic on the assumption that the priority traffic constitute a smaller fraction of the total traffic.

In this experimental setup average latency and peak latency of the two traffic classes GT and BE are evaluated for mesh and torus topologies for different traffic patterns and keeping other parameters constant. Routing function is taken as planar adapter, fail seed parameter is taken as 1, number of virtual channels is kept 8, and virtual channel buffer size is kept as 8, which is of a flit size. iSLIP virtual channel allocator is used. iSLIP is separable allocation method that uses round-robin arbiters and updates the priority of each arbiter only when that arbiter generates a winning grant. By rotating the winning arbiters, iSLIP acts to stagger the priority of the input arbiters resulting in a fewer conflicts at the output stage. The update of a priority only occurs when arbitration results in a grant and, as in a round robin arbiter, priorities are updated so that a winning request has the lowest priority in the next round.

Number of flits per packet is kept at 20. Class priority is used and packet injection rate (PIR) is varied from 0.05 to 0.4 and the numbers of link failures are varied from 0 to 15. The link failures are introduced as shown in TABLE I. The network is evaluated on the basis of average latency of both GT and BE traffic by varying the number of link failures and the type of traffic pattern.

Table 1.

| No. of               | Failure at                                         |                             |  |
|----------------------|----------------------------------------------------|-----------------------------|--|
| Link<br>Failure<br>s | Node Number                                        | Inter-<br>connect<br>Number |  |
| 1                    | 13                                                 | 0                           |  |
| 2                    | 14, 46                                             | 0                           |  |
| 3                    | 14,46,37                                           | 0                           |  |
| 4                    | 14, 46, 37, 12                                     | 0                           |  |
| 5                    | 14, 46, 37, 12                                     | 0                           |  |
|                      | 52                                                 | 2                           |  |
| 6                    | 14, 46, 37, 12                                     | 0                           |  |
|                      | 52, 43                                             | 2                           |  |
| 7                    | 14, 46, 37, 12, 10                                 | 0                           |  |
|                      | 52, 43                                             | 2                           |  |
| 8                    | 14, 46, 37, 12, 10, 41                             | 0                           |  |
|                      | 52, 43                                             | 2                           |  |
| 9                    | 14, 46, 37, 12, 10, 41, 17                         | 0                           |  |
|                      | 52, 43                                             | 2                           |  |
| 10                   | 14, 46, 37, 12, 10, 41, 17, 34                     | 0                           |  |
|                      | 52, 43                                             | 2                           |  |
| 11                   | 14, 46, 37, 12, 10, 41, 17, 34, 19                 | 0                           |  |
| 11                   | 52, 43                                             | 2                           |  |
| 12                   | 14, 46, 37, 12, 10, 41, 17, 34,<br>19, 50          | 0                           |  |
|                      | 52, 43                                             | 2                           |  |
| 13                   | 14, 46, 37, 12, 10, 41, 17, 34, 19, 50, 21         | 0                           |  |
|                      | 52, 43                                             | 2                           |  |
| 14                   | 14, 46, 37, 12, 10, 41, 17, 34, 19, 50, 21, 28     | 0                           |  |
|                      | 52, 43                                             | 2                           |  |
| 15                   | 14, 46, 37, 12, 10, 41, 17, 34, 19, 50, 21, 28, 30 | 0                           |  |
|                      | 52, 43                                             | 2                           |  |

#### 4. Simulation Results and Analysis

Analyzing the simulation results of 8x8, 2-dimensional mesh the following deductions were made. For lower values of injection rate, that is, when PIR equals to 0.05 all the traffic patterns have lower values of average latency and the value of average latency is lowest for Shuffle traffic followed by Bit reverse, Transpose, Uniform and Bit complement. It is evident from the Fig.2 (a) and 2(b) that as one of the links goes faulty in case of GT traffic the saturation point shifts from 0.4 to 0.3 PIR value in case of Uniform and Shuffle traffic pattern. Again as we increase the number of link failures gradually a shift in the saturation point of different traffic patterns from higher value of PIR to a lower value of PIR occurs. Increasing the number of link failure further from 4 to 5, Fig. 2(e), 2(f) the saturation point of Uniform traffic pattern changes from 0.3 to 0.2 PIR. With the increase in the number of link failure from 5 to 6, Fig. 2(f), 2(g) the saturation point of Transpose traffic pattern changes from 0.3 to 0.2 PIR. A further increase in the number of link failure from 6 to 7, Fig. 2(g), 2(h) the saturation point of Bit reverse traffic pattern changes from 0.3 to 0.2 PIR. As the number of link failure increases from 8 to 9, Fig. 2(i), 2(j) the saturation point of Shuffle traffic pattern changes from 0.3 to 0.2 PIR. The value of average latency grows although there is no shift in the saturation point of the traffic patterns with an increase in the number of link failure from 9 to 10 and 10 to 11 as shown in Fig. 2(j), (k), (l). However, again as the number of link failure increases from 11 to 12, Fig. 2(1), 2(m) the saturation point of Bit reverse traffic pattern changes from 0.2 to 0.1 PIR. As the number of link failure increases from 12 to 13, Fig. 2(m), 2(n) the saturation point of Transpose traffic pattern changes from 0.2 to 0.1 PIR and finally as the number of link failure increases from 13 to 14 and 14 to 15, for GT traffic class, Fig. 2(n), (o), (p) the value of average latency grows although there is no shift in the saturation point of the traffic patterns.

Analyzing the results for BE traffic it is evident from the Fig. 3(a), (b), (c), (d) that as one of the links goes faulty in case of BE traffic, that is from 0 to 1, 1 to 2 and 3 to 4 there is an increase in the average latency of all types of traffic patterns. A further increase in the number of link failure from 4 to 5, Fig. 3(e), 3(f) the saturation point of Transpose traffic pattern changes from 0.3 to 0.2 PIR. As the number of link failure increases from 5 to 6, Fig. 3(f), 3(g) the saturation point of Uniform traffic pattern changes from 0.3 to 0.2 PIR. Although there is no shift in the saturation point of any traffic pattern yet the average latency of all the traffic pattern increases as the number of link failure increases from 6 to 7 and 7 to 8 as shown in Fig. 3(g), 3(h) and 3(i). As the number of link failure increases from 8 to 9, Fig. 3(i), 3(j) the saturation point of Shuffle traffic pattern changes from 0.3 to 0.2 PIR. As the number of link failure increases from 9 to 10, 10 to 11, Fig. 3(j), 3(k) and 3(l) there is no shift in the saturation point of any traffic pattern

yet the average latency of all the traffic pattern increases. As the number of link failure increases from 11 to 12, Fig. 3(1), 3(m) the saturation point of Bit reverse traffic pattern changes from 0.2 to 0.1 PIR. Increasing the number of link failure from 12 to 13 shifts the saturation point of Transpose traffic pattern changes from 0.2 to 0.1 PIR as shown in Fig. 3(m), 3(n). As the number of link failure increases from 13 to 14, Fig. 3(n), 3(o) the average latency of all the traffic pattern increases. As the number of link failure increases from 14 to 15, Fig. 3(o), 3(p) the saturation point of Bit complement traffic pattern changes from 0.1 to 0.05 PIR.

Comparing the results of both GT and BE traffic classes for same number of link failures for all the traffic patterns it is evident that the average latency of BE traffic is more as compared to the average latency of GT traffic for a particular value of PIR. It is also evident from the results that when 13.39% of the total links (15 out of 112) goes faulty at 0.05 PIR the increase in average latency of GT class is more pronounced in case of Bit complement and Shuffle traffic patterns than for the BE class.

Jitter for both GT and BE traffic for different PIR values are determined. For lower values of PIR that is less than or equal to 0.2 Bit complement traffic pattern shows a large value of jitter for both type of traffic classes (BE and GT) as the number of link failures increases. As in this case network reaches saturation before PIR becomes 0.2.

In general, a large value of jitter is observed in all types of traffic pattern for those values of PIR at which the network approaches saturation. One of the possible reasons could be the increase in the maximum difference in the latency between two packets within a particular flow.

## 5. Conclusion & Future Work

It can be concluded that as the Shuffle traffic experienced lower average latency for both BE and GT classes with increase in number of link failures implying that it is more fault tolerant whereas Bit complement traffic pattern is least fault tolerant as it experiences a higher value of average latency with the increase in number of link failures among all the traffic pattern for both BE and GT. It can also be concluded that Bit complement traffic pattern is also least fault tolerant while the Shuffle traffic is most fault tolerant in terms of jitter as the value of jitter becomes greater than the threshold value of jitter sensitive traffic for a particular type of traffic with the introduction of link failures in the network then the performance of the network degrades for that particular type of traffic pattern. In other words it can be estimated that a particular traffic pattern can sustain so much of link failures with a fixed bound on jitter.

In our future work, network performance needs to be evaluated for different routing functions and other traffic types to analyze the impact of link failures on Quality of Service on NoC for different network topologies. In this paper we restricted ourselves to assigning higher priority to GT traffic which constitutes a smaller fraction (10%) of the total traffic. It can further be evaluated by increasing the fraction (greater than 10%) of the GT traffic.



GT Traffic, No. of Link failures = 0

Fig. 2 (a)



Fig. 2 (b)





Fig. 2 (c)





Fig. 2 (d)

GT Traffic, No. of Link failures = 4



Fig. 2 (e)





Fig. 2 (f)

GT Traffic, No. of Link failures = 6



Fig. 2 (g)



Fig. 2 (h)



Fig. 2 (i)

GT Traffic, No. of Link failures = 9

GT Traffic, No. of Link failures = 12



GT Traffic, No. of Link failures = 10

200





Fig. 2 (n)







Fig. 2 (l)

GT Traffic, No. of Link failures = 14



Fig. 2 (o)











Fig. 3 (b)

Fig. 3 (c)











40

20

0

0.05

0.1



Fig. 3 (h)

Fig. 3 (k)

0.3

PIR

0.4

0.2



Fig. 3 (l)



Fig. 3 (m)





Fig. 3 (n)

BE Traffic, No. of Link failures=15



Fig. 3 Average latency of BE traffic for different values of PIR and different no. of Link Failures in case of different traffic pattern.

#### References

- W.Dally and B.Towles, "Route Packets, Not Wires: On-Chip Interconnection Networks", DAC, June 2001, pp. 684-689.
- [2] L.Benini, G.De Micheli, "Networks on Chip: A New SoC Paradigm.", IEEE Computer, vol.35, January 2002, pp. 70-78.
- [3] Andreas Hansson, Kees Goossens, and Andrei Radulescu, "A Unified Approach to Mapping and Routing on a Network-on-Chip for both Best-Effort and Guaranteed Service Traffic", VLSI design, volume 2007.
- [4] Rostilav (Reuven), Dobkin, Ran Ginosar, Avinoam Kolodny, "QNoC Asynchronous Router", Integration VLSI Journal (2008), volume no:814, pp:1-13.
- [5] E. Rijpkema, K.Goossens, A. Radulescu, J. Deilissen, J. van Meerbergen, P. Weilage, and E. Waterlander, "Trade-offs in the design of a router with both guaranteed and best effort services for networks on chip" In DATE, 2003

- [6] Daniel Andreasson and Shashi Kumar, "On Improving Best Effort throughput by better utilisation of Guaranteed-Throughput channels in an on-chip communication system", 2005.
- [7] Daniel Andreasson and Shashi Kumar, "Improving BE traffic QoS using GT slack in NoC systems", NORCHIP conference, 2005, pp.44-47,21-22 Nov.2005.
- [8] Evgeny Bolotin, Israel Cidon, Ran Ginosar, Avioam Kolodny, "QNoC: QoS architecture and design process for network on chip", pp.105-128, Journal of systems Architecture 50 (2004)
- [9] W. Dally, "Virtual-channel flow control", IEEE Transactions on parallel and distributed systems, vol..3, March, 1992, pp.194-205.
- [10] A. Kumar, L.S Peh, P.Kundu, and N.K. Jha, "Express Virtual Channels: Towards the Ideal Interconnection Fabric," in ISCA '07: Proceedings of the 34th Annual International Symposium on Computer Architecture. New York, NY, USA: ACM,2007, pp. 150-161
- [11] W. Dally, B. Towles, "Principles and Practices of Interconnection Networks", San Francisco, CA, USA: Morgan Kaufmann publishers, 2004.
- [12] Jun Ho Bahn, and Nader Bagherzadeh, "A Generic Traffic Model for On-Chip Interconnection Networks," in NoCArc '08: Proceedings of the first International Workshop on Network on Chip Architectures. Lake Como, Italy, 8th Nov 2008.



Krishan Kumar Paliwal received the B.E. degree in Electronics & Communication Engineering from Malaviya National Institute of Technology, Jaipur in 1994 and served Indian Army as Officer Commanding Communication Company in Corps of Signals. He completed his M.E. degree

in Digital Communication Engineering in 2005 from J.N.V. University, Jodhpur and is presently pursuing his PhD from Malaviya National Institute of Technology, Jaipur. His Research interests are Network on Chip.



**Vijay Janyani** received the B.E. and M.E. degree in Electronics & Communication Engineering in 1994 and 1996 respectively from Malaviya National Institute of Technology, Jaipur and completed his PhD from Nottingham University, UK. He is Reader in the department of Electronics and

Communication Engineering, Malaviya National Institute of Technology, Jaipur. His areas of interest include Optoelectronics, Nonlinear Optics, Optical switching and Numerical Modeling.



Manoj Singh Gaur received the B.E. degree in Electronics & Communication Engineering and M.E. degree from Indian Institute of Science, Bangalore and completed his PhD from University of Southampton (UK). He is Professor and Head of the department of Computer Engineering, Malaviya

National Institute of Technology, Jaipur. His research interests are High Level Synthesis, Network on Chip, Network Security.



Vijay Laxmi received the B.E. degree from J.N.V. University, Jodhpur and M.Tech. Degrees in Computer Engineering from Indian Institute of Technology, Delhi and completed her PhD from University of Southampton (UK). She is Reader in the department of Computer Engineering, Malaviya

National Institute of Technology, Jaipur. Her research interests are Algorithms, Biometrics, Image Processing, and Crypt Analysis.