This page lists all the talks (offline and online) and the details at the department of Computer Science and Engineering, IIT Palakkad.
Accelerating Graph Neural Networks on Heterogeneous Accelerators
- Speaker: Kevin Jude Concessao (PhD Scholar)
- Date : 21 September 2022
- Venue: Room 301, Samgatha, Nila Campus
- Abstract:
Graph Neural Networks (GNN) are used extensively for analyzing graph-structured data with higher accuracy for node classification, graph classification, and link prediction, in areas such as social networks (analyzing human interactions), chemistry (molecule discovery, protein folding prediction), transportation, fast-placement of chips, etc. The vertices and edges in the case of a GNN are associated with high dimensional feature vectors. The patterns of interests are often hidden/entangled in these feature vectors. The operations in GNNs consist of data-driven, irregular, with interleaved dense and regular neural network operations. Hence, performance optimizations applied for graph processing and traditional DNNs cannot be applied easily to GNNs. Several forms of parallelism can be explored in GNNs - operator-level, feature vector-level, graph-partition-level, model-level, etc. Since there is a need for studying larger and deeper GNN models, there is also the need to accelerate and improve the performance and scalability of GNN training/inference algorithms on heterogeneous hardware architectures. We aim to provide GNN-specific programming abstractions for heterogeneous hardwares. We briefly introduce how graph programming models have evolved to process graph neural networks, and then introduce the challenges in processing graph neural networks on heterogeneous architectures. The presentation will conclude with an high level overview of future work.
Graphs and Posets as intersection pattern of line segments
- Speaker: Sreejith K P (PhD Scholar)
- Date : 18 July 2022
- Venue: Online
- Abstract:
An intersection graph of a family F of sets is a graph G with each set in F considered as a vertex, and an edge added between exactly those pairs of vertices that correspond to non-disjoint sets. B0-VPG graphs are intersection graphs of axis-parallel line segments in the plane and any such intersection representation is called a B0-VPG representation. In this thesis, we study a few graphs and posets which have such a B0-VPG representation. Cohen, Golumbic, Trotter, and Wang (Order, 2016) pose the question of characterizing B0-VPG permutation graphs. We respond here by characterizing B0-VPG cocomparability graphs. This helps us show that a simple necessary condition in fact characterizes B0-VPG permutation graphs. The characterization also leads to a polynomial-time recognition algorithm and its proof gives us a B0-VPG drawing algorithm for the class of B0-VPG cocomparability graphs. Our drawing algorithm starts by fixing any one of the many posets P whose cocomparability graph is the input graph G. On the set of axis-parallel line segments in the plane, we define a binary relation $<_2$ as $p <_2 q$ if and only if they are non-intersecting and the bottom-left endpoint of p precedes the top-right endpoint of q in the product order on $R^2$. The reflexive closure $\le_2$ of the relation $<_2$ restricted to the line segments of our drawing turns out to be a partial order isomorphic to the poset P. This gives a new 2D visualization for such posets.
Following the above result, we show that all AT-free outerplanar graphs are B0-VPG. In the course of the argument, we show that any AT-free outerplanar graph can be identified as an induced subgraph of a biconnected outerplanar graph whose weak dual is a path. Our B0-VPG drawing procedure works for such graphs and has the potential to be extended to larger classes of outerplanar graphs. Finally, we define a new class of graphs, viz. linear outerplanar graph, as the class of all biconnected outerplanar graphs whose weak dual is a path, and their subgraphs. This does not include all outerplanar graphs whose weak dual is a linear forest. We characterize this graph class and show that a graph in this class can be realized both as an induced subgraph and as a spanning subgraph of (different) biconnected linear outerplanar graphs. The recognition of linear outerplanar graphs and the construction of a biconnected linear outerplanar supergraph can be done in polynomial time. Hence one can extend various drawings and geometric representations available for biconnected linear outerplanar graphs to this larger class.
Low Power and High-Throughput Accelerator Architectures for CNN Inferencing
- Speaker: Shine P Sunny (PhD Scholar)
- Date : 06 July 2022
- Venue: Online
- Abstract:
Deep Neural Networks (DNN) are deployed in many real-world problems with stringent constraints on the processing hardware. However, growth in network sizes of the DNN models leads to an exponential increase in computational effort and required memory size. Dedicated domain-specific hardware accelerators are thus used to increase the performance of the inference of DNNs. CNNs constitute the major class of DNN networks. Here, The input tensors in each layer of Deep Neural Network (DNN) models are often partitioned/tiled to get accommodated in the limited onchip memory of accelerators. This leads to redundant data movement between the accelerator and different levels of the memory hierarchy degrading the performance and energy consumption during convolution. Studies show that efficient tiling schedules (commonly referred to as mapping) for a given accelerator and DNN model reduce the data movement between the accelerator and different levels of the memory hierarchy improving the performance. However, finding layer-wise optimum mapping for a target architecture with a given energy and latency envelope is an open problem due to the huge search space in the mappings.
We worked on a Reinforcement Learning (RL) based automated mapping approach to find optimum schedules of DNN layers for a given architecture model without violating the specified energy and latency constraints. In the work done, A Deep Q-Learning (DQN) agent learns a generalized policy by analyzing the complex relationship between the layer shape, target resources, and performance parameters. The learned policies easily adapt to a wide range of DNN models with different hardware configurations, facilitating transfer learning and improving the training time. Experiments show that the proposed work improves latency and energy consumption by an average of 21.5% and 15.6% respectively compared to the state-of-the-art approach for a wide range of DNN models running on NVIDIA Deep Learning Accelerator (NVDLA). The training time of RL-based transfer learning is 15× faster than that of GAMMA. RL-based tiling experiments showed that despite utilizing a highly optimized tiling, the PE utilization is relatively low. In addition to this, the weight tensor occupies a major chunk of on-chip storage and has to persist throughout the convolution process. All these are fundamental limitations of spatial domain convolution where arithmetic complexity is less. In the meantime, Recent studies show that FFTbased convolutions are faster compared to spatial domain convolutions due to higher arithmetic intensity resulting in better utilization of hardware systems whose compute–to–memory ratios are increasing over time. Moreover, FFT-based convolution offers an opportunity to tile the tensors more efficiently. Compared to spatial domain convolution acceleration, research on FFT-based convolution accelerators are in a nascent stage. In the due course of research. it is planned to explore FFT-based convolution accelerators particularly to work on dataflow and memory mapping problems associated with it. Currently, An initial design of FFT-based convolution accelerator is proposed. The proposed design features a optimum Overlap and Discard (OaD) method of convolution, DCT-based compression of filter kernels, JPEG-based compression of ofmap, event-based scheduling of FFT processor and a Gauss Complex multiplier. The OaD method of convolution simplifies the tiling problem by decomposing the ifmap into well defined patches. The convolution on these patches can be computed independently.
Balanced Connected Subgraphs
- Speaker: Ardra P S (PhD Scholar)
- Date : 28 April 2022
- Venue: Online
- Abstract:
Graphs are abstract structures used to represent binary relationships among objects. The objects are called vertices and the relation between them are called edges. Colored graphs are those where each vertex/edge is colored with a color from a finite set of colors. Graphs are tremendously powerful in modeling complex structures and hence efficient methods for searching for desired patterns in graphs are of extreme importance in Computer Science. In particular, problems related to finding subgraphs of a certain color pattern have received a lot of attention in the literature. A well-known example is the class of questions in Ramsey Theory that deals with the existence of monochromatic subgraphs (where all edges have the same color) of a specific type like cliques, spanning trees and paths. In this work, we focus on fundamental questions related to finding balanced subgraphs instead of monochromatic subgraphs where balanced means having equal number of edges of each color. These problems come under a subarea of Ramsey Theory known as Zero-Sum Ramsey Theory.
Given a graph whose edges are colored, the Maximum Balanced Connected subgraph problem is to identify a maximum-sized subgraph that is connected and balanced. While finding a maximum-sized monochromatic connected subgraph is easy (using breadth-first search or depth-first search), we show that Maximum Balanced Connected subgraph is NP-hard even on graphs whose edges are colored using 2 colors. We then study the complexity of the problem on special graph classes like split graphs, bipartite graphs, chordal graphs and planar graphs. We also describe a fixed-parameter tractable (with respect to the size of the solution subgraph as the parameter) algorithm for general graphs. Then, we study a variant of the problem where the solution subgraph is required to be a path. We call this problem Maximum Balanced Path and show that it is NP-hard even on split graphs but is fixed-parameter tractable (with respect to the size of the solution subgraph as the parameter) in general graphs.
We believe that this study is the starting point towards understanding the complexity of finding balanced substructures in graphs.
Graph Orientation Algorithms with Distance Guarantees
- Speaker: Deepu Benson (PhD Scholar)
- Date : 22 March 2022
- Venue: Zoom
- Abstract:
This presentation mainly focuses on orientations of graphs and mixed multigraphs. An orientation of an undirected graph G is an assignment of exactly one direction to each edge of $G$. The oriented radius of an undirected graph $G$ is the smallest radius among all the orientations of $G$. Let $\overline{f}(F)$ denote the maximum oriented radius of graphs in the family $F$. Let $\overline{f}(r) = \overline{f}(F_r)$ where $F_r$ denotes the family of all graphs of radius $r$. The oriented diameter of an undirected graph $G$ is the smallest diameter among all the orientations of G. Let $\overline{g}(F)$ denote the maximum oriented diameter of graphs in the family $F$. Let $\overline{g}(d) = \overline{g}(F_d)$ where $F_d$ denotes the family of all graphs of diameter $d$. A mixed multigraph is a multigraph which may contain both undirected and directed edges. An orientation of a mixed multigraph $G$ is an assignment of exactly one direction to each undirected edge of $G$. For each $r \in \mathbb{N}$, let $\overline{f}(r)$ denote the smallest number such that any strongly connected bridgeless mixed multigraph with radius at most $\overline{f}(r)$.
One of the most famous studies on strong orientations of undirected graphs was carried out by Chvátal and Thomassen [J. Comb. Theory. Ser. B., 1978]. They proved that $\forall r \in \mathbb{N}$, there exists an undirected 2-edge connected graph of radius $r$ such that every orientation of it has radius at least $r^2+r$. First, we focus on such strongly connected orientations of undirected graphs. Chvátal and Thomassen [J. Comb. Theory. Ser. B., 1978] gave a lower bound of $(1/2)d^2+d$ and an upper bound of $2d^2+2d$ for $\overline{g}(d)$. This upper bound is obtained as a corollary from the upper bound for $\overline{f}(r)$. We improved the upper bound on $\overline{g}(d)$ to $1.373d^2+6.971d-1$, which outperforms the former upper bound for all values of d greater than or equal to 8. The proof is constructive and lends itself to a polynomial time algorithm.
Since our attempts to reduce the gap between the upper and lower bounds on $\overline{g}(d)$ any further were not fruitful, we studied the bounds on $\overline{g}(d)$ for specific values of $d$. It is an easy exercise to prove that $\overline{g}(1) = 3$. Chvátal and Thomassen [J. Comb. Theory. Ser. B., 1978, 1978] proved that $\overline{g}(2) = 6$. For $d \ge 3$, the exact value of $\overline{g}(d)$ is unknown. For $\overline{g}(3)$, Kwok, Liu and West [J. Comb. Theory. Ser. B., 2010] obtained improved lower and upper bounds of 9 and 11 respectively. For $\overline{g}(4),$ the bounds provided by Chvátal and Thomassen were 12 and 40 and no better bounds were known. By extending the method we used for diameter-d graphs, along with an asymmetric extension of a technique used by Chvátal and Thomassen, we have improved this upper bound to 21. We also give an upper bound of 47 for $\overline{g}(5)$ as an easy corollary.
We then shifted our focus to the problem of finding strong orientations of mixed multigraphs. A mixed multigraph G can be oriented to a strongly connected digraph if and only if G is connected and bridgeless [Boesch and Tindell, Am. Math. Mon., 1980]. We improve the current best upper bound of $4r^2+4r$ on $\overline{f}(r)$ [Chung, Garey and Tarjan, Networks, 1985] to $1.5r^2+r+1$. En route, we also show that if each edge of $G$ lies in a cycle of length at most $\eta$, then the oriented radius of G is at most $1.5r\eta$. All these proofs are constructive and lend themselves to polynomial time algorithms.
Our study of strongly connected orientations of undirected and mixed graphs and algorithms furnishing such orientations led us to the discovery of some families of graphs that provide lower bounds to the oriented radius and diameter of undirected and mixed graphs. From this, we were able to provide a marginally better lower bound, $\overline{f}(r) \ge r^2+3r-1$, for mixed multigraphs. While this marginal improvement does not help with asymptotic estimates, it clears a natural suspicion that, like undirected graphs where $\overline{f}(r) = r^2+r$, $\overline{f}(r)$ may be equal to $r^2+r$ for mixed multigraphs. Further, we now know that the new upper bound for $\overline{f}(r)$ is tight up to a multiplicative factor of 1.5.
Finally, we discuss some existing open problems which defied all attacks for a long period of time. Chvátal and Thomassen [J. Comb. Theory. Ser. B., 1978] asked the following important question. Does there exist an orientation $G*$ of $G$ such that for every edge $uv$ that lies in a cycle of length $k$ in $G$, either arc $uv$ or arc $vu$ will lie in a directed cycle of length at most $h(k)$ in $G*$? The bound for $h(k)$ provided by Chvátal and Thomassen is exponential. It is still unknown whether $h(k)$ could be bounded by a polynomial or even a linear function of $k$.
On Complexity and Algorithms for the Eternal Vertex Cover problem (Pre-submission research seminar)
- Speaker: Veena Prabhakaran (PhD Scholar)
- Date : 15 March 2022
- Venue: Zoom
- Abstract:
The eternal vertex cover problem is a dynamic variant of the classical vertex cover problem defined in terms of an infinite attacker-defender game played on a graph. In each round of the game, the defender reconfigures a fixed number of guards from one vertex cover to another following certain rules of the game in response to an attack on an edge chosen by the attacker. The minimum number of guards to be placed on the vertices of a graph G so that any sequence of attacks on its edges can be defended this way is the eternal vertex cover number of G, denoted by evc(G). It was known from the literature that given a graph G and an integer k, checking whether evc(G) <= k is NP-hard. Further, it was known that for any graph G, mvc(G) <= evc(G) <= 2mvc(G), where mvc(G) is the vertex cover number of G.
Though a characterization of graphs for which evc(G)=2mvc(G) existed in literature, a characterization of graphs for which evc(G)=mvc(G) remained as an open problem, since 2009. We solved this problem partly by achieving such a characterization that works for a class of graphs that includes chordal graphs, internally triangulated planar graphs and locally connected graphs. This characterization yielded a polynomial time algorithm for precisely determining eternal vertex cover number of any biconnected chordal graph G and an algorithm for determining a safe strategy for guard movements in each round of the game using only evc(G) guards.
Though the eternal vertex cover problem was only known to be in PSPACE in general, it followed from our new characterization that the problem is in NP for locally connected graphs, a graph class which includes all biconnected internally triangulated planar graphs. We also provided reductions establishing NP-completeness of the problem for locally connected graphs and in particular, for biconnected internally triangulated planar graphs. As far as we know, these were the first NP-completeness results known for the problem for any graph class.
Our later attempts were to extend these algorithmic and complexity results to larger graph classes. To achieve this, it was found to be necessary to improve the lower bound on the eternal vertex cover number from the known trivial bound given by the vertex cover number. We showed that the size of the smallest vertex cover of a graph G that contains all its cut vertices is a lower bound to its eternal vertex cover number. For a class of graphs including chordal graphs, the eternal vertex cover number is shown to be at most one more than the new lower bound. As a consequence of this lower bound, we could generalize our earlier results to obtain a quadratic complexity algorithm for finding the eternal vertex cover number of any chordal graph. Another consequence was a polynomial time approximation scheme (PTAS) for finding the eternal vertex cover number of internally triangulated planar graphs, for which exact determination of eternal vertex cover number was shown to be NP-hard from our previous result.
Though the new lower bound we obtained was useful for graphs classes like chordal graphs, the bound was far from being tight for some simple graph classes like cactus graphs. To address this gap, we introduced the notion of a certain substructure property in the context of the eternal vertex cover problem and derived a new lower bounding technique for the problem based on the property. We applied the technique to obtain a linear time algorithm for computing the eternal vertex cover number of cactus graphs. We also generalized this to obtain a quadratic time algorithm for the problem for any graph whose each block is either a chordal graph or a cactus graph.
We could simplify the proof of correctness of a linear time recursive algorithm for computing the eternal vertex cover number of maximal outerplanar graphs known in literature. This algorithm works by splitting the input graph G into some relevant maximal outerplanar subgraphs and computing some parameters including the eternal vertex cover number of these subgraphs. The simplification of the proof is obtained by applying our earlier characterization to decide whether evc(H)=mvc(H) or not for such relevant maximal outerplanar subgraphs H of G.
Malware Analysis for Resource-Constrained Computing Systems (Research proposal)
- Speaker: Akshara Ravi (PhD scholar)
- Date : 04 February 2022
- Venue: Zoom
- Abstract:
Malware has been a constant threat to the community and it impacts almost all devices which are connected to the Internet like laptops, mobile phones, servers, etc. Malware detection is a well-studied research area from the past 20 years and several static and dynamic techniques were developed to deal with malware in an efficient manner. But still, the malware remains an unsolved problem and new variants of malware are emerging day by day. Hence we started our research with malware detection on Windows OS based personal computers. We have analyzed basically the Portable Executable (PE) features since they are the most popular and effective static features for detecting Windows malware. We conducted experiments to reduce the number of features required to predict the maliciousness of executables. We achieved an accuracy of 98.55% and a false positive rate of 1.58% with just 10 PE features. Due to the gaining popularity of Linux OS in recent years, we also analyzed the importance of Executable and Linkable Format (ELF) features in malware detection. Recent trend in malware attacks is to use vulnerable Internet of Things (IoT) devices as attack vectors for distributed attacks targeting critical computing infrastructures. These devices are an easy target for malware due to its limited computational resources, constant online connection, and poor security design. Our goal is to develop a novel, fast, cross-architecture, and resource-efficient malware detection methodology for resource-constrained computing systems which make use of static (and dynamic, if needed to improve the efficiency) features from the executables and implement the framework on a real IoT network. The developed system will deal with the challenges raised by the IoT devices like constrained-resources, heterogeneity of architecture etc. Since most of the IoT devices use different versions of Linux OS, we first extracted the ELF header features from Linux executables and applied the chi-square feature selection technique to reduce the number of features to 34, without impacting the overall accuracy. Compared to other state-of-the-art works, the framework achieved an accuracy around 99% with less than 0.1% false positive and false negative rate and time taken to train these models was very less.
Research Proposal Seminar of Mr. Lijo M Jose
- Speaker: Lijo M Jose (PhD scholar)
- Date : 01 February 2022
- Venue: Zoom
- Abstract:
Graph coloring usually refers to a proper vertex coloring of graphs where no two adjacent vertices are colored with the same color. The proper vertex coloring problem is a minimization problem where the challenge is to identify the minimum possible number of colors that can be used to properly vertex color a graph. Coupon coloring is another type of vertex coloring problem. A coloring of a graph using k colors is said to be a coupon coloring if the neighbourhood of each vertex has at least one vertex of each of the k colors. Here maximizing the number of colors is the challenge.Coupon coloring problem has many direct applications in network science and other disciplines. Like a resource allocation problem in a network where all nodes are to be assigned with resources in an efficient way with the constraint that each node can only access the resources allocated to it or its neighbours. In real world applications it may not be always necessary to limit the neighbourhood to open neighbourhood as a node can make use of the resource assigned to them as well. The problem we are considering is less challenging in closed neighbourhood hence we stick to the open neighbourhood. We refer to coupon coloring of a graph using k colors as a k-coupon coloring. While it is easy to achieve and verify a 1-coupon coloring, to determine whether a graph is 2-coupon colorable is a NP complete problem. Goddard and Henning conjecture that If G is a simple triangulated planar graph of order at least 4, then G has a 2-Coupon coloring. This paper mainly focuses on our study of this conjecture and attempts to settle it.
Energy Efficient Loop Acceleration on Coarse-Grained Reconfigurable Array (CGRA) Architectures (Research proposal)
- Speaker: Chilankamol Sunny (PhD scholar)
- Date : 15 January 2022
- Venue: Zoom
- Abstract:
With the increasing demand for high-performance computing in the application domains with stringent power budgets, coarse-grained reconfigurable array (CGRA) architectures have become a popular choice among researchers and manufacturers. Loops are the hot-spots of kernels running on CGRAs and hence several techniques have been devised to optimize the loop execution. However, works in this direction are predominantly software-based solutions. This work addresses the optimization opportunities at a deeper level and introduces a hardware-based loop control mechanism that can support arbitrarily nested loops up to four levels. Major contributions of this work are, a lightweight Hardware Loop Block (HLB) for CGRAs that eliminates control instruction overhead of loops and an acyclic graph transformation that removes loop branches from the application CDFG. A centralized implementation of HLB will be devised to reduce the area/power overhead. The effect of combining hardware and software-based loop optimizations will be analyzed. In addition, this work will focus on optimizing the CGRA instruction memory hierarchy to better support the loop execution. We plan to introduce a loop buffer to cache the instructions that make up the loop body so that instruction fetch/decode costs can be reduced.
Video Compression: Challenges and Opportunities for Performance Optimization
- Speaker: Dr. Swathi Gurumani, MulticoreWare
- Date : 17 December 2021
- Venue: Google meet
- Abstract:
This talk will provide an overview of video compression algorithms and the current standards that are widely used. It will highlight some of the challenges in video compression to tradeoff compression efficiency with encoding speed and the potential opportunities for performance optimization in codecs. The talk will cover examples of prior performance optimization work in x265, MulticoreWare’s open-source implementation of HEVC/H.265.