Available Online at www.ijcsmc.com # **International Journal of Computer Science and Mobile Computing** A Monthly Journal of Computer Science and Information Technology ISSN 2320-088X IJCSMC, Vol. 3, Issue. 3, March 2014, pg.438 – 444 ## RESEARCH ARTICLE # DESIGN OF MULTIPLE CONSTANT MULTIPLICATION ALGORITHM FOR FIR FILTER T. Sandhya Pridhini\*, Diana Aloshius, Aarthi Avanthiga, Rubesh Anand Department of Electronics and Communication Engineering, Hindustan University, Chennai, Tamilnadu pridhinithamilarasan@gmail.com \* Abstract---Design of low power systems has become a significant performance goal in the present world. A fast and energy-efficient multiplier is required in electronics industry especially in Digital Signal Processing, image processing and arithmetic units in microprocessors. Multiplier is an important element which contributes substantially to the total power consumption of the system. In VLSI design, the designers have more constraints which include less silicon area, high speed and minimal power consumption. The Aim of this research is to design a low cost finite impulse response filter using the concept of faithfully rounded truncated multipliers. The optimization of bit width and the hardware resources are done with good accuracy. In direct FIR filter the multiple constant multiplication are implemented using the improved version of truncated multipliers. Key Words --- FIR filter; multiple constant multiplication; truncated multiplier; compressors #### 1. INTRODUCTION The Finite impulse response filter is an important component for designing an efficient digital signal processing system. In this paper FIR filters are constructed, which consumes less power and area. Adders are the main building block for the construction of FIR filter. Addition is one of the fundamental arithmetic operations, used extensively in many VLSI systems such as microprocessors and application specific DSP architectures. The complexity of FIR filter is dominated by the number of adders and subtractors which are used to implement the co-efficient multipliers. To get optimized solutions, the original co-efficient are multiplied by constant value. This can be done by using multiplier block called Multiple Constant Multiplication (MCM) followed by accumulation of all products. There are two FIR structures namely, direct form and transposed form. Direct form needs extra pipeline registers between the adders to reduce the delay of the adder and to achieve high throughput which is shown in fig 1. Transposed form does not require pipeline registers so the delay will be high. Fig.1 FIR filter – Direct form [4] The hardware implementation of digital filters is divided into multiplierless based and memory based in order to reduce the cost. Multiplierless MCM based FIR filter based on the transposed form consumes large power and area when compared to the direct form. Optimizing the bit width of the filter coefficients has a direct impact on the reduction in the size of the registers. After multiplication, the bit width grows and in order to obtain the desired output, it is necessary round the outputs. The total error occurring in the quantization and rounding should not be more than 1ulp.In brief, the partial product bits are accumulated for realizing the MCM/A module and the precised output is formed by removing unnecessary partial product bits (PPB). Design and the implementation of the FIR filter is classified into 3 stages namely finding the filter order and filter coefficients, quantization of the coefficients and optimization of hardware resources. In the first stage to satisfy the specification of frequency responses the filter coefficient and filter order are determined. In the stage second stage to attain finite bit accuracy the coefficients are quantized. Finally to minimize the area cost of hardware optimization approaches is used. Fig 2 shows the stages of design and implementation of FIR filter. Fig.2 Stages of digital FIR filter design and implementation ## 2. MULTIPLICATION USING TRUNCATED MULTIPLIER In many computer systems, the parallel multipliers produce 2n bits which are rounded to n bits in order to avoid growth in word size. An efficient method for reducing the hardware requirements for the rounded parallel multiplier is truncated multipliers. Truncated multipliers give significance improvements in area, power and delay The partial products bits are truncated to get the precised form to reduce area and cost .If there is any truncation error, compensation circuits can be used to reduce the truncation error. In this method, it jointly considers the tree reduction, truncation and rounding of the partial product bits. There is no need of compensation circuits because the truncation error is not more than 1ulp and the final output will be precised. The fig 3 describes the improved version of the faithfully rounded truncated multiplier. In this method only the deletion and truncation takes place to eliminate the partial product bits. The range of the deletion error is two times larger than the previous results. The grey dot represents the deleted bits while the green dots represent the truncated bits. The Wallace tree multiplier is considered as faster than parallel multiplier and it is efficient implementation of a digital circuit which multiplies two integers. To reduce the latency a Wallace tree multiplier uses carry save addition algorithms. Basically Wallace tree multiplies two unsigned integers. A multiplier is divided into three stages:-the first stage is partial product generation where the multiplicand and multiplier are multiplied by bit wise to generate partial products. The second stage is the partial product reduction which is more complicated and the final stage is the carry propagation stage using different compressors employed in high speed multipliers to reduce the latency of the accumulation stage. Fig.3 Improved version of truncated multiplier [4] # 3. MULTIPLICATION USING COMPRESSION TECHNIQUE Multiplication using improved version of the truncated multipliers consumes more area and power. In order to reduce power and area, in this paper 5:2 compression technique is used which gives good accuracy with less power consumption. During multiplication process the partial products are accumulated by the basic building block called compressor. The multiplier architecture consists of partial product generation, partial product reduction and partial product accumulation. By decreasing the number of adders the latency of the Wallace tree multiplier can be reduced in the partial product reduction stage. The full adders and half adders are replaced by the different compressors which speeds up the summation in general and multiplication in particular. ## 3.1. 5:2 compressor The basic idea of a n:2 compressor is that the n operands can be reduced by 2. The architecture of 5:2 compressors comprises of two serially connected 3:2 compressor as shown in the fig.4. The A, B, C, Sum1 and Carry1 are the inputs and outputs of the first compressor. The sum1, D, E Sum2 and Carry2 are the inputs and outputs of the second compressor shown in fig 5. Fig 5.Serially connected 3:2 compressor Consider an example shown in fig.6,in the given 8x8 multiplication the first 8 bit input is multiplied by 8 bit multiplier. If each bit is multiplied by another 8 bit totally 64 bits obtained . These bits are than added by using compressors which reduces the latency and increases the speed. The sum output of the intermediate compressor is fed as input to the next compressor of the same column and generated carry of the corresponding adders are propagated to the next column . Finally a 16 bit result will be obtained. The obtained partial products are than optimized by using the several 5:2 compressors as shown in fig.7. Fig. 6 8x8 multiplication with their partial product bits. Fig.7 Reducing of partial products bits using optimized 5:2 compressors. ## 4. SYNTHESIS REPORT Synthesis report shown in fig 8 gives the complete details of device and total memory utilization for multiplication of two 8 bit data using 5:2 compression techniques. Using 5:2 compressor the number of gates is reduced to 2136 compared to the improved version of truncated multiplier. Fig 8 shows the synthesis report of area in 5:2 compressor. The table 1 gives the comparative study on area utilization and power consumption of the improved version of the truncated multiplier and the 5:2 compressors. Table 1. Comparison of the area and power of the improved version of the faithfully rounded truncated multiplier and the 5:2 compressor. | PARAMETERS | IMPROVED VERSION OF THE TRUNCATED MULTIPLIER | 5:2<br>COMPRESSOR | |------------------------------------------|----------------------------------------------|-------------------| | NO OF GATE COUNTS (AREA) mm <sup>2</sup> | 2203 | 2136 | | POWER (mW) | 56 | 48 | ## 5. CONCLUSION In this research, a linear phase direct form FIR filter is designed by considering the optimization of bit widths and the hardware resources. The proposed method reduces the number of stages in partial product reduction which leads to reduction in silicon area and power consumption. The comparison results shows that a significant reduction in area and power. The results prove that the proposed method is more conventional one in terms of area and power. #### 6. REFERENCES - [1]Bickerstaff, K. C., Schulte, M. and Swartzlander, E.E., Reduced Area Multipliers, in *Proc. Int. Conf. Appl.-Specific Array Processors*, pp.478–489, 1993. - [2] Chang, C.H Chen, J. and Vinod, A.P., Information theoretic approach to complexity reduction of FIR filter design, *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 55, no. 8, pp. 2310–2321, Sep. 2008. - [3] Gustafsson, O., Lower bounds for constant multiplication problems, *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 54, no.11, pp. 974–978, Nov. 2007. - [4]Ko, H.-J. and Hsiao, S.-F., Design and application of faithfully rounded and truncated multipliers with combined deletion, reduction, truncation, and rounding, *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 58, no. 5, pp. 304 308, May 2011. - [5]Lin,Y.C. and Parker, S., Discrete coefficient FIR digital filter design based upon an LMS criteria, *IEEE Trans. Circuits Syst.*, vol. 30, no. 10, pp. 723–739, Oct. 1983. - [6] Meher, P.K., New approach to look-up-table design and memory-based realization of FIR digital filter, *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 57, no. 3, pp. 592–603, Mar. 2010. - [7]Park, I.C., and Kang, H.J., Digital filter synthesis based on an algorithm to generate all minimal signed digit representations, *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 21, no. 12, pp. 1525–1529, Dec. 2002. [8] Yao, C.Y., Chen, H.-H., Lin, T.-F., Chien, C.-J. J., and Hsu, X.-T., A novel Common subexpression elimination method for synthesizing fixed-point FIR filters, *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 51, no. 11, pp. 2215–2221, Sep. 2004. [9]Yu, Y. J., and Lim, Y.C., Design of linear phase FIR filters in subexpression space using mixed integer linear programming, *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 54, no. 10, pp. 2330–2338, Oct. 2007.