p. 1
fast fourier transforms collection editor c sidney burrus
[close]
p. 3
fast fourier transforms collection editor c sidney burrus authors c sidney burrus matteo frigo steven g johnson markus pueschel ivan selesnick online
[close]
p. 4
© 2008 c sidney burrus this selection and arrangement of content is licensed under the creative commons attribution license http creativecommons.org/licenses/by/2.0/
[close]
p. 5
table of contents 1 preface fast fourier transforms 1 2 introduction fast fourier transforms 3 3 multidimensional index mapping 5 4 polynomial description of signals 15 5 the dft as convolution or filtering 19 6 factoring the signal processing operators 27 7 winograd s short dft algorithms 31 8 dft and fft an algebraic view 47 9 the cooley-tukey fast fourier transform algorithm 57 10 the prime factor and winograd fourier transform algorithms 71 11 implementing ffts in practice 81 12 algorithms for data with restrictions 101 13 convolution algorithms 103 14 comments fast fourier transforms 113 15 conclusions fast fourier transforms 115 16 appendix 1 fft flowgraphs 117 17 appendix 2 operation counts for general length fft 121 18 appendix 3 fft computer programs 123 bibliography 155 index 178 attributions 179
[close]
p. 7
chapter 1 preface fast fourier transforms 1 this book focuses on the discrete fourier transform dft discrete convolution and particularly the fast algorithms to calculate them exciting as far as we can tell gauss was the rst to propose the techniques that we now call the fast fourier transform fft for calculating the coecients in a trigonometric expansion of an astroid s orbit in 1805 [172 however it was the seminal paper by cooley and tukey [87 in 1965 that caught the attention of the science and engineering community and in a way founded the discipline of digital signal processing dsp the impact of the cooley-tukey fft was enormous problems could be solved quickly that were not even considered a few years earlier a urry of research expanded the theory and developed excellent practical programs as well as opening new applications [93 in 1976 winograd published a short paper [399 that set a second urry of research in motion [85 this was another type of algorithm that greatly expanded the data lengths that could be transformed eciently the ground work for this algorithm had be set earlier by good [147 and by rader [305 in 1997 frigo and johnson developed a program they called the fftw fastest fourier transform in the west [129 [134 which is a composite of many of ideas in other algorithms as well as new results to give a robust very fast system for general data lengths on a variety of computer and dsp architectures this work won the 1999 wilkinson prize for numerical software it is hard to overemphasis the importance of the dft convolution and fast algorithms with a history that goes back to gauss [172 and a compilation of references on these topics that in 1995 resulted in over 2400 entries [358 the fft may be the most important numerical algorithm in science engineering and applied mathematics new theoretical results still are appearing advances in computers and hardware continually restate the basic questions and new applications open new areas for research it is hoped that this book will provide the background references programs and incentive to encourage further research and results in this area as well as provide tools for practical applications studying the fft is not only valuable in understanding a powerful tool it is also a prototype or example of how algorithms can be made ecient and how a theory can be developed to dene optimality the history of this development also gives insight to the process of research where serendipity plays an interesting role the material contained in this book has been collected over 40 years of teaching and research in dsp therefore it is dicult to attribute just where it all came from some comes from my earlier fft book [58 and much from the fft chapter in [215 certainly the interaction with people like jim cooley and charlie rader was central but the work with graduate students and undergraduates was probably the most formative i would particularly like to acknowledge ramesh agarwal howard johnson mike heideman henrik sorensen doug jones ivan selesnick haitao guo and gary sitton interaction with my colleagues tom parks hans schuessler al oppenheim and sanjit mitra has been essential over many years support has come from the nsf texas instruments and the wonderful teaching and research environment at rice these topics have been at the center of digital signal processing since its beginning and new results in hardware theory and applications continue to keep them important and 1 this content is available online at
[close]
p. 8
2 chapter 1 preface fast fourier transforms university and in the ieee signal processing society several chapters or sections are written by authors who have extensive experience working on the particular topics ivan selesnick had written several papers on the design of short ffts to be used in the prime factor algorithm pfa fft and on automatic design of these short ffts markus puschel has developed ¨ a theoretical framework for algebraic signal processing which allows a structured generation of fft programs and a system called spiral for automatically generating algorithms specically for an architicture steven johnson along with his colleague matteo frigo created developed and now maintains the powerful fftw system the fastest fourier transform in the west i would also like to thank prentice hall inc who returned the copyright on chapter 4 of in signal processing [48 around which this book is built advanced topics much of the content of this book is in the connexions http cnx.org open repository and all of it will be eventually additional fft material can be found in connexions particularly content by doug jones [203 note that this book and all the content in connexions are copyrighted under the creative commons attribution license http creativecommons.org c sidney burrus houston texas 2008/05/23 13:09:46
[close]
p. 9
chapter 2 introduction fast fourier transforms 1 the development of fast algorithms usually consists of using special properties of the algorithm of interest to remove redundant or unnecessary operations of a direct implementation because of the periodicity symmetries and orthogonality of the basis functions and the special relationship with convolution the discrete fourier transform dft has enormous capacity for improvement of its arithmetic eciency there are four main approaches to formulating ecient dft [49 algorithms dft into multiple shorter ones the rst two break a this is done in multidimensional index mapping chapter 3 by using an index map and in polynomial description of signals chapter 4 by polynomial reduction the fourth in factoring the signal the signal processing operators chapter 6 factors the dft operator matrix into sparse factors the dft as convolution or filtering chapter 5 develops a method which converts a prime-length dft into cyclic convolution still another approach is interesting where for certain cases the evaluation of the dft can be posed recursively as evaluating a dft in terms of two half-length dfts which are each in turn evaluated by a quarter-length dft and so on the very important computational complexity theorems of winograd are stated and briey discussed in the specic details and evaluations of the cooley-tukey fft and split-radix fft are given in the cooley-tukey fast fourier transform algorithm chapter 9 and the prime factor algorithm pfa and winograd fourier transform algorithm wfta are covered in the prime factor and winograd fourier transform algorithms chapter 10 a short discussion of high speed convolution is given in algorithms for data with restrictions chapter 12 both for its own importance and its theoretical connection to the dft we also present the chirp goertzel qft ntt sr-fft approx fft autogen practical programs and more the organization of the book represents the various approaches to understanding the fft and to obtaining ecient computer programs it also shows the intimate relationship between theory and implementation that can be used to real advantage the disparity in material devoted to the various approaches represent the tastes of this author not any intrinsic dierences in value a fairly long list of references is given but it is impossible to be truly complete i have referenced the work that i have used and that i am aware of the collection of computer programs is also somewhat idiosyncratic they are in matlab and fortran because that is what i have used over the years they also are written for their educational value although some are quite ecient there is excellent content in the connexions book by doug jones [204 1 this content is available online at
[close]
p. 10
4 chapter 2 introduction fast fourier transforms
[close]
p. 11
chapter 3 multidimensional index mapping 1 a powerful approach to the development of ecient algorithms is to break a large problem into multiple small ones one method for doing this with both the dft and convolution uses a linear change of index variables to map the original one dimensional problem into a multi-dimensional problem this approach provides a unied derivation of the cooley-tukey fft the prime factor algorithm pfa fft and the winograd fourier transform algorithm wfta fft it can also be applied directly to convolution to break it down into multiple short convolutions the basic denition of the discrete fourier transform dft is n -1 c k where nk x n wn n=0 3.1 n k and n are integers j -1 wn e-j2/n 3.2 and if the k 0 1 2 · · · n 1 n values of the transform are calculated from the n values of the data x n it is easily seen that n2 complex multiplications and approximately that same number of complex additions are required one method for reducing this required arithmetic is to use an index mapping a change of variables to change the one-dimensional dft into a two or higher dimensional dft this is one of the ideas behind the very ecient cooley-tukey [88 and winograd [400 algorithms the purpose of index mapping is to change a large problem into several easier ones [45 [119 this is sometimes called the divide and conquer approach [25 but a more accurate description would be organize and share which explains the process of redundancy removal or reduction 3.1 the index map for a length-n sequence the time index takes on the values n 0 1 2 n 1 when the length of the dft is not prime variables can be dened over the ranges 3.3 n can be factored as n n1 n2 and two new independent n1 0 1 2 n1 1 1 this content is available online at
[close]
p. 12
6 chapter 3 multidimensional index mapping n2 0 1 2 n2 1 3.5 a linear change of variables is dened which maps n1 and n2 to n and is expressed by 3.6 n k1 n1 k2 n2 n where ki are integers and the notation x n n denotes the integer residue of x modulo n [229 this map denes a relation between all possible combinations of 3.3 the question as to whether all of the is one-to-one two cases must be considered n1 and n2 in 3.4 and 3.5 and the values for n in in 3.3 are represented i.e whether the map is one-to-one unique has been answered in [45 showing that certain integer ki always exist such that the map in 3.6 3.1.1 case 1 n1 and n2 are relatively prime i.e the greatest common divisor n1 n2 1 the integer map of 3.6 is one-to-one if and only if k1 an2 where and/or k2 bn1 and k1 n1 k2 n2 1 3.7 a and b are integers 3.1.2 case 2 n1 and n2 are not relatively prime i.e n1 n2 1 the integer map of 3.6 is one-to-one if and only if k1 an2 or and k2 bn1 and a n1 k2 n2 1 3.8 k1 an2 and k2 bn1 and k1 n1 b n2 1 3.9 reference [45 should be consulted for the details of these conditions and examples two classes of index maps are dened from these conditions 3.1.3 type-one index map the map of 3.6 is called a type-one map when integers a and b exist such that 3.10 k1 an2 and k2 bn1 3.1.4 type-two index map the map of 3.6 is called a type-two map when when integers a and b exist such that 3.11 k1 an2 the type-one can be used or k2 bn1 n but not both only if the factors of are relatively prime but the type-two can be used whether they are relatively prime or not good [148 thomas and winograd [400 all used the type-one map in their dft algorithms cooley and tukey [88 used the type-two in their algorithms both for a xed radix n rm and a mixed radix [298
[close]
p. 13
7 the frequency index is dened by a map similar to 3.6 as k k3 k1 k4 k2 n the integers 3.12 where the same conditions 3.7 and 3.8 are used for determining the uniqueness of this map in terms of k3 and k4 two-dimensional arrays for the input data and its dft are dened using these index maps to give x n1 n2 x k1 n1 k2 n2 n x k1 k2 x k3 k1 k4 k2 n of variables applied to the denition of the dft given in 3.1 give 3.13 3.14 in some of the following equations the residue reduction notation will be omitted for clarity these changes n2 -1 n1 -1 c k n2 =0 n1 =0 kkkkx n wn 1 k3 n1 k1 wn 1 k4 n1 k2 wn 2 k3 n2 k1 wn 2 k4 n2 k2 3.15 where all of the exponents are evaluated modulo n ki can be chosen in such a way the amount of arithmetic required to calculate 3.15 is the same as in the direct calculation of 3.1 however because of the special nature of the dft the constants integers that the calculations are uncoupled and the arithmetic is reduced the requirements for this are k1 k4 n 0 and/or k2 k3 n 0 ki may 3.16 when this condition and those for uniqueness in 3.6 are applied it is found that the chosen such that one of the terms in 3.16 is zero if the make always be both ni are relatively prime it is always possible to terms zero if the ni w are not relatively prime only one of the terms can be set to zero when terms in 3.15 to become unity they are relatively prime there is a choice it is possible to either set one or both to zero this in turn causes one or both of the center two an example of the cooley-tukey radix-4 fft for a length-16 dft uses the type-two map with k1 4 k2 1 k3 1 k4 4 giving n 4n1 n2 k k1 4k2 the residue reduction in 3.6 is not needed here since values since in this example the factors of can hold and therefore 3.15 becomes 3.17 3.18 does not exceed n n as n1 and n2 take on their n 3 have a common factor only one of the conditions in 3.16 3 nnnx n w4 1 k1 w162 k1 w4 2 k2 3.19 c k1 k2 c k n2 =0 n1 =0 note the denition of k w161 k2 w4 this has the form of a two-dimensional dft with an extra term w16 called a twiddle factor [268 [145 [279 the inner sum over n1 represents four length-4 dfts the w16 term represents 16 complex multiplications and the outer sum over n2 represents another four length-4 dfts this choice of the ki uncouples the calculations since the rst sum over n1 for n2 0 calculates the dft of the rst row of the data array x n1 n2 and those data values are never needed in the succeeding row calculations the wn in 3.3 allows the simple form of row calculations are independent and examination of the outer sum shows that the column calculations are likewise independent this is illustrated in figure 3.1.
[close]
p. 14
8 chapter 3 multidimensional index mapping figure 3.1 uncoupling of the row and column calculations rectangles are data arrays the left 4-by-4 array is the mapped input data the center array has the rows transformed and the right array is the dft array the row dfts and the column dfts are independent of each other the twiddle factors tf which are the center of figure 3.1 this uncoupling feature reduces the amount of arithmetic required and allows the results of each row dft to be written back over the input data locations since that input row will not be needed again this is called in-place calculation and it results in a large memory requirement savings an example of the type-two map used when the factors of w in 3.19 are the multiplications which take place on the center array n are relatively prime is given for n 15 as n 5n1 n2 k k1 3k2 of the type-two map sets only one of the terms in 3.16 to zero the dft in 3.15 becomes 3.20 3.21 the residue reduction is again not explicitly needed although the factors 3 and 5 are relatively prime use 4 2 nnnxw3 1 k1 w152 k1 w5 2 k2 3.22 x n2 =0 n1 =0 which has the same form as 3.19 including the existence of the twiddle factors tf here the inner sum is ve length-3 dfts one for each value of 3 data arrays k1 this is illustrated in 3.2 where the rectangles are the 5 by
[close]
p. 15
9 figure 3.2 uncoupling of the row and column calculations rectangles are data arrays an alternate illustration is shown in figure 3.3 where the rectangles are the short length 3 and 5 dfts figure 3.3 uncoupling of the row and column calculations rectangles are short dfts the type-one map is illustrated next on the same length-15 example this time the situation of 3.7 with the and condition is used in 3.10 using an index map of n 5n1 3n2 and 3.23 k 10k1 6k2 the residue reduction is now necessary since the factors of 3.24 n are relatively prime and the type-one map is being used both terms in 3.16 are zero and 3.15 becomes 4 2 n x w3 1 k1 w5 2 k2 n 3.25 x n2 =0 n1 =0 which is similar to 3.22 except that now the type-one map gives a pure two-dimensional dft calculation with no tfs and the sums can be done in either order figures figure 3.2 and figure 3.3 also describe this case but now there are no twiddle factor multiplications in the center.
[close]