Beamforming: FPGAs rise to the challenge

Military Embedded Systems

Several design approaches exist for implementing beamforming processing tasks, with options ranging from GPUs to multicore CPUs, DSPs, and FPGAs. The unique strengths of FPGAs make them an increasingly appealing choice for beamforming when compared to their counterparts.

The active electronically scanned array (AESA) architecture has transformed what radar can do and how fast it can do it. It has transformed how many threats radar can track and over what instantaneous bandwidth the tracking can be performed. It has also influenced which additional functions radar can handle. Achieving all of these capabilities requires a truly enormous amount of processing power, high-speed memory and interconnects, and it requires analog to digital converters (ADCs) and digital to analog converters (DACs) with high resolution, sampling rate, and dynamic range. Adaptive beamforming, one of an AESA radar's key functions, is a complex process that requires all of the formidable capabilities that today's cutting-edge signal processing can deliver.

Adaptive beamforming, and beamforming in general for that matter, involves spatial filtering techniques. These techniques enable an antenna array to employ its elements in order to form a wave pattern that provides higher sensitivity in specific, desired directions. This allows the control over the shape and steering of the array's directivity pattern (see Figure 1) and increases signal reception or transmission performance in the desired direction. Improved reception and transmission is the result of constructive (in-phase) interference of the desired propagating wave and the destructive (out of phase) interference of waves from all other directions. Beamforming enhances performance in a specific spatial region - in both azimuth and elevation - while nulling out interference, noise, and extraneous signals, including those from jammers, in other regions.

The effects of beamforming

Figure 1: The effects of beamforming

The combination of beamforming and other radar functions results in enormous signal processing requirements. These requirements will increase further as the number of elements in the array, for example, also grow. As large AESA arrays already typically have thousands of elements, beamforming can pose an exceptionally difficult processing task, as it requires the least possible latency, with a high level of accuracy, all done in real time. As mentioned earlier, beamforming can be implemented using GPUs, DSPs, and multicore CPUs alone or in combination, as well as with FPGAs. As the floating-point support within FPGAs continues to increase, they have become capable of delivering performance an order of magnitude greater than what can be achieved by the other approaches.

At the lowest level, beamforming requires digital down-conversion and filtering that requires many multiplications and additions of data with configurable coefficients. Furthermore, matrix mathematics is required to process large arrays of data. Because of the highly modular architecture of FPGAs, they are well suited to solve large problems that require parallel processing. When QDR SRAMs are attached to the FPGAs, additional processing gains can be made as they enable more efficient matrix calculations because these devices permit random access for striding functions and simultaneous read and write operations. SRAMs also have significantly lower latency than DDR3 SDRAMs, the primary memories used with general-purpose processors.

With each new generation of FPGAs, the number of dedicated DSP blocks also increases. The DSP blocks within an FPGA contain hardened multiply-accumulate logic that is designed to support complex fixed-point and floating-point calculations. While FPGA processing is currently best limited to single-precision floating-point (since double-precision cannot be implemented very efficiently), for beamforming applications single-precision floating-point is sufficient. For instance, the 3600 DSP blocks that exist in current generation AMD Virtex-7 devices can support approximately five billion multiply-accumulate operations per second when factoring in clock frequency. These blocks play a key role when performing beamforming and other radar functions.

The other important area for beamforming in which FPGAs excel is I/O. Beamforming algorithms require high degrees of interconnectivity to enable the combination of data from many beams. That interconnection matrix grows as the number of beams increase. The largest FPGAs now have 36, 72, or more serializer/deserializers (SERDES), each of which are capable of running faster than 10 Gbps. When interconnecting FPGAs together, low overhead/latency protocols such as AMD's Aurora protocol can be used to efficiently pass data between FPGAs. These SERDES can be used to interconnect multiple FPGAs on the same card or FPGAs on different cards.

Modern AESA radars also use adaptive beamforming. With adaptive beamforming, the coefficients being used to shape the beams are changed based on the data being received. This type of processing requires a view of the whole system, and is typically done by a general-purpose processor (GPP) that better handles sequential processing. FPGAs pass summary data to the GPP to recalculate the coefficients, after which the coefficients are then fed back into the system.

For these reasons, if an adaptive beamforming network is constructed from multiple CPUs it will be larger and heavier, consume significantly greater power, require more devices, and offers limited memory and external interface options compared to an FPGA-based design. AESA radars (and soon, electronic warfare [EW] systems based on the AESA architecture) must accommodate the electrical and physical confines of smaller platforms and stringent size, weight, power, and cost (SWaP-C) demands. For that reason it's not surprising that the capabilities of FPGAs are being exploited for beamforming applications.

Beamforming using CPUs is shown in Figure 2. In this admittedly simple scenario, signals captured from each element are sent to an ADC where they are converted to the digital domain. The digital data streams from each ADC are sent to processors for filtering, decoding, and pulse compression, and then to a single processor for Doppler processing. While not shown in this diagram, there are also paths between the processors to pass data to each other. The number of sensors that can be handled by each processor will be limited by the processing and I/O capability of the processor.

Beamforming using general-purpose processors

Figure 2: Beamforming using general-purpose processors (GPPs)

Figure 3 shows a similar simple example with processing performed by FPGAs. Again, the number of sensors that can be handled by a single FPGA will depend on FPGA processing capacity and the amount of I/O. However, because FPGAs can support more parallel processing and I/O than a GPP, fewer FPGAs are required, and therefore less size and power is needed.

Beamforming using FPGAs

Figure 3: Beamforming using FPGAs

Another example of a radar application in which FPGAs are increasingly making an impact is the implementation of space time adaptive processing (STAP) algorithms (see Figure 4). STAP is a two-dimensional (space and time) filtering technique that combines spatial channels and pulse-Doppler waveforms. In dense signal environments, STAP can achieve order-of-magnitude improvements in target detection by effectively extracting and revealing signals of interest below the clutter level.

A parallel-pipelined STAP implementation showing data transfer between tasks

Figure 4: A parallel-pipelined STAP implementation showing data transfer between tasks

To achieve this goal, STAP requires increased signal-to-interference-and-noise ratio to suppress noise, clutter, jammers, and other signals while retaining the desired radar return. As there are often multiple potential targets, each of which requires calculations for location and velocity that must be processed simultaneously in real time, intense processing power is essential.

The STAP requirement for intensive numeric processing, low latency, high dynamic range, and floating-point processing has historically made it practical only when high-performance computing resources were available, which has limited its use in many environments. This is unfortunate, as airborne platforms can greatly benefit from STAP. Another critical concern precluding the use of STAP has been the SWaP limitations of airborne platforms.

The processing of STAP algorithms requires Doppler filter processing, weight computation, beamforming, pulse compression, and constant false-alarm rate processing. While FPGAs feature most of the inherent attributes to make them very useful in STAP processing, issues of routing congestion and poor floating-point performance have previously limited their overall performance.

As a result, the overall viability of FPGAs for STAP was not competitive with approaches using GPUs or multicore CPUs. Recently however, major FPGA manufacturers have found ways to overcome this limitation and have circumvented the problems associated with the high-performance computing resources formerly required to implement STAP. The door to using FPGAs for STAP has now opened for the first time. Combined with their aforementioned advantages in SWaP for beamforming applications, today's FPGAs now allow STAP to be implemented where they never could be before.

In both beamforming and STAP applications, which are in fact directly related, OpenVPX cards such as Curtiss-Wright's 6U CHAMP-FX4 (see Figure 5) provide all of the required processing power with its three large Virtex-7 FPGAs. Its FPGAs feature both QDR and SDRAM memory, large numbers of SERDES that provide very wide bandwidths for FPGA interconnect, and both PCIe or SRIO support for passing data from the FPGA into an Intel Core i7 or a GPU via an OpenVPX backplane. These boards can scale efficiently to handle a very large number of beams and enable complex subsystems to be constructed using far less hardware while offering substantial high-speed interconnects (SERDES). In contrast, both CPU and GPU-based solutions are limited by their reliance on PCIe. In addition, the huge amounts of data output from the ADC in a broadband radar can only be accommodated by FPGAs, such as the CHAMP-FX4's Virtex-7s. The 6U form factor provides ample real estate for these resources.

The CHAMP-FX4 utilizes the board space available in the 6U form factor to provide resources for multiple beamforming tasks

Figure 5: The CHAMP-FX4 utilizes the board space available in the 6U form factor to provide resources for multiple beamforming tasks

FPGAs going forward

AESA radars require more processing power, bandwidth, I/O, and other resources than their predecessors. This requirement is likely to grow as their instantaneous bandwidths and elements increase along with the tasks they must perform in order to handle more formidable threats from electronic attack systems. Not surprisingly, beamforming - which has always been a critical element in active phased arrays - has taken on increased importance. With the greater demand for beamforming comes the need to deliver greater performance with fewer devices, consuming less power in less space. Thanks to recent advances in FPGA technology, their DSP processing capability continues to increase, and they are rapidly becoming the device of choice in current designs as well as those in development.

Boards in the 6U OpenVPX form factor are well suited for beamforming, as they can pack extraordinary amounts of the parallel processing, floating-point support, and I/O offered by FPGAs in combination with other resources to dramatically reduce the size and cost of implementing beamforming and related functions.

This article was written by Denis Smetana and was published in Military Embedded Systems