Turbocharge HPEC System Design with HPC Development Tools
Published in Military Embedded Systems
As parallel programming grows in importance and popularity, the critical challenge has become how to intelligently manage, develop, and debug the increasingly complex code. Traditional tools such as trace analysis, serial debuggers, and the venerable "print" statement just aren't up to the task. Although some commercial off-the-shelf (COTS) vendors and customers in the embedded-defense space have attempted to develop their own parallel programming tools, the task has proved difficult and the resulting tools are far from full-featured. What's more, using proprietary development tools can add risk to a program's cost and schedule. The good news: A better source of tools for designing cutting-edge high-performance embedded computing (HPEC) systems already exists in an adjacent market - the commercial high-performance computing (HPC) market. Sourcing proven and powerful tools from the HPC community, long supported by an expansive user base, can greatly speed delivery time while decreasing costs and program risk.
The largest cost of developing a HPEC system for aerospace and defense applications is not the hardware, but rather the software. Research consistently shows at least 50 percent of programming time is typically spent debugging. The right tools can make all the difference. Using a comprehensive system debugger can help slash development time, reduce schedule creep, and eliminate cost overruns. Profilers can also be of great utility for HPEC system development because they help optimize and benchmark code and can perform regression testing. Another important tool for HPEC development is a cluster manager to help organize all of the nodes in the system. Taken together, debuggers, profilers, and cluster managers have become critical tools for creating a fully tested, validated, and benchmarked HPEC development environment.
The importance of these development tools has greatly increased in parallel with the availability of the latest generation of multicore central processing units (CPUs), graphics-processing units (GPUs), and field-programmable gate arrays (FPGAs). As the newest generation of devices has become the building blocks of choice for demanding embedded intelligence, surveillance, and reconnaissance (ISR) systems, the process of developing and debugging these systems has also become increasingly more complicated. This situation occurs because system designers tasked with taking full advantage of the processing power of next-generation devices to develop embedded HPEC “supercomputers” have found that their code must be executed in parallel across many nodes. High-speed serial coding techniques have proved inefficient when compared to programming the multiple independent and concurrent activities required to process the complex algorithms typically used in the radar, image, and signal processing jobs that HPEC systems perform.
HPC Tools to the Rescue
Over the past decade, the HPC market has evolved a mature and feature-rich set of software development tools that include math libraries, communications APIs, testing tools, and cluster managers. What makes these resources so appealing for the COTS market is the fact that the supercomputers used in commercial HPC applications, such as ultra-high-speed financial transactions and weather simulation, are built with the same hardware building blocks (processors, general processing units (GPUs), and fabrics) now used in the HPEC world. For example, the University of Texas’s Stampede supercomputer is built with Intel Xeon processors and NVIDIA Tesla GPUs, also fundamental components of military HPEC systems. Commercial and military supercomputer system developers face many of the same concerns, including floating-point performance, throughput, latency, and a preference for standard software APIs.
For the HPEC system developer, cluster managers, debuggers, and profilers are the software equivalent of an oscilloscope, spectrum analyzer, and logic/network analyzer. As critical as the latter are for hardware development, these software tools are equally important for developing today’s complex parallel HPEC systems
Cluster Managers Handle System Configuration
An HPC cluster manager saves time and money by easing the setup and maintenance of the system configuration. It also eases the strain on developer resources and scheduling by providing a simple method for sharing the lab-development system. Later, during the production phase, the cluster manager helps ensure quality and customer satisfaction by enabling exact duplication of the software images. A cluster-management system provides all the tools needed to build, manage, operate, and maintain a cluster in an HPEC system.
An example of a leading cluster manager, well-proven in the HPC market, is Bright Computing’s Bright Cluster Manager. Using an intelligent cluster-management installation tool like Bright Cluster Manager for HPC enables the cluster to be installed, from bare boards to a full development system, in a matter of minutes. It can configure all of the system resources such as custom kernels, disks, and networks. Since the cluster manager supports image-based provisioning, kernel images can be maintained for different board types in the system, including GPUs, which makes adding, deleting, or moving a board to another slot as simple as a mouse-click.
Cluster managers also support developers loading their own kernel images to a single board or a combination of boards. This ability enables multiple developers to work on separate groups of processors on the system – a limited resource – simultaneously. The separate kernel images allow the developers to always start their work session at a known point, eliminating any doubt regarding the state the system was left in by a previous user. This image-based provisioning architecture can also guarantee that the same version(s) of software is loaded on all the boards, of all processing types, for full system testing and delivery, thereby eliminating the headache of reprogramming boards. The revision control also empowers the user to track changes to the software images using standardized methods, and effortlessly roll nodes back to a previous revision if needed.
The health and monitoring features – including temperature, CPU loading, and disk space – in the cluster manager provide a visual status of the entire system. It also logs and displays the boot-up messages for all of the boards and conveys any errors or warnings from the system logs. This feature removes the need to connect a terminal to serial ports to debug and configure the compute nodes, saving time that may have been spent searching for the right serial cable, a serial-to-USB adapter, or the driver for the adapter (especially if the user is on a network that can’t access the outside world). This total management of the system supports the complete life cycle, enabling a seamless transition from lab development to flight readiness through manufacturing and delivery.
Introducing Gen 5 VPX
Ivan Straznicky looks at the introduction of Gen 5 VPX, the initial Gen 5 VPX protocols are expected to be 100 Gigabit Ethernet (100G-KR4) and Infiniband EDR [enhanced data rate).
GPU Trends: The Quest for Performance, Latency, and Flexibility in ISR Systems
Tammy Carter talks about how employing strategies such as GPUDirect, PCIe Device Lending, and implementing SISCI API can help system integrators optimize ISR solutions.
Reducing Costs and Risk While Speeding Development of Signal Processing System Development
Radar and other digital processing systems are by their very nature complex and challenging to configure.
Senior Product Manager
Tammy Carter is the Senior Product Manager for GPGPUs and software products, featuring OpenHPEC for Curtiss-Wright Defense Solutions. In addition to an M.S. in Computer Science, she has over 20 years of experience designing, developing, and integrating real-time embedded systems in the defense, communications, and medical arenas.
Deliver Supercomputing Processing Performance
HPEC Systems have a proven track record to deliver supercomputing processing performance in rugged, compact deployable system architectures optimized for harsh military environments. These systems consist of a large number of distributed processors, I/O, and software stacks connected by a low latency system fabric.