On Saturday afternoon (November 16) at Supercomputing 2019, Intel unveiled a new programming model called oneAPI. Intel describes the need to couple middleware and frameworks directly to specific hardware as one of the biggest problems in the development of AI / machine learning. The oneAPI model is designed to break this tight coupling so that developers can focus on their actual project and reuse the same code as the underlying hardware changes.
This mantra "Write once, execute anywhere" recalls Sun's first steps towards the Java language. Bill Savage, general manager of computing power at Intel, said to Ars that this was not an accurate characterization. Although each approach addresses the same basic problem – a tight coupling with hardware that makes life difficult for developers and makes code reuse difficult – approaches are very different.
When a developer writes Java code, the source is compiled in bytecode, and a Java virtual machine tailored to the local hardware executes that bytecode. Although many optimizations have improved the performance of Java in the more than 20 years since its launch, in most applications it is still significantly slower than C ++ code – usually between one half and one tenth of the speed. In contrast, oneAPI should generate direct object code with no or negligible performance degradation.
When we asked Savage about oneAPI's design and performance expectations, he clearly distanced himself from Java, pointing out that no bytecode was involved. Instead, oneAPI is a set of libraries that bind hardware-diagnostic API calls directly to low-level, highly optimized code that controls the hardware actually available in the local environment. Instead of "Java for Artificial Intelligence," the high-level takeaway is more like "OpenGL / DirectX for Artificial Intelligence."
For even more powerful encoding in tight loops, oneAPI also introduces a new language variant, called "Data Parallel C ++", which makes it possible to align optimized code at a very low level to multiple architectures. Data Parallel C ++ uses and extends SYCL, a single-source abstraction layer for OpenCL programming.
In the current release, a OneAPI developer still has to target the basic hardware type he or she codes for, e.g. CPUs, GPUs or FPGAs. Beyond this basic targeting, oneAPI optimizes the code for each supported hardware variant. For example, users of a project developed by OneAPI could run the same code on Nvidia's Tesla v100 or Intel's newly released Ponte Vecchio GPU.
Ponte Vecchio is the first product in Intel's new Xe GPU line to address HPC supercomputers and data center applications. Although neither Savage nor other Intel executives Ars talked with had schedules or talked to concrete products, a slide from the Intel Supercomputing 2019 presentation clearly shows that the Xe architecture includes workstations, mobile devices, and games in those areas.
Savage told Ars that while Intel still requires code for a specific architecture family (CPU, GPU, FPGA, etc.) for the current version of oneAPI developers, Intel plans to release a future version to allow automatic selection as well available hardware type.
The oneAPI Toolkit can now be used and tested on Intel Devcloud.