We tend to regard AI as a monolithic entity, but it is actually developed along several branches. One of the main branches is to perform traditional calculations, but to feed the results to another level, which receives and weighs inputs from multiple calculations before the calculations are made and passed on. Another branch is the imitation of the behavior of traditional neurons: Many small units that communicate in a spurt of activity, the so-called spikes, and track the history of past activities.
Each of these branches has different branches layers and communication networks, types of calculations performed and so on, depending on its structure. Instead of being able to act in a way that we would see as intelligent, many of them are very good at special problems like pattern recognition or poker games. And processors designed to accelerate the performance of the software can usually only improve part of it.
This latest split may have come to an end with the development of Tianjic by a large team of researchers based mainly in China. Tianjic is designed so that the individual processing units can switch from peak communication back to binary communication and perform a variety of calculations, in almost all cases faster and more efficiently than a GPU. To demonstrate the capabilities of the chip, the researchers threw together a self-propelled bicycle that simultaneously executed three different AI algorithms on a single chip.
Divided into two
While there are many types of AI software, the key distribution is characterized by Researchers move between so-called Layered Calculations and Spiking Communications. The former (including convolutional neural networks and deep learning algorithms) uses the levels of calculation units that feed the results of their calculations to the next level using standard binary data. Each of these units needs to track which other units they are communicating with and how much weight they attach to each of their inputs.
On the other side of the divide are approaches that are more directly inspired by biology. These communicate more in analogous "peaks" of activity than in data. Individual units must not only track their present state but also their past. That's because their chances of sending a spike depend on how often they have received spikes in the past. They are also arranged in large networks, but do not necessarily have to have a clean layer structure or perform the same detailed calculations within a unit.
Both approaches have benefited from dedicated hardware, which is usually up to date. At least as good as the implementation of the software on GPUs and far more energy efficient. (An example of this is the IBM TrueNorth processor.) However, because of the large differences in communication and class computation, a processor is only suitable for one type or another.
That changed the Tianjic team calling the FCore architecture. FCore is designed to allow the two different classes of AI to either be represented by a common underlying computational architecture or to be reconfigured in a snap to control one or the other compute unit.
In order to enable communication between the computing units, FCore uses the native language of traditional neural networks: binary. However, FCore is also capable of outputting bits in a binary format so that it can communicate with terms that a neural algorithm can understand. The local memory in each processing unit can either be used to track the history of peaks or as buffers for input and output data. Some of the computational hardware needed for neural networks is shut down and bypassed in artificial neuron mode.
In the chip
With these and some additional functions, each individual processor can be interposed in one FCore. The two modes perform one of the two types of calculation and communication as required. More critically, a single unit can be placed in some kind of hybrid mode. This means taking inputs from a type AI algorithm, but formatting the output to be understood by another ̵
The FCore architecture was also designed to scale. The assignment of the connections between the computer units is located in a memory area that is separate from the computer units themselves, and is so large that connections can be made outside a single chip. Thus, a single neural network could potentially be distributed over multiple cores in a processor or even multiple processors.
In fact, the Tijanic chip consists of several FCores (156 of them) arranged in a 2D network. Overall, there are approximately 40,000 individual arithmetic units on the chip, implying that a single FCore has 256 of them. It is manufactured using a 28-nanometer process that is more than twice the process used by desktop and mobile chip manufacturers. Nevertheless, it can internally move more than 600 GB / s and run at 300 MHz almost 1.3 tera-ops per second.
Despite the low clock rate, the Tianjic had some impressive numbers when run on the same NVIDIA-implemented algorithms Titan-Xp. The performance was between 1.6 and 100 times depending on the algorithm. Considering the energy consumption, the power per watt was almost funny and ranged from 12 times to more than 10,000 times. Other dedicated AI processors had high power per watt, but could not perform all the algorithms presented here.
Like riding a … Well, you know,
Alone, that would have been an interesting paper. However, the research team has also shown that Tianjic's abilities can also be used in his experimental form. "To demonstrate the usefulness of building a brain-like cross-paradigm system," the researchers write, "we designed an unmanned bicycle experiment by implementing several specialized networks in parallel in a Tianjic chip."
The Bicycle Recognized Objects Via a neural folding network and a continuous neural attractor network enabled target tracking for the bike to follow a researcher. In the meantime, the bike could follow voice commands via a neural spiking network. A so-called multi-layer perceptron pursued the balance of the bicycle. And all these inputs were coordinated by a neural state machine based on a neural spiking network.
And it worked. Although the bike was not self-propelled in the sense that it was willing to lead someone through the lanes of a big city, it was definitely good enough to be a faithful companion to a researcher during a walk on a test track that included obstacles ,
Overall, this is an impressive piece of work. Either the processor alone or the automated bike would have made a solid paper for themselves. The idea of using a single chip to natively host two radically different software architectures was courageous.
However, it should be noted that researchers interpret this as a path to a general intelligence AI. In many ways, Tianjic resembles a brain: the brain uses a single architecture (the neuron) to host a variety of different processes that together make sense of the world and plan actions that respond to it. To a certain extent, researchers are right that the ability to run and integrate several algorithms simultaneously is one way in that direction.
However, this is still not necessarily a path to general intelligence. In our brain, specialized regions – the equivalent of the algorithm – can perform a collection of ill-defined and only vaguely related activities. And a single task (like deciding where to focus) requires countless input. They range from our recent history, our emotional state, to what we capture in our temporary memory, to the prejudices that have formed over millions of years of evolution. Just being able to run multiple algorithms is still a long way from anything we would recognize as intelligence.
Nature, 2019. DOI: 10.1038 / s41586-019-1424-8 (About DOIs).