News

NationalChip AI networking chip GX8010 released

Release time:2017-11-01

At present, artificial intelligence technology is developing rapidly, and has made a breakthrough in the fields of speech recognition and image recognition.

At present, artificial intelligence technology is developing rapidly, and has made a breakthrough in the fields of speech recognition and image recognition. How to make things as soon as possible to enjoy the fruits of these technologies, and speakers, toys, household appliances and other products in the daily life talking, we have been looking forward to the goal.

 

Yesterday, NationalChip held a ceremonious product conference in Shenzhen, and released the new chip GX8010, the IOT artificial intelligence chip equipped with NPU which will give energy to all kinds of IOT products.

 

According to the characteristics of the artificial intelligence and the Internet of things, GX8010 integrates the algorithms, software and hardware deeply and innovatively adopts the latest technologies, such as NPU, DSP and so on. The launch of this new AI chip  is to make the daily life of all kinds of products intelligent:  "see", "listen to", and "say". What is in the AI chip, what is the difference between it and the traditional chip, and the difficulties and pain points of various artificial intelligence deployment on the Internet of things, and How is GX8010 solved? Let's go into this AI core together.

 1509617134129615XmQY.jpg

Solving pain point 1: Difficulty in calculating local nerve collaterals

Deep learning has made great breakthroughs in various fields, but at the same time, it requires higher computing power for the processor. CPU processors are widely used in traditional chips, but with the gradual increase of neural network, high computation and high throughput make CPU unbearable. From CPU to DSP, GPU, the performance of the processor has been continuously improved, and finally, the special processor of artificial intelligence, NPU, is born. In GX8010, the gxNPU neural network processor, which is built independently by NationalChip, is embedded. It is specially customized for AI, which accelerates the neural network and solves the problem of low efficiency of traditional chip in neural network operation.

 15096172009605887ROI.jpg

The neural network processor gxNPU is tailored to the IOT, which supports various current mainstream models, such as DNN, CNN, LSTM and so on. You can freely design and expand the network structure and customize the operation unit according to the needs of the algorithm. In data format, the NPU supports both fixed-point and floating-point operations, which is very convenient to use.

 1509617287162242lP12.jpg

In order to solve the characteristics of small memory bandwidth in the Internet of things, NationalChip specially designed the neural network compression engine NCompressor. It can make use of the data sparsity in the neural network, compress the calculation weight, and achieve 6~10 times the compression effect without affecting the precision. After the neural network is compressed, the required memory capacity and bandwidth are greatly reduced, and the speed of operation is also improved. For compressing, NationalChip also provides a compiler tool. It can achieve quantitative compression of the model by one button, and then decompress by the hardware engine in the chip, without heavy training and extra processing, which is very convenient to use.

 1509617328798545gEN5.jpg

Together with compiling and compressing tools, NationalChip also released a full set of neural network development SDK. It only needs three steps to complete the deployment of the model from server to chip. The first step is to train the Tensorflow and other platforms to generate the net table files of the model. The second step is to compile and compress using the gxNPUC (neural network compiler) to generate an instruction bin file. Finally, the gxDNN acceleration library is used on the chip, and the compiled model can be run locally.

 1509617443794510BDhE.jpg

Considering the cost and power factors in the Internet of things, this generation of gxNPU has not built a very large number of MAC arrays, but chose the configuration of the 64x64. But in the performance evaluation of typical applications, gxNPU@200MHz is still nearly 30 times faster than the multi core CPU@1GHz in the RPi, and the energy efficiency is increased by more than 100 times.

 1509617475387798qPT6.jpg

So the question is, what is the difference between this NPU and the NPU of the Google TPU and the HUAWEI kylin 970 chip? We know that Google TPU is a processor applied to a server, which is more concerned with the size of the computing power and is not so sensitive to cost and power. Compared with Google TPU, gxNPU is designed for IOT, adding neural network compression engine, less memory and bandwidth and lower power consumption when computing, and it is more suitable for deploying in all kinds of Internet of things products. And HUAWEI's NPU is for mobile scene design, because of the lack of public information, it is not good to compare at present.

 

Solve the pain point 2: The AI interactive system is complicated and the cost is high

 

AI chips should really fall to the ground, and NPU is far from enough. The whole AI interaction is a very complex process. Besides neural network computing, it also includes sensor access, signal processing, detection and recognition, as well as decision-making and feedback at the software level. There are lots of links and different algorithms and computing characteristics are needed in each place. The strategy of "comprehensive integration and full stack" is put forward by NationalChip.

 15096175711319665PzP.jpg

Taking intelligent voice interaction as an example, the great challenge of current speech recognition is still in the front end of speech noise reduction. In order to solve the problem of noise and effective voice separation, a microphone array is introduced in the industry to reduce noise and filter with spatial information. The introduction of multiple microphones first requires the interface on the hardware, and some traditional chips do not have so many interfaces to be extended only through other devices. At the same time, the access of multichannel signals also makes the computation of the front end speech increase greatly. It is very difficult for us to use CPU soft solution in traditional chips.

 1509617596107451mevp.jpg

In GX8010, we are surprised to see that it integrates the Cadence Tensilica's highest - order voice DSP Hifi-4, which is designed for intelligent speech and can efficiently perform various speech signal processing calculations. At the same time, the GX8010 chip supports the 8 channel microphone interface, which not only supports the PDM and I2S Digital interfaces, but also has a built-in 8 way ADC directly supported analog microphone, which is still the first time in the industry. Meanwhile, on this DSP, NationalChip is working with Rokid, aispeech and other top level voice algorithm companies to transplant their algorithm into cooperative voice solution.

 

In addition to the voice system, GX8010 also constructs a visual system that supports modules such as 1080P camera input, image preprocessing, MJPEG coding and so on. After the signal processing of the voice and image, it is sent to the central decision and application system for business and application processing.

 

The whole chip adopts multi-core heterogeneous architecture, which integrates many processors such as core gxNPU, ARM Cortex A7 CPU, Hifi-4 DSP, etc. DSP is responsible for speech signal processing enhancement, NPU is responsible for deep learning computation, and CPU is responsible for software operation and application decision control. These modules are integrated on a chip and constitute a complete AI processing system, which is actually a real AI SOC chip.

 1509617624464279Dj6q.jpg

In addition to the above, GX8010 also directly integrated a DRAM through SIP in the chip. So the integration of the whole chip is quite amazing, the peripheral devices are very small, and the BOM cost of the whole product will be greatly reduced. It is very competitive especially in the field of voice application, such as intelligent sound box and voice interactive module.

 

Solve the pain point three: The power is too big!

 

The major application difficulty of IOT products lies in its small volume and varied scenarios, and many times require battery power supply, which requires higher power consumption of products. We also give a solution to this problem.

 

On the dynamic power consumption, GX8010 makes full use of the advantages of multi-core heterogeneous, and rationally arranges the working frequency and starting and stopping time of each module, which can achieve the effect of on-demand and stop. In typical speech interaction, GX8010 only needs 100-200MHz to finish off-line speech recognition. DSP works on 300-400MHz to realize multi microphone array processing, and CPU can adjust dynamically according to system load. This scheme allows the chip to operate efficiently while maintaining very low power consumption. According to the test, the power consumption of GX8010 can be less than 0.7W (including DRAM) at full speed in the scene of off-line voice interaction.

 

The problem of standby is also the difficulty of voice interactive equipment. Because the system is still able to be awakened by voice when it is on standby, it also means a series of actions, such as voice acquisition, noise reduction, and recognition of activation words when waiting for the machine.

 

GX8010 proposed a multi-level wakeup mechanism, which can be divided into several levels according to whether there are voices, whether there is human voice, whether it is a key word or not. In standby mode, the application of GX8010 new VAD (Voice Activity Detection) technology detects whether the microphone has voice input. Once receiving voice instructions, DSP program starts noise reduction, then NPU activates the activation word recognition, such as detecting keywords to activate the whole application system

 1509617667828069r6ah.jpg

This step by step wake-up mechanism ensures that the voice assistant can respond to the instructions in real time and prolong the endurance of the equipment. According to the test, GX8010 can realize voice wake up under the standby power of 0.05W. This value is far lower than the other chips on the current market, and it is easy to achieve a long standby time.

 1509617897134856qNCM.jpg

With the local off-line neural network computing, ultra high integration, low power and so on, GX8010 chip will show its strength in many applications. The key applications will include smart speakers, voice interfaces, and intelligent toys.

 

 

》》Battery powered, off-line intelligent sound box scheme

 

Most of the smart speakers on the market choose power supply because of power problems. After using the GX8010 solution, it can play its low power and standby characteristics, and the use of battery can also be easily standby for several days. The off-line ability of GX8010 can be implemented on line, even in the non network environment. Finally, GX8010 integrates many modules, which will have an absolute advantage in cost, which will help reduce the cost of products and promote further market release.

 1509617897134856348L.jpg

》》A front end scheme for creating a voice interface to all things

 

Many products like TV, set-top boxes, home appliances and other products themselves already have more mature hardware and software systems, but they still want to achieve intelligent upgrading, especially with voice interaction. On the basis of GX8010, NationalChip launched a tailored version of GX8008, specifically for the voice front-end market. It allows traditional devices to retain their original hardware and upgrade their voice capabilities through a simple USB port. In standby, it can let the host be completely dormant, only rely on GX8008 to do noise and activation, and restore the system. The problem of low power standby under voice monitoring can not be solved by the previous TV and set-top boxes.

 1509618062106521NmX3.jpg

》》An intelligent toy scheme with both voice and vision

 

GX8010, with voice and visual interface, will have good performance in smart toys and preschool markets. With its off-line low power characteristics, smart toys will get rid of the restrictions on Wifi and will be able to play with their children outside in the future.

 15096181048484059yP9.jpg

High intelligence, low power and full integration are the biggest features of this GX8010, which will bring new changes to the Internet of things. All the latest artificial intelligence algorithms and computations have the opportunity to deploy in embedded devices. The sound box, home appliance, toy, vehicle and other products will become more intelligent, rich in function and better in experience. The age of the intelligence of all things is no longer far away. Let us look forward to the following performance of the GX8010 from NationalChip!

1509618236181838R3k7.jpg