The loneliest neural network: only one neuron, but will "shadow doppelganger"

Reporting by XinZhiyuan

EDIT: LRS

【Introduction to New Wisdom Element】The neural network model is getting bigger and bigger, and it is also becoming more and more expensive. The research team at the Technical University of Berlin did the opposite, building a network of single neurons that can simulate multi-layer neural networks, and the performance is not bad!

What is the most advanced neural network model in the world? That's definitely the human brain.

The human brain has 86 billion neurons, and the neural networks combined with each other not only surpass artificial neural networks in performance, but also consume surprisingly little energy.

Today's AI systems attempt to mimic the human brain by creating multi-layer neural networks, aiming to cram as many neurons as possible into as little space as possible.

Although this approach has made performance improvements, such a design not only requires a lot of electricity, but also dwarfs the results of the output compared to the human brain.

It is estimated that OpenAI will need about 190,000 kWh of electricity when using Nvidia GPUs to train neural network GPT-3 in Microsoft data centers, equivalent to the electricity used by 126 households in Denmark each year. If converted to the carbon dioxide content produced by fossil fuels, it is equivalent to driving a car from the earth to a round trip to the moon.

And the number of neural networks, and the amount of hardware needed to train them with huge data sets, is growing. Taking GPT as an example, there are already 175 billion parameters at GPT-3, which is 100 times more than the amount of parameters of the predecessor GPT-2.

This "bigger is better" neural network design is clearly not in line with a sustainable scientific outlook on development.

A multidisciplinary research team from the Technical University of Berlin recently created a new type of neural "network". But calling it a network is still relatively reluctant, because it is new and new, with only one neuron!

The researchers propose a new method capable of folding a deep neural network of any size into a single neuron loop with multiple delay feedbacks. This single-neuron deep neural network consists of only a single nonlinear and appropriately adjusted feedback signal that can fully represent a standard deep neural network (DNN), contains sparse DNNs, and extends the concept of DNNs to the implementation of dynamic systems.

The new model, also known as Folded-in-time Fit-DNN, also showed considerable performance in testing benchmark tasks.

Is it hard to grow into a forest alone?

A regular neural network requires multiple nodes to be spatially connected to each other, while a single neuronal model is diffusely connected in the temporal dimension.

The full-time folding method of multilayer feedforward DNNs devised by the researchers requires only a single neuron with a loop of feedback regulatory delay. Through the chronological sequence of nonlinear operations, a DNN of any depth or width can be implemented.

In traditional neural networks, such as GPT-3, each neuron has a weight value in order to fine-tune the results. But the result of this approach is usually more neurons, producing more parameters, and only more parameters can produce more precise results.

But the team at the Technical University of Berlin found that they could achieve similar functions by weighting the same neuron differently at different times, rather than spatially dispersing differently weighted neurons.

It's like at a banquet, where you can simulate a conversation at the dinner table by quickly switching seats and pretending to be different parts of different guests.

It sounds a bit of a "split personality", but with this expansion of timing, one person (neuron) can also accomplish things that can only be done by multiple people.

Mentioning "fast" switching, the Berlin team said that this statement has been very low-key.

In fact, their system activates time-based feedback loops in neurons via lasers, theoretically reaching speeds close to the limits of the universe — that is, switching neural networks at or near the speed of light.

According to the researchers, this means that for AI, it can significantly reduce the energy costs of training hyperscale neural networks.

To implement the above idea, the researchers hypothesized that the state of the system evolves in continuous time according to the general form of differential equations.

Here x(t) represents the state of the neuron at time t; f is a nonlinear function whose argument a(t) combines the data signal J(t), the bias b(t) of the time change, and the delay feedback signal x(t-τd) modulated by the function Md(t). Multiple loops of different delay lengths τd can be explicitly considered. Thanks to the feedback loop, the system becomes a so-called delayed power system.

Intuitively, the feedback loop in Fit-DNN causes neurons to reintroduce information that has already passed through nonlinear f, which allows nonlinear f to chain multiple times. Classical DNNs form their trainable representations by using neurons layer by layer, while Fit-DNNs accomplish the same thing by repeatedly introducing feedback signals into the same neuron.

In each pass, the time-varying bias b(t) and modulation Md(t) on the delay line ensure that the time evolution of the system processes the information in the desired way. In order to obtain the data signal J(t) and output y, both variables require an appropriate pre- or post-processing operation.

To further illustrate that Fit-DNN is functionally equivalent to multilayer neural networks, it can be seen that Fit-DNN can convert the dynamics of a single neuron with multiple delay rings into DNNs.

The temporal evolution of x(t) can be divided into time intervals of length T, each of which simulates a hidden layer. In each interval, select N points. Use an equidistant time grid with a small time interval θ. For a hidden layer with N nodes, it can be concluded that θ=T/N. At each time grid point tn=nθ, the system state x(tn) is used as an independent variable. Each time grid point tn will represent a node, while x(tn) will represent its state. It can be further assumed that the data signals J(t), bias b(t), and modulation signal Md(t) are step functions with a step size of θ.

As a very sparse network, the researchers first applied Fit-DNN to an image denoising task: Adding Gaussian noise with an intensity of variance of 1 to the images in the Fashion-MNIST dataset treated as vectors with values between 0 (white) and 1 (black). The resulting vector entries are then cut at thresholds 0 and 1 to obtain a noisy grayscale image. The task of denoising is to reconstruct the original image from its noisy version.

The experimental results compared the original Fashion-MNIST image, its noisy version and an example of the reconstructed image. You can see that the effect of the recovery is still quite good.

But the real question for Fit-DNN is whether a single neuron in the time loop can produce the same results as billions of neurons.

To demonstrate the computational power of Fit-DNN and the time state, the researchers chose five image classification tasks: MNIST40, Fashion-MNIST41, CIFAR-10, CIFAR-100, and SVHN.

Experimentally, the performance of Fit-DNN in the number of nodes N=50, 100, 200, and 400 of each hidden layer in the above tasks was compared. From the results, we can see that high accuracy was achieved on individual neurons on the relatively simple MNIST and Fashion-MNIST tasks. But for the more challenging CIFAR-10, CIFAR-100 and SVHN missions, the accuracy rate is lower.

While these results are clearly not comparable to the performance records created by the current sota model, they are achieved on a novel, completely different architecture. In particular, Fit-DNN here uses only half of the available diagonals of the weight matrix. For test tasks, increasing N will obviously lead to an improvement in performance.

With further development, scientists believe the system can be extended to an "infinite number" of neuronal connections in the temporal dimension.

They say such a system is feasible, and it can surpass the human brain to become the most powerful neural network in the world, which is what artificial intelligence experts call "superintelligence."

Resources:

https://thenextweb.com/news/how-ai-brain-with-only-one-neuron-could-surpass-humans

The loneliest neural network: only one neuron, but will "shadow doppelganger"

Read on

10,000 words to understand the automatic driving 3D visual perception algorithm

For the first time in the world! The scientific research team of Tianjin University has achieved significant results

Physical systems perform machine learning calculations, a deep physical neural network trained using backpropagation

Peking University Digital College Students won the first SIAM Data Science Youth Award, focusing on privacy data protection

What happened to the five-year wave of artificial intelligence-specific chips?

Autoencoder 26 pages review paper: Concepts, Diagrams, and Applications

Interview with the DeepMind team: "Ithaca" restoration of Greek inscriptions is just the beginning

These stunning houses transcend the rich and the poor, and the Pritzker Architecture Prize was awarded to African architects for the first time

Maps, GPS is not reliable, UC Berkeley robot strange environment navigation of more than 3 kilometers

How to deal with the AI interpretability crisis, which should be more concerned about interpretation and verification?

Spectral imaging is slow, and deep learning helps

Research such as the University of Cambridge has found that stable and accurate deep neural networks do not actually exist in theory

Professor Xidian Jiao Li Cheng's work: "Deep Neural Network FPGA" Latest Research Review

Professor Xidian Jiao Li Cheng's work: "Deep Neural Network FPGA" Latest Research Review

A review of the latest literature on large-scale neural networks: training efficient DNNs, saving memory usage, optimizer design

Where have researchers gone in solving the traveling salesman problem with deep learning?

New research from PNAS: Cambridge scholars have found that some AI models cannot be computed

A single neuron can also achieve DNN, and the image classification accuracy is 98% | the Nature sub-journal

Use AI to identify the illusion of the Russian-Ukrainian conflict! The first military search platform in China, 4 core algorithms to deal with open source intelligence

What is the risk of sudden death in 10 years? The first neural network algorithm tells you

Jeff Dean posted a review: The Golden Decade of Deep Learning

Talk about the body and soul of autonomous driving

When Deep Learning Comes to Weather Forecasting – GRL: A Study of Short-Term Forecasts of Continental Precipitation

Xu Yuanchun, a company Xiaoice dialogue: Virtual people are not an industry that can stand on gimmicks, and the core value is efficiency

Large inventory of the application of machine learning algorithms in autonomous driving

Stop the "TA" from listening to you, can the AI do it?

The transformational role of GPU computing and deep learning in drug discovery

Bloody work! The ZTE Axon 40 Ultra is coming: the most perfect full screen ever

HIT Ding Gong: A Cognitive Reasoning Method Based on Neural Symbols

Freediving supported! Huawei WATCH GT3 Pro first sale: from 2488 yuan

Spending only $60 can destroy 0.01% of the dataset, significantly reducing AI model performance