
MultiValued Neurons
The
MultiValued Neuron (MVN) is a neuron with complexvalued weights and inputs
and output that are located on the unit circle. The latter means that MVN’s
output depends only on the argument (phase) of its weighted sum and does not
depend on its magnitude. This important property distinguishes MVN from other
complexvalued neurons and determines all its advantages. The most important of
these advantages is derivativefree learning algorithm for both a single neuron
and an MVNbased feedforward neural network.
Some Historical Notes and Essentials
A
term “MultiValued Neuron” was suggested in 1992 by Naum Aizenberg and
Igor Aizenberg in their paper [1]. However, the MVN story has started much earlier
when in 1971 Naum Aizenberg et al. in their paper [2] suggested a model of multiplevalued logic over the
field of complex numbers and introduced a notion of multiplevalued threshold
function over the field of complex numbers. In that seminal paper they also
introduced the first historically known complexvalued activation function. The
main idea behind that work was to get generalization of Boolean threshold logic
for the multiplevalued case and to generalize a notion of a Boolean threshold
function for the multiplevalued case, respectively. Unlike a classical
multiplevalued logic, where values of kvalued logic are encoded by
integers from the set K={0, 1, …, k1}, in multiplevalued logic
over the field of complex numbers they are encoded
by the k^{th} roots of unity, as it was suggested in [2]. Thus, the values of kvalued logic are located on
the unit circle:

ε=e^{i2π/k}^{
}– primitive kth root of unity, i is an imaginary unity

Important
advantage of this model of multiplevalued logic over the traditional one is
that all values of kvalued logic encoded by the k^{th} roots of unity are normalized – their absolute values are
equal to 1 and they differ only by their arguments (phases). A key
definition given in [2], which is a background behind the multivalued neuron, is
a definition of a multiplevalued threshold function. Let
\(E_k = \{1, \varepsilon_{k}, \varepsilon^{2}_{k}, ..., \varepsilon^{k1}_{k}\}\)
where
\(\varepsilon_{k} = e^{i2\pi/k}\)
is the primitive k^{th}
root of unity (i is an imaginary unity and k is some positive
integer) be the set of the k^{th} roots of unity. Then a
function
\(f(x_{1}, ..., x_{n}): E^{n}_{k}\rightarrow E_{k}\)
of kvalued
logic is called a kvalued threshold function (or threshold
function of kvalued logic) if there exist a complexvalued vector
\((w_{0}, w_{1}, ..., w_{n})\)
such that for all
\((x_{1}, ..., x_{n}) \)
from the domain of the
function
\(f(x_{1}, ..., x_{n}) \)
\(f(x_{1}, ..., x_{n}) = P(w_{0} + w_{1}x_{1} + ... + w_{n}x_{n})\)
,

(1)

where
\(P(z)=e^{i2\pi j/k}, \)
if
\(\ 2\pi j/k \le arg\ z < 2\pi(j+1)/k\)

(2)

\(j=0,1,...,k1\)
are values of kvalued
logic, i is an imaginary unity, and arg z is the argument
of the complex number z. Vector
\((w_{0}, w_{1}, ..., w_{n})\)
is called a weighting
vector of the threshold function f. Function (2) separates the complex
plane into k equal sectors. P(z) depends only on arg z.
This is illustrated as follows:

\(\require{color}\)
\( \textcolor{BrickRed}{P(z)}=exp(i2\pi j/k)= \textcolor{BrickRed}{\varepsilon^{j}}\),
if
\(\ 2\pi j/k \le arg\ z < 2\pi(j+1)/k\)
Function P maps the complex plane into the
set of the k^{th} roots of unity

Function (2) suggested in [2] in 1971 is the
first historically known complexvalued activation function. In [3][4], a multivalued
threshold element, which implements a multiplevalued threshold function, and
methods of its synthesis, was introduced. A theory of multiplevalued threshold
logic over the field of complex numbers was deeply developed in [5].
The Neuron
The discrete
MultiValued Neuron (MVN) was introduced in [1] in 1992. It is a
neuron with n complexvalued inputs that are located on the unit circle
and a single complexvalued output, which is the k^{th} root of unity
and is also located on the unit circle, respectively. The weights can be
arbitrary complex numbers (the weighted sum can also be an arbitrary complex
number, accordingly). Function (2) is the activation function of MVN. Let O
be the continuous set of the points located on the unit circle. Then the
discrete MVN implements input/output mapping, which is described by a function
\(f(x_{1}, ..., x_{n}): X^{n} \rightarrow E_{k} \)
where X=E_{k}
or X=O:

MultiValued Neuron

Hence MVN
implements mapping (1) with activation function (2).
The main
advantages of MVN over other neurons are its higher functionality (that is, its
ability to learn those input/output mappings that other neurons cannot learn)
and simplicity of its learning, which is derivativefree.
The high
functionality of MVN is determined by its activation function. Let us take a
look at its 3D graphical representation

3D graph of the MVN activation function, k=16

It looks like a circular stairs
where each stair is a sector on the complex plane. The next picture illustrates
why the MVN activation function ensures higher functionality of a single MVN
over a neuron with a sigmoid activation function. When we train a sigmoidal
neuron to produce some exact output, we have to ensure that a weighted sum
takes some certain and the only possible value. It is very difficult (and often
impossible) to achieve. Unlike this situation, when we train MVN to produce
some exact output, a weighted sum does not need to take a certain value,
because it can appear in the predefined sector, corresponding to the desired
output. However, the sector is infinite and this makes MVN much more flexible
and respectively much more functional that a sigmoidal neuron.

MVN vs. Sigmoidal Neuron

In [6], a continuous activation function for MVN was introduced. The concept of the
continuous MVN has been then developed in [7]. Activation function (2) becomes continuous when k→∞. However, if k→∞,
then the angular size of a sector on the complex plane approaches 0 and
function (2) turns into the projection of the weighted sum
\(z=w_{0} + w_{1}x_{1} + ... + w_{n}x_{n} \)
onto the unit circle:
P(z)=e^{i}^{Argz}
= z/z.

(3)

Continuous activation function (3) is illustrated as follows:

Continuous MVN Activation Function


3D Representation of the Continuous MVN Activation
Function

The continuous MVN is a very
suitable tool for doing with continuous input/output mappings and solving
regression problems using an MVNbased neural network. There is also another
interesting property. The continuous MVN has a direct analogy with a biological
neuron. In fact, biological neurons intercommunicate with each other by spike
trains. The information contained in the corresponding spike train is encoded
by the frequency of the corresponding spikes, while their magnitude is a
constant. The information contained in the signals transmitted by continuous
MVNs to each other is completely contained in the phases of these signals,
while their magnitude is always equal to 1. The correspondence between
frequency and phase can easily be established. Let
f be the frequency. As it is commonly known from the oscillations
theory, if t is the time, and φ is the phase, then
\(\varphi = \theta_{0} + 2\pi \int fdt = \theta_{0} + \theta(t) \)
. If the frequency f
is fixed for some time interval
\( \Delta t \)
then
the last equation may be transformed as follows:
\(\varphi = \theta_{0} + 2\pi f \Delta t \)
Thus, if the frequency
f generated by a biological neuron is known, it is very easy to
transform it to the phase φ
and to the complex number
\( e^{i\phi} \)
located
on the unit circle, which encodes the MVN state. The opposite is also true:
having any complex number lying on the unit circle, which is the MVN's output
it is possible to transform it to the frequency. This means that all signals
generated by the biological neurons may be unambiguously transformed into the
form acceptable by the MVN, and wise versa, preserving a physical nature of the
signals.
DerivativeFree
Learning Algorithm for MVN
MVN
learning
is derivativefree. It is not necessary to consider it as the
optimization
problem. Thus, neither a derivative of the error function nor a
derivative of
the activation function appears in a learning rule. Moreover, both
discrete (2) and continuous (3) activation functions are not
differentiable as functions of
complex variable. Intuitively it is easy to conclude that since an MVN
output
is always located on the unit circle, the learning process is reduced to
movement along the unit circle. Evidently, a circular movement along the
unit
circle can never lead in the incorrect direction. If we need to move
from one
point on the circle to another one, we always will reach the target,
even
taking a longer way instead of a shorter one. However, there exists a
learning
rule, which grantees taking the shortest way to the target. This is
errorcorrection learning rule, which is illustrated as follows:

ErrorCorrection Learning Rule

If D is the desired output
and Y is the actual output, then the error δ
is their difference δ=DY.
This rule is common for both discrete and continuous MVN. The error completely
determines adjustment of the weights and the errorcorrection learning rule for
MVN is as follows:
\(W_{r+1} = W_{r} + \frac{a}{(n+1)}\delta \overline{X} \)
,

(4)

where W_{r} is a current weighting vector, W_{r}_{+1 }is the following weighting vector (after adjustment), α is a learning rate, which should be
equal to 1, X is a vector of
inputs (bar over it means that it is taken complexconjugated), n is the number of inputs. A learning
algorithm based on the errorcorrection learning rule (4) is considered in
detail in [8], where it convergence is proven for the discretevalued MVN. For
the continuous MVN the convergence is proven in [7]. The learning algorithm
consists of the consecutive checking whether for a given learning sample the
actual output coincides with the desired one and if not, the weights should be
adjusted according to (4).
Theory of
multivalued neurons, essentials of their learning, and theory of
multiplevalued logic over the field of complex numbers are considered in
detail in [8]. Contributions made by Dr. Igor Aizenberg and his coauthors can
be found here.
Multivalued neurons have been using in various applications: in associative memories,
in cellular neural networks, in feedforward neural networks, etc. There are
important contributions in this area made by Dr. Jacek Zurada and his
coauthors, Dr. Hiroyuki Aoki and his coauthors, Dr. DongLiang Lee, and by other
scientists.
One of the
most successful applications of multivalued neurons is a multilayer
feedforward neural network based on them (Multilayer Neural Network with
MultiValued Neurons – MLMVN). Its idea was proposed in [6], and then it was
considered in detail in [7] and further developed in [9]. This network
significantly outperforms a standard backpropagation network and many
kernelbased techniques in terms of learning speed and generalization
capability.
REFERENCES
[1] N.
N. Aizenberg and I. N. Aizenberg, "CNN
Based on MultiValued Neuron as a Model of Associative Memory for GrayScale
Images", Proceedings of the Second IEEE Int. Workshop on Cellular
Neural Networks and their Applications, Technical University Munich,
Germany October 1992, pp.3641.
[2] N.
N. Aizenberg, Yu. L. Ivaskiv, and D. A. Pospelov, "About one generalization
of the threshold function" Doklady Akademii Nauk SSSR (The
Reports of the Academy of Sciences of the USSR), vol. 196, No 6, 1971, pp.
12871290 (in Russian).
[3] N. N. Aizenberg , Yu. L. Ivas'kiv,
D. A. Pospelov
and G. F. Khudyakov,
“Multivalued threshold functions I. Boolean complexthreshold
functions and their generalization”, Cybernetics and
Systems Analysis, 1971, Vol. 7, No 4,
pp. 626635.
[4] N. N. Aizenberg , Yu. L. Ivas'kiv,
D. A. Pospelov
and G. F. Khudyakov,“Multivalued
threshold functions II. Synthesis of MultiValued Threshold Element”, Cybernetics
and Systems Analysis, 1973, Vol. 9, No 1, pp. 6177.
[5] N. N. Aizenberg and Yu. L. Ivaskiv, MultipleValued
Threshold Logic, Naukova Dumka Publisher House, Kiev, 1977 (in Russian).
[6] I. Aizenberg, C. Moraga, and D. Paliy, "A
Feedforward Neural Network based on MultiValued Neurons", In Computational Intelligence, Theory and
Applications. Advances in Soft Computing, XIV, B. Reusch, Ed., Springer, Berlin,
Heidelberg, New York, pp. 599612, 2005.
[7] I. Aizenberg and C. Moraga, "Multilayer
Feedforward Neural Network based on MultiValued Neurons and a Backpropagation
Learning Algorithm", Soft Computing, vol. 11, No 2, January
2007, pp. 169183.
[8] I. Aizenberg, N. Aizenberg, and J. Vandewalle,
“MultiValued
and Universal Binary Neurons: Theory, Learning, Applications”, Kluwer
Academic Publishers, Boston/Dordrecht/London, 2000.
[9] I. Aizenberg, D. Paliy, J. Zurada, and J.
Astola, "Blur
Identification by Multilayer Neural Network based on MultiValued Neurons",
IEEE Transactions on Neural Networks, vol. 19, No 5, May 2008, pp.
883898.
