Keras in a single McCulloch-Pitts neuron

Published in

Chatbots Life

6 min readFeb 6, 2017

What a single neuron can do?

Keras is cool. It is clear, concise and powerful. In this tutorial, I will write the easiest possible model using Keras: one single neuron.

As a part of the great Udacity self-driving car nanodegree, we deal with Keras, a deep neural networks computational package. Keras is a wrapper, that runs another powerful package, TensorFlow (or Theano). In the nanodegree, we use neural networks to do classification of traffic signals (project 2) and prediction of steering angles in a simulator (project 3).

Returning 70 years in time, the beginning of it all: the McCulloch-Pitts artificial neuron.

McCulloch and Pitts

The very first notion of an artificial neuron is from a 1943 paper, by two guys: Warren McCulloch and Walter Pitts.

Imagine Walter Pitts as an adolescent little genius, born in a tough family, where his father despised school and wanted to put him to work. Imagine Pitts hidden in the public library in the night, reading the Principia Mathematica by Bertrand Russell (one of the greatest mathematicians of all time) and dreaming to understand the world.

I can not resist the temptation. Pitts looks like the Leader, a character of Hulk comics!

Years later, Pitts met Warren McCulloch, a much older and already respected neurophysiologist. McCulloch explains he wants to model the brain in a logical way, how the neurons work, the analogies with the computer model of Alan Turing, the Principia Mathematica, and so on. A few hours later and it becomes clear that Pitts is the right guy to do the mathematical formulation of the problem.

Impressed by the geniality of Pitts, McCulloch “adopts” him. Pitts starts to live in McCulloch’s home, and they work together every night, after McCulloch’s family is in bed. The bright, respected, old scientist and the runaway, jobless, high school dropout young genius.

Together, they developed the first idea of artificial neurons, in this paper “A logical calculus of the ideas immanent in nervous activity”, 1943.

Their artificial neuron is something that receives signals as input, multiplies to a weight and compares the result to a discriminant. If it is greater, the output is one. If not, zero.

The McCulloch-Pitts neuron is binary, with few neurons and no backpropagation technique to fit the weights. Pitts showed a combination of neurons can emulate the main logic gates (or, and, nor), and doing this, it can do the calculation a digital computer does. At that time, digital computers were being designed by great John von Neumann (another genius), what meant they did all calculations by hand.

After this seminal work, a whole new area of knowledge began to flourish. Today, there are backpropagation, multiple layers, hundred of thousands of neurons, several possible activation functions, dropout, convolution, transfer learning, regularization… and much, much more to come.

Original 1943 paper:

http://link.springer.com/article/10.1007%2FBF02478259

The paper has only three citations. One of them is the Principia Mathematica from Russell and Whitehead.

Just to finish the historical note, real life is not a fairy tale, from rags to riches. It was more like from rags to riches and back to rags again. Walter Pitts got depressed after some disappointments. He started to drink heavily, became more and more isolated from others, and died alone, in poverty. He was 46.

Source: magazine Nautil.us.

http://nautil.us/issue/21/information/the-man-who-tried-to-redeem-the-world-with-logic

The single neuron Keras model

To create a single neuron model is Keras is as easy as described below: define the type of model (sequential) and add to this model a single layer, with a single neuron in this layer. The input_shape parameter says how is the shape of input data, in this case, one dimensional input.

#Model architecturemodelSimple = Sequential()
modelSimple.add(Dense(1, init=’uniform’, input_shape=(1,)))

This is not truly the McCulloch Pitts model, because theirs is binary (they have a step activation), but for didactic reasons, lets begin with it. There is a lot we can learn from this minimalist model.

To complete the model, there are other parameters (or better, hyper-parameters) to be passed to the model.

#Compile and fit modelLEARNING_RATE =0.05modelSimple.compile(optimizer=Adam(lr=LEARNING_RATE), loss=’mse’)modelSimple.fit(X_train, Y_train, batch_size=1, nb_epoch=20, verbose=1, validation_split=0.2)

Again. In McCulloch-Pitts work, there were no learning. They predefined the weights and designed each neuron for their purpose. For not spending time, we’re working with these modern concepts.

Suppose we want a single neuron with weight = 0.5 and bias = 0.

The input and output training data may be something like this.

X_train = [.1,.2,.3, .4]
Y_train= [.05,.1, .15,.2]

After the optimization, we’re going to take a look on the weights of the model.

print(“Weights: \n”)
print(modelSimple.get_weights())

And in the forecast, given the inputs.

print(“Prediction: \n”)
print(modelSimple.predict(X_train, batch_size=1))

The whole model is described below.

#Test single neuron
import numpy as np
import kerasfrom keras.models import Sequential
from keras.layers import Dense
from keras.models import Model
from keras.optimizers import Adam, SGD#Input data
X_train = [.1,.2,.3, .4]
Y_train= [.05,.1, .15,.2]print(“X_train “, X_train)
print(“Y_train “, Y_train)#Model architecture
modelSimple = Sequential()
modelSimple.add(Dense(1, init=’uniform’, input_shape=(1,)))#Compile and fit model
LEARNING_RATE =0.05
modelSimple.compile(optimizer=Adam(lr=LEARNING_RATE), loss=’mse’)
modelSimple.fit(X_train, Y_train, batch_size=1, nb_epoch=20, verbose=1, validation_split=0.2)#Print weights
print(“”)
print(“Weights: \n”)
print(modelSimple.get_weights())#Print prediction
print(“”)
print(“Prediction: \n”)
print(modelSimple.predict(X_train, batch_size=1))

I’m assuming here that Keras and TensorFlow are correctly installed. Please refer to https://keras.io/ and https://www.tensorflow.org/ for installation and other informations.

The beauty of this single neuron model is that is easy to understand what happens. Let’s take a look.

Running the model

Let’s run it with one neuron, and just one epoch (nb_epoch=1).

I’m running it in a Ubuntu 16.04 operation system, with 8 GB memory, GPU on GeForce 920M.

The neuron converged to a weight of 0.05761143 and bias of 0.1081204 in one iteration. For the first input, 0.1, it gives a forecast of 0.05761143*0.1 + 0.1081204 = 0.11388154.

The reader can do as an exercise for the other input values, but it will give exactly the array given in “Prediction”.

However, it is very far away from what we expected (0.5 weight and 0 bias). So, lets increase the number of epochs to 20.

Now, the loss is much smaller, and weight (0,46188787) and bias (0.00758998) are very close to the expected values. The prediction is close to the training set.

With this single neuron model, we can easily test (and, mainly, understand) how parameters affect the model (activations, learning rates, number of epochs, and so on).

After the concepts of this single neuron are understood, we can increase the number of neurons, layers, and build complex architectures above these simple concepts.

This model is shared in Github, in the link below.

asgunzi/SingleKerasNeuron

SingleKerasNeuron - A Keras model with only one neuron

github.com

Conclusion

Walter Pitts worked hidden in the library at night, with only paper and pen. I wonder what he could do, if he had a digital computer, TensorFlow, Keras, Ubuntu 16.04, 8 GB memory, GPU on GeForce 920M…

A whole world was built based on McCulloch-Pitts neuron, and a much bigger world is being built. World is great!

Other writings: https://medium.com/@arnaldogunzi

Main blog: https://ideiasesquecidas.com/