Screencast about Deep Learning and IBM Watson is online!

Just a short post to inform you that I labored to produce:

This is a screen-cast video, meaning I talk a lot and film a presentation, about some AI related topics. I tell a bit about my background and how I came in contact with AI research. Where I see the connections between AI, Neuroscience and Psychology. Then I explain why AlphaGo was such a breakthrough from the perspective of a psychologist. What went wrong with Microsoft's Tay, why AI needs also Artificial Emotions and how models of emotions can be implemented. Next is HRL Labs revolutionary approach in pilot training. Finally I elaborate a bit about IBM Watson and how it is connected to psychology.

Oh yes and you can find the standalone presentation here:
Click here to be taken to the corresponding Prezi presentation!


The black box of deep learning

In this amazing blog post of the magazine Nature, Davide Castlevicci pointed out that we understand pretty little about how machines learn. Where 'we' is actually defined as the experts who develop and implement deep learning systems. I would even state that we understand far more about human learning now, than about how the way the machines learn we create!

The underlying causes of this are simplicity, complexity and chaos theory - let me explain.


The basic learning rules of a artificial neural network aka deep learning system can be only understood when focusing on the smallest unit of this construct, the artificial neuron. It goes way back into the year 1949 where neuropsychologist Donald O. Hebb formulated the idea of 'what fires together, wires together' in his book 'The Organization of Behavior'. Today it is known as the Hebbian learning theory or in short: The Hebb rule.

Simply written, you can imagine two nerve cells and mentally observe their connections strength to each other. Hebb postulated now that if these cells are excited at the same time and even fire an electrical impulse at the same time, then their connection strength will increase.
Here is is interesting to note that this postulate came from the theory of associative learning.

So, Donald Hebb took a theory for macroscopic learning effects and transferred it to the microscopic level of nerve cell assemblies. Some years later, this theory was proven right by Eric Kandel, who was able to confirm these predictions in the nerve cells of the California sea hare. Back then, the reason was simply because the sea hare had nerve cell bodies up to 1 mm in diameter - which made it easy to place measurement electrodes inside of them.

The learning rule itself is very easy to understand and laid the foundation for the later theoretical formulations of artificial neural networks.


The movement of one pendulum is something which is mathematically very easy to describe - high school stuff. Take two of these buggers, connect them and it gets quite complicated. Be extremely mischievous and attach them to each other to form a double pendulum, give a mathematician the task to formally (aka with lots of equations) describe it's movement and he will most likely quit the job or end up in an insane asylum. This is because describing the movement of a single pendulum is possible with formal mathematics to a sufficient level, so that we confuse the mathematical model with reality. Trying to apply this formalism to a double pendulum reveals the inaccuracies of this single model. The slightest influences, like air movement, magnetic fields or even the infinitesimally small force of light pushing, may change the movement of this system so drastically that any formalism must yield.

Now, above we had two nerve cells - but deep learning networks can reach sizes of hundred of billion parameters. Where parameters are the numerical representations of the mathematical formalism in the machine's memory. It can be very roughly compared to the size of the human brain by assuming the numbers of neurons and the numbers of connections (Synapses) resemble these parameters. The number of parameters in the human brain would be some orders of magnitude higher.

So we can conclude that deep learning networks and biological networks are very complex stuff.

Chaos theory

Chaos theory postulates that such complex systems, no matter how vast they are, are behaving not necessarily completely random. Even if we take the simple mechanics of two colliding molecules in the air and scale the description up to the level of all molecules in the air of this planet Earth - there is still hope.
Otherwise there would me no weather forecast!
Right, weather can be seen as a very complex assembly of simple physical rules - that is why weather frogs, sorry meteorologists, need so powerful computers the even the most megalomaniac gaming rig would pale in comparison. In physics there is a simple rule:

If mathematicians can't describe it, then calculate it!

Only, there are a lot of calculations to make for the weather forecast!

Same is in deep learning networks. The difference is, that in the weather forecast we observe a natural phenomenon and try to predict its behavior. In deep learning networks the system expresses behavior and we try to understand it.


Since a bit more than one hundred years psychologists try exactly that what deep learning geeks are trying to figure out now. To understand and predict the observable behavior of very complex systems. The difference is that, we psychologists mostly deal with organic computers walking on two legs and having mood swings. Where computer scientists deal with silicon computers without legs (for now!) which have no mood swings (for now!) but are also 'doing' strange things.

My point is:

Since both disciplines try to understand the macroscopic behavior of complex systems, which are based on similar principles (!), is it too far fetched when we try to apply psychological theory construction on artificial information systems?

In all this lies the root of another major humiliation of the human condition. On the scale of Copernicus, Galileo's and Darwin's conclusions. More about this, another time!


IBM's Response to - Request for Information: Multi-disciplinary research

Response to - Request for Information: Multi-disciplinary research

This is the last part in my blog series about IBM's answer to the White House Request for Information.

I come from multi disciplinary research approaches. It began with my involvement in Artificial Neural Network (ANN) research as a young student of psychology in the early 1990's. When our tiny three person research nucleus went through the book series parallel distributed processing it became apparent that we had to learn much more than in the usual curriculum for students of psychology. There were uncommon mathematical concepts like gradient descent, numerical approximations of differential equations and  topology. We had to understand the basics of learning vector quantization to enable us to implement self organizing maps contributed by Professor Teuvo Kohonen. Our programming skills had to improve too, to be able to implement meaningful ANN's. So we learned object oriented programming by teaching ourselves C++ with the second edition of the book 'The C++ Programming Language' by Bjarne Stroustrup. We also used Pascal and later Delphi to be able to discuss our coding with our Professor, who knew Turbo Pascal and was unwilling to switch to a fancy new programming language. But I have to honorably mention Professor Gert Haubensak who was nevertheless willing to get involved in the concepts of object oriented programming with Object Pascal besides his duties as a full professor of psychology.

My minor in my psychology studies was sociology where I learned to work with qualitative research methods. I specialized on 'The Civilizing Process' by Norbert Elias which gave me insights into a completely different approach to view and describe social systems.

Later, in my Neuroscience years, I worked in projects together with mathematicians, physicists, physicians, linguists and computer science folks. My field of research was 'dynamic causal modelling for fMRI research', a highly multi disciplinary research field. Still today researchers with various backgrounds, spanning from engineering to medical science, work in this field.

So I know a bit about multi disciplinary research.

'AI researchers will need to work with educators, researchers, and entrepreneurs on developing cognitive assistants to help learners to follow alternative career pathways, moving from academic disciplines to entry level jobs to additional occupational transitions.'

Above a point is a bit hidden I will expand on now. In my opinion one of the biggest issues of multi disciplinary research is translation of concepts. All international scientists collaborate using the English language, however the most confusing experiences in multi disciplinary research environments are that the same words don't mean the same to everyone. Furthermore researchers of different fields might use the same concepts but express them using different words.

For example when a physician mentions a 'vector' it is probably about transmission of viral diseases, whereas the mathematician is confused how his 'vector' is linked to the common flu and the by standing psychologist knows 'vector' from his statistics courses about factor analysis and  tries desperately to link the elaborations to intelligence theory.

When a physicist is new in the field of human neuroscience he might be confused by the disparate lingo for a certain brain area. Let's assume he begins his work in a team researching perception of visual movement. A neurosurgeon might describe this certain area of the brain using Brodmann's system. Where the neurophysiologist might want to show off with his knowledge of Talairach and MNI coordinates. The psychologist in the team is not interested in brain geography and just speaks of the 'V5 or MT' area to emphasize his interest in function over location. The poor physicist scratches his head and maybe wonders how it happened that talking about a chunk of grey matter at a specific location got so complicated!

This is all very confusing! 

So multi disciplinary researchers could really use AI systems to 'translate' between disciplines and enhance the learning speed for young researchers new in multi disciplinary research. In multi disciplinary research everyone is a 'young researcher' from time to time. Because new theories might require the involvement in formerly unknown areas of research.

'T-shaped professionals with depth and breadth better at teamwork and more adaptable than I-shaped professionals who only have depth.'

Well this is pretty much common knowledge in multi disciplinary research nowadays. For example when our tiny research nucleus about artificial neural networks meddled with mathematical concepts and new programming techniques we were far from being experts in these fields. Also later I didn't claim and still don't claim full understanding of the mathematical formulations of brain state dynamics.
But it is important to have an understanding of foreign terminology and concepts to a certain degree. This degree should enable one to apply multi disciplinary concepts and to collaborate with colleagues from different professional backgrounds.

This ends my mini series about IBM's response to the White House's RFI about 'preparing for the future of artificial intelligence'. Find part one here and part two here.


IBM briefs the White House about artificial intelligence: fundamental questions & emotions

Fundamental questions in AI research, and the most important research gaps

(Version from: July 28, 2016)
And emotions! Psychological research shows that emotions are a very important component in human action regulation. One may now think of the ideal of an AI being not affected by any emotions. I am skeptical, this will work for true universal AI systems of the first generations. Such systems that are still elements of science fiction, despite the progress in AI research. My reasoning is simple: AI is currently modeled after the only intelligent beings we know - us humans. Of course animals have certain degrees of intelligence too, but when I look at the examples and visions given for AI entities - it seems the human level is the goal. Now we humans are terribly bad at logical reasoning in daily life, we pass most of our daily life challenges with the support of our 'gut feeling' as has been shown by Gerd Gigerenzer. So, if we take us as a model and leave half of the necessary ingredients away, we will fail!

Microsoft Tay, AE & the way to it.

The beginning of the year 2016 Microsoft experienced a PR disaster. Tay is a chatbot that was brought online on the 23'rd of March 2016 and was taken offline after less than 24h again.What happened? The vision of Microsoft was well intended and Tay was a promising step towards a new generations of chatterbots. Normally a chatterbot or chatbot is a more or less sophisticated assembly of scripts that react to certain keywords someone types in a chat conversations. Tay is a deep learning AI system that should have learned from the interactions, i.e. exchange of text messages via Twitter. It did indeed learn! The things a gullible dimwit (emotionally) learns, when you put him on a schoolyard and tell kids to come and teach him stuff. Thousands of enthusiastic teenagers around the world began to 'teach' Tay. Some of the highlights can be seen in an article on gizmodo.com. A clinical psychologist would have probably diagnosed Tay as sociopathic or even psychopathic, lacking empathy and common moral values.

How do we humans acquire 'common moral values'?

In the movie 'Kindergarden Cop' there is a pretty self explanatory scene. Hardened police officer John Kimble has to go undercover as a kindergarten teacher. When he is introduced the first time to his class a small boy says 'Girls have a vagina and boys have a penis.' All children laugh and John Kimble tries to control the situation by saying "Thanks for the tip!". What happened here, is called social learning and was described as a theory by the psychologist Albert Bandura. The model for the other children was the little boy who asked the delicate question, which is a running gag during the movie. The other children see now the reaction of the grown up to that question and realize this question is something special and funny - a conscious representation of the concepts 'penis', 'vagina', 'moral', 'shame' doesn't even need to be established here. It is sufficient for the other children to realize these words are emotionally loaded and create a tension.

Deep learning vs. semantic networks.

Trying to model the above example we would have little success using a semantic network, assuming the AI has the same quantity of words and concepts like a kindergarten child. Most of them don't know the definitions of the words 'penis' and 'vagina', but they realize trough social learning that something is 'funny' with these words, because they feel 'funny'. Here an approach utilizing deep learning would be promising. Training a system with enough 'funny' examples, witnessed on a model, would connect the feeling 'funny' with the words 'vagina' or 'penis' without the need for a semantic definition of these words.

But emotions are chaotic! How to model them?

That something is chaotic doesn't mean you can't model it, as chaos theory has shown in  the  1980's. Back then very simple computer programs could model deterministic chaos. Which means that the chaos created by the machine is based on a quite simple set of rules. The theory also teaches us that deterministic chaos is not to be confused with completely random events. Many forms of chaotic states tend to gravitate to an attractor, a relative stable system state - for the moment. So do emotions! Emotions can change seemingly chaotic, but on closer look there is a system in these changes and they are usually never completely random. Maybe emotions can be modeled  on sets of rules?

Professor Dietrich Dörner, the PSI Theory and Artificial Emotions (AE).

 The question how to model emotions in a computer program was exactly the challenge Professor Dietrich Dörner at the University of Bamberg faced. His research field was the human performance in complex situations modeled by computer simulations. So as early as in the 1970's Dietrich Dörner and his team began to write computer programs which interacted with humans. Today we would probably call them computer games. The only difference to computer games was that the interaction between human and program was recorded on a very detailed level. His research life showed that we humans generally perform terrible in complex situations. He researched which people perform good and what are the factors of good and bad performance. In the 1990's the necessity arose to simulate emotions in simulated entities to study the interaction between human and computer. So interestingly the goal was not to implement a general theory of Artificial Emotions, or AE - but to have a believable simulated entity on a computer screen.
This gave birth to the PSI-Theory which was implemented and simulates a small robot that explores and island to satisfy its needs for food, water and friendship. For me the funny fact here is that Dietrich Dörner managed to implement a pretty good model of emotions that works believable! Today one can implement this model on any smartphone. Joscha Bach was one of his students and evolved the PSI-Theory by creating MicroPsi. Version two is publicly available at GitHub.

So, does it mean AI systems will have feelings and a soul?
My opinion is, that these are philosophical questions. I believe we will have fully evolved general AI systems in some decades that seem to have feelings, dream of electric sheep and maybe even believe to have a soul. But since medicine, neuroscience and psychology have not yet answered the question about the nature of consciousness we simply have not empirical benchmark to prove above postulates. Until then, it has to be good enough to have working models of emotions.

To be continued...


IBM briefs the White House about artificial intelligence: society, economics & education

In June 2016 the White House published a request for information about the future of Artificial Intelligence. Seeking for input of industry and thought leaders. In my humble opinion I think that was because 2016 seems to be the year where Artificial Intelligence left the lab. Sure, there have been widespread applications of deep learning classifications in image classification, customer classification in retail and other specialized areas.

But since IBM stirs the pot with it's Watson technology, Tesla gets publicity with it's self driving cars everyone can buy and Google's AlphaGo won it's epic Go match against Lee Sedol things got more visible to the public eye. I think there is now a widespread awareness rising that Artificial Intelligence is no longer part of Science Fiction playbooks but is rising in our everyday's life.

I agree mostly with IBM's answer to the RFI, however in some points I have a different opinions I will lay them out here.

Social and economic implications of AI 

(Version from: July 28, 2016)
Overall the assessment is that the positive consequences will out weight the negative ones and history have shown us that everyone will benefit from a rise in productivity. Well, I think this will depend on how society, investors and policy makers will react to this productivity rise. I still remember an article in I read in the 80's from the German popular science magazine P.M. Back then, the robotics industry was on the rise, especially brought to public awareness by the success of Japanese companies. The prediction in this article for the new millennia was that it would be sufficient for employees to work around four hours a day to earn a living, even a good one! Well, we are far from that - surprisingly the assumed rises in productivity turned out to be quite right. So where did the benefit go? Only a fraction of society in the western industrialized world benefited from the productivity gains from industrial robotics and the widespread introduction of ERP systems. The coming challenge will be to really let the whole society benefit from increased productivity by AI.

Education for harnessing AI technologies 

(Version from: July 28, 2016)
I think it is prudent to point out that the biggest potential for education lies in AI itself. In AI coaches to be specific. I use the term coach instead of teacher because in the beginning AI systems for teaching will be more like a coach, supporting instead of guiding and developing a curriculum. This will be already a big step of challenging the the one teacher many students problem. AI coaches have the potential to give every student the feeling of being taken care of by offering a tailored learning experience. Chalapathy Neti, Director of Education Transformation at IBM has already pointed this out.

That AI systems can be utilized in teaching has already been demonstrated by Professor Ashok Goel who introduced 'Jill Watson' as a teaching assistant. The whole semester his students were unaware that Jill was in fact an AI system. They were quite surprised when it was revealed to them because they were about to give a very positive feedback for this assistant. The Sydney Morning Herald wrote a nice article about this. Professor Goel also thinks that AI systems, AI coaches in my wording, could be a big improvement for MOOCs.
Official press release from Georgia Tech.

To be continued...


A psychologists view on Artificial Intelligence, Cognitive Systems, Deep Learning and similar disruptive events.

AlphaGo challenged Lee Sedol this year and won in four out of five matches in the Game of Go. Google made the Deep Learning API TensorFlow available last year, with explicit permission to use it for commercial purposes. IBM's Watson technology left the laboratory and is now a commercial product available for everyone. Nearly every day now I read news worth spreading about disruptive events in the world related to AI technology. I finally thought it is time to start a blog.

Where do I come from?

I was a Star Wars kid and a Trekkie. In my teenage years Blade Runner was one of the most inspiring movies for me. It is based on the Novel 'Do Androids Dream of Electric Sheep?' by Philip K. Dick and asked the question if artificially created beings can have real emotions and should be treated like real beings. Which raised the interesting question for me of what the quality of this reality might be.

After a short interlude of studying physics I switched my subject to psychology, because it was (and still is) such a young science which was full of promises for new discoveries. In the early 90's I discovered one of the first courses about 'Artificial Neural Networks' will be held at the Justus Liebig University Giessen at the department of Psychology. I heard vague rumors about these things and that they might be the technology future R2D2s and C3POs might be based on. As I entered the lecture hall five minutes before the designated time there was only one other guy with long hair and an electric guitar hanging from his ear. I expected the lecture hall will be bursting with students! It turned out we were only three persons for this course, including our Professor: Gert Haubensak. We started with the book 'Parallel Distributed Processing: Explorations in the Microstructure of Cognition' by James McClelland and David Rumelhart. Soon the theory was too dry and we implemented our own Neural Networks based on the source codes in the book 'Explorations in Parallel Distributed Processing: A Handbook of Models, Programs, and Exercises'. This course was conducted for several years to come and we were constant participants. So a small AI research nucleus was formed at the psychological faculty in Giessen.

After extended study years I was eager to get a research job in this promising and emerging field of cognitive science. I was quite dumbfounded to learn that there seemed to be no need for a psychologist in this field. There were some computer science labs in Germany applying this technology but they had no use for a psychologist. The only Psychologist with a research group in this field was Professor Dietrich Dörner at the Unversity of Bamberg. He and his team developed the PSI-Theory of action regulation, intentions and emotions. Which was quite cool back then and quite tempting for me. Unfortunately his research team was small and he could not take everyone in. So looking for a job was a quite sobering experience.

The Neuroscience years.

I don't know how many application talks I had, where I gave presentations of my research with Artificial Neural Networks to no avail. So many, that I didn't care any more and just viewed them as a lottery ticket. To my luck, Professor Mark. W. Greenlee gave me a position as a research assistant with the possibility to work on my PhD thesis. However, my research field were no longer Artificial Neural Networks but Neuroscience research on the human brain. Specifically researching the human visual system with Dynamic Causal Modelling and fMRI experiments. It seemed a far cry from AI research back then but in the years to come connections between these research field got more visible.

Artificial Neural Networks try to mimic the principles of biological nerve cells in a very simplifying way. In fact the incredible complexity of a single neuron is broken down into a hand full of mathematical equations that should approximate its computational behavior, creating a crude simulation of a wetware computer.

My PhD thesis researched the interaction of regions in the human brain, each region containing hundreds of millions biological neurons. Until some years ago, simulating such vast numbers in a computer was considered a theoretical question at best.

Mind, Brain & Chips

But the computing power problem was tackled in the last decade in two ways. In a talk at Google Tech Talks Geoffrey Hinton boasted jokingly how he made computations 100,000 times faster. He optimized the algorithm to calculate 100 times faster, it took him 17 years, in this time the computers got 1,000 times faster.

Furthermore the technology of graphic accelerators, widely used for enhancing 3D computer gaming experiences, was utilized for Artificial Neural Networks, speeding up the calculations drastically. Since this year the first commercial off the shelf supercomputer for Artificial Neural Network computations is available, NVIDIA's DGX-1.

2015 IBM introduced a computer chip that is resembling an artificial neural network called True North. I wrote a short post about SAP HANA & True North in the SAP SCN community.

The other approach for speeding up things is to understand the vast complexity of mammal brain regions and to boil this complexity down into mathematical equations that give a model of this brain regions behavior. I had the pleasure of visiting the lab of Professor Gustavo Deco at the Pompeu Fabra University in Barcelona, that was 2007. I was introduced into a fascinating field of research that gave hope that one day it will be possible to describe large parts of our mind's working via mathematics.

Professor Klaas Enno Stephan is the head of the Translational Neuromodeling Unit of the University of Zürich and the ETH Zürich. He and his team are researching the system dynamics of the brain, deepening the understanding of it as a cognitive system and making numerical testing of mathematical models possible.

So the year 2016 is a quite fitting year to start a blog about Artificial Intelligence, Cognitive Systems & Theories, Deep Learning, Neuro -science, -computation & -economics, Brain Computer Interfaces etc. pp.

Or in short about Mind, Brain & Chips!

Especially because we have the 60'th year of the term Artificial Intelligence, which was officially coined at Darmouth Conference in 1956.


Wait! You wrote Deep Learning?

Yes - but I am too old and too tired to be impressed by the latest hype speak. For me Deep Learning systems are Artificial Neural Networks. I know that some information scientists would strongly disagree now, pointing out that Deep Learning systems have a much higher degree of complexity and variation of underlying algorithms.

From the viewpoint of a neuroscientist a mammal brain is a mammal brain even when a biologist would strongly disagree and point out the differences in behavior and abilities between a rhesus monkey and a human being. The psychologist in me also shrugs and thinks of decades of cognitive psychology research where models where made that consisted of many boxes connected by arrows (like here). If these systems are computed by single CPU's, massive parallel computing architectures or vast assemblies of stone pebbles makes little difference to me.

But I will get used to the term Deep Learning.