Interpretation of deep neural networks leveraging cognitive psychology
Deep neural networks have undergone learning to exhibit performance in a broad array of activities – from identifying and reasoning about objects in imagery to playing Atari and Go at human-beating levels. As these activities and network architectures become more complicated, the solutions that neural networks learn become more tough to comprehend.
This is referred to as the ‘black-box’ issue, and it is becoming more and more critical as neural networks are leveraged in more and more real world applications.
Researchers are working to broaden the toolkit for comprehension and interpretation of these systems. In a certain paper by DeepMind, which was accepted at ICML, they put forth a new strategy to this issue that employs methodologies from cognitive psychology to comprehend deep neural networks. Cognitive psychology quantifies behaviour to make inferences about mechanisms of cognition, and consists of a vast literature detailing these mechanisms, combined with experiments for their verification. As our neural networks get to human-level performance on particular tasks, methods from cognitive psychology are become more and more applicable to the black-box issue.
To illustrate this point, the paper reports a case study where an experiment developed to elucidate human cognition was leveraged to assist us comprehend how deep networks identify solutions to an image classification task.
The outcomes demonstrated that behaviours observed by cognitive psychologists in human beings are also demonstrated by these deep networks. Further, the outcomes revealed useful and shocking insights with regards to how the networks find solutions to the classification activity. In a general sense, the success of the case study illustrated the potential of leveraging cognitive psychology to comprehend deep learning systems.
Measuring the shape bias in one-shot word learning models
In the case study, it was looked into how children identify and label objects – a rich sphere of research in developmental cognitive psychology. The capability of children to guess the meaning of a word from a singular instance – so-called ‘one-shot word learning’ – occurs with such ease that it is easy to view it as a simplistic process. Although a classic thought experiment from the philosopher Willard Van Orman Quine demonstrates just how complicated this actually is:
A field linguist has paid a visit to a culture whose language is completely different from his own. The linguist is attempting to learn some words from a helpful native speaker, when a hair runs by. The native speaker declares “gavagai”, and the linguist is left to make inferences about the meaning of this new word. The linguist is faced with an array of potential inferences, which includes that gavagai is in reference to rabbits, animals, white things, that particular rabbit, or “undetached parts of rabbits”. There is a limitless number of potential inferences to be made. How are people able to opt for the correct one?
Half-a-century on, we are faced with the same question with regard to deep neural networks that can perform one-shot learning. Think about the Matching Network, a neural network generated by DeepMind. This model leverages latest advancements in attention and memory to accomplish bleeding-edge performance categorizing ImageNet images leveraging just a singular instance from a class. But, we do not know what assumptions the network is making to categorize these images.
To illuminate this, we seek the work of developmental psychologists (1-4) who detected evidence that children identify the correct inferences through application of inductive biases to eradicate several of the incorrect references.
Whole object bias, through which children make the assumption that a word is in reference to an entire object and not its components (eradicating Quine’s concern about undetached rabbit parts)
Taxonomic bias, through which children make the assumption that a word is in reference to the basic level category an object belongs to. (quelling Quine’s fears that all organisms might be chosen as the meaning of “rabbit”.
Shape bias, through which children assume the meaning of a noun has it basis on object shape over colour or texture (quelling Quine’s anxiety that all white things might be allocated to the meaning of “rabbit”)
The classic shape bias experiment that was taken up went onwards as follows: we put forth our deep networks with imagery of three independent objects, a probe object, a shape-match object, (which is like the probe in shape but not in colour), and a colour-match object (which is like the probe in colour but not in shape). We then quantify the shape bias as the proportion of times that the probe image is allocated the same label as the shape-match image over the colour-match image.
We leveraged images of objects leveraged in human experiments in the Cognitive Development Lab at Indiana University.
This experiment was attempted with our deep networks (Matching networks and an inception baseline model) and discovered that – like humans – our networks possess a robust bias towards object shape over colour or texture. In other words, they possess a ‘shape bias’.
This indicates that matching networks and the inception classifier leverage an inductive bias for shape to reduce wrong hypotheses, providing us a clear insight into how these networks solve the one-shot word learning issue.
The observation of shape bias was the only fascinating finding.
It was observed that the shape bias props up gradually over the course of preliminary training in our networks. This is like the emergence of shape bias in human beings: young children demonstrated reduced shape bias in contrast to older children, and adults show the biggest bias.
It was discovered that there exist differing levels of bias in the networks dependent on the random seed leveraged to draw valid conclusions during experimenting with deep learning systems, just like psychologists have learned not to make a conclusion on the basis of a singular subject.
It was discovered that networks accomplish the same one shot learning performance even if the shape bias is radically different, illustrating that differing networks can discovery a variety of equally impactful solutions to a complicated problem.
The finding of this prior unrecognized bias in standardized neural network architecture demonstrates the possibility of leveraging artificial cognitive psychology for interpretation of neural network solutions. In other fields, insights from the episodic memory literature may be good for comprehending episodic memory architectures, and strategies from the semantic cognition literature may be good for comprehending latest models of concept formation. The psychological literature is rich in these and other spheres, providing us powerful utilities to tackle the ‘black box’ problem and to more deeply comprehend the behaviour of our neural networks.