Teaching AI to Improve Visual Recognition

How do we recognize an image with our brain? We, human beings, are able to distinguish a rose from a camellia at brief sight, while a cat can trace a bug flying and immediately catch it. During the process of visual recognition, many neurons located across several layers in the visual cortex are hierarchically activated. Recent progress in this research field has achieved the characterization of neuronal cell properties responding not only to visual direction but also to size, position, and rotation. However, the visual information we receive every moment is full of diversity resulting in neuronal responses that are non-linear or complex. It is not so easy to analyze nonlinear neuronal responses and predict the visual properties they represent; therefore, scientists are seeking better options to “visualize” the complexity of visual responses by employing artificial neural networks.

To understand complex visual information, a group led by Prof. Kenichi Ohki of the International Research Center for Neurointelligence (IRCN) and Department of Physiology, Graduate School of Medicine, the University of Tokyo, developed a new method of nonlinear response characterization, especially for the nonlinear estimation of receptive fields (RFs), a basic property describing the response profile of a neuron. They achieved this with a computer model called a convolutional neural network (CNN), a key component of recent AI technology, to build an “encoding model” of visual neurons. First, they trained the CNN to predict the visual responses to natural images (Figure 1). Then, using the encoding model, the researchers generated RF images that would evoke a maximum response in the target neuron. When the trained CNN was applied to neurons in the mouse primary visual cortex (V1), they generated reasonably accurate RFs. Remarkably, the neuronal patterns of computed RFs looked slightly different when different conditions were used (Figure 2) and were able to classify the type of V1 neurons as either shift-variant or shift-invariant.

The new method, reported in the journal Scientific Reports, is expected to be reliable enough to use as a tool to analyze complex nonlinear responses of various types of neurons in the brain. Although computer-generated artificial neural networks and the brain’s own biological neural networks have much in common, Ohki cautioned that the former are far from an exact in silico implementation of the latter. If researchers can design and train artificial neural networks to learn further non-linear responses, then CNN-based computer vision might one day instantly “see” roses and camellias.

Correspondent: Mayumi Kimura, Ph.D., IRCN Science Writing Core

Reference: Ukita J, Yoshida T, and Ohki K (2019) Characterisation of nonlinear receptive fields of visual neurons by convolutional neural network. Scientific Reports. Published online: March 7, 2019. DOI:10.1038/s41598-019-40535-4

Graphic: See below

Media Contact: The author is available for interviews in English.

Professor Kenichi Ohki, M.D., Ph.D.
Department of Physiology, Graduate School of Medicine
International Research Center for Neurointelligence
The University of Tokyo

Mayuki Satake
Public Relations
International Research Center for Neurointelligence
The University of Tokyo
pr@ircn.jp

Fig. 1 The CNN encoding model

A convolutional neural network (CNN) was trained to increase the accuracy of estimating the visual neuronal responses related to natural images.

Fig. 2 Trained CNNs estimate neuron RFs

10 iteratively generated RFs for V1 neurons are shown. Compared to the RF estimate from Neuron #639, some RF images (blue-framed) from Neuron #646 were shifted to another by one pixel.