Sparsey, but not most other Neural Models, Explicitly Enforces Mesoscale Structure and Function

Sparsey explicitly enforces structural/functional organization at the mesoscale, in particular at both the minicolumn and macrocolumn scale as can be seen in figures throughout these pages. Specifically, the mac is formalized as a set of Q winner-take-all (WTA) competitive modules (CMs), proposed as analogs of cortical minicolumns, each consisting of K cells. For simplicity, and in all instantiations thus far, the CMs are disjoint. Thus far, macs are also disjoint as well, though their receptive and projective fields (RFs, PFs) are allowed to have arbitrary overlap. There is nothing in the theory that in principle prevents either CMs or macs from overlapping. That is, any given cell could be part of more than one CM (WTA network), and CMs or portions of CMs could be part of multiple macs as well. The relation of the many types of cortical interneurons is far from being comprehensively understood, especially when gap junction connectivity is considered as well. This is a future research topic.

However, in general, most other existing neural, more specifically cortical, models do not explicitly enforce macrocolumn (or minicolumn) scale structure/function. This is likely due to the historically strong bias toward viewing the single neuron as the "atomic" unit of intelligent computation, or in other words to the bias toward thinking in terms of localist representations. My claim may be viewed by skepticism so I'll explain it with respect to RBMs and ConvNets, the two main sub-types of the most popular and successful class of models, the Deep Learning models.

RBMs: Specifically a layer of an RBM is simpy a field of units, which have no connections between them and no structural constraints influencing the correlations or lack of correlations that may accrue between/amongst units. True, more recent instantiations may have sparsity-enforcing constraints, but these only directly enforce constraints on the overall numbers of strongly-responding units, not on the patterns of which units respond; i.e., they do not directly impose mesoscale structure.
ConvNets: At first sight, ConvNet-type DL models might appear to have an explicit mesoscale structure. Specifically, the ConvNet concept of a kernel (filter), which conceptually slides over the input surface, might be mistaken as a module. But a kernel is not a module; conceptually, a kernel is a receptive field (RF), i.e., an afferent set of synapses (or, of the source cells of those synapses). In the literature, a kernal is always a property of a single processing unit, proposed as analogous to a neuron, that receives/processes all information from the kernel. As for the case of a single RBM layer, any individual map in a ConvNet is a flat field of units. It is true that typically, during recognition/retrieval operation (i.e., following a learning mode), only the unit having the maximum match to its input kernel becomes active (in a max-pooling level) and sends outputs. However, in all ConvNets instances that I've seen, every individual map covers the entire subjacent level. Thus, a map is a single module whose input is the global input space and as such, involves no structural enforcement of mesoscale functionality. And, the top-level(s) of ConvNets are usually all-to-all-connected flat fields of units, thus again, possessing no mesoscale structure.

Above, we have explained that the currently most successful learning systems, the DL models, do not possess explict mesoscale structure. In fact, throughout most of the history of neural network research, absence of explicit mesoscale structure and function and presence of all-to-all mappings has been the rule. This includes Kohonen's self-organizing maps, Grossberg's ART models, Fukushima's neocognitron models, Hopfield nets, Roll's Visnet, Litvak & Ullman's model, all of the probabilistic population code models, e.g., Georgopolous, Pouget et al., Sanger, Movshon, and many more. By and large, there has been relatively little focus on mesoscale structure/function within the machine intelligence and even the neural network communities through the decades. There have been some exceptions, Lansners' work, Murre's CALM modules. Recently, Hiinton has acknowledged this long-standing absence within the classes of models with which he has been involved, e.g., MLP/backprop and Boltzmann Machines. His current foray into this domain features a module, called a "capsule", which he posits as analogous to the minicolumn, though this work is very new and little has been published on it.

Start of Hyperessay