Complex Hierarchical Trace (Engram) of Sequence

The Figure 1 animation shows the hierarchical memory (engram) trace of a short (preprocessed) video, a Weizmann snippet of a person bending down, in a small Sparsey model with two internal levels. The left panel shows the preprocessed (edge-filtered, binarized, skeletonized) input sequence. The middle panel shows (plan view) of the grid of macrocolumns ("macs"), i.e., V1 hypercolumns, that receive input from the input pixels. We do not show the receptive fields (RFs) of these macs in Figure 1, but they are rounded patches of approximately 40-50 pixels centered below the mac. See Figure 3. RFs of neighboring macs overlap to varying extents (seen here and here). Each of these macs is a sparse distributed representation (SDR) coding field consisting of Q=7 WTA competitive modules (CMs), each composed of K=7 binary units. What you see playing out here is the pattern of V1 macs that become active as the input plays out. A V1 mac becomes active (rose color) only when it has an criteria-meeting number of active pixels in its RF. So you can see how the pattern of V1 mac activation through time more or less tracks the contour of the bending figure. When a mac activates, that activation takes the form of an SDR (code), i.e., a cell assembly, consisting of exactly Q active units, one per CM. [You need to zoom in to see the actualy units in the V1 macs (go fullscreen or maybe download the mp4).] At V1, these codes activate for one frame, i.e., the code "persistence" at V1 is one frame. These active codes send out large numbers of efferent signals (not shown here) via bottom-up (U) synapses to the macs at the next higher level (analog of V2) and via horizontal (H) synapses both recurrently to other units in their own mac and to neighboring V1 macs. In general, they will also send top-down signals.

Figure 1: Depiction of the engram—in the form of a spatiotemporal pattern ("Hebbian phase seuqence") of many cell assemblies playing out over the macrocolumns of V1 and V2.

The right panel shows the grid of "V2" macs, each with Q=6 CMs, each having K=6 binary units. Note that they only appear larger than V1 macs here because our software expanded the grid to have the same overall dimensions as the V1 grid and the input grid. Nevertheless, the increased size of the V2 macs does correctly suggest the larger RFs of V2 macs. That is, just as a V1 mac receives input from a patch of pixels below it, a V2 mac receives input from a patch of V1 macs below it (see here for more detail on increasing RF size with level). The figure below shows some of these U signals (blue lines) and H signals (green arcs) that propagate as this engram plays out.  It suggests the fan-in/out, i.e., overlapping RFs.  As you can see above, the pattern of V2 mac ativation through time also more or less tracks the bending figure, but much more coarsely than at V1. Similar to V1, a V1 mac activates only when it has an criteria-meeting number of active V1 macs in its RF. In general, the larger V2 mac RFs will overlap even more than the V1 mac RFs. Also, the persistence of V2 SDR codes is two frames. This allows single SDR codes at V2 to become synaptically linked (bidirectionally) to two-frame-long sequences of V1 SDR codes (i.e., temporal "chunking", compression).  [Note: the color scheme for the units is: black=correctly active; red=incorrectly active; green=incorrectly inactive.  This is a recognition test trace, so we compare this trace to the trace (engram) that was created on the original presentation of the sequence: that's why we can say that individual unit activations are correct/incorrect.  For V2, you will also often see units half one color and half another: that's because V2 units stay on for two frames, so their status can change between the two frames of their activation.]

Figure 2: 3D view of engram of Figure 1, showing some of the bottom-up (U, blue) and horizontal (H, green) signals whose propagation determines the unfolding engram. There are also top-down signals (D) propagating (magenta in videos on this site) but not shown here.

Figure 3 focuses on the RF of a single V1 mac (near center of depicted patch of V1 macs). It's from the same 14-frame "bending" sequence as the other figures. It shows the mac activating on several of the frames, specifically, those in which the number of active pixels in the mac's RF falls within a tight range, about 12-14. This range is chosen to be approximately the diameter of the RF in pixels. The idea is that given the preprocessing applied to the inputs (edge filtering, binarization, skeletonization), contour segments consisting of approximately that many active pixels will occur within the RF with appreciable frequency, while contour segments with substantially fewer or substantially more active pixels will be filtered out by the preprocessing. The larger idea here is that it is sufficient to represent only a realtively small set of such contour segments, i.e., of such size (indirectly, complexity), for that RF. In other words, a basis of such features is sufficient for representing all future inputs to the RF. Note that the particular contours that cause the mac to activate are idiosyncratic to the particular overall input, i.e., the bending figure. If the depicted sequence occurs early enough in the life of this model, i.e., early enough so that newly experienced inputs to this mac can still be stored, i.e., before the "critical period" closes for this mac, then these particular contour segments, i.e., features, will become part of the set of basis features (lexicon) permanently stored in this mac. And the specific SDR codes shown activating in the mac will be the codes (memory traces) of those particular features. It may seem arbitrary to admit the particular contour segments that happen to occur in this mac's RF prior to its critical period closing as its basis elements. After all, usually a basis consists of more regular looking features, e.g., a set of 12 straight contours at 30 degrees spacing. But in fact it doesn't matter how regular-looking the basis features are as long as the expected fidelity with which inputs in the RF are represented, across all future experience, is sufficient.

Note that in those cases where the input pattern in the RF meets criterion on consecutive frames, T and T+1, the horizontal (H, green) weights from the SDR active at T to the SDR active at T+1 will be increased (i.e., chaining). But also note that the horizontal weight matrices generally extend beyond the source mac to surrounding macs (to some parameter-defined radius). Thus, not only will an SDR active in the mac at T be associated with an active SDR in that same mac at T+1, but also with SDRs that become active at T+1 in any of the nearby macs falling within its efferent H radius.

Figure 3: Here, we show the RF of a single V1 mac (to which the blue lines connect) and the sequence of activations of codes in the mac as contours of the overall input sweep through its RF. This RF is a patch of about 120 pixels, which is perhaps a bit larger than optimal, or than corresponds to real V1 macrocolumns.