The term "binding problem" is used in several ways, which as is often a problem in a convergence discipline that is also "neat", muddies the waters. The clearest meaning is the limited one, where investigators are asking questions about how we can perceive a specific coherent stimulus with specific multiple properties where those properties are received and processed by discrete channels; color and shape in visual perception of an object is a common arena for these questions. In other words, when you look at a green bottle on a black table, you don't have to decide whether the bottle or the table is green or black (or whether the green thing is the bottle-shaped surface or the flat square surface). In fact you can't not see the bottle-shaped surface as green; you have no choice in the matter, since the two properties are combined before they enter your experience.
Sometimes people also use the term binding problem in a more general sense, of how all these stimuli are combined to form our full coherent conscious experience of the world. While a worthy goal, it's easier and more productive at this point to ask questions about the limited definition of the first sense of the term. Read about or click through to one such recent productive investigation here.
Another binding problem that I've not seen investigated is the tactile binding problem. Areas of skin on our torsos are innervated by discrete nerves emanating from specific positions between our thoracic vertebrae - yet to learn this, humans needed to study anatomy, rather than perform introspection on our experience. When something touches your chest and moves down to your navel it's traversing 10 separate nerves, but as with visual binding, there's no sense of "transfer" between channels; it's completely continuous. Why? Is the answer the same kind of answer for visual property binding pairs? Maybe that's another potential angle of investigation; maybe easier.
Note that there are some forms of possible, non-automatic pattern recognition (not "mandatory" pre-conscious pairs) that are highly semantic. That is, I go for a walk in the canyon behind my house, and if I'm paying attention, I see tracks in the dry dirt; if I'm paying more attention, I can differentiate them as coyotes rather than domestic dogs; and if I'm really concentrating I can determine that there were two of them, they were there about dawn, were chasing a rabbit, and ran down into the creekbed after it. Unlike sensory binding it would be easy for this process to be derailed. That is, imagine that I like bunny rabbits and find it unpleasant to think about their being devoured. If I see all the tracks and begin to suspect that Peter Cottontail met an untimely end on this very trail I could choose to distract myself from these highly voluntary semantic pattern-recognition efforts, and never consciously realize what happened. But if I come upon the coyotes at the start of their meal, then, no matter how much I might want to avoid it, I have no free will in the matter; my brain will bind "red" and "bunny rabbit shape" and I will be conscious of it.
Of course, my semantic reasoning about these marks on the ground could have been all wrong, and in fact this is much, much more likely than when I preconsciously integrate "bottle-shaped" and "green" or "bunny rabbit-shaped" and "red". Semantic reasoning allows us to have false beliefs, but even the integration of direct sensory input is not without glitches; optical illusions do occur. Both means of perceptions involve integration of discretely sensed information, it's just that one of those integrations is occurring voluntarily, consciously, and semantically.
What's more interesting (and the subject of another post) is that semantic learning that's so well-conditioned it is also automatic and pre-conscious, for example writing to literate people. You can't look at an "A" shape without thinking of the letter, even if it's a natural rock formation that humans have never touched. You can't see the word "cat" without thinking of the animal. (If you disagree, email me with your perfect Stroop test score.) There are various forms of synesthesia, many of them having to do with graphemes, and I submit that literacy is a form of conditioned grapheme-sememe synesthesia that only seems non-miraculous because we've been doing it for a few millennia. (A sememe is just a unit of semantic meaning.) Connections between various forms of synesthesia and reading/writing ability are therefore interesting, and anecdotally there is a positive association between dyslexia and synesthesia.
2 hours ago