AI and cognitive psychology rant (getting more and more OT - tell me if I should shut up)

Thu Oct 16 14:52:04 EDT 2003

Stephen Horne wrote:
   ...
>>>>no understanding, no semantic modeling.
>>>>no concepts, no abstractions.
>>> 
>>> Sounds a bit like intuition to me.
   ...
> What is your definition of intuition?

I can accept something like, e.g.:

> 2.  insight without conscious reasoning
> 3.  knowing without knowing how you know

but they require 'insight' or 'knowing', which are neither claimed
nor disclaimed in the above.  Suppose I observe a paramecium moves
more often than not, within a liquid, in the direction of increasing
sugar concentration.  Given my model of paramecium behavior, I would
verbalize that as "no understanding" etc on the paramecium's part,
but I would find it quaint to hear it claimed that such behaviour
shows the paramecium has "intuition" about the location of food --
I might accept as slightly less quaint mentions of "algorithms" or
even "heuristics", though I would take them as similes rather than
as rigorous claims that the definitions do in fact apply.

If we went more deeply about this, I would claim that these
definitions match some, but not all, of what I would consider
reasonable uses of "intuition".  There are many things I know,
without knowing HOW I do know -- did I hear it from some teacher,
did I see it on the web, did I read it in some book?  Yet I would
find it ridiculous to claim I have such knowledge "by intuition":
I have simply retained the memory of some fact and forgotten the
details of how I came upon that information.  Perhaps one could
salvage [3], a bit weakened, by saying that I do know that piece
of knowledge "came to me from the outside", even though I do not
recall whether it was specifically from a book (and which book),
a Discovery Channel program, a teacher, etc, in any case it was
an "outside source of information" of some kind or other -- I do
know it's not something I have "deduced" or dreamed up by myself.

> The third definition will tend to follow from the second (if the
> insight didn't come from conscious reasoning, you won't know how you
> know the reasoning behind it).

This seems to ignore knowledge that comes, not from insight nor
reasoning, but from outside sources of information (sources which one 
may remember, or may have forgotten, without the forgetting justifying
the use of the word "intuition", in my opinion).

> Basically, the second definition is the core of what I intend and
> nothing you said above contradicts what I claimed. Specifically...

I do not claim the characteristics I listed:

>>>>no understanding, no semantic modeling.
>>>>no concepts, no abstractions.

_contradict_ the possibility of "intuition".  I claim they're very
far from _implying_ it.

> ...sounds like "knowing without knowing how you know".

In particular, there is no implication of "knowing" in the above.

> Intuitive understanding must be supplied by some algorithm in the
> brain even when that algorithm is applied subconsciously. I can well

A "petition of principle" which I may accept (taking 'algorithm' as
a simile, as it may well not rigorously meet the definition), at least
as a working hypothesis.  Whether the locus of the mechanisms may in
some cases be elsewhere than in the brain is hardly of the essence,
anyway.

> believe that (as you say, after long practice) a learned algorithm may
> be applied entirely unconsciously, in much the same way that (after
> long practice) drivers don't have to think about how to drive.

Or, more directly, people don't (normally) have to think about how
to walk (after frequent trouble learning how to, early on).

> Besides, just because a long multiplication tends to be consciously
> worked out by humans, that doesn't mean it can't be an innate ability
> in either hardware or software.

Of course, it could conceivably be "hard-wired" as simpler operations
such as "one more than" (for small numbers) seem to be for many
animals (us included, I believe).

> Take your voice recognition example. If the method is Markov chains,
> then I don't understand it as I don't much remember what Markov chains
> are - if I was to approach the task I'd probably use a Morlet wavelet
> to get frequency domain information and feature detection to pick out
> key features in the frequency domain (and to some degree the original
> unprocessed waveform) - though I really have no idea how well that
> would work.

You can process the voice waveform into a string of discrete symbols
from a finite alphabet in just about any old way -- we used a model of
the ear, as I recall -- but I wasn't involved with that part directly,
so my memories are imprecise in the matter: I do recall it did not
make all that much of a difference to plug in completely different
front-end filters, as long as they spit out strings of discrete
symbols from a finite alphabet representing the incoming sound (and
the rest of the system was trained, e.g. by Viterbi algorithm, with
the specific front-end filter in use).

So, we have our (processed input) sequence of 'phones' (arbitrary name
for those discrete symbols from finite alphabet), S.  We want to
determine the sequence of words W which with the highest probability
may account for the 'sequence of sounds' (represented by) S: W
such that P(W|S) is maximum.  But P(W|S)=P(S|W)P(W) / P(S) -- and
we don't care about P(S), the a priori probability of a sequence of
sounds, for the purpose of finding W that maximizes P(W|S): we only
care about P(S|W), the probability that a certain sequence of
words W would have produced those sounds S rather than others (a
probabilistic "model of pronunciation"), and P(W), the a priori
probability that somebody would have spoken that sequence of words
rather than different sequences of words (a probabilistic "model
of natural language").  HMM's are ways in which we can model the
probabilistic process of "producing output symbols based on non-
observable internal state" -- not the only way, of course, but quite
simple, powerful, and amenable to solid mathematical treatment.

You can read a good introduction to HMM's at:
http://www.comp.leeds.ac.uk/roger/HiddenMarkovModels/html_dev/main.html

> However, I don't need any details to make the following
> observations...
> 
> The software was not 'aware' of the method being used - the Markov
> chain stuff was simply programmed in and thus 'innate'. Any 'learning'
> related to 'parameters' of the method - not the method itself. And the
> software was not 'aware' of the meaning of those parameters - it
> simply had the 'innate' ability to collect them in training.

The software was not built to be "aware" of anything, right.  We did
not care about software to build sophisticated models of what was
going on, but rather about working software giving good recognition
rates.  As much estimation of parameters as we could do in advance,
off-line, we did, so the run-time software only got those as large
tables of numbers somehow computed by other means; at the time, we
separated out the on-line software into a "learning" part that would
run only once per speaker at the start, and the non-learning (fixed
parameters) rest of the code (extending the concept to allow some
variation of parameters based on errors made and corrected during
normal operation is of course possible, but at the time we kept things
simple and didn't do anything like that).  Any such "learning" would
in any case only affect the numerical parameters; there was no
provision at all for changing any other aspect of the code.

> This reminds me a lot of human speach recognition - I can't tell you
> how I know what people are saying to me without going and reading up
> on it in a cognitive psychology textbook (and even then, the available
> knowledge is very limited). I am not 'aware' of the method my brain
> uses, and I am not 'aware' of the meaning of the learned 'parameters'
> that let me understand a new strong accent after a bit of experience.

People do have internal models of how people understand speech -- not
necessarily accurate ones, but they're there.  When somebody has trouble
understanding you, you may repeat your sentences louder and more slowly,
perhaps articulating each word rather than slurring them as usual: this
clearly reflects a model of auditory performance which may have certain
specific problems with noise and speed.  Whether you can become conscious
of that model by introspection is secondary, when we're comparing this
with software that HAS no routines for introspection at all -- _your_
software has some, which may happen to be not very effective in any
given particular case but in certain situations may help reach better
understanding; "from their fruits shall ye know them" is more reliable.

And the models are adjustable to a surprising degree -- again, rather
than relying on introspection, we can observe this in the effects.  When
a speaker becomes aware that a certain listener is deaf and relying on
lip-reading, after some experience, the "louder" aspect of repetition
in case of trouble fades, while ensuring that the 'listener' can see
one's lips clearly, and the "clearer articulation, no slurring" get
important (so does hand-gesturing supporting speech).  When devices
are invented that human beings can have no possible "biologically innate"
direct adaptation to, such as writing, human beings are flexible enough
in adapting their model of "how people understand speech" to enrich
the "clarification if needed" process to include spelling words out in
some cases.  Indeed, the concept of "understanding a word" is clearly
there -- sometime, in case of trouble communicating (particularly when
the purely auditory aspects are out of the picture), you may try using
another perhaps "simpler" synonym word in lieu of one which appears
to puzzle the listener; if the social situation warrants the time and
energy investment you might even provide the definition of a useful
term which you may expect to use repeatedly yet appears misunderstood.

In all of these cases, if the model is consciously available to you via
introspection, that may be useful because it may let you "play with it"
and make decisions about how to compensate for miscommunication more
effectively.  Of course, that will work better if the model has SOME
aspects that are "accurate".  Is your reaction to problems getting
across going to be the same whether [a] you're talking in a very noisy
room with a person you know you've talked to many times in the past,
in different situations, without trouble, [b] you're talking to a two
year old child who appears to not grasp the first principles of what
you're talking about, [c] you're talking to an octuagenarian who keeps
cupping his hand to his hear and saying "eh?", [d] you're talking to
a foreigner whom you have noticed speaks broken, barely understandable
English, ... ?  If so, then it's credible that you have no model of
speech understanding by people.  If your attempts to compensate for
the communication problems differ in the various cases, some kind of
model is at work; the more aware you become of those models, the better
chance you stand of applying them effectively and managing to communicate
(as opposed to, e.g., the proverbial "ugly American" whose caricatural
reaction to foreigners having trouble understanding English would be
to repeat exactly the same sentences, but much louder:-).

In all of these cases we're modeling how OTHERS understand speech --
a model of how WE understand speech is helpful only secondarily, i.e.
by helping us "project" our model of ourselves outwards into a "model
of people".  There may be some marginal cases in which the model would
be directly useful, e.g. trying to understand words in a badly botched
tape recording we might use some of the above strategies in "replaying"
it, but the choices are generally rather limited (volume; what else?-);
or perhaps explicitly asking a speaker we have trouble understanding
to speak slowly / loudly / using simpler words / etc appropriately.  I
think this situation is quite general: model of ourselves with direct
usefulness appear to me to be a rarer case than useful model of _others_
which may partly be derived by extroflecting our self-models.  In part
this is because we're rarely un-lazy enough to go to deliberate efforts
to change ourselves even if our self-models should suggest such changes
might be useful -- better (easier) to rationalize why our defects aren't
really defects or cannot be remedied anyway (and such rationalizations
_diminish_ the direct usefulness of models-of-self...).

> The precise algorithms for speach recognition used by IBM/Dragons
> dictation systems and by the brain are probably different, but to me

Probably.

> fussing about that is pure anthropocentricity. Maybe one day we'll

Actually it isn't -- if you're aware of certain drastic differences
in the process of speech understanding in the two cases, this may be
directly useful to your attempts of enhancing communication that is
not working as you desire.  E.g., if a human being with which you're
very interested in discussing Kant keeps misunderstanding each time
you mention Weltanschauung, it may be worth the trouble to EXPLAIN
to your interlocutor exactly what you mean by it and why the term is
important; but if you have trouble dictating that word to a speech
recognizer you had better realize that there is no "meaning" at all
connected to words in the recognizer -- you may or may not be able
to "teach" spelling and pronunciation of specific new words to the
machine, but "usage in context" (for machines of the kind we've been
discussing) is a lost cause and you might as well save your time.

But, you keep using "anthropocentric" and its derivatives as if they
were acknowledged "defects" of thought or behavior.  They aren't.

> meet some alien race that uses a very different method of speach
> recognition to the one used in our brains.

Maybe -- the likely resulting troubles will be quite interesting.

> In principle, to me, the two systems (human brain and dictation
> program) have similar claims to being intelligent. Though the human

I disagree, because I disagree about the importance of models (even
not fully accurate ones).

> mind wins out in terms of being more successful (more reliable, more
> flexible, better integrated with other abilities).

Much (far from all) of this is about those models (of self, of others,
of the world).  The programs who try to be intelligent, IMHO, must
use some kind of semantic model -- both for its usefulness' sake,
and in order to help us understand human intelligence at all.  This,
btw, appears to me to be essentially the same stance as implied on
the AAAI site which I have previously referenced.

> But giving non-answers such as "I don't know" or "I just felt like it"
> tends not to be socially acceptable. It creates the appearance of
> evasiveness. Therefore, giving accurate 'self-aware' answers can be a
> bad idea in a social context.

Indeed, from a perfectly rational viewpoint it may be a bad idea in
general.  Why give others a help in understanding you, if they're
liable to use that understanding for _their_ benefit and possibly to
your detriment?  For a wonderful treatment of this idea from a POV 
that's mostly from economics and a little from political science,
see Timur Kuran's "Private Truths, Public Lies", IMHO a masterpiece
(but then, I _do_ read economics for fun:-).

But of course you'd want _others_ to suppy you with information about
_their_ motivations (to refine your model of them) -- and reciprocity
is important -- so you must SEEM to be cooperating in the matter.
(Ridley's "Origins of Virtue" is what I would suggest as background
reading for such issues).

>>> 1.  This suggests that the only human intelligence is human
>>>     intelligence. A very anthropocentric viewpoint.
>>
>>Of course, by definition of "anthropocentric".  And why not?
> 
> Because 'the type of intelligence humans have' is not, to me, a valid
> limitation on 'what is intelligence?'.

But if there are many types, the one humans have is surely the most
important to us -- others being important mostly for the contrast
they can provide.  Turing's Test also operationally defines it that
way, in the end, and I'm not alone in considering Turing's paper
THE start and foundation of AI.

> Studying humanity is important. But AI is not (or at least should not
> be) a study of people - if it aims to provide practical results then
> it is a study of intelligence.

But when we can't agree whether e.g. a termine colony is collectively
"intelligent" or not, how would it be "AI" to accurately model such a
colony's behavior?  The only occurrences of "intelligence" which a
vast majority of people will accept to be worthy of the term are those
displayed by humans -- because then "model extroflecting", such an
appreciated mechanism, works fairly well; we can model the other
person's behavior by "putting ourselves in his/her place" and feel
its "intelligence" or otherwise indirectly that way.  For non-humans
it only "works" (so to speak) by antroporphisation, and as the well
known saying goes, "you shouldn't antropomorphise computers: they
don't like it one bit when you do".

A human -- or anything that can reliably pass as a human -- can surely 
be said to exhibit intelligence in certain conditions; for anything
else, you'll get unbounded amount of controversy.  "Artificial life",
where non-necessarily-intelligent behavior of various lifeforms is
modeled and simulated, is a separate subject from AI.  I'm not dissing
the ability to abstract characteristics _from human "intelligent"
behavior_ to reach a useful operating definition of intelligence that
is not limited by humanity: I and the AAAI appear to agree that the
ability to build, adapt, evolve and generally modify _semantic models_
is a reasonable discriminant to use.

If what you want is to understand intelligence, that's one thing.  But
if what you want is a program that takes dictation, or ones that plays
good bridge, then an AI approach -- a semantic model etc -- is not
necessarily going to be the most productive in the short run (and
"in the long run we're all dead" anyway:-).  Calling program that use
completely different approaches "AI" is as sterile as similarly naming,
e.g., Microsoft Word because it can do spell-checking for you: you can
then say that ANY program is "AI" and draw the curtains, because the
term has then become totally useless.  That's clearly not what the AAAI
may want, and I tend to agree with them on this point.

>>connection: that "proper study of man" may well be the key reason
>>that made "runaway" brain development adaptive in our far forebears --
> 
> Absolutely - this is IMO almost certainly both why we have a
> specialised social intelligence and why we have an excess (in terms of
> our ancestors apparent requirements) of general intelligence.
> Referring back to the Baldwin effect, an ability to learn social stuff
> is an essential precursor to it becoming innate.
> 
> But note we don't need accurate self-awareness to handle this. It
> could even be counter productive. What we need is the ability to give
> convincing excuses.

What we most need is a model of _others_ that gives better results
in social interactions than a lack of such a model would.  If natural
selection has not wiped out Asperger's syndrome (assuming it has some
genetic component, which seems to be an accepted theory these days),
there must be some compensating adaptive advantage to the disadvantages
it may bring (again, I'm sure you're aware of the theories about that).
Much as for, e.g., sickle-cell anemia (better malaria resistance), say.

I suspect, as previously detailed, that the main role of self-awareness
is as a proxy for a model of _other_ people.  There would then appear
to be a correlation between the accuracy of the model as regards your
own motivations, and the useful insights it might provide on the various
motivations of others (adaptively speaking, such insights may be useful
by enhancing your ability to cooperate with them, convince them, resist
their selfish attempts at convincing you, etc, etc -- or directly, e.g.
by providing you with a good chance of seducing usefully fertile members
of the opposite sex; in the end, "adaptive" means "leading to enhanced
rates of reproduction and/or survival of offspring").

>>But the point remains that we don't have "innate" mental models
>>of e.g. the way the mind of a dolphin may work, nor any way to
>>build such models by effectively extroflecting a mental model of
>>ourselves as we may do for other humans.
> 
> Absolutely true. Though it seems to me that people are far to good at
> empathising with their pets for a claim that human innate mental
> models are completely distinct from other animals. I figure there is a

Lots of antropomorphisation and not-necessarily-accurate projection
is obviously going on.  Historically, we bonded as pets mostly with
animals for which such behavior on our part led to reasonably useful
results -- dogs first and foremost (all the rest came far more recently,
after we had developed agriculture, but dogs have been our partners
for a far longer time) -- for obvious reasons of selection.

> The only problem is that if you apply social psychology principles to
> understand people, you may predict their behaviour quite well but you
> absolutely cannot explain your understanding that way - unless, of
> course, you like being lynched :-(

Develop a reputation as a quirky, idiosyncratic poet, and you'll be
surprised at how much you CAN get away with -- "present company
excepted" being the main proviso to generally keep in mind there;-).

> In contexts where this has worked for me, I would say the final
> intuition goes beyond what the original rules are capable of. ie it
> isn't just a matter of the steps being held in procedural memory.
> Presumably heuristics are learned through experience which are much
> better than the ones verbally stated in the original rules.

Yes, this experience does show one limit of _exclusively_ conscious /
verbalized models, of course.  I particularly like the way this is
presented in Cockburn's "Agile Software Development" (Addison-Wesley
2002), by the way.  Of course, SW development IS a social activity,
but a rather special one.

>>And I sure DO know I don't mentally deal a few tens of thousands
>>of possibilities in a montecarlo sample to apply the method I had
>>my computer use in the research leading to said published results...;-)
> 
> How sure are you of that? After all, the brain is a massively parallel
> machine.

If I did, I would play a far better game of bridge than in fact I
do.  Therefore, I don't -- QED;-).

> My guess is that even then, there would be more dependence on
> sophisticated heuristics than on brute force searching - but I suspect
> that there is much more brute force searching going on in peoples
> minds than they are consciously aware of.

I tend to disagree, because it's easy to show that the biases and
widespread errors with which you can easily catch people are ones
that would not occur with brute force searching but would with
heuristics.  As you're familiar with the literature in the field
more than I am, I may just suggest the names of a few researchers
who have accumulated plenty of empirical evidence in this field:
Tversky, Gigerenzer, Krueger, Kahneman... I'm only peripherally
familiar with their work, but in the whole it seems quite indicative.

It IS interesting how often an effective way to understand how
something works is to examine cases where it stops working or
misfires -- "how it BREAKS" can teach us more about "how it WORKS"
than studying it under normal operating conditions would.  Much
like our unit tests should particularly ensure they test all the
boundary conditions of operation...;-).

Alex