<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Knowledge Nuggets</title>
	<atom:link href="http://scone1.scone.cs.cmu.edu/nuggets/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://scone1.scone.cs.cmu.edu/nuggets</link>
	<description>Scott Fahlman's Notes on AI and Knowledge Representation</description>
	<lastBuildDate>Tue, 25 Dec 2012 23:27:33 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Tutorial Information on Scone</title>
		<link>http://scone1.scone.cs.cmu.edu/nuggets/?p=188</link>
		<comments>http://scone1.scone.cs.cmu.edu/nuggets/?p=188#comments</comments>
		<pubDate>Tue, 25 Dec 2012 23:27:33 +0000</pubDate>
		<dc:creator>Scott Fahlman</dc:creator>
				<category><![CDATA[Scone]]></category>

		<guid isPermaLink="false">http://scone1.scone.cs.cmu.edu/nuggets/?p=188</guid>
		<description><![CDATA[Plans for Posting Scone Tutorials As I wrote in the “About This Blog” description for Knowledge Nuggets, I started this blog with two goals in mind:  First, I wanted to present a series of short “nuggets” of tutorial information about the Scone knowledge-base system and about knowledge representation and reasoning (KRR) issues related to Scone.  [...]]]></description>
				<content:encoded><![CDATA[<h3>Plans for Posting Scone Tutorials</h3>
<p>As I wrote in the “About This Blog” description for Knowledge Nuggets, I started this blog with two goals in mind:  First, I wanted to present a series of short “nuggets” of tutorial information about the Scone knowledge-base system and about knowledge representation and reasoning (KRR) issues related to Scone.  This collection of nuggets would serve as a place-holder for the planned tutorial book on Scone and its uses, and eventually these fragments would be woven into the book itself.  Second, I wanted a venue where I could present occasional informal essays about AI and KRR in general.</p>
<p>I have partially met the second goal, though the pace of producing these essays has been slower than I had anticipated.  However, the goal of producing and distributing tutorial material on Scone has not yet been addressed at all.  I now want to fix that.  I’ve got a lot of little pieces of tutorial information in my head and in Emails I have sent to my students and collaborators, and I just need to start putting this information out where others can benefit from it.</p>
<p>Scone has been working for several years, and has been used in a number of research projects, both within CMU and with a few outside collaborators.  Of course, we have a long list of things that we want to improve and new capabilities that we want to add to Scone, as time and resources allow.  But the system is useful as it is.</p>
<p>Thought it is legally open-source (under the industry-friendly Apache 2 license) I have hesitated to release Scone on the Internet because we don’t have the resources to support a much larger user community, including people with little experience in KRR, AI, Common Lisp programming, or even in computing.  Several attempts to obtain the funding and resources to properly support Scone as a community resource have come up empty, but we keep trying.</p>
<p>However, I’ve come around to the view that it’s time – probably long past time – to put out a “no support” release of Scone, and just see what happens.  Maybe some of the more knowledgeable people out in net-land will become involved, and will help us lift Scone to the next level.  We hope to do this release in the first half of 2013, though the timing is subject to some external constraints beyond my control.  And, if all goes according to plan, you will be seeing a stream of tutorial posts on this blog.  Look for the “Scone” category tag for these Scone-specific posts, and the other tags for more general musings.</p>
<p>Scone is a living system, still evolving fast in some areas.  So some information that appears in these tutorial posts may be superseded by later developments.  In general, I think it will be best to leave the body of the original posts alone, and to indicate in the intro and/or the comments area that something in the post is no longer valid, with a pointer to the update.</p>
<h3>Existing Scone Documents</h3>
<p>Two important topic areas I want to cover early are (1) Scone’s unusual marker-passing algorithms for fast common-sense reasoning, and (2) the general ideas behind Scone’s multiple-context mechanism and its uses – probably Scone’s most novel feature compared to other KRR systems.  Both of these topics are fairly well covered by conference papers on my website, though like most conference papers they suffer a bit from the editing required to fit into tight page limits.  I may someday address these topics in “nugget” form, but for now I’m just going to point to these papers and get on with documenting some of Scone’s other features and ideas.  So if you are seriously interested in Scone, I suggest you go read these papers first:</p>
<ul>
<li>Scott E. Fahlman: &#8220;Marker-Passing Inference in the Scone Knowledge-Base System&#8221;, First International Conference on Knowledge Science, Engineering and Management (KSEM&#8217;06), Guilin, China, August 2006. Proceedings published by and copyright by Springer-Verlag (Lecture Notes in AI).  <a href="http://www.cs.cmu.edu/~sef/scone/publications/MarkerPaper.pdf">PDF format&nbsp;
<p></a></li>
<li>Scott E. Fahlman, “Using Scone’s multiple-context mechanism to emulate human-like reasoning”, <i>Proceedings of the AAAI Fall Symposium on Advances in Cognitive Systems</i>, 2011.  <a href="http://www.cs.cmu.edu/~sef/scone/publications/ACS-2011.pdf">PDF format</a></li>
</ul>
<p>For those interested in where the ideas came from, you might also want to look at my old Ph.D. thesis from 1977, which was published in book form but still is also available online, in scanned form, from MIT’s archives.  Of course, there has been a good deal of progress in the 35 years since that was written, but it is remarkable to see how many of these ideas live on, in recognizable form, in present-day Scone.</p>
<ul>
<li>Scott E. Fahlman: <a href="http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&amp;tid=9750"><i>NETL: A System for Representing and Using Real-World Knowledge</i></a>, MIT Press, 1979.<br />
Online Thesis/Tech Report<br />
<a href="http://dspace.mit.edu/handle/1721.1/6888">http://dspace.mit.edu/handle/1721.1/6888&nbsp;</p>
<p></a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://scone1.scone.cs.cmu.edu/nuggets/?feed=rss2&#038;p=188</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New Online Journal: Advances in Cognitive Systems</title>
		<link>http://scone1.scone.cs.cmu.edu/nuggets/?p=165</link>
		<comments>http://scone1.scone.cs.cmu.edu/nuggets/?p=165#comments</comments>
		<pubDate>Tue, 07 Aug 2012 13:45:01 +0000</pubDate>
		<dc:creator>Scott Fahlman</dc:creator>
				<category><![CDATA[AI]]></category>

		<guid isPermaLink="false">http://scone1.scone.cs.cmu.edu/nuggets/?p=165</guid>
		<description><![CDATA[Those who have followed my comments in this blog will know that I’ve been advocating that AI researchers – at least a few of us – should renew our focus on flexible, integrated, human-like AI.  This was the original focus of the AI field, and is still a very exciting open problem, but research with [...]]]></description>
				<content:encoded><![CDATA[<p>Those who have followed my comments in this blog will know that I’ve been advocating that AI researchers – at least a few of us – should renew our focus on flexible, integrated, human-like AI.  This was the original focus of the AI field, and is still a very exciting open problem, but research with this focus (and funding for such research) has largely been pushed aside in the rush to exploit powerful but narrow approaches – particularly various forms of statistical learning driven by “big data”.</p>
<p>There have always been a few researchers who have kept the flame alive for the original AI goals and some of the knowledge-based approaches – approaches that are much more viable now that machines are orders of magnitude faster and memories are orders of magnitude larger than they were back in the 1980&#8242;s.  There are also a lot of new ideas and new resources to mix with the older ones.</p>
<p>Over the last few years, I have seen some signs that perhaps the pendulum is swinging back in our direction.  Certainly the statistical learning methods have earned a permanent role in the AI field, but it appeared to me that a growing number of people have begun to realize that this isn’t – and can’t be – the whole story.   In the past few years, this community (the “Rebel Underground” of AI?) has been gathering at the “Advances in Cognitive Systems” or “ACS” symposium – one track in the AAAI Fall Symposium Series.  Pat Langley has been particularly active in organizing the ACS movement.</p>
<p>I am pleased to report that this effort has resulted in a new, free online journal, named (not surprisingly) “Advances in Cognitive Systems”.  Pat is the editor.  Paul Bello, Ken Forbus, John Laird, and Patrick Winston are associated editors, and I am on the editorial board, along with many of the leaders of the rejuvenated Cognitive Systems movement.</p>
<p>The inaugural issue has now been released: <a href="http://cogsys.org/journal/volume-1">http://cogsys.org/journal/volume-1</a>.  I urge you to take a look, and especially to read Pat’s essay, <a href="https://docs.google.com/viewer?url=http%3A%2F%2Fcogsys.org%2Fpdf%2Fpaper-1-2.pdf">“The Cognitive Systems Paradigm”</a>, which gives his view of what this is all about.</p>
<p>My own essay from this blog, “Human vs. Super-Human AI”, has been revised, updated, and slightly expanded, and is included as an invited essay in this inaugural issue.  The new (and somewhat more controversial) title is <a href="https://docs.google.com/viewer?url=http%3A%2F%2Fcogsys.org%2Fpdf%2Fpaper-1-3.pdf">“Beyond Idiot-Savant AI”</a>.  The full citation is</p>
<blockquote><p>Fahlman, Scott E. (2012): “Beyond Idiot-Savant AI” in Advances in Cognitive Systems 1, pages 15-22.</p></blockquote>
<p>An <a href="http://www.cogsys.org/conference/2012/">ACS conference</a> is also being planned for Dec 7-9, 2012, in Mountain View, California.</p>
]]></content:encoded>
			<wfw:commentRss>http://scone1.scone.cs.cmu.edu/nuggets/?feed=rss2&#038;p=165</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Some Thoughts on the &#8220;Man vs. Computer&#8221; Match on Jeopardy</title>
		<link>http://scone1.scone.cs.cmu.edu/nuggets/?p=152</link>
		<comments>http://scone1.scone.cs.cmu.edu/nuggets/?p=152#comments</comments>
		<pubDate>Wed, 23 Feb 2011 04:42:27 +0000</pubDate>
		<dc:creator>Scott Fahlman</dc:creator>
				<category><![CDATA[AI]]></category>

		<guid isPermaLink="false">http://scone1.scone.cs.cmu.edu/nuggets/?p=152</guid>
		<description><![CDATA[A friend, Carl Kurlander, runs a Pittsburgh-oriented blog on the Pittsburgh Post-Gazette web site.  He asked if I would write up a few quick thoughts on the recent &#8220;Man vs. Computer&#8221; match on Jeopardy, so I did that.  The article was written for an intelligent but non-techy audience, and is a bit more superficial than the usual [...]]]></description>
				<content:encoded><![CDATA[<p><span style="color: #ff0000;">A friend, Carl Kurlander, runs a <a href="http://communityvoices.sites.post-gazette.com/index.php/arts-entertainment-living/six-degrees-of-pittsburgh">Pittsburgh-oriented blog</a> on the Pittsburgh Post-Gazette web site.  He asked if I would write up a few quick thoughts on the recent &#8220;Man vs. Computer&#8221; match on Jeopardy, so I did that.  The article was written for an intelligent but non-techy audience, and is a bit more superficial than the usual fare on Knowledge Nuggets.  But I thought I&#8217;d put here anyway just in case some readers of this blog may be interested.</span></p>
<p>I watched the recent Jeopardy challenge match – IBM&#8217;s Watson system vs. Greg Jennings and Brad Rutter, Jeopardy&#8217;s all-time human champions – with great interest for several reasons: first, I&#8217;m a long-time fan of Jeopardy, though I&#8217;ve never been a contestant; second, I&#8217;ve been doing research on artificial intelligence, specializing in knowledge representation and common-sense reasoning, for 40 years; third, several of my colleagues at Carnegie Mellon, both faculty and students, worked with the IBM Watson team, contributing both ideas and software.  CMU Professor Eric Nyberg was a key member of the IBM team.</p>
<p>Since I was not a part of this project, I do not know all the details of how Watson is organized internally, but I have been able to pick up a fair amount of information from the team&#8217;s public statements.  So here are a few of my personal thoughts (not reflecting the official views of Carnegie Mellon or anyone else) on Watson&#8217;s victory:</p>
<p><strong>1.</strong> <strong>The first thing to say is that this victory is a very exciting achievement for IBM and for the science of AI<em>.</em> </strong> This four-year effort has been a real <em>tour de force</em>, pulling together a lot of existing (but scattered) ideas – plus a few new ideas – into an integrated system that can hold its own against the best humans on earth on a true Grand Challenge problem.  Few of us, even in the AI field, would have predicted this level of success in so short a time.</p>
<p><strong>2.</strong> <strong>There has been a lot of griping on the Internet about the advantage that Watson had in ringing in.</strong> I believe that these critics have a good point.  I suspect that for something like 70-80% of the questions, both Watson and at least one of the humans thought they had the answer by the time Alex finished reading the question and the buzzers became &#8220;live&#8221;.  Watson has a huge advantage in these situations, and indeed it won most of these toss-ups, especially on the crucial Double Jeopardy round of the first game.  If you take away this buzz-in advantage, for example by letting all three contestants answer if they buzz in as soon as the buzzer becomes live (they would have to be in isolation booths), then I think that the game would have been a virtual tie, or at least much closer than it seemed on TV.  But despite this opinion, I think that even <em>tying</em> the two best humans on the planet is a tremendous victory for the Watson team.</p>
<p style="padding-left: 30px;">A bit more detail on this: The way the button works (as I understand it), a production assistant offstage makes a judgment call about when Alex has finished reading the clue.  (There&#8217;s some slop in that process).  The assistant then pushes a button that arms the buzzers.  This also turns on a little light on each podium, and (in this contest) simultaneously sends an electrical &#8220;OK to buzz&#8221; signal to Watson.  If Watson is ready to answer, it immediately triggers a solenoid that pushes its button.</p>
<p style="padding-left: 30px;">No human is going to beat Watson at that. Humans take about 0.2 to 0.3 seconds to push a button in response to a light.  The  only way a human is going to beat Watson is if the human listens to Alex&#8217;s voice and <em>anticipates</em> when the button is going to go live, rather than waiting until he sees the light.  The human contestants seems to have done this a few times during the contest, but it&#8217;s a risky strategy: if you jump the gun, you are locked out for something like half a second, so your second try will definitely come in too late.  Note that I&#8217;m not talking here about relative <em>thinking</em> speed, though that&#8217;s an interesting topic in its own right – I’m just talking about button-pushing reflexes.</p>
<p><strong>3.</strong> <strong>Whether we call it a win or a tie on the task of Jeopardy, the key thing to remember about Watson&#8217;s performance is that it makes no sense to argue about whether Watson is more or less intelligent than a human.</strong> The IQ-testing industry notwithstanding, intelligence is not something that we can measure with a single number.  Intelligence, whether in a human or a machine, is a bundle of many different capabilities.  Each of us (and Watson too) have these capabilities in different amounts.  Some of us are good at arithmetic, but even the earliest computers were far better at this limited aspect of intelligence than any human; some humans know a tremendous amount of assorted information, but perhaps lack the ability to apply that knowledge in solving real-world problems; some of us can solve complex problems, but can barely get through a day without someone to remind us of simple things; some have talent in written or spoken communication, in music, in interpersonal relations, and so on &#8212; all aspects of intelligence that can be measured separately.</p>
<p>Watson is very, very good at retrieving raw facts.  In the contest, Watson was not allowed to access the Internet, but no matter – it was pre-loaded with a large fraction of the reference materials available on the Web, including Wikipedia, dictionaries, tables of states and countries, Olympic winners, Oscar winners, characters in old TV shows, and so on.  Not even Ken and Brad have this much factual information in their brains.  But Watson is much worse than any normally functioning human at understanding the precise <em>meanin</em>g of a complex sentence or reasoning about the consequences of what it knows.  Humans gain knowledge (in part) by reading about it, digesting what the text says, and putting the knowledge away in a pre-digested abstract form that can be reasoned about.</p>
<p>Watson doesn&#8217;t have much of that pre-digested information.  When a question arrives, it does a sort of Google search over its stored documents, looking for chunks of text that share many of the same words (and perhaps synonyms or closely associated words) with the text of the question.  Then it does a lot of processing to see which candidate answers score the best, based on many different tests – that&#8217;s where the thousands of processors, working in parallel, come in &#8212; and finally it picks a winner, along with a level of confidence.  It does all this in three seconds or less.  This filtering technology has come a very long way in a short time – it is the thing that separates a question-answering system from something like Google that returns pages that <em>might</em> contain the answer you want, but that leaves the filtering to the human.  But this part of Watson is still very weak compared to human capabilities.  For example, Watson is more likely than a human to make category errors, such as giving you the name of a book when the question is clearly asking for a person.  And Watson can&#8217;t even begin to approach a human&#8217;s ability to handle metaphors, wordplay (though it does better than I would have guessed), or simple problem-solving.</p>
<p><strong>So Watson has a different mix of strengths and weaknesses than you will find in any human.</strong> It would be very good at some tasks &#8212; answering simple questions about a product&#8217;s features online &#8212; and very bad at others, such as answering a frustrated customer&#8217;s questions about how to assemble an Ikea bookcase.</p>
<p>The interesting thing is that Watson&#8217;s great strengths and great weaknesses more or less balance out on the Jeopardy task.  Fans of the show know that many questions have a factual part and a bit of clue that requires human cleverness &#8212; but the clever bit is only really important if you&#8217;re not sure about the answer.  For example, here is a question from a previous season:  &#8221;Logically, it was the radioactive transuranium metal discovered right after neptunium&#8221;.  Watson might well answer this by digging through its memory and finding the actual dates at which various elements were discovered – that is, it just looks up the answer; a human – even one who likes chemistry – is unlikely to have memorized those dates, but may well get the answer, &#8220;plutonium&#8221; , by analogy to the discovery of the planets (and by knowing that there is a radioactive element named &#8220;plutonium&#8221;).  So you can win at Jeopardy by having a super-human collection of facts (plus some reasoning), or by having excellent reasoning and a smaller collection of facts.  It would be possible to write Jeopardy questions to favor one side or the other, but in the actual contest, the Jeopardy staff thought they were writing questions for a regular show with human contestants.</p>
<p><strong>4.</strong> <strong>So, is Watson going to conquer the world, or at least take away all our jobs?</strong> I think that the world is safe for a while.  Watson, after all, doesn&#8217;t actually do anything but play one specific game.  And while Watson has the ability to access a lot of knowledge and do some reasoning about it, the real-world planning (and coup-plotting) ability of today&#8217;s AI systems are a long way from scary-good.</p>
<p>As for jobs, it&#8217;s complicated.  Watson can answer factual questions very fast, but it would not beat a human player who has access to the internet and enough time to make a few queries.  This would give the human access to a store of factual information comparable to Watson&#8217;s.  Combined with the human&#8217;s much greater powers of reasoning and language understanding, I can state with confidence that the human would usually produce better answers – but it would take longer and probably cost more.</p>
<p>In its present state, Watson makes far too many mistakes – &#8220;howlers&#8221; that no human would ever make &#8212; to entrust it with decision-making powers in law or medicine or any field where it&#8217;s important to get things right.  <strong>But as part of a team, a near-future descendant of Watson might be an extremely valuable partner.</strong> As the IBM people have pointed out, Watson could take a list of symptoms and suggest possible diseases and treatments, including some too rare or too new to be known to the average human physician.  But Watson (with today&#8217;s technology) should not be in charge of the treatment.   We want a human to have the ultimate control so that Watson&#8217;s occasional blunders don&#8217;t kill people.  In the near future, I think that Watson and its progeny will be used mostly as <em>intelligence amplifiers</em>, working with humans.  Watson&#8217;s very impressive English-language capability is the key to making these human-machine partnerships attractive.  So some jobs &#8212; digging through libraries and heaps of documents &#8212; will probably be taken over by the machines, but others &#8212; the ones involving judgment and complex reasoning &#8212; are safe for some time to come.</p>
<p><strong>5.</strong> One last comment:  You may have noticed that <strong>Watson&#8217;s computer-generated voice sounded very human</strong>, and not much like the tinny &#8220;computer voices&#8221; that you hear in old science fiction movies.  In fact, there are reports that the IBM people wanted a Watson voice that didn&#8217;t sound <em>too</em> human, for fear that it would come across as &#8220;creepy&#8221;, so they deliberately made it a bit flat.   A lot of the basic research in generating realistic-sounding computer voices was done by my colleagues at Carnegie Mellon&#8217;s Language Technologies Institute.</p>
]]></content:encoded>
			<wfw:commentRss>http://scone1.scone.cs.cmu.edu/nuggets/?feed=rss2&#038;p=152</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scientific Creativity: How to Get More</title>
		<link>http://scone1.scone.cs.cmu.edu/nuggets/?p=123</link>
		<comments>http://scone1.scone.cs.cmu.edu/nuggets/?p=123#comments</comments>
		<pubDate>Sat, 12 Feb 2011 07:11:35 +0000</pubDate>
		<dc:creator>Scott Fahlman</dc:creator>
				<category><![CDATA[AI]]></category>

		<guid isPermaLink="false">http://scone1.scone.cs.cmu.edu/nuggets/?p=123</guid>
		<description><![CDATA[In an earlier article, I sketched a mini-theory of human scientific creativity – a theory that, I believe, is in principle implementable in an AI system.  I also mentioned that, if this theory is (more or less) correct, it may suggest some techniques that we humans can employ to increase our own scientific creativity.  In [...]]]></description>
				<content:encoded><![CDATA[<p><a href="http://scone1.scone.cs.cmu.edu/nuggets/wp-content/uploads/2011/02/Hungry-Gull2.jpg"><img class="aligncenter size-full wp-image-140" title="Hungry Gull" src="http://scone1.scone.cs.cmu.edu/nuggets/wp-content/uploads/2011/02/Hungry-Gull2.jpg" alt="" width="550" height="518" /></a></p>
<p>In an <a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=101">earlier article</a>, I sketched a mini-theory of human scientific creativity – a theory that, I believe, is in principle implementable in an AI system.  I also mentioned that, if this theory is (more or less) correct, it may suggest some techniques that we humans can employ to increase our own scientific creativity.  In this article, I will try to spell out some of these techniques.  I don’t want Knowledge Nuggets to become a “self help” blog – I just think it’s interesting to see where some of these theories lead us.</p>
<p>Let’s begin by reviewing the key points of the theory presented earlier:</p>
<ol>
<li>What we call “scientific creativity” is not magic.  It’s just good, effective problem solving that happens to lead to a surprising result.</li>
<li>A “flash of inspiration” – the part that seems creative or magical to us – is basically a recognition that the problem fits (or almost fits) some representation or metaphor or recipe already stored in our memory.  This approximate matching is computationally very demanding, but it uses our parallel recognition machinery – considering many possible matches at once – so it feels like a flash.</li>
<li>These flashes of inspiration hardly ever occur until you’ve done a lot of work to investigate and understand the structure of the problem you’re grappling with.  And, once the flash has occurred, it is of no value until you have done all the detailed “grind it out” work to fit your idea to the problem and verify that it works.  This part is perceived as hard mental work – the “99% perspiration” of which Edison spoke.</li>
<li>For a problem that is difficult, important, and generally recognized, many smart people will already have worked on it.  So all the obvious things have been tried, and they didn’t work (or didn’t work well enough).  To solve the problem “creatively” will require you to come up with a new approach.</li>
</ol>
<p>So, if you accept these ideas, what techniques do they suggest?</p>
<h2>Two non-solutions</h2>
<p>Let’s dispose of two ideas that usually <em>don’t</em> work:</p>
<p><strong>Do what everyone else is doing, only more so.</strong> That is, work harder, work longer, explore more alternatives, or bring more resources to bear (people, computing cycles, data, or whatever).  If you really do have access to a unique level of resources, this can sometimes work.  In fact, it might be the only way to solve certain big, ugly problems.  But even if you succeed in this way, people are unlikely to recognize your solution as “creative” – it’s just “brute force”. <sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=123#footnote_0_123" id="identifier_0_123" class="footnote-link footnote-identifier-link" title="Of course, it may require a lot of creative problem-solving to assemble an unprecedented level of resources, but that&rsquo;s often not appreciated.">1</a>]</sup></p>
<p><strong>Try a lot of things at random. </strong>Again, this may occasionally work, but the odds are very much against you.  For interesting, hard problems, the useful answers will be very sparse in the space of possibilities, and all the obvious things will have been tried already.  Probably your only hope is to use some kind of knowledge or model in choosing what new alternatives to consider, rather than wandering around without any plan or guidance.</p>
<h2>Cultivate your stock of metaphors.</h2>
<p><strong>The most creative people are generally the ones who seem to be interested in everything.</strong><sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=123#footnote_1_123" id="identifier_1_123" class="footnote-link footnote-identifier-link" title="Obviously, I&rsquo;m generalizing here from the creative people I happen to have met.&nbsp; But if you&rsquo;ve spent your entire adult life working in places like Carnegie Mellon and MIT, you will have had the opportunity to observe a large number of scientifically creative people in action, including some whose creativity is legendary.">2</a>]</sup> And it’s a special kind of interest: not just collecting trivia (though some very creative people do that as well), but trying to figure out how everything <em>works</em>.   If the tides are caused by the pull of the moon, why are there two high tides every day instead of just one?  What is the trick that allows a stomach to digest meat, when it’s made of meat?  When a 6-way symmetrical snowflake is forming in the atmosphere, how does one branch know what pattern has been chosen by the other branches?  What’s really going on with the “casting out nines” trick for checking arithmetic?  The more you ponder such questions, the larger will be your stock of metaphors and models.</p>
<p>These curious-about-everything people can sometimes seem rather odd to bystanders, but most of them don’t care – some even revel in that oddness.  An example: one day, out of the blue, Marvin Minksy speculated that there are no parasites that eat hair because, after biting through the hair shaft, the parasite would have no <em>local</em> way of knowing which of the two pieces to hold onto if it wants to remain with the host.  Whether that’s true or not – it doesn’t really matter – that thought has stuck with me, and I’ve used it as a model a few times in thinking about data structures and distributed processing.</p>
<p><strong>Collaborate. </strong>A great way to multiply your effective store of metaphors is to collaborate with someone.  Ideally, you want someone that you can communicate with easily, but whose background is different from yours: different training, different generation, different style of approaching problems, or whatever.  The history of science and technology is full of creative breakthroughs made by two people working closely together.  Larger groups provide even more diversity of ideas, but with more than two people it can be hard to maintain the very close communication that truly collaborative problem-solving requires.</p>
<p>One collaboration model that is widespread and often successful is the professor and the grad student.  This is the academic equivalent of the old master/apprentice model.  The conventional view is that the professor has deep knowledge of the field and its accepted ways of doing things, while the student provides enthusiasm, a fresh point of view, and often the creative spark.  As the saying goes, the student “doesn’t yet know what is impossible”.</p>
<p>But in my experience, it just as often works the other way.  The students have recently taken courses and are full of all the latest knowledge, but it is a rare student who has the self-confidence to venture very far out of the box.  Those students who do have that self-confidence, the skill to exercise it successfully, and the wisdom to <em>occasionally</em> listen to their elders, are the ones who end up as faculty in the top universities – or, in more applied fields, as successful entrepreneurs.</p>
<h2>Relax reality – temporarily!</h2>
<p>We hypothesized that the “flash of inspiration” is really a recognition that the problem you are grappling with matches, more or less, some metaphor, template, or recipe in your bag of tricks.  But that “more or less” can cause problems: every problem is different and you generally don’t get an exact match.  Sometimes you have to find a near-miss solution and massage it a bit to fit the problem you’re working on.  But if it’s not a very close fit, that flash of recognition may never occur.</p>
<p>An alternative approach that sometimes works is to modify the problem and the rules – perhaps even the laws of nature – and see if you can solve this modified problem.  Then you must find a way to massage that unrealistic solution back into something you can really use.</p>
<p>An example:  One day in the mid-1970s Marvin Minsky offered the following challenge to the grad students in the MIT AI Lab (I was one of them):  “Suppose you had an unlimited hardware budget.  You can have as much hardware as you want, but it has to be well-defined hardware – no magic boxes.  Your goal is to solve (or partly solve) some big problem in AI.  What would you ask for and how would you use it?”</p>
<p>At the time, I had been thinking about the core problem of recognition: you have a bunch of features and expectations, and you want to find the stored description that best matches these inputs.  So it occurred to me that we could build a little hardware recognition-box for each stored description.  As input features arrive, we broadcast them to all of these boxes at once.  Each box keeps score, asking “Is this me?  How well do I match?”, and one of them ultimately emerges as the winner.  I kept thinking about that model, and it gradually evolved into the NETL architecture, which could handle both this recognition task and simple inference in a knowledge base.  And that became my Ph.D. thesis.</p>
<p>Of course, eventually you have to get back to reality.  I was not given an unlimited hardware budget, so this model could not be implemented as it was.  But it led to a lot of knowledge-representation and recognition ideas that have been used in other systems, and that today, decades later, form the basis of my Scone implementation.  So by <em>temporarily</em> setting aside some real-world constraints, I was able to gain some deeper insights into the problems I was grappling with.</p>
<p>Scientists do something very similar when they develop a simplified model and then put in the real-world complications.  For example, Galileo and Newton developed their dynamics by postulating an ideal world without friction, and verifying their models in minimal-friction settings such as billiard tables.  (Planetary orbits are essentially friction-free so they provided another way to test the simplified theory.)  Then they and their successors put the friction back in so that they could model more complex real-world situations, such as the flight of cannon balls.</p>
<h2>Do your homework, but don’t be captured by it.</h2>
<p>It follows from points <strong>a</strong> and <strong>c</strong> above that it is very rare for someone to make creative discoveries in a field if they don’t have a reasonably solid knowledge of that field.  Lucky accidents and flashes of inspiration are all well and good, but you will be very inefficient if you don’t have the knowledge and skill to determine whether your brilliant (or lucky) idea has some chance of working.</p>
<p>Even if your idea is a great one, it is not going to change the world unless you have the skill and perseverance to work out all the details and <em>prove</em> that it works.  The world is full of people who “invented” something, but didn’t follow up, and then had to watch while someone else reaps all the glory – and sometimes riches – for what seems to be the same idea.  Well, if you don’t follow up (or collaborate with someone who will), it doesn’t count.  And if you don’t have the skills to follow up efficiently, you will waste a lot of time chasing wild geese.  So it’s important to master the “conventional wisdom” in a field – or a good part of it, at least – before you try to innovate.</p>
<p><strong>However… </strong>If a problem has been around for a while, and it is generally understood that it’s important, then there’s probably something wrong the “conventional wisdom”.  It might be a big flaw or a seemingly tiny one, but if the conventional approach actually worked, the problem would already have been solved.<strong> </strong></p>
<p>So it’s important to learn what everyone else “knows” and what has already been tried, but not to <em>accept</em> it all at face value.  Be alert for things that are stated dogmatically without a good reason.</p>
<h2>Don’t look where everyone else is looking.</h2>
<p>If everyone else is looking at a problem in a certain way and is applying a certain set of tools and techniques, the best chance of finding a creative solution is to try something else.  Here are some techniques for doing that successfully.</p>
<p><strong>Find a new problem. </strong>If you identify some new problem that needs to be solved, that may be creativity enough.  If nobody has looked at this problem before, it’s quite possible that simple, well-known techniques will be sufficient to solve it.  Many useful (and often lucrative) inventions have been created in this way.  A silly but illustrative example: someone says, “Gee, for some people it’s a real hassle to get up and turn the lights on and off – what if people could just clap their hands?”  The electronics of the day might or might not be up to the task.  Even if they are, it might require marshalling resources and conducting a program of research.  But it may not require much creative thinking to solve this problem, once some creative person has recognized the need.</p>
<p>Similarly, in science, a lot of discoveries have been made by people who are the first – or <em>among</em> the first – to ask a new question.  One of the best ways to do this is to look for anomalies – observations that don’t <em>quite</em> fit the current theories.  It’s easy to write off most anomalies as experimental error, or as simply being unimportant, and most busy researchers will do just that.  But until such anomalies have been explained, they may contain the seeds of important new questions.  So pay close attention to the little mysteries.</p>
<p>As Isaac Asimov once observed:  “The most exciting phrase to hear in science, the one that heralds new discoveries, is not &#8216;Eureka!&#8217; (I found it!) but &#8216;That&#8217;s funny &#8230;&#8217; ”</p>
<p><strong>Make sure the problem can be solved – and maybe that will give you some clues about how to solve it. </strong>This is an important technique in AI and in some other areas.  As I observed in an <a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=39">earlier article</a>, some AI researchers focus on problems that most human brains can solve, while others focus on problems that are beyond the capacity of any unaided human.  For problems of the first kind (e.g. understanding natural language, whether spoken or written), we humans serve as a “two-legged existence proof” that the problem can be solved by some sort of physical information-processing device.  So those of us who work in this area are not wasting our time by trying to solve an inherently impossible problem, though it might be impossible with today’s technology and today’s ideas.</p>
<p>The more we can learn about how this “existence proof” works, the more clues we will have about how to solve the problem.   I should say “<em>one</em> way to solve the problem”, since there may be other ways.  But if there’s only one existence proof, it’s good to understand as much as possible about that one.  From neuroscience we know that the problem of understanding natural language and speech can be solved using a very large network of millisecond-speed components, all running in parallel – that’s what the brain seems to be.  It doesn’t seem to require nanosecond-speed logic circuits, and it probably doesn’t require a lot of floating-point arithmetic, since we see no evidence of floating-point hardware in the brain.  So the “two-legged existence proof” has given us a few clues about what a solution might look like.</p>
<p><strong> Ask the question a different way. </strong>Discovering a new problem to solve is fine, but sometimes you want to solve a <em>particular</em> problem or answer a <em>particular</em> question, rather than finding a new one.  If everyone else is asking this question in a certain way, maybe that’s what is holding them back.  The way you frame a question usually carries certain hidden assumptions.</p>
<p>Gerry Sussman, my Ph.D. research advisor at MIT, had a favorite response: whenever I would ask him what he thought was the best way to solve problem X, and he would ask, “What is the problem of which this is a sub-problem?”  In other words, stand back and think about the larger problem.  Maybe the question you’re asking is not really the one you need to answer.  If you attack the larger problem in some other way, it might be easier.  It’s very common for a field to become fixated on a certain way of posing some important problem, and never to consider whether that is the problem that they really want to solve.</p>
<p><strong>Think about what has changed. </strong>The world changes, and that creates new opportunities for researchers.  Sometimes an approach that was impossible a few years ago is a good approach today.  This is especially true in computer science, where Moore’s Law and a steady flow of inventions changes the game every few years.  None of the apps running on your smart phone would have been possible with the technology of ten or fifteen years ago, and they certainly wouldn’t have fit into your pocket.</p>
<p>Faster machines, larger memories, and more pixels are one engine of change, but there are many others: new data sources, including all the information now available on the internet; new theoretical and analytical techniques; new software tools; better instruments, new materials…</p>
<p>Before the Wright Brothers developed their airplane, there were many false starts by others, but heavier-than-air flight was just not going to happen until there was a power source with an adequate power-to-weight ratio – the internal combustion engine finally solved that problem.  So the Wrights attacked the right problem at just the right moment &#8211; and it didn&#8217;t hurt that, as bicycle mechanics, they had the skill-set to try out their ideas.</p>
<p>In science, one very important kind of change is the development of a new way to look at what’s going on.  Time and again, some new visualization technology has led to an explosion of scientific discovery: the telescope, the microscope, X-rays, spectroscopy, high-speed strobe photography, satellite sensing of the earth, functional MRI of the brain in action&#8230;  The list goes on and on.  Each of these new visualization technologies created an exciting opportunity for the first researchers to exploit the tool in new ways.  Galileo didn’t invent the telescope, but he was one of the first to point it at the heavens.  That led to a number of revolutionary discoveries that changed our understanding of the universe.</p>
<p>Today, ever more powerful computer simulations tied to graphical displays are providing a similar opportunity.  Through simulation, we can “see” natural and synthetic phenomena that we could never visualize before.  And new analytic tools, such as plotting webs of inter-personal connections, are giving us new insight into large data sets that were just unreadable printouts in the past.  Opportunity beckons.</p>
<p><strong>Take a close look at past failures. </strong>It is often worth revisiting old failures and thinking hard about what really went wrong.  When some effort succeeds, there’s not much need for post-analysis.  It worked.  Yay!  End of story.<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=123#footnote_2_123" id="identifier_2_123" class="footnote-link footnote-identifier-link" title="Of course, you might want to work on ways to make a good solution better.">3</a>]</sup>  But when an effort fails or falls short of its goals, there are many possible reasons.  It’s easy for people to draw the wrong conclusions, and easy for these wrong conclusions to harden into a consensus – our old friend “conventional wisdom”.</p>
<p>A project might have ten good ideas and one really bad one, and it fails because of that one flaw.  This doesn’t mean that <em>all</em> the ideas were bad, or that the effort was hopeless and should never be tried again, but it may be hard to assign the blame correctly.  Or perhaps all the ideas were good, but the people on the project didn’t execute it well.  Perhaps the project was managed in such a way that good ideas and a good technical effort was crippled.  Perhaps funding was cut just as success was within reach.  All of these things happen, and very often the “conventional wisdom” enshrines the wrong diagnosis.  Or perhaps the project was indeed doomed to failure at the time, but the world has changed since then, and the approach that failed earlier could (with a bit of tuning) succeed now.</p>
<p>So old failures are a very rich source of “almost-right plans”.  Instead of abandoning these plans, it’s worth trying to debug them – that is, to figure out what went wrong, fix it, and try again.<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=123#footnote_3_123" id="identifier_3_123" class="footnote-link footnote-identifier-link" title="This idea of &ldquo;debugging almost-right plans&rdquo; has been championed over the years by Gerry Sussman at MIT.">4</a>]</sup>  This can be a very good way of breaking away from the crowd – sometimes the creative “new” idea is actually a recycling of a creative old idea that others have given up on.  Sometimes those involved in the earlier project know very well what went wrong, but they may not be listened to; sometimes it takes an outsider, with no emotional investment in the earlier project, to understand what really happened.</p>
<p><strong>Cross boundaries. </strong>One great way to break away from the pack in your own field is to sneak across the border into someone else’s field – for example, crossing from computer science to some area of biology.  A surprising number of discoveries are made by people who show up in a new field with a set of tools, skills, and ways of looking at things that are very different from those employed by the natives.  If the immigrant is trained in some scientific or engineering field, much of that general training will carry over, but the immigrant will have a rather different set of metaphors to draw upon.</p>
<p>But as we discussed earlier, you are unlikely to have much success until you understand the core knowledge of the field you are working in.  Some people will move into a new field that they have been interested in all along, so they will have a head start in acquiring the necessary knowledge; others succeed by just working very hard for a year or two.  One interesting shortcut is to develop a close collaboration with someone who is well established in your new field – they can serve as a guide and, for a while, as a critic.</p>
<h2>One last thought…</h2>
<p>The suggestions in this paper may help you to approach scientific and engineering problems more creatively.  But if you apply them too aggressively, you may be regarded as a crackpot – or you may find that you have <em>become</em> a crackpot.  So strive for greater creativity, but try to keep your balance.  Show some respect for those who stick to the conventional paths – they are conventional for a reason – and be sure to visit reality from time to time.</p>
---------------------------
  
  <ol class="footnotes"><li id="footnote_0_123" class="footnote">Of course, it may require a lot of creative problem-solving to assemble an unprecedented level of resources, but that’s often not appreciated.</li><li id="footnote_1_123" class="footnote">Obviously, I’m generalizing here from the creative people I happen to have met.  But if you’ve spent your entire adult life working in places like Carnegie Mellon and MIT, you will have had the opportunity to observe a large number of scientifically creative people in action, including some whose creativity is legendary.</li><li id="footnote_2_123" class="footnote">Of course, you might want to work on ways to make a good solution better.</li><li id="footnote_3_123" class="footnote">This idea of “debugging almost-right plans” has been championed over the years by Gerry Sussman at MIT.</li></ol>]]></content:encoded>
			<wfw:commentRss>http://scone1.scone.cs.cmu.edu/nuggets/?feed=rss2&#038;p=123</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Bit More on Scientific Creativity</title>
		<link>http://scone1.scone.cs.cmu.edu/nuggets/?p=116</link>
		<comments>http://scone1.scone.cs.cmu.edu/nuggets/?p=116#comments</comments>
		<pubDate>Wed, 08 Dec 2010 16:01:55 +0000</pubDate>
		<dc:creator>Scott Fahlman</dc:creator>
				<category><![CDATA[AI]]></category>

		<guid isPermaLink="false">http://scone1.scone.cs.cmu.edu/nuggets/?p=116</guid>
		<description><![CDATA[As a follow-up to my previous post:  I ran into this interesting New York Times article today.  I  think it&#8217;s pretty compatible with the view I presented in my article.  The author, Benedict Carey, talks about &#8220;flashes of inspiration&#8221; in terms of exploring loose or &#8220;out of the box&#8221; connections among ideas, rather than the [...]]]></description>
				<content:encoded><![CDATA[<p>As a follow-up to my <a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=101">previous post</a>:  I ran into this interesting <a href="http://www.nytimes.com/2010/12/07/science/07brain.html" target="_blank">New York Times article</a> today.  I  think it&#8217;s pretty compatible with the view I presented in my article.  The author, Benedict Carey, talks about &#8220;flashes of inspiration&#8221; in terms of exploring loose or &#8220;out of the box&#8221; connections among ideas, rather than the more structured path-following that we do in &#8220;analytic mode&#8221;.</p>
<p>He presents some evidence from neuroscience and cognitive science suggesting that careful analytic thinking and exploratory, out-of-the-box thinking are really two rather distinct modes of thought.  You will need to do some of each, switching back and forth, in order to solve hard scientific problems.</p>
<p>He also cites some evidence that being in a happy, playful, carefree mood can lead to greater success in &#8220;out of the box&#8221; mode &#8212; something that most creative scientists know instinctively, and that some of the people running funding agencies would do well to ponder.</p>
]]></content:encoded>
			<wfw:commentRss>http://scone1.scone.cs.cmu.edu/nuggets/?feed=rss2&#038;p=116</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>An AI View of Scientific Creativity</title>
		<link>http://scone1.scone.cs.cmu.edu/nuggets/?p=101</link>
		<comments>http://scone1.scone.cs.cmu.edu/nuggets/?p=101#comments</comments>
		<pubDate>Sat, 09 Oct 2010 12:05:22 +0000</pubDate>
		<dc:creator>Scott Fahlman</dc:creator>
				<category><![CDATA[AI]]></category>

		<guid isPermaLink="false">http://scone1.scone.cs.cmu.edu/nuggets/?p=101</guid>
		<description><![CDATA[  Can an AI system be creative?  A lot of people believe that the answer is no - obviously no.  After all, we are talking about a computer program.  It only does what its instructions tell it to do, and some human programmer wrote those instructions.  Furthermore, computer programs are deterministic: give a program the same input [...]]]></description>
				<content:encoded><![CDATA[<div>
<p style="text-align: center;"><a href="http://scone1.scone.cs.cmu.edu/nuggets/wp-content/uploads/2010/10/Gull-in-Profile2.jpg"><img class="size-medium wp-image-106 aligncenter" title="Gull in Profile" src="http://scone1.scone.cs.cmu.edu/nuggets/wp-content/uploads/2010/10/Gull-in-Profile2-300x225.jpg" alt="" width="576" height="452" /></a> </p>
<p>Can an AI system be creative?  A lot of people believe that the answer is no - obviously no.  After all, we are talking about a computer program.  It only does what its instructions tell it to do, and some human programmer wrote those instructions.  Furthermore, computer programs are deterministic: give a program the same input a thousand times, and it will give you the same answer a thousand times.  If you liked the system’s answer the first time, credit goes to the human programmer; if you didn’t, it’s not very creative to get stuck on that bad answer for all eternity.</p>
<p>That’s the popular view, but I disagree:  I believe that an AI system can exhibit what any fair observer would call creativity.  To explore the question properly, we have to take a closer look at what we mean by creativity, and then think about where it comes from.  That is the subject of this article.</p>
<h3>Why scientific creativity?</h3>
<p>In this discussion, I want to focus on scientific creativity, including also engineering, puzzle-solving, and creative planning of all kinds.  For now, I am avoiding artistic creativity because I think it’s harder to discuss that coherently &#8211; so let’s attack the easier (or better-defined) problem first.  In scientific (etc.) creativity, you generally are able to tell when you finally have a good solution to the problem you were working on: you have a theory that explains the phenomena at hand, with some predictive power for new observations; the bridge you designed stays up, and it proves to be less expensive or easier to construct than bridges built according to existing cookbook methods; the puzzle is solved; the plan you came up with accomplishes all of its goals with good efficiency.  The creative act here is generally in finding a good solution; the evaluation criteria are generally agreed upon.</p>
<p>In artistic creativity, you have this same problem of creating a solution, but it is coupled with the more complex problem of deciding whether the solution is any good.  We can usually tell whether the product is original in some way &#8211; until recently, nobody had ever dribbled paint all over a canvas, framed the result, and invited an audience to admire it &#8211; but is it art? Is it a good painting (or poem, novel, symphony, performance…)?  Is there some audience &#8211; preferably including more people than just the artist himself &#8211; who will find this work interesting or beautiful or provocative in some way?  There is a vast literature on artistic value and aesthetics, and (for now) I don’t intend to add to the clutter.</p>
<p>There is an even deeper problem when we think about computers making art that will be appreciated by humans: some excellent art is very intellectual and abstract, but a lot of what moves us is rooted in shared “visceral” human experience &#8211; things that a computer or robot would not have experienced in the same way.  For example, some of our emotional reaction we have to music is related to the human heartbeat, to other rhythms of our bodies and surroundings, and to sounds like crying and laughter and unexpected threatening noises. Our reactions to these things are (to some degree) hard-wired and shared across all human cultures.  Most good literature relates in some way to human fears, yearnings, and inter-personal relationships.  These are things that a computer might understand, as a blind person might understand colors, but they would not be a normal part of the experience of an AI system &#8211; even a long-lived robot that spends its whole “life” learning.  I would not absolutely rule out the possibility that an AI system will one day produce art that connects with a human audience with what feels, to us, like shared visceral experience, but getting there will be a much longer journey than coming up with AI systems that “creatively” solve scientific and engineering problems.</p>
<p>So, for now, we will focus only on the kind of creativity where there is some objective measure or general consensus on what is a good or bad solution &#8211; in which the cleverness is in how the solution was discovered.</p>
<h3>Randomness</h3>
<p>Among those people who will admit that an AI system might show some degree of scientific creativity, there is a common belief that some degree of randomness is the secret ingredient.  If we put a random number generator in our code and use it to alter some of the system’s behaviors, then the system can do things that surprise its creator, avoiding the trap of always coming up with the same answer to a given problem.</p>
<p>I think that’s true, but it’s only a tiny part of the answer.  For one thing, in any really difficult problem, and any AI system complex enough to grapple with it, there will be plenty of randomness; we don’t have to add it explicitly.  Every problem (and every data set) will be a little bit different from the ones that we have seen before, and this inherent variation may cause the system to try different solution paths in different orders.</p>
<p>The AI system will also be a bit different each time.  A good problem-solving system will learn as it goes, encapsulating and generalizing lessons from its experience and using these to guide it toward some strategies and away from others.  If the system tries a particular path that fails to find a solution, or if it finds a solution that ultimately doesn’t work, there will be a memory of that effort.  If the system tries again on the same problem, it will be a different system because if this memory.  So an AI system with some learning built in will exhibit one of the key elements of creativity: Keep making new mistakes instead of repeating the same old mistakes.<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=101#footnote_0_101" id="identifier_0_101" class="footnote-link footnote-identifier-link" title="This maxim &amp;#8212; I&amp;#8217;ve seen it worded in various ways &amp;#8212; is often attributed to Esther Dyson.">1</a>]</sup></p>
<p>Also, while some random component may be necessary, random behavior by itself generally gets you nowhere.  Doing a random walk in the space of possible low-level actions or solutions is not going to solve any really hard problem, since good solutions will be extremely sparse in this space.<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=101#footnote_1_101" id="identifier_1_101" class="footnote-link footnote-identifier-link" title="&hellip;Well, a random walk might work if you can afford to invest the immense time and parallelism that we see in biological evolution by natural selection.&nbsp; But even in natural selection, we see some very clever mechanisms to guide the search, for example by selecting and stabilizing various useful&nbsp;partial&nbsp;solutions and re-using them in various combinations, rather than wandering randomly in the lowest-level feature-space.&nbsp; &nbsp;This preservation and shuffling of partial solutions seems to be the primary (non-recreational) role of sexual reproduction.">2</a>]</sup>  A much larger role is played by the knowledge that we use to select strategies and to structure the space of possibilities that we want to explore, so that the random part of the search takes place in what the military would call a “target-rich environment”.  The right problem representation, a partial recipe, or a good plan for how to explore the space can often take the place of several billion years of unguided random search.</p>
<h3>The French Horn theory of creativity</h3>
<p>I think that a good metaphor for this creative process is playing a French horn.  Yes, there is a partially random excitation &#8211; flapping your lips to make the flatulent sound that we sometimes call “blowing raspberries”.<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=101#footnote_2_101" id="identifier_2_101" class="footnote-link footnote-identifier-link" title="Some sources suggest that this term is derived from Cockney rhyming slang via &ldquo;raspberry tarts&rdquo;.">3</a>]</sup>   No sound comes out if you don’t do this, but the process of making music has a lot more to do with four or five meters of beautifully crafted brass tubing, plus years spent learning how to work the valves, shape the left hand, and how to make exactly the right kind of “random” excitation that the score demands at any given moment.  In the realm of problem-solving, knowledge of the domain plays the role of the instrument, and accumulated experience is the essence of the player’s skill.</p>
<p>Some parents who want their kids to be creative teach them to run around behaving in a random and undisciplined manner.  To me, that makes as much sense as teaching kids to play the French horn by telling them to run around making raspberry sounds.  Yes, some spontaneity is required for creative problem solving, but most kids come equipped with this unless it is driven out of them.  Far more important is to teach them all the knowledge they will need about various fields, and the discipline to gain systematic experience in applying that knowledge.  (It helps if the kids think all of that learning and exploration is great fun &#8211; which it can be!)  It does take some courage and self-confidence to move away from what everyone else is doing and to try something new and perhaps heretical &#8211; to “think outside the box” &#8211; but that courage is much easier to muster if you have a solid understanding of the domain, what others are working on, and what possibilities they seem to be overlooking.</p>
<h3>No Magic Needed</h3>
<p>Here’s a hypothesis &#8211; actually only part of a hypothesis, but we will get to the other part later:  What we call “scientific creativity” is just good, effective problem solving that happens to lead to a surprising result. <sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=101#footnote_3_101" id="identifier_3_101" class="footnote-link footnote-identifier-link" title="I&rsquo;m certainly not the first person to suggest this.&nbsp; This idea was &ldquo;in the air&rdquo; during my grad-school days at the MIT AI Lab, probably introduced by Marvin Minsky.&nbsp; However, I don&rsquo;t know if he would agree completely with the view of creativity I am presenting here.">4</a>]</sup>Sometimes the surprise is that any result at all was found.  No magic is required &#8211; just a lot of knowledge, some good search strategies (that evolve as you gain experience with the domain), a willingness to look at some possibilities that others have missed, and perhaps just a dash of randomness or unpredictability in choosing new paths.</p>
<p>If this hypothesis is correct, there is no reason in principle why an AI problem-solving program could not exhibit this kind of creativity.  We just have to build in the right knowledge, representations, and strategies, plus the ability to learn more of these things based on experience, observation, and “being told”.  That word “just” is a stand-in for decades of research on this topic, not much of which is taking place at the moment.  (See my rant on this topic in an <a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=39" target="_blank">earlier article</a>.)  However, I don’t see anything really fundamental that is missing from this picture.</p>
<p>The only way to prove that this hypothesis is correct is to do a lot more work in this area: either we will start to see AI systems with undeniable creativity, or we will get stuck and will have to admit that we’ve run into some fundamental barrier to further progress.  But we’re not stuck yet &#8211; just moving more slowly than most of us would like.</p>
<p>The idea of creative machines is troubling to some.  We like to think that scientific creativity is one of those special talents that we humans will never have to share with machines.  Some skeptics would even say that we don’t share this kind of problem-solving creativity with animals.  People who believe this obviously haven’t spent much time trying to keep squirrels out of their bird-feeder.  I think that the big difference is that humans are much better than animals at accumulating and passing on great heaps of accumulated knowledge and experience &#8211; language gives us an enormous advantage in doing this.  Animals without language must depend almost entirely on their own life-experience, and will probably never develop anything as complex as the bow and arrow, let alone integrated circuits and quantum physics.</p>
<p>One additional point:  Problem-solving is fractal, and therefore so is problem-solving creativity.<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=101#footnote_4_101" id="identifier_4_101" class="footnote-link footnote-identifier-link" title="I believe that I first heard this idea from Seymour Papert, but I don&rsquo;t know if it was original with him.">5</a>]</sup>  When you tackle a big problem, you have to find the right overall structure for a solution.  That may be handed to you, or it may require some creativity to discover it.  But once you’ve got the overall structure, there are many smaller sub-problems that must be solved in order to fill in the details and execute (or test) the grand design.  Each of those sub-problems may require some degree of creativity as well &#8211; and then the sub-sub-problems and so on, until we get down to operations that that are well understood and completely routine.  Few of us will experience the exciting rush of creativity that goes with creating a major scientific theory or a world-changing invention, but every one of us must do some small-scale creative problem-solving every day.  And, at every level, I would claim, that is just effective problem solving that leads to a surprising answer &#8211; not magic.</p>
<h3>But what about “flashes of inspiration”?</h3>
<p>Perhaps the biggest problem with this “no magic” hypothesis is that it doesn’t match our subjective experience.  Time and again, scientists and other problem-solvers speak of having a “flash of inspiration” or an “Aha! experience” in which some key missing idea comes to them suddenly, often when they are thinking about something else.</p>
<p>There are many well-known examples.   In some cases, the “flash” is triggered by an observation:  Archimedes, settling into his bath and watching the water rise, suddenly understands how to measure the volume &#8211; and therefore the density &#8211; of a crown that may or may not be pure gold.  Eli Whitney, working on the cotton gin, observes a cat clawing at a chicken behind a wire fence, and suddenly understands that he should snag the cotton fibers and pull them away from the seeds, rather than trying to comb or roll the seeds out of a tangled mass of cotton.</p>
<p>In other cases, the “flash” seems to come from the inventor’s own mind, without any particular external stimulus.  One famous example: after years of trying to understand the structure of benzene, the chemist Friedrich August Kekulé hit upon a model with a ring of six carbons with alternating (but ever-shifting) single and double bonds.  That model worked, explaining all the odd properties of this compound that had puzzled chemists for decades.  Kekulé later wrote that the breakthrough came when, in a daydream, he saw the image of a snake eating its own tail &#8211; a common symbol for reincarnation or cyclic renewal in many cultures &#8211; and that suggested the dynamic, resonant ring of carbons.  Of course, it would have suggested no such thing if he had not already spent years wrestling with more conventional approaches to this problem.</p>
<p>Some of these famous examples may be apocryphal &#8211; explanations invented long after the fact &#8211; but they ring true to us because we all have experienced such “Aha!” moments in the small creative tasks we must perform in our daily lives.  It feels like magic, not mundane, grind-it-out problem-solving.  So what could be going on here?</p>
<p>Well, introspection is notoriously unreliable, so we could just dismiss these flashes as an illusion. But I think that there is something deeper going on here, and the flashes we experience provide a clue.</p>
<p>There are at least three different activities bundled into what we call “problem solving”.  First, there is information gathering: learning whatever is known about a problem, what approaches have already been tried, and perhaps running some experiments to learn more.  Second, there is choosing a representation or framework for attacking the problem.  Third, there is “grinding out” and testing the answer: working out all the details to see if the chosen framework can, indeed, provide a successful solution.</p>
<p>The first and third of these activities require conscious, step-by-step mental effort.  We are well aware when we’re doing this kind of work.  But the second step, choosing a representation and approach &#8211; a metaphor for the problem, if you will &#8211; is different.  It can be a conscious serial process: make a list of all the possibilities and try them out, starting with the ones that seem most likely to succeed.  But this can also be a sort of recognition process: having studied the problem to be solved, we have a mental sketch of its essential features and overall structure.  We then reach into our mental storehouse of schemes and metaphors, and something clicks &#8211; an approximate match, not a perfect one.  This is the candidate solution structure that we may or may not be able to massage into a complete answer.</p>
<p>This recognition process is very similar to other recognition tasks that we humans perform without apparent effort: vision, speech understanding, and so on.   Matching a set of observed features against a huge number of possible descriptions or schemas is not an easy task in terms of the computation required &#8211; far from it!  But it feels easy to us humans because our brains have powerful, massively parallel hardware to throw at the matching task.  I believe that the same (or similar) recognition hardware is used when we try to find a framework or metaphor that matches a problem.</p>
<p>Because so much is happening in parallel, we do not have a conscious awareness of all the mental effort that this requires.  We just see the candidate answers that pop out.  When one of those answers works, it’s a “flash of inspiration”.  Of course, we do this kind of matching all the time in our daily planning, but we don’t remember the matches that come easily, without a struggle.  The flashes we remember as creative are the ones that we have to struggle to find &#8211; often ones that others have struggled with and have failed to find.</p>
<p>In many cases our existing description of the problem will fail to match anything useful in our storehouse of metaphors.  The matcher may get stuck, perhaps because it has latched onto one obvious solution that doesn’t work, and it refuses to let go.  When that happens, one useful strategy is to think about something else for a while, and then come back to the problem.  Or, as noted above, some external stimuli &#8211; themselves being fed through the recognition machinery &#8211; may take on new meaning and catalyze a match.</p>
<p>So here is the revised theory:</p>
<ul>
<li>What we call “scientific creativity” is just good, effective problem solving that happens to lead to a surprising result.</li>
<li>The part that seems creative or magical to us is the selection of the representation or approach. That is fundamentally a recognition process: matching the problem (as we understand it) against a vast store of stored metaphors and techniques. This is computationally demanding, but it happens in parallel, so it feels like a flash. We don’t feel the mental effort required to do the match.</li>
<li>These flashes of inspiration hardly ever occur until you’ve done a lot of work to investigate and understand the structure of the problem. That part is perceived as hard work. And, once the flash has occurred, it is of no value until you have done all the detailed “grind it out” work to fit your idea to the problem and verify that it works. That part, too, is perceived as hard work. As Thomas Edison put it, “Genius is 1% inspiration and 99% perspiration.”</li>
<li>For a problem that is both difficult and important, many people will try the standard methods and representations and will do all the necessary hard work. So what sets apart the “creative” person is their success in doing the recognition part, and in coming up with an answer that the others have missed.</li>
</ul>
<p>If creativity were magic, then there’s not much any of us can do to become more creative &#8211; the flash occurs or it doesn’t.  But if the above theory is correct, then there are problem-solving strategies that we humans can adopt that might lead us to more creative results, more of the time.  That will be the topic of a future article, coming soon.</p>
</div>
---------------------------
  
  <ol class="footnotes"><li id="footnote_0_101" class="footnote">This maxim &#8212; I&#8217;ve seen it worded in various ways &#8212; is often attributed to Esther Dyson.</li><li id="footnote_1_101" class="footnote">…Well, a random walk might work if you can afford to invest the immense time and parallelism that we see in biological evolution by natural selection.  But even in natural selection, we see some very clever mechanisms to guide the search, for example by selecting and stabilizing various useful partial solutions and re-using them in various combinations, rather than wandering randomly in the lowest-level feature-space.   This preservation and shuffling of partial solutions seems to be the primary (non-recreational) role of sexual reproduction.</li><li id="footnote_2_101" class="footnote">Some sources suggest that this term is derived from Cockney rhyming slang via “raspberry tarts”.</li><li id="footnote_3_101" class="footnote">I’m certainly not the first person to suggest this.  This idea was “in the air” during my grad-school days at the MIT AI Lab, probably introduced by Marvin Minsky.  However, I don’t know if he would agree completely with the view of creativity I am presenting here.</li><li id="footnote_4_101" class="footnote">I believe that I first heard this idea from Seymour Papert, but I don’t know if it was original with him.</li></ol>]]></content:encoded>
			<wfw:commentRss>http://scone1.scone.cs.cmu.edu/nuggets/?feed=rss2&#038;p=101</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Human vs. Super-Human AI</title>
		<link>http://scone1.scone.cs.cmu.edu/nuggets/?p=39</link>
		<comments>http://scone1.scone.cs.cmu.edu/nuggets/?p=39#comments</comments>
		<pubDate>Mon, 07 Sep 2009 22:14:31 +0000</pubDate>
		<dc:creator>Scott Fahlman</dc:creator>
				<category><![CDATA[AI]]></category>

		<guid isPermaLink="false">http://sef-linux.radar.cs.cmu.edu/nuggets/?p=39</guid>
		<description><![CDATA[Note:  A revised, updated, and slightly expanded version of this essay has been published in the inaugural issue of the new online journal, Advances in Cognitive Systems, or ACS: Fahlman, Scott E. (2012): “Beyond Idiot-Savant AI” in Advances in Cognitive Systems 1, pages 15-22. As for the photo, it doesn’t really have anything to do with [...]]]></description>
				<content:encoded><![CDATA[<p><img src="http://scone1.scone.cs.cmu.edu/nuggets/images/Scolding Duck.jpg" alt="" /></p>
<p><span style="color: #ff0000;">Note:  A </span><span style="color: #ff0000;">revised, updated, and slightly expanded version of this essay has been published in the inaugural issue of the new online journal, <em>Advances in Cognitive Systems</em>, or ACS:</span></p>
<blockquote><p><span style="color: #ff0000;">Fahlman, Scott E. (2012): <a title="“Beyond Idiot-Savant AI”" href="https://docs.google.com/viewer?url=http%3A%2F%2Fcogsys.org%2Fpdf%2Fpaper-1-3.pdf">“Beyond Idiot-Savant AI”</a> in Advances in Cognitive Systems 1, pages 15-22.</span></p></blockquote>
<p class="MsoBodyText" style="margin-top: 5.75pt;"><span style="color: red;">As for the photo, it doesn’t really have anything to do with the topic.People seem to like a bit of eye-candy in the blog, just for variety.  I took this photo a few years ago in </span><span style="color: red;">Marwood</span><span style="color: red;">Hill</span><span style="color: red;">Gardens</span><span style="color: red;">, near </span><span style="color: red;">Barnstaple</span><span style="color: red;"> in </span><span style="color: red;">England</span><span style="color: red;"> – a highly recommended garden, by the way.</span></p>
<p class="MsoBodyText" style="margin-top: 5.75pt;"><strong>What this article is about</strong></p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">Despite the title, this article is not about super-intelligent, autonomous AI systems that <em>might</em> attempt to take over the world and that, if they succeed, might or might not decide to keep us humans around as pets.There has been a certain amount of discussion about this in recent months, triggered in part by an <a href="http://research.microsoft.com/en-us/um/people/horvitz/AAAI_Presidential_Panel_2008-2009.htm">AAAI panel</a> set up to look at such issues – and in part by a lot of sensationalized press accounts.Nothing sells papers like the threat of killer robots run amok.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">I agree that those of us working on AI have a responsibility to consider the long-term human consequences of our work.Fortunately, we&#8217;ve got some time to think about this.AI systems are still a very long way from achieving even a child-like level of common sense and general planning ability.This article discusses one reason why progress has been so slow.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;"><strong>Slow progress toward the original goal of AI</strong></p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">If measured by the number of useful applications, tools, and vibrant spin-off fields it has produced, Artificial Intelligence has been a spectacular success.  However, a lot of people (including me) believe that AI has been a disappointment in terms of achieving its original goal: to understand and, ultimately, to replicate the computational mechanisms responsible for human-like intelligence, in all its generality, flexibility, and resilience.In an <a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=29">earlier article</a> I listed some of the major elements of intelligence that we still don&#8217;t understand after 50+ years of work on AI.In <a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=27">another article</a> I echoed Ron Brachman&#8217;s call for continuing work on an integrated architecture for AI.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">Back in the early days of the field, we seemed to be making good progress toward this goal.  There were a number of key discoveries along the way: first, that computers could manipulate symbols as well as numbers; second, that search through a space of possibilities, with occasional backtracking, was a powerful and resilient way to solve many problems; third, that human-like performance is going to require a lot of knowledge, not just search-power; fourth, that it&#8217;s too tedious to assemble and organize by hand enough knowledge for broad, general intelligence, so we had better find ways to increase our store of knowledge by learning.  But somehow, since the mid-1980&#8242;s, progress toward this central goal of AI seems to have run out of steam.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">Why is that?  Well, one explanation is that the funding climate changed.In the old days, there was steady, long-term support for research on the central problems of AI – not an enormous amount of funding, but enough to enable a small community of AI researchers to focus on the most challenging fundamental problems.  This effort attracted some of the most brilliant minds in the field of computing.But times changed.Sponsors lost patience with basic, long-term AI research; they began demanding a focus on specific applications, with constant benchmarks, competitions, “go/no-go” decisions, and short-term deliverables.  The patient, curiosity-driven funding that characterized the early days of AI is now very rare.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">But that is only a part of the story.I think that we face a more fundamental problem: in an odd way, AI has been a victim of its own success.  More specifically, the field&#8217;s success in producing useful but narrow technologies in particular areas – what I call &#8220;super-human AI&#8221; – has almost completely crowded out work on our original goal of creating flexible, integrated, human-like AI.We have seen one gold rush after another to exploit new, highly specialized technologies with their roots in AI.In the short run, this may be good for the field, since it pulls in both people and money; in the long run, I think it&#8217;s a serious problem.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;"><strong>&#8220;Super-Human&#8221; AI</strong></p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">What do I mean by &#8220;super-human AI&#8221;?  I wrote briefly about this in an earlier <a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=23">article</a>.  The idea is that intelligence is really a bundle of many capabilities.It is possible (and is now very common) to have super-human performance in one of these areas, or a few of them, without having anything that resembles the breath, resilience, and resourcefulness of “merely human” intelligence.<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=39#footnote_0_39" id="identifier_0_39" class="footnote-link footnote-identifier-link" title="Note that the word &ldquo;merely&rdquo;, as used here, is meant to be read in a voice dripping with sarcasm.&nbsp;If we could somehow develop an artificial system with &ldquo;merely human&rdquo; performance, that would be one of the crowning intellectual achievements of mankind &ndash; infinitely more important than making incremental progress in some particular subfield such as data mining or optimal planning.">1</a>]</sup></p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">There are many examples of narrow super-human AI, but the story is similar for each.First, researchers grappling with some important problem within AI try a variety of approaches, inspired to some degree by the questions &#8220;How do humans perform this task?&#8221; or &#8220;What is really required to achieve human-like performance?&#8221;  And then someone comes up with an elegant mathematical approach that, <em>under certain conditions</em> and <em>with sufficient computing power</em>, can produce results much better than an unaided human.  In many cases, this leads to a commercially valuable technology.In some cases, it gives rise to an active field of investigation that takes on a life of its own, attracting many researchers, lots of funding, and spawning its own specialized conferences and journals.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">There are many examples of these super-human AI technologies: computer algebra systems that can solve integrals that no unassisted human can handle; search-intensive chess programs that can consistently beat (almost) every human player; search engines that can browse and index the entire Internet, but without any understanding of the content; statistical machine translation systems that can produce useful (if imperfect) translations without ever considering the meaning of the text; statistical data-mining programs that can extract subtle regularities from a mountain of noisy data; poker-playing programs that employ powerful techniques from game theory; optimal or provably near-optimal planning systems; theorem-proving inference systems, with their guarantees of soundness, logical completeness, and provable consistency; and statistical inference systems that (if their models and input probabilities are correct) can very precisely infer the probabilities of various outcomes in a way that no unaided human can match.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">This is great, but in every case (so far, at least) these developments contribute little or nothing toward achieving our original goal.The super-human techniques apply only to a very narrow set of problems, or the assumptions underlying the mathematical model are unrealistic in practice, or the method is too computationally demanding to be used on large problems – often problems that we humans can solve easily using our more informal approaches.Or all of the above.All of these systems are impressive, and many are commercially valuable, but none of them would be called <em>intelligent</em>, in the normal sense of that word.None of these systems can begin to match the common sense or flexible problem-solving ability of a young child.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;"><strong>An Example</strong></p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">To understand what’s going on here, let&#8217;s look at one of these areas – the evolution of AI planning and problem-solving systems – in a bit more detail.A lot of the early work in this area took an intuitive approach, informed to some degree by introspection about how we humans approach complex planning tasks.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">The first problem was to represent the universe in which the planning is to take place, the allowable set of operations, and the preconditions and effects of each operation.(We still have not completely solved these representation problems, but that&#8217;s a topic for another article.)Given an adequate representation, the next problem is how to find a legal path from the current state to the goal.Sometimes a legal path is easily found; sometimes it requires a great deal of search and non-obvious application of the available operators.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">Ideally, we would like both a reasonably efficient plan and a reasonably efficient planning process.One powerful idea is hierarchical planning: first, use high-level, abstract operators to sketch the outlines of a plan; then use more specific operators to fill in the details.Another powerful idea is to save a sequence of operations that is useful in one context, generalize it a bit, and to turn the sequence into a &#8220;macro-operator&#8221; that can be used in other problems.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">These ideas were explored extensively in the early days of AI by systems such as GPS<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=39#footnote_1_39" id="identifier_1_39" class="footnote-link footnote-identifier-link" title="Newell, A.; Shaw, J.C.; Simon, H.A. (1959). Report on a general problem-solving program.&nbsp;Proceedings of the International Conference on Information Processing. pp. 256-264.">2</a>]</sup> , STRIPS<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=39#footnote_2_39" id="identifier_2_39" class="footnote-link footnote-identifier-link" title="R. Fikes and N. Nilsson (1971). STRIPS: a new approach to the application of theorem proving to problem solving.&nbsp;Artificial Intelligence, 2:189-208.">3</a>]</sup>, ABSTRIPS<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=39#footnote_3_39" id="identifier_3_39" class="footnote-link footnote-identifier-link" title="Sacerdoti, E. D., &ldquo;Planning in a Hierarchy of Abstraction Spaces,&rdquo;&nbsp;Artificial Intelligence, 5:115-135, 1974.">4</a>]</sup>, SOAR<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=39#footnote_4_39" id="identifier_4_39" class="footnote-link footnote-identifier-link" title="John Laird, Paul Rosenbloom, and Allen Newell (1987). &ldquo;Soar: An Architecture for General Intelligence&rdquo;.&nbsp;Artificial Intelligence, 33: 1-64.">5</a>]</sup>, and many others.My own BUILD program<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=39#footnote_5_39" id="identifier_5_39" class="footnote-link footnote-identifier-link" title="Fahlman, S. E. (1974), &amp;#8220;A Planning System for Robot Construction Tasks&amp;#8221;, Artificial Intelligence 5 (1974), 1-49.&nbsp;Tech Report version available&nbsp;online.">6</a>]</sup>   (my MIT master&#8217;s thesis from 1973) was typical of early work in this area.  BUILD tried to figure out a plan by which a (simulated) one-handed robot could build a specified structure on a table, given a collection of blocks.  BUILD could be quite resourceful: it would first try a straightforward approach, placing the blocks one by one, starting from the bottom of the desired structure and working upward.  But if the desired structure was unstable during the construction, it would consider more complex plans.  It would try to use other blocks as scaffolding or temporary counterweights, and if that didn’t work it would try to build a sub-structure on the table and then lift the whole sub-assembly into place.  BUILD would do some extra work to produce good plans – for example, it would eliminate redundant steps in the plan – but its plans were by no means optimal, and were never intended to be.  It just returned the first reasonably good plan that worked.In that respect, it seemed very human-like in its planning.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">Not long after BUILD was published, the AI planning field changed radically.  Methods were developed that, for a certain limited class of problems, would guarantee optimal results, or results that were provably close to optimal.Other things being equal, that&#8217;s good: who wouldn&#8217;t prefer an optimal solution over one that is merely &#8220;good enough&#8221;?But, of course, other things were not equal.The optimal planning programs were very computationally demanding because the programs had to consider <em>every</em> possible solution – or formally exclude some parts of the search space where no optimal solution could possibly be hiding.For may problems of interest, these techniques were computationally intractable, or at least impossibly inefficient.So these techniques were limited to small problems in very clean, easy-to-model domains.In the real world, it makes little sense spend a lot of supercomputer time seeking an optimal solution to real-world problems when a single pothole – not represented in the model – could force the whole planning process to be re-run.(If you really care about optimality, a local patch to the plan isn’t good enough.)And, as all Pittsburghers know, potholes are everywhere.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">Given these limitations, some of us felt that the obvious move would be to continue work on flexible, resourceful, trainable, &#8220;good enough&#8221; planning systems.After all, we humans don&#8217;t worry about optimal planning in our daily lives.&#8221;Good enough&#8221; planning is good enough for us, and we can show great cleverness and resiliency when things go wrong at execution time – as they so often do – forcing us to re-plan on the fly.We can even pass partially instantiated plans from one person to another via informal high-level recipes:&#8221;To get from CMU to the airport by car, take Fifth Avenue to the Parkway East (heading west), cross the Fort Pitt Bridge, and just follow the &#8216;Airport&#8217; signs from there.&#8221;</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">But the idea of optimal or near-optimal solutions, built on a sound and elegant mathematical foundation of theorems and lemmas, was too alluring to pass up.Since the mid-1980&#8242;s, the planning field has been dominated by this approach.Most of the papers at planning conferences focus on how to deal with the resulting intractability, so that at least some problems of practical interest can be addressed. If an optimal solution is infeasible, you at least need to prove something about how close your technique can come to the optimum – impossible in most messy-real-world planning domains.So it is now difficult to publish planning results that do not address optimality concerns, and several generations of students have learned to take this approach for granted.Not only has a super-human sub-field of AI been spawned, but work on more human-like approaches to planning has mostly shriveled and died, unable to thrive in the shade of this mighty oak.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;"><strong>The Problem</strong></p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">And that, I think, is the problem.AI is one field with two very different sets of goals.It would be healthy for the field if these two approaches could co-exist: one set of researchers working on various super-human areas of AI and another set working on the original core problem of broad, human-like intelligence.These efforts could reinforce one another, and some people would move back and forth between them in the course of a career.  Unfortunately, it seldom works out that way, for two reasons:</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">First, when one of these super-human technologies takes off, it creates a sort of gold rush that attracts a lot of talent and resources away from the core problems of AI.  In recent years, it seems that 80-90% of the people at the big AI conferences are working on super-human AI problems, not on human-like AI.  So it is little wonder that progress on the core problems has slowed down.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">Second, researchers in some of these super-human areas develop a certain contempt for the less elegant human-like approaches in the same or neighboring areas: their own work is based on elegant mathematics and clean abstractions – the approach is scientific and <em>principled</em> – while those working on less formal approaches to human-like AI are just messing around.“That’s the sort of thing we did in the old days, before we understood how to properly frame the problem.Anyone still messing with those <em>ad hoc</em> approaches must be doing so out of ignorance, unaware of all the amazing progress that has taken place in AI.”</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">Well, OK, there has been amazing progress, and we should build on that whenever we can.But in most cases,<em>we’re not talking about the same area of research.</em>There is a place for optimal planning, but we also need to understand human-like good-enough planning, which is faster and much more flexible. There is an important place for theorem-proving, but (as I have argued <a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=34">elsewhere</a>), we need something more quick-and-dirty if we want our systems to read the daily newspaper.And so on.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">Unless and until these super-human approaches can be extended to cover the kinds of large, messy, hard-to-formalize tasks that we humans handle with such aplomb, we have to keep working on these things, by whatever scruffy means are necessary.Maybe some of these problems can be handled by techniques that will ultimately be formalized and wrapped in elegant theory, or maybe they are inherently messy, but any reasonable person must admit that AI still contains many challenging problems that don’t fit into the elegant theoretical frameworks we have today.<em> </em></p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">One might argue that these super-human techniques are <em>more</em> valuable than understanding and emulating human-like intelligence.  After all, we already have plenty of humans, so why not just focus on the areas where machines can extend human capabilities?  I think there is some merit in that argument, but it would be a shame to let the scramble for super-human capabilities crowd out the quest for human-like AI.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">The quest to understand and replicate human-like intelligence remains as one of the great intellectual challenges of mankind – one of the last great mysteries.  Yes, this problem has proven to be more difficult than we thought it would be, and the solution is unlikely to rest on a foundation of clean, beautiful mathematics, but that should not discourage us.  If, along the way to understanding intelligence, we can create some valuable technologies that provide super-human performance in specific narrow domains, that&#8217;s a bonus.  AI as a field may pause occasionally to take advantage of these new technologies, but we should not let them divert us from the ultimate goal.  Better yet, we can combine these technologies to get the best of both worlds: flexible, resourceful human-like systems with a &#8220;telepathic&#8221; link to an array of super-human tools, for the times when those tools are applicable.</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">And then we can go back to worrying about the death robots.;-)</p>
<p class="MsoBodyText" style="margin-top: 5.75pt;">
---------------------------
  
  <ol class="footnotes"><li id="footnote_0_39" class="footnote">Note that the word “merely”, as used here, is meant to be read in a voice dripping with sarcasm. If we could somehow develop an artificial system with “merely human” performance, that would be one of the crowning intellectual achievements of mankind – infinitely more important than making incremental progress in some particular subfield such as data mining or optimal planning.</li><li id="footnote_1_39" class="footnote">Newell, A.; Shaw, J.C.; Simon, H.A. (1959). Report on a general problem-solving program. <em>Proceedings of the International Conference on Information Processing.</em> pp. 256-264.</li><li id="footnote_2_39" class="footnote">R. Fikes and N. Nilsson (1971). STRIPS: a new approach to the application of theorem proving to problem solving. <em>Artificial Intelligence</em>, 2:189-208.</li><li id="footnote_3_39" class="footnote">Sacerdoti, E. D., “Planning in a Hierarchy of Abstraction Spaces,” <em>Artificial Intelligence</em>, 5:115-135, 1974.</li><li id="footnote_4_39" class="footnote">John Laird, Paul Rosenbloom, and Allen Newell (1987). “Soar: An Architecture for General Intelligence”. <em>Artificial Intelligence</em>, 33: 1-64.</li><li id="footnote_5_39" class="footnote">Fahlman, S. E. (1974), &#8220;A Planning System for Robot Construction Tasks&#8221;, Artificial Intelligence 5 (1974), 1-49. Tech Report version available <a href="http://dspace.mit.edu/handle/1721.1/6918">online</a>.</li></ol>]]></content:encoded>
			<wfw:commentRss>http://scone1.scone.cs.cmu.edu/nuggets/?feed=rss2&#038;p=39</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>In Defense of Incomplete Inference</title>
		<link>http://scone1.scone.cs.cmu.edu/nuggets/?p=34</link>
		<comments>http://scone1.scone.cs.cmu.edu/nuggets/?p=34#comments</comments>
		<pubDate>Thu, 17 Jul 2008 07:14:47 +0000</pubDate>
		<dc:creator>Scott Fahlman</dc:creator>
				<category><![CDATA[KR Issues]]></category>

		<guid isPermaLink="false">http://sef-linux.radar.cs.cmu.edu/nuggets/?p=34</guid>
		<description><![CDATA[In this article, I will expand a bit on some comments I made in the previous article about the tension between expressiveness, scalability, and the &#8220;general theorem proving&#8221; approach to inference – an approach that currently dominates the field of knowledge representation (KR). For a practical and useful knowledge base system (KBS), I think we [...]]]></description>
				<content:encoded><![CDATA[<p class="MsoNormal">In this article, I will expand a bit on some comments I made in the previous article about the tension between expressiveness, scalability, and the &#8220;general theorem proving&#8221; approach to inference – an approach that currently dominates the field of knowledge representation (KR).<span> </span>For a practical and useful knowledge base system (KBS), I think we have to find another path, and I will explain why I have come to this conclusion.</p>
<p class="MsoNormal">It is not my goal here to present a tutorial on first-order logic (FOL), resolution theorem proving, or the decidability and tractability of inference in FOL and in all the various subsets and supersets of FOL that have been proposed over the years.<span> </span>There is not enough space for this in a short article, and I certainly am not the best person to explain these issues.<span> </span>There are many good tutorials on these topics, both online and in the published literature, and most of those tutorials offer pointers to original sources for those who want to dig deeper. <span> </span>One good entry point, which ties these topics explicitly to the concerns of practical knowledge representation, is the book <em><a href="http://www.amazon.com/Knowledge-Representation-Reasoning-Artificial-Intelligence/dp/1558609326">Knowledge Representation and Reasoning</a></em>, by Ron Brachman and Hector Levesque, mentioned in an <a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=33">earlier article</a> in this blog.<span> </span>So in this article I will assume some prior knowledge of FOL and related matters; readers totally lacking such a background will probably find this article to be impenetrable and uninteresting.</p>
<p><strong>First-Order Logic for KR</strong></p>
<p class="MsoNormal">When faced with the problem of representing symbolic knowledge in a computer, most researchers immediately turn to First-Order Logic (FOL).<span> </span>It seems the obvious choice.<span> </span>This approach rests on a solid mathematical foundation, built up by great minds since the time of Aristotle.<span> </span>There is a very solid theory telling us what FOL can represent and what kinds of inference are possible in this logic.<span> </span>John McCarthy, with the publication of his 1959 paper <em><a href="http://www-formal.stanford.edu/jmc/mcc59.html">Programs with Common Sense</a></em>, and in much of his subsequent work, gave the just-created field of AI a mighty push in this direction, and away from the rather messy and obscure embedding of knowledge in programs that most of the early pioneers of the AI field had engaged in.<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=34#footnote_0_34" id="identifier_0_34" class="footnote-link footnote-identifier-link" title="At the time, McCarthy&amp;#8217;s arguments in favor of a declarative, logic-based approach to KR had only partial success.&nbsp;Most AI researchers accepted, more or less, the &amp;#8220;declarative&amp;#8221; part, but for a long time the AI field was split between those who favored a logic-based approach and those who just wanted to get the job done.&nbsp;The split remains, though it is now less visible.&nbsp;In recent years, the logicians have gained the ascendancy among theoretical AI researchers and those focused on the &amp;#8220;semantic web&amp;#8221;.&nbsp;Many people focused on AI applications (i.e. &amp;#8220;getting the job done&amp;#8221;) still often employ other, mostly older KR approaches that avoid some of the representational limitations discussed in this article.">1</a>]</sup></p>
<p class="MsoNormal">For the inference procedure, it seems natural to choose something like resolution theorem-proving<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=34#footnote_1_34" id="identifier_1_34" class="footnote-link footnote-identifier-link" title="John Alan Robinson, &amp;#8220;A Machine-Oriented Logic Based on the Resolution Principle&amp;#8221;,&nbsp;Communications of the ACM, 5:23&ndash;41, 1965.">2</a>]</sup>, with its guarantees of sound inference and logical completeness.<span> </span>The basic idea is simple: if you want to determine whether proposition X is true (provable) relative to a given knowledge base (KB), assert not-X and see if the KB is inconsistent given that addition.<span> </span>If so, X must be true.<span> </span>Note, however, that this inserts a potentially very expensive step into the procedure for even simple queries: &#8220;determine whether the entire KB is now consistent&#8221;.<span> </span>In many cases where X is true, a well-designed inference algorithm will terminate quickly, having found some inconsistency that derives from not-X.<span> </span>However, proving that <em>no</em> inconsistency exists, even by the most circuitous path through the assertions and predicates of the KB, can take a long time.<span> </span>In more expressive logical systems, the search may not terminate at all.<span> </span></p>
<p class="MsoNormal">There is the added problem that if some inconsistency was <em>already</em> present in the KB before not-X was added, the system will agree that any statement X must be true, even if it is something absurd like &#8220;one equals two&#8221;.<span> </span>So when using resolution, or any other logically complete inference method, knowledge-base hygiene is all-important: inconsistency must be kept out at all costs.<span> </span>This can be a problem when combining large bodies of information from multiple sources with varying degrees of reliability.<span> </span>But if we have a sufficiently fast procedure for testing the KB for global consistency (as we must have when using resolution), the problem of KB hygiene can be solved, though perhaps at great computational cost.</p>
<p><strong>Speed and Scalability vs. Expressiveness</strong></p>
<p class="MsoNormal">Here&#8217;s the big problem: it has been proven that logically complete inference in a full first-order logic system is computationally intractable.<span> </span>By <em>intractable</em>, we mean that the time required for the worst cases grows exponentially with the size of the KB. Over the years, some very clever inference engines have been developed that speed up typical-case queries by caching partial results or by ruling out parts of the KB that are (provably) not relevant to a given query, but the fundamental problem of intractability remains.<span> </span>The practical consequence is that a system based on full FOL and logically complete inference cannot grow to the very large size that we need for most interesting real-world KB applications.</p>
<p class="MsoNormal">When faced with this problem, most recent researchers in the KR field have turned to more restricted, less expressive logical systems than full FOL. There are many possible tradeoffs between tractability/scalability and expressiveness, and KR conferences are full of papers analyzing and advocating various points in the space of possible tradeoffs.<span> </span>A rather complicated alphabet soup of abbreviations has grown up in the theoretical KR community for describing the various expressive mechanisms that are included or not included in a given KR system.<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=34#footnote_2_34" id="identifier_2_34" class="footnote-link footnote-identifier-link" title="For some pointers into this space, see the Wikipedia Article on&nbsp;Description Logics.&nbsp;For a good example of the complex ways in which the inclusion of expressive features may be traded off against the speed and tractability of inference, see Ian Horrocks, Ulrike Sattler, and Stephan Tobies:&nbsp;&amp;#8220;Practical Reasoning for Very Expressive Description Logics&amp;#8221; in&nbsp;Logic Journal of the IGPL, 8(3):239-264, 2000.">3</a>]</sup></p>
<p class="MsoNormal">The <a href="http://www.w3.org/2004/OWL/">OWL</a> notation, which currently seems to be the most popular KR system in actual use (especially for &#8220;semantic web&#8221; applications, since it has been blessed by the Worldwide Web Consortium), is actually a family of languages.<span> </span>OWL-Lite is a very restricted form of OWL that is tractable and for which reasonably fast inference engines exist.<span> </span>Some popular inference engines are <a href="http://www.racer-systems.com/">Racer</a>, <a href="http://www.cs.man.ac.uk/~horrocks/FaCT/">FaCT</a>, and <a href="http://kaon2.semanticweb.org/">KAON2</a>.</p>
<p class="MsoNormal">However, in OWL-Lite it is impossible to express much of the knowledge that we need to express for real applications. OWL-DL, based on <a href="http://dl.kr.org/">Description Logics</a>, is somewhat more expressive: OWL-DL is itself a family with multiple versions, corresponding to the various ways in which description logics can be restricted.<span> </span>Some versions of OWL-DL are theoretically tractable and some are not; of the theoretically tractable versions of OWL, some fit into existing &#8220;fast&#8221; inference engines, and some do not.<span> </span>OWL-Full is more expressive still, but does not generally work with the inference engines that have been developed for description logics.</p>
<p class="MsoNormal">This research on the agonizing tradeoffs between scalability and expressiveness is valuable, but for our purposes in the Scone project – and for any other KB system with similar practical goals – these tradeoffs are beside the point.<span> </span><strong>For human-like reasoning on real-world problems with a large KB, we need both scalability to very large size and expressiveness that is <em>greater</em> – not <em>less</em> – than is provided by first-order logic.</strong></p>
<p class="MsoNormal">Regarding scalability, the core inferences that occur frequently in the KBS must be not only tractable, but fast.<span> </span>Even if a logical system can produce results in polynomial time, that is not good enough if the polynomial involves high-degree terms or large coefficients.<span> </span>If the KB is to scale to the size required for human-like common sense – or beyond – we need more-or-less constant-time responses for most kinds of queries, and linear-time response (or near-linear time) for most of the rest.</p>
<p class="MsoNormal">Of course, there is no magic approach for solving provably intractable inference problems.<span> </span>Some difficult reasoning problems will always require exponential or high-degree polynomial time, no matter what we do.<span> </span>But I would argue that these are not the inference problems that we need to solve frequently in the course of everyday human-like reasoning.<span> </span>So instead of holding the KBS design hostage to the worst-case performance of these most-difficult problems, we could instead optimize the KBS for the easier problems – the ones that we humans routinely solve with no apparent display of mental effort – and banish the more difficult problems to a specialized theorem-proving or puzzle-solving module, where they won&#8217;t bog down our reasoning when we want to determine, for example, whether Clyde the elephant has a heart.</p>
<p class="MsoNormal">Regarding expressiveness: as mentioned in the previous article, and as we will explore in more detail in future articles, a real-world human-like KBS needs higher-order constructs (i.e. the ability to refer to assertions as objects in the KB and to reason about the assertions themselves), multiple overlapping contexts (which really are just a convenient re-packaging of the higher-order constructs), and default reasoning with exceptions (non-monotonic constructs).<span> </span>If we add any of these features to our knowledge representation system, it takes us beyond the intractability of FOL and into the space of undecidability – that is, if we try to apply a logically complete inference method like resolution, it may not return an answer at all for some queries, no matter how long we wait.</p>
<p><strong>Three Major Goals: Pick Any Two</strong></p>
<p class="MsoNormal">So that&#8217;s the problem. We want three things from a practical KR system: (1) logical completeness and provable consistency; (2) speed and scalability; and (3) enough expressive power to cover all of our needs for representing common-sense knowledge.<span> </span>There is a well-established body of theory that says we can&#8217;t have all three properties at once.<span> </span>So researchers must choose which of these goals to give up (or at least to compromise on).</p>
<p class="MsoNormal">For many researchers, giving up the logical properties of completeness and provable consistency seems to be inconceivable – they don&#8217;t even discuss this as a choice – so they end up arguing endlessly about how much expressiveness to trade off for somewhat better speed and scalability.<span> </span>Often, they end up with the worst of both worlds: an awkward, inexpressive system that is still too slow to be used in many applications.</p>
<p class="MsoNormal">Some of the logicians in the KR field have effectively given up on scalability: they remain rooted in the logical tradition and continue to explore ever more complex and expressive logics, even if the resulting representations are intractable or undecidable.</p>
<p class="MsoNormal">And a few of us, unwilling to compromise either on expressiveness or scalability, have made the choice to give up on logically complete proof procedures and to reason in a more local and limited way.<span> </span>Our systems are not general theorem-provers, but they seem to support human-like common-sense reasoning quite well.<span> </span>In any case, given our goals, this is the only reasonable choice.<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=34#footnote_3_34" id="identifier_3_34" class="footnote-link footnote-identifier-link" title="I am certainly not the first person to reach this conclusion.&nbsp;For example, see Jon Doyle and Ramesh S. Patil&amp;#8217;s paper&nbsp;&amp;#8220;Two Theses of Knowledge Representation: Language Restrictions, Taxonomic Classifications, and the Utility of Representation Services&amp;#8221;,&nbsp;Artificial Intelligence, Vol.48, No. 3 (April 1991), pp. 261-297.&nbsp;They present a longer and more detailed argument leading to the same general conclusion: that in a KR system for large real-world problems, we must use some kind of incomplete inference scheme, though their ideas about exactly what sort of incomplete inference to use are a bit different from mine.">4</a>]</sup></p>
<p><strong>Incomplete Inference</strong></p>
<p class="MsoNormal">So what are the implications of this choice?<span> </span>On the plus side, giving up on logically complete inference is very liberating: the greater expressiveness available in an incomplete-inference system like Scone allows us to easily express and reason about general-knowledge statements and concepts that would be extremely difficult, if not impossible, in a system based on FOL or a tractable subset of FOL.<span> </span>In Scone it is easy to represent and reason about pink elephants and flightless birds, and to represent complex multi-context statements like &#8220;Tom believes that Fred knows Mary&#8217;s phone number, but Fred doesn&#8217;t really know it.&#8221; <span> </span>This expressive power makes it much easier to build up large bodies of knowledge in real-world domains, since we are not constantly fighting against the fundamental limitations of our descriptive machinery.<span> </span>Since our KBS is more expressive, we can import knowledge from representations such as OWL into Scone, though we cannot, in general, export Scone knowledge back into less expressive systems – some of it just can&#8217;t be represented there.<span> </span>Freed from the expressiveness/tractability dilemma, we can focus on making our system&#8217;s limited inference as fast and scalable as possible.</p>
<p class="MsoNormal">On the other hand, there will be some valid inferences that could be made from the knowledge in our KB that a system like Scone will never make – that&#8217;s the definition of incomplete inference. <span> </span>Scone&#8217;s reasoning is mostly local – it will not follow chains of inference to arbitrary depth.<span> </span>When faced with some loop (&#8220;Who shaves the barber?&#8221;), Scone ideally would flag this as a problematic case, but we cannot guarantee that all such cases will be detected.<span> </span>Sometimes Scone will just return one answer or the other.<span> </span>For human-like reasoning, that strikes me as good enough. <span> </span>I do the same thing.<span> </span>We all do, when we&#8217;re not in theorem-proving mode.</p>
<p class="MsoNormal">Scone can still catch most inconsistencies as they are entered into the KB.<span> </span>For example, if you say that Clyde is male and later that he is someone&#8217;s mother, implying that Clyde is female, that will be caught.<span> </span>However, more subtle inconsistencies may sneak into the KB unnoticed.<span> </span>If somehow we end up in a situation where Clyde is both male and female, then we will experience some local confusion regarding Clyde.<span> </span>(What restroom should Clyde use?<span> </span>What sort of sex organs does Clyde have?)<span> </span>But our local, limited inference machinery will then not go on to conclude that one equals two.<span> </span>So the lack of logical completeness is not always a bad thing.</p>
<p class="MsoNormal">The deductions made by the limited inference engine in our KBS are normally sound, as far as they go, but there are situations where an unsound conclusion can be reached.<span> </span>In a non-monotonic but incomplete reasoning system, this is unavoidable.<span> </span>For example, suppose that elephants are, by default, gray, but that Clyde is pink.<span> </span>Since our KBS supports exceptions, it is legal to cancel the inherited &#8220;Clyde is gray&#8221; property and to add the contradictory assertion &#8220;Clyde is pink&#8221;.<span> </span>Suppose, however, that the information about Clyde&#8217;s non-standard color is not stated directly, but is derived as the result of a long and circuitous chain of reasoning.<span> </span>In this case, when asked the color of Clyde, our KBS might quickly infer that Clyde is gray and, if we stop the reasoning too soon, might never discover any reason to retract this conclusion. The KBS would return gray as the color, and that is incorrect.</p>
<p class="MsoNormal">It is troubling to have unsound inferences in our system under any circumstances.<span> </span>However, default inference with exceptions is so valuable that, in my opinion, we cannot live without it in real-world domains.<span> </span>If you know a lot about elephants, it is unlikely that any individual elephant will fit the typical-elephant description without a few exceptions.<span> </span>And if you banish from the elephant description any property that is not universally true, you will be left with a very sparse and not very useful description.<span> </span>So I think that this particular kind of unsoundness is the lesser of evils – the only reasonable choice for a system that must solve real problems in messy, non-mathematical domains.</p>
<p class="MsoNormal">
<p class="MsoNormal">Again, I believe that this choice is quite consistent with human-like common-sense reasoning. <span> </span>In my daily life, I will often draw an invalid conclusion because I haven&#8217;t thought deeply enough about the problem – I have just applied the usual default instead of noticing that the case I&#8217;m dealing with is exceptional.<span> </span>We humans all do this in our day-to-day reasoning, whenever we’re not in ultra-careful theorem-proving mode.</p>
<p class="MsoNormal">Perhaps the greatest disadvantage of an incomplete-inference approach is that it is hard for us to make sweeping, closed-form statements describing exactly what inferences these systems can and cannot make.<span> </span>In most of these systems, the depth of reasoning depends in complicated ways on exactly what is in the KB and on parameters that govern how much time and effort should be spent on a query before the system gives up.<span> </span>So the incomplete-inference systems lack the satisfying mathematical elegance of the theorem-proving systems; in exchange, the offer speed, scalability and expressiveness.<span> </span>I think that each of us must choose which of these properties we value most, depending on our goals.</p>
<p><strong>In Conclusion&#8230;</strong></p>
<p class="MsoNormal">None of this is meant to suggest that researchers who cling tightly to the logically complete, theorem-proving approach are wrong to do so.<span> </span>Theorem proving – and, more recently, <em>automated</em> theorem proving – are among the crown jewels of human intellectual achievement.<span> </span>Theorem-proving enables us to reason far more deeply than we otherwise could, with a strong guarantee that, if our premises are correct, our conclusions are correct as well.<span> </span>Even for practical inference systems, this approach has some value.<span> </span>There are small but important reasoning problems for which intractability is not a fatal problem.<span> </span>And there are many uses for logic-based representation with limited expressive power.<span> </span>For example, a system like OWL-Lite may be sufficiently expressive for the Semantic Web as it is currently conceived by many: a simple hierarchical scheme to provide somewhat more meaningful labels for URLs and web pages.<span> </span></p>
<p class="MsoNormal">So I have no problem with researchers who want to focus on exploring the limits and tradeoffs of logically complete inference systems.<span> </span>It’s important work, and perhaps they deserve the dominant position they currently occupy in the KR universe.<span> </span>But I do wish that they would regard those of us who have chosen to work on incomplete inference systems as equally serious researchers who have chosen a different approach for good reason, and not as ignorant or lazy bumblers who are too slow or too stubborn to perceive the One   True Way.</p>
<p class="MsoNormal">It is rather annoying when the logic people start congratulating themselves for having taken a &#8220;principled&#8221; approach to representing one kind of knowledge or another, with the sneering implication that other approaches are &#8220;unprincipled&#8221; – just a collection of <em>ad hoc</em> design choices made more or less at random by people who don&#8217;t know any better.<span> </span>Principles are important.<span> </span>But if strict adherence to a particular set of principles leads you into a part of the design space that is provably worthless for many important real-world problems, maybe it&#8217;s time to look for some new principles.</p>
---------------------------
  
  <ol class="footnotes"><li id="footnote_0_34" class="footnote">At the time, McCarthy&#8217;s arguments in favor of a declarative, logic-based approach to KR had only partial success. Most AI researchers accepted, more or less, the &#8220;declarative&#8221; part, but for a long time the AI field was split between those who favored a logic-based approach and those who just wanted to get the job done. The split remains, though it is now less visible. In recent years, the logicians have gained the ascendancy among theoretical AI researchers and those focused on the &#8220;semantic web&#8221;. Many people focused on AI applications (i.e. &#8220;getting the job done&#8221;) still often employ other, mostly older KR approaches that avoid some of the representational limitations discussed in this article.</li><li id="footnote_1_34" class="footnote">John Alan Robinson, &#8220;A Machine-Oriented Logic Based on the Resolution Principle&#8221;, <a title="Communications of the ACM" href="http://en.wikipedia.org/wiki/Communications_of_the_ACM">Communications of the ACM</a>, 5:23–41, 1965.</li><li id="footnote_2_34" class="footnote">For some pointers into this space, see the Wikipedia Article on <a href="http://en.wikipedia.org/wiki/Description_logic">Description Logics</a>. For a good example of the complex ways in which the inclusion of expressive features may be traded off against the speed and tractability of inference, see Ian Horrocks, Ulrike Sattler, and Stephan Tobies: <a href="http://www.comlab.ox.ac.uk/people/ian.horrocks/Publications/download/2000/HoST00.pdf">&#8220;Practical Reasoning for Very Expressive Description Logics&#8221;</a> in Logic Journal of the IGPL, 8(3):239-264, 2000.</li><li id="footnote_3_34" class="footnote">I am certainly not the first person to reach this conclusion. For example, see Jon Doyle and Ramesh S. Patil&#8217;s paper <a href="http://www.csc.ncsu.edu/faculty/doyle/publications/td91.ps">&#8220;Two Theses of Knowledge Representation: Language Restrictions, Taxonomic Classifications, and the Utility of Representation Services&#8221;</a>, Artificial Intelligence, Vol.48, No. 3 (April 1991), pp. 261-297. They present a longer and more detailed argument leading to the same general conclusion: that in a KR system for large real-world problems, we must use some kind of incomplete inference scheme, though their ideas about exactly what sort of incomplete inference to use are a bit different from mine.</li></ol>]]></content:encoded>
			<wfw:commentRss>http://scone1.scone.cs.cmu.edu/nuggets/?feed=rss2&#038;p=34</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mini-Nuggets: Knowledge Base Requirements for Human-Like Thought</title>
		<link>http://scone1.scone.cs.cmu.edu/nuggets/?p=33</link>
		<comments>http://scone1.scone.cs.cmu.edu/nuggets/?p=33#comments</comments>
		<pubDate>Wed, 25 Jun 2008 10:12:45 +0000</pubDate>
		<dc:creator>Scott Fahlman</dc:creator>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[KR Issues]]></category>

		<guid isPermaLink="false">http://sef-linux.radar.cs.cmu.edu/nuggets/?p=33</guid>
		<description><![CDATA[This is the second mini-nuggets article: a collection of propositions, with some minimal explanation for each, on a given topic. The idea is to sketch out an overall approach or point of view quickly, without getting bogged down in a lot of detail or lengthy justification about each point. I will come back and expand [...]]]></description>
				<content:encoded><![CDATA[<p class="MsoNormal">
<p class="MsoNormal">
<p><img style="vertical-align: top; border: 2px solid black;" src="http://scone1.scone.cs.cmu.edu/nuggets/images/Rainbow2.jpg" alt="" width="565" height="464" /></p>
<p class="MsoNormal">This is the second <em><a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=31">mini-nuggets</a></em> article:<em><span style="font-style: normal;"> a collection</span></em><em> </em>of propositions, with some minimal explanation for each, on a given topic. The idea is to sketch out an overall approach or point of view quickly, without getting bogged down in a lot of detail or lengthy justification about each point.<span> </span>I will come back and expand on many of these points in later articles.</p>
<p class="MsoNormal">In the <a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=32">previous article</a>, I gave some specific narrative examples illustrating the amazing capabilities of the human memory system, and I argued that these capabilities are at the core of human intelligence.<span> </span>Until we understand how this memory system works, precisely enough to build some approximation of it, our AI systems will never approach the breadth, flexibility, and common sense that we humans take for granted in our own thinking.<span> </span>(It all seems so easy for us that it took some time just to realize that there&#8217;s an interesting problem here.)<span> </span>This, in my opinion, is the single most important missing piece in our current understanding of intelligence, whether natural or artificial.</p>
<p class="MsoNormal"><strong>In this article I will try to enumerate some of the specific capabilities that a memory system or knowledge-base system (KBS)</strong><strong><span style="font-weight: normal;"><sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=33#footnote_0_33" id="identifier_0_33" class="footnote-link footnote-identifier-link" title="For clarity, in this and future articles I will try to consistently refer to a specific collection of knowledge as a&nbsp;knowledge base&nbsp;(KB), and to the system that contains this knowledge, with some support for search and inference, as the&nbsp;knowledge-base system&nbsp;(KBS).&nbsp;Informally, it is common to refer to both with the term &amp;#8220;knowledge base&amp;#8221; &ndash; similar to the practice in the database world.">1</a>]</sup></span> must have in order to support human-like performance.<span> </span></strong>I don&#8217;t expect this brief statement of positions to persuade anyone who is inclined toward another view.<span> </span>My goal here is just to provide a rough sketch of the overall approach that (as of this moment) looks best to me and my research group – an approach that we are working busily to implement in the Scone knowledge-base system.</p>
<p class="MsoNormal">One point before we dive in: I will be talking here about <em>symbolic memory</em> – the component that handles propositions like &#8220;the capital of Ohio is Columbus&#8221; or &#8220;the typical elephant hates snakes&#8221; or procedural knowledge like &#8220;the first step in making chocolate chip cookies is to let some butter soften to room temperature.&#8221;</p>
<p class="MsoNormal">The symbolic knowledge base is not the whole story of human-like memory.<span> </span>I believe that we have other – probably separate – memory mechanisms to deal with images and visual patterns (2D, 3D, and 4D), sounds and music, smells and tastes, low-level motion planning, and probably several other things.<span> </span>I will have more to say about these other memory components in a future article.<span> </span>Nor is this symbolic KBS meant to be an intelligent system in its own right: as we will see, the KBS is more than a passive repository of facts, but it is only the base upon which we can build higher-level planning and cognition, and intelligent applications of all kinds.</p>
<p class="MsoNormal">Some AI researchers, psychologists, and philosophers would argue that it makes no sense to study the human memory system (or an artificial KBS) in isolation, as a separate component distinct from the processes that use the knowledge.<span> </span>They argue that that intelligence and &#8220;consciousness&#8221; must be considered holistically if we are to make any sense of what is going on.<span> </span>I disagree: I believe that &#8220;divide and conquer&#8221; is a good approach whenever it can be applied without too much damage to the subject, and I believe that the division suggested here between the KBS and everything else is a natural one. My students and I have already seen ways in which understanding the KBS – what it contains and what it does – can help us<span> </span>to think more clearly about how the overall system works.</p>
<p class="MsoNormal">So with those caveats in mind, here we go:</p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 14pt; font-family: Symbol;"><span>·<span> </span></span></span><!--[endif]--><strong>The knowledge base can be thought of as a <em>semantic network</em>, with <em>nodes</em> representing entities (physical objects like &#8220;elephant&#8221;, but also more abstract entities like &#8220;September&#8221; or &#8220;juggling&#8221;) and <em>links</em> representing the relational connections among these entities.<span> </span></strong>We refer to nodes and links, collectively, as <em>elements</em> in the KBS.<span> </span>As we will see, the semantic network we contemplate here is much more precise in its representation and reasoning than the early, sloppy semantic networks in which it was often impossible to determine whether a node labeled &#8220;elephant&#8221; represented the typical elephant, some specific elephant, the set of all elephants, or the word &#8220;elephant&#8221;.<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=33#footnote_1_33" id="identifier_1_33" class="footnote-link footnote-identifier-link" title="See William A. Woods, &amp;#8220;What&amp;#8217;s in a Link: Foundations for Semantic Networks&amp;#8221;. In D. Bobrow and A. Collins (eds.),&nbsp;Representation and Understanding: Studies in Cognitive Science, New York: Academic Press, 1975.">2</a>]</sup> <span> </span>We need to represent all of these things, but in distinct ways.</p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 14pt; font-family: Symbol;"><span>·<span> </span></span></span><!--[endif]--><strong>Elements in the KBS represent <em>concepts</em>, not words or word-definitions.<span> </span></strong>So in Scone (and, we would argue, in the memory system of humans) the &#8220;elephant&#8221; node represents the <em>concept</em> of elephant, and is the anchor point for our description of the typical elephant: its parts, properties, relations, and behaviors.<span> </span>This conceptual representation is independent of the particular human-language descriptions or other ways in which this knowledge was originally expressed.</p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 14pt; font-family: Symbol;"><span>·<span> </span></span></span><!--[endif]--><strong>A conceptual element may be associated with any number of <em>names</em> (words or phrases in some human language), and a name may be associated with many different <em>meanings</em> (concept nodes), so the mapping is many-to-many within a single human language.<span> </span></strong>Some concepts may have no names.<span> </span>The set of concept nodes doesn&#8217;t change as we switch from one human language to another, but each new language will provide a new set of names and associations.<strong> </strong><span> </span>If a word or phrase maps to more than one concept, we call it ambiguous or polysemous.<span> </span>This ambiguity must be resolved, at least tentatively, before we are ready to store the conceptual meaning of an utterance in the KBS.<span> </span>The conceptual representation does not preserve the original ambiguity, but represents <em>one</em> specific meaning.<span> </span>If a sentence is truly ambiguous we might store several possible meanings in the KBS, labeling them as distinct alternative choices.<span> </span>In any case, we will probably store the original natural-language form as well, at least for a short time, just in case we need to refer back to it to see if there could be an alternative meaning.</p>
<p class="MsoNormal" style="margin-left: 0.25in;">Many linguists and some AI researchers blur this distinction between linguistic expressions and conceptual representations.<span> </span>Some even claim that no separate conceptual representation exists, and that no representation <em>can</em> exist that is independent of language.<span> </span>This radical view leads to all sorts of problems and paradoxes.<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=33#footnote_2_33" id="identifier_2_33" class="footnote-link footnote-identifier-link" title="An excellent account of some of these problems can be found in Chapter 3 of&nbsp;Steven Pinker&amp;#8217;s&nbsp;The Stuff of Thought, Viking Press, 2007.">3</a>]</sup></p>
<p class="MsoNormal" style="margin-left: 0.25in;">The <a href="http://csc.media.mit.edu/">Commonsense Computing</a> project at MIT (affiliated with the <a href="http://www.openmind.org/">Open Mind Project</a>) has advocated building and using a knowledge base that is a collection of undigested English sentences.<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=33#footnote_3_33" id="identifier_3_33" class="footnote-link footnote-identifier-link" title="Their effort now seems to have broadened to encompass both raw-English representations and more structured ones.">4</a>]</sup> <span> </span>Such a collection has its uses, but I believe that a language-independent, concept-level representation is essential for human-like thought.<span> </span>It is too hard to reason effectively with raw English expressions, full of ambiguities and unresolved references.<span> </span>In any case, we need a mechanism that can reason about the many concepts that, in a given human language, do not have conventional names.<span> </span>For example, we can easily represent and reason about the concept of &#8220;juggling while standing on one foot, blindfolded, during a hurricane&#8221; – for example, we can conclude that this activity is much more difficult than conventional juggling – but I would be very surprised to find a human language or subculture that has a name for this activity.</p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 14pt; font-family: Symbol;"><span>·<span> </span></span></span><!--[endif]--><strong>The KBS must support at least simple kinds of search and inference; it cannot be a passive repository of entities and assertions.<span> </span></strong>As described in the previous article, human memory <em>appears to contain</em> much more information than it <em>explicitly contains</em>; the extra information must be supplied by access-time inference, even though we have no subjective experience of doing significant mental work as these inferences take place.<span> </span>If we are told that Clyde is an elephant, we immediately appear to know that Clyde is gray, four-legged, has a liver, and so on.<span> </span>Our KBS must provide a similar capability for simple inference.<span> </span>It must also be able to support moderately complex searches, such as finding all the matching elements in the KB, given a list of features, even if the chosen elements have those features only by inheritance and not directly.<span> </span>We could do this search and inference in a separate module, distinct from the KBS, but it is much more efficient to build these operations into the KBS itself – then we can organize the representations in the knowledge base to speed up these operations.</p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 14pt; font-family: Symbol;"><span>·<span> </span></span></span><!--[endif]--><strong>For human-like performance, the KBS must be capable of scaling up to a large size.<span> </span></strong>As we saw above, the amount of information implicit in the knowledge base, and easily accessible to the KBS, is much larger than the amount of information explicitly present.<span> </span>But, even so, a large amount of explicit knowledge is still required.<span> </span>My best guess as to the size of the <em>symbolic</em> memory in a typical human is something like 10 million<sup> </sup> to 100 million<sup> </sup> elements (i.e. node and link equivalents, however the brain actually stores these).<span> </span>We will revisit this question in a future article.<span> </span>For now, the point is that the storage and inference machinery in our KBS had better be able to scale up to sizes like this.</p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 14pt; font-family: Symbol;"><span>·<span> </span></span></span><!--[endif]--><strong>The KBS <em>does not</em> have to directly support more complex forms of reasoning, such as general theorem proving, optimal planning, Bayesian combination of probabilities, or puzzle solving.</strong><span> </span>I would argue that these operations are <em>not</em> required for general human-like intelligence.<span> </span>Some humans have learned to do these things, and they are important capabilities for solving certain problems.<span> </span>But the vast majority of humans have not mastered these skills, and these people get along just fine, even in our complex society.<span> </span>Of course, these things may be going on deep in our cognitive machinery, even if we have no conscious access to them, but I see no evidence of that.</p>
<p class="MsoNormal" style="margin-left: 0.25in;">So, we may want to provide a theorem-proving or optimal-planning module as part of an AI application, but I do not believe that we need to build such capabilities into the KBS itself.<span> </span>As a general rule of thumb, we want the KBS to handle all the memory-related operations that appear to be almost effortless for humans, and not those things that require deep thought, advanced education, or a pencil and paper.</p>
<p class="MsoNormal" style="margin-left: 0.25in;">This is a controversial view.<span> </span>Many AI researchers equate inference with theorem-proving, almost as an article of faith.<span> </span>Their inference systems provide guarantees of logical completeness and provable consistency, but they pay a terrible price for this: if they cling to these guarantees, they cannot have a KBS that is both sufficiently expressive for human-like reasoning and computationally tractable – that is, able to scale up to a large size without exponential growth in the time required for inference.<span> </span>So we take the radical (to some) step of separating the fast, simple inference that is integral to the KBS from more general kinds general theorem proving that we view as optional.<span> </span>I will have more to say about this topic in a future article.</p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 14pt; font-family: Symbol;"><span>·<span> </span></span></span><!--[endif]--><strong>The backbone of the network representation is the<em> is-a hierarchy</em>.<span> </span></strong>The is-a relation determines class membership and is also the main pathway for inheritance of properties and descriptions.<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=33#footnote_4_33" id="identifier_4_33" class="footnote-link footnote-identifier-link" title="Many knowledge-base systems, such as&nbsp;Cyc, use distinct link-types for &amp;#8220;is a kind of&amp;#8221; and &amp;#8220;is an instance of&amp;#8221;.&nbsp;In Scone we just use &amp;#8220;is-a&amp;#8221; for both of these, since these link-types behave in very similar ways. The type/instance distinction is represented in a different way.">5</a>]</sup> <span> </span>So if there is a chain of is-a links from &#8220;Clyde&#8221; to &#8220;elephant&#8221; to &#8220;mammal&#8221; to &#8220;vertebrate&#8221; to &#8220;physical object&#8221; to &#8220;thing&#8221;, we know that Clyde is an instance of each of these types, and that Clyde should inherit the properties of the typical member of each of these types.<span> </span>If the typical vertebrate has a liver, Clyde has a liver.</p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 14pt; font-family: Symbol;"><span>·<span> </span></span></span><!--[endif]--><strong>Multiple inheritance – the ability for a node to have is-a links to more than one immediate superior (more general) class – is essential in a KBS designed for human-like reasoning.<span> </span></strong>In addition to being an elephant, Clyde may be a male, an adult, a circus performer, and an expert skateboarder, inheriting properties and other information from each of these descriptions.<span> </span>Some classes are defined as intersections of others: a &#8220;boy&#8221; is a child, male, human, while a &#8220;woman&#8221; is an adult, female, human.<span> </span>The type hierarchy in a programming language like Java may (barely) get away with only single inheritance, but for representing knowledge, multiple inheritance is crucial.</p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 14pt; font-family: Symbol;"><span>·<span> </span></span></span><!--[endif]--><strong>The KBS must support default reasoning with exceptions.</strong><span> </span>When we hear that Clyde is an elephant, we assume (because of inheritance) that Clyde is gray, has four legs, and eats only vegetation.<span> </span>Default reasoning of this kind is very important in any KBS – it&#8217;s a big part of the &#8220;value added&#8221; by including some inference mechanisms in the KBS.<span> </span>However, for a real-world KBS, there are very few general statements that do not have at least a few exceptions.<span> </span>We may be told explicitly that, unlike typical elephants, Clyde is white, is three-legged, and that he enjoys an occasional hamburger, but despite all that, Clyde is still an elephant and he still inherits all the other elephant properties.<span> </span>So we must be able to add these specific statements about Clyde and cancel or over-ride the conflicting default properties that would otherwise be inherited.</p>
<p class="MsoNormal" style="margin-left: 0.25in;">The use of exceptions is controversial in the knowledge representation community.<span> </span>Almost everyone agrees that exceptions are potentially very useful, and perhaps unavoidable, for real-world reasoning.<span> </span>But for knowledge representation systems whose inference is based on theorem-proving, this capability imposes a cost that many researchers consider unacceptable: any use of exceptions takes you into the realm of non-monotonic logic, which introduces many new problems and greatly increases the computational cost of inference.<span> </span>So most current knowledge representation systems exclude exceptions: if you add a statement to the knowledge base, it must be <em>universally</em> true; statements that admit exceptions, if they are allowed at all, are banished to some disreputable annex of the KBS, outside the purview of the core inference machinery.</p>
<p class="MsoNormal" style="margin-left: 0.25in;">Controversial it may be, but for human-like performance on real-world knowledge, there is really no option: statements with exceptions are too common to ignore or to consign to second-class status.<span> </span>There are too many flightless birds, white elephants, meat-eating plants, and honest politicians in the world – they may be rare, but we can’t just label them as inconceivable, nor can we give up the default reasoning that handles the non-exceptional cases.<span> </span>So we must bite the bullet and allow default reasoning with exceptions, whatever the cost.<span> </span>And, if we don&#8217;t cling to the assumptions of inference-as-theorem-proving, the cost is not so bad.</p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 14pt; font-family: Symbol;"><span>·<span> </span></span></span><!--[endif]--><strong>The KBS must support meta-information and meta-reasoning about the statements themselves; statements must be first-class objects with properties that we can reason about.<span> </span></strong>For many problem domains, it is very important to record certain information about the statements themselves: where they came from, what probability or level of confidence we assign to each, what various analysts and critics have said about specific statements or groups of statements, and so on.<span> </span>And, having reasoned about what statements we choose to believe in a given situation, we then must be able to go ahead and use only the &#8220;believed&#8221; statements in our reasoning.<span> </span>This kind of meta-reasoning is particularly important in domains such as military intelligence, in which the task is to draw conclusions from uncertain and often conflicting reports, but it comes up in all kinds of human reasoning – even in children&#8217;s stories – and we human are very good at this.<strong> </strong></p>
<p class="MsoNormal" style="margin-left: 0.25in;">Once again, this choice is controversial in the knowledge representation community, since it takes us beyond the comfortable and well-understood domain of first-order logic.<span> </span>But, once again, it is unavoidable in a KBS designed for human-like reasoning in real-world situations.<strong> </strong></p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 14pt; font-family: Symbol;"><span>·<span> </span></span></span><!--[endif]--><strong>Each entity (node) in memory is a <em>description</em>; each subtype or instance is a <em>virtual copy</em> of that description.<span> </span></strong>So far, we have spoken of properties being inherited through the is-a hierarchy.<span> </span>In fact, each entity (node) in memory is potentially a full description: it contains not just membership in superior classes and some properties, but also relations, parts and components, and relationships among those components.<span> </span>So, for example, the &#8220;elephant&#8221; node (representing the typical elephant) has a &#8220;predominant color&#8221; property of &#8220;gray&#8221;, one head, one nose (connected to the head), four legs, a &#8220;hates&#8221; relation to the &#8220;typical snake&#8221; node, and some typical activities such as eating and breathing.<strong> </strong></p>
<p class="MsoNormal" style="margin-left: 0.25in;">When we say that Clyde is an elephant, we want Clyde to suddenly become a virtual copy of the typical-elephant description: henceforth, as far as queries to the KBS are concerned, we want Clyde to behave as if we had copied the entire &#8220;elephant&#8221; description, replacing &#8220;elephant&#8221; with &#8220;Clyde&#8221;.<span> </span>However, we don&#8217;t want make an actual copy: that requires a lot of effort, wastes memory in the KBS, and creates the maintenance problem of keeping many &#8220;elephant&#8221; copies up to date as we learn new things about the typical elephant.<span> </span>So this must be a <em>virtual copy</em>: the inference machinery built into the KBS must make it appear that the copy has been made, without actually doing that.<span> </span>It can be tricky to get this right, but it is really the fundamental operation in any well-behaved KBS, and a bit of introspection will persuade you that your own memory system is also creating virtual copies, or something very similar.</p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 14pt; font-family: Symbol;"><span>·<span> </span></span></span><!--[endif]--><strong>The KBS must be able to support many distinct but overlapping world-models (<em>contexts</em>) at once.<span> </span></strong>We humans are quite adept at creating and<span> </span>working with multiple world-models.<span> </span>We use these in many ways: exploring possible futures, alternative or &#8220;what if&#8221; possibilities (including entire fictional universes, such as the Harry Potter universe), and different states of the real world at different times.<span> </span>We also use these to represent the states of another human&#8217;s belief or knowledge, and to reason about what that person might conclude or might do, given that belief-set.<span> </span>From a young age, we are able to reason about situations like this:</p>
<p class="MsoNormal" style="margin-left: 0.5in;"><em>&#8220;The three pigs are safe inside the brick house.<span> </span>The wolf is outside, and the pigs know it.<span> </span>The wolf pretends to leave.<span> </span>The pigs believe this deception and open the door… You can guess what happens next!&#8221;</em></p>
<p class="MsoNormal" style="margin-left: 0.25in;">It should be clear that in order to understand this simple story, the reader must be able to reason about what the pigs know and believe (pre-deception), what they believe and act upon (post-deception), what the wolf intends the pigs to believe, and so on.</p>
<p class="MsoNormal" style="margin-left: 0.25in;">So to achieve human-like performance, a KBS must be able to represent each of these distinct world models at once, moving freely between them and reasoning about what would be true in each.<span> </span>To entertain a &#8220;what if X?&#8221; hypothesis, we must be able to create a context in which X is true and then explore the consequences, but we do not want to mix X into the set of things that we believe to be true in current reality.</p>
<p class="MsoNormal" style="margin-left: 0.25in;">Note that these contexts are distinct – each one differs from the others in some specific ways – but they overlap: in each of the contexts in the pig story, we know that brick-houses are wolf-proof unless the inhabitants open the door.<span> </span>So we want each new context to begin life as a virtual copy of some other context, and then we add or cancel statements as needed.<span> </span>We will have much more to say about this in future articles.</p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 14pt; font-family: Symbol;"><span>·<span> </span></span></span><!--[endif]--><strong>In addition to static knowledge, the KBS must support episodic memory: the ability to store and reason about events, actions, and plans.<span> </span></strong>Again, this is central to human reasoning ability.<span> </span>It appears that we use the same episodic-memory representation for many tasks.<span> </span>First, we need a way of representing and reasoning about event sequences that we have observed.<span> </span>In the course of a lifetime, we will collect a large number of these.<span> </span>Sometimes we will just know the sequence of events, or a partially-observed sequence; often (as with out own plans) we will know the higher-level motivations as well.<span> </span>Second, we can use the same representation for recognizing familiar plan-templates when we observe a few steps in the plan.<span> </span>Third, we can use all these stored plans and sequences as templates or recipes to guide our own planning, or our estimates of what some other actor may do next.<span> </span>We want our human-like KBS to have all these capabilities as well.</p>
<p class="MsoNormal" style="margin-left: 0.25in;">This is a very big topic that I will address in future articles.<span> </span>For now, let me just say that the virtual-copy mechanism will be crucial in representing actions, events, and plans at various levels of abstraction, and that the multiple context mechanism will provide the means for reasoning<span> </span>&#8220;before&#8221; and &#8220;after&#8221; world-models, and the state of the universe after each step in a plan.</p>
<p class="MsoNormal">So that is our preliminary wish-list for a KBS that can support human-like reasoning.<span> </span>There is much more to say about each of these points, and many more points I might have added, but I hope that this is enough to give you a general idea of what my students and I are trying to achieve in our wok on knowledge representation in general and on Scone in particular.<span> </span>It is a rather different wish-list from that of most other researchers in the knowledge representation field, but we think it is a coherent and productive set of goals.</p>
---------------------------
  
  <ol class="footnotes"><li id="footnote_0_33" class="footnote">For clarity, in this and future articles I will try to consistently refer to a specific collection of knowledge as a knowledge base (KB), and to the system that contains this knowledge, with some support for search and inference, as the knowledge-base system (KBS). Informally, it is common to refer to both with the term &#8220;knowledge base&#8221; – similar to the practice in the database world.</li><li id="footnote_1_33" class="footnote">See William A. Woods, &#8220;What&#8217;s in a Link: Foundations for Semantic Networks&#8221;. In D. Bobrow and A. Collins (eds.), Representation and Understanding: Studies in Cognitive Science, New York: Academic Press, 1975.</li><li id="footnote_2_33" class="footnote">An excellent account of some of these problems can be found in Chapter 3 of Steven Pinker&#8217;s <a href="http://pinker.wjh.harvard.edu/books/stuff/index.html">The Stuff of Thought</a>, Viking Press, 2007.</li><li id="footnote_3_33" class="footnote">Their effort now seems to have broadened to encompass both raw-English representations and more structured ones.</li><li id="footnote_4_33" class="footnote">Many knowledge-base systems, such as <a href="http://www.cyc.com/">Cyc</a>, use distinct link-types for &#8220;is a kind of&#8221; and &#8220;is an instance of&#8221;. In Scone we just use &#8220;is-a&#8221; for both of these, since these link-types behave in very similar ways. The type/instance distinction is represented in a different way.</li></ol>]]></content:encoded>
			<wfw:commentRss>http://scone1.scone.cs.cmu.edu/nuggets/?feed=rss2&#038;p=33</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Human-Like Memory Capabilities</title>
		<link>http://scone1.scone.cs.cmu.edu/nuggets/?p=32</link>
		<comments>http://scone1.scone.cs.cmu.edu/nuggets/?p=32#comments</comments>
		<pubDate>Tue, 17 Jun 2008 19:09:45 +0000</pubDate>
		<dc:creator>Scott Fahlman</dc:creator>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[KR Issues]]></category>

		<guid isPermaLink="false">http://sef-linux.radar.cs.cmu.edu/nuggets/?p=32</guid>
		<description><![CDATA[In an earlier article, &#8220;AI: What&#8217;s Missing&#8220;, I listed several capabilities that are missing from current AI systems – capabilities that must somehow be provided before our systems will be able to exhibit anything approaching general, human-like intelligence. Perhaps the most important of these, because it interacts with several others, is &#8220;The ability to assimilate [...]]]></description>
				<content:encoded><![CDATA[<p class="MsoNormal">In an earlier article, &#8220;<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=29">AI: What&#8217;s Missing</a>&#8220;, I listed several capabilities that are missing from current AI systems – capabilities that must somehow be provided before our systems will be able to exhibit anything approaching general, human-like intelligence.<span> </span>Perhaps the most important of these, because it interacts with several others, is <em>&#8220;<strong><span style="font-weight: normal;">The ability to assimilate and store large amounts of symbolic knowledge and to make that knowledge effective.&#8221;</span></strong></em><strong><span style="font-weight: normal;"><span> </span>In this article, we will explore what it means when we talk about “human-like” intelligence and “human-like” memory.</span></strong><strong> </strong></p>
<p class="MsoNormal">A key point: when I speak of human-like intelligence, I am talking about human-like <em>capabilities</em>, not necessarily human-like <em>implementation</em>.<span> </span>I want to focus on the breadth, flexibility, and the amazing search and inference capabilities of human memory, and not necessarily on how this is implemented in the marvelous pile of neuro-stuff that is the human brain.<span> </span>If we could understand the general principles and mechanisms of human memory, at some level of abstraction, that would guide us both in trying to understand what the structures of the brain are doing, and also in trying to implement similar capabilities using the very different electronic hardware that is available today – or it would tell us what kind of hardware we really need in order to do this job.</p>
<p class="MsoNormal">So I suggest that we – some of us, anyway – approach this problem not as neuroscientists or as cognitive psychologists, but as computer scientists: Here is what the system must do; we know that this is possible (humans can do it), so what algorithms, data structures, and organizing principles are required to do the job?<span> </span>Oh, and as computer scientists, we must understand this well enough to implement it – no hand-waving or vague metaphors about shadows on the wall of a cave are allowed in our final theory, though we might make good use of metaphorical reasoning along the way.<span> </span>That is the challenge.</p>
<p class="MsoNormal">Note that it’s possible – some would say probable, in this case – that we will come up with models that cannot be implemented, in any practical sense, on today’s computing hardware.<span> </span>That’s OK.<span> </span>But if we come up with the answer that the problem is computationally impossible or intractable in some fundamental way, we need to look again at how we are framing the problem, because <em>at least one solution exists</em>.</p>
<p class="MsoNormal">(Well, the brain could be using magic or some as-yet-undiscovered kind of physics, but we should only accept that conclusion if we’re absolutely certain, through proof or exhaustive search, that no solution exists in computation as we know it.<span> </span>And, today, we are very far from being able to make that claim, despite the ravings of some dilettante physicists and mathematicians who have wandered into the field.)</p>
<p class="MsoNormal">When I say that we are focused on the capabilities of human memory and not on implementation, I don’t mean to suggest that we should ignore the findings if neuroscientists and cognitive psychologists, just that we should not be constrained by them.<span> </span>There may be many possible ways to build a human-like memory, but at present we have only one example, so we should learn what we can from it.<span> </span>A few examples:</p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><strong><span style="font-family: Symbol;"><span>·<span> </span></span></span></strong><!--[endif]-->The brain <em>apparently</em> does what it does using millisecond-speed components (neuron firings and protein bindings), but with a huge degree of parallelism.<span> </span>So that’s a clue.<span> </span></p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><strong><span style="font-family: Symbol;"><span>·<span> </span></span></span></strong><!--[endif]-->While access to the human memory is fast, creating new memories is much slower, at least in the long-term, symbolic part of memory – maybe one or a few new items can be permanently saved per second.<span> </span></p>
<p class="MsoNormal" style="margin-left: 0.25in; text-indent: -0.25in;"><!--[if !supportLists]--><strong><span style="font-family: Symbol;"><span>·<span> </span></span></span></strong><!--[endif]-->In humans (especially in aging computer science professors) some memories just vanish, but for many more we seem to lose some of the access threads, or these become inactive until something “reminds” us of the item we are looking for; then we can easily repair the thread.<span> </span>I might be able to picture a certain actor who has appeared in recent pirate movies and remember a dozen other movies he has been in, but be unable to access his name until someone says, “Rhymes with ‘strep’.”<span> </span>(We might prefer to construct an artificial intelligence without this partial-forgetting phenomenon, making it super-human in that respect.<span> </span>That might work, or we might find that this decay of access paths is not a hardware limitation, but that it is somehow essential to the operation of a human-like memory.)</p>
<p class="MsoNormal">So what are the important aspects of human-like symbolic memory<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=32#footnote_0_32" id="identifier_0_32" class="footnote-link footnote-identifier-link" title="I am not suggesting that&nbsp;symbolic&nbsp;memory &ndash; the things we humans can easily describe in natural language &ndash; is all that there is.&nbsp;I&rsquo;ll have more to say about visual, sound, and motor memory in future articles.&nbsp;But, for now, I think that it&rsquo;s best to focus on the most accessible and familiar part of our memory.">1</a>]</sup> that I am so eager to capture?<span> </span>Let me try to illustrate the problem here with an example or two, and in a near-future article I will try to enumerate some particular abilities and features.<span> </span>Here is a description I&#8217;ve been using for a long time, in various forms.<span> </span>It first appeared in my 1977 Ph.D. thesis<sup>[<a href="http://scone1.scone.cs.cmu.edu/nuggets/?p=32#footnote_1_32" id="identifier_1_32" class="footnote-link footnote-identifier-link" title="Reprinted in book form by MIT Press:&nbsp;NETL: A System for Representing and Using Real-World Knowledge, Scott E. Fahlman, 1979.&nbsp; In addition, there&amp;#8217;s a scanned version of the tech-report form of the thesis online at&nbsp;ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-450.pdf.">2</a>]</sup> at the MIT AI Lab:</p>
<p class="MsoNormal"><em>Suppose I tell you that a certain animal – let&#8217;s call him </em><em>Clyde</em><em> – is an elephant.<span> </span>You accept this simple assertion and file it away with no apparent display of mental effort.<span> </span>And yet, as a result of this transaction, you suddenly appear to know a great deal about </em><em>Clyde</em><em>.<span> </span>You can tell me, with a fair degree of certainty, how many legs he has, what color he is, and whether he would be a good pet in a small third-floor apartment.<span> </span>You know not only that he has eyes, but what they are used for, and what it implies if they are closed.<span> </span>If I try to tell you that </em><em>Clyde</em><em> builds his nest in a tree or that he amuses himself by hiding in a teacup, you will immediately begin to doubt my credibility.<span> </span>And you can do this very quickly and easily, with none of the sort of apparent mental effort that would accompany, say, adding two four-digit numbers.<span> </span>This effortlessness may be an illusion, but it is a compelling one.</em></p>
<p class="MsoNormal"><em>“Elephant”, of course, is not the only concept that behaves in this way.<span> </span>The average person knows a huge number of concepts of comparable or greater complexity – the number is probably in the millions.<span> </span>Consider for a moment the layers of structure and meaning that are attached to concepts like lawsuit, birthday party, fire, mother, walrus, cabbage, or king.<span> </span>These are words we use casually in our daily lives, and yet each of them represents a very substantial package of information.<span> </span>In technical fields (except, perhaps, for the more austere parts of mathematics, the situation is the same.<span> </span>Consider how much you would have to tell someone in order to fully convey the meaning of concepts like meson, oscillator, hash-table, valence, ribosome, or leukemia.<span> </span>And yet, once these concepts are built up, they can be tossed around with abandon and can be used as the building blocks for concepts of even grater complexity.</em></p>
<p class="MsoNormal"><em>The point is not just that we can handle large chunks of knowledge as though they were atoms; the important thing is that we can find our way through these complex, nested structures to whatever fact or relationship we might need at any given time, that we can do this in a very flexible way, and that we can somehow avoid having to look individually at each of the vast number of facts that could be – but are not – relevant to the problem at hand.<span> </span>If I tell you that a house burned down, and that the fire started at a child’s birthday party, you will think immediately of the candles on the cake and perhaps of paper decorations.<span> </span>You will not, in all probability, find yourself thinking about playing pin-the-tail-on-the-donkey, or about the color of the cake’s icing or about the fact that birthdays come once a year.<span> </span>These concepts are there when you need them, but they do not seem to slow down the search for a link between fires and birthday parties.<span> </span>If, hidden away somewhere, there is a sequential search for this connection, that search is remarkably quick and efficient, and it does not become noticeably slower as the knowledge base expands to its adult proportions.</em></p>
<p class="MsoNormal">So that’s the challenge, as I see it: To build a memory system or knowledge base that can support operations of this kind, on a very large scale, and that gives us access not only to the knowledge that is <em>explicitly</em> present, but the much larger body of knowledge that is <em>implicitly</em> present.<span> </span>This <em>active knowledge base</em> is not, by itself, an intelligent system, but it is the key missing piece – the piece that ties together our sensory systems, higher-level problem-solving modules, natural language processing, and all of our experiences: past, present, and imagined-future.</p>
<p class="MsoNormal">In future articles we will try to break down this problem and will begin to explore how we might implement a system of this kind.</p>
---------------------------
  
  <ol class="footnotes"><li id="footnote_0_32" class="footnote">I am not suggesting that symbolic memory – the things we humans can easily describe in natural language – is all that there is. I’ll have more to say about visual, sound, and motor memory in future articles. But, for now, I think that it’s best to focus on the most accessible and familiar part of our memory.</li><li id="footnote_1_32" class="footnote">Reprinted in book form by MIT Press: <a href="http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&amp;tid=9750">NETL: A System for Representing and Using Real-World Knowledge</a>, Scott E. Fahlman, 1979.  In addition, there&#8217;s a scanned version of the tech-report form of the thesis online at <a title="ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-450.pdf" href="ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-450.pdf" target="_blank">ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-450.pdf</a>.</li></ol>]]></content:encoded>
			<wfw:commentRss>http://scone1.scone.cs.cmu.edu/nuggets/?feed=rss2&#038;p=32</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
