Sunday 14 June 2015

History of Artificial Intelligence

The gestation of artificial intelligence

The first work that is now generally recognised as A1 was done by Warren McCulloch and Walter Pitts (1943). They drew on three sources: knowledge of the basic physiology and function of neurons in the brain; a formal analysis of propositional logic due to Russell and Whitehead; and Turing's theory of computation. They proposed a model of artificial neurons in which each neuron is characterized as being  " on "  or  " off, "  with a switch to  " on "  occurring in response to stimulation by a sufficient number of neighboring neurons.
The state of a neuron was conceived of as  " factually equivalent to a proposition which proposed its adequate stimulus. "  They showed, for example, that any computable function could be computed  by some network of connected neurons, and that all the logical connectives (and, or, not, etc.) could be implemented by simple net structures. McCulloch and Pitts also suggested that suitably defined networks could learn. Donald Hebb (1949) demonstrated a simple updating rule for modifying the connection strengths between neurons. His rule, now called  Hebbian learning,  remains an influential model to this day.
Two undergraduate students at Harvard, Marvin Minsky and Dean Edmonds, built the first neural network computer in 1950. The S NARC , as it was called, used 3000 vacuum tubes and a surplus automatic pilot mechanism from a B - 24 bomber to simulate a network of 40 neurons. Later, at Princeton, Minsky studied universal computation in neural networks. His Ph.D. committee was skeptical about whether this kind of work should be considered mathematics, but von Neumann reportedly said,  " If it isn't now, it will be someday. "  Minsky was later to prove influential theorems showing the limitations of neural network research. There were a number of early examples of work that can be characterized as AI, but it was Alan Turing who first articulated a complete vision of A1 in his 1950 article  " Computing Machinery and Intelligence. "  Therein, he introduced the Turing test, machine learning, genetic algorithms, and reinforcement learning.

The birth  of  artificial intelligence

Princeton was home to another influential figure in AI, John McCarthy. After graduation, McCarthy moved to Dartmouth College, which was to become the official birthplace of the field. McCarthy convinced Minsky, Claude Shannon, and Nathaniel Rochester to help him bring together U.S. researchers interested in automata theory, neural nets,  and  the study of intelligence. They organized a two - month workshop at Dartmouth in the summer of 1956. There were 10 attendees in all, including Trenchard More from Princeton, Arthur Samuel from IBM, and Ray Solomonoff and Oliver Selfridge from MIT. 

Two researchers from Carnegie Tech,13 Allen Newell and Herbert Simon, rather stole the show. Although the others had ideas and in some cases programs for particular applications such as checkers, Newell and Simon already had a reasoning program, the Logic Theorist  (LT),  about which Simon claimed,  " We have invented a computer program capable of thinking non - numerically, and thereby solved the venerable mind.

The Dartmouth workshop did not lead to any new breakthroughs, but it did introduce all the major figures to each other. For the next 20 years, the field would be dominated by these people and their students and colleagues at MIT, CMU, Stanford, and IBM. Perhaps the longest - lasting thing to come out of the workshop was an agreement to adopt McCarthy's new name for the field:  artificial intelligence.  Perhaps  " computational rationality "  would have been better, but  " AI "  has stuck. Looking at the proposal  for  the Dartmouth workshop (McCarthy  et  al.,  1955),  we can see why it was necessary for AI to become a separate field. Why couldn't all the work done in AI have taken place under the name of control theory, or operations research, or decision theory, which, after all, have objectives similar to those of AI? Or why isn't AI a branch of mathematics? The first answer is that AI from the start embraced the idea of duplicating human faculties like creativity, self - improvement, and language use. None of the other fields were addressing these issues. The second answer is methodology.  AI  is the only one of these fields that is clearly a branch of computer science (although operations research does share an emphasis on computer simulations), and AI is the only field to attempt to build machines that will function autonomously in complex, changing environments.

Early enthusiasm, great expectations

The early years of A1 were full of successes - in a limited way. Given the primitive computers and programming tools of the time, and the fact that only a few years earlier computers were seen as things that could do arithmetic and no more, it was astonishing whenever a computer did anything remotely clever. The intellectual establishment, by and large, preferred to believe that  " a machine can never do  X. "  (See Chapter 26 for a long list of X's gathered by Turing.) A1 researchers naturally responded by demonstrating one  X  after another. John McCarthy referred to this period as the  " Look, Ma, no hands! "  era.

Newell and Simon's early success was followed up with the General Problem Solver, or GPS. Unlike Logic Theorist, this program was designed from the start to imitate human problem - solving protocols. Within the limited class of puzzles it could handle, it turned out that the order in which the program considered sub goals and possible actions was similar to that in which humans approached the same problems. Thus, GPS was probably the first program to embody the  " thinking humanly "  approach. The success of GPS and subsequent programs as models of cognition led Newel1 and Simon (1976) to formulate the famous  physical symbol system  hypothesis, which states that  " a physical symbol system has the necessary and sufficient means for general intelligent action. "  What they meant is that any system (human or machine) exhibiting intelligence must operate by manipulating data structures composed of symbols. We will see later that this hypothesis has been challenged from many directions.

At IBM, Nathaniel Rochester and his colleagues produced some of the first AI programs. Herbert Gelernter (1959) constructed the Geometry Theorem Prover, which was able to prove theorems that many students of mathematics would find quite tricky. Starting in 1952, Arthur Samuel wrote a series of programs for checkers (draughts) that eventually learned to play at a strong amateur level. Along the way, he disproved the idea that computers can do only what they are told to: his program quickly learned to play  a  better game than its creator. The program was demonstrated on television in February 1956, creating a very strong impression. Like Turing, Samuel had trouble finding computer time. Working at night, he used machines that were still on the testing floor at IBM's manufacturing plant. 

John McCarthy moved from Dartmouth to MIT and there made three crucial contributions in one historic year: 1958. In MIT AI Lab Memo No. 1, McCarthy defined the high - level language  Lisp, which was to become the dominant  AI  programming language. Lisp is the second - oldest major high - level language in current use, one year younger than FORTRAN. With Lisp, McCarthy had the tool he needed, but access to scarce and expensive computing resources was also a serious problem. In response, he and others at MIT invented time sharing. Also in 1958, McCarthy published a paper entitled Programs with Common Sense, in which he described the Advice Taker, a hypothetical program that can be seen as the first complete AI system. Like the Logic Theorist and Geometry Theorem Prover, McCarthy's program was designed to use knowledge to search for solutions to problems. But unlike the others, it was to embody general knowledge of the world. For example, he showed how some simple axioms would enable the program to generate a plan to drive to the airport to catch a plane. The program was also designed so that it could accept new axioms in the normal course of operation, thereby allowing it to achieve competence in new areas without being reprogrammed. The Advice Taker thus embodied the central principles of knowledge representation and reasoning: that it is useful to have a formal, explicit representation of the world and of the way an agent's actions affect the world and to be able to manipulate these representations with deductive processes. It is remarkable how much of the 1958 paper remains relevant even today.

1958 also marked the year that Marvin Minsky moved to MIT. His initial collaboration with McCarthy did not last, however. McCarthy stressed representation and reasoning in formal logic, whereas Minsky was more interested in getting programs to work and eventually developed an anti - logical outlook. In 1963, McCarthy started the AI lab at Stanford. His plan to use logic to build the ultimate Advice Taker was advanced by  J. A.  Robinson's discovery of the resolution method. Work at Stanford emphasized general - purpose methods for logical reasoning. Applications of logic included Cordell Green's question - answering and planning systems (Green, 1969b) and the Shakey robotics project at the new Stanford Research Institute (SRI). The latter project was the first to demonstrate the complete integration of logical reasoning and physical activity.

Minsky supervised a series of students who chose limited problems that appeared to require intelligence to solve. These limited domains became known as  micro worlds.  James Slagle's SAINT program (1963a) was able to solve closed - form calculus integration problems typical of first - year college courses. Tom Evans's ANALOGY  program (1968) solved geometric analogy problems that appear in IQ tests, such as the one in Figure 1.4. Daniel Bobrow's STUDENT  program (1967) solved algebra story problems, such as the following:

If the number of customers  Tom  gets is twice the square of  20  percent of the number of advertisements he runs, and the number of advertisements he runs is  45,  what  is  the number of customers Tom gets?

The most famous micro world was the blocks world, which consists of a set of solid blocks placed on a tabletop (or more often, a simulation of a tabletop), as shown in Figure. A typical task in this world is to rearrange the blocks in a certain way, using a robot hand that can pick up one block at a time. The blocks world was home to the vision project of David Huffman (1971), the vision and constraint propagation work of David Waltz (1975), the learning theory of Patrick Winston (1970), the natural language understanding program

of Terry Winograd (1972), and the planner of Scott Fahlman (1974).
Early work building on the neural networks of McCulloch and Pitts also flourished. The work of Winograd and Cowan (1963) showed how a large number of elements could collectively represent an individual concept, with a corresponding increase  in  robustness and parallelism. Hebb's learning methods were enhanced by Bernie Widrow (Widrow and Hoff, 1960; Widrow, 1962), who called his networks ada lines, and by Frank Rosenblatt (1962) with his perceptions. Rosenblatt proved the perception convergence theorem, showing that his learning algorithm could adjust the connection strengths of a perceptron to match any input data, provided such a match existed.

0 comments:

Post a Comment