Pitted Knowledge of the Almighty Olive: AI class notes

Some notes I made for an AI class I took a while back:

An AI program is called an intelligent agent.
An agent interacts with an environment. This environment can be physical or virtual.
The agent will perceive the state of this environment, apply internal logic and implement actions upon the environment to change it's state. The feedback loop created can be termed the Perception-action cycle.

Perceive environment through sensors
Apply control logic to determine next best possible move
Implement action through actuators
Repeat from step 1

AI has been used in the financial industry (make trading decisions quickly), robotics (to help control cameras, motors, grippers, microphones, touch pads, and speakers), games (to make adversaries seem 'real'), medicine (provide a diagnosis of patients), and on the Internet (search engines).
Fully observable: What your agent can sense is enough to deduce the exact state of the environment for decision-making purposes.
Partially observable: Aspects of the environment is hidden so the agent needs to make estimates and deduce the probable state of the environment.
An example of a partially observable environment is poker. The hands of your opponent and the cards in the deck are kept hidden from the agent; the agent will then need to remember what cards have been played e.t.c to build up an estimate of the probable state to base its move on.
Deterministic: The actions of the agent uniquely determine the outcome. For instance, in checkers if the agent moves his piece to a square it will always appear in that square; the piece will not randomly appear elsewhere on the board.
Stochastic: The outcome will have some randomness or uncertainty involved. For instance, in snakes and ladders the agent will roll the dice to determine its next move; however the number of spaces is left up to the dice roll and chance.
Discrete: The range of inputs and outputs are finite and distinct.
Continuous: The range of inputs and outputs are infinite.
Benign: The objective of the agent does not contradict yours or others.
Adversarial: The objective of the agent is to beat you or others.
AI is used to manage uncertainty. If the behaviour of a system is predictable, we can just write normal software to respond and act. AI is designed to respond appropriately when the system displays unpredictable behaviour.
The reasons for uncertainty in a system can be because of adversaries (can't predict their move), stochastic (randomness in the environment), laziness and ignorance.
A good example of AI is machine translation. Rather than hard-coding word pairs (which would inevitably provide incorrect result), machine translation was achieved through machine learning and AI techniques. The basic process is:

Build a translation model of the probability that a given word correlates with another.
Given a word, the AI will go through its model and find a translation based on the probability table
If it is correct, it is given reinforcement to improve the correlation for future translations. If incorrect, it will decrease the correlation.

Navigation and Search algorithms

In this unit we are only concerned with navigational complexity, where the complexity occurs from deciding on the sequence of actions to get us where we want. This is in contrast to complexity due to not fully perceiving our environment.

Some definitions for the route-finding problem:

Initial state of the system is what we know of the environment prior to execution of the system.
The Action function determines the actions we can achieve from our current state. The input to this function will be a state, and the output a list of possible actions.

Action(s) -> {a1, a2, a3, .... , an}
The Result function determines the possible result of performing an action from our current state. The input will be the action and current state, and the output will be a new state.

Result(s, a) -> s1
The GoalTest function tells us if this new state is our goal.

GoalTest(s) -> True | False
The PathCost function will tell us the true cost of a sequence of state transitions.

PathCost((s1, a1)->(s2, a2)->(s3, a3)) -> n
The StepCost will tell us the true cost of a single transition.

StepCost(s, a, s1) -> n
The Frontier is the furthest point/s explored. The frontier defines both the explored and the unexplored areas.
A Breadth-first or shortest-first search will always choose the shortest path length first (not cost!).
A Tree-search is a dumb algorithm that simply explores all state transitions, even if it has explored a state previously.

function TREE.search (problem p) returns path

frontier = {path(p.initial)}
loop:

if frontier is empty: return fail
s = path.end
if GoalTest(s): return path
for a in p.actions(s):

add[path+a->Results(s,a)] to frontier

A Graph-search solves the problem of re-exploring known states.

function GRAPH.search (problem p) returns path

frontier = {path(p.initial)}
explored = {}
loop:

if frontier is empty: return fail
path = remove_choice(frontier)
s = path.end
add s to explored
if GoalTest(s): return path
for a in p.actions(s):

add[path+a->Results(s,a)] to frontier
unless Results(s,a) in frontier + explored

Remember that tree and graph searches do not simply end when we expand the goal node because it may not be the best path to the goal (i.e. while we have found the goal state, it may not be the optimal path)

The Uniform cost or cheapest-first search is guaranteed to find the cheapest overall path cost. The only change to our algorithm is that our path_remove() function will now select our next choice based on lowest total cost.
The Depth-first algorithm is the opposite to the breadth-first algorithm in that it will select the longest path first. This is not an optimal solution and is not guaranteed to find the goal, however it requires a lot less memory than the other algorithms as you will only really need to store one path.

These types of searches simply expand out from the initial state until it hits the goal. The search is not directed at all. To improve search performance we can add more information, namely the estimated distance from the current state to the goal.

The Greedy Best-first search algorithm is a directed search, but if there are obstacles this search will take you along the line that appears to get you closer to the goal but may not be the best solution.
The A* algorithm combines the directed aspect of the Greedy-best-first and the shortest path aspect of the uniform cost searches. It does this by selecting the path that gives the minimal value for f.

f = g + h
where:
g(path) = path cost
h(path) = h(s) = estimated distance to goal

Minimising g() ensures the shortest path, while minimising h() ensures we keep focused on the goal.
A* depends on the heuristics function h() to find the optimal path. To do this we need to keep h() optimistic and to never overestimate the distance to the goal.
Optimal if
h(s) < true cost to the goal through that state
The logic for the above statement is that all paths on the frontier have an estimated cost greater than the found path as we are exploring using the cheapest first algorithm. Since h() is optimistic (the true cost is greater than the estimated cost), the found path must be the optimal path. For the graph search it is slightly more complicated, but the principles are the same.
A state space diagram can help map out state transitions.
Heuristics functions can generated from the rules of a problem, generally by relaxing the rules (or removing rules). By doing this it is as if we are adding operators that dictate how we traverse the state space and therefore reduce complexity. By making it easier we never overestimate and hence have a good h().
We must remember that h() is just an estimate!!!
Search works when:

Fully observable: Must be able to see the state map.
Known: Domain is known; the actions are known to the system.
Discrete: Finite number of actions to choose from.
Deterministic: The result of taking an action is known.
Static: Nothing else in the environment can affect the state of that environment.

Nodes are data structures that consist of:

State: final state of path
Action: what took us here
Cost: true total cost up to this point
Parent: a pointer to the previous node in the path

Our search algorithms would include two linked lists of nodes: a frontier (priority queue/set) and explored (set).

Hopefully this is useful to someone!

Pitted Knowledge of the Almighty Olive

Mar 6, 2012

AI class notes

No comments:

Post a Comment