A* Algorithm Notes on Amit Patel's Work

17 Apr 2019 • a star algorithms computer science cs learning

Notes on Amit Patel's work

Introduction (1)

Pathfinders have advantages over movement algorithms when the routes or obstacles are slowly changing, or they are long. Movement algorithms are better when the paths are short, and the paths or obstacles are changing. Pathfinding algorithms from computer science books work on graphs. Humans understand things innately that algorithms don't: distances, directions, and symmetry.

Dykstra's algorithm and Greedy Best-First Search (4)

Dykstra's algorithm works by repeatedly examining closest not-yet-examined vertices from the origin, expanding outwards until it reaches the goal. It scans a large area, but it is guaranteed to find the shortest path as long as more of the edges have a negative cost.

It runs much faster since it uses a heuristic to estimate how far from the goal it is and thus guide its decisions. It is that much faster since it examines fewer vertices and it naturally proceeds toward the goal rather than searching in all directions. When encountering a concave obstacle, Dykstra's algorithm finds the shortest path, but it does significantly more work. Greedy best-first-search only considers the cost to the goal and ignores the cost of the path thus far. It continues on paths even if they are quite long.

A*

Can be used to find the shortest path and it uses a heuristic to guide itself. Thus it seemingly attains the best of both Dykstra's algorithm and greedy best-first-search. It favors indices that are close to the starting point (like Dykstra's), but it also favors vertices close to the goal (like greedy best-first-search). Algorithm (f) is cost of path plus heuristic: f(n) = g(n) + h(n).

Heuristics (1)

The heuristic function (h(n)) can be used to tailor your application for speed or accuracy, and it doesn't have to be static. g(n) is the cost of the path.

if h(n) <= g(n) then A* expands its search and is slower
if h(n) = g(n) then A* will be very fast though it won't happen in every case
if h(n) is sometimes > g(n) then A* is not guarenteed to find the shortest path.
if h(n) is much > g(n) then only h(n) plays a role and it turns into greedy best-first-search.

You often don't need the best solution. You often only need something close. Similarly, using a heuristic function that never over-estimates means that it will sometimes underestimate (quite a bit).

There's a choice between speed and accuracy though it doesn't have to be global nor immutable.
A* computes f(n) = g(n) + h(n). g(n) + h(n) must be on the same scale in order to add them.

Exact Heuristics (4)

A* is computer the function (f()) at every node. When h(n) = g(n) then f(n) doesn't change. Nodes not on the right path will have > f() than the nodes that are on the right path. One way to speed up on exact h(n) is to precompute the shortest path between pairs of points. However, this isn't feasible because of performance considerations in many cases. One way to get around this is to use a coarse grid or compute between waypoints (formula).

Heuristics for Grid Maps (5)

For 4 or 6 directions of movement use Manhattan
For 8 directions use diagonal
For any use Euclidean
Don't square the distance, use the square root for euclidean otherwise it degrades into greedy best first search or Dykstra's.

If you want to search for paths to all of several, then Dykstra's might be your best shot.

Breaking Ties (12)

The quick hack is to adjust the g or h values. The tiebreaker needs to be deterministic with respect to the vertex.

One way is to nudge h slightly upward (it will prefer vertices close to the goal). A* will explore far less of the map by using the tiebreaker: p < (min cost of taking one step) / (expected max path length).

One way is to pass h to the comparison function (f). Another way is to add a deterministic random number to the heuristic or edge costs. You can do so by computing a hash of the coordinates. It breaks more ties than the former (added to Manhattan).

Another way is to prefer paths along a straight line from the starting point to the goal (formula in the paper). Another way to construct the priority queue is to ensure that new insertions with a specific f are always greater than old with the same f.

Another way is to minimize turns by looking at the x,y and potentially adding a penalty depending on the direction in relation to the goal. However, breaking ties is just a bandaid to the underlying problem. Alternate map representations can help by reducing the number of nodes.

Reducing the nodes visited also helps. You can also speed up node processing.

Approximate Heuristics (17)

Calculating exact distances is ideal but is usually impractical. You can often use an approximation. ALT A* uses landmarks and triangle inequality to preprocess the pathfinding graph. Some call this differential heuristics. Can be compressed. You can also use distance oracles. Landmarks are a specific case of a general approach.

Dealing with Moving Obstacles

Recalculating Paths (1)

Methods
- Every N steps
- When extra CPU cycles are available
- Turns a corner or passes a waypoint
- The world near the unit has changed

Path recalculation is expensive since a lot of information is thrown away. Not a good idea if you expect to have many long paths. Would be better to reuse.

Path Splicing (2)

To splice paths, one way is to use the local repair strategy (basically focus on closer paths and not worry about ones further away until you get closer). You can often detect bad paths by looking at their length. Path splicing will compute 2N to 3N path steps independent of M (number of steps for splicing).

Predicting Obstacle Movement (4)

You can also have units subscribe to changes in a region and polling. You can also predict positions of obstacles for pathfinding. A* can be modified to keep track of the time required to reach a point. The cost function can take this into account.

Implementation Notes

The A* algorithm has two sets: open, candidates to examine; closed, candidates already examined. Open is the frontier while closed is the interior. The main loop repeatedly pulls out the best node in open and checks if it is the goal node, then we're done else it gets added to closed. Then N neighbors are examined.

Ways to increase performance (3)

Decrease the size of the graph, improve the accuracy of the heuristic, or make the priority queue faster.

Set Representation (5)

Three primary operations of the data structure or set: remove-best, check if exists, insert new nodes. This matches up closely with a priority queue. The best generic choice is a binary heap. Various data structures for A* discussed. In a data structure, we're not just looking for asymptotic (Big O) behavior. We're also looking for a low constant. None of the underlying data structures are entirely satisfactory, and it is recommended to use a hybrid data structure.

Optimizations (14)

A shorter path is often a better path. However, they can still be expensive to compute. Strategies: immediate start, get the unit moving and compute a path later; early exit, get a partial path; Interruptible, compute a path for N distance and store the state to come back to it later; and group by having multiple units flocking together.

Space Used by Precalculated Paths

Path compression, waypoints, and beacons are all space saving techniques for paths. Waypoints rely on well-known paths calculated beforehand.

Variants of A*

Beam Search (1)

Beam search places a limit on the size of the open set. When it becomes too large the node with the worst chances of giving a good path is dropped.

Iterative Deepening (1)

Iterative Deepening A* enhancements such as IDA* start with an approximate answer and make it more accurate. The name comes from game tree searches. When your answer doesn't change or improve much, then you assume you have a pretty good answer. IDA* tends to increase computation while reducing memory. Not a huge win typically.

Dynamic Weighting (2)

It's more important to get anywhere quickly at the start and at the end it's more important to get to the goal.

Bandwidth Search

Not very useful

Bidirectional Search

Starts two searches one at the start and another at the end to keep the trees smaller. Two variants: front-to-front and retargeting

Dynamic A* and Lifelong Planning A*

D* allows for changes after the initial path is computed and when you don't have complete info. It can correct mistakes. LPA* is for when costs are changing, or the map may be invalidated. It reuses previous A* computations.

Both use lots of memory and isn't ordinarily useful with lots of moving units.