You've essentially just derived my Parallel Hierarchical Replanner (PHR)... :-) ...

You've essentially just derived my Parallel Hierarchical Replanner (PHR)... :-) However, your step 6 isn't always useful because not only do you sometimes discover new obstacles (higher costs), occasionally you discover new shortcuts (lower costs).

In the PHR, you spawn two (or more, but ignore that for now) threads. One runs the globally optimal planner on as up-to-date data as it can; slow, but important to keep working upon. The other runs a local window planner from the agent's current location to as far ahead along the most recent global plan as fits within the window.

The result is a pretty good trade-off between actually executing the plan, and waiting to finish computing a globally optimal one. Essentially you're tracking along a stale but once-upon-a-time optimal global plan, locally optimising subsections of it. There are also some more generally useful trade-offs to this too, like using an anytime planner for the global thread.

My PhD was then an attempt to generalize these strategies into a cohesive framework. It starts from the assumption that what you really want is not to travel along `optimal' paths, but just to get to the damn goal as fast as possible. To that end, measuring everything in terms of time becomes sensible. Ultimately, it comes down to solving y* = arg_min_over_Y{ p + e + c }. y* is the optimal set of adjustable parameters in your system (e.g. choice of algorithm, number of threads to use, amount of RAM to use, tuning parameters in the algorithm, etc) in the space Y of all possible parameter sets. p is planning time, e is execution time, and c is the time taken to solve that optimisation equation itself. i.e. it's a transindental equation, so the only way to solve it is to estimate each of those time factors.

Finally, everything gets wrapped up in a nice package that combines a learning model, state estimator, optimiser, and planning system, and solves that equation repeatedly. Every time it does, it "applies" the y* parameters to the system (whatever that means depends upon what they represent), and starts again. If it helps, you can think of it as a feedback loop that adjusts its own controller gains, and where one of the inputs is the time taken to run the controller.