Are you aware of the generate trajectories (like 8 different plans), rank and th...

Are you aware of the generate trajectories (like 8 different plans), rank and then judge workflow from reinforcement learning?

I noticed it was giving me better results and allowed me greater variety even though I won't use the remaining plans.

Note that the rule doesn't make much sense out of context and the math is wrong... oops :D