Skip to main content

and
  1. No Access

    Article

    Reward Function Design Method for Long Episode Pursuit Tasks Under Polar Coordinate in Multi-Agent Reinforcement Learning

    Multi-agent reinforcement learning has recently been applied to solve pursuit problems. However, it suffers from a large number of time steps per training episode, thus always struggling to converge effectivel...

    Yubo Dong 董玉博, Tao Cui 崔 涛, Yufan Zhou 周禹帆 in Journal of Shanghai Jiaotong University (S… (2024)