Many systems like autonomous vehicle fleets and drone swarms can be modeled as Multi-Agent Reinforcement Learning (MARL) tasks, which deal with how multiple machines can learn to collaborate, ...