In this work, we demonstrate the potential of dynamic reinforcement learning (RL) methods to revolutionize cybersecurity. The RL framework we develop is shown to be capable of shutting down an aggressive botnet, which initially uses spear phishing to establish itself in a Department of Defense (DoD) network. To ensure a suitable real-time response, we employ CP, a transformer model trained for network anomaly detection, to factorize the state space accessible to our RL agent. As the fidelity of our cyber scenario is of the utmost importance for meaningful RL training, we leverage the CyberVAN emulation environment to model an appropriate DoD enterprise network to attack and defend. Our work represents an important step towards harnessing the power of RL to automate general and fully-realistic Defensive Cyber Operations (DCOs).
|