HomeTechnologyMicrosoft open-sources tool to use AI in simulated assaults

Microsoft open-sources tool to use AI in simulated assaults

Join GamesBeat Summit 2021 this April 28-29. Register for a free or VIP pass today.

As part of Microsoft’s research into ways to use machine learning and AI to improve security defenses, the company has launched an open source assault toolkit to let researchers create simulated community environments and see how they fare against assaults.

Microsoft 365 Defender Research launched CyberBattleSim, which creates a community simulation and models how threat actors can move laterally through the community looking for weak points. When building the assault simulation, enterprise defenders and researchers create various nodes on the community and indicate which services are operating, which vulnerabilities are present, and what type of security controls are in place. Automated brokers, representing threat actors, are deployed in the assault simulation to randomly execute actions as they try to take over the nodes.

“The simulated attacker’s goal is to take possession of some portion of the community by exploiting these planted vulnerabilities. While the simulated attacker moves through the community, a defender agent watches the community activity to detect the presence of the attacker and contain the assault,” the Microsoft 365 Defender Research Team wrote in a post discussing the project.

Using reinforcement learning for security

Microsoft has been exploring how machine learning algorithms such as reinforcement learning can be used to improve information security. Reinforcement learning is a type of machine learning in which autonomous brokers learn how to make decisions based on what happens while interacting with the environment. The agent’s goal is to optimize the reward, and brokers gradually make better decisions (to get a bigger reward) through repeated attempts.

The most common example is playing a videogame. The agent (player)  gets better at playing the game after repeated tries by remembering the actions that worked in previous rounds.

In a security scenario, there are 2 types of autonomous brokers: the attackers trying to steal information out of the community and defenders trying to block, or mitigate the effects of, an assault. The brokers’ actions are the instructions that attackers can execute on the computers and the steps defenders can perform in the community. Using the language of reinforcement learning, the attacking agent’s goal is to maximize the reward of a successful assault by discovering and taking over more systems on the community, and finding more things to steal. The agent has to execute a series of actions to gradually discover the networks but do so without setting off any of the security defenses that may be in place.

Security training and games

Much like the human mind, AI learns better by playing games, so Microsoft turned CyberBattleSim into a game. Capture the flag competitions and phishing simulations help strengthen security by creating situations in which defenders can learn from attacker methods. By using reinforcement learning to get the reward of “winning” a game, the CyberBattleSim brokers can make better decisions on how they interact with the simulated community.

The CyberBattleSim focuses on threat modeling how an attacker can move laterally through the community after the initial breach. In the assault simulation, each node represents a machine with an operating system, software applications, specific properties (security controls), and set of vulnerabilities. The toolkit uses the Open AI Gym interface to train automated brokers using reinforcement learning algorithms. The open source Python source code is available on GitHub.

Erratic behavior should quickly trigger alarms, and security tools would reply and evict the malicious actor. But if the actor has learned how to compromise systems more quickly by shortening the number of steps it needs to succeed, that gives defenders perception into the places that need security controls and helps with detecting the activity sooner.

The CyberBattleSim is part of Microsoft’s broader research into applying machine learning and AI to automate many of the tasks security defenders are currently handling manually. In a recent Microsoft research, almost 3-quarters of organizations said their IT teams spent too much time on tasks that should be automated. Autonomous systems and reinforcement learning “can be harnessed to construct resilient real-world threat detection technologies and robust cyber-protection strategies,” Microsoft wrote.

“With CyberBattleSim, we are just scratching the surface of what we consider is a huge potential for applying reinforcement learning to security,” the company added.


VentureBeat’s mission is to be a digital town square for technical decision-makers to achieve knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to entry:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted entry to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member



Please enter your comment!
Please enter your name here

Most Popular