Reinforcement learning-guided control strategies for CAR T-cell
activation and expansion
Abstract
Reinforcement learning (RL), a subset of machine learning (ML), can
potentially optimize and control biomanufacturing processes, such as
improved production of therapeutic cells. Here, the process of CAR-T
cell activation by antigen presenting beads and their subsequent
expansion is formulated in-silico. The simulation is used as an
environment to train RL-agents to dynamically control the number of
beads in culture with the objective of maximizing the population of
robust effector cells at the end of the culture. We make periodic
decisions of incremental bead addition or complete removal. The
simulation is designed to operate in OpenAI Gym which enables testing of
different environments, cell types, agent algorithms and state-inputs to
the RL-agent. Agent training is demonstrated with three different
algorithms (PPO, A2C and DQN) each sampling three different state input
types (tabular, image, mixed); PPO-tabular performs best for this
simulation environment. Using this approach, training of the RL-agent on
different cell types is demonstrated, resulting in unique control
strategies for each type. Sensitivity to input-noise (sensor
performance), number of control step interventions, and advantage of
pre-trained agents are also evaluated. Therefore, we present a general
computational framework to maximize the population of robust effector
cells in CAR-T cell therapy production.