Quantum Circuit Optimization using Deep Reinforcement Learning

Tuesday, February 3, 2026 - 12:30

Location:

CpE Conference Room 126

Thesis or Project Presentation

Presenter(s):

Eng. Yousef Alfezea

Computer Engineering Department

As quantum computing advances toward becoming a realizable solution to many real-world problems, the current Noisy Intermediate-Scale Quantum (NISQ) era highlights the need for efficient circuit optimization frameworks, driven by hardware limitations, short coherence times, and the vulnerability of quantum operations to noise. Conventional rule-based and heuristic optimizers depend on fixed rewrite rules and local transformations, which limit their capacity to discover deeper improvements. To address these limitations, we propose a reinforcement learning approach as an adaptive strategy to navigate the search space of circuit transformations. In this framework, a Deep Q-Network-based optimizer, named QuAda, applies a series of gate transformations with the objective of reducing gate count while maintaining functional equivalence. The proposed framework helps the agent identify structural patterns by representing circuits as binary matrices of qubit activity across moments. The model's action space consists of native transformation operations supported by the Cirq framework. The optimizer also incorporates a custom reward function that provides positive feedback only when a transformation reduces the gate count. In addition, an ε-greedy strategy is used to promote exploration of transformations that may temporarily increase the gate count, ultimately allowing the agent to discover deeper optimizations. Experimental results on standard quantum circuit benchmarks show that our method achieves, on average, a 6% improvement in gate count reduction over a Greedy approach, while maintaining runtimes within seconds compared to the fixed 6-hour and 24-hour budgets of Quarl and Quartz. In a separate evaluation on randomly generated circuits, our optimizer reduces the average gate count from 128 to 103 (19% improvement) while running 57% faster than RL-CNN. This work shows that learning-based optimization can be adapted to a variety of quantum architectures, paving the way for practical and scalable quantum compilers.

Supervisor: Dr. Ameer Mohammed
Convener: Prof. Tassos Dimitriou
Examination Committee: Prof. Maitham Safar