Publications

(2025). Stream-HLS: Towards Automatic Dataflow Acceleration. In FPGA'25 (Best Paper Candidate ★).
(2024). TAPA-CS: Enabling Scalable Accelerator Design on Distributed HBM-FPGAs. In ASPLOS'24.
(2023). FlexCNN: An end-to-end framework for composing CNN accelerators on FPGA. In TRETS'23.
(2023). Callipepla: Stream Centric Instruction Set and Mixed Precision for Accelerating Conjugate Gradient Solver. In FPGA'23.
(2022). A Versatile Systolic Array for Transposed and Dilated Convolution on FPGA. In FCCM'22.
(2021). A Customizable Domain-Specific Memory-Centric FPGA Overlay for Machine Learning Applications. In FPL'21.
(2020). SPAR-2: A SIMD Processor Array for Machine Learning in IoT Devices. In ICDIS'24.