Publications

MobiZO: Enabling Efficient LLM Fine-Tuning at the Edge via Inference Engines

Author(s): L. Gao, A. Ziashahabi, Y. Niu, S. Avestimehr, M. Annavaram

The Main of the 2025 Conference on Empirical Methods in Natural Language Processing, Nov 2025.

Paper Link

Reconfigurable Optical Recognition of Independent Data Patterns in a QPSK Data Stream Using Nonlinear Wave Mixing and Time or Wavelength Multiplexing

Author(s): A. Minoofar, A. Alhaddad, W. Ko, H. Lian, H. Zhou, A. Almaiman, M. Ramakrishnan, N. Karapetyan, Z. Jiang, M. Annavaram, M. Tur, J.L. Habif, A.E. Willner

Journal of Lightwave Technology, Jun 2025.

Paper Link

Reconfigurable optical recognition of 2 independent data patterns using wavelength multiplexing and nonlinear wave mixing

Author(s): A. Minoofar, A. Alhaddad, W. Ko, H. Lian, H. Zhou, M. Ramakrishnan, N. Karapetyan, A. Almaiman, M. Annavaram, M. Tur, J.L. Habif, A.E. Willner

CLEO: Science and Innovations, May 2025.

Paper Link

Mind the Dialect: NLP Advancements Uncover Fairness Disparities for Arabic Users in Recommendation Systems

Author(s): A. Alshabanah, M. Annavaram

The Findings of the 2025 Conference on Empirical Methods in Natural Language Processing, Nov 2025.

Paper Link

DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding

Author(s): H. Entezari Zarch, L. Gao, C. Jiang, M. Annavaram

The Second Conference on Language Modeling, Oct 2025.

Paper Link

Icy-Hot: Decoupled Compute Paradigm towards a General-Purpose Superconducting CPU Design

Author(s): T. Renduchintala, J. Lee, H. Zha, M. Annavaram

The 17th edition of the European Conference on Applied Superconductivity, Sep 2025.

Paper Link

LEAF: Lightweight, Efficient, Adaptive and Flexible Embedding for Large-Scale Recommendation Models

Author(s): C. Jiang, A. Alshabanah, M. Annavaram

Proceedings of the 19th ACM Conference on Recommender Systems, Sep 2025.

Paper Link

KVPR: Efficient LLM Inference with I/O-Aware KV Cache Partial Recomputation

Author(s): C. Jiang, L. Gao, H. Entezari Zarch, M. Annavaram

The Findings of the 63rd Annual Meeting of the Association for Computational Linguistics, Jul 2025.

Paper Link

Estimating Privacy Leakage of Augmented Contextual Knowledge in Language Models

Author(s): J. Flemings, B. Jiang, W. Zhang, Z. Takhirov, M. Annavaram

The Main of the 63rd Annual Meeting of the Association for Computational Linguistics, Jul 2025.

Paper Link

HyCache: Hybrid Caching for Accelerating Input Preprocessing Pipelines in DNN Training

Author(s): K. Vinayak, S. Pandey, M. Annavaram, A. Basu

The 2025 USENIX Annual Technical Conference, Apr 2025.

Paper Link

On Using Arabic Language Dialects in Recommendation Systems

Author(s): A. Alshabanah, M. Annavaram

The Findings of the Nations of the Americas Chapter of the Association for Computational Linguistics, Apr 2025.

Paper Link

Efficient LLM Inference with I/O-Aware Partial KV Cache Recomputation

Author(s): C. Jiang, L. Gao, H. Entezari Zarch, M. Annavaram

AAAI Workshop on Scalable and Efficient Artificial Intelligence Systems, Mar 2025.

Paper Link

MPC-Pipe: An Efficient Pipeline Scheme for Semi-honest MPC Machine Learning

Author(s): Y. Wang, R. Rajat, M. Annavaram

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Feb 2025.

Paper Link

PIGEON: A High Throughput Framework for Private Inference of Neural Networks using Secure Multiparty Computation

Author(s): C. Harth-Kitzerow, Y. Wang, R. Rajat, G. Carle, M. Annavaram

The 25th Privacy Enhancing Technologies Symposium, Feb 2025.

Paper Link

Meta-Learn to Unlearn: Enhanced Exact Machine Unlearning in Recommendation Systems with Meta-Learning

Author(s): A. Alshabanah , K. Balasubramanian, M. Annavaram

The 25th Privacy Enhancing Technologies Symposium, Feb 2025.

Paper Link

Characterizing Context Influence and Hallucination in Summarization

Author(s): J. Flemings, W. Zhang, B. Jiang, Z. Takhirov, M. Annavaram

Neurips Workshop on Safe & Trustworthy Agents, Dec 2024.

Paper Link

Enabling Resource-Efficient On-Device Fine-Tuning of LLMs Using Only Inference Engines

Author(s): L. Gao, A. Ziashahabi, Y. Niu, S. Avestimehr, M. Annavaram

Neurips Workshop on Efficient Natural Language and Speech Processing, Dec 2024.

Paper Link

Reconfigurable Optical Recognition of 3 Independent Data Patterns Using Intra-Symbol Time Multiplexing and Nonlinear Wave Mixing

Author(s): A. Minoofar, A. Alhaddad, W. Ko, H. Lian, H. Zhou, N. Karapetyan, M. Ramakrishnan, Z. Jiang, A. Almaiman, M. Annavaram, M. Tur, J.L. Habif, A.E. Willner

Frontiers in Optics, Sep 2024.

Paper Link

Cross-core data sharing for energy-efficient GPUs

Author(s): H. Falahati, M. Sadrosadati, Q. Xu, J. Gómez-Luna, B. Saber Latibari, H. Jeon, S. Hesaabi, H. Sarbazi-Azad, O. Mutlu, M. Annavaram, M. Pedram

ACM Transactions on Architecture and Code Optimization, Sep 2024.

Paper Link

Biased User History Synthesis for Personalized Long-Tail Item Recommendation

Author(s): K. Balasubramanian, A. Alshabanah, E. Markowitz, G. Ver Steeg, M. Annavaram

Proceedings of the 18th ACM Conference on Recommender Systems, Oct 2024. (Acceptance rate 58/266 21.8%)

Paper Link

CADC: Encoding User-Item Interactions for Compressing Recommendation Model Training Data

Author(s): H. Entezari Zarch, A. Alshabanah, C. Jiang, M. Annavaram

Workshop on Risks, Opportunities, and Evaluation of Generative Models in Recommender Systems, Oct 2024.

Paper Link

High-Throughput Secure Multiparty Computation with an Honest Majority in Various Network Settings

Author(s): C. Harth-Kitzerow, A. Suresh, Y. Wang, H. Yalame, G. Carle, M. Annavaram

The 25th Privacy Enhancing Technologies Symposium, Aug 2024.

Paper Link

Differentially Private Knowledge Distillation via Synthetic Text Generation

Author(s): J. Flemings, M. Annavaram

The Findings of Association for Computational Linguistics, Aug 2024.

Paper Link

Tunable optical matrix convolution of 20-Gbit/s QPSK 2-D data with a kernel using optical wave mixing

Author(s): A. Minoofar, A. Alhaddad, W. Ko, N. Karapetyan, A. Almaiman, H. Zhou, M. Ramakrishnan, M. Annavaram, M. Tur, J.L. Habif, A.E. Willner

Optics Letters, Aug 2024.

Paper Link

Differentially Private Next-Token Prediction of Large Language Models

Author(s): J. Flemings, M. Razaviyayn, M. Annavaram

North American Chapter of the Association for Computational Linguistics, Jun 2024. (Acceptance rate 565/2434, 23.2%)

Paper Link

Ethos: Rectifying Language Models in Orthogonal Parameter Space

Author(s): L. Gao, Y. Niu, T. Tang, S. Avestimehr, M. Annavaram

The Findings of North American Chapter of the Association for Computational Linguistics, Jun 2024. (Acceptance rate 304/2434, 12.5%)

Paper Link

Edge Private Graph Neural Networks with Singular Value Perturbation

Author(s): T. Tang, Y. Niu, S. Avestimehr, M. Annavaram

Proceedings of the 24th annual Privacy Enhancing Technologies Symposium (PETS), Jul 2024. (Acceptance rate 99/456, 22%)

Paper Link

Demonstration of Tunable Optical Matrix Convolution of a 10-Gbaud QPSK 2-D Image with a Kernel Using Wave Mixing

Author(s): A. Minoofar, A. Alhaddad, W. Ko, H. Zhou, N. Karapetyan, Z. Jiang, A. Almaiman, M. Ramakrishnan, M. Annavaram, J.L. Habif, M. Tur, A.E. Willner

CLEO: Science and Innovations, May 2024.

Paper Link

LAORAM: Look Ahead ORAM for Recommendation Model Privacy

Author(s): Y. Wang, R. Rajat, M. Annavaram

Proceedings of the International Symposium on Computer Architecture (ISCA), Jun 2023. (Acceptance rate 79/373, 21%)

Paper Link

PageORAM: An Efficient DRAM Page Aware ORAM

Author(s): R. Rajat, Y. Wang, M. Annavaram

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, Oct 2022. (Acceptance rate 83/369, 22%)

Paper Link

StATIK: Structure and Text for Inductive Knowledge Graph Completion

Author(s): E.S. Markowitz, K. Balasubramanian, M. Mirtaheri, M. Annavaram, A. Galstyan, G.V. Steeg

2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Jul 2022.

Paper Link

Characterization of MPC-based Private Inferences for Transformer-based Models

Author(s): Y. Wang, E. Suh, W. Xiong, M. Annavaram, H. Lee

Proceedings of the International Symposium on Performance Analysis of Systems and Software, May 2022.

Paper Link

Enhancing Privacy Through Domain Adaptive Noise Injection for Speech Emotion Recognition

Author(s): T. Feng, H. Hashemi, M. Annavaram, S. Narayanan

Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, May 2022.

Paper Link

AVCC: Adaptative Verifiable Coded Computing

Author(s): T. Tang, H. Hashemi, R. Ali, S. Avestimehr, M. Annavaram

Proceedings of the International Conference on Parallel and Distributed Processing Systems, May 2022. (Acceptance rate 46/474, 10% Round 1 acceptance)

Paper Link

SpreadGNN: Serverless Multi-task Federated Learning for Molecular Graphs

Author(s): E. Ceyani, C. He, K. Balasubramaniyan, M. Annavaram, S. Avestimehr

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22), Feb 2022. (Acceptance rate 1349/9251, 15%)

Paper Link

Check-N-Run: A Checkpointing System for Training Deep Learning Recommendation Models

Author(s): A. Eisenman, K. Matam, S. Ingram, D. Mudigere, R. Krishnamoorthi, K. Nair, M. Smelyanskiy, M. Annavaram

Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI), Apr 2022.

Paper Link

DarKnight: An Accelerated Framework for Privacy and Integrity Preserving Deep Learning Using Trusted Hardware

Author(s): H. Hashemi, Y. Wang, M. Annavaram

Proceedings of the 54th IEEE/ACM International Symposium on Microarchitecture, Oct 2021. (Acceptance rate 94/423, 22%)

Paper Link

cDLRM: Look Ahead Caching for Scalable Training of Recommendation Models

Author(s): K. Balasubramanian, A. Alsabnah, J. Choe, M. Annavaram

Proceedings of the 15th ACM Conference on Recommender Systems, Oct 2021. (Acceptance rate 49/267, 18%)

Paper Link

Origami Inference: Private Inference Using Hardware Enclaves

Author(s): K. Narra, Z. Lin, Y. Wang, K. Balasubramanian, M. Annavaram

IEEE International Conference on Cloud Computing (Short paper), Sep 2021. (Acceptance rate 23.8%)

Paper Link

MultiLogVC: Efficient Out-of-Core Graph Processing Framework on Flash Storage

Author(s): K. Matam, H. Hashemi, M. Annavaram

Proceedings of the International Conference on Parallel and Distributed Processing Systems, May 2021.

Paper Link

Group Knowledge Transfer: Federated Learning of Large CNNs at the Edge

Author(s): C. He, M. Annavaram, S. Avestimehr

Advances in Neural Information Processing Systems, Dec 2020. (Acceptance rate 1900/9454, 20%)

Paper Link

Multi fluxon storage and its implications for microprocessor design

Author(s): N.K. Katam, H. Zha, M. Pedram, M. Annavaram

Journal of Physics: Conference Series, Jun 2020.

Paper Link

Collage Inference: Using Coded Redundancy for Lowering Latency Variation in Distributed Image Classification Systems

Author(s): K. Narra, Z. Lin, S. Avestimehr, G. Ananthanarayanan, M. Annavaram

Proceedings of the International Conference on Distributed Computing Systems, Jul 2020. (Acceptance rate 105/584, 18%)

Paper Link