About me

I am a 4th year PhD student in Computer Science at Oregon State University (OSU) advised by Dr. Xiao Fu. My research is focused on understanding and designing principled methods for unsupervised domain translation, generative models, and multiview/multimodal learning.

Prior to my PhD, I received my MS in Computer Science from OSU in 2023, also advised by Dr. Xiao Fu. During my MS, I developed efficient and principled ML methods for important problems in wireless communications and federated learning. The work was funded jointly by Intel and NSF under the Machine Learning for Wireless Networking Systems (MLWiNS) program.

Before joining OSU, I co-founded Paaila Technology, where I led the development of waiter robots and banking assistant chatbot.

During my undergraduate study, I was a team member of Team Nepal for ABU Robocon 2015 and 2016, where my team won many awards.

Select Publications

Content-Style Learning from Unaligned Domains: Identifiability under Unknown Latent Dimensions

Sagar Shrestha, Xiao Fu

[Under Review]

This work develops a novel framework for identifying latent content and style variables from unaligned multi-domain data, a crucial challenge in domain translation and generative modeling. We introduce cross-domain latent distribution matching (LDM) along with sparsity constraint to identify the latent content and style variables without latent dimension information. Furthermore, we translate LDM into a computationally efficient regularized multi-domain GAN formulation, achieving superior results with reduced overhead compared to existing methods. Experiments validate the practicality and theoretical soundness of our approach, advancing the frontiers of unsupervised representation learning.

Paper | Bibtex | Code (Coming Soon)

Identifiable Shared Component Analysis of Unpaired Multimodal Mixtures

Subash Timilsina, Sagar Shrestha, Xiao Fu

Neural Information Processing Systems (NeurIPS) 2024

This work investigates shared component identifiability from multi-modal linear mixtures with unaligned cross-modality samples, extending beyond previous research on aligned samples. We propose a distribution divergence minimization-based loss and derive sufficient conditions for shared component identifiability. Our approach, based on cross-modality distribution discrepancy and density-preserving transform removal, offers milder conditions than existing methods. We also provide relaxed conditions through structural constraints motivated by side information. Promising results are obtained for domain adaptation, single cell multi-modal data alignment, and multilingual lexicon embeddings alignment.

Paper | Bibtex | Code

Towards Identifiable Unsupervised Domain Translation: A Diversified Distribution Matching Approach

Sagar Shrestha, Xiao Fu

International Conference on Learning Representations (ICLR) 2024

Unsupervised domain translation is the problem of learning a mapping between two domains without paired data. Existing methods (e.g. CycleGAN and follow-up works) generally tackle the problem with distribution matching objectives. A fundamental issue with these approaches is the the "non-identifiability" of translation maps due to the existence of multiple solutions. This results in "content-misaligned" translations in practice. Our work proposes a simple yet provable solution to address this issue. We reveal interesting insights into the nature of the UDT problem. The proposed identifiable solution is demonstrated to be effective in avoiding the content-misalignment issue.

Paper | Bibtex | Code

Translation Identifiability-Guided Unsupervised Cross-Platform Super-Resolution for OCT Images

Jiahui Song*, Sagar Shrestha*, Xueshen Li, Yu Gan, and Xiao Fu (* Equal Contribution)

IEEE SAM Workshop 2024

Optical Coherence Tomography (OCT) provides detailed cross-sectional images of coronary arteries, but cost-effective systems produce only low-resolution images. Unsupervised OCT super-resolution (OCT-SR) offers a solution without requiring high-resolution systems or paired images. Existing methods use CycleGAN, which lacks translation identifiability, leading to incorrect results. Our work proposes a translation identifiability-guided framework using a diversified distribution matching module. This approach guarantees OCT translation identifiability under reasonable conditions with a simple learning loss. Results show our framework matches or exceeds state-of-the-art performance while requiring fewer resources such as annotations, computation time, and memory.

Paper | Bibtex |

Optimal Solutions for Joint Beamforming and Antenna Selection: From Branch and Bound to Graph Neural Imitation Learning

Sagar Shrestha, Xiao Fu, Mingyi Hong

IEEE Transactions on Signal Processing (TSP) 2023

Joint Beamforming and Antenna selection is a mix of non-convex continuous and combinatorial optimization problem. Existing methods are based on continuous convex approximation or greedy solutions that are suboptimal. We address this issue by first proposing an optimal Branch and Bound (B&B) algorithm for the problem. To address the potential scalability issue of the B&B, we propose a GNN based policy learnt via imitation learning to speed up the B&B. The proposed approach is demonstrated to be effective in achieving near-optimal solutions with significantly reduced computational complexity.

Paper | Bibtex | Code

Communication-efficient Federated Linear and Deep Generalized Canonical Correlation Analysis

Sagar Shrestha, Xiao Fu

IEEE Transactions on Signal Processing (TSP) 2023

Classic and deep learning-based GCCA algorithms seek low-dimensional common representations of data entities from multiple “views” (e.g., audio and image) using linear transformations and neural networks, respectively. When the views are acquired and stored at different computing agents (e.g., organizations and edge devices) and data sharing is undesired due to privacy or communication cost considerations. This work puts forth a convergence-guaranteed communication-efficient federated learning framework for both linear and deep GCCA under the Maximum Variance (MAX-VAR) formulation. Compared to the unquantized version, the proposed algorithm enjoys more than 90% communication overhead reduction with virtually no loss in accuracy and convergence speed.

Paper | Bibtex | Code

Quantized Radio Map Estimation Using Tensor and Deep Generative Models

Subash Timilsina, Sagar Shrestha, Xiao Fu, Mingyi Hong

IEEE Transactions on Signal Processing (TSP) 2023

Spectrum cartography (SC), also known as radio map estimation (RME), aims at crafting multi-domain (e.g., frequency and space) radio power propagation maps from limited sensor measurements. Existing SC approaches assume that sensors send real-valued (full-resolution) measurements to the fusion center, which is unrealistic. This work puts forth a quantized SC framework that generalizes the BTD and DGM-based SC to scenarios where heavily quantized sensor measurements are used. A maximum likelihood estimation (MLE)-based SC framework under a Gaussian quantizer is proposed. Recoverability of the radio map using the MLE criterion is characterized under realistic conditions, e.g., imperfect radio map modeling and noisy measurements. Simulations and real-data experiments are used to showcase the effectiveness of the proposed approach.

Paper | Bibtex | Code

Deep Spectrum Cartography: Completing Radio Map Tensors Using Learned Neural Models

Sagar Shrestha, Xiao Fu, Mingyi Hong

IEEE Transactions on Signal Processing (TSP) 2022

The spectrum cartography (SC) technique constructs multi-domain (e.g., frequency, space, and time) radio frequency (RF) maps from limited measurements, which can be viewed as an ill-posed tensor completion problem. In this work, an emitter radio map disaggregation-based approach is proposed, under which only individual emitters radio maps are modeled by DNNs. Using the learned DNNs, a fast nonnegative matrix factorization-based two-stage SC method and a performance-enhanced iterative optimization algorithm are proposed. Theoretical aspects—such as recoverability of the radio tensor, sample complexity, and noise robustness—under the proposed framework are characterized, and such theoretical properties have been elusive in the context of DL-based radio tensor completion. Experiments using synthetic and real-data from indoor and heavily shadowed environments are employed to showcase the effectiveness of the proposed methods.

Paper | Bibtex | Code

Blogs

Oct 4, 2024

Unified Perspective on Diffusion and Flow Matching

A note on how to view flow matching and diffusion under the same framework, resulting in great flexibility in designing probability path, training score or vector field, and sampling using SDE or ODE.

Sept 3, 2024

Optimal Solutions for Diffusion and Flow Matching Objectives

A short note on understanding the diffusion and flow matching objectives and their solutions.

June 28, 2024

Vision Language Representation Learning

A brief survey into joint representation learning of text and images.

February 20, 2024

Understanding Flow Matching-based Generative Models

A distilled explanation of the flow matching / stochastic interpolant /rectified flow. It features an easy-to-follow proof (not found in original paper) of the flow-matching objective with stochastic interpolants.

October 29, 2023

A High Schooler's Guide to Independent Component Analysis

A beginner's introduction to ICA.