I am a 4th year PhD student in Computer Science at Oregon State University (OSU) advised by Dr. Xiao Fu. My research is focused on understanding and designing principled methods for unsupervised domain translation, generative models, and multiview/multimodal learning.
Prior to my PhD, I received my MS in Computer Science from OSU in 2023, also advised by Dr. Xiao Fu. During my MS, I developed efficient and principled ML methods for important problems in wireless communications and federated learning. The work was funded jointly by Intel and NSF under the Machine Learning for Wireless Networking Systems (MLWiNS) program.
Before joining OSU, I co-founded Paaila Technology, where I led the development of waiter robots and banking assistant chatbot.
During my undergraduate study, I was a team member of Team Nepal for ABU Robocon 2015 and 2016, where my team won many awards.
Subash Timilsina, Sagar Shrestha, Xiao Fu
Neural Information Processing Systems (NeurIPS) 2024
This work investigates shared component identifiability from multi-modal linear mixtures with unaligned cross-modality samples, extending beyond previous research on aligned samples. We propose a distribution divergence minimization-based loss and derive sufficient conditions for shared component identifiability. Our approach, based on cross-modality distribution discrepancy and density-preserving transform removal, offers milder conditions than existing methods. We also provide relaxed conditions through structural constraints motivated by side information. Promising results are obtained for domain adaptation, single cell multi-modal data alignment, and multilingual lexicon embeddings alignment.
Sagar Shrestha, Xiao Fu
International Conference on Learning Representations (ICLR) 2024
Unsupervised domain translation is the problem of learning a mapping between two domains without paired data. Existing methods (e.g. CycleGAN and follow-up works) generally tackle the problem with distribution matching objectives. A fundamental issue with these approaches is the the "non-identifiability" of translation maps due to the existence of multiple solutions. This results in "content-misaligned" translations in practice. Our work proposes a simple yet provable solution to address this issue. We reveal interesting insights into the nature of the UDT problem. The proposed identifiable solution is demonstrated to be effective in avoiding the content-misalignment issue.
Jiahui Song*, Sagar Shrestha*, Xueshen Li, Yu Gan, and Xiao Fu (* Equal Contribution)
IEEE SAM Workshop 2024
Optical Coherence Tomography (OCT) provides detailed cross-sectional images of coronary arteries, but cost-effective systems produce only low-resolution images. Unsupervised OCT super-resolution (OCT-SR) offers a solution without requiring high-resolution systems or paired images. Existing methods use CycleGAN, which lacks translation identifiability, leading to incorrect results. Our work proposes a translation identifiability-guided framework using a diversified distribution matching module. This approach guarantees OCT translation identifiability under reasonable conditions with a simple learning loss. Results show our framework matches or exceeds state-of-the-art performance while requiring fewer resources such as annotations, computation time, and memory.
Sagar Shrestha, Xiao Fu, Mingyi Hong
IEEE Transactions on Signal Processing (TSP) 2023
Joint Beamforming and Antenna selection is a mix of non-convex continuous and combinatorial optimization problem. Existing methods are based on continuous convex approximation or greedy solutions that are suboptimal. We address this issue by first proposing an optimal Branch and Bound (B&B) algorithm for the problem. To address the potential scalability issue of the B&B, we propose a GNN based policy learnt via imitation learning to speed up the B&B. The proposed approach is demonstrated to be effective in achieving near-optimal solutions with significantly reduced computational complexity.
Sagar Shrestha, Xiao Fu
IEEE Transactions on Signal Processing (TSP) 2023
Classic and deep learning-based GCCA algorithms seek low-dimensional common representations of data entities from multiple “views” (e.g., audio and image) using linear transformations and neural networks, respectively. When the views are acquired and stored at different computing agents (e.g., organizations and edge devices) and data sharing is undesired due to privacy or communication cost considerations. This work puts forth a convergence-guaranteed communication-efficient federated learning framework for both linear and deep GCCA under the Maximum Variance (MAX-VAR) formulation. Compared to the unquantized version, the proposed algorithm enjoys more than 90% communication overhead reduction with virtually no loss in accuracy and convergence speed.
Subash Timilsina, Sagar Shrestha, Xiao Fu, Mingyi Hong
IEEE Transactions on Signal Processing (TSP) 2023
Spectrum cartography (SC), also known as radio map estimation (RME), aims at crafting multi-domain (e.g., frequency and space) radio power propagation maps from limited sensor measurements. Existing SC approaches assume that sensors send real-valued (full-resolution) measurements to the fusion center, which is unrealistic. This work puts forth a quantized SC framework that generalizes the BTD and DGM-based SC to scenarios where heavily quantized sensor measurements are used. A maximum likelihood estimation (MLE)-based SC framework under a Gaussian quantizer is proposed. Recoverability of the radio map using the MLE criterion is characterized under realistic conditions, e.g., imperfect radio map modeling and noisy measurements. Simulations and real-data experiments are used to showcase the effectiveness of the proposed approach.
Sagar Shrestha, Xiao Fu, Mingyi Hong
IEEE Transactions on Signal Processing (TSP) 2022
The spectrum cartography (SC) technique constructs multi-domain (e.g., frequency, space, and time) radio frequency (RF) maps from limited measurements, which can be viewed as an ill-posed tensor completion problem. In this work, an emitter radio map disaggregation-based approach is proposed, under which only individual emitters radio maps are modeled by DNNs. Using the learned DNNs, a fast nonnegative matrix factorization-based two-stage SC method and a performance-enhanced iterative optimization algorithm are proposed. Theoretical aspects—such as recoverability of the radio tensor, sample complexity, and noise robustness—under the proposed framework are characterized, and such theoretical properties have been elusive in the context of DL-based radio tensor completion. Experiments using synthetic and real-data from indoor and heavily shadowed environments are employed to showcase the effectiveness of the proposed methods.
A note on how to view flow matching and diffusion under the same framework, resulting in great flexibility in designing probability path, training score or vector field, and sampling using SDE or ODE.
A short note on understanding the diffusion and flow matching objectives and their solutions.
A brief survey into joint representation learning of text and images.
A distilled explanation of the flow matching / stochastic interpolant /rectified flow. It features an easy-to-follow proof (not found in original paper) of the flow-matching objective with stochastic interpolants.
A beginner's introduction to ICA.