Advances in AI at Johns Hopkins University - April 23, 2026

Name: Advances in AI at Johns Hopkins University - April 23, 2026
Start: 2026-04-23
End: 2026-04-23

This event has ended, but you can still catch up! Watch the on-demand recordings and register for our future events.

Apr 23, 2026

9AM PST

Online. Register for the Zoom!

Speakers

About this event

Join our virtual Meetup to hear talks from researchers at Johns Hopkins University on cutting-edge AI topics.

Schedule

Recent Advancements in Image Generation and Understanding

In this talk, I will provide an overview of my research and then take a closer look at three recent works. Image generation has progressed rapidly in the past decade—evolving from Gaussian Mixture Models (GMMs) to Variational Autoencoders (VAEs), GANs, and more recently diffusion models, which have set new standards for quality.

I will begin with DiffNat (TMLR’25), which draws inspiration from a simple yet powerful observation: the kurtosis concentration property of natural images. By incorporating a kurtosis concentration loss together with a perceptual guidance strategy, DiffNat can be plugged directly into existing diffusion pipelines, leading to sharper and more faithful generations across tasks such as personalization, super-resolution, and unconditional synthesis.

Continuing the theme of improving quality under constraints, I will then discuss DuoLoRA (ICCV’25), which tackles the challenge of content–style personalization from just a few examples. DuoLoRA introduces adaptive-rank LoRA merging with cycle-consistency, allowing the model to better disentangle style from content. This not only improves personalization quality but also achieves it with 19× fewer trainable parameters, making it far more efficient than conventional merging strategies.

Finally, I will turn to Cap2Aug (WACV’25), which directly addresses data scarcity. This approach uses captions as a bridge for semantic augmentation, applying cross-modal backtranslation (image → text → image) to generate diverse synthetic samples. By aligning real and synthetic distributions, Cap2Aug boosts both few-shot and long-tail classification performance on multiple benchmarks.

Resources

From Representation Analysis to Data Refinement: Understanding Correlations in Deep Models

This talk examines how deep learning models encode information beyond their intended objectives and how such dependencies influence reliability, fairness, and generalization.

Representation-level analysis using mutual information–based expressivity estimation is introduced to quantify the extent to which attributes such as demographics or anatomical structural factors are implicitly captured in learned embeddings, even when they are not explicitly used for supervision. These analyses reveal hierarchical patterns of attribute encoding and highlight how correlated factors emerge across layers.

Data attribution techniques are then discussed to identify influential training samples that contribute to model errors and reinforce dependencies that reduce robustness. By auditing the training data through influence estimation, harmful instances can be identified and removed to improve model behavior.

Together, these components highlight a unified, data-centric perspective for analyzing and refining correlations in deep models.

Resources

Paper: Encoding of Demographic and Anatomical Information in Chest X-Ray-Based Severe Left Ventricular Hypertrophy Classifiers

Scalable & Precise Histopathology: Next-Gen Deep Learning for Digital Histopathology

Whole slide images (WSIs) present a unique computational challenge in digital pathology, with single images reaching gigapixel resolution—equivalent to 500+ photos stitched together.

This talk presents two complementary deep learning solutions for scalable and accurate WSI analysis. First, I introduce a Task-Specific Self-Supervised Learning (TS-SSL) framework that uses spatial-channel attention to learn domain-optimized feature representations, outperforming existing foundation models across multiple cancer classification benchmarks.

Second, I present CEMIL, a contextual attention-based MIL framework that leverages instructor-learner knowledge distillation to classify cancer subtypes using only a fraction of WSI patches. This approach achieves state-of-the-art accuracy with significantly reduced computational cost.

Together, these methods address critical bottlenecks in generalization and efficiency for clinical-grade computational pathology.

Resources

Towards trustworthy AI under real world data challenges

The current paradigm of training AI models relies on fundamental assumptions that the data we have is clean, properly annotated, and sufficiently diverse across domains. However, this is not always true for the real world. In practice, data is may be physically corrupt, incompletely annotated, and specific to certain domains. As me move towards large scale general purpose models like LLMs and foundation models, it is even more important to address these data challenges so that we can train trustworthy AI models even with noisy real world data. In this presentation, we discuss some methods to tackle these potential issues.

Resources

Paper: TOGA: Temporally Grounded Open-Ended Video QA with Weak Supervision