publications | Yujia Huang

2024

ICMLOral

Symbolic Music Generation with Non-differentiable Rule Guided Diffusion

Yujia Huang, Adishree Ghatare, Yuanzhe Liu, Ziniu Hu, Qinsheng Zhang, Chandramouli S Sastry, Siddharth Gururani, Sageev Oore, and Yisong Yue

In International Conference on Machine Learning , 2024

Oral Presentation [Top 1.5%]
Abs PDF Code Website

Oral

We study the problem of symbolic music generation (e.g., generating piano rolls), with a technical focus on non-differentiable rule guidance. Musical rules are often expressed in symbolic form on note characteristics, such as note density or chord progression, many of which are non-differentiable which pose a challenge when using them for guided diffusion. We propose \oursfull (\ours), a novel guidance method that only requires forward evaluation of rule functions that can work with pre-trained diffusion models in a plug-and-play way, thus achieving training-free guidance for non-differentiable rules for the first time. Additionally, we introduce a latent diffusion architecture for symbolic music generation with high time resolution, which can be composed with SCG in a plug-and-play fashion. Compared to standard strong baselines in symbolic music generation, this framework demonstrates marked advancements in music quality and rule-based controllability, outperforming current state-of-the-art generators in a variety of settings. For detailed demonstrations, code and model checkpoints, please visit our [project website](https://scg-rule-guided-music.github.io).

2023

Preprint

FI-ODE: Certified and Robust Forward Invariance in Neural ODEs

Yujia Huang^*, Ivan Dario Jimenez Rodriguez^*, Huan Zhang, Yuanyuan Shi, and Yisong Yue

In arXiv preprint arXiv:2210.16940 , 2023

Abs PDF Code

Forward invariance is a long-studied property in control theory that is used to certify that a dynamical system stays within some pre-specified set of states for all time, and also admits robustness guarantees (e.g., the certificate holds under perturbations). We propose a general framework for training and provably certifying robust forward invariance in Neural ODEs. We apply this framework to provide certified safety in robust continuous control. To our knowledge, this is the first instance of training Neural ODE policies with such non-vacuous certified guarantees. In addition, we explore the generality of our framework by using it to certify adversarial robustness for image classification..

2022

ICML

Diffusion Models for Adversarial Purification

Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, and Anima Anandkumar

In International Conference on Machine Learning , 2022

Abs PDF Code

Adversarial purification refers to a class of defense methods that remove adversarial perturbations using a generative model. These methods do not make assumptions on the form of attack and the classification model, and thus can defend pre-existing classifiers against unseen threats. However, their performance currently falls behind adversarial training methods. In this work, we propose DiffPure that uses diffusion models for adversarial purification: Given an adversarial example, we first diffuse it with a small amount of noise following a forward diffusion process, and then recover the clean image through a reverse generative process. To evaluate our method against strong adaptive attacks in an efficient and scalable way, we propose to use the adjoint method to compute full gradients of the reverse generative process. Extensive experiments on three image datasets including CIFAR-10, ImageNet and CelebA-HQ with three classifier architectures including ResNet, WideResNet and ViT demonstrate that our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods, often by a large margin.

2021

NeurIPS

Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds

Yujia Huang, Huan Zhang, Yuanyuan Shi, J Zico Kolter, and Anima Anandkumar

In Neural Information Processing Systems , 2021

Abs PDF Code Slides

Certified robustness is a desirable property for deep neural networks in safetycritical applications, and popular training algorithms can certify robustness of a neural network by computing a global bound on its Lipschitz consant. However, such a bound is often loose: it tends to over-regularize the neural network and degrade its natural accuracy. A tighter Lipschitz bound may provide a better tradeoff between natural and certified accuracy, but is generally hard to compute exactly due to non-convexity of the network. In this work, we propose an efficient and trainable local Lipschitz upper bound by considering the interactions between activation functions (e.g. ReLU) and weight matrices. Specifically, when computing the induced norm of a weight matrix, we eliminate the corresponding rows and columns where the activation function is guaranteed to be a constant in the neighborhood of each given data point, which provides a provably tighter bound than the global Lipschitz constant of the neural network. Our method can be used as a plug-in module to tighten the Lipschitz bound in many certifiable training algorithms. Furthermore, we propose to clip activation functions (e.g., ReLU and MaxMin) with a learnable upper threshold and a sparsity loss to assist the network to achieve an even tighter local Lipschitz bound. Experimentally, we show that our method consistently outperforms state-of-the-art methods in both clean and certified accuracy on MNIST, CIFAR-10 and TinyImageNet datasets with various network architectures.

2020

NeurIPS

Neural Networks with Recurrent Generative Feedback

Yujia Huang, James Gornet, Sihui Dai, Zhiding Yu, Tan Nguyen, Doris Y. Tsao, and Anima Anandkumar

In Neural Information Processing Systems , 2020

Abs PDF Code Slides Media

Neural networks are vulnerable to input perturbations such as additive noise and adversarial attacks. In contrast, human perception is much more robust to such perturbations. The Bayesian brain hypothesis states that human brains use an internal generative model to update the posterior beliefs of the sensory input. This mechanism can be interpreted as a form of self-consistency between the maximum a posteriori (MAP) estimation of the internal generative model and the external environmental. Inspired by this framework, we enforce consistency in neural networks by incorporating generative recurrent feedback. We implement this framework on convolutional neural networks (CNNs). The proposed framework, Convolutional Neural Networks with Feedback (CNN-F), introduces a generative feedback with latent variables into existing CNN architectures, making consistent predictions via alternating MAP inference under a Bayesian framework. CNN-F shows considerably better adversarial robustness over regular feedforward CNNs on standard benchmarks. In addition, With higher V4 and IT neural predictivity, CNN-F produces object representations closer to primate vision than conventional CNNs.
JBO

Investigating ultrasound–light interaction in scattering media

Yujia Huang, Michelle Cua, Joshua Brake, Yan Liu, and Changhuei Yang

Journal of Biomedical Optics , 2020

Abs PDF Code Poster

Significance: Ultrasound-assisted optical imaging techniques, such as ultrasound-modulated optical tomography, allow for imaging deep inside scattering media. In these modalities, a frac- tion of the photons passing through the ultrasound beam is modulated. The efficiency by which the photons are converted is typically referred to as the ultrasound modulation’s “tagging efficiency.” Interestingly, this efficiency has been defined in varied and discrepant fashion throughout the scientific literature. Aim: The aim of this study is the ultrasound tagging efficiency in a manner consistent with its definition and experimentally verify the contributive (or noncontributive) relationship between the mechanisms involved in the ultrasound optical modulation process. Approach: We adopt a general description of the tagging efficiency as the fraction of photons traversing an ultrasound beam that is frequency shifted (inclusion of all frequency-shifted com- ponents). We then systematically studied the impact of ultrasound pressure and frequency on the tagging efficiency through a balanced detection measurement system that measured the power of each order of the ultrasound tagged light, as well as the power of the unmodulated light component. Results: Through our experiments, we showed that the tagging efficiency can reach 70% in a scattering phantom with a scattering anisotropy of 0.9 and a scattering coefficient of 4 mm−1 for a 1-MHz ultrasound with a relatively low (and biomedically acceptable) peak pressure of 0.47 MPa. Furthermore, we experimentally confirmed that the two ultrasound-induced light modulation mechanisms, particle displacement and refractive index change, act in opposition to each other. Conclusion: Tagging efficiency was quantified via simulation and experiments. These findings reveal avenues of investigation that may help improve ultrasound-assisted optical imaging techniques.
Nature Photonics

Fluorescence imaging through dynamic scattering media with speckle-encoded ultrasound-modulated light correlation

Haowen Ruan, Yan Liu, Jian Xu, Yujia Huang, and Changhuei Yang

Nature Photonics , 2020

Abs PDF

Fluorescence imaging is indispensable to biomedical research, and yet it remains challenging to image through dynamic scattering samples. Techniques that combine ultrasound and light as exemplified by ultrasound-assisted wavefront shaping have enabled fluorescence imaging through scattering media. However, the translation of these techniques into in vivo applications has been hindered by the lack of high-speed solutions to counter the fast speckle decorrelation of dynamic tissue. Here, we report an ultrasound-enabled optical imaging method that instead leverages the dynamic nature to perform imaging. The method utilizes the correlation between the dynamic speckle-encoded fluorescence and ultrasound-modulated light signal that originate from the same location within a sample. We image fluorescent targets with an improved resolution of ≤75 µm (versus a resolution of 1.3 mm with direct optical imaging) within a scattering medium with 17 ms decorrelation time. This new imaging modality paves the way for fluorescence imaging in highly scattering tissue in vivo.

2018

Nature Comm.

Multi-color live-cell super-resolution volume imaging with multi-angle interference microscopy

Youhua Chen, Wenjie Liu, Zhimin Zhang, Cheng Zheng, Yujia Huang, Ruizhi Cao, Dazhao Zhu, Liang Xu, Meng Zhang, Yu-Hui Zhang, and others

Nature communications , 2018

Abs PDF

Imaging and tracking of near-surface three-dimensional volumetric nanoscale dynamic processes of live cells remains a challenging problem. In this paper, we propose a multi-color live-cell near-surface-volume super-resolution microscopy method that combines total internal reflection fluorescence structured illumination microscopy with multi-angle evanescent light illumination. We demonstrate that our approach of multi-angle interference microscopy is perfectly adapted to studying subcellular dynamics of mitochondria and microtubule architectures during cell migration.