Bibliography · MMXVI—MMXXVI
Publications
A decade of papers across multimodal learning, vision, audio, and computational understanding of media.
· · ·
Below is the complete list — peer-reviewed papers, workshop contributions, arXiv preprints, and one patent — grouped by year, newest first. Papers marked with a small accent dot (selected) are the ones I’d point a new reader to first. BibTeX for the entire list is available here, and an almost-always-fresher list lives on Google Scholar.
19
Papers
10
Years
9
Venues
1
Patent
20251 paper
-
Interspeech 2025Can Multimodal Foundation Models Help Analyze Child-Inclusive Autism Diagnostic Videos?Proceedings of the 26th International Conference on Multimodal Interaction, Rotterdam.
20242 papers
-
ICMI 2024Can Text-to-image Models Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?Proceedings of the 26th International Conference on Multimodal Interaction, San Jose, Costa Rica.
-
ICASSP 2024Does Video Summarization Require Videos? Quantifying the Effectiveness of Language in Video SummarizationIEEE International Conference on Acoustics, Speech and Signal Processing.
20239 papers
-
EMNLP Findings 2023Domain Adaptation for Sentiment Analysis Using Robust Internal RepresentationsFindings of the Association for Computational Linguistics: EMNLP 2023, Singapore.
-
ACM MM 2023First authorMM-AU: Towards Multimodal Understanding of Advertisement VideosProceedings of the 31st ACM International Conference on Multimedia.
-
ACM MM 2023SEAR: Semantically-grounded Audio RepresentationsProceedings of the 31st ACM International Conference on Multimedia.
-
WACV 2023SelectedFirst authorMovieCLIP: Visual Scene Recognition in MoviesIEEE/CVF Winter Conference on Applications of Computer Vision.
-
KDD 2023SelectedFedMultimodal: A Benchmark for Multimodal Federated LearningProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
-
ICASSP 2023First authorContextually-Rich Human Affect Perception Using Multimodal Scene InformationIEEE International Conference on Acoustics, Speech and Signal Processing.
-
ICASSP 2023A Dataset for Audio-Visual Sound Event Detection in MoviesIEEE International Conference on Acoustics, Speech and Signal Processing.
-
ICASSP 2023Signal Processing Grand Challenge 2023 — E-Prevention: Sleep Behavior as an Indicator of Relapses in Psychotic PatientsIEEE International Conference on Acoustics, Speech and Signal Processing.
-
ICASSP ’23 · WorkshopMultimodal Estimation of Change Points of Physiological Arousal During DrivingICASSP Workshop on Ambient AI: Multimodal Wearable Sensor Understanding.
20221 paper
-
FPSAM 2022SelectedFirst authorAutomatic Analysis of Asymmetry in Facial Paralysis Patients Using Landmark-Based MeasuresFacial Plastic Surgery & Aesthetic Medicine, Vol. 24, No. 6.
20213 papers
-
ICCV CLVL 2021First authorUnderstanding of Emotion Perception from ArtICCV Workshop on Closing the Loop Between Vision and Language.
-
arXiv · 2021Cross-Domain Emotion Recognition Using Few-Shot Knowledge TransferarXiv preprint 2110.05021.
-
CODS-COMAD 2021Robust Resource Demand Estimation Using Hierarchical Bayesian Model in a Distributed Service SystemProceedings of the 8th ACM IKDD CODS & 26th COMAD.
20161 paper
-
ICVGIP 2016First authorHierarchical Spectral Clustering-Based Large-Margin Classification of Visually Correlated CategoriesProceedings of the Tenth Indian Conference on Computer Vision, Graphics and Image Processing, IIT Bombay.
Patents
-
US Patent · 2020GrantedVisually Guided Query ProcessingUS Patent 10,878,291 · Assigned to Google Patents.
This list is maintained by hand. The canonical bibliography, in BibTeX form, lives at /bibliography. Corrections and omissions: dbose [at] usc [dot] edu. Last rebuilt on the seventeenth of April, MMXXVI.
— D. B.