CSE Colloquium: Toward Powerful AI Agents Users Understand and Trust

Abstract: I will describe how my research enables AI to learn relationships between complex, unstructured inputs, specifically: 1) similarity relationships between two images (e.g., that a matching top and skirt are similar), and 2) similarity between descriptive text and an image or video (e.g. that a caption is similar to the image it describes). I will focus on two contributions.
First, my research has greatly expanded our understanding of the relationship between vision and language, including introducing the task of phrase grounding, which relates natural language phrases to image regions, along with a new dataset called Flickr30K Entities that is commonly used to train these models. Second, I will present my work in learning explainable visual similarity methods that can capture a greater variety of similarity functions than those in prior works.

Biography: Dr. Bryan Plummer is currently a Research Assistant Professor in the Department of Computer Science at Boston University. He completed his Ph.D. in Computer Science at the University of Illinois at Urbana-Champaign in 2018, after which he became a postdoctoral associate at Boston University before joining as a member of the faculty.
His primary research interests are in Machine Learning, Computer Vision, Vision, Language Understanding, and Robotics. He is a member of the Image and Video Computing group at Boston University, where he currently supervises 4 graduate students and 2 undergraduates. He has an excellent track record of producing innovative research and a commitment to teaching and outreach.


Media Contact: Daniel Kifer



