Pritam Sarkar (প্রীতম সরকার)
Postdoc at UBC and Vector Institute
Email: pritam.sarkar@queensu.ca

Google Scholar GitHub LinkedIn

I am interested in advancing safe multimodal intelligence and in designing algorithms that require minimal human supervision. Currently, I am a postdoc at the University of British Columbia and Vector Institute, working with Leonid Sigal. I completed my PhD in 2025 at Queen’s University, where I worked with Ali Etemad. During my PhD, I interned at Google and RBC Borealis, and was affiliated with the Vector Institute, Ingenuity Labs, and Aiim Lab. I received the First Prize in the Research Excellence Award (PhD), IEEE Kingston Section, in 2023. Outside of research, I am passionate about photography and film-making, enjoy coffee, long walks, and playing a bit of badminton.

Open to collaboration
If you are interested in collaborating on research ideas of mutual interest, please feel free to email me.

Pro-bono activity (click to read more)

    I am dedicating a weekly time slot (45 minutes) to speak with graduate students who might lack access to a strong mentorship or peer support. Whether you want to brainstorm, refine your research direction, or get an external perspective on your work, I am happy to help.

    The Motivation: I have been incredibly fortunate to have supportive mentors and peers throughout my career, but I also know the difficulty of working without it. This is my way of paying forward the support I have received, aiming to reach students outside my immediate network who might benefit from such support.

    To express your interest, please send a brief message through this form, and I will get back to you. Before reaching out, please review my research interests to see if they (at least somewhat) align with yours.

    To make the best use of our time, let us plan the discussion in the following format:

    • 5 minutes: Quick introductions.
    • 10 minutes: You describe your research problem, current progress, and specific challenges. Ideally some slides with specific points.
    • 25 minutes: Brainstorming ideas and some actionable feedback.
    • 5 minutes: Wrap up and any final questions/suggestions for me.

    Please be assured that I will periodically review the submitted responses and reach out to selected candidates. However, due to limited time, I may not be able to respond to everyone. Based on my availability, I may schedule sessions during the weekends. Thank you for your understanding.

Research


My current research focus is multimodal AI with video, image, audio, and language.

Broadly interested in: generative models (LLMs, multimodal LLMs, diffusion models), foundation models (video, image, vision-language, audio-visual), self-supervised learning, alignment, reasoning, AI agents, computer vision. Please find more about my research here.

News


  • [Oct 25] I am serving as an AC for WACV 2026.
  • [Sep 25] Our proposed Self-alignment with RRPO got accepted in NeurIPS 2025.
  • [Sep 25] I successfully defended my PhD thesis! Here are the slides.
  • [May 25] Introduced VCRBench, the first video-based multi-step causal reasoning benchmark.
  • [Apr 25] Introduced RRPO, a fine-grained self-alignment recipe to align Multimodal LLMs.
  • [Jan 25] DPA got accepted in ICLR 2025.
  • [Dec 23] XKD and RDDM got accepted in AAAI 2024.
  • [Nov 23] I have won the first prize in IEEE Research Excellence Award (PhD).
  • [Sep 23] Our paper on Video SSL in OOD got accepted in NeurIPS 2023 as a Spotlight.
  • [Aug 23] Accepted an offer from Google to join as a Student Researcher.
  • [Nov 22] AVCAffe and CrissCross (Oral) got accepted in AAAI 2023.
  • [Oct 22] We are organizing AAAI 2023 Workshop on R2HCAI.
  • [Oct 22] Honourable Mention in poster competitions (1.) Robotics and AI Symposium 2022 and (2.) FEAS Research Symposium 2022 at Queen’s University, Canada.
  • [Jun 22] Accepted an offer from Borealis AI for a fall internship as a Machine Learning Research Intern.
  • [Oct 21] Best poster award at Robotics and AI Symposium, Ingenuity Labs, 2021.
  • [Aug 21] We are organizing AAAI 2022 Workshop on HC-SSL.
  • [Mar 21] I received postgraduate affiliation award from Vector Institute. News
  • [Dec 20] Our paper CardioGAN got accepted in AAAI 2021.
Click to see old news
  • [Aug 20] My first journal/transaction as a first author got accepted in IEEE Trans. of Affective Computing.
  • [Apr 20] Successfully defended my M.A.Sc. thesis.
  • [Jan 20] Conference paper on ECG-based SSL got accepted in IEEE ICASSP 2020 for oral presentation.
  • [Jun 19] My first paper got accepted for oral presentation in IEEE ACII 2019.
  • [Sep 18] Joined Queen's for a master's degree.
  • [Dec 17] Joined Infosys as a Sr. System Engineer.
  • [Nov 15] Joined Tech Mahindra as a Software Engineer.
  • [Jun 15] Completed B.Tech!

Education


  • PhD at Queen’s University, Canada, 2020 - 2025. PhD Thesis. Convocation Video.
  • MASc at Queen’s University, Canada, 2018 - 2020. MASc Thesis.
  • B.Tech at West Bengal University of Technology, India, 2011 - 2015.

Employment


  • Postdoctoral Research Fellow at University of British Columbia, Vancouver, Canada 2025 - Present.
  • Distinguished Postdoctoral Fellow at Vector Institute, Toronto, Canada, 2025 - Present.
  • Research Assistant at Queen’s University, Kingston, Canada, 2018 - 2025.
  • Teaching Assistant/Guest Lecturer at Queen’s University, Kingston, Canada, 2018 - 2025.
  • Student Researcher at Google, Sunnyvale, USA, Fall 2023.
  • Machine Learning Research Intern at RBC Borealis, Toronto, Canada, Fall 2022.
  • Sr. System Engineer at Infosys Ltd., Bangalore, India, 2017 - 2018.
  • Software Engineer at Tech Mahindra Ltd., Hyderabad, India, 2015 - 2017.

Academic Service


I serve as an AC for the following:

  • WACV 2026

I frequently review for the following venues:

  • NeurIPS (2023, 2025), ICLR (2024, 2025), ICML (2024)
  • CVPR (2023, 2024, 2026), ICCV (2023), ECCV (2022, 2024)
  • PAMI (2023 - Present), T-Affc (2022 - Present)

Invited Talks


There are some overlaps across my recent talks which are derived from my PhD thesis, so I am sharing the slides from my PhD defence, which provide a high-level overview combining most of my recent content. Link

  • [Nov 2025] MLRG at University of Guelph, Advacing Safe Multimodal Intelligence
  • [Oct 2025] Vector Institute, Advacing Safe Multimodal Intelligence
  • [Jul 2025] Amazon AGI, Multimodal Learning from Videos: Pre-training, Post-training, and Benchmarks
  • [Jul 2025] MLR at Apple, Multimodal Learning from Videos: Pre-training, Post-training, and Benchmarks
  • [May 2025] Google Deepmind, Multimodal Learning from Videos: Pre-training, Post-training, and Benchmarks
  • [Dec 2024] FAIR at Meta, Multimodal Visual Understanding
  • [Jun 2024] CAIR at Google, Mitigating Vision-Language Hallucinations via Phrase-level Alignment
  • [Jul 2023] Ingenuity Labs at Queen’s University, Learning withour human supervision
  • [Feb 2023] AAAI, Self-supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity
  • [Jan 2023] Borealis AI, AugESeq: Augmentation improves Event Sequence prediction
  • [Sep 2019] ACII, Classification of Cognitive Load and Expertise for Adaptive Simulation using Deep Multitask Learning

Some recorded talks/videos: