ACM MM — LLMpedia

ACM MM
Name	ACM International Conference on Multimedia
Abbreviation	ACM MM
Discipline	Multimedia, Computer Vision, Machine Learning, Human-Computer Interaction
Publisher	Association for Computing Machinery
Frequency	Annual
First	1993

Contents

ACM MM

The ACM International Conference on Multimedia is an annual flagship venue for research on multimedia systems, multimedia content analysis, multimedia retrieval, and multimedia applications. The conference brings together researchers from Association for Computing Machinery, IEEE Computer Society, International Organization for Standardization, National Science Foundation, and industrial labs such as Google Research, Facebook AI Research, Microsoft Research, and Adobe Research to present advances spanning algorithms, systems, datasets, and applications. The event typically features peer-reviewed technical papers, demonstrations, workshops, tutorials, and keynote talks by leaders associated with institutions like Massachusetts Institute of Technology, Stanford University, Carnegie Mellon University, University of California, Berkeley, and Tsinghua University.

Overview

ACM MM serves as a multidisciplinary forum linking work from University of Oxford, University of Cambridge, ETH Zurich, Peking University, Seoul National University, and industry teams at Amazon Web Services, Apple Inc., NVIDIA, Samsung Research, and Baidu Research. The program commonly covers topics related to multimedia indexing, retrieval, summarization, captioning, cross-modal learning, and real-time codecs, drawing authors from Google DeepMind, OpenAI, Huawei Noah's Ark Lab, SenseTime, and ByteDance AI Lab.

Founded in 1993 as an effort to consolidate research presented at venues related to multimedia, the conference evolved alongside advances from work by groups at Bell Labs, Mitsubishi Electric Research Laboratories, Siemens Corporate Research, and Xerox PARC. Milestones include early multimedia standards and initiatives led by Moving Picture Experts Group, breakthroughs in content-based image retrieval associated with Cornell University, and the rise of deep learning methods influenced by publications from University of Toronto, University College London, and Google Brain. Over decades the meeting expanded to incorporate workshops like those tied to NeurIPS, ICCV, CVPR, ECCV, and ACL.

The program structure normally includes a main technical track, poster sessions, demo tracks, tutorials, and co-located workshops organized by groups from IEEE Transactions on Multimedia, ACM Transactions on Multimedia Computing, Communications, and Applications, SIGMM, and editorial boards with members from Princeton University, Brown University, University of Washington, and University of Illinois Urbana-Champaign. Typical topic areas span multimedia search, multimodal learning, speech and audio processing attributed to teams at ETH Zurich, Johns Hopkins University, and University of Southern California, visual understanding linked to Caltech, Duke University, and National University of Singapore, and human-centered multimedia prioritizing work from Cornell Tech, Georgia Institute of Technology, and University of Michigan.

Noteworthy contributions presented at the conference include early work on content-based image retrieval from researchers affiliated with Cornell University and University of California, Berkeley, multimedia indexing frameworks from Microsoft Research and IBM Research, cross-modal embedding advances influenced by teams at Facebook AI Research and Google Research, and large-scale dataset releases inspired by efforts at ImageNet-related labs and groups at Yahoo! Research. Influential methods for video understanding, caption generation, and action recognition were advanced by collaborations including Carnegie Mellon University, University of Oxford, University of Tokyo, and KAIST. Benchmarks and challenge tracks have catalyzed progress akin to initiatives sponsored by Kaggle, ImageNet Challenge, and PASCAL VOC efforts.

The conference recognizes outstanding work through Best Paper Awards, Best Demo Awards, and Doctoral Consortium honors administered by committees with members from SIGMM, IEEE Signal Processing Society, ACM SIGCHI, and academic leaders at Columbia University, Yale University, University of Pennsylvania, and University of Toronto. Recipients have included researchers who later received accolades such as Turing Award-level recognition, influential fellowships from Gordon and Betty Moore Foundation, and funding from organizations like the European Research Council and Japan Society for the Promotion of Science.

Organizational leadership typically comes from program chairs and general chairs drawn from universities and industry labs including University of California, Los Angeles, Nanyang Technological University, Hong Kong University of Science and Technology, Infosys, Intel Labs, Qualcomm Research, and Lenovo Research. Sponsors often include Association for Computing Machinery, Microsoft Research, Amazon Web Services, Google, Huawei, and regional hosts coordinated with municipal entities and conference bureaus in cities such as Beijing, New York City, San Francisco, Barcelona, Suzhou, and Seattle.