Overview | ICCV 2021

Program Overview
Main Conference	2021 October 12 - 15 (Tuesday - Friday)
Workshops and Tutorials	2021 October 11, 16 and 17 (Monday, Saturday & Sunday)

Main Conference Schedule (pdf, Paper Visualizations)

KEYNOTE Tuesday, October 12 9:00 AM – 10:00 AM EDT

Judith Donath, Harvard University

Judith Donath is a writer, designer and artist whose work focuses on the co-evolution of technology and society. She has published numerous articles about social media, AI, ethics and anonymity, and is the author of The Social Machine: Designs for Living Online (MIT Press). As the former director of the MIT Media Lab's Sociable Media Group, she and her students designed innovative interfaces for on-line communities; their art projects examining the future of identity, privacy and mediated life have been exhibited in museums and galleries worldwide. Currently, she is a faculty fellow at Harvard's Berkman Klein Center and is writing a book about technology, trust and deception.

She received her doctoral and master's degrees in Media Arts and Sciences from MIT and her bachelor's degree in History from Yale University.

Fireside Chat Friday, October 15 1:00 PM – 2:00 PM EDT

Caroline Sinders, Convocation Design + Research
Moderated by Camilo J Taylor (CJ) and Negar Rostamzadeh


Caroline Sinders is a critical designer, researcher, and artist. For the past few years, she has been examining the intersections of artificial intelligence, intersectional justice, systems design, harm, and politics in digital conversational spaces and technology platforms. She has worked with the United Nations, Amnesty International, IBM Watson, the Wikimedia Foundation, and others. Sinders has held fellowships with the Harvard Kennedy School, Google’s PAIR (People and Artificial Intelligence Research group), Ars Electronica’s AI Lab, the Weizenbaum Institute, the Mozilla Foundation, Pioneer Works, Eyebeam, Ars Electronica, the Yerba Buena Center for the Arts, the Sci Art Resonances program with the European Commission, and the International Center of Photography. Her work has been featured in the Tate Exchange in Tate Modern, the Contemporary Art Center of New Orleans, Telematic Media Arts, Victoria and Albert Museum, MoMA PS1, LABoral, Wired, Slate, Hyperallergic, Clot Magazine, Quartz, the Channels Festival, and others. Sinders holds a Masters from New York University’s Interactive Telecommunications Program.		CJ Taylor is the Raymond S. Markowitz President’s Distinguished Professor in the Computer and Information Science Department at the University of Pennsylvania. He is also currently the inaugural Associate Dean for Diversity, Equity and Inclusion in the School of Engineering and Applied Science at Penn. Dr. Taylor received his A.B. degree in Electrical Computer and Systems Engineering from Harvard College in 1988 and his M.S. and Ph.D. degrees from Yale University in 1990 and 1994 respectively. Dr. Taylor was the Jamaica Scholar in 1984, a member of the Harvard chapter of Phi Beta Kappa and held a Harvard College Scholarship from 1986-1988. From 1994 to 1997 Dr. Taylor was a postdoctoral researcher and lecturer with the Department of Electrical Engineering and Computer Science at the University of California, Berkeley. He joined the faculty of the Computer and Information Science Department at the University of Pennsylvania in September 1997. He received an NSF CAREER award in 1998 and the Lindback Minority Junior Faculty Award in 2001. In 2012 he received a best paper award at the IEEE Workshop on the Applications of Computer Vision. Dr Taylor's research interests lie primarily in the fields of Computer Vision and Robotics and include: reconstruction of 3D models from images, vision-guided robot navigation and scene understanding. Dr. Taylor has served as an Associate Editor of the IEEE Transactions of Pattern Analysis and Machine Intelligence. He has also served on numerous conference organizing committees he is a General Chair of the International Conference on Computer Vision (ICCV) 2021 and was a Program Chair of the 2006 and 2017 editions of the IEEE Conference on Computer Vision and Pattern Recognition and of the 2013 edition of 3DV. In 2012 he was awarded the Christian R. and Mary F. Lindback Foundation Award for Distinguished Teaching at the University of Pennsylvania.


Negar Rostamzadeh is a Research Scientist at Google Ethical AI team, where she studies the social impact of machine learning technologies and evaluation systems. Prior to that, Negar was a research scientist at Element AI, where she worked on efficient sampling and labeling approaches in a variety of multimedia and computer vision problems. Negar obtained her PhD from the University of Trento, where she was advised by Nicu Sebe. Her main area of research during her PhD was on large scale video understanding problems. During her PhD, she spent two years at MILA, where she worked on attention mechanism in videos, video captioning and generation under the supervision of Aaron Courville. Negar was a co-founder of Women in Deep Learning (WiDL) in 2016, and a co-organizer of WiML, WiCV and WiDL in 2017.

ICCV 2021 Invited Speakers and Discussion Panels

Old School/New School Panel Wednesday, October 13 4:00 PM – 5:00 PM EDT

A discussion about deep learning vs classical methods and their roles in computer vision.

Moderated by: Serge Belongie, University of Copenhagen & Pioneer Centre for AI

Featuring:

Andrew Davison, Imperial College London
Alexei Efros, University of California at Berkeley
Svetlana Lazebnik, University of Illinois at Urbana-Champaign
Jitendra Malik , University of California at Berkeley
Aude Oliva, MIT–IBM Watson AI Lab
Richard Szeliski. University of Washington


Andrew Davison is Professor of Robot Vision and Director of the Dyson Robotics Laboratory at Imperial College London. His long-term research focus is on SLAM (Simultaneous Localisation and Mapping) and its evolution towards general `Spatial AI': computer vision algorithms which enable robots and other artificial devices to map, localise within and ultimately understand and interact with the 3D spaces around them. With his research group and collaborators he has consistently developed and demonstrated breakthrough systems, including MonoSLAM, KinectFusion, SLAM++ and CodeSLAM, and recent prizes include Best Paper at ECCV 2016 and Best Paper Honourable Mention at CVPR 2018. He has also had strong involvement in taking this technology into real applications, in particular through his work with Dyson on the design of the visual mapping system inside the Dyson 360 Eye robot vacuum cleaner and as co-founder of applied SLAM start-up SLAMcore. He was elected Fellow of the Royal Academy of Engineering in 2017.		Svetlana Lazebnik received a Ph.D. in Computer Science at University of Illinois in 2006. After serving as assistant professor at the University of North Carolina at Chapel Hill from 2007 to 2011, she returned as faculty to the University of Illinois, where she is currently Full Professor in the Department of Computer Science. Her notable awards include the NSF CAREER Award (2008), Microsoft Research Faculty Fellow (2009), Sloan Research Fellow (2013), and IEEE Fellow (2021). Her CVPR 2006 paper on Spatial Pyramid Matching received the 2016 Longuet-Higgins Prize for a paper with significant impact on computer vision. She served as Program Chair for ECCV 2012 and ICCV 2019, and is currently serving as an Editor in Chief of the International Journal of Computer Vision. Her main research themes include scene understanding, modeling of large-scale photo collections, joint representations for images and text, and deep learning techniques for visual recognition problems.


Jitendra Malik was born in Mathura, India in 1960. He received the B.Tech degree in Electrical Engineering from Indian Institute of Technology, Kanpur in 1980 and the PhD degree in Computer Science from Stanford University in 1985. In January 1986, he joined the university of California at Berkeley, where he is currently the Arthur J. Chick Professor in the Department of Electrical Engineering and Computer Sciences. He is also on the faculty of the department of Bioengineering, and the Cognitive Science and Vision Science groups. During 2002-2004 he served as the Chair of the Computer Science Division, and as the Department Chair of EECS during 2004-2006 as well as 2016-2017. In 2018 and 2019, he served as Research Director and Site Lead of Facebook AI Research in Menlo Park. Prof. Malik's research group has worked on many different topics in computer vision, computational modeling of human vision, computer graphics and the analysis of biological images. Several well-known concepts and algorithms arose in this research, such as anisotropic diffusion, normalized cuts, high dynamic range imaging, shape contexts and R-CNN. He has mentored more than 60 PhD students and postdoctoral fellows. He received the gold medal for the best graduating student in Electrical Engineering from IIT Kanpur in 1980 and a Presidential Young Investigator Award in 1989. At UC Berkeley, he was selected for the Diane S. McEntyre Award for Excellence in Teaching in 2000 and a Miller Research Professorship in 2001. He received the Distinguished Alumnus Award from IIT Kanpur in 2008. His publications have received numerous best paper awards, including five test of time awards - the Longuet-Higgins Prize for papers published at CVPR (twice) and the Helmholtz Prize for papers published at ICCV (three times). He received the 2013 IEEE PAMI-TC Distinguished Researcher in Computer Vision Award, the 2014 K.S. Fu Prize from the International Association of Pattern Recognition, the 2016 ACM-AAAI Allen Newell Award, the 2018 IJCAI Award for Research Excellence in AI, and the 2019 IEEE Computer Society Computer Pioneer Award. He is a fellow of the IEEE and the ACM. He is a member of the National Academy of Engineering and the National Academy of Sciences, and a fellow of the American Academy of Arts and Sciences.		Aude Oliva, Ph.D. is the MIT director of the MIT–IBM Watson AI Lab and director of MIT Quest Corporate, MIT Schwarzman College of Computing, leading collaborations with industry to translate natural and artificial intelligence research into tools for the wider world. She is also a senior research scientist at the Computer Science and Artificial Intelligence Laboratory where she heads the Computational Perception and Cognition group. Her research is cross-disciplinary, spanning human perception and cognition, computer vision and cognitive neuroscience, and focuses on research questions at the intersection of all three domains.


Richard Szeliski is an Affiliate Professor at the University of Washington, a Member of the National Academy of Engineering, and a Fellow of the ACM and IEEE. Dr. Szeliski has done pioneering research in the fields of Bayesian methods for computer vision, image-based modeling, image-based rendering, and computational photography, which lie at the intersection of computer vision and computer graphics. His research on Photo Tourism, Photosynth, and Hyperlapse are exciting examples of the promise of large-scale image and video-based rendering. Dr. Szeliski received his Ph.D. degree in Computer Science from Carnegie Mellon University in 1988. He joined Facebook as the founding Director of the Computational Photography group in 2015 and retired in 2020. Prior to Facebook, he worked at Microsoft Research for twenty years as well as several other industrial research labs. He has published over 180 research papers in computer vision, computer graphics, neural networks, and numerical analysis, as well as the books Computer Vision: Algorithms and Applications and Bayesian Modeling of Uncertainty in Low-Level Vision. He was a Program Chair for CVPR'2013 and ICCV'2003, served as an Associate Editor of the IEEE Transactions on Pattern Analysis and Machine Intelligence and on the Editorial Board of the International Journal of Computer Vision, and was a Founding Editor of Foundations and Trends in Computer Graphics and Vision.		Alyosha Efros is a professor at UC Berkeley. He obtained his PhD in 2003 from UC Berkeley, and spent time in Oxford, CMU, and INRIA/Paris before moving back to Berkeley in 2013. Alyosha is a big fan of data, pixels, nearest neighbors, and simple things that work, while being suspicious of complex (especially probabilistic) models, semantic labels, and language. Alyosha's proudest achievement are his wonderful former students and postdocs. He loves Paris and gelato.

Deepfakes and Data Security Thursday, October 14 11:00 AM – 12:00 PM EDT

Moderated by Tal Hassner, Facebook AI

Tal Hassner received his M.Sc. and Ph.D. degrees in applied mathematics and computer science from the Weizmann Institute of Science, Israel, in 2002 and 2006, respectively. In 2008 he joined the Department of Math. and Computer Science at The Open Univ. of Israel where he was an Associate Professor until 2018. From 2015 to 2018, he was a senior computer scientist at the Information Sciences Institute (ISI) and a Visiting Associate Professor at the Institute for Robotics and Intelligent Systems, Viterbi School of Engineering, both at USC, CA, USA, working on the IARPA Janus face recognition project. From 2018 to 2019, Tal was a Principal Applied Scientist at AWS, where he led the design and development of the latest AWS face recognition pipelines. Since June 2019, he is an Applied Research Lead at Facebook AI, supporting both the text (OCR) and people (faces) photo understanding teams. Tal is an associate editor for both IEEE TPAMI and IEEE TBIOM. Some of his recent organizational roles include program chair for WACV'17, workshop chair for CVPR'20, and program chair for ICCV'21.

Featuring:

Hany Farid, UC Berkeley
Cristian Canton Ferrer. Facebook AI Red Team
Sam Gregory, WITNESS NGO
Mei Ngan, NIST
Luisa Verdoliva, University Federico II of Naples


Hany Farid is a Professor at the University of California, Berkeley with a joint appointment in Electrical Engineering & Computer Sciences and the School of Information. His research focuses on digital forensics, forensic science, misinformation, image analysis, and human perception. He received his undergraduate degree in Computer Science and Applied Mathematics from the University of Rochester in 1989, his Ph.D. in Computer Science from the University of Pennsylvania in 1997. Following a two-year post-doctoral fellowship in Brain and Cognitive Sciences at MIT, he joined the faculty at Dartmouth College in 1999 where he remained until 2019. He is the recipient of an Alfred P. Sloan Fellowship, a John Simon Guggenheim Fellowship, and is a Fellow of the National Academy of Inventors.		Cristian Canton is the Research Manager at Facebook AI, focused on understanding weaknesses and vulnerabilities derived from the use (or misuse) of AI. He was the project lead for the Deepfake Detection Challenge (DFDC) and has co-chaired several workshops on media forensics. In the past, he managed the computer vision team within the objectionable and harmful content domain. From 2012-16, he was at Microsoft Research in Redmond (USA) and Cambridge (UK); from 2009-2012, he was at Vicon (Oxford), bringing CV to produce visual effects for the cinema industry. He got his PhD and MS from Technical University of Catalonia (Barcelona) and his MS Thesis from EPFL (Switzerland) on computer vision topics.


Sam Gregory is an award-winning technologist, media-maker, and advocate, and Program Director of WITNESS (www.witness.org) which helps people use video and technology to defend human rights. An expert on new forms of misinformation and disinformation as well as innovations in preserving trust, authenticity and evidence, he leads WITNESS’ work on global Technology Threats and Opportunities related to the nexus of broader use of AI and increasing audiovisual and immersive communication and trends in both smartphone witnessing and rising authoritarianism. In coordination with technical researchers, policy-makers, companies, media organizations, journalists and civic activists WITNESS's 'Prepare, Don't Panic' initiative builds better globally-inclusive preparedness for malicious usages of synthetic media (wit.to/Synthetic-Media-Deepfakes) including current work on deepfake detection equity, deepfake ethics and satire, and emerging standards in authenticity infrastructure. Quoted in major media worldwide, he publishes frequently on technology and human rights and has spoken at Davos and the White House. Sam is a former Young Global Leader of the World Economic Forum, is on the Technology Advisory Board of the International Criminal Court, co-chaired the Partnership on AI’s Expert Group on AI and the Media and previously taught 2010-2018 at the Harvard Kennedy School.		Mei Ngan is a scientist at the National Institute of Standards and Technology (NIST). Her research focus includes evaluation of face recognition and tattoo recognition technologies. She is currently involved in a number of key face recognition testing activities at NIST, including leading the Face Recognition Vendor Test (FRVT) MORPH project to evaluate face morphing detection algorithms. Mei has authored and co-authored a number of technical publications, including the accuracy of face recognition with face masks, performance of facial age and gender estimation algorithms, and publication of a seminal open tattoo database for developing tattoo recognition research, which she received the Special Contribution Award for at the 2015 IEEE International Conference on Identity, Security and Behavior Analysis (ISBA). Mei was awarded the Department of Commerce Gold Medal Award in 2020 and was a recipient of the 2020 Women in Biometrics Award, a globally-recognized award honoring innovative women in the biometrics field.


Luisa Verdoliva is Associate Professor at University Federico II of Naples, Italy, where she leads the Multimedia Forensics Lab. In 2018 she has been visiting professor at Friedrich-Alexander-University (FAU) and in 2019-2020 she has been visiting scientist at Google AI in San Francisco. Her scientific interests are in the field of image and video processing, with main contributions in the area of multimedia forensics. She has published over 120 academic publications, including 45 journal papers. She is the PI for University Federico II of Naples in the DISCOVER (a Data-driven Integrated Approach for Semantic Inconsistencies Verification) project funded by DARPA under the SEMAFOR program (2020-2024). She has contributed to the academic community through service as technical Chair of the 2019 IEEE Workshop in Information Forensics and Security and the 2021 ACM Workshop on Information Hiding and Multimedia Security, area Chair of IEEE ICIP since 2017. She has been also co-Chair of the IEEE CVPR Media Forensics Workshop both in 2020 and 2021. She is on the Editorial Board of IEEE Transactions on Information Forensics and Security and IEEE Signal Processing Letters and has been Guest Editor for IEEE Journal of Selected Topics in Signal Processing. Dr. Verdoliva is Chair of the IEEE Information Forensics and Security Technical Committee and vice-Chair of the EURASIP Signal and Data Analytics for Machine Learning Special Area Teams. She is the recipient of the 2018 Google Faculty Award for Machine Perception and a TUM-IAS Hans Fischer Senior Fellowship (2020-2023). She has been elected to the grade of IEEE Fellow since January 2021.

Industry and Computer Vision Panel Friday, October 15 2:00 PM – 3:00 PM EDT

Moderated by John Tsotsos, York University

John Konstantine Tsotsos received his Hons. BASc in Engineering Science, MSc in Computer Science and PhD in Computer Science all from the University of Toronto in 1974, 1976 and 1980 respectively. He then joined the University of Toronto on faculty in both Departments of Computer Science and of Medicine. There, he founded and led the computer vision group for almost 2 decades. He moved to York University in 2000 where he is Distinguished Research Professor of Vision Science, and maintains Adjunct Professorships at the University of Toronto in the Departments of Computer Science and in Ophthalmology and Vision Sciences. He directed York’s Centre for Vision Research from 2000-2006 and is the founding Director of York’s Centre for Innovation in Computing at Lassonde.

He has been in the ICCV family since the beginning, being part of the first group of Marr Prize winners in 1987, on several program committees and General Chair of ICCV 1999. He is best known for contributions to visual attention and active perception. He is currently focused on the role of visual attention and cognitive control in how active observers (human or artificial) solve complex 3D visuospatial tasks in real-world settings.

Awards include: Fellow, Artificial Intelligence and Robotics Program of the Canadian Institute for Advanced Research (CIFAR); Fellow of the Royal Society of Canada; Fellow IEEE; Canadian Image Processing and Pattern Recognition Society Award for Research Excellence and Service; Geoffrey J. Burton Memorial Lectureship from the United Kingdom’s Applied Vision Association for significant contribution to vision science; the Royal Society of Canada’s 2015 Sir John William Dawson Medal for sustained excellence in multidisciplinary research, the first computer scientist so honored; and, the 2020 Lifetime Achievement Award from Canada's national computing society, CS-Can|Info-Can.

Featuring:

Sanja Fidler, Nvidia and University of Toronto
Rita Cucchiara, Universita di Modena
David Forsyth, University of Illinois
Vladlen Koltun, Apple  "Vladlen Koltun"
Georgia Gkioxari, Facebook AI Research Lab


Sanja Fidler is an Associate Professor at the Department of Computer Science, University of Toronto. She joined UofT in 2014. In 2018, she took a role of Director of AI at NVIDIA, leading a research lab in Toronto. Previously she was a Research Assistant Professor at TTI-Chicago, a philanthropically endowed academic institute located in the campus of the University of Chicago. She completed her PhD in computer science at University of Ljubljana in 2010, and was a postdoctoral fellow at University of Toronto during 2011-2012. She received the NVIDIA Pioneer of AI award, Amazon Academic Research Award, Facebook Faculty Award, and the Connaught New Researcher Award. In 2018 she was appointed as the Canadian CIFAR AI Chair. Her work on semi-automatic object instance annotation won the Best Paper Honorable Mention at CVPR’17. Her main research interests are scene parsing from images and videos, interactive annotation, 3D scene understanding, 3D content creation, and multimodal representations.		Rita Cucchiara is Professor of Computer Architecture and Computer Vision at Engineering Department “Enzo Ferrari” of the University of Modena and Reggio Emilia, Italy. She heads the AImagelab research lab at UNIMORE where she is Director of the interdipartimentale AIRI Artificial Intelligence Research. She is also the Director of the Modena-Firenze Unit of the European Labs of Learning and Intelligent Systems (ELLIS). She is Fellow of ELLIS and IAPR. She is in the Director Board of Italian Institute of Technology (IIT) and coordinator of the Italian Ministerial Group of the National Research Plan in AI.


David Forsyth is currently a full professor at U. Illinois at Urbana-Champaign, where he has occupied the occupied the Fulton-Watson-Copp chair in Computer Science since 2014. Prior to UIUC, he was a full professor at UC Berkeley. David has published over 130 papers on computer vision, computer graphics and machine learning. David has served as program co-chair for IEEE Computer Vision and Pattern Recognition in 2000, 2011, and 2018, general co-chair for CVPR 2006 and 2015, program co-chair for the European Conference on Computer Vision 2008, and is a regular member of the program committee of all major international conferences on computer vision. David has served six years on the SIGGRAPH program committee, and is a regular reviewer for that conference. David has received best paper awards at the International Conference on Computer Vision and at the European Conference on Computer Vision. David received an IEEE technical achievement award for 2005 for his research. He became an IEEE Fellow in 2009, and an ACM Fellow in 2014. His recent textbook, “Computer Vision: A Modern Approach” (joint with J. Ponce and published by Prentice Hall) is now widely adopted as a course text (at such institutions as MIT, U. Wisconsin-Madison, UIUC, Georgia Tech and U.C. Berkeley. David is currently serving a second term as Editor in Chief, IEEE TPAMI.		Vladlen Koltun is currently a Distinguished Scientist at Apple, where he is building an new international research organization. He was the Chief Scientist for Intelligent Systems at Intel, from 2014 to 2021, where he led an international lab of researchers working in machine learning, robotics, computer vision, computational science, and related areas. He was a Senior Research Scientist at Adobe Research from 2013-2014. From 2005 to 2013 he was an Assistant Professor in Computer Science at Stanford University. From 2002-2005 he was a postdoctoral researcher at the University of California, Berkeley. He was the recipient of the Sloan Research Fellowship in 2007 and the NSF CAREER Award in 2006. He received the Presidential Grant for Junior Faculty in 2006 from Stanford University. He received his PhD from Tel Aviv University in Computer Science in 2002.


Georgia Gkioxari is a research scientist at Facebook AI Research. She received her PhD from UC Berkeley, where she was advised by Jitendra Malik. She did her bachelors in ECE at NTUA in Athens, Greece, where she worked with Petros Maragos. She is the recipient of the 2021 PAMI Young Researcher Award. She was named one of the "30 Influential Women Advancing AI in 2019: by Re-Work.