Skip to main content
Speaker Photo
dacheng-tao.jpg
Speaker University
Inaugural Director, JD Explore Academy, China
Senior Vice President, JD.com
Speaker Biography

Dacheng Tao is the Inaugural Director of the JD Explore Academy and a Senior Vice President of JD.com. He is also an advisor and chief scientist of the digital science institute in the University of Sydney. He mainly applies statistics and mathematics to artificial intelligence and data science, and his research is detailed in one monograph and over 200 publications in prestigious journals and proceedings at leading conferences. He received the 2015 Australian Scopus-Eureka Prize, the 2018 IEEE ICDM Research Contributions Award, and the 2021 IEEE Computer Society McCluskey Technical Achievement Award. He is a fellow of the Australian Academy of Science, the World Academy of Sciences, the Royal Society of NSW, AAAS, ACM, IAPR and IEEE.

Program Speaker Topic and Featured Program Summary
Dacheng Tao is the Inaugural Director of the JD Explore Academy and a Senior Vice President of JD.com. He is also an advisor and chief scientist of the digital science institute in the University of Sydney.
Question
More Is Different: ViTAE elevates the art of computer vision
Answer

Big data contains a tremendous amount of dark knowledge. The community has realized that effectively exploring and using such knowledge is essential to achieving superior intelligence. How can we effectively distill the dark knowledge from ultra-large-scale data? One possible answer is: “through Transformers”. Transformers have proven their prowess at extracting and harnessing the dark knowledge from data. This is because more is truly different when it comes to Transformers. In this talk, I will showcase our recent work on transformers named ViTAE, on many dimensions of “more” including: model parameters, labeled and unlabeled data, prior knowledge, computing resource, tasks, and modalities. Specifically, ViTAE has more model parameters and more input modality support; ViTAE can absorb and encode more data to extract more dark knowledge; ViTAE is able to adopt more prior knowledge in the form of biases and constraints; ViTAE can be easily adapted to larger-scale parallel computing resources to achieve faster training.

ViTAE has been applied to many computer vision tasks and has proven its promise, such as image recognition, object detection, semantic segmentation, image matting, pose estimation, scene text understanding, and remote sensing.

You can find the source code for this work at here.

Speaker Category
Forum Program Speakers Category