Tech Insights
Vision Transformers

Vision Transformers

Last updated , generated by Sumble
Explore more →

What is Vision Transformers?

Vision Transformers (ViTs) apply the Transformer architecture, originally designed for natural language processing, to image recognition tasks. Instead of processing sequential words, an image is split into patches, treated as tokens, and fed into a standard Transformer encoder. ViTs have shown competitive performance compared to convolutional neural networks (CNNs) on image classification benchmarks and are commonly used in tasks like image recognition, object detection, and image segmentation.

What other technologies are related to Vision Transformers?

Vision Transformers Competitor Technologies

A competing Vision Transformer architecture.
mentioned alongside Vision Transformers in 88% (64) of relevant job posts
DINO is a self-supervised learning method for Vision Transformers, making it a competitor in the broader sense of approaches to training Vision Transformers.
mentioned alongside Vision Transformers in 43% (68) of relevant job posts
U-Net is a convolutional neural network architecture, often used for image segmentation, representing a competing approach to Vision Transformers in computer vision tasks.
mentioned alongside Vision Transformers in 26% (75) of relevant job posts
ResNet is a CNN architecture and a competing approach to Vision Transformers in computer vision tasks.
mentioned alongside Vision Transformers in 9% (88) of relevant job posts
CNNs are a general class of neural networks frequently used in computer vision, representing a competing approach to Vision Transformers.
mentioned alongside Vision Transformers in 3% (85) of relevant job posts
YOLO is a real-time object detection system, usually based on CNNs, and a competing approach to Vision Transformers for object detection tasks.
mentioned alongside Vision Transformers in 3% (75) of relevant job posts

Vision Transformers Complementary Technologies

ONNX facilitates the interoperability of different deep learning frameworks, and can be used to deploy Vision Transformers.
mentioned alongside Vision Transformers in 98% (51) of relevant job posts
FSDP (Fully Sharded Data Parallel) is a technique for training large models, including Vision Transformers, and is therefore complementary.
mentioned alongside Vision Transformers in 13% (75) of relevant job posts
XLA (Accelerated Linear Algebra) is a compiler that can optimize and accelerate the execution of deep learning models, including Vision Transformers.
mentioned alongside Vision Transformers in 8% (89) of relevant job posts

Which job functions mention Vision Transformers?

Job function
Jobs mentioning Vision Transformers
Orgs mentioning Vision Transformers

Which organizations are mentioning Vision Transformers?

Organization
Industry
Matching Teams
Matching People

This tech insight summary was produced by Sumble. We provide rich account intelligence data.

On our web app, we make a lot of our data available for browsing at no cost.

We have two paid products, Sumble Signals and Sumble Enrich, that integrate with your internal sales systems.