Explore Technology Competitors, Complementaries, Teams, and People

Vision Transformers

Last updated May 21, 2025, generated by Sumble

Explore more →

What is Vision Transformers?

Vision Transformers (ViTs) apply the Transformer architecture, originally designed for natural language processing, to image recognition tasks. Instead of processing sequential words, an image is split into patches, treated as tokens, and fed into a standard Transformer encoder. ViTs have shown competitive performance compared to convolutional neural networks (CNNs) on image classification benchmarks and are commonly used in tasks like image recognition, object detection, and image segmentation.

Find 222 organizations using Vision Transformers on Sumble →

What other technologies are related to Vision Transformers?

Vision Transformers Competitor Technologies

No summary available

A competing Vision Transformer architecture.

DINO is a self-supervised learning method for Vision Transformers, making it a competitor in the broader sense of approaches to training Vision Transformers.

U-Net is a convolutional neural network architecture, often used for image segmentation, representing a competing approach to Vision Transformers in computer vision tasks.

ResNet is a CNN architecture and a competing approach to Vision Transformers in computer vision tasks.

CNNs are a general class of neural networks frequently used in computer vision, representing a competing approach to Vision Transformers.

YOLO is a real-time object detection system, usually based on CNNs, and a competing approach to Vision Transformers for object detection tasks.

Number of organizations that mention technology

Vision Transformers

ⓘ Tap on a tech to explore matching organizations

Vision Transformers Complementary Technologies

Open Neural Net Exchange

No summary available

ONNX facilitates the interoperability of different deep learning frameworks, and can be used to deploy Vision Transformers.

FSDP (Fully Sharded Data Parallel) is a technique for training large models, including Vision Transformers, and is therefore complementary.

XLA (Accelerated Linear Algebra) is a compiler that can optimize and accelerate the execution of deep learning models, including Vision Transformers.

Number of organizations that mention technology

Vision Transformers

Open Neural Net Exchange

ⓘ Tap on a tech to explore matching organizations

Which job functions commonly mention Vision Transformers?

Applied Scientist

0.3% of all Applied Scientist jobs mention Vision Transformers

View 34 jobs on Sumble

Machine Learning

0.2% of all Machine Learning jobs mention Vision Transformers

View 208 jobs on Sumble

0.1% of all AI Engineer jobs mention Vision Transformers

View 93 jobs on Sumble

Research Scientist

23 Research Scientist jobs mention Vision Transformers

View 23 jobs on Sumble

55 Data Scientist jobs mention Vision Transformers

View 55 jobs on Sumble

See more or filter by date, location, industry, etc →

Which organizations are mentioning Vision Transformers?

Amazon Web Services

Information

7 team
mention Vision Transformers

1 person
use Vision Transformers

Walmart

Retail Trade

3 team
mention Vision Transformers

1 person
use Vision Transformers

Deloitte

Professional Services

1 teams
mentions Vision Transformers

1 person
use Vision Transformers

See more or filter by date, location, industry, etc →

Summary powered by

Find the right accounts, contact, message, and time to sell

Whether you're looking to get your foot in the door, find the right person to talk to, or close the deal — accurate, detailed, trustworthy, and timely information about the organization you're selling to is invaluable.

Use Sumble to:

Sign in to continue exploring

or

Book a call to discuss your needs