Research Papers | Notion
CLIP (Contrastive Language-Image Pre-training)
Vision Transformer
ResNet (Residual Network)
Swin Transformer
Xception
EfficientNet
VGG
Faster R-CNN
Mask R-CNN
SSD (Single Shot Multibox Detector)
YOLOv3
RetinaNet
Detr (Decision Transformer)
ViT (Vision Transformer)
U-Net
FCN (Fully Convolutional Network)
FPN (Feature Pyramid Network)
ALIGN
BLIP (Bootstrapping Language-Image Pre-training)
MobileNetV2