Watch out for naming conventions in CV papers (The DINO Collision)
Story
I was reviewing literature today and hit a classic computer vision naming collision.
If a colleague or collaborator sends you a paper mentioning “DINO”, you must immediately clarify which architecture they mean, as they solve entirely different problems:
- DINO (Object Detection): DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection. This is IDEA Research’s transformer-based object detector.
- DINO (Self-Supervised Learning): Emerging Properties in Self-Supervised Vision Transformers. This is Meta AI’s self-supervised representation learning paradigm for Vision Transformers (ViTs).
One is an object detector. The other is a method for training foundational vision models without labels.
The Takeaway
Never assume architecture names are globally unique in AI. Always verify the arXiv link and the reference section before diving into a codebase or designing an integration.