Vision-language models, which map images and text to a common representational space, have demonstrated remarkable performance on a wide range of multimodal AI tasks. But they’re...
Zongyi (Joe) Liu is a principal scientist in the Amazon Customer Experience and Business Trends (CXBT) organization, which evaluates the customer experience across Amazon’s products and...
Most state-of-the-art computer vision models depend on supervised learning, in which labeled data is used for training. But labeling is costly, and the cost is compounded...
Logo recognition is the task of identifying a specific logo and its location in images or videos. It helps create a safe and trustworthy shopping experience,...