As part of a student team, I am building a system to classify used shoes.
I know that Google Lens is doing a really good job here.
I came across Google Cloud Vision API (which should be a similar thing) and implemented this in python.
For clean, well-angled images like this Air Force One:
I am getting really promising results:
10 Web entities found:
Score : 0.9957345128059387
Description: Nike Air Force 1 07 LV8 EMB Raiders Mens
Score : 0.7279999852180481
Description: Nike
Score : 0.7279999852180481
Description:
Score : 0.7130167484283447
Description: Nike Mens Air Force 1 '07 LV8 'Metallic Swoosh Pack
Score : 0.7052000164985657
Description: Sneakers
Score : 0.7049999833106995
Description: Shoe
Score : 0.6831490993499756
Description: Nike Mens Air Force 1 Low
Score : 0.6559000015258789
Description: Nike
Score : 0.6399800181388855
Description: Nike Air Max
Score : 0.6158000230789185
Description: Men's Shoe
If however, i input real-world images like this old used Nike Tanjun:
Things fall apart:
8 Web entities found:
Score : 0.5776046514511108
Description: Shoe
Score : 0.4444863796234131
Description: Product design
Score : 0.42980000376701355
Description: Design
Score : 0.4197726845741272
Description: Product
Score : 0.39287227392196655
Description: Activewear
Score : 0.384799987077713
Description: Walking
Score : 0.35569998621940613
Description: Walking Shoe
Score : 0.3215000033378601
Description: outdoor
But if I upload the image to google lens, I could still figure out the right label:
Logo detection (Nike) almost always works. And using this, I could for example search after the most often occurring word after the Logo (Tanjun) to figure out the model.
It must be mentioned that the data of our system will be better than that, there will be multiple images taken from different angles and very good lighting conditions.
Now i am trying to figure out how to
EITHER: Get Vision API working in the same way as Google Lens
OR: Acces Google Lens data in a somehow convenient way (should in the best case run from a raspberry pi)