Figure 7: Qualitative results comparing the old embeddings
vs our new multi-task embeddings in Flashlight. For each
query on the left, the results from old embeddings are
shown on the top row, and the new embeddings are shown
on the bottom row.
us to move faster towards our most important objective – to build
and improve products for our users.
REFERENCES
[1]
Sean Bell and Kavita Bala. 2015. Learning Visual Similarity for Product Design
with Convolutional Neural Networks. ACM Trans. on Graphics (SIGGRAPH) 34, 4
(2015).
[2]
Jerry Zitao Liu Yuchen Liu Rahul Sharma Charles Sugnet Mark Ulrich
Jure Leskovec Chantat Eksombatchai, Pranav Jindal. 2018. Pixie: A System
for Recommending 3+ Billion Items to 200+ Million Users in Real-Time. In Pro-
ceedings of the International Conference on World Wide Web.
[3]
Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, and Andrew Rabinovich.
2017. GradNorm: Gradient Normalization for Adaptive Loss Balancing in
Deep Multitask Networks. CoRR abs/1711.02257 (2017). arXiv:1711.02257
http://arxiv.org/abs/1711.02257
[4]
Sumit Chopra, Raia Hadsell, and Yann LeCun. 2005. Learning a Similarity Metric
Discriminatively, with Application to Face Verication. In 2005 IEEE Computer
Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 1.
IEEE, 539–546.
[5]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A
Large-Scale Hierarchical Image Database. In CVPR09.
[6]
Priya Goyal, Piotr Dollár, Ross B. Girshick, Pieter Noordhuis, Lukasz Wesolowski,
Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. 2017. Accurate,
Large Minibatch SGD: Training ImageNet in 1 Hour. CoRR abs/1706.02677 (2017).
arXiv:1706.02677 http://arxiv.org/abs/1706.02677
[7]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual
Learning for Image Recognition. arXiv preprint arXiv:1512.03385 (2015).
[8]
Elad Hoer and Nir Ailon. 2014. Deep metric learning using Triplet network.
CoRR abs/1412.6622 (2014). http://arxiv.org/abs/1412.6622
[9]
Houdong Hu, Yan Wang, Linjun Yang, Pavel Komlev, Li Huang, Xi (Stephen)
Chen, Jiapei Huang, Ye Wu, Meenaz Merchant, and Arun Sacheti. 2018. Web-Scale
Responsive Visual Search at Bing. In Proceedings of the 24th ACM SIGKDD Inter-
national Conference on Knowledge Discovery & Data Mining, KDD 2018, London,
UK, August 19-23, 2018. 359–367. https://doi.org/10.1145/3219819.3219843
[10]
Jie Hu, Li Shen, and Gang Sun. 2017. Squeeze-and-excitation networks. arXiv
preprint arXiv:1709.01507 (2017).
[11]
Y. Jing, D. Liu, D. Kislyuk, A. Zhai, J. Xu, and J. Donahue. [n. d.]. Visual Search at
Pinterest. In Proceedings of the International Conference on Knowledge Discovery
and Data Mining (SIGKDD).
[12]
Alex Kendall, Yarin Gal, and Roberto Cipolla. 2017. Multi-Task Learning Us-
ing Uncertainty to Weigh Losses for Scene Geometry and Semantics. CoRR
abs/1705.07115 (2017).
[13]
A. Krizhevsky, S. Ilya, and G. E. Hinton. 2012. ImageNet Classication with Deep
Convolutional Neural Networks. In Advances in Neural Information Processing
Systems (NIPS). 1097–1105.
[14]
David C. Liu, Stephanie Rogers, Raymond Shiau, Dmitry Kislyuk, Kevin C. Ma,
Zhigang Zhong, Jenny Liu, and Yushi Jing. 2017. Related Pins at Pinterest: The
Evolution of a Real-World Recommender System. CoRR abs/1702.07969 (2017).
[15]
Ishan Misra, Abhinav Shrivastava, Abhinav Gupta, and Martial Hebert. 2016.
Cross-stitch Networks for Multi-task Learning. CoRR abs/1604.03539 (2016).
[16]
Yair Movshovitz-Attias, Alexander Toshev, Thomas K. Leung, Sergey Ioe, and
Saurabh Singh. 2017. No Fuss Distance Metric Learning using Proxies. CoRR
abs/1703.07464 (2017). http://arxiv.org/abs/1703.07464
[17]
Henning Müller, Wolfgang Müller, David McG. Squire, Stéphane Marchand-
Maillet, and Thierry Pun. 2001. Performance Evaluation in Content-based Image
Retrieval: Overview and Proposals. Pattern Recogn. Lett. 22, 5 (April 2001), 593–
601. https://doi.org/10.1016/S0167-8655(00)00118-5
[18]
Zhongzheng Ren and Yong Jae Lee. 2017. Cross-Domain Self-supervised Multi-
task Feature Learning using Synthetic Imagery. CoRR abs/1711.09082 (2017).
[19]
Kaifeng Chen Pong Eksombatchai William L. Hamilton Jure Leskovec Rex Ying,
Ruining He. 2018. Graph Convolutional Neural Networks for Web-Scale Rec-
ommender Systems. In Proceedings of the International Conference on Knowledge
Discovery and Data Mining (SIGKDD).
[20]
Florian Schro, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A
Unied Embedding for Face Recognition and Clustering. In The IEEE Conference
on Computer Vision and Pattern Recognition (CVPR).
[21]
K. Simonyan and A. Zisserman. 2014. Very Deep Convolutional Networks for
Large-Scale Image Recognition. CoRR abs/1409.1556 (2014).
[22]
Kihyuk Sohn. 2016. Improved Deep Metric Learning with Multi-class N-pair Loss
Objective. In Advances in Neural Information Processing Systems 29, D. D. Lee,
M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.). Curran Associates,
Inc., 1857–1865.
[23]
Hyun Oh Song, Yu Xiang, Stefanie Jegelka, and Silvio Savarese. 2016. Deep Metric
Learning via Lifted Structured Feature Embedding. In The IEEE Conference on
Computer Vision and Pattern Recognition (CVPR).
[24]
Andreas Veit, Serge J. Belongie, and Theofanis Karaletsos. 2017. Conditional
Similarity Networks. 2017 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) (2017), 1781–1789.
[25]
Bin Yang Wenjie Luo and Raquel Urtasun. 2018. Fast and Furious: Real Time
End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convo-
lutional Net. In The IEEE Conference on Computer Vision and Pattern Recognition
(CVPR).
[26]
Chao-Yuan Wu, R. Manmatha, Alexander J. Smola, and Philipp Krähenbühl. 2017.
Sampling Matters in Deep Embedding Learning. CoRR abs/1706.07567 (2017).
arXiv:1706.07567 http://arxiv.org/abs/1706.07567
[27]
Yuxin Wu and Kaiming He. 2018. Group Normalization. CoRR abs/1803.08494
(2018).
[28]
Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2017.
Aggregated residual transformations for deep neural networks. In 2017 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 5987–5995.
[29]
K. Yamaguchi, M. H. Kiapour, and T. L. Berg. 2013. Paper Doll Parsing: Retrieving
Similar Styles to Parse Clothing Items. In 2013 IEEE International Conference on
Computer Vision (ICCV), Vol. 00. 3519–3526. https://doi.org/10.1109/ICCV.2013.
437
[30]
Fan Yang, Ajinkya Kale, Yury Bubnov, Leon Stein, Qiaosong Wang, M. Hadi
Kiapour, and Robinson Piramuthu. 2017. Visual Search at eBay. In Proceedings of
the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining, Halifax, NS, Canada, August 13 - 17, 2017. 2101–2110. https://doi.org/10.
1145/3097983.3098162
[31]
Amir Roshan Zamir, Alexander Sax, William B. Shen, Leonidas J. Guibas, Jitendra
Malik, and Silvio Savarese. 2018. Taskonomy: Disentangling Task Transfer
Learning. CoRR abs/1804.08328 (2018).
[32]
Andrew Zhai, Dmitry Kislyuk, Yushi Jing, Michael Feng, Eric Tzeng, Je Donahue,
Yue Li Du, and Trevor Darrell. 2017. Visual Discovery at Pinterest. arXiv preprint
arXiv:1702.04680 (2017).
[33]
Andrew Zhai and Hao-Yu Wu. 2018. Making Classication Competitive for
Deep Metric Learning. CoRR abs/1811.12649 (2018). arXiv:1811.12649 http:
//arxiv.org/abs/1811.12649
[34]
Yanhao Zhang, Pan Pan, Yun Zheng, Kang Zhao, Yingya Zhang, Xiaofeng Ren,
and Rong Jin. 2018. Visual Search at Alibaba. In Proceedings of the 24th ACM
SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD
2018, London, UK, August 19-23, 2018. 993–1001. https://doi.org/10.1145/3219819.
3219820
[35]
Xiangyun Zhao, Haoxiang Li, Xiaohui Shen, Xiaodan Liang, and Ying Wu. 2018. A
Modulation Module for Multi-task Learning with Applications in Image Retrieval.
In ECCV.