تکامل برچسب‌های تصاویر با اعمال خوشه‌بندی فازی تک‌گذر C-Means بر ویژگی‌های یادگیری‌شده توسط شبکه عصبی کانولوشن عمیق

جوانمردی, شیما; لطیف, علی محمد; درهمی, ولی

فهرست نشریات دارای اعتبار وزارت علوم، تحقیقات و فناوری

تعداد نشریات	45
تعداد شماره‌ها	1,415
تعداد مقالات	17,466
تعداد مشاهده مقاله	56,332,625
تعداد دریافت فایل اصل مقاله	18,670,835

	تکامل برچسب‌های تصاویر با اعمال خوشه‌بندی فازی تک‌گذر C-Means بر ویژگی‌های یادگیری‌شده توسط شبکه عصبی کانولوشن عمیق
مجله مهندسی برق دانشگاه تبریز
مقاله 11، دوره 49، شماره 1 - شماره پیاپی 87، اردیبهشت 1398، صفحه 111-123 اصل مقاله (1.06 M)
نویسندگان
شیما جوانمردی؛ علی محمد لطیف^* ؛ ولی درهمی
گروه کامپیوتر - دانشگاه یزد
چکیده
تکامل برچسب‌های تصاویر، فرآیندی است که هم‌زمان به غنی‌سازی تگ‌های تصاویر و رفع نویز از آن‌ها می‌پردازد. بسیاری از تصاویر در وب، توسط تگ‌های مبهم و بی‌ارتباط با محتوای تصویر برچسب‌گذاری شده‌اند. وجود این برچسب‌های غیرمرتبط با تصویر، موجب کاهش دقت بازیابی آن‌ها می‌شود. ازاین‌رو در سال‌های اخیر، به‌منظور رفع نویز و تکمیل برچسب‌های تصاویر، الگوریتم‌هایی با عنوان تکامل تگ مطرح‌شده‌اند که هدف آن‌ها دستیابی به برچسب‌های مرتبط با محتوای تصاویر و حذف برچسب‌های غیرمرتبط می‌باشد. با توجه به کارآمدی فرآیند یادگیری عمیق در بسیاری از حوزه‌های پژوهشی، در این مقاله نیز به‌منظور استخراج ویژگی‌های دیداری و معنایی مناسب از تصاویر، از شبکه‌های عصبی کانولوشنال عمیق استفاده شده است. همچنین با توجه به چالش‌های مطرح در بارگذاری مجموعه تصاویر با مقیاس بزرگ در حافظه، به‌منظور دسته‌بندی تصاویر مشابه دیداری و پالایش برچسب‌های هر تصویر با توجه به نمونه‌های مشابه، از الگوریتم خوشه‌بندی فازی تک‌گذر C-Means استفاده شده است. نتایج آزمایش‌ها بیانگر مؤثر بودن رویکرد ارائه‌شده، در فرآیند تکامل برچسب‌های تصاویر می‌باشد.
کلیدواژه‌ها
تکامل تگ تصویر؛ شبکه عصبی کانولوشنال عمیق؛ پالایش تگ؛ خوشه‌بندی فازی تک‌گذر C-Means؛ بازیابی تصاویر

مراجع
[1] R. Datta, D. Joshi, J. Li and J. Z. Wang, “Image retrieval: ideas, influences and trends of the new age,” ACM Computing Surveys, vol. 40, no. 2, 2008. [2] مریم تقی‌زاده و عبداله چاله‌چاله، »مدلی به‌منظور بازیابی تصاویر مبتنی بر چند درخواست«، مجله مهندسی برق دانشگاه تبریز، دوره ۴۷، شماره ۳، صفحه ۸۹۳-۹۰۳، ۱۳۹۶. [3] X. Li, L. Chen, L. Zhang, F. Lin, and W.-Y. Ma, “Image annotation by large-scale content-based image retrieval,” ACM International Conference on Multimedia, 2006. [4] X. Rui, M. Li, Z. Li, W.-Y. Ma, and N. Yu, “Bipartite graph reinforcement model for web image annotation,” ACM International Conference on Multimedia, 2007. [5] M. J. Huiskes and M. S. Lew, “The MIR flickr retrieval evaluation”, ACM International Conference on Multimedia Information retrieval, 2008. [6] هنگامه دلجویی و امیرمسعود افتخاری مقدم، »حاشیه‌نویسی خودکار تصویر با استفاده از ارتباط معنایی بین نواحی مبتنی بر تئوری تصمیم چند شرطی«، مجله مهندسی برق دانشگاه تبریز، دوره ۴۲، شماره ۲، صفحه ۵۲-۳۹، ۱۳۹۲. [7] C. Blake and C. J. Merz, UCI Repository of Machine LearningDatabases,http://mlearn.ics.uci.edu/MLRepository.html, University of California, Irvine, School of Information and Computer Sciences, vol 55. 1998. [8] T. C. Havens, J. C. Bezdek, C. Leckie, L. O. Hall and M. Palaniswami, “Fuzzy c-means algorithms for very large data,” IEEE Transactions on Fuzzy Systems, vol. 20, no. 6, 2012. [9] X. Li, T. Uricchio, L. Ballan, M. Bertini, C. G. M. Snoek and A. Del Bimbo, “Socializing the semantic gap: a comparative survey on image tag assignment, refinement, and retrieval,” ACM Computing Surveys (CSUR), vol. 49, no. 1, 2016. [10] S. Lee, W. De Neve and Y. M. Ro, “Visually weighted neighbor voting for image tag relevance learning,” Multimedia Tools Applications, vol. 72, no. 2, pp. 1363–1386, 2014. [11] T. Uricchio, L. Ballan, M. Bertini and A. Del Bimbo, “An evaluation of nearest-neighbor methods for tag refinement,” International Conference on Multimedia and Expo (ICME), 2013. [12] L. Chen, D. Xu, I. W. Tsang and J. Luo, “Tag-based image retrieval improved by augmented features and group-based refinement,” IEEE Transactions on Multimedia, vol. 14, no. 4, pp. 1057–1067, 2012. [13] G. Zhu, S. Yan and Y. Ma, “Image tag refinement towards low-rank, content-tag prior and error sparsity,” International Conference of Multimedia, pp. 461–470, 2010. [14] J.Tang, X.Shu, G.J.Qi, Z.Li, M.Wang, S.Yan and R.Jain, “Tri-clustered tensor completion for social-aware image tag refinement,” IEEE Transactions on Pattern Analysis and Machine Intelligence., vol. 39, no. 8, pp. 1662–1674, 2017. [15] X. Yang and F. Yang, “Completing tags by local learning: a novel image tag completion method based on neighborhood tag vector predictor,” Neural Computing and Applications , vol. 27, no. 8, pp. 2407–2416, 2016. [16] Z. Feng, S. Feng, R. Jin and A. K. Jain, “Image tag completion by noisy matrix recovery,” European Conference on Computer Vision, pp. 424–438, 2014. [17] Y. Bengio “Learning deep architectures for AI,” Foundations and Trends in Machine Learning, vol. 2, no. 1, 2009. [18] S. Lawrence, C. L. Giles, A. C. Tsoi and A. D. Back, “Face recognition: a convolutional neural-network approach,” IEEE Transactions on Neural Networks , vol. 8, no. 1, 1997. [19] G. E. Hinton, “Deep belief networks,” Scholarpedia, vol. 4, no. 5, 2009. [20] T. Mikolov, M. Karafiát, L. Burget, J. Cernock and S. Khudanpur, “Recurrent neural network based language model,” Interspeech, vol. 2, pp.3, 2010. [21] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” Computer Vision and Pattern Recognition (CVPR), pp. 580-587, 2014. [22] G. Hinton, L. Deng, D. Yu, G. Dahl, AR .Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, TN. Sainath and B. Kingsbury, “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal Processing Magazine, vol. 29, no. 6, 2012. [23] R. Collobert and J. Weston, “A unified architecture for natural language processing: deep neural networks with multitask learning,” International Conference on Machine Learning, 2008. [24] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” International Conference on Learning Representations arXiv preprint arXiv:1409.1556, 2014. [25] J. Deng, W. Dong, R. Socher, L. Li, K. Li and L. Fei-Fei, “Imagenet: a large-scale hierarchical image database,” Computer Vision and Pattern Recognition, 2009. [26] J. C. Dunn, “A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters,” 1973. [27] V. Schwämmle and O. N. Jensen, “A simple and fast method to determine the parameters for fuzzy c–means cluster analysis,” Bioinformatics, vol. 26, no. 22, 2010. [28] D. Dembélé and P. Kastner, “Fuzzy c-means method for clustering microarray data,” Bioinformatics, vol. 19, no. 8, 2003. [29] D. Liu, X.-S. Hua, M. Wang and H.-J. Zhang, “Image retagging,” International Conference on Multimedia, 2010. [30] X. Li, C. G. M. Snoek and M. Worring, “Learning social tag relevance by neighbor voting,” IEEE Transactions on Multimedia, vol. 11, no. 7, 2009. [31] T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo and Y. Zheng, “NUS-WIDE: a real-world web image database from national university of Singapore,” ACM International Conference on Image and Video Retrieval, 2009. [32] Z. Lin, G. Ding, M. Hu, Y. Lin and S. S. Ge, “Image tag completion via dual-view linear sparse reconstructions,” Computer Vision Image Understanding, vol. 124, 2014. [33] S. Zhu, S. Aloufi and A. El Saddik, “Utilizing image social clues for automated image tagging,” IEEE International Conference on Multimedia and Expo (ICME), 2015. [34] Y. Guo, Y. Liu, A. Oerlemans, S. Lao, S. Wu and M. S. Lew, “Deep learning for visual understanding: A review,” Neurocomputing, vol. 187, 2016. [35] NUS-WIDE Homepage, Lab for Media Search, http://lms.comp.nus.edu.sg/research/NUS-WIDE.html, Accessed 07.07.2017. [36] J. Sang, C. Xu, and J. Liu, “User-aware image tag refinement via ternary semantic analysis,” IEEE Transactions on Multimedia, vol. 14, no. 3, 2012.
آمار تعداد مشاهده مقاله: 1,507 تعداد دریافت فایل اصل مقاله: 1,582

سامانه مدیریت نشریات علمی. قدرت گرفته از سیناوب

پیوندهای مفید

آمار

تکامل برچسب‌های تصاویر با اعمال خوشه‌بندی فازی تک‌گذر C-Means بر ویژگی‌های یادگیری‌شده توسط شبکه عصبی کانولوشن عمیق