استفاده از الگوریتم بهینه‌سازی گرگ خاکستری در خوشه‌یابی کلان‌داده‌ها

بهروان, ایمان; ظهیری, سید حمید; رضوی, سیّد محمّد; ترازارتی, روبرتو

فهرست نشریات دارای اعتبار وزارت علوم، تحقیقات و فناوری

تعداد نشریات	45
تعداد شماره‌ها	1,417
تعداد مقالات	17,419
تعداد مشاهده مقاله	56,168,345
تعداد دریافت فایل اصل مقاله	18,489,739

	استفاده از الگوریتم بهینه‌سازی گرگ خاکستری در خوشه‌یابی کلان‌داده‌ها
مجله مهندسی برق دانشگاه تبریز
مقاله 5، دوره 50، شماره 1 - شماره پیاپی 91، خرداد 1399، صفحه 41-62 اصل مقاله (4.94 M)
نویسندگان
ایمان بهروان¹؛ سید حمید ظهیری^* ¹؛ سیّد محمّد رضوی¹؛ روبرتو ترازارتی²
¹دانشکده مهندسی برق و کامپیوتر- دانشگاه بیرجند
²آزمایشگاه استخراج اطلاعات و داده‌کاوی- موسسه علوم و فناوری اطلاعات- پیزا- ایتالیا
چکیده
امروزه حجم بسیار زیادی از اطلاعات و داده‌ها از منابع مختلف نظیر گوشی‌های هوشمند، شبکه‌های اجتماعی، تکنولوژی‌های عکاسی و سایر منابع تولید می‌شود. بررسی و پردازش این حجم عظیم از اطلاعات چالش دهه‌های اخیر است که به آن کلان‌داده گفته می‌شود. یکی از روش‌های پرکاربرد استخراج اطلاعات، خوشه‌یابی است. خوشه‌یابیِ کلان‌داده‌ها چالش بزرگی است که توجه بسیاری از محققین را به خود جلب کرده است. در این پژوهش ابتدا یک روش خوشه‌یابی غیر خودکار (برای حالتی که تعداد خوشه‌ها از قبل مشخص است) و سپس یک روش خوشه‌یابی خودکار (قادر به یافتن تعداد خوشه‌ها) با استفاده از الگوریتم بهینه‌سازی گرگ خاکستری برای خوشه‌یابی کلان‌داده‌ها ارائه شده است. روش خوشه‌یابی خودکار یک روش دو مرحله‌ایست که در مرحله‌ی اول یک ساختار درخت گونه از الگوریتم مورد نظر برای یافتن تعداد خوشه‌ها اجرا می‌شود و در مرحله‌ی دوم الگوریتم اصلی فضا را برای یافتن موقعیت مراکز خوشه‌ها جست‌وجو می‌کند. عملکرد روش ارائه شده بر روی ۱۳ مجموعه داده‌ی مصنوعی و ۲ مجموعه کلان‌داده‌ی واقعی مربوط به مسیرهای طی شده توسط خودروها در سطح شهر پیزا مورد ارزیابی قرار گرفته و نتایج آن بررسی شده است. نتایج به دست آمده نشان از دقت بالای این الگوریتم در خوشه‌یابی داده‌های بزرگ و حجیم دارد.
کلیدواژه‌ها
کلان‌داده؛ خوشه‌یابی خودکار؛ روش‌های هوش جمعی؛ الگوریتم بهینه‌سازی گرگ خاکستری

مراجع
[1] S. LaValle, E. Lesser, R. Shockley, M. S. Hopkins, and N. Kruschwitz, "Big data, analytics and the path from insights to value," MIT sloan management review, vol. 52, p. 21, 2011. [2] S. Cheng, Y. Shi, Q. Qin, and R. Bai, "Swarm intelligence in big data analytics," in International Conference on Intelligent Data Engineering and Automated Learning, 2013, pp. 417-426. [3] J. Leskovec, A. Rajaraman, and J. D. Ullman, Mining of Massive Datasets: Cambridge University Press, 2014. [4] A. K. Jain, M. N. Murty, and P. J. Flynn, "Data clustering: a review," ACM computing surveys (CSUR), vol. 31, pp. 264-323, 1999. [5] J. A. Hartigan, "Clustering algorithms (probability & mathematical statistics)," ed: John Wiley & Sons Inc New York, 1975. [6] R. Xu and D. Wunsch, "Survey of clustering algorithms," IEEE Transactions on neural networks, vol. 16, pp. 645-678, 2005. [7] J. A. Hartigan and M. A. Wong, "Algorithm AS 136: A k-means clustering algorithm," Journal of the Royal Statistical Society. Series C (Applied Statistics), vol. 28, pp. 100-108, 1979. [8[ احسان نادرنژاد، حمید حسن‌پور و حسین میارنعیمی، «استفاده از مشخصه‌های آماری داده‌ها و پردازش بلوکی برای قطعه بندی تصاویر»، فصلنامه مهندسی برق دانشگاه تبریز، دوره‌ی ۳۹ شماره‌ی ۱، صفحه‌ی ۴۸ تا ۵۷، بهار ۱۳۸۸. [9] A. S. Shirkhorshidi, S. Aghabozorgi, T. Y. Wah, and T. Herawan, "Big data clustering: a review," in International Conference on Computational Science and Its Applications, 2014, pp. 707-720. [10] R. C. Eberhart, Y. Shi, and J. Kennedy, Swarm intelligence: Elsevier, 2001. [11] S. Mirjalili, S. M. Mirjalili, and A. Lewis, "Grey wolf optimizer," Advances in engineering software, vol. 69, pp. 46-61, 2014. [12] A. Sinha and P. K. Jana, "A novel K-means based clustering algorithm for big data," in Advances in Computing, Communications and Informatics (ICACCI), 2016 International Conference on, 2016, pp. 1875-1879. [13] M. Jain and C. Verma, "Adapting k-means for Clustering in Big Data," International Journal of Computer Applications, vol. 101, pp. 19-24, 2014. [14] A. Saini, J. Minocha, J. Ubriani, and D. Sharma, "New approach for clustering of big data: DisK-means," in Computing, Communication and Automation (ICCCA), 2016 International Conference on, 2016, pp. 122-126. [15] D. Arthur and S. Vassilvitskii, "k-means++: the advantages of careful seeding," presented at the Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, New Orleans, Louisiana, 2007. [16] S. H. Razavi, E. O. M. Ebadati, S. Asadi, and H. Kaur, "An efficient grouping genetic algorithm for data clustering and big data analysis," in Computational Intelligence for Big Data Analysis, ed: Springer, 2015, pp. 119-142. [17] S. Saitta, B. Raphael, and I. F. Smith, "A comprehensive validity index for clustering," Intelligent Data Analysis, vol. 12, pp. 529-548, 2008. [18] A. Abraham, S. Das, and S. Roy, "Swarm intelligence algorithms for data clustering," in Soft computing for knowledge discovery and data mining, ed: Springer, 2008, pp. 279-313. [19] S. H. Kwon, "Cluster validity index for fuzzy clustering," Electronics letters, vol. 34, pp. 2176-2177, 1998. [20] X. Cui, T. E. Potok, and P. Palathingal, "Document clustering using particle swarm optimization," in Swarm Intelligence Symposium, 2005. SIS 2005. Proceedings 2005 IEEE, 2005, pp. 185-191. [21] M. G. Omran, A. Salman, and A. P. Engelbrecht, "Dynamic clustering using particle swarm optimization with application in image segmentation," Pattern Analysis and Applications, vol. 8, p. 332, 2006. [22] U. Maulik and S. Bandyopadhyay, "Performance evaluation of some clustering algorithms and validity indices," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, pp. 1650-1654, 2002. [23] C. Zhang, D. Ouyang, and J. Ning, "An artificial bee colony approach for clustering," Expert Systems with Applications, vol. 37, pp. 4761-4767, 2010. [24] G. Krishnasamy, A. J. Kulkarni, and R. Paramesran, "A hybrid approach for data clustering based on modified cohort intelligence and K-means," Expert Systems with Applications, vol. 41, pp. 6009-6016, 2014. [25] Y. Lu, B. Cao, C. Rego, and F. Glover, "A Tabu search based clustering algorithm and its parallel implementation on Spark," Applied Soft Computing, vol. 63, pp. 97-109, 2018. [26] D. Karaboga and C. Ozturk, "A novel clustering approach: Artificial Bee Colony (ABC) algorithm," Applied soft computing, vol. 11, pp. 652-657, 2011. [27] X. Han, L. Quan, X. Xiong, M. Almeter, J. Xiang, and Y. Lan, "A novel data clustering algorithm based on modified gravitational search algorithm," Engineering Applications of Artificial Intelligence, vol. 61, pp. 1-7, 2017. [28] A. Banharnsakun, "A MapReduce-based artificial bee colony for large-scale data clustering," Pattern Recognition Letters, vol. 93, pp. 78-84, 2017. [29] C. Muro, R. Escobedo, L. Spector, and R. Coppinger, "Wolf-pack (Canis lupus) hunting strategies emerge from simple rules in computational simulations," Behavioural processes, vol. 88, pp. 192-197, 2011. [30] T. Caliński and J. Harabasz, "A dendrite method for cluster analysis," Communications in Statistics-theory and Methods, vol. 3, pp. 1-27, 1974. [31] School of Computing University of Eastern Finland. "clustering basic benchmarks,” June 10, 2018; https://cs.joensuu.fi/sipu/datasets/. [32] W. M. Rand, "Objective criteria for the evaluation of clustering methods," Journal of the American Statistical association, vol. 66, pp. 846-850, 1971. [33] D. Pelleg and A. W. Moore, "X-means: Extending k-means with efficient estimation of the number of clusters," in Icml, 2000, pp. 727-734. [34] Z. F. Knops, J. A. Maintz, M. A. Viergever, and J. P. Pluim, "Normalized mutual information based registration using k-means clustering and shading correction," Medical image analysis, vol. 10, pp. 432-439, 2006. [35] J. Demšar, "Statistical comparisons of classifiers over multiple data sets," Journal of Machine learning research, vol. 7, pp. 1-30, 2006. [36] P. Fränti and O. Virmajoki, "Iterative shrinking method for clustering problems," Pattern Recognition, vol. 39, pp. 761-775, 2006. [37] I. Kärkkäinen and P. Fränti, Dynamic local search algorithm for the clustering problem: University of Joensuu, 2002. [38] P. Fränti, R. Mariescu-Istodor, and C. Zhong, "XNN graph," in Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), 2016, pp. 207-217. [39] P. Franti, O. Virmajoki, and V. Hautamaki, "Fast agglomerative clustering using a k-nearest neighbor graph," IEEE transactions on pattern analysis and machine intelligence, vol. 28, pp. 1875-1881, 2006. [40] سمیرا رفیعی، پرهام مرادی، «بهبود عملکرد الگوریتم خوشه‌بندی فازی سی-مینز با وزن‌دهی اتوماتیک و محلی ویژگی‌ها»، مجله‌ی مهندسی برق دانشگاه تبریز، دوره‌ی ۴۶، صفحه‌ی ۷۵ تا ۸۶، تابستان ۱۳۹۵. [41] I. Aljarah and S. A. Ludwig, "Parallel particle swarm optimization clustering algorithm based on mapreduce methodology," in Nature and biologically inspired computing (NaBIC), 2012 fourth world congress on, 2012, pp. 104-111. [42] B. Wu, G. Wu, and M. Yang, "A mapreduce based ant colony optimization approach to combinatorial optimization problems," in Natural Computation (ICNC), 2012 Eighth International Conference on, 2012, pp. 728-732. [43] J. Li, X. Hu, Z. Pang, and K. Qian, "A parallel ant colony optimization algorithm based on fine-grained model with GPU-acceleration," International Journal of Innovative Computing, Information and Control, vol. 5, pp. 3707-3716, 2009. [44] D.-W. Huang and J. Lin, "Scaling populations of a genetic algorithm for job shop scheduling problems using MapReduce," in Cloud Computing Technology and Science (CloudCom), 2010 IEEE Second International Conference on, 2010, pp. 780-785.
آمار تعداد مشاهده مقاله: 786 تعداد دریافت فایل اصل مقاله: 752

سامانه مدیریت نشریات علمی. قدرت گرفته از سیناوب

پیوندهای مفید

آمار

استفاده از الگوریتم بهینه‌سازی گرگ خاکستری در خوشه‌یابی کلان‌داده‌ها