Araştırma Makalesi
BibTex RIS Kaynak Göster

Takviyeli öğrenme tekniği kullanarak haraketli balık sürülerinin tespiti

Yıl 2025, Cilt: 42 Sayı: 1, 21 - 26, 08.03.2025
https://doi.org/10.12714/egejfas.42.1.03

Öz

Bu çalışmada toplu halde hareket eden balık sürülerinin yerlerinin tespit edilerek, balık endüstrisine katkı sağlamaya odaklanılmıştır. Takviyeli öğrenme tekniğini kullanan Q-Learning algoritması ile balıkların sıkça rastlandığı bölgeler işaretlenip, otonom gemilerin bu bölgelere daha hızlı ulaşması sağlanmıştır. Makine öğrenmesi tekniklerinde olan Q-Learning algoritmasıyla, küçük karelere ayrılmış her bir bölgeye verilen ödül ceza puanlarıyla, balık sürülerinin bol olduğu bölgeler tespit edilmiştir. Ayrıca istenen bölgenin balık sürüsü yoğunluk matrisi çıkartılıp, avcı ya da araştırmacılar tarafından daha hızlıca tanınması sağlanmıştır. Sonuç olarak, bölgenin otonom gemiler tarafından tanınmasıyla birlikte, balık sürülerini bulma veya takip etmede zaman ve yol maliyeti açısından yüksek kazançlar elde edilmiştir.

Kaynakça

  • Angiuli, A., Fouque, J.P., & Laurière, M. (2022). Unified reinforcement Q-learning for mean field game and control problems. Mathematics of Control, Signals, and Systems, 34(2), 217 271. https://doi.org/10.1007/s00498-021-00310-1
  • Aydındağ Bayrak, E., Kırcı, P., Ensari, T., Seven, E., & Dağtekin, M. (2022). Diagnosing breast cancer using machine learning methods. (in Turkish with English abstract) Journal of Intelligent Systems: Theory and Applications, 5(1), 35-41. https://doi.org/10.38016/jista.966517
  • Barto, A.G., Bradtke, S.J., & Singh, S.P. (1995). Learning to act using real-time dynamic programming. Artificial Intelligence, 72(1-2), 81-138. https://doi.org/10.1016/0004-3702(94)00011-O
  • Chapman, D., & Kaelbling, L.P. (1991). Input generalization in delayed reinforcement learning: An algorithm and performance comparisons. Proceedings of the 1991. International Joint Conference on Artificial Intelligence, 726–731 pp., Sydney, Australia.
  • Christiano, P.F., Leike, J., Brown, T., Martic, M., Legg, S., & Amodei, D. (2017). Deep reinforcement learning from human preferences. Advances in Neural Information Processing Systems, 30. http://dx.doi.org/10.48550/arXiv.1706.03741
  • D'Eramo, C., Cini, A., Nuara, A., Pirotta, M., Alippi, C., Peters, J., & Restelli, M. (2021). Gaussian approximation for bias reduction in Q-learning. Journal of Machine Learning Research, 22(277), 1-51.
  • D'Eramo, C., Nuara, A., Pirotta, M., & Restelli, M. (2017). Estimating the maximum expected value in continuous reinforcement learning problems. In Proceedings of the AAAI Conference on Artificial Intelligence, 31(1), 1846-1846.
  • Dayan, P. (1993). Improving generalization for temporal difference learning: The successor representation. Neural Computation, 5(4), 613-624. https://doi.org/10.1162/neco.1993.5.4.613
  • Devlin, S., Yliniemi, L., Kudenko, D., & Tumer, K. (2014). Potential-based difference rewards for multiagent reinforcement learning. In Proceedings of the 2014 International Conference on Autonomous Agents and Multi-Agent Systems, 165-172 pp.
  • Elallid, B.B., Benamar, N., Hafid, A.S., Rachidi, T., & Mrani, N. (2022). A comprehensive survey on the application of deep and reinforcement learning approaches in autonomous driving. Journal of King Saud University-Computer and Information Sciences, 34(9), 7366-7390. https://doi.org/10.1016/j.jksuci.2022.03.013
  • Everitt, T., Krakovna, V., Orseau, L., Hutter, M., & Legg, S. (2017). Reinforcement learning with a corrupted reward channel. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17), 4705-4713.
  • Gümüş, E. (2016). Q-Learning Algoritması ile Labirentte Yol Bulmak. 7(2), 1–23. https://github.com/emrahgumus/java-q-learning-labirent.git(Erişim Tarihi: 10.09.2024)
  • Jogunola, O., Adebisi, B., Ikpehai, A., Popoola, S.I., Gui, G., Gačanin, H., & Ci, S. (2021). Consensus algorithms and deep reinforcement learning in energy market: A review. IEEE Internet of Things Journal, 8(6), 4211-4227. https://doi.org/10.1109/JIOT.2020.3032162
  • Jones, G.L., & Qin, Q. (2022). Markov chain Monte Carlo in practice. Annual Review of Statistics and Its Application, 9(1), 557-578. https://doi.org/10.1146/annurev-statistics-040220-090158
  • Jordan, M.I., & Mitchell, T.M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255-260. https://doi.org/10.1126/science.aaa8415
  • Kober, J., Bagnell, J.A., & Peters, J. (2013). Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11), 1238-1274. https://doi.org/10.1177/0278364913495721
  • Nykjaer, K. (2022). Q Learning Library. https://kunuk.wordpress.com/2012/01/14/q-learning-library-example-with-csharp/ (Erişim Tarihi: 11.09.2024)
  • Lin, L.J. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8, 293-321. https://doi.org/10.1007/BF00992699
  • Liu, H., Bi, W., Teo, K.L., & Liu, N. (2019). Dynamic optimal decision making for manufacturers with limited attention based on sparse dynamic programming. Journal of Industrial & Management Optimization, 15(2). https://doi.org/10.3934/jimo.2018050
  • Meng, T.L., & Khushi, M. (2019). Reinforcement learning in financial markets. Data, 4(3), 110. https://doi.org/10.3390/data4030110
  • Pandey, P., Pandey, D., & Kumar, S. (2010). Reinforcement learning by comparing immediate reward. International Journal of Computer Science and Information Security, 8(5), 1009.2566. https://doi.org/10.48550/arXiv.1009.2566
  • Parisotto, E. (2021). Meta reinforcement learning through memory. Doctoral dissertation, Pittsburgh, Carnegie Mellon University.
  • Van Seijen, H., Fatemi, M., Romoff, J., Laroche, R., Barnes, T., & Tsang, J. (2017). Hybrid reward architecture for reinforcement learning. Advances in Neural Information Processing Systems, 30. ISBN: 9781510860964.
  • Wang, J., Liu, Y., & Li, B. (2020, April). Reinforcement learning with perturbed rewards. In Proceedings of the AAAI conference on artificial intelligence, 34(04), 6202-6209. https://doi.org/10.1609/aaai.v34i04.6086
  • Watkins, C.J.C.H. (1989). Learning from delayed rewards. Doctoral dissertation, King's College, London, UK.
  • Watkins, C.J.C.H., & Dayan, P. (1992). Q-learning. Machine Learning, 8, 279-292. https://doi.org/10.1007/BF00992698

Detection of moving fish schools using reinforcement learning technique

Yıl 2025, Cilt: 42 Sayı: 1, 21 - 26, 08.03.2025
https://doi.org/10.12714/egejfas.42.1.03

Öz

In this study, it is aimed to contribute to the fishing sector by determining the locations of moving fish schools. With the Q-Learning algorithm, areas where fish schools are frequently seen were marked and autonomous ships were able to reach these areas faster. With the Q-Learning algorithm, one of the machine learning techniques, areas where fish schools are abundant were determined and reward and penalty points were given to each region. In addition, the fish density matrix of the region was extracted thanks to the autonomous systems. Moreover, the algorithm can be automatically updated according to fish species and fishing bans. A different Q-Gain matrix was kept for each fish species to be caught, allowing autonomous ships to move according to the gain matrix. In short, high gains were achieved in terms of time and travel costs in finding or following fish schools by recognizing the region by autonomous ships.

Etik Beyan

For this type of study, formal consent is not required.

Kaynakça

  • Angiuli, A., Fouque, J.P., & Laurière, M. (2022). Unified reinforcement Q-learning for mean field game and control problems. Mathematics of Control, Signals, and Systems, 34(2), 217 271. https://doi.org/10.1007/s00498-021-00310-1
  • Aydındağ Bayrak, E., Kırcı, P., Ensari, T., Seven, E., & Dağtekin, M. (2022). Diagnosing breast cancer using machine learning methods. (in Turkish with English abstract) Journal of Intelligent Systems: Theory and Applications, 5(1), 35-41. https://doi.org/10.38016/jista.966517
  • Barto, A.G., Bradtke, S.J., & Singh, S.P. (1995). Learning to act using real-time dynamic programming. Artificial Intelligence, 72(1-2), 81-138. https://doi.org/10.1016/0004-3702(94)00011-O
  • Chapman, D., & Kaelbling, L.P. (1991). Input generalization in delayed reinforcement learning: An algorithm and performance comparisons. Proceedings of the 1991. International Joint Conference on Artificial Intelligence, 726–731 pp., Sydney, Australia.
  • Christiano, P.F., Leike, J., Brown, T., Martic, M., Legg, S., & Amodei, D. (2017). Deep reinforcement learning from human preferences. Advances in Neural Information Processing Systems, 30. http://dx.doi.org/10.48550/arXiv.1706.03741
  • D'Eramo, C., Cini, A., Nuara, A., Pirotta, M., Alippi, C., Peters, J., & Restelli, M. (2021). Gaussian approximation for bias reduction in Q-learning. Journal of Machine Learning Research, 22(277), 1-51.
  • D'Eramo, C., Nuara, A., Pirotta, M., & Restelli, M. (2017). Estimating the maximum expected value in continuous reinforcement learning problems. In Proceedings of the AAAI Conference on Artificial Intelligence, 31(1), 1846-1846.
  • Dayan, P. (1993). Improving generalization for temporal difference learning: The successor representation. Neural Computation, 5(4), 613-624. https://doi.org/10.1162/neco.1993.5.4.613
  • Devlin, S., Yliniemi, L., Kudenko, D., & Tumer, K. (2014). Potential-based difference rewards for multiagent reinforcement learning. In Proceedings of the 2014 International Conference on Autonomous Agents and Multi-Agent Systems, 165-172 pp.
  • Elallid, B.B., Benamar, N., Hafid, A.S., Rachidi, T., & Mrani, N. (2022). A comprehensive survey on the application of deep and reinforcement learning approaches in autonomous driving. Journal of King Saud University-Computer and Information Sciences, 34(9), 7366-7390. https://doi.org/10.1016/j.jksuci.2022.03.013
  • Everitt, T., Krakovna, V., Orseau, L., Hutter, M., & Legg, S. (2017). Reinforcement learning with a corrupted reward channel. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17), 4705-4713.
  • Gümüş, E. (2016). Q-Learning Algoritması ile Labirentte Yol Bulmak. 7(2), 1–23. https://github.com/emrahgumus/java-q-learning-labirent.git(Erişim Tarihi: 10.09.2024)
  • Jogunola, O., Adebisi, B., Ikpehai, A., Popoola, S.I., Gui, G., Gačanin, H., & Ci, S. (2021). Consensus algorithms and deep reinforcement learning in energy market: A review. IEEE Internet of Things Journal, 8(6), 4211-4227. https://doi.org/10.1109/JIOT.2020.3032162
  • Jones, G.L., & Qin, Q. (2022). Markov chain Monte Carlo in practice. Annual Review of Statistics and Its Application, 9(1), 557-578. https://doi.org/10.1146/annurev-statistics-040220-090158
  • Jordan, M.I., & Mitchell, T.M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255-260. https://doi.org/10.1126/science.aaa8415
  • Kober, J., Bagnell, J.A., & Peters, J. (2013). Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11), 1238-1274. https://doi.org/10.1177/0278364913495721
  • Nykjaer, K. (2022). Q Learning Library. https://kunuk.wordpress.com/2012/01/14/q-learning-library-example-with-csharp/ (Erişim Tarihi: 11.09.2024)
  • Lin, L.J. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8, 293-321. https://doi.org/10.1007/BF00992699
  • Liu, H., Bi, W., Teo, K.L., & Liu, N. (2019). Dynamic optimal decision making for manufacturers with limited attention based on sparse dynamic programming. Journal of Industrial & Management Optimization, 15(2). https://doi.org/10.3934/jimo.2018050
  • Meng, T.L., & Khushi, M. (2019). Reinforcement learning in financial markets. Data, 4(3), 110. https://doi.org/10.3390/data4030110
  • Pandey, P., Pandey, D., & Kumar, S. (2010). Reinforcement learning by comparing immediate reward. International Journal of Computer Science and Information Security, 8(5), 1009.2566. https://doi.org/10.48550/arXiv.1009.2566
  • Parisotto, E. (2021). Meta reinforcement learning through memory. Doctoral dissertation, Pittsburgh, Carnegie Mellon University.
  • Van Seijen, H., Fatemi, M., Romoff, J., Laroche, R., Barnes, T., & Tsang, J. (2017). Hybrid reward architecture for reinforcement learning. Advances in Neural Information Processing Systems, 30. ISBN: 9781510860964.
  • Wang, J., Liu, Y., & Li, B. (2020, April). Reinforcement learning with perturbed rewards. In Proceedings of the AAAI conference on artificial intelligence, 34(04), 6202-6209. https://doi.org/10.1609/aaai.v34i04.6086
  • Watkins, C.J.C.H. (1989). Learning from delayed rewards. Doctoral dissertation, King's College, London, UK.
  • Watkins, C.J.C.H., & Dayan, P. (1992). Q-learning. Machine Learning, 8, 279-292. https://doi.org/10.1007/BF00992698
Toplam 26 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Balıkçılık Yönetimi
Bölüm Makaleler
Yazarlar

Mehmet Yaşar Bayraktar 0000-0003-3182-120X

Yayımlanma Tarihi 8 Mart 2025
Gönderilme Tarihi 5 Eylül 2024
Kabul Tarihi 15 Ocak 2025
Yayımlandığı Sayı Yıl 2025Cilt: 42 Sayı: 1

Kaynak Göster

APA Bayraktar, M. Y. (2025). Detection of moving fish schools using reinforcement learning technique. Ege Journal of Fisheries and Aquatic Sciences, 42(1), 21-26. https://doi.org/10.12714/egejfas.42.1.03