Research Article Open Access

Reinforcement Learning in Financial Services: Modelling Payment Switching as a Multi-Armed Bandit Problem

Ishaya Gambo1, Christopher Agbonkhese2, Segun Aina1, Mogboluwaga Tayo Otegbayo3, Johnson Bayo Adekunle4, Israel Odetola1, Omobola Gambo5, Tolulope Oluwadare1 and Oluwatoni Odetola1
  • 1 Department of Computer Science and Engineering, Obafemi Awolowo University, Ile-Ife, Nigeria
  • 2 Department of Digital and Computational Studies, Bates College, Lewiston, United States
  • 3 Vitruvian Shield PT, LDA, Portugal
  • 4 Venture Garden Group, Ikeja, Lagos, Nigeria
  • 5 Department of Arts and Social Science Education, Lead City University, Nigeria

Abstract

The ever-evolving landscape of digital payments demands continuous innovation and self-improvement. This study addresses this imperative by simulating a model for payment routing, a crucial aspect of the digital payment ecosystem. To achieve this, industry professionals were interviewed to inform the approach, emphasizing data randomization for effective data collection. Using Python, a randomized dataset is created and three Reinforcement Learning (RL) algorithms are implemented and evaluated: Epsilon Greedy, Upper Confidence Bound (UCB), and Thompson Sampling. The paper adopts the Multi-Armed Bandit (MAB) framework to model payment routing as a resource allocation problem, offering a computational approach to real-world resource allocation dilemmas. Through simulation, we eliminate real-time transaction costs, allowing us to focus on algorithmic approaches without implications for customers, businesses, or payment providers. Among the RL algorithms studied, UCB emerges as the most effective in addressing this Multi-Armed Bandit problem, corroborating findings from prior research. This study suggests not only the potential of modeling real-world problems as MAB but also the superior performance of the UCB algorithm in solving RL problems. The paper underscores the need for increased focus on non-consumer-facing aspects of the financial services industry, emphasizing cross-disciplinary research to create infrastructure and software solutions. Researchers can extend this study by exploring MAB algorithms in various domains with options for system choices. The simulation-based approach offers a cost-effective means of testing system performance and hypotheses across a spectrum of industries, fostering innovation and progress.

Journal of Computer Science
Volume 20 No. 11, 2024, 1519-1529

DOI: https://doi.org/10.3844/jcssp.2024.1519.1529

Submitted On: 7 March 2024 Published On: 12 October 2024

How to Cite: Gambo, I., Agbonkhese, C., Aina, S., Otegbayo, M. T., Adekunle, J. B., Odetola, I., Gambo, O., Oluwadare, T. & Odetola, O. (2024). Reinforcement Learning in Financial Services: Modelling Payment Switching as a Multi-Armed Bandit Problem. Journal of Computer Science, 20(11), 1519-1529. https://doi.org/10.3844/jcssp.2024.1519.1529

  • 493 Views
  • 294 Downloads
  • 0 Citations

Download

Keywords

  • Multi-Armed Bandit Problem
  • Reinforcement Learning
  • Digital Payments
  • Transaction
  • Simulation