Hery, H., & Wawolangi, A. C. . (2026). Decision Policy Optimization for Human–AI Collaboration Using Off-Policy Reinforcement Learning from Logged Interaction Data. International Journal for Applied Information Management, 6(2), 272–289. https://doi.org/10.47738/ijaim.v6i2.121