Hery, H., and A. C. . Wawolangi. “Decision Policy Optimization for Human–AI Collaboration Using Off-Policy Reinforcement Learning from Logged Interaction Data”. International Journal for Applied Information Management, vol. 6, no. 2, June 2026, pp. 272-89, doi:10.47738/ijaim.v6i2.121.