Using Reinforcement Learning Methods to Price a Perishable Product, Case Study: Orange

Document Type : Journal of Mathematics and Modeling in Finance (JMMF)


1 Industrial Engineering Department, Science and Technology of Behshahr, Mazandran, Iran.

2 Department of industrial engineering, University of Science and Technology of Mazandaran, Behshahr, Iran

3 Industrial Engineering Department, Science and Technology of Behshahr, Mazandran, Iran



‎Determining the optimal selling price for different commodities has always been one of the main topics of scientific and industrial research‎. ‎Perishable products have a short life and due to their deterioration over time‎, ‎they cause great damage if not managed‎. ‎Many industries‎, ‎retailers‎, ‎and service providers have the opportunity to increase their revenue through optimal pricing of perishable products that must be sold within a certain period‎. ‎In the pricing issue‎, ‎a seller must determine the price of several units of a perishable or seasonal product to be sold for a limited time‎. ‎This article examines pricing policies that increase revenue for the sale of a given inventory with an expiration date‎. ‎Booster learning algorithms are used to analyze how companies can simultaneously learn and optimize pricing strategy in response to buyers‎. ‎It is also shown that using reinforcement learning we can model a demand-dependent problem‎. ‎This paper presents an optimization method in a model-independent environment in which demand is learned and pricing decisions are updated at the moment‎. ‎We compare the performance of learning algorithms using Monte Carlo simulations‎.