We present a framework for a class of sequential decision-making problems in the context of max-min bilevel programming, where a leader and a follower repeatedly interact. At each period, the leader allocates resources to disrupt the performance of the follower (e.g., as in defender-attacker or interdiction problems), who in turn minimizes some cost function over a set of activities that depends on the leader’s decision. While the follower has complete knowledge of his problem, the leader has only partial information, and needs to learn about the cost parameters, available resources, and the follower’s activities from the feedback generated by the follower’s actions. We measure policy performance in terms of its time-stability, defined as the number of periods it takes for the leader to match the actions of an oracle with complete information. In particular, we propose a class of greedy and robust policies and show that these policies are weakly optimal, eventually match the oracle's actions, and provide a real-time certificate of optimality. We also study a lower bound on any policy performance based on the notion of a semi-oracle. Our numerical experiments demonstrate that the proposed policies consistently outperform reasonable benchmark, and perform fairly close to the semi-oracle. This is a joint work with Juan Borrero (Oklahoma State University) and Denis Saure (University of Chile).
Dr. Oleg Prokopyev is a Professor in the Department of Industrial Engineering at the University of Pittsburgh. He received MS and PhD degrees in industrial and systems engineering from the University of Florida and BS and MS degrees in applied mathematics and physics from Moscow Institute of Physics and Technology (Moscow, Russia). Dr. Prokopyev’s research interests are in the areas of combinatorial optimization, stochastic programming and applications of Operations Research in health care, bioinformatics, network analysis and military problems. His research has been supported by the National Science Foundation, Air Force Office of Scientific Research (AFOSR), Department of Veteran Affairs and the Defense Threat Reduction Agency. Dr. Prokopyev is a recipient of the AFOSR Young Investigator Program Award. He is the Co-Editor-in-Chief of Optimization Letters and serves on the editorial boards of IIE Transactions, Journal of Global Optimization and Omega.