Ellipsoid Algorithms As A Tool Against Predictable Opponents

Ellipsoid Algorithms as a Tool Against Predictable Opponents | HackerNoon

Last updated: 2025/01/24 at 9:23 PM

News Room Published 24 January 2025

Authors:

(1) Avrim Blum, Toyota Technological Institute at Chicago, IL, USA;

(2) Melissa Dutz, Toyota Technological Institute at Chicago, IL, USA.

Table of Links

Abstract and 1 Introduction

2 Setting and 2.1 Models of behaviorally-biased opponents

3 Preliminaries and Intuition

4.1 Myopic Best Responder and 4.2 Gambler’s Fallacy Opponent

4.3 Win-Stay, Lose-Shift Opponent

4.4 Follow-the-Leader Opponent and 4.5 Highest Average Payoff Opponent

5 Generalizing

5.1 Other Behaviorally-Biased Strategies

5.2 Exploiting an Unknown Strategy from a Known Set of Strategies

6 Future Work and References

A Appendix

A.1 Win-Stay Lose-Shift Variant: Tie-Stay

A.2 Follow-the-Leader Variant: Limited History

A.3 Ellipsoid Mistake Bounds

A.4 Highest Average Payoff Opponent

4.4 Follow-the-Leader Opponent

Recall that the Follow-the-Leader opponent plays the best action in retrospect, defined as the action that would have achieved the highest payoff against our entire history of play. For this opponent, our strategy will be to learn a best response to each action, and then use the well-known ellipsoid algorithm to predict the opponent’s actions while playing best responses to the predicted actions.

Using Ellipsoid for Prediction

Note that while the bound on the number of losses and ties we incur is exponential, the runtime can be considered efficient since choosing which action to play in each round is efficient; this is an improvement over the simple general prediction algorithm we considered in section 3, which requires considering an exponential number of game matrices to choose which action to play in a single round. We also consider a limited-history variant of the Follow-the-Leader opponent below, against which we can achieve a polynomial bound on losses and ties.

4.4.1 Variant: Limited History

4.5 Highest Average Payoff Opponent

The Highest Average Payoff opponent plays the action that has achieved the highest average payoff over the times they have played it. We discuss this opponent in A.4.

Ellipsoid Algorithms as a Tool Against Predictable Opponents | HackerNoon

Table of Links

4.4 Follow-the-Leader Opponent

4.4.1 Variant: Limited History

4.5 Highest Average Payoff Opponent

Leave a Reply Cancel reply

Stay Connected

Latest News

The JBL Boombox 3 is $50 off for a limited time at Amazon

Have We Reached a Distroless Tipping Point?

Why did ChatGPT go down this week? Blame the 700M AI images made by 130M users

ReMarkable Tablets Just Got a Bunch of New Templates to Boost Your Productivity

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Table of Links

4.4 Follow-the-Leader Opponent

4.4.1 Variant: Limited History

4.5 Highest Average Payoff Opponent

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News