Learning Heuristics

AI discussion, ideas, and SDK help.
Post Reply
MrEnder
Lux Newbie
Posts: 6
Joined: Thu Feb 10, 2011 4:26 pm

Learning Heuristics

Post by MrEnder » Thu Mar 03, 2011 8:23 pm

So I am taking an AI class on planning algorithms and we are studying algorithms which learn as they play more. I might be asking a variety of questions on this forum, so if you are interested let me know.

First Question:
One such method of creating a learning AI needs a variety of "features" to look at which might be valid to the success of an agent. As the AI plays, it will decide which features were actually worth evaluating, but for now I just need to provide them. Here is what I have so far, but I would love to hear what kinds of things other human players look for.

Players Remaining
My # of Armies
My Income
Number of bordering Armies
Surface Area: A ratio of how many of my countries are touching enemy countries divided by the number of countries I own (lower is better)
Income ratio: My Income/Avg. Income will indicate how you are doing compared to other players.
Army ratio: My Armies / Avg # of Armies

User avatar
Sophrosyne
Luxer
Posts: 18
Joined: Fri Nov 12, 2010 1:45 pm
Location: Birmingham, London

Post by Sophrosyne » Fri Mar 04, 2011 5:05 pm

Hey there,

I'd be interested to know how the algorithm would evaluate a decision once it has been made. Looking back over the decisions in a game, can you clearly distinguish between good and bad decisions? Or do you just consider every decision in a lost game to be a bad one, every decision in a won game to be good?

I don't really play the game much at all, but I would also consider:

Continent bonuses (conquering continents/defending them)
The income/position of the strongest player
Whether I can take a player out to get their cards
Whether I have conquered a country yet to get a card

Hope that's helped,

Sophro

jesterme
Lux Duck Lover
Posts: 3178
Joined: Fri Oct 30, 2009 12:06 am

Post by jesterme » Fri Mar 04, 2011 5:19 pm

CARDS CARDS CARDS!!!

MrEnder
Lux Newbie
Posts: 6
Joined: Thu Feb 10, 2011 4:26 pm

Post by MrEnder » Fri Mar 04, 2011 5:24 pm

That is part of the trick. While it is possible to evaluate a games decisions based solely on whether I have won or lost it wouldn't converge to a good policy very quickly. Instead, I might need some sort of distance function to reward myself for getting close to winning. This can really be a very small part of the reward (entirely dominated by the reward for winning) but it will provide some sort of guidance along the way to encourage good decisions. I think that what I will probably do is use something to do with players left and/or percentage of countries/total armies i control. While these might not directly indicate that I am on my way to winning, they do seem to be reasonable.

Thanks for the feature ideas! I think that what I will probably end up doing is converting the ability to defeat an AI from a boolean to an estimated percent chance.

Oh also, I think another feature might be: cards/3 * card-set value

User avatar
Sophrosyne
Luxer
Posts: 18
Joined: Fri Nov 12, 2010 1:45 pm
Location: Birmingham, London

Post by Sophrosyne » Fri Mar 04, 2011 5:47 pm

I don't think a policy would converge at all if you based the games decisions soley on a win or a loss to be honest, they tried similar things with Deep Blue and ended up going the brute force way.

MrEnder
Lux Newbie
Posts: 6
Joined: Thu Feb 10, 2011 4:26 pm

Post by MrEnder » Fri Mar 04, 2011 6:13 pm

Ahh yes, this is sometimes the problem with courses in theory, isn't it? I was planning on not relying on that though, and I think the evaluation of a Lux board is easier than the evaluation of a chess board

User avatar
Sophrosyne
Luxer
Posts: 18
Joined: Fri Nov 12, 2010 1:45 pm
Location: Birmingham, London

Post by Sophrosyne » Fri Mar 04, 2011 6:39 pm

I would say the opposite, although by looking at the state of a Risk game you can get more information than looking at a position in chess, I think the fact that Risk has stochastic elements and usually has more than one player makes it incredibly difficult for a machine to determine the outcome of the game just by looking at one state. The search space for chess is massive, so I'd hate to think about what it's like for Risk :? but noone's going to be doing a brute force thing for Risk for a while.

MrEnder
Lux Newbie
Posts: 6
Joined: Thu Feb 10, 2011 4:26 pm

Post by MrEnder » Sat Mar 05, 2011 7:41 pm

I guess what I said was a bit innaccurate. I meant not really that Risk is simpler, I just meant that often in risk it is easier to tell from looking at it if you are closer or further from victory. There and good indicators of your relative chance at certain times. This is assuming a few things (that your opponents behave rationally for one) but if I am playing a game against some bots, I frequently know that if I control certain spots, I am in really good shape. That brings me back to my feature list, the features I was describing above are what I am looking at as possible indicators of the current state of the game. Not necessarily of being close or far from victory, but they are a good way of generalizing the position of a player. You can vastly reduce the state space if you consider all boards with the same value for the features to be about the same. It isn't the way to break the game, but it is the way I am going to try and teach my AI to play. We'll see how it goes!

Post Reply