As I continue to play with the IPD, I created code to genetically evolve IPD strategies to see if cooperation could be borne out of random behaviours. I used the standard GA I’ve created in Wintermute, with each agent represented by the five weights detailed in my last post and with fitness calculated as the average points earned in each bout (this was subtracted from 5 in order to allow the GA to search for a minimum). I then created a population of 500 agents with a mutation and elitism rate of 0.5% per generation.
It took me a while to tweak the GA to start working, but I finally got it with fascinating results. Here is a chart of the distribution of strategies along with the best fitness for each iteration:
Over the first 200 generations, defecting strategies (All-D and Spiteful) are by far the dominant strategies. At this point though, TFT invades the population and within 100 generations, it makes up more than half of the population. However, TFT quickly crumbles with Pavlov and Spiteful picking up the remainder of the slack. Interestingly, All-D never recovers and doesn’t exceed a single-digit percentage after that.
More interesting however is the cycles that the world enters into, with strategies exploiting each other over successive generations. It looked like TFT and Pavlov enjoy longer and longer periods of “prosperity” and running the GA for 2,000 iterations caused the TFT strategy to ultimately win, sometimes taking 96% of the total population.
However, despite wild swings in the dominant strategies, neither the average fitness or the best fitness change much at all past the 200-300th iteration, with both fitnesses hovering just above 2 (meaning that strategies are scoring ‘3’, the reward payoff). What I did notice was that the initial starting weight nearly always quickly converged to “Cooperate” in the most successful strategies, making all evolved strategies “nice”.
Next was to introduce noise to see if a strategy such as Pavlov would evolve as many have noted. Introducing noise at a very high level (10%) simply caused an All-D strategy to rocket to the top and no other strategies could gain a foothold. At 5%, All-D was the biggest strategy until about the 1000th generation when Spiteful took hold, followed by TFT and Pavlov. At 2.5% noise, All-D was superceded quickly with Spiteful and TFT vying for dominant position. In this case though, the fitnesses varied much more significantly and oscillated quite regularly. However, Pavlov seemed to gain little additional traction compared to previous runs despite numerous tweaks.
Finally, I wanted to see if running the simulation longer would “breed” further cooperation. It did seem that running a larger population over a larger number of generations, started to create more cooperative strategies like one I named “Forgive” (1, 0, 1, 1) which will defect in response to a Sucker’s payout but will otherwise always cooperate. This has all the traits of a successful strategy, as per Axelrod’s original observations, of being “nice”, “retaliating”, “forgiving” and “non-envious”. However, this strategy never evolved into an even more cooperative behaviour, as a TFT or Spiteful strategy would exploit the inherent naivety in the population.
This has been very interesting, but I have some thoughts on expanding this:
- My fitness function seems to limit the GA’s ability to search beyond strategies that average a Reward payout. While this was useful for this experiment, how would you continue to force evolutionary pressure on these strategies?
- Co-evolving the payoff matrix? Which way would this drag the population?
- Would more complex strategies (such as TF2T etc.) enable greater diversity and therefore produce a more cooperative populations?
- Would running a very large population over a much greater number of iterations (Ball spoke about 92,000 generations) generate different results? My initial findings are pretty cyclical in nature…but it is hard to determine how the GA would do long-term.
If people have any interesting results to share, please let me know!