Probabilistic Genomes for Genetic Programming

This thesis investigates the problem solving capabilities of Probabilistic Plushies.

Additional notes on the changes I would have made to the thesis paper if given more time for revisions:

  • State that our attempt at equalization is probably overcompensating, so probabilistic looks better than one would think at first from the data.
  • Explain why the average sizes of the Probabilistic Plushy genomes are much bigger than the average sizes of the Non-Probabilistic Plushy genomes because many of the genes in the Probabilistic Plushies have a probability of 0, so the effective size will be smaller.
  • Suggest future work on developing a more-fairer comparison to actually account for all computational resources and on initializing the sizes of probabilistic genomes differently.
  • Add in section 4.1 that the evidence suggests that crossover appears particularly effective in the context of large, untuned genetic sources and discuss how it could have promising contributions to the GP field.
  • Provide more tables that present the number of near-1s and near-0s of the Probabilistic Plushy genomes that Propeller outputs from the probabilistic runs.

Very cool! I just had a glance and will read the paper in detail later. I just came up with one quick thought for follow-up work, is that the probabilities of genomes may be further used for interpretation of the program. (Maybe incorrect, just some thoughts) I would assume that more important and essential genomes will have higher probabilities, or if we think of genomes in a certain module, some genomes that are interchangeable will have similar probabilities in the ideal case. The ultimate goal would be we have one prob genome representation, that can explain most of the variants of valid programs. And if we want to expand on that program, it would be much easier since we can start from the existing search space that covers the variants of prerequisites.


My suspicion is that many/most genes will end up with probabilities close to 0 or close to 1. Looking into this is one of the reasons for @dndang23’s last bullet item above.