Tags (and what happened to Names)

“Tags,” in genetic programming and particularly in the context of Push/PushGP, are identifiers that can be used to name, store, and retrieve code and data as a program runs. The distinctive property of Tags relative to the “Names” used in Push1-Push3 is that Tags can match inexactly. This feature is hypothesized to be helpful for the evolution of reference, and hence for the evolution of complex and modular systems.

History and Motivation

Tags were developed as a successor/replacement for Push “Names,” which were included in the language (in a couple of different experimental forms) through Push3 (GECCO paper, language specification). While Names performed as expected in hand-written Push programs, they were rarely (if ever) used in evolved programs. One reason probably stems from the number of Names in circulation during evolution: If there are too few then evolving systems will not be able to scale, while if there are too many then it will be rare for storage/retrieval operations will refer to the same names by chance, and therefore unlikely that name usage will evolve.

The conceptual difference between Tags and Names is that Tags can match inexactly: If one tries to retrieve something with a particular Tag, but that exact tag has not previously been used to tag (store) anything, then the retrieval operation will return the item that has been tagged with the closest matching tag. This concept of tags, with inexact matching, stems from the work of John Holland (perhaps discussed most fully in his book Hidden Order: How Adaptation Builds Complexity).

With Tags, as long as at least one thing has been tagged, all tag references will refer to something. The motivation for providing Tags in an evolutionary computation system is the hypothesis that this feature – that references may be produced easily by chance and change incrementally over evolutionary time – may be enough to get the ball rolling on the use of storage/reference in an evolving system.

In principle, tags and tag-based reference mechanisms could be added to many different kinds of evolutionary computation systems, including many different kinds of genetic programming systems. However, attempts to provide tagging mechanisms in tree-based genetic programming have had limited success.

A related concept of tags, also deriving from Holland’s work and the notion of inexact matching, but different in details from the concept described here for evolutionary computation, has been used in game-theoretic studies of the evolution of altruism (e.g. this, which started it all, and this, and this from members of the Push community).

Tags in Clojush

In Clojush tags are integers that are “baked in” to tagging and tag-based retrieval instructions. For example, the instruction tag_exec_123 will tag the top item on the exec stack with the tag $123$, and the instruction tagged_456 will push whatever has been tagged with the tag that most closely matches $456$ onto the exec stack. “Closeness” is implemented in a one-directional way, with wraparound: if nothing has been tagged with $456$ then $457$ is the closest, followed by $458$, $459$, etc., and if no higher-numbered tags were used then the closest is $0$, then $1$, etc.

The full set of tag-related instructions can be found in clojush.instructions.tag.

Tagging is generally enabled by including tag-related “erc” (ephemeral random constant) functions in the atom-generators of a PushGP run. These functions are defined in clojush.instructions.tag and demonstrated in problems such as clojush.problems.demos.tagged-regression.

Early experiments with tags in Clojush are described here and here.

I’d like to add that we have never found evidence of tags providing useful functionality beyond a small set of problems with very particular characteristics, namely the lawnmower and DSOAR problems. In these problems, we found increases in performance, but when we studied their usage further, found that they were primarily used to implement looping constructs. Additionally, these problems work by side-effect, meaning an infinite loop to “mow the whole lawn” is sufficient, as the program does not need to return a correct functional value.

We have tried to show the utility of tags on numerous other problems, including symbolic regression and classification problems for which modularity would be very natural to use for a human programmer. On none of these problems did the presence of tags increase performance.

I personally haven’t given up on tags entirely, and would be very interested to see their evolved use on non-trivial problems. On the other hand, I think some sort of additional objectives for modularity or tag use might be necessary to get the critical mass of tags necessary for them to be useful.

1 Like


Indeed, the evolved programs did things that looked more like looping in spaghetti code than like human-engineered “modularity,” but I think that whether or not this was nonetheless significant/interesting is a complicated question, and I’m not sure that the functional/side-effecty nature of the problems says anything about this one way or another.

But I agree that so far the demonstrations of utility for tags have been pretty limited and weak.

[EDIT: One context in which tag use has definitely evolved has been in work on clojush.problems.software.calc. But there it’s impossible to make any progress at all without using tags, because of the “tagged entry point” architecture. And this problem is unusual for other reasons as well, and has not yet (to my knowledge!) been solved in any significant sense.]

Agreed also that there’s more to investigate, refine, and develop!

Can I just say: Computer Science nerds are the finickiest nerds I have ever met.

I used to have to teach this stuff. To undergrads. And they had to memorize it. And they were happy. (Maybe some of those are lies).

But it’s arguably the worst spaghetti code ever written down.


1 Like

What a coincidence – that’s exactly what we evolved with PushGP tags! :wink:

1 Like

So in looking at @lspector’s slides for the “Expressive GP” tutorial, I found myself asking a lot of questions about these tag things since there are several slides devoted to them.

I went and looked through all of @thelmuth’s “clustering” results (essentially anything where he saved CSV files with entire populations across the whole run). There are 900 log files (runs) in that set, 249 of which were successful. 16 of those 249 have the string “tag” somewhere in the simplified successful program at the end. 11 have both “tag_” and “tagged_”, so 11 of 249 (less than 5%) both declare a tag and use a tag in the simplified, successful program.

Maybe. It could be less. I’ve got a few questions about how these things are actually implemented that I’m hoping @thelmuth and/or @lspector can help clarify:

  • When an instruction (or block of instructions) is tagged, is it popped from the exec stack? This appears to be under the control of a global-pop-when-tagging variable, which appears to default to true, which which appears to still be true in these runs.
  • It is not executed when it’s popped, right? So I think it quietly vanishes when it’s tagged, only to appear again later if a tag is searched for. Is that true?
  • Tagging is done _when the tag instruction is executed, and not sooner, right? So if I have a tagged instruction first, it won’t find anything, right? If so my simple grepping above may be over counting, as there might be programs there that have both a tag_ and a tagged_, but in the wrong order so the tagging actually has no effect.

So there could be less than 11 because of the ordering thing. There could also be less than 11 because some of the tag_ instructions might be trying to tag items on empty stacks, and thus turn into NOOPs.

A quick skim (some of these are quite long, even after simplification) identifies a few that use tags to re-arrange the order of things. This solution to Syllables:

(\e "ThF &umb1r oV syllable=1isa" integer_fromchar 
string_last \y \o in1 char_swap \i string_replacechar 
tag_exec_950 (\u \i string_replacechar \i string_replacechar 
   \i char_yankdup string_replacechar string_occurrencesofchar 
string_replacechar "The number of syllables is " \o \e 

for example, appears to be using tags to essentially move the tagged block from where it is to the end of the program, where it is called once. So not really any kind of modularity so much as a way to fix a bit of bad positioning.

I think this solution to string-length-backwards:

(in1 vector_string_reverse 
   (string_length tagged_225 print_integer 
        tag_exec_465 print_newline))

is also using tags to move an instruction (the print_newline) from the end of that block to the second spot, but only after the loop as been executed once, thereby avoiding a spurious newline the first time through the loop.

And I may have identified one instance where tagging is being used in a tag-y sort of way to create a module. If I’m understanding all the pieces correctly, I think this solution to negative-to-zero actually uses tags to create some sort of module:

   (boolean_stackdepth integer_min boolean_stackdepth 
    vector_integer_replace tagged_279) 
exec_dup (vector_integer_pushall in1) tagged_279)

This starts by tagging an exec block, then it sets some things up and calls that block. That block ends in a tagged call, so it is effectively an infinite loop, relying on the step limit to stop it eventually. I think the block takes the next value from the integer stack (which contains all the values in the input vector as part of the setup), mins it with 0, and then either (a) replaces all instances of 0 with 0 if the vector value was ≥0 or (b) replaces all instances of the value with 0 if the vector value was ≤0. So it works, which is cool, and it is actually a tagged block that is being called with something akin to a GOTO (less cool? more cool?).

In general, though, it really doesn’t look like tagging is happening much, and I’d bet good money that @thelmuth could have run all these experiments without the tagging instructions and seen little to no impact on the results.

I’ve included all 11 of the simplified solutions that my quick grep identified in this file: tagged_programs.txt (8.6 KB)


This is really great to see!!

I think that the answers to all of @mcphee’s questions are “yes.”

We have suspected for some time that tags were doing less than we had hoped (and less than our initial, methodologically weak tests suggested), but we never previously had the analytic tools/chops to tell for sure. This seems to be a solid confirmation of that suspicion, along with documentation of one notable exception.

The motivations for tags, or something like them, still seem to me to be compelling in the long run. And I think that digging deeper into this issue will help us to understand more about the relationships between modularity, robustness, and evolvability (because I think we’re finding that non-modular solutions evolve more because they’re more robust… but I assume that modularity is necessary for some kinds of large-scale evolutionary innovation).

I’ve put this on the agenda for our next meeting (which hasn’t yet been scheduled).

[Added: we should probably remove tags, among other things, from the GECCO tutorial, since we need to cut anyway to add stuff. I hope to work on that later today.]

1 Like

Yay! I can read code! :stuck_out_tongue:

Actually, this only used data you’ve had for ages (the log files) and some very simplistic grepping. I had thought I might do something more sophisticated which is why I chose the runs with the CSV outputs, but in the end the number of runs that used tags was so small grepping and eyeballs got me much of the way there.

One thing I’d like to look at is whether the same is true for the autoconstructive runs, but I haven’t gotten there.

Agreed. I took a bunch of notes on those slides, which I’m writing up now in an e-mail to you, and one of the suggestions is to drop the material on names and tags since there isn’t anything stronger than speculation to say on either of them.

1 Like

This is great to look into! TBH, I had forgotten I had even included tagging in the instruction sets. They have not come up (often) in the solutions I’ve looked at! :anguished:

Could you share your grep/script for finding all of these simplified programs? I’ve wanted to look at similar things in the past, but it seemed daunting to figure out how to grab all of those.

It sounds like it isn’t often making interesting use of the tagging. In our initial results, which ran on the lawnmower and DSOAR problems, we got improved results with tags. But, looking at the evolved programs, it seemed like almost always those tags were implementing rudimentary looping, like in your last example. In those experiments we didn’t provide any other methods for looping/recursion, so it was a big boon to have tags.

But, I haven’t seen any results that suggest that tags help things when used alongside all of the typical Push :exec stack manipulators. Your findings line up with what I would expect – little use of tags, and probably no significant improvements. We could of course do more runs without tags to see if we get any difference, though this doesn’t seem worthwhile at this point.

I agree wholeheartedly with this. It would be great to get modularity things like tags being used more, but at least we are starting to understand why they don’t seem to be used often.

1 Like

Since it has long been observed that tags do not improve anything, should they still be used by default in Clojush?

I had done some experiments (not published anywhere) 2 years back as part of a course project and the results clearly showed that tags do not affect the success rate. Since then, I have stopped using tags in my experiments. Should we change the default, and let the users decide if they want to include tags in their experiments?


I too have done some experiments showing tags neither help nor hurt.

Our typical approach to the instruction set is to include anything that isn’t obviously hurting evolution. Is that a good idea? I don’t know. I imagine we could eek out a few more successes by tuning our instruction sets better, but I wouldn’t bet on huge improvements unless you went to problem-specific (and maybe domain knowledge-inspired) instruction sets. That said, I’d be happy if someone wants to work on ways to better tune instruction sets!

All of that said, I’d vote for leaving them on by default for the program synthesis problems, but have no problem with people turning them off.

1 Like

In a sense, we already do: If you implement a new problem, it’s up to you to include or not include the tag-related instructions in the atom-generators, either directly or with the “tag-enrichment” facility. We only have them included “by default” in the sense that many of our pre-implemented problems include them, so if you want to run those problems without them then you have to change the atom-generators.

I’m not sure it matters so much what’s the default is, but rather that we think about whether or not to include them in particular experiments.

Also, I’d note that they may become more useful if, for example, someone achieves success in encouraging evolution to make use of modular designs :wink:.