(The following is an idea I’ve been mulling over and talking to friends about for a few months. I thought I’d finally share it to see if anyone liked it or was interested in working on it.)
Warning: Evo psych just-so story to follow. Think of it as a parable, not as a theory. It’s just here to contextualize the idea that follows.
The Story
Suppose there’s a monkey. Suppose also that the monkey has evolved to have an inbuilt proto-toolmaking behavior.
For this specific example, let’s say he’s learned to snap a twig off a tree and stick it in an anthill. When he pulls it out, it’s covered with tasty protein-rich ants.
This monkey is unlike you and I in that he takes no pleasure in finding the right stick. He knows the stick must have certain qualities – long, thin, not too brittle. However, he does not experience any pleasure until he actually eats the tasty ants.
Suppose this monkey represents a species. This species does well because it has this one trick for getting protein out of the ground in abundance at a low cost.
Now, suppose one day a monkey is born who has a quirk. Instead of taking pleasure in the ant part, he takes pleasure in the stick selection part too. That is, when he finds an appropriate stick, his brain rewards him with premature pleasure. So, whereas his brethren experience pleasure only upon eating the ants, this one monkey gets pleasure from selecting an appropriate branch.
This confers an advantage on that monkey. The other monkeys will select an appropriate branch, then use it until it breaks. This new monkey will change branches often until he finds the best one. Because he enjoys the selection process per se. He doesn’t know why he enjoys it, but as a result, he tends to get more ants per twig. He enjoys making the best twig for its own sake as much as he enjoys the inevitable payoff.
His mental pathway, simplified, goes something like this:
Confusion —> Understanding —> Pleasure (CUP)
Over evolutionary time, all the monkeys have this pathway, and it becomes a point of competition. The supply of good branches is limited and it takes work to select the best one. So, selection rewards the monkey who can go from confusion to understanding the quickest.
This pathway, originally useful for twig selection, leads to other beneficial effects. The monkeys now have the desire to understand systems. One day, a monkey finds a sharp rock and decides he wants to understand how to make sharp rocks. He creates the first hand ax, and outcompetes his brothers.
And so the mental pathway, CUP, gets strengthened and strengthened, producing useful results over time. We could set up an equation that looks something like this:
Level of understanding from, denoted by some value from 0 to infinity (potential size depends on complexity of thing being understood) = U
Amount of time taken to go from one value of U to another = T
The equation would be
ΔU/T = Quality of monkey brain.
If it takes a long time to go from confusion to understanding, the monkey is a bad tool maker. If the time is short, the monkey is a good tool maker. In general, a higher ΔU/T score is a better toolmaker.
It goes without saying that the equation for U could be complicated and dependent on many things. For example, a stupid musician would understand sheet music faster than a very smart person with no music background. However, in cases like these, we could still find the latter to be the better brain. After all, the musician may be going from U=99900 to U=99950 while the non-musician may be going from U=50 to U=99950. So, the non-musician’s longer time needs wouldn’t necessarily indicate lower intelligence.
Over evolutionary time, ΔU/T should increase. The more selection pressure on toolmaking, the faster it should go up. Although this is generally good, it results in some perverse side effects.
For one, the monkeys now indulge in behavior without a clear evolutionary payoff. For example, they make up riddles for each other to solve. Sitting in the dark of winter with no natural puzzles to solve, they invent puzzles for each other, to generate pleasure for its own sake. These puzzles make use of a new concept – cleverness.
Here, you can think about ΔU/T in a second sense – how good a puzzle is.
If it’s too easy (What has two wings and a beak?) ΔU is small. So, the possible ΔU/T is limited.
If it is too hard (What is the product of the first 400 Fibonacci digits. Please solve using multiplication only) then T is too large compared to ΔU.
If it is non-inferable (What’s my middle name?), it’s no fun because you can’t solve it, so there is no change in understanding.
A clever puzzle threads the needle. For example, the classic riddle from The Hobbit: “What box has no hinges key or lid yet inside golden treasure is hid?” The answer is “an egg.” Within reason, it is the only possible solution. It is also not obvious. It requires you to make inferences about what is acceptable in the category of “box” and “treasure.” In this case, the ΔU/T is at some favorable ratio.
It is fun because it uses the confusion-understanding-pleasure (CUP) pathway. In the golden treasure example, the question presents confusion. After some thinking, it leads to understanding. The reason the CUP pathway exists is for the advantage it conferred in toolmaking and strategizing. However, it’s existence also makes for a peculiar monkey behavior called puzzles and riddles. These behaviors are mere byproducts of the natural selection for monkeys who derive pleasure from understanding.
Now, suppose these monkeys come up with a couple versions of this game. They have games, they have puzzles, they have riddles, and they have jokes. They conceive of these as different things, when in fact they’re just points on the ΔU/T line. The game lets you move slowly from non-understanding to understanding as you begin to comprehend all the possible tactics. The puzzle is like the game, but a bit faster, and with less to understand. The riddle is faster still, and has the bonus of essentially allowing you to make a discrete jump from non-understanding to understanding, once you catch the answer. The joke is nearly instantaneous. It takes you from complete confusion to complete understanding very rapidly.
For example: “Why did the church hate Dungeons and Dragons? Because it’s a form of birth control.”
The confusion is very brief, and concerns why there is a connection between Dungeons and Dragons and birth control. However, making the connection requires only a short chain of inference. So, the CUP pathway runs very quickly, and all the pleasure comes at once.
Thus, the game supplies a very large amount of pleasure over a long time. The puzzle supplies a smaller amount of pleasure, but in a shorter time. The riddle supplies even less pleasure, but in shorter time still. And the joke supplies the least pleasure, but it supplies it in an infinitesimal amount of time. When these things are combined, they result in more pleasure still. They all combine readily, as each is just a different member of the same family. Many jokes, for example, could be rephrased as riddles.
If true, this would explain why pleasure is experienced in all these things, and it would explain why pleasure is often more acute but less profound for jokes – they supply the highest ΔU/T, but the lowest value for ΔU.
It would also explain why people sometimes laugh when understanding a concept or solving a mystery. These are all just expressions of CUP.
The enjoyment of jokes has two prominent aspects – pleasure and laughter. They pleasure may be explained by the above. The laughter could possibly be explained as follows:
The good problem solvers are the best mates. Thus, it benefits a monkey to signal understanding of a concept. In this case, the fact that the noise is a “HA” made at the back of the throat could be entirely arbitrary. It could just as easily have been a click or a bark.
This would have some implications that could be tested. For one, it would mean that a person is more likely to signal amusement (via vocalization and facial expression) when there are other monkeys to hear. That person might also be more likely to vocalize when the concept understood is an especially tricky one.
I suspect this is the case in humans. For example, if you just understood something interesting, where would you be most likely to vocalize – near other members of your social group or at home alone? Are you more likely to laugh out loud when watching a movie with friends or when watching it alone?
Similar behavior has been seen in human female sex vocalizations. For example, in some primate species, females are more likely to vocalize during sex if males can hear.
ΔU/T would also explain why dissected jokes are never funny. For the joke to have the proper ΔU/T, T must be very very low. When jokes have to be explained, T gets bigger and the joke becomes less pleasurable.
The Logic
I don’t know if the above is true, but I suspect something very like it, in principle, is. If so, it has implications for how jokes are written.
It means the ideal joke presents something confusing that can be quickly understood with a key piece of information. I propose that you could in fact write a fairly simple program that would create at least a certain type of joke. With modification, it could potentially handle more types.
The general way in which this type of joke runs is as follows: two things are at first glance unrelated, but then shown to have some relation in a sensible way. The above Dungeons and Dragons joke is an example. The perception of the joke proceeds as follows:
Understanding the church is involved.
Understanding the church opposes D&D.
(Note, so far, everything is just empirical statements)
Changing D&D to mean birth control.
(Note, the new statement is confusing, but still maintains all prior logical connections. That is, it’s still something the church dislikes, and it’s related on at least one metric to D&D)
Confusion over whether the statement makes sense.
Understanding.
Pleasure. (Hopefully)
Many classic jokes follow this format. For example, “Take my wife. PLEASE!”
An understandable statement is made – “Take my wife.” The meaning of the word “take” is altered, but all logical connections are maintained. Brief confusion results. The confusion is followed by understanding – the comedian means a different statement that maintains all prior logical connections. Once understood, pleasure results. Note the pattern – sense, nonsense, sense, pleasure.
(Of course, in the above case, we all know the joke, so ΔU = 0. But, the first time it was told, this would not have been so.)
For another example, I once wrote a joke in which Jesus tells his disciples to give all they have to the poor. This results in the poor’s economy crashing because the free product puts their economy in a deflation.
This joke follows a similar structure. You are told that Jesus favors helping the poor and is acting in a way to harm the poor. This results in confusion. When the connections are explained – dumping product results in deflation – understanding results. Once again, an idea (giving to the poor) has its meaning changed in a way that preserves logical sense but alters the meaning of pre-existing connections. Ideally this happens quickly, and the reader will laugh.
Note that in both cases a connection is discovered. In the first case, there is a strange equivalence. Imagine you discovered it by doing the following:
Start with a concept. Build all possible relations off of that concept as bridges to other concepts. From each of those concepts, build more possible relationships to more concepts. Eventually you have a branch tree. At some point, you will have a situation where you fork off of a concept, only to have to paths come back together. The following is be an example.
1) Church opposes->D&D->is loved by->Geeks->who have->no sex
2) Church opposes->Birthcontrol->whose methods include->abstinence.
You can see that we fork from what the church opposes, only to “close the loop” at not having sex. This is, of course, simplified, In an actual diagram, “church opposes” would branch to many things, as would D&D as would “is loved by” and so on. We’re just creating chains of relationships. Saying geeks have no sex might seem like cheating, since it’s similar to a joke. However, consider it as being one quality of a stereotypical geek among many. Others might be shyness, social awkwardness, etc.
Here’s a doodle of a more worked out chart, that is still obviously rather artificial.
The point here is that we follow the perfect structure of a one liner via this path way. When we find one of these loops, it represents a surprising shared relationship, which is essentially how we described jokes above.
So, structurally, this whole diagram would look like lots of nodes with lots of links coming off each node. To find the potential jokes, we simply need to look for these “closed loops.” That is, places where something forks, only to recombine later.
My suspicion is, based on the ΔU/T concept, that there is an ideal size to the loop. Too big a loop would require too much inference, thus making T large. Too small a loop would make ΔU too small and the joke would be dull. The ideal joke takes a second to understand, but only a second. So, there is probably a desirable length for a closed loop.
In addition, note that there are two types of closed loop. I’m calling these Loop of Equivalence (LOEq) and Loop of Contradiction (LOCo).
In LOEq, connections proceed from a fork until two places contain the same thing (e.g. fork from things the church hates to reconnection at lack of sex).
In LOCo, connections proceed from a fork until two places contain perfect contradiction (e.g. fork from things Jesus wants to reconnection when one end is “alleviation for the poor” and one is “suffering for the poor.” Jesus wants the poor to be alleviated and suffer.
In LOEq, the reader is presented with a strange equivalence that is then resolved, along the CUP pathway.
In LOCo, the reader is presented with a strange contradiction that is then resolved, along the CUP pathway.
The Program
Thus, to make a program, one would need do do the following:
1) Acquire many concepts
This could be accomplished by creating a website where people can enter nouns.
2) Acquire many relations
Suggest a noun to a website user, then ask for a relation that could come off it to another noun. For example, suggest the noun “star.” The relation could be “shines on” or “destroys” or “creates” or “is loved by.”
3) Acquire more concepts
Present the website user with subject relation combinations. For example, “Batman is loved by.” The user supplies a new thing, such as “The people of Gotham,” “Catwoman,” or “Comic Book Readers.”
4) Find similarities
Present users with similarly-connected or similarly-spelled things. For example, Jesus Christ or Jesus. The users identify when two things are in fact the same thing, thus reducing errors and false positives.
5) Construct the tree.
Note that at no point in this process do users input any jokes. They merely input concepts and relations. This is akin to a comedian observing the world. We’re just feeding the computer raw facts about the universe.
6) Search for loops of the ideal size.
If the program works, at least some of the time, the result should be a “clever” joke. With human assistance, it might be possible to pull out the good ones and make them into new jokes.
Limitations and Potential
It would be hard to make this program come up with longer story-based jokes. These require much more than just logic chains. In principle, the idea for a compelling story could be created using ΔU/T and logic chains, but the actual story itself requires a human.
Additionally, much of humor relies on unspoken concepts and context. This could be fed into a machine, but the output wouldn’t necessarily be a funny joke. For example, a raised eyebrow can serve to change the meaning of a phrase quickly from surprising to arousing. This is funny for the reasons above – it changes the meaning while preserving logic. It’s not clear how the proposed system would come up with the eyebrow raise, even if it came up with the arousing part.
In general, presentation would probably require human assistance. Once the loops are discovered, they have to be conveyed in a way that maximizes ΔU/T in the reader. It’s conceivable a stock method could be determined for the machine to do this. However, that’d have the built-in limitation that it would be less funny every time it was used, thus lowering ΔU.
Discussion of Weirdness
This may seem like it shouldn’t work, since humans create jokes through something called “creativity” or “cleverness.” And, in fact, it may only work (if it works at all) for a certain class of jokes. However, in essence, it works the same way a comedian does. It is fed observations, then looks for a certain type of connection.
It has been said that a computer can’t make up a joke. However, neither would a person raised in a blank room. Humor requires observations in order to establish then subvert a logical chain. If a modern computer is incapable of joking, it may be more about the computer’s memory than its hardware or software.

It should also use the International Phonetic Alphabet representation of words to search for puns.
I know someone at my school did a research project where they built a program that made puns by analyzing a dictionary and looking for connections. Kind of the same idea, only less reliant on common knowledge.
Any chance you’ll actually make this? I’d love to see the results; also, I wanna be part of that website ^~^
Jonathan Rosenberg from Scenes of a Multiverse tweeted some time ago something similar:
‘Funny is the reward your brain gives you for connecting two things you already knew in a new way. Funny is learning. That’s why sci-fi humor is so difficult and rarely works; scifi deals in the fantastic, not the ordinary.
Sci-fi is about new concepts. The more familiar you are w/ concepts being connected, the stronger the funny. That’s why so many jokes rely on plane food and pop culture. So the trick to sci-fi humor is to take the outlandish and root it in the familiar somehow, like Hotblack Desiato avoiding taxes for example’
Count me in if you plan to start coding this. As a start point, the concepts and relations can be gathered from http://www.freebase.com/
There are a few forms which wouldn’t be difficult to mechanize — the snowclone, the “1 XXX, 2 YYY, 3 ???, 4 PROFIT!”, the pun, the knock-knock joke… not every loop would fit one of these, but some of them might.
It’s certainly an interesting concept. Assuming the machine/program was able to make funny jokes fairly consistently, do you think there would be a stigma that developed against this sort of joke generation?
For that matter, I’m curious whether seemingly non-sequitur humor is something that could be generated eventually as well, or whether it’s
uniquely humantoo complex to automate.Interesting!
The part about emergence of jokes as an unintended side-effect relates well to the Dennett’s philosophy of conciousness (http://en.wikipedia.org/wiki/Consciousness_Explained) and as such laughing may not just be a signal to others, but to ourselves as well; self-stimulation as an outward way to connect brain-parts.
From a design perspective, universal story-generation itself is not considered the hard part (for an excellent attempt at it, try http://hmi.ewi.utwente.nl/showcase/The%20Virtual%20Storyteller); it’s the AI-completeness-problem of the ontology (http://en.wikipedia.org/wiki/AI-complete), in part because of the (sub-)cultural and language dependence of the required knowledge. For example, I am Dutch and our culture has no equivalence for knock-knock-jokes, because the key part, ” who?” does not translate well. The same holds for functional data relationships; as another example, Dutch churches do not (actively) oppose birth control. I had to “switch to American” to get both.
That said, formalizing leaps in understanding as a source of happiness opens up a host of opportunities! I’m intrigued to see how this pans out. Mr Wiener, you continue to impress me with your versatility.
At least in the case of the church, I think the joke is not a cultural one, in that it’s really talking about the Vatican. But yes, I think you make a good point.
Dennet et al.’s Inside Jokes is a great book that lays out a theory of humor that’s VERY close to what you sketched above. I’m not familiar with FreeBase, but you could (also) use something like WordNet, which also has the upside of being included in the Natural Language Toolkit for Python (2.x only, so far.)
I’ve thought about stuff like this in the context of puns (i.e., how far can a word be deformed before it stops being a pun?), but I suspect the tricky part might lie in the “filler” of the jokes: ontologies like WordNet encode things like “subset-of,” “in-same-set-as,” etc, that might be tricky to map onto automatically generated natural language that doesn’t feel robotic or clunky.
I agree; this is similar to Dennett’s Inside Jokes idea, and it’s a very good idea. Well worth pursuing.
Shorter Zach: I want to make a strong AI to do nothing but write pithy one-liners for me.
I like the idea and may take a stab at coding it up over the weekend.
Steps 1-3 seem difficult or at least very slow if we do just count on kind people supplying words, I suppose running a major website could speed up participation. If I were to do it without that participation I would probably use amazon’s mechanical turk for crowd sourcing it for fairly cheap.
A solid prototype to test the idea probably can be built entirely using already freely available sources like http://www.freebase.com/ suggested by Ricardo. My suspicion is the automatically generated graph one would have too many boring word connections then one built by users with some idea of the purpose. I.E. we could find a loop between a horse and a plane but, it would make a meh joke. If we identified the better joke topic concepts such as d&d and church, then we could just look at the loops between quality joke concepts and get better results.
I’m actually already working with someone to build something, but you’re free to design your own one as well :)
Great idea, that seems like one of more advanced issues in computational linguistics and you might wanna read a little more about that because I feel like the entire problem might be more complicated (i.e., how to overcome the problem of “picking the good ones” by an actual human being, because I think that as described above, only a fraction of thus created jokes would be found to be really funny, or their ΔU/T being favorable).
Couple of ideas that I thought of – might help you or not at all (I’m a college undergrad, neuroscience major – that’s where this comes from):
- I like your “ΔU/T” ratio, but maybe you could start thinking about of the “T” part more as “effort required” rather than “time required” – because I feel like that would be more appropriate: when one has to think really, really hard about a joke that turns out to be mediocre, the pleasure (amusement) produced is also lowered, despite the fact that only a small amount of time elapsed.
- When I read that the monkeys would “indulge in behavior without a clear evolutionary payoff,” I thought – “the only real evolutionary payoff is survival.” What I mean is that as long as individuals indulge in some behavior in the longterm and still survive, it means it is not harmless to their survival and in a more complex way, it is most probably beneficial in this respect, although, yes, it might not be immediately apparent. (I don’t think that absolutely neutral behavior with respect to favorability for survival is really possible – the long time needed for evolutionary steps to take place usually picks one way or another.)
Later in your post, this connected to another insight you had (not sure if intentionally): “The good problem solvers are the best mates. Thus, it benefits a monkey to signal understanding of a concept.” – exactly, here you go with the reason why even make jokes in the first place and then why laugh at them (and I bet that also ‘hearing laughter” sets off another gratification pathway in the brain): evolutionarily, it is beneficial to make jokes to see who can understand them. When someone else understands a joke, they wanna make sure you know they understood, and that’s why they laugh. You hear the laughter and see: this guy is smart(er), it’s good to be friends with them (and the laughing guy wants to be picked – therefore he laughs) – because cooperation with smart individuals increases the chances for survival. (This entire process is, of course, not conscious but rather automatized by evolution.) – If this is what you actually meant, I apologize for reiterating it here.
Good luck with making it all work, it seems to be feasible!
I took an AI class at UCSC a couple of years ago, and one team of students made a program almost exactly like the one you describe, minus the part that identifies loops of ideal size. After entering a few new nouns, verbs, or relations, it would prompt you with an assertion, “X is a Y” for example, and you were to tell it whether the assertion was true.
The assertions were often funny, which makes sense in light of your theory. With some consideration and knowledge of the past relations entered into the program, you could often piece out why the computer was saying something surprising or confusing like “A baby has wheels.” (Cars move. Cars have wheels. Babies move. Therefore…) It never accumulated a dataset large enough to surface truly useful connections, but it was on the right track, I think. As it was, the humor was tied too much to your knowledge of what was in the tree so far for it to be useful for joke writing.
I wish I remembered who made the program. I’d point you their way if I did.
Humour requires intelligence, and social skill. For humour to work it has to be appropriate to the audience and so requires a clear understanding of the social setting and the personalities of the people around. So humour is a powerful advert of ones intelligence and social standing, it is the mental equivalent of the peacock’s tail.
A more difficult problem is Groucho’s famous line, “Time flies like an arrow, but fruit flies like a banana.”
We can say that it’s funny because of the change in grammatical structure between the first and second parts and our pleasure in recognizing it — but that’s not enough.
If we reverse it and say “Fruit flies like a banana, but time flies like an arrow,” it’s no longer funny, and it isn’t even a poor joke.
Pingback: Stone Links: Boyle's Olympic Incoherence - NYTimes.com
On the other hand, maybe the monkey who gets pleasure from good sticks rather than ants winds up spending all his time enjoying sticks. Rather than flourishing, he dies out of the gene pool. A couple of million years later, his spiritual (though not genetic) descendants are humans spending all their time on the Internet.
There are some other theories of humor that get to the same result by different paths. I’m not sure what this one adds to our understanding. Freud had some very different views and, better ones in my view.
Finally, I did not understand the D&D joke because the ‘connections’ I could think of were not in the least bit funny. Reading on, I discovered I had, in fact, quickly hit upon the same understanding that was intended. But it remains very unfunny.
Perhaps it’s because the conclusion is false. The Church should support D&D rather than oppose it. Because abstinence is the ONLY form of birth control that the Church supports.
Indeed, that’s what’s wrong with the joke, it’s totally backwards!
If this works–considering the healing power of laughter–you may have to get it certified as a medical device.
(badump-tssshh)
Zach, that is an interesting theory. This recent paper looks like it is similar in spirit to your joke generation idea. You may find it useful if you decide to explore existing research on this topic in computational linguistics.
Humor as Circuits in Semantic Networks
Igor Labutov and Hod Lipson
The 50th Annual Meeting of the Association for Computational Linguistics
http://aclweb.org/anthology/P/P12/P12-2030.pdf
Eric
If this works, you could ask people to rate the generated jokes, and use the ratings to group people according to their background knowledge. This might be useful for a matchmaking site, or even fancy team selection for businesses. (Maybe you could sell it to OkCupid?)
Once the groups are established, you could probably make a good estimate of a persons background knowledge based on their rating of a relatively small number of jokes. This could be used by educational software to
guesspredict where the gaps are in a students knowledge.The problem with those things is that it’ll only work for people whose pleasure response is in the normal ∆U/joke range. If ∆U/T is a measure of intelligence, it might be harder to classify people who aren’t close to average intelligence. It could probably be made less of a problem by using different loop-lengths, and for each user choosing mostly jokes near the loop-length they rate highest.
But I don’t think preferred loop-length is a good measure of intelligence, or of ∆U/T. I expect loop-length preference to be too influenced by culture or personal linguistic habits. That influence might improve matching people with eachother, but might foil attempts to match students with appropriate educational materials.