The Wallace Effect
Charles Darwin put Origin of Species in a drawer for 20 years, until Alfred Russel Wallace sent him a copy of his own manuscript outlining the same basic ideas, and Darwin realized it was now or never. Call it “The Wallace Effect” – the realization that you are being overtaken by time and progress and that if you wait long enough, it will be too long and someone else will beat you to the punch. Whether it’s something you see on an infomercial (“That’s MY idea!”), or a theory, or a novel, there are just too damn many people on this planet who are also thinking the same thoughts and beating you to the punch, because they are uninhibited by whatever it is that’s holding you back – in the case of Darwin, scruples about the impact of the book on his personal and professional lives, among other things. In my case, it’s the exhausting effect of emotions on my well-being, and an ardent desire to avoid having them if at all possible.
But these articles just keep popping up that remind me that AI of the sort I’m aiming to create in my novel are coming closer every day. One author (I think it was Tom Wolfe but I can’t seem to verify this a.m.) wrote an article about how it was nearly impossible to write satire any more – how the most absurd thing you can imagine just keeps happening before you can get it into print. I’m feeling that way myself lately.
First up, there’s this NPR story about a program called "StatsMonkey,” which takes a baseball game’s stats and generates a newspaper-worthy recap story. The amazing thing isn’t that the numbers can be plugged into text (any form letter linked to a database does the same), but that the game is actually analyzed, and linguistic modifiers inserted that make the article sound, well, human-generated (underlining mine):
An outstanding effort by Willie Argo carried the Illini to an 11-5 victory over the Nittany Lions on Saturday at Medlar Field.
Argo blasted two home runs for Illinois. He went 3-4 in the game with five RBIs and two runs scored.
Illini starter Will Strack struggled, allowing five runs in six innings, but the bullpen allowed only no runs and the offense banged out 17 hits to pick up the slack and secure the victory for the Illini.
OK clearly there’s a syntax error there (“the bullpen allowed only no runs”), but still. Kris Hammond of the Intelligent Information Laboratory at Northwestern University outlines the generative process this way:
Well, it starts with the numbers. And, in fact, in general, what we do is we go from numbers to story. So it looks at the box scores, it looks at the play-by-play information. And then it uses that to figure out what we call the angle. That is, what kind of game was this? Was it a back and forth? Was it a pitcher’s duel? And then from that, it actually generates the language.
Which of course is also what a human does. You can make the argument that the words and phrases that make up “color commentary,” like right-wing slogans, are easily databased – who isn’t sick of watching football games and listening to some idiot saying “You’ve got to move the ball down the field! You’ve got to put points on the board! That’s how you win the game!”
Obviously adding modifiers like “banged out” and “blast” and “struggle” isn’t the same as performing a deep and insightful analysis, but still – it’s definitely a step towards replacing the sort of writers who can only generate cliches. Personally, I believe that the more AI can replicate “thin speech” (“We are Standing Tall to Fight Socialism and Defend Marriage!” “Our Corporate Family is Synergizing Excellence!”) and take the place of those who are paid to create ever more bullshit, the more humans will be forced towards “rich speech” – how else will you know if you’re talking to a person? How else will we differentiate ourselves from the robots, if not by ceasing to be the robots we become now when we use thin speech?
AI research is definitely coming closer to realistic speech – check out the “Neverending Language Learner” at Carnegie Mellon:
NELL (“never ending language learner”) is only a prototype so far. It is to operate 24 hours a day, 7 days a week and perform two tasks each day: (1) reading and (2) learning. When NELL is reading, it extracts knowledge from web text and adds it to its internal knowledge base. When it goes in to the learning phase, it applies machine learning algorithms to its newly enlarged knowledge base, thereby enhancing its understanding of language. The researchers believe that NELL “holds the potential to yield major steps forward in the state of the art of natural language understanding.”
[Here in the story there’s a graphic that displays the input, “Carnegie Mellon has seven colleges and independent schools,” from which NELL decides “Carnegie Mellon is probably a university.”]
For now, they have focused on a specific language learning task, that of discovering noun phrases that belong to different classes. For example, “Carnegie Mellon” belongs to the class “University,” and “General Electric” belongs to the class “Company.” However, it gets more complicated because “Organization” is a superclass that both “Carnegie Mellon” and “General Electric” belong to. The task is highly appropriate for something like NELL because there are a nearly limitless of number of object classes in the English language. There is simply no existing database to tell computers that “cups” are kinds of “dishware” and that “calculators” are types of “electronics.” NELL could create a massive database like this, which would be extremely valuable to other AI researchers.
Exactly the sort of database Christopher would raid/steal to enhance Alex. I am creeping back towards actual novel writing – waking up early again and restless enough to write something, if only a blog post. Six weeks till my next trip to NYC, a place that always energizes me creatively, and reminds me that there’s a world outside Reno where I might be able afford to live, and thrive, someday. I need to believe there’s something besides emotional upset to be had as the output from my own writing process.