Image for post
Image for post
Meaning is the building block of human language, uncovered by context.

Meaning is core to language because the meaning of a sentence determines the forms of words and phrases that are selected and vice versa. Or as I say: Form follows meaning®. But what is meaning?

In language, the word forms that we use to communicate with others follow the meaning of what we want to say and, just as importantly, the meaning of what we say is far deeper than the words we can use to say it. Therefore, meaning needs to be at the core of our language understanding systems, not word forms.

What is missing from data science…


Knowledge representation is key to the future of natural language understanding because the right model enables all languages to share a common ‘repository of knowledge.’ But to this date, models are immature. By analogy, we haven’t seen the kind of breakthrough to better explain knowledge as Copernicus did in the field of astronomy. Fundamentally, models are misaligned with what we know in the cognitive sciences.

Today, I’ll look into the arbitrary nature of the current approach to knowledge representation as an enabler of artificial intelligence (AI) and consider an alternative optimized for human language representation. My justification for the alternative…


Image for post
Image for post
Representing knowledge, in a language-independent manner that is also bidirectional, is needed to make NLP more effective.

Yesterday, my copy of the book, Rebooting AI by Marcus and Davis[i], arrived. Although I’ve only looked at a couple of pages so far, it is going to be a good reference point for scientific observations about artificial intelligence (AI) because its authors are experts “at the forefront of AI research.” If they can’t explain the state-of-the-art, nobody can!

Because my work doesn’t come from the academic world, its findings aren’t broadly known at the moment, but it’s easy to show solutions to the book’s problems. I want to share solutions to current problems to help reboot AI and my…


Image for post
Image for post
Aiming at the target is the best way to hit it. An NLU benchmark needs to have the same target — i.e. NLU in conversation. Search is NOT language.

An NLU benchmark should progress NLP performance in conversation, making it as accurate as mathematics on computer.

My SuperGLUE benchmark article notes that the consortium doesn’t ask questions in language, or generate answers in language. It is more of a test of search, rather than natural language understanding (NLU), which could explain the observable limitations in conversational AI that is using technology that is improving at the GLUE benchmark.

I was immediately asked what a benchmark for natural language understanding should look like.

The benchmark for natural language processing (NLP), which should be comprised of NLU and natural language generation (NLG), should test language, not knowledge. What’s the difference?

Language allows communications to take place, leveraging shared…


Image for post
Image for post
Linguistic models scale exponentially when taught; NLP training data does not.

Linguistic models add exponential knowledge: that’s good. The data science training model, by comparison, is slow: that’s bad.

The “data” model promised effective NLP (natural language processing) given just “more data” and later, perhaps, AGI (artificial general intelligence). But data availability is terribly limited compared to the scale of a natural language and that possibly explains why the data model doesn’t scale to conversations.

I’ll use English to show the scale that machine learning systems need to deal with. …


I spent 3 days last week in Buffalo, New York, at the International Role and Reference Grammar (RRG) conference at the University at Buffalo that was reporting on progress in humanity’s final frontier: how our languages work.

Or as I say: how human intelligence is enabled, because intelligence comes from language (language use is the differentiator between humans and other animals).

Amazingly, I was the sole industry representative at the conference! In this article, I want to explain some of the features of language discussed and why it is needed for natural language processing (NLP).

Image for post
Image for post
Understanding conversations explained at RRG 2019.

Why NLP needs this scientific progress

RRG models languages with three…


Image for post
Image for post
Benchmark titles should reflect their purpose for AI

Today, the top computer chess programs are much better at chess than humans. The best human rating today is around 2900, while the best computers are in the 3000s. Humans just can’t track as many possible future moves as a computer with such accuracy. In short, humans are not as good at chess[i] as machines can be. Similarly, humans are slower than machines, with cars, rockets and jet airplanes winning almost every time.

There’s nothing wrong with losing to machines, but the final frontier is in the use of natural language. We’ve been waiting a long time for machines to…


Image for post
Image for post
Meaning hits the target missed by data: generalization and understanding

A decade ago, Google’s scientists published “The Unreasonable Effectiveness of Data[i].” It focused on “natural-language-related machine learning” successes like statistical speech recognition and statistical machine translation that use large amounts of data. It preceded the best successes of deep learning which took place a few years later, around 2012, improving on statistical systems.

Today, I will look at a likely consequence of that paper’s recommendations: the divergence away from brain and language theory to data science that has led us to today’s gap between user expectations and system performance in conversational AI. I’ll look at applying data to natural language…


Image for post
Image for post
Is this really impossible?

I’m not a fan of scientists telling us that things are impossible, because it stifles innovation. The claims also involve things we haven’t solved, giving incentive to not try. In the case of natural language processing (NLP), which should involve language understanding, many people tell me that it is impossible in theory, despite the obvious:

You are reading and understanding this.

Therefore, full language understanding is possible in a machine such as a brain. It’s fast, too!

Today, I will look into what was actually claimed by: “language is impossible,” and why an improved scientific model based on linguistics, not…


Image for post
Image for post
Device interactions need conversation, not keywords

The inability to accurately recognize human language is a problem with state-of-the-art science.

In part 1, we looked at the difference between language understanding in which the systems don’t understand what the words mean (keyword-NLU) and systems where they do (meaning-NLU). Today, I compare the operation of state-of-the-art (keyword-NLU) conversational AI platforms with one based on modern linguistics, brain theory and computer science (meaning-NLU). We also cover the developer’s high-level tasks to set up the different platforms.

My conclusion is that conversational AI is impossible with today’s platforms because human languages are too precise to be represented with the current…

John Ball

I'm a cognitive scientist working on NLU (Natural Language Understanding) systems based on RRG (Role and Reference Grammar). A mouthful, I know!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store