Thursday, May 2, 2024

How Language Works in a Nutshell

In the discussion about Artificial Intelligence, a parallel was drawn between how machines communicate via the traditional seven layers (starting with hardware through to applications) in comparison to how humans communicate within in society as broken down into the classic disciplines of Hard and Social Sciences. Diving deeper into this analogy will provide a foundation for understanding how to teach a machine to parse and then recreate human speech patterns using Natural Language Understanding (NLU) and Natural Language Generation (NLG). These two disciplines are the emergent sciences of Human-Machine and Machine to Machine (M2M) communications. Of primary concern in this article is the question of how Language works, what is its main functional component? For without this understanding, there is no framework from which to operate. If you don’t understand how human languages work, how can you then teach computers to understand and parse them?

Social Contracts

The Language Instinct, as already described in a prior article, is the desire to communicate with another human. When communities were first coalescing in the early days of human society, since at least 14,000 BC or earlier let’s say, people would come together as tribes, usually small family units of hunting parties. They had to figure out how to trade and interact with others in order to survive. Small groups banding together into larger and larger groupings became the first winter settlements. People created a common cause based on inter-marriage between tribes: they share a common language and traditions. This is the basis of culture. In order to live together in harmony, a common understanding or law evolved. These social contracts between groups kept conflicts from breaking out and stabilized society. They were based around common beliefs, culture, and practices.

Archæo-linguistics and demographic analysis of migration patterns show that, for just one example, the speakers of proto-Indo-European on the Ponto-Caspian steppes about 4000 years ago had a strong horse culture. Technological revolutions such as the development of the mouth bit, wheel, and chariots, created commercial marketplaces for trade in bronze and ironwork related to equine tack and tools. All of this data is stored in the way the language evolved, and can be traced alongside excavated middens and grave goods. Culture is instantiated in language and the two are inescapably linked to commerce and trade talk. This has been proven time and time again, not the least of which is by David Anthony in his seminal work The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World (Princeton UP , 2007.)

Evolutionary Linguistics as a discipline examines the macro-trends in language to determine linkages between languages in family groups, as well as the micro-trends such as vowel shifts and structural grammatical loss (losing a verb tense for example.) One example of machine learning applied to recovering ancient languages comes out of MIT's CSAIL lab, where they have developed algorithms that can automatically decipher a lost language without prior knowledge of its structure. "The team’s ultimate goal is for the system to be able to decipher lost languages that have eluded linguists for decades, using just a few thousand words."

Marketplaces and a type of simplified accounting must have developed. Consider this scenario. One tribe has domesticated a few horses and figured out how to ride them by developing a thing that you stick in the horse’s mouth to guide it with straps attached to either side. You can ride the horse or attach it to a sledge that drags behind with your household goods loaded on. Now, your family is able to move faster and catch more game. You make several of these “things” and want to trade them. You have to figure out what to call them: you agree on the term “bit.” A new noun has entered into the vocabulary. It's an Entity as well, something that can be tracked and counted. You become wealthy off your invention. But the king wants his Taxes, and the gods want their Tithes. Someone has to keep track of it all. When tribes come together to trade in the town market, there is a class of people who know how to write and calculate, these are scribes, they are the ones who take the tithes and taxes, handle the money. They are the bankers, the priests, the accountants. And, most importantly, for our purposes scribes are the only ones who know how to write, the keepers of Language. In this simple scenario, it is clear to see how language and culture evolve side by side but are also intertwined with the advance of commerce and technology.

Semantics and Semiotics

When looking at the two “big” areas of how language works, linguists discuss Semantics or the structure of language and Semiotics or the content of language. Every language has a form and function (the structural semantics) and a vocabulary list of words (the semiotics.) It’s best to think of it in medical terms: A human has a skeleton off of which the muscles are attached. Without the muscles, the skeleton cannot move or function. The grammar and rules are the skeleton, the words are the muscles. You have to pour words, content into the structure in order to make it work. But without a structure, words are meaningless… If you just pour out words in any random order, there’s no sense, no meaning, just a bag of words… You need grammar and word order of the sentence to make the words make sense. This is true in any language. You can’t just memorize a vocabulary list in order to know how to talk the talk. Sooner or later, you have to learn the rules of how to string those words in a particular order, how to pronounce and put emphasis on some words: Does your voice rise or drop at the start or end of your sentence? What syllable do you emphasize? How do you indicate time passing? What about intent and emotion? These are critical to conveying meaning. And this is the hardest part of teaching a machine to understand human speech.

Here we get to the dirty little secret of NLP and AI when it comes to training data and human languages. It’s really easy to do Semiotics with Statistical Analysis techniques. And we’ve come a long way in the past decade given the increase in computing capacity and data in the cloud. However, it’s incredibly hard to do Semantics, to understand and tease out Intent and Context, things like sentiment and emotions from the data, because in this case Statistics just don’t cut it. We just don’t have sophisticated enough tools yet to teach a machine to recognize sneaky, giddy, enthusiastic, political, depressed, and so forth.

What does this really mean? It all goes back to Semantics and how languages work versus Semiotics and what languages contain. Remember this is the same regardless of if you are talking about Sino, Indo-European, or Afroasiatic languages. Semiotics looks at the “bag of words” problem. Computers are really good at counting things. And in documents they can count words, whether it be words in a sentence based on Kanji symbols or based on the spaces at the end of a string of characters. Think of a Google search. The computer knows which words occur with higher frequency and are therefore more important.

But just because a word is more frequent, does that really mean it’s more important? The concept is called Clustering. The next step is to look at location, do words co-occur next to each other, within two or three words of each other in a phrase or sentence resulting in concepts or ideas. Parsing a complete sentence is harder: For each language, you must know the parts of speech, parse the order of the words, and then the rules of grammar. This is a multi-step processing problem to get to a structural view that is diagrammed like this for many western, Latin-based languages:



In turn, each part of speech can have its own specialization, as we saw with Entity Extraction, where nouns (subject, direct object, prepositional object) are categorized and examined for their relationships. There are toolsets for English, but many major languages still do not have maturity in part-of-speech tagging, the ability recognize and parse parts of speech, perform entity recognition, let alone higher order functions like sentiment analysis. The community is just now emerging for Spanish, Japanese, and Chinese toolset commercial viability, with German, French and Arabic also in the works.

So how do we get to the point where machines can properly understand the semantics, the structure and form of languages to get to the context and deeper meaning of language such as sentiment, intention, and nuances of speech? Essentially what was the person really thinking when they said what they said. This is a hard problem that computational linguists are still working on: The ability of a machine to comprehend and reason. One step is to tag each part of speech in a sentence so that the complete utterance can be parsed by the machine according to known grammar rules for the language. Then the machine can understand word by word what is being discussed, marrying Semantics with Semiotics. This capability leads to a discussion of Natural Language Understanding, the topic of our next article and just one of the most currently active frontiers of AI research.

S Bolding—Copyright © 2021 · Boldingbroke.com



No comments:

Post a Comment

Generative AI: Risks and Rewards

 Benefits of GenAI The landscape of general computing has changed significantly since the initial introduction of ChatGPT in November, 202...