When users of a system–customers or employees–communicate with each other in the regular course of business, they convey their plans and intentions in written and verbal speech patterns. These are in turn recorded to ensure continuity of business, financial record keeping, compliance to regulatory laws and code of conduct standards, and other use cases. The work environment is dynamic and language patterns and terminology change constantly. Correspondingly, the systems and tools have to keep pace with the rate of change and must be able to continuously learn and reveal unforeseen and actionable connections to uncover opportunities as well as risk. To do this, machines need to be able to understand human language.
Natural Language Processing (NLP) is the science of breaking down human
language into discrete patterns that a machine can understand and interpret.
Whereas the understanding and responding to basic commands is straightforward,
a machine cannot understand the nuances of why a person says what they say, or
their intent and sentiment. This is where data sciences and computational
linguistics have created tools to help machines understand what humans are
really trying to accomplish when they type or say something.
NLP has several disciplines, including Natural
Language Understanding (NLU) or speech recognition based on
disambiguation (understanding precisely what a word means) and Natural
Language Generation (NLG) or speech creation, where a computer
independently composes sentences such as with a “chatbot.” We will look at
these side disciplines in future blog posts. For now, you can follow the links
to Wikipedia for a quick reference definition of these areas.
Why Should You Care
In the 2021 Algorithmia
Enterprise Trends in Machine Learning survey, there was a reported
urgency around AI/ML projects: “When we asked respondents why, 43% said their
AI/ML initiatives “matter way more than we thought.” Nearly one in four said
that their AI/ML initiatives should have been their top priority sooner.” (p6) While
many areas of IT budgets are downsizing, the AI/ML line item is ramping up
significantly. If 2020 has taught businesses anything, it is that automation of
DevOps and management of data assets are key strategic investments that
recession-proof operations and keep a business viable in uncertain times.
Organizations are looking at an increasing number of use cases for ML and
with it, NLP. In future articles, we will dive into these scenarios and their
ML/NLP applications in depth. In the meantime, here are just a few of the
possible best practices and outcomes where this technology can be applied:
- Improving
Customer acquisition, retention, interactions, and experience and
therefore customer loyalty
- Process,
supply chain, and back office automation, reducing operational costs and
increasing ROI
- Fraud and
Insider Threat detection
- Sales
pipeline, recommendation systems, loyalty, brand awareness, and marketing
program intelligence
- Financial
planning and insights
- Governance,
Risk, and Compliance management and workflow for audit and regulatory
reporting
Governance is by far the most problematic of these use cases, with over half
of all organizations ranking it as the top challenge. The Ethics,
Explainability, and Data Privacy concerns inherent in the AI/ML discipline are
fodder for many conversations and debate in this emerging space. As with any
new technology, standards are not yet established. But governance mandates that
the handling and processing of data, especially PII, be treated with kid
gloves… and an audit trail in order to minimize risk. Data is exploding at a
rate that makes it hard to trace: even with data lakes and cloud solutions,
data is messy by nature.
Why Does “Big Data” Matter
Big Data is a term that illustrates the growth in volume, velocity, and
variety of data in the world. It is exemplified by an explosion in the quantity
of data; and primarily so in the unstructured “messy” data of chats, and the
semi-structured data of emails. Notes, images, and attachments increase the
complexity of what must be captured and supervised. And while communications
appear to be small, when viewed independently, they provide more insightful
patterns and context when viewed in aggregate. Until recently, a human eye was
required to spot these patterns and understand the context – but now we have
access to advanced tools such as statistical analysis, machine learning, data
mining, NLP, information retrieval, and predictive analytics.
Fundamentals of Pattern Analysis in Language
Detecting behavioral patterns in unstructured, text-based data is often
compared to a “Needle-in-a-Haystack” scenario. The basic assumption is that the
vast majority of people are following common patterns, and only a small
percentage are outliers, such as First Movers with a new technology (positive
use case), the “bad actors” who try to mask their intent for Insider Threats
(negative use case), or a person who is acting under duress due to life
circumstances (neutral use case). Still, NLP systems must look at everything in
order to find evidence in the communications of those few individuals. Most
communications detected as potential trends are generally innocuous, and we
call these “false positives” when we have to review them.
NLP software therefore seeks to sift through and filter out the vast
majority of documents that are valid, business-related communications, and
reduce the haystack down to an interesting set of data where the “needles” or
outlier behaviors are hiding. It is these “interesting” communications that are
considered high-value, or “true positives” and for which we want to generate
data sets for further examination.
In any search application, there is a classic tradeoff between Recall and
Precision [Fig.1]. Recall is the scope of coverage: “Did I miss anything?”
Precision is how closely can I come to getting exactly what I intend to find?
We seek to optimize the balance between reducing False Positives (increasing
Precision) and making sure as few True Positives as possible are missed
(increasing Recall). Complete coverage with high precision is the goal of all
NLP solutions.
One method for managing this competition is to map the business needs and
risks to a behavior taxonomy and then link individual rules and algorithms to
them. The taxonomic technique allows us to measure the tool’s performance with
respect to each of the managed use cases, and to demonstrate that each of the
risks has a corresponding coverage. This management method renders the typical
tradeoff between sacrificing Recall in favor of Precision insignificant, as
there is now an assurance of business coverage, while providing targeted risk
reporting for those behaviors that senior management prioritize.
NLP—An In-Depth Explanation
NLP tries to break down the complexity of human speech in two parts and
solve each part independently, with what are called Semiotic and Semantic
analysis.
The first challenge is to understand word origins and their evolution over
time, in order to find relationships between them. This is called “Semiotic
Analysis” and is the subject of a lot of research in the industry. It has been
broadly addressed by breaking words down into character strings and sub-strings
(their stem and “lemma” or main representative form). For example, “run, runs,
ran, running” are all forms of the lemma “run.” The system then performs
clustering and counting of words within documents, finding which are most
commonly present in the proximity of others. The tools for this work are
effective and openly available as open-source toolkits.
Common toolkits that are used in the industry are Stanford Core NLP
and Apache OpenNLP,
for the following operations:
- Normalization:
Correcting spelling errors and standardizing words.
- Stemming: Looking
for the stem, i.e. the most basic, common form of a word.
- Entity Extraction:
Identifying nouns and tagging them with properties that are useful for
analysis (for example: Barclays can be tagged with “Bank” or
“Counterparty”, and “cell phone” can be tagged with “communication
channel”.)
- Fuzzy Matching: A
type of string-based matching of phrases to a dictionary of interesting
words, or topics, that accounts for variations in position or spelling (so
“call my cell” would be the same as “call @ cell”.)
- Synonyms: Simple
word substitutes to capture the context provided by other words (so “call
my cell” would be the same as “call my mobile” in search terms.)
- Feature Construction:
The use or combination of the above techniques to generate and store
complex context along with the data (so “call my cell” would be stored as
“use of external communication channel” –and would match the text “reach
me on my mobile”.)
The second challenge is to understand each word’s meaning, and how that
meaning changes within a single sentence and within the context. This becomes
more complex in larger passages because the context and sentiment will shift
between sentences or chat lines in a conversation. This type of analysis is
called “Semantic Analysis” and is a harder problem to address because it
reflects the deeper linguistic intricacies of human communication.
In language, context is everything. We, humans, are exceptionally good at
understanding emotions, nuance, and innuendos, all things that machines cannot
grasp. A diagram of the language’s structure helps explain why [Fig.2]. There
are two levels of semantics in human speech, shown here as breadth and depth.
The basic structure of a sentence is represented in the top line:
Subject-->Verb-->Direct Object. The conduct risk behaviors that we are
trying to detect are constructed as Verb-->Direct Object phrases,
representing the discrete activities that the Subjects perform. The deeper
structures of prepositional phrases provide context. This is where the intent
can be best discerned: the “why” that explains a person’s motivation and
actions.
NLP enables systems to learn the context and meaning of words from
sentences, paragraphs, and entire documents by reading each line.
Unfortunately, though, computers cannot yet read “between” the lines. It is
instructive to point out the ambiguity inherent in human language by examining
a few expressions:
- “He was on fire lst nite.”
(last night)
How does a machine know that a person is not really burning up when someone
says that a sports player is “on fire” - meaning that they are performing well?
In an NLP platform, the fuzzy matching capability is able to accommodate the
spelling error or abbreviation and recognize “lst nite” as a “timeframe”. Its
NLP semiotic techniques can then understand that “on fire” is a synonym for
“high performance” when combined with the dual contexts of “person” and
“timeframe”.
- “She jumped for joy.”
In the same way “jumping for joy” is not a literal action, but it could be
especially when the context is dealing with small children.
- “They really love to do bad things.”
In this example, the software will need to be trained to understand that
while “love” is positive (let’s think about it as a +1) and “bad things” is
negative (-1), to love a negative makes the statement doubly negative (-2)
instead of neutral (which is reasonable, given +1 -1 = 0). In order for the
machine to understand the above examples, the instructions we give have to take
away the ambiguity that humans are very adept at handling naturally. This is in
essence the semantic power of NLP.
Our NLP tools can look at adverbs and prepositional phrases to discern
sentiment or emotions (“love”, “hate,”) and intensifiers (“really”, “maybe”,
“hesitantly”). However, the ability to discern free-will choices and the intent
of a person’s speech patterns based on deeper semantic structures is still an
area of research that has yet to enter the commercial space. In order to
approximate the understanding of such human psychology, it is possible to
create semi-static structures against which rules can be mapped. We call these
structures “taxonomies of risk,” or “behavioral taxonomies.” In future topics,
we will look at how language works and the way taxonomies aid computers in
organizing human knowledge.
In this post, we have looked at the basics of NLP at an extremely high level
and posited several use cases for businesses to consider when applying advanced
AI/ML techniques to their operations. Making an investment in ML for the long
run is a strategic decision that should be mapped out with deliberation and an
understanding of the investment in DevOps, Infrastructure and Data Sciences
that are necessary to fully support the initiative. To accomplish this type of
transformation takes buy in at the most senior levels of the company.
Therefore, it is critical that the C-Suite understand the concepts involved and
the investments necessary to drive innovation in the new world of Predictive
and Behavioral Analytics powered by NLP.
S Bolding—Copyright © 2020 · Boldingbroke.com
No comments:
Post a Comment