Thursday, May 2, 2024

Generative AI: Risks and Rewards

 Benefits of GenAI

The landscape of general computing has changed significantly since the initial introduction of ChatGPT in November, 2022 and similar LLMs for usage by developers and end users alike. And yet. The technology is not new, just expanded. Built on previous generations of research and model building, ChatGPT and its fellow models are examples of stitching together several targeted models in various domains to reflect a more global view of reality. ChatGPT itself comprises at least eight distinct models under the hood. It’s not the apocalypse some critics predict: End of the Middle Class. Destruction of white-collar jobs. Advent of the Age of Machines.

Cynicism and fear generate clicks, using the specter of ‘radical tech’ for marketing purposes and to show you how important they, the titans of tech, are. Human-computer interactions via AI went from prescriptive to collaborative overnight. The shift centers around a new interface paradigm. No longer tied to the physical elements of screen, keyboard, mouse, humans can now use natural language to order their machines around. This currently includes voice, gestures, facial expressions which augment the traditional written text input. We talk, using words to explain what we mean, and AI talks back to us.

These types of interactions defined as more ‘human’ imply collaboration, in that humans do part of the task. People scope, bound, and provide context to the questions. This is now known as ‘prompt engineering’ a whole new job category that emerged with the advent of interactive models. Collaboration implies that humans do part of the tasks, AI does others, and the back and forth of it all feels more ‘natural’ with a better result at the end of the day. The trend is to view these tools and bots as augmentation to existing efforts. While Ironman’s JARVIS is a long way from the old paradigm of Jeeves (as in AskJeeves of the Internet 1.0), generative AI is far from independent, able to have agency and purpose without human input.

There is a potential downside, in that AI amplifies what’s already there. So if you have great processes you’re automating, they will become more efficient. If your processes are faulty, inefficient, or misdirected, then those flaws will multiply causing more loss of time, income, and ultimately clients. It’s a double-edged sword. 

Operationalizing AI

AI as a topic is about more than just LLMs and content creation. At its core, the LLMs that underpin today’s advances in AI are built on data. When you operationalize AI, you're essentially operationalizing data. For the longest time (in internet terms) the focus has been on building out networks, infrastructure, and the applications that run on top of them. Now however, to protect your valuable information, the paradigm has shifted to protecting the data that runs through and powers that infrastructure. Think of data as the gas that fuels the engine. Without fuel, the systems and processes are just so much piping and potential.

To get to a point where companies are protecting both the infrastructure and the data, a core discipline of “always on encryption” will be needed to protect sensitive information and prevent breaches in both internal systems and cloud environments.

Data Security in Today’s Environment

Data, your personal information, is a form of currency in today’s online world. No longer is your privacy guaranteed as geographical boundaries and jurisprudence are increasingly ignored by hackers, thieves, and well-intentioned analytics gurus. Data mining is a profitable business, and you are the product being milked for all its worth. There are reports every day of medical information being shared without permission to insurance groups, biostatisticians, and others in the name of the ’public good.’ Standard organizations can’t even agree on basic definitions. Rules and regulations like GDPR give many the right to be forgotten. But you must deliberately opt out. And when you don’t realize you’re even ‘in’ the game in the first place, it's a never-ending battle to protect your digital footprint.

How does the smart consumer of digital life go about shielding their most intimate details, while still enjoying the many entertainments and services at hand on the web of today? Some experts say it’s already too late. And they may be right. Your cellphone alone provides a wealth of data via GPS, location tracking and connectivity services, the baseline necessary to stay connected while out and about, before you even begin to indulge in a little TikTok. 

Needless to say, data security is a core discipline for companies to protect sensitive information and prevent breaches in both internal systems and cloud environments. In the case of expert systems, where the software mines information internally from repositories and active channels, such as MS Teams or Slack, the employees are subject to exposure on a regular basis. These smart knowledge management systems advertise that they distribute the right information to the right people at the right time. However, if an employee is chatting about personal plans, or sensitive health information, that could be shared with coworkers unintentionally. Another example: executives are talking with each other about the financial implications of an upcoming merger. It is essential that the information remains confidential to comply with SEC regulations and not violate insider trading restrictions. Ensuring that the conversations remain limited to a small cohort is where data privacy rules and policies come into play.

Encryption at rest and in transit

Maintaining data security is critical for most organizations. Unauthorized access to even moderately sensitive data can damage the trust, relationships, and reputation that you have with your customers. Any truly responsible company encrypts data stored at rest by default. Any object uploaded to the cloud can be encrypted using for example an Amazon-managed encryption key or a Google key store. Having a key management solution is a critical step for any company to take. Data in transit should also be encrypted to follow industry best practices and prevent leakage or exposure to bad actors. Again, each company will employ their own keys and methods to ensure this protection is in place. It’s all about good hygiene along the entire route of getting the bits of information to and from where they are most needed.

Risks of Unbounded AI

The government, as usual, is behind the curve in regulating the industry. There are no clear ways to apply old paradigms enshrined in law to new ways of viewing the world.  Regulatory controls and agencies such as the FEC and SEC are designed to monitor and force compliance of human beings to laws regarding monetary policy. If an AI agent acts contrary to the rules, who do you punish? The programmers who created the agent or bot? The company that deployed the AI? Murphy’s Law of unintended consequences always raises its head.

A case pending before the Supreme Court addresses YouTube’s use of content to generate video recommendations for users. Does the liability rest in the organization’s ability to shape content via recommendation engines and the algorithms that power them? This case is relevant to the debate, as ChatGPT and its brethren operate according to the same principles as those recommendation engines. In other words, do Section 230 protections generally apply to any third-party content from users, to content or information created by a company out of that third-party data?  Does it protect companies from the consequences of their own products? More importantly, who ‘owns’ the output of the engines? In Thaler v. Vidal, the courts maintained that US Patent Law requires a human inventor.

Many institutions and groups are organizing around ‘Ethical AI’ to lay the groundwork for a public policy debate that surely must come. Groups such as Oxford University, the Institute for Ethical AI and ML (UK), Stanford University, AI for Good (UN), and Global IA Ethics Institute (Paris) are all attempting to lead the way. There are many more out there offering classes, certificates, and services such as audits and systemic quality reviews. It’s early days. 

Ethical and Privacy Concerns Inherent in the AI/ML Space

When it comes to the use of data to create LLMs and the output of those models in the GenAI world, there are a few points of interest to consider. When you stop the flow of data, the gas that fuels the engine, you interrupt the flow of business. Effective cybersecurity means safeguarding the flow as well as the contents of data. In terms of AI, and how that data is used to build models, the process of securing your digital assets starts at the moment of creation and then curation is the name of the game. A few points of interest are in order to understand how AI sucks in your digits, chews them up and ultimately spits out a version of your reality, or view of the world.

Provenance: Who chooses the data to train the model?

There are many sources of data that a company generates: documents, communications, imagery, marketing, financial, and so forth. Granting access to the databases, file repositories, and financial systems inherently opens that system up to intrusions, because once you provide an API or other access, it can be compromised. Guaranteeing security at the source is critical. It always comes down to a people problem. Trust the people, secure the API endpoints, and instantiate best practices such as 2FA or MFA to ensure the person or system accessing your data is really authorized and authenticated to use it.

But people have biases, and data must be cleansed, normalized, labeled, and in many other ways massaged to be useful to machines. Unstructured learning techniques avoid a lot of these issues. However, they are less efficient and more prone to statistical errors than having humans curate the data set by preprocessing it. The quality of data matters, and just as ‘you are what you eat’, your ‘model is what it ingests.’ If you only feed it emails, then a skewed view of your company results.

Data has a sort of provenance to it, and this evidence can be found in the metadata, like ‘was it created by a human or a computer,’ ‘when was it created,’ ‘where is it housed,’ and so forth. Each element, while able to be faked, adds up to a sort of fingerprint or watermark, an information trail as it were.

Methodology: What technique is used to produce and maintain the model

There are many approaches to model generation: Supervised Learning, Unsupervised Learning, Reinforcement Learning, and Deep Learning, to name the most popular. You may also have heard of techniques like ‘zero-shot’ and ‘single-shot’ models.

Each method is an approach to statistically evaluating the data from the first issue of data above. How and why you pick one approach over another is purely based on the results you are looking for. Business applications and solutions to process bottlenecks require differing perspectives to achieve improved outcomes.

These techniques are fundamentally descriptive and focus on what humans actually have done. They do not hand the ambiguity of what we ‘should do,’ or negation, what we ‘shouldn’t do.’ Consider the method used to train an AI for identifying which recruits to interview for a new hire. If the data set is skewed and then the method is weighted toward a desired outcome: the decision making of the machine automatically throws out non-statistically significant resumes. There is no way to argue against it. Valid candidates don’t even get a rejection email. Classifications can be reviewed, but who’s to say it’s the right set of classifiers in the first place? The explainability of your method is just as critical to the process as is the data. Values such as beneficence, non-maleficence, autonomy, accountability, responsibility, and privacy are also key dimensions.

Don’t exaggerate what your algorithm or model can do. Don’t speculate that it delivers fair or unbiased results. Your statements must be backed up by evidence and this is where we get to the next issue, one of quality and veracity.

Quality: What technique is used to test the model and validate its output

Transparency is the cornerstone of trust when it comes to what for all intents and purposes is a black box process. The means by which you start out must be the way you finish. And the ends cannot justify the means. If a company is not careful, decisions and activities can be delegated to bots and justified by the thought that bots can get away with everything a human can’t. This is a logical fallacy: this evasion of responsibility ends up becoming no more than a legal dodge, a sort of ‘human behavior laundering’ in the guise of technical advances. Who’s to blame? You can’t point to a single person, the machine’s doing the work. 

While the alphabet agencies are developing frameworks, and talking compliance and certification, there are still basic steps to testing, validation, and certification you can take. Obvious proactive measures are:

  • Independent audits
  • Legal reviews
  • Opening your data or source code to inspection by 3rd parties

Quality control of the data input lays the foundation for verification of the output. Does the results match the expected area of information and standard responses a reasonable person would accept? Are statistical measures within tolerances? All of this implies that you have established acceptance criteria and KPIs in advance, expectations that support your business goals.

In testing, skepticism about the data and the output is your best tool. Deep fakes are an increasing problem. Confirmation bias creates a reinforcement feedback loop that increases the willingness of some to believe whatever a machine creates. ‘The data says so’ ‘that’s what the data shows’ are no longer strong arguments. Corroboration of information means human curation at some point. Quality control is that step.

Product: Who owns the output

The legal responsibility for the impact of a model is the question at hand. As ridiculous as it may seem, lawyers are already being hired to represent AI chatbots. This maneuver is an attempt to displace responsibility away from the humans behind the curtain. In effect, the developers and the companies who employ them are trying to say, ‘hand to heart, we can’t be held responsible for discrimination, it’s an algorithm.’

Logic dictates that we look at this attitude. Common sense says ‘the creator is greater than the creation.’ When a human encodes instructions and gives direction to model, they are essentially the puppet masters controlling the strings. A computer sifts through data, looking for patterns. Who told it which patterns were relevant and which patterns to ignore? The humans who wrote the algorithms and code obviously. The teams that create, curate, process and quality control the data should be held responsible for the outcome of their creation.

The next question that immediately comes to mind is ‘who then owns the output?’ Some companies are arguing that the end use is the legal owner. OpenAI’s terms assign the right, title and interest in output to a user. Legal experts are cautionary in stating that it may be a nice position to take, but large corporations are wont to change their mind retroactively. Additional criteria may include ‘independent intellectual effort’ (perhaps demonstrated by prompt creation), ‘originality’ of authorship, and so forth. None of these are inherent in AI-generated output.

In Conclusion

Learning about all these issues and more will protect you and your company in the coming months and years as the current wild west of GenAI turns into the established way of doing business. Taking care now to balance and protect your interests will ensure a safe and prosperous future for all.


How Language Evolves Over Time: Especially in the Marketplace

Behind all of the buzz about NLP, is Language. We wouldn’t have text-based communications, intermediated by machines, without two people using a common method of transferring ideas to one another. How did all of this arise? We examined in How Language Works in a Nutshell, the surface issues of social contracts, marketplace dynamics, and tribal interactions. In this post, I’ll take a deeper look at shifts in language, the micro and macro forces at play, how to adjust and account for those elements. Some refer to this as Evolutionary Linguistics, which relies heavily on biology and comes across as rather Darwinian. While this approach is based on 19th century understanding of language, it is biased. For example, the authors of this entry state that there is no archaeological trace of early human language. This is false. Much of the research can be found in Sociology and Anthropology studies. In addition, there is a new and emerging trend in Linguistics, where Archaeo-linguistics seeks to combine archaeology and linguistics into a blended greenfield approach.

An interesting book from 2007 by anthropologist D W Anthony, The Horse, the Wheel, and Language, presents primarily archaeological findings about the Kurgan hypothesis, and takes the position that language arises and evolves in parallel with technical innovations. To wit, the inventions surrounding the domestication of horses on the Sarmatian plains north of the Caspian Sea. (This area is approximately the equivalent of modern-day Ukraine.) Combined with the invention of the wheel, to create a mobilized society in the Bronze Age, this theory of evolutionary linguistics takes on the origins of Proto-Indo-European (PIE) by means of archaeology and evolutionary biology, specifically spending a greater portion of the book examining middens and pottery shards.

Imagine if you will a bronze-age innovator who decides to stop eating horses and instead domesticates one of them. This enterprising individual inserts a stick into the mouth of the horse, and puts two ropes on the ends, to make the horse follow his guidance. That eventually leads to the invention of a durable bit. Soon you have a set of traces connecting a pallet of your worldly possessions to the sides of the horse, so it pulls your goods to and from the winter camps. You no longer need to pull the sledge yourself. Now someone else comes along and invents the wheel, turning that flat pallet dragging along the ground into a cart that moves faster. Imagine the advantages you have over your neighbors. The horse is doing the work of men.

There are several inventions here to take note of. One technology builds on another. First, the idea of a pallet to stack food and possessions on, instead of carrying it on your back or in your arms. Then the horse or the wheel to make the transportation of items faster, easier. Now instead of a cart, a chariot for war. And so it goes, as history attests.

But more importantly to our story of language evolution, our clever inventor decides to trade copies of his bit, wheel, and other improvements to his fellow tribesmen. We now have business that exchanges goods for technology innovations. Perhaps a few sacks of grain for this new thing they decide to call a bit. Or two cattle for a set of those so-called wheels. As new inventions emerge, words are created to label that thing over there versus this thing here. Pointing just doesn’t suffice.

Now imagine if you will, a neighboring tribe sees the increased mobility, the speed and advantages of fighting off of the horse’s back instead of on foot. One can easily see that there are two choices. The tribes will align as allies and exchange technology and goods as friends. Alternately, one of the tribes decides that might makes right, goes to war against the other tribe and whoever wins incorporates the other tribe as slaves into their society. The rise of wealth, the need to have a common medium of communication, the desire to safely buy and sell possessions and crops all lead to the rise of marketplaces. Common meeting ground where people must talk to each other in order to achieve the desired outcomes. Soon, instead of a barter system, a token is found to equate value to goods. And this is the rise of money. Whatever a society values the most becomes an easy medium of value exchange. Is it gold, beads, shells, or simply a piece of paper that promises there’s gold behind it somewhere in a bank.

The gods must have their tithes and the king his taxes: Not only to keep this social construct of the marketplace supported and protected, but also to maintain their primacy of power over the people who gather to exchange goods and ideas. It all needs a warrior class to guard against invading neighbors. Authority is always based on power and money. Following the money, the rise of a military state almost seems inevitable. Protection rackets are not just for the mafia.

One can easily discern the causal links between technology, commerce, and language development. Example: Google is a new noun and verb based on technology shifts.

Shifts in Language

Language, while a social cultural construct, is not a constant. Definitions change, words drop out of popularity and as we see, are subject to the forces of history. You only need to look at English, to know that a speaker of Old English would have no clue what today’s Queen’s English is conveying. Researchers refer to the concept of Language Shift as a large-scale phenomenon, where a population changes from using one language to another. But what are the forces that lead up to such a radical shift?

Realizing that the British Isles have been invaded and conquered many times by sundry Nordic groups from the far north, by the neighboring France (creating Anglo-Norman in the 11th century), it is self-evident that Old English, primarily a Germanic language would be endangered and die out. Indeed OE, or Anglo-Saxon was an invading culture, brought over in the mid-5th century. It replaced the native Celtic languages. The dynamics of language communities demand a certain amount of maintenance and care if the survival of a mother tongue is overcome historic circumstances. Survival of language is why dictionaries exist: to codify spelling, definition, etymology, and variants of words. Example: the Académie Française in the 17th century forcing language standards on publications, teaching institutions, and attempting to outlaw local dialects.

The progression of any type of speech within a new context is characterized by migration, infiltration, or diffusion. When a whole speech community moves to a new location, that group of people tend to cling to their language, halting change for a time. Think of Québécois French, where a colony tried to keep its connection to the old world by forcing the next generation to maintain 17th century colloquialisms in the transmission of language from the older generation. Then after that original set of colonists had died off, the language began to change again, borrowing from the surrounding native tribes, inventing new words for the discoveries they made in the conquest of the continent. A variant, or creole, is created for that community, causing a branching of the mother tongue in a new direction. Another New World example is of course American vs. British English. Or Brazilian Portuguese vs. European Portuguese. Mexican vs. Castilian vs. Andalusian and so forth. Spanish has numerous dialects due to Spain’s colonialization of many parts of the world in the 16th century and onward.

War (infiltration) is another factor. Forcing a conquered people to adopt the language and culture of the victors, a sort of cultural assimilation technique. Here a great example is the Russification efforts of Soviet era policies, where native language and songs were outlawed, people from Russia forcibly relocated to populate the territories (or encouraged them to settle there), schools banned from teaching literature and history that might glorify the original regime. This happened in Estonia, Latvia, Lithuania and other post WWII mid-European states like Ukraine. In reality, the policy of forced started under Tsar Alexander II in the 1860’s and even earlier in medieval times. It was most successful in Belarus.

Diffusion is the cultural spread of a language. Here, a more modern example is English spreading through pop culture such as movies, books, and the internet. Another example is the popularity of anime and manga helping to promote the learning of Japanese.

Micro and macro forces at play

The sociological forces discussed above constitute the obvious Macro influencers for language shift. What are the micro forces? Literacy is surely one of them. Borrowing terminology to expand lexically and grammatically, the individual’s choices leading to localized slang. It is through an individuals’ speech behavior that language is either maintained or lost in the family context; and hence in the broader society.

Slang

Trade slang is a particularly interesting case to examine. Dutch traders arrived in Indonesia in the late 16th century, they surely did not speak the local language. Stepping off the ships, to the locals, they must have appeared as aliens, unintelligible, and Oh So White. ‘Do we kill them? Do we approach with caution? Do we try to make first contact?’ So many conflicting thoughts must have gone through the minds of each side. ‘What have they got that we want?’ This scene plays out repeatedly throughout history.

The need to have common terms arises. The forces of the global marketplace win over the sword and/or spear. Of course, by the end of the relationship, the sword wins after all. The need by the Dutch to keep the British at bay, let alone the Spanish, would dictate having forts and closed ports to protect their monopoly. Soon it means taking advantage of and controlling the local population. A Dutch monopoly on export is paramount. But back to Pidgin, which usually evolves around the domains of trade and labor.

A jargon, or set of vocabular terms, that is extremely limited enables a basic form of communication between two incomprehensible language speakers. It is often accompanied by hand signals and gestures. Sometimes an imperfect grasp, but still some knowledge, of the other’s native vernacular is required. There is a double illusion created when for example, the French think they’re speaking an Indian language, and the natives believe they are speaking good French. The conversations result in slang developing.

A clear example is Russenorsk, which arose in Northern Norway and used by Russian merchants, Norweigian fishermen, and the like. The first historic instance is from a 1785 lawsuit, and the last example shows it being stamped out in WWI. It was a seasonal trade language for the summer months, and never established itself as a creole with native speakers. Another example is the Lingua Franca (Sabir) of the Middle Ages, which grew up post-Crusades and dominated commerce in the Mediterranean, Black, and Irish seas.

Trade slang exists today, most notably on the trading floors of major banks, where a specific vocabulary and shorthand grammar is used in combination with hand gestures.

Phonology

Strong arguments are made on either side of the aisle about phonology and the forced pronunciation rules governing ‘proper speech.’ How words are pronounced influences spelling via errors in orthography. Consider a few examples that are now accepted regional dialectic forms. Y’all instead of You for the second person plural. It started out as saying “you all” to indicate a group of people as a separation from “you” singular. Then at some point, the error becomes the new standard. ‘Can’t’ instead of ‘cannot’. ‘Thru’ instead of ‘through’. And a classic favorite ‘Halloween’ instead of ‘All Hallow’s Eve’. Now let’s look at a function shift from adjective to adverb. The standard adverb is “well” as in “I am doing well.” In the past twenty years or so it is fashionable to say “I’m doing good.” Or just “Good” as a response to “How are you?” Many will look at you strangely if you respond “Well” instead of “Good.” A proper English teacher of prior generations would not just cringe, but flunk any student who speaks thusly. (And sound pompous for doing so.)

Adjusting and Accounting for These Elements

As always, why should we care about these issues of language and grammar in the context of NLP? Firstly, language changes, and so models must change to reflect the current state of the culture. After all a model is just a reflection of the data it ingests. And each domain within an area of knowledge will have shifting patterns of language. There is “Banking English,” “Healthcare English,” “Legal English,” and “Academic English” to contend with, let alone “East Coast English” “West Coast English” “Street English” “Australian English” and so forth. Each one requires an understanding or at least an awareness of the culture that created it.

All of this variation leads to the “natural” part of NLP. The discipline is not trying to necessarily have a formal understanding of the rules, but rather a practical understanding of the usage. The challenge is to not just count nouns and frequency of words in a text. It’s to understand the interrelated parts of speech that cause meaning to arise from an interaction between two individuals. A much more complicated challenge than TFIDF (term frequency–inverse document frequency) or other statistical approaches. To truly perform NLP at a level that leads to meaning and intent, a data scientist must understand how language works. If practitioners truly love languages and want to understand, they must study pure linguistics as well as computational linguistics, the structure of speech as well as the measurement and tallying of speech.

S Bolding—Copyright © 2022 · Boldingbroke.com

Use Case: Explainable AI, Learning from Learning Machines

Pattern Computer, out of Friday Harbor, WA has published a new whitepaper about XAI (Explainable AI). You can download the PDF here. The press release is here. The researchers at Pattern Computer present an interesting use case of how AI in its current state is able to advance scientific research, and yet still faces fundamental challenges. I have summarized the article in plain English as some of the concepts are highly technical and related to data sciences theory.

Hypothesis: One can use AI and data sciences to change the basic nature of scientific discovery. Researchers no longer need to form a hypothesis, instead they should let the data tell them what the emerging issue/idea is to pursue in the broader world. In order to achieve this, adequate tools must be developed.

Today, science leans on human intuition to form a hypothesis about an issue or idea being investigated. This is dependent on that human having deep domain knowledge. With Machine Learning (ML), the machines are domain experts by virtue of the data they hold and analyze. The methods for creating the models and neural networks are dependent on the humans who select and input the data as well as code the algorithms. However, machines are able to go beyond human capacity, in that they are able to be domain-agnostic and are faster, more efficient, and can learn. AI models make connections that are profound that a human would miss.

Will this impact the sciences as much as it has other industries? Biology is one example where AI is already providing success. Other sciences that are early adopters are Climate Sciences, Drug Discovery, Agriculture, Cosmology, Neuroscience.

Why is AI more productive in Industry than the Sciences? It is the nature of the questions being asked: Industry looks to solve specific discrete problems and achieve returns on investments, money is on the line. The Sciences, however, seek to answer broader questions in the wider world, such as natural phenomena which is an order of magnitude and scale that goes way beyond the smaller, more targeted goals of Industry. As a result, AI in Sciences requires more investment and time, more data, and a more complex set of algorithms. Indeed, it all goes back to the questions being asked.

Additionally, scientific discovery seeks to answer 'how' and 'why' an outcome occurred, not just to arrive at the 'what' of the data output. The goal of the Pattern Computer whitepaper is to examine the promise of 'AI for Science' where, as they state, 'we must develop methods to extract novel, testable hypotheses directly from data-driven AI models.' In essence, use AI to automate one of the steps of the scientific process, that of forming a correct hypothesis. This will save time, money, and increase opportunities for furthering science. One of the most costly aspects of research is chasing incorrect hypotheses or investigating an idea only to find out that a competitor has already investigated and abandoned that idea. If you eliminate these false trails early on in the process, you increase the likelihood that your research will result in tangible outcomes.

Taking a 'scientific first-principle' approach to modeling means that the model is a secondary tool to the primary question of understanding the world. Therefore the model, regardless of methodology used to create it, must adjust to the data and needs of the domain being investigated. The simple approach of using a training set to create a candidate model, and then test, adjust, test, etc. is not sufficient because it only looks for the positive and does not account for counterfactual, or contradictory data. You get curve fitting, rather than accounting for the whole world. This results in a correct outcome but does not answer the critical questions of 'why' and 'how' that were mentioned earlier.

Statistical approaches investigate not only the data but the actual model, how it was built, and the domain being modeled. They provide confidence scores, answer the uncertainty factors, and look at the impact of parameters, both chosen and rejected. Statistical methods are just one way of testing a model, there are others that are not dependent on parameters, and the authors list several. The crucial factor is that the model be tested by outside methods to validate its effectiveness to reason quantitatively and qualitatively.

The authors provide a clean and well-presented history of improvements in tools and technology in Figure 1, which summarizes the advances in math and computing leading to today's state of the art in AI. They question whether data-driven AI can mix with Statistics the same way that Calculus and Computing are linked. Figure 2 provides a view of the applications of current AI in Science versus Industry.

Starting with a statistical approach of creating a minimal model based on a limited set of parameters, the authors posit that it is possible to infer from that proto-model a hypothetical collection of 'parameters to outcomes' that will indicate which parameters to include in a more complex alpha-model, and then iterate from there. However, in my opinion, this is little to no different from creating a training data set and then testing as was rejected in their introduction to the problem. They are simply using the statistical approach instead of starting with an ML approach. It is a matter of degrees rather than process at this juncture: we are splitting hairs.

In regard to their discussion of Generalization versus Extrapolation (Section 3), the authors look to expound upon their theory regarding the construction of models for scientific understanding of the world. And it all comes down to data: access to quality data is always a barrier to entry in any effort. They break the discovery of patterns into two classic parts: the capacity to learn from data and the capacity to learn from outliers that do not 'fit the curve' of the statistically interesting nodes. Why is number two important? Because it is in the outliers that the potential for new scientific discoveries may lie. But both are critical for the scientific method to succeed. In contrast for Industry, those outliers are anomalies to be cleansed and removed.

Extrapolation is the principle of using one concept to transfer learning to another domain. Figure 3 of the paper illustrates this principle of science. For AI research, the concept of extrapolation applies when a first-principles model is used to create phenomenological or semi-empirical models so that data can be extended beyond its narrow domain to neighboring areas of research.

But where does the principle of Generalization fit into this for AI? It is in the testing of accuracy that we see generalization methods being applied, and again it goes back to training sets, where a data set is split in two, one for creating the model and one for testing its output. This allows the creator to measure things like accuracy, precision, recall, and other important benchmarks. If you can then apply that model to a very different data set and get similar benchmarks, the model is said to be generalizable or extensible to other domains.

Extrapolation in contrast takes into account the counterfactual reasoning about data that could not be collected and what we can posit about it. The essential difference is that the AI model is either discovering patterns within a given data set or discovering patterns outside of that data set in the broader world–two entirely different problems. However, without proper tools and methods to evaluate the performance of the model being used, the extrapolation may end up with poor results. In other words, it provides a vector for further investigation, but does not provide conclusive evidence.

Where is the answer to this conundrum? Adversarial AIs which challenge those models may provide an automated means of testing. Such testing would increase the robustness of the models and reduce the noise in the signals produced.

Explaining AI is always a challenge. This whitepaper overall makes a significant attempt at explaining AI for Science. The authors go into detail in Section 4 on the importance of differentiating between 'explaining' and 'interpreting' the matter at hand. While it may seem like semantics, the details are important. One can understand a model but have a hard time understanding the implications of its output. The authors denote the difference as follows: 'the distinction between identifying patterns in the data/model (hence, ML-style generalization) versus discovering patterns in the world from data/model (hence, scientific extrapolation, or ML-style transfer learning).' But there is a more simplistic way to understand the difference, a simple English definition will suffice. Explaining is understanding what's going on; Interpreting is translating into new insights, the 'why' and the 'how'. This then, according to the authors results in a process of vetting the model in question for its fitness for use.

The Completeness of the model as a metric is not really addressed in this paper. But it is significant to the discussion, and somewhat implied. In modeling a domain for any purpose, the choice of data for training or input to the model is based on the question being asked. But as is noted in this paper, science seeks to answer questions about the wider world. Therefore, in looking at the quality of a model, and using principles of extrapolation, would it not also be important to examine the gaps or assumptions in the model, thereby avoiding bias. Examples of bias are well documented, such as facial recognition AI that fails when looking for minority races, or the lack of data about women's health in third-world countries when modeling heart disease. If healthcare models are based on first-world countries and the bulk of data over time has been collected about men, then it's hard to imagine that a model is complete in that domain. The authors discuss density estimations as embedded preponderances of data to some degree, and link that to regional effects. These and other barriers to integrating AI into the scientific process again trace back to the availability of input data. The authors acknowledge that the 'capacity to understand data representations' is a 'grand challenge' that has significant impacts on the scientific discovery process.

Learning additional information always results in a decision being taken. This is how science advances. You make a hypothesis, test and learn from it, then make decisions about the next steps. What that information has taught you about the world determines a new way of interacting with the data and parameters with which you are working. Starting out with semantically meaningful parameters describes a known system or area of knowledge. Then you examine what that model is telling you about the information. How you take decisions to extrapolate and form new hypotheses is a common sequence of events. When the data is within the same domain but lacking semantic meaning in and of itself, then the model may take that unstructured data and provide a new insight into it. Or it may distort and provide false trails.

The distinction is important. How you take decisions on which path to pursue is critical to the process. The authors posit that it is possible for AI algorithms to discover emergent parameters that lead to defining new semantics, intrinsic meaning, within a domain that lacks definition. This problem is non-trivial because the process as well as the results need to be subjected to rigorous testing. Emergent system features for modeling in a domain is a wide-open field of research at the moment, due to the lack of good, let alone sufficient, testing methods. One example the authors give is whether a model has hidden within it knowledge about a structure or pattern that humans lack the ability to discern. How to extract that knowledge in a scientific manner, and then make it more than an anecdote of the research? This is the non-trivial part where the scientists must be rigorous, detailed in their validation steps.

The lack of clear connections between parameters and patterns is where AI excels but is also where the most doubt can be found. Where there are 'as-yet-known system controls' at play, the definition of emergent phenomenon is in the outliers. But many data scientists treat those outliers as data to be rejected or cleansed from the set. The data doesn't match their hypothesis. This is classic confirmation bias at play. The authors rightly acknowledge that outlier data cannot be rejected out of hand. It must be accounted for in order to validate the hypothesis as well as the model. Just because you don't like something, doesn't mean you can ignore its existence. That's not scientific.

Scale is another factor to consider in the problem space, whether the domain is life sciences or any other. What is the correct resolution of data? Or information inputs when they are images? The data points to collect from those images? The system being queried has different layers of complexity whether it be a civilization and the domain is sociology, or the domain is neurosciences and the object being studied is the human brain. The need to learn the correct level, or scale at which to examine the data is based not only on the question being asked, but also on the governing principles, the decisions taken to this point, and the responses gleaned thus far. There is a need to learn the scale as you perform the research, trial and error come into play.

The solution may lie in creating multiple variants of models, and then combine them into a single model resulting in higher quality predictions, because the sum of the errors of each of the sub-models cancel each other out, while their outputs tend to enhance or have a combinatorial effect. This may be faulty logic unless the ensemble methods used address the errors, because the sum of those errors could also have a compounding effect on the output. It creates a black-box effect in Industry, where the end users rarely look at how the model was built: its explainability. Industry only cares about effectiveness and accuracy for the most part, because there is no need to worry about the negative or counterfactual implications. The causal relationships do not matter. AI solves a discrete task, a function of operations, a limited need to know. Science is not like this. The need to know in science is boundless. Science needs the causal factors as well as the relationships to gain a deeper understanding of system dynamics.

Extracting structures from the model representation of a domain (i.e. real-life situations) is a pattern recognition problem. Repeating patterns allow scientists to identify the building blocks of life, nature, and the cosmos at the largest scale. When something repeats, it's called a pattern, when patterns occur over and over again we call those structures. The closer to the core, the more fundamental the structure, the further out from the core, the more detailed and finer the scale the dimension becomes often with less and less data to support it. Scale is handled by filter selection, how fine-grained, high-dimensional, or low-dimensional for higher-ordered patterns. Minimal models with a few parameters are said to be low-dimensional and are useful when the data set is finite.


The problem with high-dimensional systems is that they exhibit patterns of chaotic behavior. They expand quickly beyond the ability of the model to handle and have high noise to signal ratios. Sometimes the data points merge or diverge when a method such as linear regression is used, losing the unique qualities that interesting for the researchers. This again goes to the question of scale, where is the right cut off point? And also the computing power required for complex, probabilistic models can be prohibitive. (Quantum computing may solve that problem in the near future.)

The authors take away two conclusions from this analysis: an AI model that has good predictive accuracy must be learning some sort of approximated parsimonious model; and existing methods are insufficient to identify or learn from that parsimonious model in any scientific way. It still requires a human intervention to test and validate the learnings, showing that only the first step of the scientific method can be automated by AI, that of forming the hypothesis. Therefore, complex models, while they may contain well-defined, low-dimensional sub-models, do not lend themselves to aggregation into a globally applicable structure that can be analyzed with ease by a machine or human. A taxonomy of model structures with definitions is provided in Figures 4A and B for reference.

Global sparsity does not translate to local sparsity in this configuration. But the intermediate forms of model structures depending on either type of sparsity may provide probative areas of interest and lead to high predictive quality output. In other words, start small, combine what is good, and in testing along the way, you may learn something interesting to pursue.

Statistical theory rests on models of global sparsity, lacking local bias in some way. Results for locally-biased models have proven to be brittle when tested. The thought is that by extrapolating from the local, then intermediate, then global scale, those global features will have higher-quality predictions. Seems logical. But if all models are wrong, and some are useful (as George Box stated1), how does one pick the models to incorporate at the local level to build up to the global? This problem of sub-models is inherent in both Industry and Science, when looking for domain knowledge to be extended. Global models treated as a black box end up being a sort of Mechanical Turk, doing the heavy lifting without much insight. The authors look to local models to provide the finer insights of a white-box approach, since there the inner workings of the model have been examined more closely due to finer-grained parameterization.

Ultimately, the challenge is to find the happy medium, to pursue the intermediate-scale models. It is believed that this intermediate space is where high-quality predictions lie. Data plus computation holds the potential to solve challenges such as computer vision, predictability of drug outcomes, and other hard problems. As always, the issue is testing the models, validating the results, challenging them with negative cases.

Taking the approach of employing surrogate models to face this 'grand challenge', the use of scientifically validated models, smaller models that have been tested, core sets of data that are cleansed for training, and other sources of ground truth can be called into service to establish a baseline. These surrogate models are a type of lens through which one can examine the model being tested. Even though the mechanisms of the two models may be different or even unknown, if their responses match, that output provides a level of validation for the candidate model. The proposition put forth by the authors is that it is essential to develop a set of 'scientifically-validated surrogates for state-of-the-art AI models' in order for AI-enabled science to progress.

Operationally, they want to 'enable domain scientists to identify, test, and validate' properties of surrogate models. This implies creating a suite of tools and processes for that aspiration to be realized. The authors then go on to provide a history lesson in analogous developments from chemistry and mathematics, where similar approaches were taken. They state that better, faster computers will not solve the problem when the challenge lies in the area of validation and verification.

Naturally, the Next Steps, as Section 8 is titled, revolve around the development of this set of surrogate models, including counterfactual ones. Acknowledging that this is virgin territory for researchers, with little or no methods in existence, key to the success of their proposal is a three-point plan: create theories and methods for surrogate model development; statistical analysis of the output of those surrogate models, especially the outliers; and strategies for counterfactual testing and reasoning of the models. All of the above requires valid use cases for each domain, as well as learning from historic examples in engineering practices, hard sciences, and life sciences.

Conclusions

Addressing the elephant in the room, ethical AI is especially relevant when discussing medical applications, life sciences, or anything that affects people in their daily lives. Social bias predictions, blindly following the data without examining who chose the data in the first place, allowing a machine to make decisions with no human intervention in the interpretations, all lead to potentially damaging results for individuals and communities. Keeping a human in the loop is critical. An example given is crowd-sourcing the data input to ensure a wider, more diverse representation of demographics. Using local surrogate models against global models tests the global model's applicability to a new region. The cost of being wrong in these situations is high when medical treatments are the subject matter.

Teaching a machine to know what it doesn't know, to in essence identify a gap in data, a bias, is another area for research and exploration. These 'known unknowns' go to the question of competency of the model. But again it is a question of trust. Trusting the technology means trusting the people behind its creation, and absent the tools to validate the output as well as the methodologies, it is a matter of reputation and awareness. Once again, the authors argue that surrogate models provide the answer, as they are orthogonal to the methods used for training the candidate models being tested. The measure of uncertainty for models would then have a more theoretical foundation from which to operate.

Data being retrospective in nature is hard to mold into a predictive tool. The decisions being made based on AI and data-driven models are entering the daily lives of people at an increasing rate without societal awareness. It is more incumbent than ever that scientists, both data and hard scientists, take their ethical obligations seriously. This whitepaper is a step forward in advancing and proposing a method and processes for doing so.

S BoldingCopyright © 2022 · Boldingbroke.com


[1] George E. P. Box. Science and statistics. Journal of the American Statistical Association, 71(356):791–799, 1976. (See footnote 22 of the whitepaper.)


Discerning Intent, the Holy Grail of Any NLP Solution

What was I thinking when I said/typed "Find the best restaurant." The obvious nature of this statement to a human is a clear, declarative desire to get something to eat. And to find the most popular choice in the area. But to a machine, the intent is not at all clear, even for such a simple search query. Consider this phrase: "pain treatment". Here the intent is not so clear even to a human listening to the speaker. Is the pain physical, emotional, existential? Perhaps the best treatment is a psychiatrist instead of ibuprofen.

A person looking for intelligent responses would need to add some context in the form of a prepositional phrase or two to have any chance of narrowing the results of a search. Even a person would not understand the true intent of "pain treatment" until the user added this: "pain treatment for migraines". Now we and the machine know to narrow the search to the field of medicine, specifically neurological medications. However, what if the person really meant "holistic treatment for migraines"? With the addition of an adjective, the focus shifts from drug therapy to organic protocols such as meditation, light reduction techniques, and perhaps herbal or traditional Chinese teas and tinctures.

This simple example of communication highlights the nature of language as ambiguous and changing, depending on the context. The semantic weight of various components is essential to understanding the desire of the speaker. Therefore it is necessary for the machine to also learn the semantics and grammar structure of the language at hand. What is a noun? An adjective? How to recognize and break down a prepositional phrase. More importantly, are there verbs involved? These indicate actions that are vectors of intent. 'Intent' in NLP is the desired outcome of a behavior. In parallel, behaviors are the outcome of verbs. How you behave is what you do. So, it should be straight-forward to see the value of verb analysis, instead of just counting nouns for statistical purposes.


[1]

Stringing together words is a skill that children pick up from their social environment. Consider: "The dog bited me." And then imagine the mother or father correcting the child to "the dog bit me." Thus enforcing an irregular verb form of the past tense into the child's understanding. Semantics cannot be underrated in the search and broader NLP processing endeavor.

Many computational linguists rely on statistics. Count the nouns and how many times they appear in a text, called clustering, to see what a document contains. But statistics are not enough. To have more than a topical understanding of text, or conversations that have been transcribed to text, such as a phone call, the ideas or subject matter must be discerned from the context of the larger subject matter. A paragraph has a theme or major idea to it. If that paragraph contains a lot of medical jargon, to use the original example above, it could be about migraines without ever using that term.

NLU, or Natural Language Understanding, attempts to solve this issue. To read a document sentence by sentence, paragraph by paragraph, is more complex than just looking at heading titles and counting nouns. NLP and its sub-disciplines, NLU and NLG (Natural Language Generation) rely on grammar parsers, part of sentence tagging, and other tools to break down the sentence into component parts, examine their relationship to other parts of the sentence, and apply rules to grasp its core meaning. Then the paragraph can be examined, again looking at structure and content, to label that grouping of sentences with tags indicating the topic, the major ideas, and other nuances of semantic significance.

But all of this does not answer the question of intent. What does the person really want, or want to convey, when they make an utterance. Sometimes I don't even know what I meant by what I just said. I think of different ways to phrase it, to get just the right emphasis, the right idea across to my listener. This is why people practice speeches, instead of speaking extemporaneously. Some people just say what they're thinking, then have to explain what they really mean by the words they chose. The ensuing verbal vomit results in comedy. Detection of sarcasm, irony,[2] and the like is another problem. Were they serious, or being emotionally manipulative? All of these issues result in a practice of NLP called "sentiment analysis." Yet another sub-discipline. How to determine the emotional attitude of the speaker.

As you may be beginning to realize, the many factors behind "intent" in speech analysis create a fascinating landscape for research. And they open the door to bias and pollution by world views that are in conflict.

Teaching a Machine to Understand Intent

When we interact with another human, we rarely recognize the surface behaviors. What the brain cares about is the underlying message. Facial expression, hand motions, body language are all subconsciously cataloged. But the computer often does not have these types of kinematic inputs available. The machine only has the plain text in front of it. "A generative knowledge system underlies our skill at discerning intentions, enabling us to comprehend intentions even when action is novel and unfolds in complex ways over time. Recent work spanning many disciplines illuminates some of the processes involved in intention detection."[3] Even the medical community struggles to define and quantify the elements of communication leading to a discernment of intent.

As we have detailed before, a computer is a logic machine, working step-by-step to break a problem down into a number of processes, sub-processes, and algorithms that interact to create an outcome or goal. Here are the common steps for NLU, the first part of intent analysis.

  • Analyze the Topic: Extract key concepts and overall subject matter
  • Extract Relevant Context: What is the general background of the discussion
  • Syntactic Analysis: Sentence structure and defined meaning of nouns, and other parts of speech using a Part of Speech tagger.
  • Identify Actors: Who are the people, organizations, and agents involved, how important are they compared to each other
  • Semantic Analysis: Resolve the contextual meaning of the word, where it may have multiple meanings.
  • Sentiment Analysis: What are the 'moods' of the user? Emotion, attitude, mental state.

Discerning the actor's intent is one of NLU's key strengths. To successfully identify intent, NLU leverages translation, text summarization, clustering, speech recognition, topic analysis, named entity recognition, semantic analysis, etc. all of the subcomponents of the NLP toolkit that have been developed through intensive research. Most of the current use cases are related to consumer analysis applied to commerce and support systems.

An example of a good, multilingual sentiment analysis and intent discernment tool is Rosette, available in over 30 languages. It provides a combination of tools (as listed above) that automate the effort. There are other services out there such as MarsView.ai to ease the development cycle.

Data Implications

As with any ML (machine learning) problem, the quality of the training data for a model determines its viability and value. Many use cases, including the one around intent, require labeled data to give the model a jumpstart in building its view of the content. Business goals determines in large part what labels a client chooses. If you are in the medical industry, you choose medical terms as the labels. If you are in technology, you choose programming concepts, terms, and structures as the labels.

Why is this important? Essentially you are trying to create vectors pointing to a common meaning. If you parse the 'concept' of Mercury, you need to indicate the domain in which the term has meaning. Is it a god from Roman mythology? A chemical element? A planet? Or the car from Ford in 1938? See the problem? Context matters. The model is only as good as the tags you put on the data when it's cleansed and prepared for processing.

And what's with the 'vector' thing? Word embeddings, called vectors, are representations of text data where words with similar contextual meaning have a similar representation. In other words, synonyms, what are all the various ways to say the same thing? A Roget's Thesaurus comes to mind. Words from the text are represented as calculated vectors in a predefined vector space, or domain of knowledge. One of the most common tools for this is Word2Vec, which superseded LSA (Latent Semantic Analysis). Each look to create an understanding of a word based on the context in which it is used. The creation of a vector space is more graph-ical (as in graph based) than the tree structure approach of an ontology. In a coordinate-based system, words that are related are in close proximity to one another based on a corpus of relationships that can be calculated and turned in to mathematical formulas.

Again, why should we care? Here's why: These models are increasingly used to drive services over the internet. And they may contain biases and suppositions based in world views that we agree or disagree with. The outcome of the model and how it is applied to a problem can be discriminatory at its very root if the tagging and resulting word vectors are not carefully constructed. Training a model with data that brings in multiple perspectives is essential to creating a well-rounded knowledge domain.

An Impossible Dream?

Back to the Holy Grail. "What I meant to say was..." How often to you hear people explaining themselves after saying something that is difficult to understand. Even humans have a hard time with intent. To let a machine do the work for us means that multiple techniques must be deployed to get within acceptable tolerances of what Natural implies in NLP.

This is why it's a 'holy grail' type of problem in machine learning and AI. The desire to parse out what a person really means when they say something is a hard problem in computational linguistics. And we are far from a solution that works well.

S. Bolding, Copyright © 2022 ·Boldingbroke.com


[1]      Conversational AI over Military Scenarios Using Intent Detection and Response Generation, Hsiu-Min Chuang, Ding-Wei Cheng, Current Approaches and Applications in Natural Language Processing, Appl. Sci. 2022, 12(5), 2702.  doi: 10.3390/app12052494.

[2]      Sociolinguistically Informed Natural Language Processing: Automating Irony Detection, D.A. Baird, J.A. Baird, Trends in Cognitive Science, 2001 Apr 1;5(4):171-178. doi: 10.1016/s1364-6613(00)01615-6.

[3]      Discerning intentions in dynamic human actions, D.A. Baird, J.A. Baird, Trends in Cognitive Science, 2001 Apr 1;5(4):171-178. doi: 10.1016/s1364-6613(00)01615-6.

Structural Semantics—A Deeper Dive

As we saw in our discussion of How Language Works, there are two major disciplines in Linguistics: Semantics and Semiotics. Often Computational Linguistics depends upon statistical methods for counting words and using frequency to determine importance. To wit, the more a word is used the more important it must be as a concept. This is a semiotic approach, looking to individual words and their meaning. In contrast, Semantics is the relationship of words to each other within the sentence and paragraphs of a text, the context that gives a greater intent to the individual words.

Why does this matter to NLP and Machine Learning? Basically, if you don’t understand how language functions, how can you teach a machine to process it? It’s as simple as that. For too long, Computational Linguistics has been dependent on pure mathematics and ignored the deeper structures of language. It only touches the most essential elements with Part of Speech (POS) tagging and parsing of grammar for a few languages. The grammar parsers look at simple Subject → Verb → Direct Object relationships. This stuff is hard to perform, and, in the meantime, the discipline has come a long way by just counting words and looking at how close or far they are from each other in a text. But it is long past time that the Linguistics of Computational Linguistics takes on a greater role.

In simplest terms, structuralists take linguistics as a model and attempt to develop “grammars”—systematic inventories of elements and their possibilities of combination—that would account for the form and meaning of literary works; post-structuralists investigate the way in which this project is subverted by the workings of the texts themselves. Structuralists are convinced that systematic knowledge is achievable by moving the focus of criticism from thematics to the conditions of signification, the different sorts of structures and processes involved in the production of meaning. Semiotics, the successor to structuralism, is best defined as the science of signs. It involves the study of any medium as a semiotic study of signs. While the analysis of content involves a quantitative approach to the analysis of a text, semantics seeks to analyze texts as structured wholes.

The history of this discipline goes back to the early 1900s and a gentleman named Ferdinand de Saussure, who kicked things off with his seminal work Cours de Linguistique Générale (1918). Things really took hold as a school of thought in 1950’s and 60’s with contributions from A. J. Greimas and François Rastier. Let’s take a moment to understand the importance of how this systemic breakdown of grammar and words works.

Key to Saussure's theory is the concept of “signs” in that each language is composed of linguistic units of sound patterns that are represented by written symbols. These symbols are referred to as signs which reference a concept, or the thing signified. This is a psychological and not necessarily a material concept, and therefore a social construct. Both components of the “sign” are inseparable, like two sides of a coin. The value of a sign is determined by all other signs in the corpus of the language, adding to its nature as a social construct. Therefore the signs in French are the sum total of the French culture just as the signs in English reflect English culture and its sum total history of melding various languages through invasions and conquest. This leads to the maxim that "culture is instantiated in language."

Meaning

For signs there are syntagmatic and paradigmatic relationships. A syntagmatic relationship involves a sequence of signs that together create meaning. A paradigmatic relationship involves signs that can replace each other, usually changing the meaning slightly with the substitution. Think of it as a syntagmatic relationship as grammatical, and a paradigmatic relationship as a list of synonyms.

A syntagma is an elementary constituent segment within a text. Such a segment can be a phoneme, a word, a grammatical phrase, a sentence, or an event within a larger narrative structure, depending on the level of analysis. Syntagmatic analysis involves the study of relationships among syntagmas.

Paradigmatic analysis is the analysis of paradigms embedded in the text rather than of the surface structure of the text which is termed syntagmatic analysis.

The concept of syntagmatic and paradigmatic analysis was extended to narrative discourse by A. J. Greimas best known for his Semiotic Square. Vladimir Propp, in his analysis of Russian folk tales, Morphologie du conte (1928), concentrated on an internal analysis rather than historical explanations to formulate his classifications for folk tales. In distinguishing between form and structure, he argued that structure included both form and content, whereas form restricted one to examining the medium of a given system of communication. Propp’s seminal work greatly influenced Greimas,[1] who replaced Propp’s concept of “functions” with that of “actants.” Greimas developed a semiotic approach to narrative structure, the “semiotic square”[2] and the “actantial model.”[3] He uses the term “seme” (seed) to refer to the smallest unit of meaning in a sign.

Greimas’ model of the semiotic square, where the interaction of opposing symbolic interpretations creates semantic categories, is based on three relationships: contradiction, contrariety, and complementarity.

“Cette structure élémentaire (...) doit être conçue comme le développement logique d’une catégorie sémique binaire du type blanc vs. noir, dont les termes sont, entre eux, dans une relation de contrariété, chacun étant en même temps susceptible de projeter un nouveau terme qui serait son contradictoire, les termes contradictoires pouvant à leur tour contracter une relation de présupposition à l’égard du terme contraire opposé.”[4]

"This elementary structure (…) must be conceived as the logical development of a binary semic category of the type white vs. black, whose terms are, between them, in a relation of contrariety, each being at the same time capable of projecting a new term which would be his contradictory, the contradictory terms being able in their turn to contract a relation of presupposition with regard to the term opposite opposite. ”

Courtés uses this figure to summarize the theories of semantic relations proposed by Greimas in both Sémantique structurale (1966) and Du Sens (1970). Using the example of true and false, S1 represents true, S2 represents false,  not true, and  not false. Its primary value resides in its usefulness as a means of establishing for a text the pertinent opposition(s) which generate(s) signification.

One may substitute any valid set of contradictory terms into the semiotic square. For example, discussing theories of the Fantastic, Fantasy, or Science Fiction, “real” and “supernatural” provide the contradictory pair. One of the most prominent types of operations anticipated by the model of veridiction[5] is the narrativization of /otherness/ in the process of linking, or concatenating, episodes that contain examples of the supernatural. The narrative dimension of a text signifies itself as a series of states and the transformations linking them.[6] There are two states, conjunction (symbolized as ∩) and disjunction (symbolized as Ụ) , while there can be numerous types of transformations.

Semiotic analysis details the relations between these static and dynamic aspects of narrative by studying the characters and the roles that they play in the succession of transformations between states. According to Greimas, the actantial model defines the relationships between characters according to six categories of actants:[7] the subject (who desires the object), the object (which is desired by the subject), the helper (who aids the subject), the opponent (who opposes the subject), the sender (who initiates the quest of the subject), and the receiver (who benefits from the acquisition of the object). Greimas’ system is particularly appealing because of the way it highlights the Subject and the desire for the Object. The actantial model is often represented in the following diagram:[8]

The Greimassian method of narrative analysis creates correspondences among the various themes of a story, which serve as the spatio-temporal coordinates for the continuity of the intrigue. The narration of a story that is encoded by the author thus encourages the cognitive and pragmatic act of decoding the meaning of a text by the reader.[9]

Stages of a structural analysis

A structural analysis can be said to have three major stages. First, it is necessary to determine the principal actants, their relationships, and the resulting episodes of the narrative (syntagmatic analysis). Segmenting the text in this manner leads to the problem of deciding what is significant and what is secondary. Next, the critic must establish major divisions or units of the text that underlie the episodic structure in order to determine the larger, overall meaning of the narrative pattern. As a final step, the relationship between episodes is defined (paradigmatic analysis).

The syntagmatic analysis usually involves studying the text as a narrative sequence. The first step in such an analysis is to identify the actants and their relationship to each other. Who is the subject? Can there be multiple subjects (and therefore multiple points of view) within a single narrative? Can a subject desire multiple objects and how do the conflicting desires affect the outcome of the narration? The relationship of desire between the subject and object provides motivation and closure to the narrative. The social implications of desire are manifested in the motivation of the subject. Does his desire for the object benefit the society at large or is it self-serving? On another level, the consequences of desire are played out within the “sender–object–receiver” relationship. The “sender” is the agent who grants the subject permission to pursue the object—for example, King Arthur initiates the quest for Guinevere in Le Chevalier de la Charrette by sending Gawain to search for Kay and the queen. The most frequent example of the receiver, the one who possesses the object at the end of the story, is the hero’s society. In this example Arthur’s court and kingdom benefit from the return of their queen and seneschal. Secondary characters fulfill the roles of helper if they aid the hero in acquiring the object. When hindering the hero from reaching the object, secondary characters take on the function of opponent. By identifying the major roles in the text, one is able to divide the text into units and episodes. A unit represents a major segment of the text, while episodes are subdivisions of units.

The next stage of a structural analysis is to formulate a narrative structure, or outline, for a given text. That structure is often expressed in terms of equations. These serve as a means of summarizing the relationship of desire between the subject and object. When linked together, the narrative units permit one to quantify the structure of the whole text. These equations reveal the narrative development of actants as they interact with each other and the object of desire. Greimas, in his Sémantique structurale, proposed a mathematical representation of semiotic structures in order to more precisely reveal the hierarchical structure.[10] In the preface to Courtés’ book, Greimas reemphasizes that the division of a text may be based on the various actantial and thematic roles of interaction with the object.[11] Courtés builds upon the work of Greimas by incorporating the concept of “isotopes,” which are defined by Greimas as a redundant set of semantic categories which makes possible a uniform reading of the narrative.[12]

Courtés’ work specifies a method of equations used to describe the disjunctive and conjunctive states of characters.[13] When a subject (S), such as the hero, obtains an object (O), he is said to be in a state of conjunction (symbolized as ∩) with that object, represented by the formula UN = S ∩ O, where UN is the narrative unit. When the hero is separated from the object he desires, he is said to be in a state of disjunction (symbolized as ), represented in the formula as UN = S Ụ O. At any given moment in a text, the opponent may possess the object desired by the subject. The subject is thus a potential agent of the function(s) that will bring him into a state of conjunction with the object. The transformation enacted in this instance is an exchange by transfer of the object from the opponent to the subject. The process whereby the subject realizes this objective is called a narrative program.[14]

According to Greimas, the final step in this type of structural analysis is to examine the ways in which the various narrative units (or situations) relate to each other. This is referred to as the paradigmatic analysis. The distinction between paradigmatic and syntagmatic structures is a key concept in structuralist semiotic analysis. These two dimensions are often represented as axes, where the vertical axis is the paradigmatic and the horizontal axis is the syntagmatic. A paradigm is a set of associated signs which form a defining category of meaning or significance, such as “love.” A paradigmatic analysis of a text studies patterns other than internal relationships (which are covered by the syntagmatic analysis). The use of one paradigm rather than another shapes the preferred meaning of a text. This is the author’s encoded meaning for the text which the reader must decode.

François Rastier continues the work of Greimas by developing a unified theory of Interpretive Semantics. You can learn more about Interpretive Semantics and Semantic Classes by reading this overview.

Again, why should we care in this modern age? What relationship does narratology and storytelling have to the business world seeking answers to domain-specific issues such as finding bad people before they commit cybercrimes, creating algorithms to sell more widgets, or automating boring tasks in the back office so that people can focus on the creative work they do best? In reality, there is always some object that someone is seeking:

  • A business is seeking profits by providing a service to clients.
  • The clients are seeking a product or service that facilitates or eases their way of operating.
  • A hacker is trying to steal your data.

Each of these use cases is a story in progress, there is a Subject seeking an Object. Sometimes there is an Opponent or Villain trying to keep the Subject from the Object. It’s a tale as old as time. Whether talking about mythologies or common business practices Subject → Verb → Direct Object is the formula with variations on that theme. It’s the variations that make the story interesting.

“In the end, we’ll all become stories. Or else we’ll become entities. Maybe it’s the same.” –Margaret Atwood, Moral Disorder

S. Bolding, Copyright © 2021 · Boldingbroke.com


[1]      “ ... la sémiotique française a voulu voir dans l’œuvre de Propp un modèle permettant de mieux comprendre les principes mêmes de l’organisation des discours narratifs dans leur ensemble.” A. J. Greimas, “Préface,” in Joseph Courtés, Introduction à la sémiotique narrative et discursive (Paris: Hachette, 1976) 5.

[2]      A. J. Greimas, Du sens: Essais sémiotiques (Paris: Seuil, 1970); Dictionnaire raisonné de la théorie du langage (Paris: Hachette, 1979).

[3]      Greimas described the actantial system in Sémantique structurale (Paris: Larousse, 1966); he has elaborated upon his concepts somewhat in later writings, such as Du sens (1970), and in Sémiotique narrative et textuelle (Paris: Larousse, 1973).

[4]      A. J. Greimas, Du Sens…, 160.

[5]      Veridiction concerns the manner in which the intertextual category “true” signifies differentially from the other categories in the model. J. Courtés, Introduction à la sémiotique…, 131–36; see also A. J. Greimas and J. Courtés, “The Cognitive Dimension of Narrative Discourse,” New Literary History 7 (1976): 433–47.

[6]      A. J. Greimas, Du Sens…, 157–83; Claude Bremond, Logique du récit (Paris: Seuil, 1973) 11–128; Seymour Chatman, Story and Discourse: Narrative Structure in Fiction and Film (Ithaca, N.Y. and London: Cornell UP, 1978) 15–145.

[7]      The English translations of Greimas’ terms are taken from Gerald Prince, Dictionary…, 40, 67, 80, 86, 93.

[8]      Reproduced from A. J. Greimas, Sémantique structurale (Paris: Larousse, 1966) 180. For a discussion of the various actantial roles and models, see specifically pages 129–134, 172–191.

[9]      A. J. Greimas, Maupassant: La Sémiotique du texte (Paris: Seuil, 1976) 167.

[10]    “L’exemple des mathématiques, mais aussi de la logique symbolique et, plus récemment encore, de la linguistique, montre ce qu’on peut gagner en précision dans le raisonnement et en facilité opératoire si, en disposant d’un corps de concepts défini de façon univoque, on abandonne la langue « naturelle » pour noter ces concepts symboliquement, à l’aide de caractères et de chiffres.” Greimas, Sémantique structurale…, 17.

[11]    Greimas gives a clear summary of his actantial model as it relates to the formulaic representation of narrative structures in “Préface,” Introduction à la sémiotique narrative et discursive (Paris: Hachette, 1976) 5–25.

[12]    “Par isotopie nous entendons un ensemble redondant de catégories sémantiques qui rend possible la lecture uniforme du récit...” Greimas, Du Sens…, 188.

[13]    For an example of the application of such formulas to a text, see Part II of Courtés’ book, where he applies this method to Cinderella. Courtés, op. cit., 111–138. François Rastier is another critic who uses similar formulas to represent structural relations on a semantic level. François Rastier, Sémantique interprétative (Paris: Presses Universitaires de France, 1987). In particular, see Part III, “Le Sémème dans tous ses états” and Part IV, “Le Concept d’isotopie.” Donald Maddox applies the theories of Greimas to Erec et Enide in Structuring and Sacring: The Systematic Kingdom in Chrétien’s Erec et Enide (Lexington: French Forum, 1978). See especially Chapter 3, “Segmental Reading: The Structure of Content,” 41–72.

[14]    J. Courtés, op. cit., 62–100.


Generative AI: Risks and Rewards

 Benefits of GenAI The landscape of general computing has changed significantly since the initial introduction of ChatGPT in November, 202...