IAASB Digital Technology Market Scan: Natural Language Processing

Jun 22, 2022 | English

Welcome to the fourth market scan from the IAASB's Disruptive Technology team. Building on our previous work, we will issue a Market Scan on topics from the report approximately every two to three months. Market Scans will consist of exciting trends, including new developments, corporate and start-up innovation, noteworthy investments and what it all might mean for the IAASB.

In this Market Scan, we explore natural language processing (NLP), a technology that has applications within Accessing Information & Data (NLP and Computer Vision for Digitizing Documents) and within Assessing Internal Controls (Optical Character Recognition, NLP and Machine Learning for Intelligent Document and Voice Analysis). This technology has the potential to impact many areas of the audit—enhancing the way auditors work and providing opportunities for greater insight.

We cover:

What is natural language processing and why is it important?
The latest developments
What this might mean for the IAASB

What is Natural Language Processing? Why Is It Important?

Natural language processing (NLP) is a branch of artificial intelligence that is concerned with giving computers the ability to understand, interpret and manipulate human language, both written text and spoken words.

Artificial Intelligence Technologies with Natural Language Processing highlighted

NLP uses a combination of technologies, including computational linguistics (rule-based modeling of human language), statistical modelling and machine learning. NLP involves both natural language understanding (NLU) and natural language generation (NLG). See the video from Simplilearn below for an explanation of how the technologies fit together.

What Is NLP And How Does It Work? | Simplilearn, five-minute watch

There are many benefits of having technology that can fully understand human speech and text (including tone and meaning), determine the right response and provide that response in a human-like format. These include:

Enabling better decision support through analysis of large quantities of unstructured data (e.g., emails, documents, social media) to understand sentiment, identify text similarity or discover specific themes.
Supporting improved communication through effective translation applications (e.g., Google Translate)
Providing around-the-clock support to customers, clients, users and other help-seekers through chatbots, voice assistants and applications that use semantic search functionality
Assisting with efficient documentation through generative language applications that can automatically create text content, such as high-level insights or summaries.

In an audit and assurance context, NLP-based technologies can be utilized to support and enhance auditor activities. For example:

Providing auditor assistance by delivering relevant guidance to the auditor, when and where they need it, through voice assistants and help bots.
Enriching risk identification and assessment activities by providing valuable insights about the entity and its environment, from analysis of unstructured data from a variety of sources such as regulatory notices, social media, and news articles.
Supporting understanding the entity's internal controls by extracting and summarizing what has been written down in process documents, emails, articles, and from employee inquiries.
Augmenting documentation activities through automatic text generation, for example, by providing commentary from data analysis results. See the below video from Wordsmith for an example of technology that can do this.

Wordsmith: Extend the Power of Tableau, Automated Insights, two-minute watch

These technologies offer opportunities, such as those described above, to the auditor or the audited entity, which could reduce manual, time-intensive activities; enhance the effectiveness of certain procedures, tasks, or actions; and augment decision making. However, they are not without their risks and adoption needs to go hand in hand with a recognition that imperfections in outputs could arise.

Recent Noteworthy Developments in Natural Language Processing

These recent developments may signal future disruption in this area; this is not a complete list of all activities in the natural language processing. For a reminder of Key Venture Capital and Investment terms please refer to the first Market Scan.

1. Transformers—boosting NLP development

Transformers are deep learning models used in natural language processing for translation and text summarization. Recent developments have seen these technologies trained with large language datasets, leading to a growth in pre-trained systems like Generative Pre-trained Transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT). Some notable developments include:

GPT-3 from OpenAI is a hugely popular model that can generate large amounts of text based on a small amount of text input. It is used in many copywriting and content generation applications available today. However, the solution has gained notoriety for the machine learning bias it picked up from the human bias in the internet text it learned from. The Gender Bias Inside GPT-3 | Made by McKinney, five-minute read

What is GPT-3 (Generative Pre-Trained Transformer 3) – Tech Target, three-minute watch

Google’s BERT is the underlying model for the Google search engine—and due to be replaced by MUM, a Multitask Unified Model, during 2022 and described as 1,000 times more powerful than BERT.
Megatron-Turing NLG is the latest model from Microsoft and Nvidia and described as the largest and most powerful generative language model with 530 billion parameters.

2. Start ups

Enterpret launches with $4.3M, NLP technology to decipher customer feedback | TechCrunch
- Enterpret, a start-up founded by brothers Varun and Arnav Sharma, is developing natural language models to improve insights from customer feedback. This technology, which is focused on enabling more informed product development, could have widespread application across all industries with digital presence.
Cohere raises $125m in Series B Funding
- Cohere builds large language models and makes them available through an Application Programming Interface (API). In November 2021 Google Cloud announced a multiyear partnership with Cohere to provide infrastructure for development of their platform.
Popular AI Writer Jarvis Rebrands to Jasper
- This popular Y-Combinator backed startup has built an AI copywriting assistant to write original, creative content including blog articles and social media.
OpenAI develops new tool to create code
- OpenAI, which developed GPT-3 for generating text, has leveraged this technology to develop a tool called Copilot to help generate code. It is part of a growing trend in developing technology that enables everyone to become a code creator.

What this might mean for the IAASB

The IAASB is interested in maintaining its collective knowledge base on digital technologies, including on specific sub-topics such as NLP, promoting digital readiness and enablement through its engagement with stakeholders, and encouraging action by others to supplement and support standard-setting activities.

The IAASB is also keen to explore how technologies, such as NLP, could be used to enhance interaction with auditing standards. Subject to the IAASB’s work plan decisions, possible use cases of digital technologies, such as NLP for audited entities and audit engagements, might provide input for further modernizing the IAASB’s standards to be adaptable to and reflect the current business and audit environment (while recognizing that the standards would address digital technologies in a principles-based manner).

NLP-based technologies present exciting opportunities for auditors to enhance the effectiveness of procedures, tasks, or actions. Widespread adoption is also likely to be more straightforward than for other technologies as it can often be introduced with limited training requirements. From the perspective of both an auditor performing an audit engagement and a firm’s quality management around the technology resources used by engagement teams, the opportunities as well as the potential challenges should be considered.

Technologies that make suggestions to the auditor, for example, about risks they may not have considered (using data extraction) or on appropriate next steps in the audit process (using a chatbot or voice assistant) offer just-in-time guidance but may give rise to automation bias. That is, the tendency to over-rely on automated aids or, more broadly, the outputs from technology solutions. In addition, challenges are likely to exist around the availability of suitable data sets for training these support technologies in order to provide appropriate recommendations given that facts and circumstances will differ from audit to audit.

Technologies that help auditors create documentation, for example, those that summarize documents or those that provide commentary on data analysis, warrant consideration about how they may blur the lines of responsibility. With these technologies often acting so seamlessly with humans, reviewers of documentation prepared by a combination of computers and humans may find it difficult to determine who did what and therefore decide on the appropriate level of scrutiny required. A principles-based approach continues to recognize that the responsibility for documentation remains that of the auditor, including that the documentation provides a sufficient and appropriate record of the work performed and conclusions reached.

Useful Articles and Resources

Interesting Story

AI helps historians complete ancient Greek inscriptions damaged over millennia | TechCrunch, two-minute read

Deepmind, the Google-owned AI company behind the AlphaGo program that was the first to beat a professional Go player, has now developed a new tool called Ithaca. Ithaca is a machine learning model that guesses what the missing words might be in incomplete ancient Greek texts potentially shedding light on the meaning of inscriptions that are thousands of years old.

What do you think about this bulletin?

Please take the time to fill out our quick survey to let us know your thoughts about this bulletin, how it can be improved and what you would like to hear about going forward.