Technology that speaks the same language

Mouser Electronics
By Paul Golata, Senior Technical Content Specialist
Wednesday, 01 July, 2020



Introduction

‘Goo goo gaga’ is a simple interjection. It is onomatopoeia from the Greek ‘onoma’ for ‘name’ and ‘poiein’ for ‘to make’, meaning ‘to make a name’. It is a phrase that resembles the sounds that a baby makes before they are old enough and smart enough to carry on a conversation. Somehow, as if by osmosis, babies naturally learn to speak and ultimately converse in the native language they find themselves living in. If only it were like this for computers and machines; instead, these items need to be programmed by humans.

Natural language processing (NLP), a subset within the domain of artificial intelligence (AI), combines high technology with the linguistics of human language, enabling machines and humans to communicate. It provides machines with the ability to understand both written and spoken human communication. It allows humans and machines to speak the same language and to talk to each other to exchange information and ideas. This article will look at how AI is helping us to have conversations with our machines.

Human language

The field of linguistics studies the methods and manner whereby processed information is communicated both internally and externally to things with intelligence. Humans use what is called natural language. Natural language can be coupled with technology. Conversational AI is helping to merge humans’ sophisticated communication and intellectual abilities with their technological capabilities.

Human communication happens by way of complex symbols that are perceivable to the senses. Notable human examples include speech (hearing), written or sign language (visual), and physical contacts such as handshakes and hugs (touch). Raymond Kurzweil, who was hired by Google in 2012 with the mission of bringing natural language insight and understanding to the company, asserts that human language was the first invention of humanity. Language is a way that humans can work together and build a society, have culture and create technology.

All intelligence-manipulating communication requires a method to structure language. Grammar, syntax and discourse provide structure to a language so that its constituent components may be appropriately understood and interpreted. Author, critic and educator Neil Postman (1931–2003) believes that language is “pure ideology” and should be viewed as an “invisible technology”. By this, Postman means that language is not neutral. It is a reflection of the starting assumptions and its use frames the entire informational content that intelligence utilises.

Interpretation is the art of adequately receiving communication and processing it in the manner that was intended by the communicator. The circumstances surrounding the specific grammar, syntax and discourse employed are called the context. It provides the external and internal environment into which the information is being processed. The context that language finds itself in is a critical key in ascertaining what a communication means from the perspective of the communicator’s intention. Because the intention of the communicator matters to the context, the issue of agency is brought to bear. If intention is not part of the communicated message, then whatever is transmitted cannot produce meaningful action since it will have only been derived from an original set of happenstance, from which it is not possible to assert meaning.

Humans can create and utilise symbols to express themselves in new and unique ways without limitation. A human can cry in pain, read Shakespeare or sing an opera. These symbols come to have meaning as a result of social interaction and agreement. Because human language is based upon social purpose, it allows both change over time and unlimited variety as society develops new symbols to communicate what people experience.

Other life forms can communicate in a manner that is natural yet distinct and different from human language. This communication is generally a form of signalling understood within the species, but it does not involve the manipulation of symbols and creative thought. For example, a dog’s bark may provide information to other dogs who receive it and understand in a manner beyond general human understanding. Animals may also use other ways to communicate that are not inherently understandable to humanity at initial glances, such as the abilities of bees to do a dance that indicates the direction to fly to obtain pollen for the hive. Scientists recognise that, even though animals may communicate with other animals of their species, there is no animal, including apes and chimpanzees, that can manipulate signs and symbols to the degree that humanity can. Animals only work and communicate regarding particular contexts and do not communicate regarding universal or abstract relationships.

From humans to machines

In contrast, machines and computers do not use human language. Their intelligence happens in the form of AI. All AI utilises programming, which enables it to receive information, compute and act in an attempt to make sense of what it is experiencing. Humans have created these machines and programming languages to be able to participate in what the machine is capable of doing.

These languages follow a specific set of rules that have been agreed upon by social and primarily scientific convention. Because of the general desire to be utilised universally, they are most frequently constructed with formality; that is, there is a universally agreed-upon method to the logic contained within the artificial language. Artificial languages (machine code or code) can be set up to perform specific predefined tasks.

Programming is the art and science of writing machine code. It is performed by manipulating the functionally equivalent elements found in human language, including grammar, syntax, semantics and discourse. Programming is initially set up by humans but can be assigned to be done by machines (robots/computers) after the initial set-up. An algorithm is a set of instructions that have been formatted and arranged to achieve a specific function. Programming code is generally broken down into a long series of discrete binary digital signals. These signals, representing particular ON and OFF sequences, are then stored, analysed and processed in conjunction with the available intelligence of the machine. All AI programming is based upon human conceptions of structure. AI semantics and syntax thus function in a manner that emulates humans rather than, for instance, another species like apes, dolphins or rats.

Large model sizes

Human language is vast and complicated. It is a collection of shared knowledge and wisdom. Understanding and meaning are derived from experience and context. The tremendous amount of variables means that the model size for one language, such as English, is vast. When expanded to understand other languages that operate in different ways, such as Chinese, French, German, Hindi, Japanese, Spanish, etc, the model sizes required are genuinely staggering. Language models must train on the most extensive and broadest data sets available to capture the most exceptional level of nuance implied in the message. The upshot is that AI and NLP models must be able to handle a vast amount of data and access it quickly and efficiently to have everything needed for understanding.

High computation demands

Machines must be able to train themselves quickly from a vast field of language in order to understand humans. This requires high computational capabilities. GPUs, FPGAs, CPUs, ASICs, crossover processors and microcontroller units (MCUs) are necessary elements for any successful implementation. Let’s look further at how a crossover processor might be part of a conversational AI solution.

Applications processors and MCUs are employed in embedded applications. Applications processors provide excellent integration and performance, while MCUs are easy to use and low cost. NXP Semiconductors has placed these two products together to provide one part that can simultaneously provide high performance, low latency, power efficiency and security in a low-cost part. This product is ideally suited to handle a variety of human language and voice-assistance applications.

NXP Semiconductors’ i.MX RT106A Crossover Processor is a solution-specific variant of the i.MX RT1060 family of MCUs, targeting cloud-based embedded voice applications. It features NXP’s advanced implementation of the Arm Cortex-M7 core, which operates at speeds up to 600 MHz to provide high CPU performance and the best real-time response. i.MX RT106A-based solutions enable system designers to easily add voice control capabilities to a wide variety of smart appliances, smart home, smart retail and smart industry devices. The i.MX RT106A is licensed to run NXP turnkey voice-assistant software solutions, which may include a far-field audio front-end softDSP, wake-word inference engine, media player/streamer and a host of associated items.

NXP Semiconductors’ i.MX RT106A Crossover Processor. (Source: Mouser Electronics)

Instant inferencing

Machines must also be able to train themselves quickly from the massive field of human language and be able to draw exceptionally fast inferencing with extremely low latency times if not in real time. Products like Intel Xeon Second Generation Scalable Gold Processors are enhanced to produce excellent inferencing results.

Intel’s Xeon Second Generation Scalable Gold Processors are optimised for inferencing. (Source: Mouser Electronics)

These Intel processors are 64-bit, multicore server microprocessors built on 14 nm lithography process technology. The processors are based on the Cascade Lake microarchitecture that allows for higher clock speeds. The processors are also optimised for demanding mainstream data centres, multi-cloud computing, and network and storage workloads. These processors offer up to 22 cores/44 threads and feature Intel Turbo Boost Technology 2.0 that ramps up to 4.4 GHz. The processors also feature up to four-socket scalability and support up to 46 bits of physical address space and 48 bits of virtual address space. The devices take embedded AI performance to the next level with new AI acceleration, including new IntelDeep Learning Boost.

Conclusion

Technology is talking sense. The implementation of AI and NLP is an example of an emerging technology that will allow humans and machines to communicate seamlessly and in real-time. Humans and machines are now starting to speak the same language.

Top image credit: ©stock.adobe.com/au/phonlamaiphoto

Originally published here.

Related Sponsored Contents

Preventa XPS Universal offered by Schneider Electric

Preventa XPS Universal is a new generation of safety modules from Schneider Electric.

Three ways to adjust power consumption and dissipation in your processing systems

This paper outlines Teledyne e2v's tailored approach proposed to system designers to adjust...

Teledyne e2v boosts radio softwarisation efforts through its latest microwave data converter developments

Modern communication network planning needs to find smart ways to keep the data flowing. The...


  • All content Copyright © 2020 Westwick-Farrow Pty Ltd