Print.IT - issue 47

34

PRINT.IT

01732 759725

SPEECH RECOGNITION

into context. Location awareness

could indicate you mean Green

Park in London and help make

assumptions about transportation

mode. If you were sitting at

Piccadilly Circus, the answer could

be ‘Take one stop, Westbound, on

the Piccadilly line’. But what if you

meant Green Park in Manchester or

Birmingham?

The search for a deeper meaning

The real challenge lies in what’s

behind the voice recognition system

– from the integration of IoT devices

to the system itself – and in

ensuring that requested commands

make sense. To achieve this, we

need to use cognitive engines as a

check and validation system.

Think of someone accidentally

giving a command to ‘Turn off

cooling system to reactor 4’,

instead of reactor 3, or of a doctor

using the system to prescribe

a harmful dose of medication

because he accidently said 400

grams instead of 400 milligrams.

There will need to be a holistic

view of actions being automated to

prevent human error and broader

intelligence to understand the

actions related to voice-controlled

requests. For example, even if ‘Turn

off cooling system to reactor 4’ was

correct, the system would need to

understand a set of operational

procedures to implement the

command.

Creating an API platform for true

voice integrated solutions

An interesting element that

could tie in strategically with

the development of true voice-

controlled enterprise environments

comes from innovations in the

traditional voice communication

world, where we are seeing an

explosion of CPaaS (Communication

Platform as-a-Service), which uses

APIs to integrate voice into other

applications.

Some major voice

communication vendors are now

entering this market, providing

CPaaS infrastructures with a

standardised set of APIs that

enable companies to integrate

communications into their business

processes.

While we traditionally look at

integration in terms of incorporating

voice and video services into

existing applications – think of a

banking application that allows

you to move from an online

application to a voice call with a

banking advisor – I believe these

will play a big part in a ‘voice-first’

environment by leveraging the rich

API infrastructure of CPaaS to

communicate with applications and

things.

How CPaaS and other platforms

communicate with devices really

needs to be standardised before

we see rapid development of voice

technology. All today’s consumer-

based voice-controlled systems

have their own interfaces, their

own API integrations and, as with

the Beta vs. VHS battle from

decades ago, the potential for

product obsolescence. Just as a

consumer doesn’t want to invest

in the latest smart coffee maker

only to see the platform that

controls it be discontinued, so an

enterprise wants to make sure

that the investment it makes in

new technologies won’t become

obsolete before it is able to realise

a return.

The good news is that there

is a set of technologies in the

works to help minimise potential

obsolescence, with frameworks like

IoTivity being developed to build a

standardised platform.

The best is yet to come

We are already seeing the value,

benefits and rapid expansion

of new voice applications for

consumers, and in the near term

we will see basic use cases move

into the enterprise. Longer term,

as advances continue to be made

in voice recognition, voice security

and simplification/standardisation

in device connectivity, we will

see more and more voice-first

activities in both the consumer

and enterprise world, helping to

reduce complexity and improve our

productivity.

Craig Walker is Director of Cloud

Services at Alcatel-Lucent Enterprise

(ALE). He has more than 25

years’ experience in publicly held

telecommunication companies,

start-up ventures and within the

partner environment. He has been

with ALE since the acquisition of

Xylan Corporation in 2000, where he

was Technical Director EMEA.

Hands-free simplicity

Is voice the answer to office productivity?

Anyone who has ever struggled with copier

settings and wished for an easier way of

completing complex copy jobs will appreciate the

Vision-e Voice app for Xerox ConnectKey MFPs,

including the new VersaLink and AltaLink series.

The solution, which combines

voice recognition technology ,

an MFP app and an Amazon Dot

speaker, lets you interact with an

MFP through spoken requests,

such as ‘Please make 20 double-

sided copies’, ‘Please scan and

email to Mike Jones’ or ‘Please

request a service call’.

The MFP talks back through Amazon Dot, giving

answers to questions like ‘Is there enough paper to

complete this job?’.

Vision-e Voice is not just useful for people with

disabilities. By providing a quick and easy way to

request information, such as toner levels, or to

initiate multi-step processes, such as scanning

and emailing, it has the potential to improve the

productivity of all employees.

Another new voice-enabled application that

provides users with instant access to information

has just been launched by BrightHR.

It has integrated its people management

software with Amazon Alexa, creating a virtual

assistant who can answer absence-related queries,

such as ‘Alexa, ask BrightHR who is out today?’ and

‘Alexa, ask BrightHR is Dave out on October 23rd?’.

Alastair Brown, Chief Technical Officer at

BrightHR, said: “There’s no need to switch to

browsers or open an app either. People can

keep working while they ask Alexa a question.

At the heart of this is the drive to reduce people

workloads. Voice is a much more natural way to

interact and if voice technologies continue to

develop further this is likely to transform the way

businesses interact.”

He added: “At BrightHR HQ, on top of keeping up

to date with who’s out of the office, Alexa is being

used as a brew roulette where it decides who’s turn

it is to make teas and coffees for their team. This

is decided simply by saying ‘Alexa, ask BrightHR

brew-lette’.“

...continued

Print.IT - issue 47 - page 34

Warning.