Advertisement

Zibb

News and New Products

Worth its SALT

Guest Opinion: SALT (Speech Application Language Tags) is a speech-markup language that sets the scene for a new dimension of user interaction with electronic devices.

Stephen Potter, Microsoft and SALT Forum Technical Working Group -- EDN, 9/10/2003

SALT (Speech Application Language Tags) is a speech-markup language that sets the scene for a new dimension of user interaction with electronic devices. SALT enables both voice-driven and so-called "multimodal" Web applications—those that blend voice interaction with more conventional interface modes. The technology integrates directly with visual and core Web technologies, enabling the enrichment of applications across a wide range of devices—mobile, telephony, desktop, and beyond—with the most natural and effective user interface known to humans: speech.

The SALT Forum (www.saltforum.org ), a group of more than 70 companies with a shared interest in speech and multimodality, developed the SALT 1.0 specification, which is available royalty-free and has been contributed to the World Wide Web Consortium (W3C). In addition, a number of companies have already announced SALT products on a variety of platforms.

Multimodal applications offer more than one modality to the end user: for example, speech input in addition to a graphical user interface (UI). Since Web pages today can host all kinds of visual and multimedia components, this enables rich interactive possibilities.

Let's consider a few scenarios. On a PDA, users typically have to peck at a miniature keyboard to enter data into Web forms. A SALT speech interface would let them speak their input directly into the form.

In other mobile scenarios, such as in warehouses or when driving, users may need largely "eyes-free" and/or "hands-free" interactions with their device. A SALT speech interface would enable voice input and output to drive the application.

In desktop or home settings, many applications would benefit from the extra dimension of a voice channel. A SALT speech interface can provide significant enrichments to the UI, such as screen reading, surfing by voice, rapid data entry, and point-click-and-speak features (such as asking a map, "How do I get from here to there?").

A number of architectures can support the convergence of user-interface modalities. SALT operates independently of the host markup language, which means that the speech interface can be integrated into whatever markup is suitable for the client device. The existing Web infrastructure remains the same.

With SALT, scripts and other code on Web pages can access and control speech functions. And SALT can be used with any current or future Web standards, including as HTML (hypertext markup language), XHTML (extensible HTML), WML (wireless markup language), and SMIL (synchronized multimedia integration language). Whether developers are enhancing visual pages with speech or porting visual interfaces entirely into telephony, they'll find that SALT is a natural extension to their skill set.

In addition, speech services are componentized. That is, speech recognition and/or synthesis can be either embedded into the device or run on a remote machine. This enables smaller devices, such as cell phones, to use the resources of distant servers to run SALT applications. In addition, large or dynamic resources such as audio files and grammar rules can reside in remote locations.

Companies are developing SALT-enabled browsers for a number of platforms, both multimodal and telephony. For example, Microsoft will soon be releasing SALT add-ins for Internet Explorer and Pocket Internet Explorer, as well as a SALT telephony application server. In addition, the company offers a SALT software-development kit. You can find a list of additional SALT products from other companies at www.saltforum.org/products/products1.asp .

As greater numbers of Web developers become excited by the possibilities of using speech to create multimodal or telephony UIs, more and more applications will deliver on the promises of speech and multimodal capability, offering the user the richest and most natural way of interacting with the Web.



Reed Business Information Resource Center

Featured Company


Most Recent Resources

ADVERTISEMENT

ADVERTISEMENT

Feedback Loop


Post a CommentPost a Comment

There are no comments posted for this article.

Related Content

 

By This Author

There are no additional articles written by this author.


ADVERTISEMENT

Knowledge Center



Technology Quick Links

EDN Marketplace


©1997-2010 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy