Apis for speech recognition and text analytics: news from the azure ai world

Apis for speech recognition and text analytics: news from the azure ai world

Microsoft has revealed several new features for Azure Cognitive Services. Behind the name are cognitive APIs that can be used to develop intelligent apps. The interfaces enable machines to see and hear, i.e. to cognitively grasp their environment. So far, more than 30 services cover decision-making, speech recognition and input, image analysis, and web search.

Text Analytics for Health is aimed at healthcare and allows medical professionals to use unstructured medical data to gain informed insights. In light of the Corona crisis, Microsoft and the Allen Institute of AI have released a free collection of more than 47.000 scientific documents compiled, the COVID-19 open research dataset. In addition, a newly developed COVID-19 search engine will be used for the cognitive search to gain new insights from researching and combating the corona virus.

What mood resonates between the lines

Advances in Natural Language Processing (NLP) enable Microsoft to provide a new opinion mining capability for text analysis. The developers claim that it recognizes moods in texts and thus allows, for example, more precise analyses of customer opinions in social media.

The now available Form Recognizer recognizes unstructured data, which can be found in forms with tables, objects and other elements. Previously, companies had to classify such data manually.

Communicating with users

The newly available interfaces also include custom commands. Developers can use it to integrate customer-specific speech functions into applications that process (Speech to Text) and understand (Language Understanding) spoken language. Voice Response and Text to Speech will also enable applications to communicate with their users. Microsoft follows the low-code principle – accordingly, the programming effort for developers is kept within limits.

With Neural Text to Speech, Cognitive Services’ linguistic capabilities grow by 15 new voices. They are based on modern models for neural speech synthesis. These include Arabic (Egypt, Saudi Arabia), Catalan (Spain), Danish (Denmark), English (India), Hindi (India), Dutch (Netherlands), Polish (Poland), Portuguese (Portugal), Russian (Russia), Swedish (Sweden), Thai (Thailand), Chinese (Cantonese, Traditional and Taiwanese Mandarin).

Like this post? Please share to your friends:
Leave a Reply