Explore Microsoft Cognitive Speech Services, including language translation, speech and speaker recognition, and customized language models, then create your own AI app that can translate, recognize, synthesize, and perform authentication using speech.
- 5 weeks long
- 3-5 hours per week
- Learn for FREE, Ugpradable
- Taught by: Lei Ma, Scott Peterson
- View Course Syllabus
Online Course Details:
Microsoft Cognitive Services is a set of cloud-based intelligence services and APIs for building richer, smarter, and more sophisticated applications. The Speech APIs available in Microsoft Cognitive Services offer many ready-to-use and easy-to-consume features that help you use Artificial Intelligence (AI) to solve your business problems. In this practical course, take an in-depth look at Speech APIs, work through hands-on exercises to learn how to piece them together, and find out how to put them to work in your organization.
Start with an overview of Microsoft Cognitive Services, and then take a look at the Bing Speech API, which provides algorithms, exposed as simple REST-based service calls, to convert audio to text, understand speech intent, and convert text back to speech for natural responsiveness. Explore the Translator Speech API to add end-to-end, real-time, speech translation to applications and services. Get the details on the Speaker Recognition API, designed to perform speaker verification and identification. And dig into the Custom Speech API, which enables you to customize speech language models to perform domain-specific and use case-specific speech recognition.
Leverage the latest best practices and Fluent Design principles, as you learn how to create Windows 10 Universal Windows Platform applications that can run on multiple devices, including desktops, tablets, phones, HoloLens, and Xbox consoles. With a prerequisite of proficiency in a C-based programming language like C, C#, C++, or Java, follow along with the instructor as you work through the labs to replicate and modify code in the examples.
Wrap up the course by creating an application that authenticates users via speaker verification and searches relevant an popular news articles based on information returned from the Bing News Search API. The app can even optionally translate news headlines into your language of choice, using the Translator Speech API. From a general overview to specific use cases and hands-on practice, this course gives you what you need to create AI apps with off-the-shelf features in Cognitive Services Speech APIs.
- Module 1: Bing Speech: Introduction to Microsoft Cognitive Service Bing Speech concepts and best practices, as well as integrating speech recognition and synthesis into applications.
- Module 2: Translator Speech: Introduction to Microsoft Cognitive Service Translator Speech concepts and best practices, as well as integrating real-time speech translation into applications.
- Module 3: Speaker Recognition: Introduction to Microsoft Cognitive Service Speaker Recognition concepts and best practices, as well as integrating speaker identification and verification into applications.
- Module 4: Custom Speech Introduction to Microsoft Cognitive Service Custom Speech concepts, as well as integrating custom language models and speech recognition into applications.
- Module 5: Final Project: Developing a Universal Windows Platform (UWP) application using various aspects of Microsoft Cognitive Speech Services
What you’ll learn
- Translate spoken content into other languages
- Perform speech synthesis and recognition
- Replace standard authentication with speaker verification
- Integrate speech commanding into app experiences
- Identify speakers via voice identification
- Build a speech and speaker recognition app