Google Cloud Speech API gets an enterprise-focused update

Google on Monday is rolling out updates to the Cloud Speech API, introducing a set of features that were built to meet enterprise customers’ needs. The new features represent a maturity level for the product, initially built just for Google’s internal use.

“We’ve been working on speech for more than a decade, closer to 20 years… but primarily we’ve always been centered on making Google’s products better and creating a greater experience for Google users,” said Google Cloud product manager Dan Aharon. “Then last year things changed. We kicked the gears up a little in terms of cloud efforts, and we wanted to help third party companies take advantage of machine learning.”

Google has offered that value proposition across its Cloud Platform — customers can take advantage of the same cutting-edge technology powering Google’s own products.

So when the Cloud Speech API was released in beta last year, it represented the first phase in Google’s journey as a cloud vendor, in which it could take its own tools and offer them to other companies. “We’re now looking at what our cloud customers need and doing some R&D to support that and build better products,” Aharon said.

The first new feature is better long-form audio support. It now offers support for files up to three hours long, up from 80 minutes. Files longer than three hours can be supported on a case-by-case basis by applying for a quota extension.

Support for longer files, Aharon said, will support a range of use cases, such as analyzing calls between customer support agents and customers or video transcription services.

Google’s also adding word-level timestamps, the most requested feature. With word-level timestamps, users can jump to the exact spot in a file that they’re looking for. The feature is clearly useful for any kind of transcription service. Aharon said that after Google improved the quality of its Speech API, the most common critique from prospective customers on the fence about the product was, “They like the quality, but they’re held back because of transcripts don’t come up with timestamps.”

Google is also adding support for 30 additional language varieties, on top of the 89 it already supported. The API now covers languages including Bengali, Latvian and Swahili, covering more than one billion speakers.

“When you think about all of the large enterprises, they have a global presence and need to be in all of these markets,” Aharon said. “For a lot of these languages, it’s the first time they’re going to have capabilities in this space.”

It should also help Google win more customers in emerging markets. Even government representatives from around the globe have expressed interest to Google in seeing more languages supported, Aharon said, because “they see it as an important part of their economic evolution.”

So far, there are “many thousands” of customers using the Cloud Speech API, Aharon said, adding it’s seen consistently strongly growth over the last year. “If it continues growing at this pace, in two to three years it will be very significant,” he said.

[“Source-article”]

Google Cloud Speech API gets an enterprise-focused update

Twitter Says Obama’s Tweet on Charlottesville Violence Becomes Most Liked Ever

5 mindset mistakes you may be making when it comes to SEO

Recent Posts

Popular Posts

3 Easy Ways To Make Your iPhone Faster

How to Find and Leverage the Freshest Links

Software for Modern Business

New headphones can pick and choose outside noises

Celebrating 20,000 diaries in DOAJ: the worth (and cost) of keeping up with trust in academic distributing

Going Beyond the Air Gap – Data Isolation and Recovery for the Modern Era

Nothing phone (1) launch date and price leak

Social Media Is a Public Health Crisis. Let’s Treat It Like One.