697 reads

HackerNoon Adds Synthetic Voice Transcription to Improve Reader Accessibility

by Limarc AmbalinaOctober 27th, 2021

Too Long; Didn't Read

HackerNoon has added full synthetic audio transcriptions to our stories. Synthetic voices can be read to you from one of our synthetic voices. Users can change the voice to the voice of their choice. As you scroll down an article, there is also a play button on the nav bar, which gives the flexibility to pause and play the article as well. The main challenge was literally converting the actual stories into readable text. We looked at companies that provide this software on a SaaS basis, and we looked into a few of these providers quite thoroughly.

People Mentioned

Companies Mentioned

featured image - HackerNoon Adds Synthetic Voice Transcription to Improve Reader Accessibility

One of our missions at HackerNoon is to make our platform available to all readers and writers across the globe. That means optimizing for niche use cases and making our platform as accessible as possible. As such, we're excited to launch a brand new AI-powered feature that will make our stories more accessible to the visually impaired, and make it easier to enjoy our content while you're on the go. We've added an audio player that uses a synthetic voice to read aloud a full transcription of the story. Now, you can listen to HackerNoon stories while you're commuting, exercising, folding laundry, or whatever else life has you doing.

What Does the Synthetic Audio Player Mean for Writers?

While the main goal was to improve reader accessibility, this feature should do wonders for our writers too. Firstly, adding an audio player gives your readers the ability to listen to the story and turns the active reading experience into an active or passive listening experience.

Sometimes readers get fatigued and stop reading your story halfway or even sooner than that. On the other hand, listening to the story takes less attention and could improve a reader's time on site.

Basically, a longer time on site = a more valuable story in the eyes of search engines. This audio player should keep readers on your stories longer, improving time on site and improving your rankings on Google and other search engines.

Learn more about the feature from the lead developer himself, Marcos Fabian, below...

This thread by Limarc Ambalina and Marcos Fabian occurred in slogging's official #software-development channel, and has been edited for readability.

Limarc AmbalinaOct 22, 2021, 6:33 AM

Maybe some of you have noticed and maybe some of you haven't but there is a little green icon on some HackerNoon stories. I wonder what the icon means?

Limarc AmbalinaOct 22, 2021, 6:35 AM

Well, if you haven't noticed by now, HackerNoon has added full synthetic audio transcriptions to our stories!!!! Every story published from now on can be read to you aloud from one of our synthetic voices.

Limarc AmbalinaOct 22, 2021, 6:35 AM

Today I'm joined by the sole developer on this project Marcos Fabian to talk a bit about how he made it.

Limarc AmbalinaOct 22, 2021, 6:35 AM

Marcos, tell us a bit about this awesome feature! How did you build it? How many voices are there? How does it woork?

Marcos FabianOct 22, 2021, 4:43 PM

Hey Limarc Ambalina, this feature is awesome. I personally use it a lot and I am glad it is now live. I simply provides Audio Readability to our stories. It comes with 4 different voices, 2 in English (M-F), 1 with European GB accent (F), and another one with the IN accent (M). It is really amazing how natural these voices sound.

Marcos FabianOct 22, 2021, 4:47 PM

If an article has the audio icon, it means that it has voices already produced for that article. You simply click on the play button, and it will read it to you. User's can change the voice to the voice of their choice. As you scroll down an article, there is also a play button on the nav bar, which gives the flexibility to pause and play the article as well.

Limarc AmbalinaOct 25, 2021, 1:12 PM

Thanks for the intro Marcos Fabian. Having personally worked with you on this project, I know the ride we went through to get where we are now :rolling_on_the_floor_laughing:. Without naming names, we looked at companies that provide this software on a SaaS basis, and we looked into a few of these providers quite thoroughly.

Limarc AmbalinaOct 25, 2021, 1:12 PM

Can you tell people a bit about why we felt it’d be best to make this in house instead of paying for a pre-built solution on the market, and what APIs we used to make this happen?

Marcos FabianOct 26, 2021, 12:41 AM

Yessss, it was a long road but we got here. To be honest when I took this project, I always felt like there was a gap in our articles. Just like myself, many people are active listener and not so much active readers, so I knew I wasn't the only one, and so working on this was sort for personal as I understood the importance of this future for our users and myself.

Marcos FabianOct 26, 2021, 12:50 AM

Hackernoon is unique, its colors, its people, its users. We are not like everyone and so we explored multiple options when it comes to developing this feature. We initiated conversations with a few companies that we believe were going to provide almost 0% of their way and almost 100% of hackernoon's way 😬. Unfortunately, it didn't work out as we are really picky ✌I meant we were looking for more customization and for thinks to look a bit more with the hackernoon styled. So partnerships did not go as expected and I sort of took this project and found google's api Text-to-Speech which was cheaper and provides a good way of transcribing text into audio. So I used this api to get the audio to our articles.

Marcos FabianOct 26, 2021, 12:53 AM

The main challenge was literally converting the actual stories into readable text, as when we go to one of our stories we see the stories but in reality all there is is a bunch of DOM elements. Thinks got trickier because we have two editors, so I had to convert one into the other to then extract the text out of it and pass it to the api and finally get the audio file. This was the challenge and then saving the audio and hosting it, but the actual conversion was really intense in terms of work.

Marcos FabianOct 26, 2021, 12:54 AM

The coolest part was designing our own audio player, which is built with plain css, no extra library. This audio player is not 100% as I still want to fix a few styling but overall looking good and doing the job.

Limarc AmbalinaOct 26, 2021, 1:48 AM

And I'm loving your design! Especially the use of HackerNoon's robot characters.

Limarc AmbalinaOct 26, 2021, 1:49 AM

So would you say Marcos Fabian that if you're not looking for customization and just want audio on your site ASAP, a 3rd party provider could be a good option, but if your website wants customization for the player and its features, it's probably faster and cheaper to build it yourself?

Limarc AmbalinaOct 26, 2021, 1:49 AM

And in your research Google text to speech API was the cheapest out there that you could find?

Marcos FabianOct 26, 2021, 2:54 AM

Yes, there is a little more to it. If we were to have this feature from a partner:

we would not be 100% owners of it
we will have to accept their designs, layout, colors, conditions, etc.
It is just not the way we do stuff at hn

We love our partners, but when it comes to dropping a feature, we want to be as unique as possible so that the feature can be best used by our users. It shows dedication and that someone cares. Us developers also read and navigate the web looking for information, so we understand what is a good experience and I wanted to provide that in a personalized way. 3rd parties are good, depending on the interest. We are grateful to have a great team that guide us to make features like this, including you Limarc Ambalina. 👍

🔥 1

Marcos FabianOct 26, 2021, 3:04 AM

Google API Text-To-Speech uses great synthetic voices, with different accents, genders, and languages. It was cheaper compared to our previous partner. I Also love the challenge that comes with building this feature from scratch 😁 without involving a 3rd party.

Limarc AmbalinaOct 26, 2021, 3:57 AM

Awesome Marcos Fabian! Final question: what iterations do you think our readers can expect in the future? What do we have in the works? I'm assuming we'll be working on adding voices with other accents and things like that?

Marcos FabianOct 26, 2021, 4:30 AM

Yes, we definitely want to start by providing audio to every single story that we have, right now we are only providing audio to recent stories. Once that is done, we will then personalize and optimize the audio player a little more so that it looks more smooth. Lastly, providing more languages is almost a done deal because of the impact this feature has bring, but it is still up for discussion. We have different stages that we want to complete before getting there, but certainly, we will love user to read stories with multiple accents. Maybe it is done via requests, or integrated on user's settings, we will find a way to expand this feature to make sure users get the best out of it.