A Peek At What I Am Doing In My Voice First Lab

Reading Time: 2 minutes


Here Are Some Aspects Of My Research.

I am a solo researcher with no budget, working in a garage lab. So understand this when you compare what I am doing in relationship to Billion dollar companies working with literally thousands of engineers, researchers and scientists.

My Voice First systems use a number of techniques. Here is one of the techniques I use to create a deeper meaning and deeper context. I have been successful in using regional accents and on-demand sound effects during long narratives to enhance the theatrical effect.

The example below is from my 2011 Quora posting [1]. It was directly read by my custom Voice First systems using the AI I have built to add real-time theatrical character to long-form text passages. I use a number of cues that activate the sonic landscape.

The systems are experimental and I have thus stabilized 12 english language accents with over 400 sonic landscapes. It is early days, but in my tests I have found a vast majority of users prefer this form of Voice First interactions over monotone and dry interactions for long-form passages.

One secret I will share about my research at this time is comprehension and retention.  When the correct sonic landscape and soundicons are used along with the correct vocal type, especially for long-form passages, my research shows a robust 62% consistent increase in comprehension and retention. You can try this with the passage below. You will remember far more than if you just read it yourself or had plain Siri or Alexa read it.   Comprehension and retention is abundantly important.  It will become more clear over the next few years.

Sonic landscapes overlaid on top of real-time text-to-speech will become commonplace over the arc of the next 10 years.  It all started in my garage.

The vocal passage below is fully synthesized in real-time as well as the sonic landscape:




[1] What is the farthest human-made object?





