By Sivani Voruganti, MEng ’23 (EECS)
This life in tech interview is part of a series from E295: Communications for Engineering Leaders. In this course, Master of Engineering students were tasked with conducting an informational interview to learn more about working in tech. They then submitted a written account of the interview, edited and organized to create a clear, compelling narrative.

“VUIs serve to empower people–of different ages, levels of literacy, socio-economic backgrounds, and those with mobility or visual impairments–to become more included in everyday society.”For instance, in developing countries like India, with the recent proliferation of affordable smartphones and some of the cheapest mobile data prices in the world, a new kind of internet user has emerged–one that relies more on voice and video than text and typing.² As millions of people from all walks of life get online for the first time, voice technology plays an enormous role in opening up avenues for the digital inclusion less-educated/illiterate communities to access and participate in digital media interactions like messaging, social media, and even financial transactions. However, my interviewee opines that VUI technology still needs to build more maturity in the area of speech recognition. In recent times, there has been discussion and debate surrounding the socio-economic and racial biases that can often become unconsciously ingrained into speech recognition machine learning models. Often, the voice data that is fed into the models at training time can itself be biased or not diverse enough. Web-mined training data may not capture the variations of all languages, dialects, and accents of the world, causing trained speech recognition models to ultimately underperform or even exclude certain under-represented populations. For instance, due to the limitations of their training datasets, many large Natural Language Processing models are not yet equipped to fully understand and cater to specialized modes of speech like African American Vernacular English.³ My interviewee suggests that the solution to overcoming limitations in speech recognition is perhaps both technical and organizational–improving the representation of diverse populations in speech models, yet also recognizing and mitigating the systemic injustices of society at large.
After all, he says, “technology reflects the society that builds it.”Over the next decade, IoT is poised to become mainstream–with an estimated 43 billion connected devices by 2023 alone.⁴ With every device becoming “smart,” there will likely be a proliferation of these “conversations” that humans engage in with their devices, enabled by a multitude of VUIs. My interviewee’s work resonates with me and has spurred a newfound interest to innovate initiatives that transform Siri-like voice assistants into their future avatars, and make way for the next wave of technological revolution surrounding the voice conversations between humans and ubiquitous IoT devices. References
- https://hbr.org/2019/05/using-voice-interfaces-to-make-products-more-inclusive
- https://www.wsj.com/articles/the-end-of-typing-the-internets-next-billion-users-will-use-video-and-voice-1502116070
- https://hai.stanford.edu/news/jazmia-henry-building-inclusive-nlp
- https://www.mckinsey.com/industries/private-equity-and-principal-investors/our-insights/growing-opportunities-in-the-internet-of-things
Life in Tech: Is Voice AI Technology the Future? A Siri Engineer’s Perspective was originally published in Berkeley Master of Engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.