Opportunities for permanent roles in a range of business sectors

Data Engineer

Location: Cambridge, UK 

"The chance to join a growing company in a growing field does not come along very often and I have been on an amazing journey with Speechmatics. The opportunities to learn and grow come every day. We work on machine learning problems with a direct path to customer happiness and data is a huge part of achieving success in that. If data is your passion and you want to use it to fuel the next generation of machine learning breakthroughs, then Speechmatics could be the perfect place for you!"

Tom Ash, Machine Learning Engineer


Speechmatics leads the market as an any-context speech recognition engine for companies to rapidly build innovative applications at scale. The UK Government, Deloitte UK, Vonage, what3words and Adobe use Speechmatics in scenarios such as contact centers, CRM, consumer electronics, security, media and entertainment and software. Speechmatics processes millions of hours of transcription worldwide every month.  

As a recognized pioneer in machine learning voice engineering, Speechmatics enables companies to build applications that detect and transcribe voice in any context in real-time. Its neural networks consider acoustics, languages, dialects, multiple speakers, punctuation, capitalization, context and implicit meanings. In 2019, Speechmatics received the Queen’s Award for Enterprise Innovation. ​

Speechmatics is a rapidly growing, global company with offices in Cambridge, UK, Denver, USA, Chennai, India and Brno, Czech Republic. With ambitious growth plans comes great opportunities that are exciting and progressive. You’ll be working with some of the smartest minds in the industry, working on cutting-edge projects and deploying the latest machine learning techniques to disrupt the market. You’ll provide customers with the best speech technology available, while immersed in a great team and company culture. We’re building a company that strives to be world-leading and we’re looking for people who wholeheartedly believe they can be additive to our team, bring new ideas and join us on our journey. If that’s you, we’d love to hear from you.

Speechmatics is an equal opportunities employer and positively encourages applications from suitably qualified and eligible candidates regardless of sex, race, disability, age, sexual orientation, transgender status, religion or belief, marital status, or pregnancy and maternity.

The Opportunity

Data is a core requirement for any machine learning company, and Speechmatics is no exception.  To produce the high-quality speech recognition technology that we build across many languages requires not just data, but attention to detail. This provides clear understanding of the data we use in relation to the uses cases that it is there to support. There is constant learning and adjustment as the company grows, use cases diversify and customers expect continuous improvements. As part of this there is constant innovation, research and development occurring in Speechmatics and we now require a dedicated person responsible for managing, understanding and supporting the machine learning teams in delivering on these challenges at pace.

Key Responsibilities

  • Ensuring that we are building the best machine learning models by supporting the teams with the best data available. Supporting our customers’ use cases, understanding our deficiencies and striving for excellence.
  • Ensuring that recording of the data used is undertaken to support any compliance/regulatory needs.
  • Owning the processes by which we gather, measure, understand and improve the data needed to deliver world beating accuracy. This could include:
  • Building and supporting testing frameworks/tools to support testing of accuracy for speech recognition and related features.
  • Obtaining Data for both testing and training different use cases, identifying, coordinating and building out network of 3rd-party vendors to support multiple languages as needed for labelling both speech and related features.
  • Supporting the data sharing agreements with 3rd parties and the management of data transfers between customers and Speechmatics.
  • Ensuring that recording of the data used is undertaken to support any compliance/regulatory needs.
  • Understanding where we are deficient in data and working out where that data gap could be closed to widen support for things like accents/dialects/languages. The deficiency resolution could be cleaning/fixing issues with current data as well as identifying new sources for the data.
  • Exploring where we are good and bad in terms of ASR quality - pro-actively gathering data from different acoustic setups and exploring how well we do against competitors. Supporting our understanding where our systems are strong and where they are weak.
  • Prioritise the budget with support from the product roadmap to enable the best value to be achieved with respect to obtaining data.
  • Help to define and implement strategies to improve models though continuous improvement data processes and data loop closure and preparing data for use in ML projects.
  • Taking inventory, understanding, and organising existing data, including availability, usage, and obtaining additional metadata as needed.

Role Requirements

  • Good communication skills supporting understanding of both technical and business considerations across multiple functions of the business.
  • Ability to understand data labels and write scripts to support evaluation of quality and processing of the data, manipulate results.
  • Ability to understand and analyse data in order to make data driven informed recommendations for improvements.
  • Understanding how to evaluate and learn from large quantities of data and find the important aspects that need attention.
  • Ability to script in Python and Bash to support the processing of data.
  • Degree in Computer Science or related field.
  • Knowledge and / or experience of automatic speech recognition and associated fields.


Competitive salary (dependent on experience), flexible working and some awesome benefits and perks.

Who are we?

​Our vision at Speechmatics is to deliver products that unlock meaning at scale by applying machine learning to understanding communication.

Innovation is what we do. We build, we iterate, we develop the next thing that delivers that wow moment. We see value in building long-term, authentic relationships that last and are based on trust and honesty. With our customers, our colleagues, our leaders, our suppliers or within our local community. Our journey should be fun and exciting. We will celebrate our successes and learn from our mistakes together along the way. We embrace learning and change to grow naturally and organically as a company and individuals. We trust, we’re honest, kind and respectful.