Methodology & Objectives

To identify the characteristics of the voice that are related to chronic diseases

Colive Voice aims to better understanding how your voice can be used to monitor your health. More precisely, we are looking to identify vocal biomarkers, a combination of characteristics of the voice signal that can be associated with a symptom, a disease or the effect of a treatment.

Our study is devoted to the research of vocal biomarkers of various chronic diseases and frequent health symptoms:

  • Cancer
  • Diabetes
  • Stress
  • Anxiety
  • Fatigue
  • Depression
  • Covid-19
  • Multiple Sclerosis
  • Inflammatory Bowel Disease

The extracted audio features are used to train machine and deep learning models to identify selected vocal biomarkers or related symptoms.

Take a look at the following representations of audio waves and notice how different they are: the first shows the absence of a symptom while the second, its presence (here fatigue in patients with Covid-19)

Fig.1 Audio waveplot – Covid-19 patient with no fatigue symptom
Fig.2 Audio waveplot – Covid-19 patient with fatigue

Impact on the future of healthcare

In the future, vocal biomarkers could be used to predict disease severity, for diagnosis purposes or for remote patient monitoring using digital technologies. However, at this stage, the main aim of this study is to identify candidates for vocal biomarkers and study the feasibility of using the voice to monitor health.

How does Colive Voice work?

Colive Voice aims to collect data from participants worldwide, and in various languages (French, English, German and Spanish. Other languages will be added later).

We simultaneously collect voice recordings and clinical, epidemiological and patient reported outcomes (PROs) data through an anonymous survey on the Colive Voice web app.

People will first answer a detailed questionnaire on their health status and then do 5 different short voice records.

Participation in the study is voluntary. All data is gathered in a single session that lasts about 20 minutes and is accessible online. You can participate directly from a smartphone, a tablet or a laptop equipped with a microphone.

Adults and adolescents above 15 years of age, regardless of their health status, can participate in the study. We hope to gather the participation of more than 50.000 people in the survey, with participants from all around the world.

What are we going to do with the collected answers and audio recordings?

Preprocessing steps are necessary on voice recordings before analyzing the data. This includes steps such as resampling, normalization, noise reduction, framing and windowing the data as described in the figure below which represents the typical pathway to identify a vocal biomarker.

Features are then extracted from audio signals, i.e. characteristics that will be used to train machine learning algorithms to automatically predict or classify a clinical, medical or epidemiological feature of interest, alone or in combination with other health-related data.

The figure below shows more into details of audio preprocessing and feature extraction.

What topics are covered in our questionnaire?

A detailed health questionnaire is associated with the voice recording and addresses the following aspects:

Basic characteristics: language, age, gender, weight, lifestyle factors, quality of life, alcohol, smoking habits

Symptoms: stress, anxiety, constipation, pain, sleep disorders, respiratory quality of life, cough, fatigue, fever..

Current treatments: for pain, cholesterol, diabetes, hypertension, anticoagulants, antidepressants, anti-reflux, hormonal treatments..

Diseases: chronic diseases (diabetes, CVD..), cancer, endocrine diseases, mental health (depression, stress..), neurological diseases, communicable diseases (HIV, Covid-19, influenza, tuberculosis, malaria, Zika)…

To see the complete questionnaire, click here