Estimation de la position des premiers temps dans un signal audio musical


We show the activation of each of the four adapted networks presented in our Ph.D manuscript on a challenging exemple, aswell as the mean activation of all four networks. The audio is also provided with a superimposed clic at the downbeat position as estimated by the mean activation of all four networks each time. We see that the combination of the networks is able to produce a suitable downbeat detection funciton that leads to an appropriate downbeat sequence, while each network individually is less reliable.


1. Harmonic Network

Audio signal Network output
HCNN_ce

2. Rhythmic Network

Audio signal Network output (after reduction)
RCNN_ce

3. Melodic Network

Audio signal Network output
RCNN_ce

4. Bass Network

Audio signal Network output (after reduction)
BCNN_ce

5. Network Combination

Audio signal Network output
meanCNN_ce