`mexca.audio.identification`

Speech segment and speaker identification.

Module Contents

Identify speech segments and cluster speakers using speaker diarization.

Command line interface for identifying speech segments and speakers.

class mexca.audio.identification.SpeakerIdentifier(num_speakers: Optional[int] = None, use_auth_token: Union[bool, str] = True)[source]

Identify speech segments and cluster speakers using speaker diarization.

Wrapper class for pyannote.audio.SpeakerDiarization.

Parameters

num_speakers (int, optional) – Number of speakers to which speech segments will be assigned during the clustering (oracle speakers). If None, the number of speakers is estimated from the audio signal.
use_auth_token (bool or str, default=True) – Whether to use the HuggingFace authentication token stored on the machine (if bool) or a HuggingFace authentication token with access to the models pyannote/speaker-diarization and pyannote/segmentation (if str).

Notes

This class requires pretrained models for speaker diarization and segmentation from HuggingFace. To download the models accept the user conditions on hf.co/pyannote/speaker-diarization and hf.co/pyannote/segmentation. Then generate an authentication token on hf.co/settings/tokens.

property pipeline: pyannote.audio.Pipeline[source]: The pretrained speaker diarization pipeline. See pyannote.audio.SpeakerDiarization for details.

Identify speech segments and speakers.

Parameters: filepath (str) – Path to the audio file.
Returns: A data class object that contains detected speech segments and speakers.
Return type: SpeakerAnnotation

mexca.audio.identification.cli()[source]: Command line interface for identifying speech segments and speakers. See identify-speakers -h for details.