`mexca.audio.identification`

Speech segment and speaker identification.

Module Contents

Classes

SpeakerIdentifier

Identify speech segments and cluster speakers using speaker diarization.

Functions

cli()

Command line interface for identifying speech segments and speakers.

exception mexca.audio.identification.AuthenticationError(msg: str)[source]

Failed authentication to HuggingFace Hub.

Parameters:: msg (str) – Error message.

class mexca.audio.identification.SpeakerIdentifier(num_speakers: int | None = None, device: torch.device = torch.device(type='cpu'), use_auth_token: bool | str = True)[source]

Identify speech segments and cluster speakers using speaker diarization.

Wrapper class for pyannote.audio.SpeakerDiarization. Uses pretrained speaker diarization model pyannote/speaker-diarization-3.1 from HuggingFace.

Parameters:

num_speakers (int, optional) – Number of speakers to which speech segments will be assigned during the clustering (oracle speakers). If None, the number of speakers is estimated from the audio signal.
device (torch.device, default=torch.device("cpu")) – The device on which the speaker diarization model is run.
use_auth_token (bool or str, default=True) – Whether to use the HuggingFace authentication token stored on the machine (if bool) or a HuggingFace authentication token with access to the models pyannote/speaker-diarization and pyannote/segmentation (if str).

Notes

This class requires pretrained models for speaker diarization and segmentation from HuggingFace. To download the models accept the user conditions on hf.co/pyannote/speaker-diarization and hf.co/pyannote/segmentation. Then generate an authentication token on hf.co/settings/tokens.

property pipeline: pyannote.audio.Pipeline[source]: The pretrained speaker diarization pipeline. See pyannote.audio.SpeakerDiarization for details.

apply(filepath: str, show_progress: bool = True) → mexca.data.SpeakerAnnotation[source]

Identify speech segments and speakers.

Parameters:

filepath (str) – Path to the audio file.
show_progress (bool, default=True) – Enables the display of a progress bar.

Returns:

A data class object that contains detected speech segments and speakers.

Return type:

SpeakerAnnotation

mexca.audio.identification.cli()[source]: Command line interface for identifying speech segments and speakers. See identify-speakers -h for details.

mexca.audio.identification

Module Contents

Classes

Functions

`mexca.audio.identification`