mexca.audio.identification
Speech segment and speaker identification.
Module Contents
Classes
Identify speech segments and cluster speakers using speaker diarization. |
Functions
|
Command line interface for identifying speech segments and speakers. |
- class mexca.audio.identification.SpeakerIdentifier(num_speakers: Optional[int] = None, use_auth_token: Union[bool, str] = True)[source]
Identify speech segments and cluster speakers using speaker diarization.
Wrapper class for
pyannote.audio.SpeakerDiarization.- Parameters
num_speakers (int, optional) – Number of speakers to which speech segments will be assigned during the clustering (oracle speakers). If None, the number of speakers is estimated from the audio signal.
use_auth_token (bool or str, default=True) – Whether to use the HuggingFace authentication token stored on the machine (if bool) or a HuggingFace authentication token with access to the models
pyannote/speaker-diarizationandpyannote/segmentation(if str).
Notes
This class requires pretrained models for speaker diarization and segmentation from HuggingFace. To download the models accept the user conditions on hf.co/pyannote/speaker-diarization and hf.co/pyannote/segmentation. Then generate an authentication token on hf.co/settings/tokens.
- property pipeline: pyannote.audio.Pipeline[source]
The pretrained speaker diarization pipeline. See pyannote.audio.SpeakerDiarization for details.
- apply(filepath: str) mexca.data.SpeakerAnnotation[source]
Identify speech segments and speakers.
- Parameters
filepath (str) – Path to the audio file.
- Returns
A data class object that contains detected speech segments and speakers.
- Return type