`mexca.audio.extraction`

Extract voice features from an audio file.

Module Contents

Classes

`BaseFeature`	Base class for features.
`FeaturePitchF0`	Extract voice pitch as the fundamental frequency F0 in Hz.
`FeatureJitter`	Extract local jitter relative to the fundamental frequency.
`FeatureShimmer`	Extract local shimmer relative to the fundamental frequency.
`FeatureHnr`	Extract the harmonicity-to-noise ratio in dB.
`FeatureFormantFreq`	Extract formant central frequency in Hz.
`FeatureFormantBandwidth`	Extract formant frequency bandwidth in Hz.
`FeatureFormantAmplitude`	Extract formant amplitude relative to F0 harmonic amplitude.
`VoiceExtractor`	Extract voice features from an audio file.

Functions

cli()

Command line interface for extracting voice features.

class mexca.audio.extraction.BaseFeature[source]

Base class for features.

Can be used to create custom voice feature extraction classes.

requires() → Optional[Dict[str, type]][source]

Specify objects required for feature extraction.

This method can be overwritten to return a dictionary with keys as the names of objects required for computing features and values the types of these objects. The VoiceExtractor object will look for objects with the specified types and add them as attributes to the feature class with the names of the dictionary keys.

Returns:: Dictionary where keys are the names and values the types of required objects.
Return type:: dict

apply(time: numpy.ndarray) → numpy.ndarray[source]

Extract features at time points by linear interpolation.

Parameters:: time (numpy.ndarray) – Time points.
Returns:: Feature values interpolated at time points.
Return type:: numpy.ndarray

class mexca.audio.extraction.FeaturePitchF0[source]

Bases: BaseFeature

Extract voice pitch as the fundamental frequency F0 in Hz.

requires() → Optional[Dict[str, mexca.audio.features.PitchFrames]][source]

Specify objects required for feature extraction.

Returns:: Dictionary with key pitch_frames.
Return type:: dict

apply(time: numpy.ndarray) → numpy.ndarray[source]

Extract features at time points by linear interpolation.

Parameters:: time (numpy.ndarray) – Time points.
Returns:: Feature values interpolated at time points.
Return type:: numpy.ndarray

class mexca.audio.extraction.FeatureJitter[source]

Bases: BaseFeature

Extract local jitter relative to the fundamental frequency.

requires() → Optional[Dict[str, mexca.audio.features.JitterFrames]][source]

Specify objects required for feature extraction.

Returns:: Dictionary with key jitter_frames.
Return type:: dict

apply(time: numpy.ndarray) → numpy.ndarray[source]

Extract features at time points by linear interpolation.

Parameters:: time (numpy.ndarray) – Time points.
Returns:: Feature values interpolated at time points.
Return type:: numpy.ndarray

class mexca.audio.extraction.FeatureShimmer[source]

Bases: BaseFeature

Extract local shimmer relative to the fundamental frequency.

requires() → Optional[Dict[str, mexca.audio.features.ShimmerFrames]][source]

Specify objects required for feature extraction.

Returns:: Dictionary with key shimmer_frames.
Return type:: dict

apply(time: numpy.ndarray) → numpy.ndarray[source]

Extract features at time points by linear interpolation.

Parameters:: time (numpy.ndarray) – Time points.
Returns:: Feature values interpolated at time points.
Return type:: numpy.ndarray

class mexca.audio.extraction.FeatureHnr[source]

Bases: BaseFeature

Extract the harmonicity-to-noise ratio in dB.

requires() → Optional[Dict[str, mexca.audio.features.HnrFrames]][source]

Specify objects required for feature extraction.

Returns:: Dictionary with key hnr_frames.
Return type:: dict

apply(time: numpy.ndarray) → numpy.ndarray[source]

Extract features at time points by linear interpolation.

Parameters:: time (numpy.ndarray) – Time points.
Returns:: Feature values interpolated at time points.
Return type:: numpy.ndarray

class mexca.audio.extraction.FeatureFormantFreq(n_formant: int)[source]

Bases: BaseFeature

Extract formant central frequency in Hz.

Parameters:: n_formant (int) – Index of the formant (starting at 0).

requires() → Optional[Dict[str, mexca.audio.features.FormantFrames]][source]

Specify objects required for feature extraction.

Returns:: Dictionary with key formant_frames.
Return type:: dict

apply(time: numpy.ndarray) → Optional[numpy.ndarray][source]

Extract features at time points by linear interpolation.

Parameters:: time (numpy.ndarray) – Time points.
Returns:: Feature values interpolated at time points.
Return type:: numpy.ndarray

class mexca.audio.extraction.FeatureFormantBandwidth(n_formant: int)[source]

Bases: FeatureFormantFreq

Extract formant frequency bandwidth in Hz.

Parameters:: n_formant (int) – Index of the formant (starting at 0).

apply(time: numpy.ndarray) → Optional[numpy.ndarray][source]

Extract features at time points by linear interpolation.

Parameters:: time (numpy.ndarray) – Time points.
Returns:: Feature values interpolated at time points.
Return type:: numpy.ndarray

class mexca.audio.extraction.FeatureFormantAmplitude(n_formant: int)[source]

Bases: BaseFeature

Extract formant amplitude relative to F0 harmonic amplitude.

Parameters:: n_formant (int) – Index of the formant (starting at 0).

requires() → Optional[Dict[str, mexca.audio.features.FormantAmplitudeFrames]][source]

Specify objects required for feature extraction.

Returns:: Dictionary with key formant_amp_frames.
Return type:: dict

apply(time: numpy.ndarray) → Optional[numpy.ndarray][source]

Extract features at time points by linear interpolation.

Parameters:: time (numpy.ndarray) – Time points.
Returns:: Feature values interpolated at time points.
Return type:: numpy.ndarray

class mexca.audio.extraction.VoiceExtractor(features: Optional[Dict[str, BaseFeature]] = None)[source]

Extract voice features from an audio file.

For default features, see the Output section.

Parameters:: features (dict, optional, default=None) – Dictionary with keys as feature names and values as feature extraction objects. If None, default features are extracted.

apply(filepath: str, time_step: float, skip_frames: int = 1) → mexca.data.VoiceFeatures[source]

Extract voice features from an audio file.

Parameters:

filepath (str) – Path to the audio file.
time_step (float) – The interval between time points at which features are extracted.
skip_frames (int) – Only process every nth frame, starting at 0.

Returns:

A data class object containing the extracted voice features.

Return type:

VoiceFeatures

mexca.audio.extraction.cli()[source]: Command line interface for extracting voice features. See extract-voice -h for details.

mexca.audio.extraction

Module Contents

Classes

Functions

`mexca.audio.extraction`