mexca.pipeline
Build a pipeline to extract emotion expression features from a video file.
Module Contents
Classes
Build a pipeline to extract emotion expression features from a video file. |
- class mexca.pipeline.Pipeline(face_extractor: FaceExtractor | FaceExtractorContainer | None = None, speaker_identifier: SpeakerIdentifier | SpeakerIdentifierContainer | None = None, voice_extractor: VoiceExtractor | VoiceExtractorContainer | None = None, audio_transcriber: AudioTranscriber | AudioTranscriberContainer | None = None, sentiment_extractor: SentimentExtractor | SentimentExtractorContainer | None = None)[source]
Build a pipeline to extract emotion expression features from a video file.
Takes either component objects or container component objects (or a mix of both) as input.
- Parameters:
face_extractor (FaceExtractor or FaceExtractorContainer, optional, default=None) – Component for detecting and identifying faces as well as extracting facial features.
speaker_identifier (SpeakerIdentifier or SpeakerIdentifierContainer, optional, default=None) – Component for identifying speech segments and speakers.
voice_extractor (VoiceExtractor or VoiceExtractorContainer, optional, default=None) – Component for extracting voice features.
audio_transcriber (AudioTranscriber or AudioTranscriberContainer, optional, default=None) – Component for transcribing speech segments to text.
sentiment_extractor (SentimentExtractor or SentimentExtractorContainer, optional, default=None) – Component for extracting sentiment from text.
Examples
Create a pipeline with standard components.
>>> from mexca import Pipeline >>> from mexca.audio import SpeakerIdentifier, VoiceExtractor >>> from mexca.text import AudioTranscriber, SentimentExtractor >>> from mexca.video import FaceExtractor >>> num_faces = 2 >>> num_speaker = 2 >>> pipeline = Pipeline( ... face_extractor=FaceExtractor(num_faces=num_faces), ... speaker_identifier=SpeakerIdentifier( ... num_speakers=num_speakers ... ), ... voice_extractor=VoiceExtractor(), ... audio_transcriber=AudioTranscriber(), ... sentiment_extractor=SentimentExtractor() ... )
Create a pipeline with container components.
>>> from mexca import Pipeline >>> from mexca.container import AudioTranscriberContainer, FaceExtractorContainer, >>> SentimentExtractorContainer, SpeakerIdentifierContainer, VoiceExtractorContainer >>> num_faces = 2 >>> num_speaker = 2 >>> pipeline = Pipeline( ... face_extractor=FaceExtractorContainer(num_faces=num_faces), ... speaker_identifier=SpeakerIdentifierContainer( ... num_speakers=num_speakers ... ), ... voice_extractor=VoiceExtractorContainer(), ... audio_transcriber=AudioTranscriberContainer(), ... sentiment_extractor=SentimentExtractorContainer() ... )
Create a pipeline with standard and container components.
>>> from mexca import Pipeline >>> from mexca.audio import SpeakerIdentifier, VoiceExtractor >>> from mexca.container import AudioTranscriberContainer, FaceExtractorContainer, >>> SentimentExtractorContainer >>> num_faces = 2 >>> num_speaker = 2 >>> pipeline = Pipeline( ... face_extractor=FaceExtractorContainer(num_faces=num_faces), ... speaker_identifier=SpeakerIdentifier( # standard ... num_speakers=num_speakers ... ), ... voice_extractor=VoiceExtractor(), # standard ... audio_transcriber=AudioTranscriberContainer(), ... sentiment_extractor=SentimentExtractorContainer() ... )
- apply(filepath: str | collections.abc.Iterable, frame_batch_size: int = 1, skip_frames: int = 1, process_subclip: Tuple[float | None] = (0, None), return_embeddings: bool = False, language: str | None = None, keep_audiofile: bool = False, merge: bool = True, show_progress: bool = True) mexca.data.Multimodal | collections.abc.Iterable [source]
Extract emotion expression features from a video file.
This is the main function to apply the complete mexca pipeline to a video file.
- Parameters:
filepath (str or collections.abc.Iterable) – Path to the video file or iterable returning paths to multiple video files.
frame_batch_size (int, default=1) – Size of the batch of video frames that are loaded and processed at the same time.
skip_frames (int, default=1) – Only process every nth frame, starting at 0.
process_subclip (tuple, default=(0, None)) – Process only a part of the video clip. Must be the start and end of the subclip in seconds. None indicates the end of the video.
return_embeddings (bool, default=False) – Return embeddings for each detected face. For large input files, this can increase the size of the output substantially as a 512-element vector is stored for each face. Face embeddings are stored in the
video_annotation
attribute of theMultimodal
object.language (str, optional, default=None) – The language of the speech that is transcribed. If None, the language is detected for each speech segment.
keep_audiofile (bool, default=False) – Keeps the audio file after processing. If False, the audio file is only stored temporarily.
merge (bool, default=True) – Whether to merge the output from the different components into a single
polars.LazyFrame
. If True (default), the methodmerge_features()
is called after all components finished processing and apolars.LazyFrame
is stored at the features attribute. If False, the method is not called and the features attribute is None.show_progress (bool, default=True) – Enables progress bars and printing info logging messages to the console. The logging is overriden when a custom logger is explicitly created.
- Returns:
A data class object that contains the extracted merged features in the features attribute. See the Output section for details. If filepath is an
collections.abc.Iterable
returns ancollections.abc.Iterable
ofmexca.data.Multimodal
objects.- Return type:
See also
Examples
>>> import polars as pl >>> from mexca.data import Multimodal >>> # Single video file >>> filepath = 'path/to/video' >>> output = pipeline.apply(filepath) >>> assert isinstance(output, Multimodal) True >>> assert isinstance(output.features, pl.LazyFrame) True >>> # List of video files >>> filepaths = ['path/to/video', 'path/to/another/video'] >>> output = pipeline.apply(filepaths) >>> assert isinstance(output, list) True >>> assert [isinstance(r, Multimodal) for r in output] True