VAD Models¶

aana.core.models.vad ¶

VadSegments ¶

VadSegments = list[VadSegment]

List of VadSegment objects.

VadParams ¶

Bases: BaseModel

A model for the Voice Activity Detection model parameters.

ATTRIBUTE	DESCRIPTION
`chunk_size`	The maximum length of each vad output chunk. TYPE: `float`
`merge_onset`	Onset to be used for the merging operation. TYPE: `float`
`merge_offset`	"Optional offset to be used for the merging operation. TYPE: `float`

VadSegment ¶

Bases: BaseModel

Pydantic schema for Segment from Voice Activity Detection model.

ATTRIBUTE	DESCRIPTION
`time_interval`	The start and end time of the segment TYPE: `TimeInterval`
`segments`	smaller voiced segments within a merged vad segment TYPE: `list[tuple[float, float]]`

to_whisper_dict ¶

to_whisper_dict()

Generate dictionary with start, end and segments keys from VADSegment for faster whisper.

RETURNS	DESCRIPTION
`dict`	Dictionary with start, end and segments keys TYPE: `dict`

Source code in aana/core/models/vad.py

def to_whisper_dict(self) -> dict:
    """Generate dictionary with start, end and segments keys from VADSegment for faster whisper.

    Returns:
        dict: Dictionary with start, end and segments keys
    """
    return {
        "start": self.time_interval.start,
        "end": self.time_interval.end,
        "segments": self.segments,
    }