Processors¶
aana.processors.remote
¶
run_remote
¶
Wrap a function to run it remotely on Ray.
PARAMETER | DESCRIPTION |
---|---|
func |
the function to wrap
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Callable
|
the wrapped function
TYPE:
|
Source code in aana/processors/remote.py
aana.processors.video
¶
extract_audio
¶
Extract the audio file from a Video and return an Audio object.
PARAMETER | DESCRIPTION |
---|---|
video |
The video file to extract audio.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Audio
|
an Audio object containing the extracted audio.
TYPE:
|
Source code in aana/processors/video.py
aana.processors.batch
¶
BatchProcessor
¶
Class for parallel processing data in chunks.
The BatchProcessor class encapsulates the logic required to take a large collection of data, split it into manageable batches, process these batches in parallel, and then combine the results into a single cohesive output.
Batching works by iterating through the input request, which is a dictionary where each key maps
to a list-like collection of data. The class splits each collection into sublists of length up
to batch_size
, ensuring that corresponding elements across the collections remain grouped
together in their respective batches.
Merging takes the output from each processed batch, which is also a dictionary structure, and combines these into a single dictionary. Lists are extended, numpy arrays are concatenated, and dictionaries are updated. If a new data type is encountered, an error is raised prompting the implementer to specify how these types should be merged.
This class is particularly useful for batching of requests to a machine learning model.
The thread pool for parallel processing is managed internally and is shut down automatically when the BatchProcessor instance is garbage collected.
ATTRIBUTE | DESCRIPTION |
---|---|
process_batch |
A function to process each batch.
TYPE:
|
batch_size |
The size of each batch to be processed.
TYPE:
|
num_threads |
The number of threads in the thread pool for parallel processing.
TYPE:
|
PARAMETER | DESCRIPTION |
---|---|
process_batch |
Function that processes each batch.
TYPE:
|
batch_size |
Size of the batches.
TYPE:
|
num_threads |
Number of threads in the pool.
TYPE:
|
Source code in aana/processors/batch.py
batch_iterator
¶
Converts request into an iterator of batches.
Iterates over the input request, breaking it into smaller batches for processing. Each batch is a dictionary with the same keys as the input request, but the values are sublists containing only the elements for that batch.
Example:
request = {
'images': [img1, img2, img3, img4, img5],
'texts': ['text1', 'text2', 'text3', 'text4', 'text5']
}
# Assuming a batch size of 2, this iterator would yield:
# 1st iteration: {'images': [img1, img2], 'texts': ['text1', 'text2']}
# 2nd iteration: {'images': [img3, img4], 'texts': ['text3', 'text4']}
# 3rd iteration: {'images': [img5], 'texts': ['text5']}
PARAMETER | DESCRIPTION |
---|---|
request |
The request data to split into batches.
TYPE:
|
YIELDS | DESCRIPTION |
---|---|
dict[str, list[Any]]
|
Iterator[dict[str, list[Any]]]: An iterator over the batched requests. |
Source code in aana/processors/batch.py
process
¶
Process a request.
Splits the input request into batches, processes each batch in parallel, and then merges the results into a single dictionary.
PARAMETER | DESCRIPTION |
---|---|
request |
The request data to process.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
dict[str, Any]
|
Dict[str, Any]: The merged results from processing all batches. |
Source code in aana/processors/batch.py
merge_outputs
¶
Merge output.
Combine processed batch outputs into a single dictionary. It handles various data types by extending lists, updating dictionaries, and concatenating numpy arrays.
Example:
outputs = [
{'images': [processed_img1, processed_img2], 'labels': ['cat', 'dog']},
{'images': [processed_img3, processed_img4], 'labels': ['bird', 'mouse']},
{'images': [processed_img5], 'labels': ['fish']}
]
# The merged result would be:
# {
# 'images': [processed_img1, processed_img2, processed_img3, processed_img4, processed_img5],
# 'labels': ['cat', 'dog', 'bird', 'mouse', 'fish']
# }
PARAMETER | DESCRIPTION |
---|---|
outputs |
List of outputs from the processed batches.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
dict[str, Any]
|
dict[str, Any]: The merged result. |
Source code in aana/processors/batch.py
aana.processors.speaker.PostProcessingForDiarizedAsr
¶
Class to handle post-processing for diarized ASR output by combining diarization and transcription segments.
The post-processing involves assigning speaker labels to transcription segments and words, aligning speakers with punctuation, optionally merging homogeneous speaker segments, and reassigning confidence information to the segments.
ATTRIBUTE | DESCRIPTION |
---|---|
diarized_segments |
Contains speaker diarization segments.
TYPE:
|
transcription_segments |
Transcription segments.
TYPE:
|
merge |
Whether to merge the same speaker segments in the final output.
TYPE:
|
process
¶
Executes the post-processing pipeline that combines diarization and transcription segments.
This method performs the following steps: 1. Assign speaker labels to each segment and word in the transcription based on the diarization output. 2. Align speakers with punctuation. 3. Create new transcription segments by combining the speaker-labeled words. 4. Optionally, merge consecutive speaker segments. 5. Add confidence and no_speech_confidence to the new segments.
PARAMETER | DESCRIPTION |
---|---|
diarized_segments |
Contains speaker diarization segments.
TYPE:
|
transcription_segments |
Transcription segments.
TYPE:
|
merge |
If True, merges consecutive speaker segments in the final output. Defaults to False.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[AsrSegment]
|
list[AsrSegment]: Updated transcription segments with speaker information per segment and word. |