Skip to content

Deployments

aana.deployments

AanaDeploymentHandle

AanaDeploymentHandle(deployment_name, num_retries=3, retry_exceptions=False, retry_delay=0.2, retry_max_delay=2.0)

A handle to interact with a deployed Aana deployment.

Use create method to create a deployment handle.

deployment_handle = await AanaDeploymentHandle.create("deployment_name")
ATTRIBUTE DESCRIPTION
handle

Ray Serve deployment handle.

TYPE: DeploymentHandle

deployment_name

The name of the deployment.

TYPE: str

PARAMETER DESCRIPTION
deployment_name

The name of the deployment.

TYPE: str

num_retries

The maximum number of retries for the method.

TYPE: int DEFAULT: 3

retry_exceptions

Whether to retry on application-level errors or a list of exceptions to retry on.

TYPE: bool | list[Exception] DEFAULT: False

retry_delay

The initial delay between retries.

TYPE: float DEFAULT: 0.2

retry_max_delay

The maximum delay between retries.

TYPE: float DEFAULT: 2.0

Source code in aana/deployments/aana_deployment_handle.py
def __init__(
    self,
    deployment_name: str,
    num_retries: int = 3,
    retry_exceptions: bool | list[Exception] = False,
    retry_delay: float = 0.2,
    retry_max_delay: float = 2.0,
):
    """A handle to interact with a deployed Aana deployment.

    Args:
        deployment_name (str): The name of the deployment.
        num_retries (int): The maximum number of retries for the method.
        retry_exceptions (bool | list[Exception]): Whether to retry on application-level errors or a list of exceptions to retry on.
        retry_delay (float): The initial delay between retries.
        retry_max_delay (float): The maximum delay between retries.
    """
    self.handle = serve.get_app_handle(deployment_name)
    self.deployment_name = deployment_name
    self.__methods = None
    self.num_retries = num_retries
    self.retry_exceptions = retry_exceptions
    self.retry_delay = retry_delay
    self.retry_max_delay = retry_max_delay

create

create(deployment_name, num_retries=3, retry_exceptions=False, retry_delay=0.2, retry_max_delay=2.0)

Create a deployment handle.

PARAMETER DESCRIPTION
deployment_name

The name of the deployment to interact with.

TYPE: str

num_retries

The maximum number of retries for the method.

TYPE: int DEFAULT: 3

retry_exceptions

Whether to retry on application-level errors or a list of exceptions to retry on.

TYPE: bool | list[Exception] DEFAULT: False

retry_delay

The initial delay between retries.

TYPE: float DEFAULT: 0.2

retry_max_delay

The maximum delay between retries.

TYPE: float DEFAULT: 2.0

Source code in aana/deployments/aana_deployment_handle.py
@classmethod
async def create(
    cls,
    deployment_name: str,
    num_retries: int = 3,
    retry_exceptions: bool | list[Exception] = False,
    retry_delay: float = 0.2,
    retry_max_delay: float = 2.0,
):
    """Create a deployment handle.

    Args:
        deployment_name (str): The name of the deployment to interact with.
        num_retries (int): The maximum number of retries for the method.
        retry_exceptions (bool | list[Exception]): Whether to retry on application-level errors or a list of exceptions to retry on.
        retry_delay (float): The initial delay between retries.
        retry_max_delay (float): The maximum delay between retries.
    """
    handle = cls(
        deployment_name=deployment_name,
        num_retries=num_retries,
        retry_exceptions=retry_exceptions,
        retry_delay=retry_delay,
        retry_max_delay=retry_max_delay,
    )
    await handle.__load_methods()
    return handle

BaseDeployment

BaseDeployment()

Base class for all deployments.

To create a new deployment, inherit from this class and implement the apply_config method and your custom methods like generate, predict, etc.

Source code in aana/deployments/base_deployment.py
def __init__(self):
    """Inits to unconfigured state."""
    self.config = None
    self._configured = False
    self.num_requests_since_last_health_check = 0
    self.raised_exceptions = []
    self.restart_exceptions = [InferenceException]

check_health

check_health()

Check the health of the deployment.

Source code in aana/deployments/base_deployment.py
async def check_health(self):
    """Check the health of the deployment.

    Raises:
        Raises the exception that caused the deployment to be unhealthy.
    """
    raised_restart_exceptions = [
        exception
        for exception in self.raised_exceptions
        if exception.__class__ in self.restart_exceptions
    ]
    # Restart the deployment if more than 50% of the requests raised restart exceptions
    if self.num_requests_since_last_health_check != 0:
        ratio_restart_exceptions = (
            len(raised_restart_exceptions)
            / self.num_requests_since_last_health_check
        )
        if ratio_restart_exceptions > 0.5:
            raise raised_restart_exceptions[0]

    self.raised_exceptions = []
    self.num_requests_since_last_health_check = 0

apply_config

apply_config(config)

Apply the configuration.

This method is called when the deployment is created or updated.

Define the logic to load the model and configure it here.

PARAMETER DESCRIPTION
config

the configuration

TYPE: dict

Source code in aana/deployments/base_deployment.py
async def apply_config(self, config: dict[str, Any]):
    """Apply the configuration.

    This method is called when the deployment is created or updated.

    Define the logic to load the model and configure it here.

    Args:
        config (dict): the configuration
    """
    raise NotImplementedError