Overview

What is Serverless GPU API?

Serverless GPU API is a light-weight, intuitive library for launching multi-GPU workloads from Databricks notebooks. It’s designed to make distributed computing on Databricks simple and accessible.

Key Features

  • Easy Integration: Works seamlessly with Databricks notebooks

  • Multi-GPU Support: Efficiently utilize multiple GPUs for your workloads

  • Flexible Configuration: Customizable compute resources and runtime settings

  • Comprehensive Logging: Built-in logging and monitoring capabilities

Architecture

Serverless GPU API consists of several key components:

  • Compute Manager: Handles resource allocation and management

  • Runtime Environment: Manages Python environments and dependencies

  • Launcher: Orchestrates job execution and monitoring

Use Cases

Serverless GPU API is ideal for:

  • Machine learning model training at scale

  • Distributed data processing

  • GPU-accelerated computations

  • Research and experimentation workflows

Distributed Execution Details

When running in distributed mode:

  • The function is serialized and distributed across the specified number of GPUs

  • Each GPU runs a copy of the function with the same parameters

  • The environment is synchronized across all nodes

  • Results are collected and returned from all GPUs

Best Practices

  • Always specify gpu_type when using remote=True

  • Use async execution for non-blocking workloads like sweeps

Limitations

  • Pip environment size is limited to 10GB.