Building Your Own Services

Once you've built a few simple bots by combining existing services, you'll want to start solving more complex problems, which means building your own services. Fortunately, it's pretty straightforward: You create a subclass of AIService, and implement a method called process_frame. For example, here's a TranslationProcessor service, used in the translation example:

class TranslationProcessor(AIService):
    def __init__(self, language):
        self._language = language

    async def process_frame(self, frame: Frame) -> AsyncGenerator[Frame, None]:
        if isinstance(frame, TextFrame):
            context = [
                {
                    "role": "system",
                    "content": f"You will be provided with a sentence in English, and your task is to translate it into {self._language}.",
                },
                {"role": "user", "content": frame.text},
            ]
            yield LLMMessagesFrame(context)
        else:
            yield frame

The __init__ method allows us to specify what language we want to use when we create an instance of the service.

The process_frame method gets called by the pipeline with each frame emitted from the previous service (or the pipeline's input queue if this is the first service in the pipeline). In this case, if the current frame is a TextFrame, we're putting the frame's text inside a context with instructions for an LLM, and then putting that context inside an LLMMessagesFrame that gets sent to the next service in the pipeline. If the current frame is anything other than a TextFrame, we pass it along unmodified.

This is an important convention you'll see in almost all services. If your service receives a frame it doesn't do anything with, you should pass it along unmodified.

Another important thing to notice is the use of yield throughout the method, as well as the return type of the process_frame function: AsyncGenerator[Frame, None].

If you're just getting into Python because of all the interesting things happening with AI, you should familiarize yourself with Python's Generators, and more specifically, async generators and asyncio.

The pipeline actually calls your process_frame function with each input frame like this: async for frame in service.process_frame(frame):, which means you can take advantage of multiple yield statements to create multiple output frames from a single input frame. For example, here's a custom service that enables extremely basic animation by displaying a "talking" image while the bot is talking:

class ImageSyncAggregator(AIService):
    async def process_frame(self, frame: Frame) -> AsyncGenerator[Frame, None]:
        if isinstance(frame, TextFrame):
            yield talking_frame
            yield frame
            yield quiet_frame
        else:
            yield frame

When this service receives a TextFrame, it yields an ImageFrame that contains a character with its mouth open, then it yields the received TextFrame, then it yields an ImageFrame with the character's mouth closed. Because of the way subsequent services (like TTS) keep frames in order, the end result will be the transport displaying the "talking" image, then playing back the TTS audio, then displaying the "quiet" image. As before, anything that isn't a TextFrame gets passed along unmodified.

Eventually, you'll find your bot doing something you don't expect. Let's learn about how to debug Pipecat apps in the next section.