Skip to content

The managed LLM-engineering platform. Structured outputs, durable workflows, human in the loop, and more with delightful DX.

License

Notifications You must be signed in to change notification settings

inferablehq/inferable

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
Inferable CI
Mar 10, 2025
144cf95 Β· Mar 10, 2025
Mar 5, 2025
Mar 6, 2025
Dec 21, 2024
Mar 4, 2025
Mar 9, 2025
Feb 6, 2025
Mar 7, 2025
Feb 6, 2025
Feb 6, 2025
Feb 6, 2025
Jan 9, 2025
Mar 7, 2025
Mar 1, 2025
Mar 5, 2025
Oct 26, 2024
Mar 4, 2025
Mar 5, 2025
Mar 10, 2025
Feb 6, 2025
Oct 7, 2024
Dec 14, 2024
Dec 21, 2024
Mar 5, 2025
Mar 1, 2025
Nov 30, 2024
Nov 28, 2024
Mar 7, 2025
Dec 9, 2024
Nov 28, 2024
Mar 6, 2025
Mar 6, 2025

Repository files navigation

Inferable Logo

Inferable

The managed LLM-engineering platform for production-ready AI applications.

NPM Version GitHub go.mod Go version NuGet Version License

What is Inferable?

Inferable is a fully managed platform that handles state, reliability, and orchestration of custom LLM-based applications. It's developer-first and API-driven, providing production-ready LLM primitives for building sophisticated LLM-based applications.

Timeline View

⚑️ Quick Start

Follow the quick start guide to get started with Inferable.

πŸ”‘ Key Features

Here are some of the key features of Inferable.

πŸ“¦ Workflows that execute in your own infrastructure

Workflows execute in your own infrastructure, even behind firewalls or private VPCs. No deployment step is required. We use long polling to connect to your infrastructure, so there is no need to open any inbound ports.

const workflow = inferable.workflows.create({
  name: "simple",
  inputSchema: z.object({
    executionId: z.string(),
    greeting: z.string(),
  }),
});

πŸ”„ Versioned Workflows for backward compatibility

When you need to change the input schema or the logic of a workflow, you can create a new version of the workflow. Inferable will maintain version affinity for currently executing workflows, so you can roll out new versions gradually. See Workflows.

workflow.version(1).define(async (ctx, input) => {
  // ...
});

workflow.version(2).define(async (ctx, input) => {
  // ...
});

πŸ—οΈ Structured Outputs with automatic parsing, validation, and retries

Inferable automatically parses and validates structured outputs, and retries failed executions. See Structured Outputs.

workflow.version(1).define(async (ctx, input) => {
  const { ticketType } = ctx.llm.structured({
    input: `Ticket text: ${input.ticketText}`,
    schema: z.object({
      ticketType: z.enum(["data-deletion", "refund", "other"]),
    }),
  });

  // do something with the items
  console.log(ticketType);
});

πŸ§‘β€πŸ’Ό Human-in-the-Loop with approval workflows

Inferable allows you to integrate human approval and intervention with full context preservation. See Human-in-the-Loop.

deleteUserWorkflow.version(1).define(async (ctx, input) => {
  // ... existing workflow code ...

  if (!ctx.approved) {
    return Interrupt.approval({
      message: `I need your approval to delete the user ${input.userId}. Is this ok?`,
      destination: {
        type: "email",
        // The email address to notify
        email: "test@example.com",
      },
    });
  }

  await db.customers.delete({
    userId: input.userId,
  });
});

πŸ€– Agents with Tool Use

Inferable agents can use tools to achieve pre-defined goals. See Agents.

const agentInstructions = `
  Evaluate the provided support ticket body and extract the user from the database.

  When searching for users, if you don't get specific results, try to search with a more general term with sub strings with unique nouns.
  For example, "John Smith": searchUser("John Smith"), searchUser("John"), searchUser("Smith"), etc.
`;

workflow.tools.register({
  name: "searchUser",
  schema: z.object({
    userId: z.string(),
  }),
  handler: async (ctx, input) => {
    // your own code to search for the user
  },
});

workflow.version(1).define(async (ctx, input) => {
  const { userId } = await ctx.llm.agents.react({
    name: "restaurantSearch",
    instructions: agentInstructions,
    input: JSON.stringify({ ticket }),
    tools: ["searchUser"],
    resultSchema: z.object({
      userId: z.string(),
    }),
  });

  // do something with the userId
  console.log(userId);
});

And more stuff...

  • Notifications to send notifications to users via Slack or Email.
  • Memoized Results to cache the results of side-effects and expensive operations in a distributed way.
  • Obervability in a timeline view, or plug into your own observability tools.
  • Developer-friendly SDKs in Node.js, and Go supported with more languages coming soon.

Workflow Timeline

πŸ“š Language Support

Language Source Package
Node.js / TypeScript Quick start NPM
Go Quick start Go

πŸš€ Open Source

This repository contains the Inferable control-plane, as well as SDKs for various languages.

Core services:

  • /control-plane - The core Inferable control plane service
  • /app - Playground front-end and management console
  • /cli - Command-line interface tool (alpha)

SDKs:

  • /sdk-node - Node.js/TypeScript SDK
  • /sdk-go - Go SDK
  • /sdk-dotnet - .NET SDK (experimental)

πŸ’Ύ Self Hosting

Inferable is completely open source and can be self-hosted on your own infrastructure for complete control over your data and compute. This gives you:

  • Full control over your data and models
  • No vendor lock-in
  • Enhanced security with your own infrastructure
  • Customization options to fit your specific needs

See our self hosting guide for more details.

🀝 Contributing

We welcome contributions to all projects in the Inferable repository. Please read our contributing guidelines before submitting any pull requests.

πŸ“ License

All code in this repository is licensed under the MIT License.