Getting Started

info

Synthetic was formerly named glhf.chat. You may still find references to glhf in the codebase.

Start the backends

docker compose up --detach

First-time setup

Currently, the canonical keys are stored in Azure Key Vault. Someone will need to grant you access.

Then install the Azure CLI, and run:

# Make sure you're logged into the correct Azure account on the CLI
az login --allow-no-subscriptions --use-device-code

# Install packages
npm install

# Set up your .env file. You can run this again in the future if new secrets or
# environment variables get added.
npm run sync-dotenv

First-time setup: Migrations

npx tsx ./scripts/migrate.ts

First-time setup: Enable local HTTPS

# MacOS
brew install mkcert

# Linux & other OS:
# https://github.com/FiloSottile/mkcert?tab=readme-ov-file#linux

# Add mkcert local Root CA to local trusted CA store.
mkcert -install

# Generate and sign certificates
cd certs
mkcert localhost

tip

If you are accessing the dev instance from a different URL, replace localhost with the correct hostname and either rename your certs to localhost*.pem or edit server.ts.

Run the dev server

npm run dev

# In another terminal session
# This enqueues jobs to turn down GPU clusters that are idle, to save money.
npx tsx autoscaler.ts

# In another terminal session
# This processes jobs from all queues
npx tsx queue-worker.ts --work-all-queues

Note that you're running the autoscaler locally, so if you turn off your laptop with a cluster running, it won't stop until you turn your laptop back on! Please destroy your clusters before turning your laptop off with:

npx tsx ./scripts/destroy-cluster.ts

Chat with LLMs

Just open https://localhost:3000/ and launch one.

Don't worry if it seems like it's taking a while on first launch: a newly-created cluster takes a while to come up: 3+ minutes, depending on how long it takes the machine to pull the vllm Docker image. Stopped machines scaling back up are faster, but it depends on how long the model weights take to load into vram: for a 70b-q4 model, for example, it'll take roughly 2mins until the model is ready. You can also check the machine logs and status in the Fly.io dashboards.

Running a REPL

Use the npm run repl command, which will automatically open up a TypeScript REPL for you with the correct env vars configured and tracing set up. This works both in development, and SSHed into production machines if necessary.

Queues & workers

We have a small layer over Postgres and Redis for distributed job queues. Jobs are stored in Postgres, and we occasionally poll it for new work; however, we also attempt to instantly distribute work via Redis LPUSH/BLPOP, to avoid the latency of polling. The basic abstraction is a Queue (imported from @/app/queues), which takes:

A name, and
A mapping of job names to worker configurations

Worker configurations should generally be set up by first defining their max allowed attempts before failure, and then giving a Structural type for the job data and an async callback that runs the job. For example:

import { t } from "structural";
import { queue, maxAttempts } from "@/app/queues";

const worker = maxAttempts(5);
export const myQueue = queue("myQueue", {
  myJob: worker(t.str, async (someString) => {
    // ... business logic goes here
  }),
});


// To push a job:
await myQueue.push("myJob", "hello world");

Then, import all your queue files in @/queue-worker.ts, which is the main worker entrypoint. Your queue will automatically start running. Jobs will be delivered in-order; however, since jobs can be worked across multiple machines, there's no guarantee of them finishing in-order: only in-order delivery is guaranteed.

If a job fails too many times, it will get added to its queue's corresponding dead-letter queue, which is a capped-size list in Redis for holding permanently-failed jobs for debugging. You can check your queue's dead-letter queue with:

const deadLetters = await myQueue.deadLetters();

The dead-letter queue's size is fixed, so if you have too many job failures, there's no guarantee you'll be able to see all of them. The dead-letter queue isn't a durability mechanism: it's only a debugging tool.

DevOps

Github Actions and Workflows

The .github directory contains actions and workflows for Synthetic DevOps.

Testing

Workflows can be tested locally without commits/pushes by using nektos/act

act push -s GITHUB_TOKEN="$(gh auth token)" --secret-file .env -v

Start the backends​

First-time setup​

First-time setup: Migrations​

First-time setup: Enable local HTTPS​

Run the dev server​

Chat with LLMs​

Running a REPL​

Queues & workers​

DevOps​

Github Actions and Workflows​

Testing​