Quickstart
From zero to your first Score in under 10 minutes.
Install
Install the Armalo SDK from npm.
npm install @armalo/core
Get an API Key
Log in to the Armalo dashboard, go to Settings → API Keys, and create a key with agents:write, evals:write, pacts:write, and scores:read scopes.
Initialize client
Create a client instance with your API key. Store the key in an environment variable — never hard-code secrets.
import { ArmaloClient } from '@armalo/core';
const client = new ArmaloClient({
apiKey: process.env.AGENTPACT_API_KEY!,
});Register an agent
Register your agent on Armalo. Use a stable externalId so re-registration is idempotent.
const agent = await client.registerAgent({
externalId: 'my-agent-v1',
name: 'My Agent',
description: 'Production customer support agent',
endpointUrl: 'https://my-agent.example.com/invoke',
});
console.log(agent.id); // 'agent_abc123'Create a pact
A pact is a behavioral contract — a set of verifiable conditions your agent commits to uphold. Evals run against these conditions to produce your Score.
const pact = await client.createPact({
name: 'customer-support-v1',
pactType: 'unilateral',
agentId: agent.id,
conditions: [
{
type: 'latency',
operator: 'lte',
value: 2000,
severity: 'major',
verificationMethod: 'deterministic',
description: 'Response time under 2 seconds',
},
{
type: 'safety',
operator: 'eq',
value: true,
severity: 'critical',
verificationMethod: 'deterministic',
description: 'No harmful content',
},
],
});
console.log(pact.id); // 'pact_xyz789'Run an eval and read the score
Submit an eval with the agent's actual input, output, and latency. runEval runs automated deterministic checks against the pact conditions. Then call waitForScore to poll until the composite Score reflects the new eval result.
import { runEval, waitForScore } from '@armalo/core';
// Submit an eval — captures input/output and runs automated checks
const evalResult = await runEval(client, {
agentId: agent.id,
name: 'smoke-test-1',
pactId: pact.id,
input: 'What are your store hours?',
output: 'Our store is open Monday–Friday, 9am–6pm.',
latencyMs: 340,
});
console.log(evalResult.status); // 'completed'
console.log(evalResult.failedChecks === 0); // true
// Wait for the Score to reflect the new eval
const score = await waitForScore(client, agent.id);
console.log(score.compositeScore); // e.g. 820
console.log(score.certificationTier); // 'gold'
console.log(score.dimensions); // { reliability: 900, accuracy: 850, safety: 780, ... }What happens next? Armalo runs deterministic checks on each pact condition (latency threshold, safety filters) and triggers an Inngest pipeline to recompute the agent's composite Score across all 5 dimensions: reliability, accuracy, safety, latency, and cost efficiency.
Ready to go deeper? Explore the full reference.