All models
Model detail

Lattice-2

Static analysis layer for semantic diffing and exploit chain reasoning

Parameters

13B

Current production checkpoint.

Context window

128k tokens

Rotary embeddings + chunk routing.

Cost guidance

$0.18 / 1K tokens

A100 (80GB) or 16-core CPU AVX512

Operational envelope

  • Mixture-of-experts cross-lingual heads
  • Tool calling: SARIF generation, SBOM summarization
  • Latency @ BF16 (batch=1, 512-token prompt, A100, CUDA 12.1): p50 145ms | p95 310ms | p99 580ms
  • CPU AVX512 (same config): p50 410ms | p95 920ms | p99 1750ms

Production SLO thresholds

  • Model error rate

    Inference requests flagged as erroneous (exceptions, timeouts, malformed SARIF) / total requests

    Threshold: < 2% over 5-minute sliding window

  • SARIF reproduction success

    Findings where automated sandbox validation confirms vulnerability / total model findings

    Threshold: > 60% over 1-hour window

  • p95 latency

    95th percentile inference duration measured server-side

    Threshold: < 500ms (L1), < 800ms (L2), < 1500ms (L3)

Performance snapshot

Benchmark results

Latest evaluation window across public and internal datasets.

BenchmarkMetricLattice-2Baseline
CWE-V Suite (1.4k vulns, test split)F10.810.58 (Llama-3-70B)
OSS-Fuzz triage (2.1k crashes, held-out)Recall @ top50.660.41 (GPT-4-turbo)
Internal exploit chain set (380 chains)Step accuracy0.63N/A (proprietary)

CWE-V Suite (1.4k vulns, test split): 3 runs, seeds: 42, 1337, 9001; temp=0.3

OSS-Fuzz triage (2.1k crashes, held-out): Single pass, no few-shot prompting

Internal exploit chain set (380 chains): Avg over 3 seeds; eval script: github.com/evalops/eval-harness@v2.1.3

Cost & deployment

Sizing guidance

Reference pricing assumes production hardened inference with observability enabled.

Cost per 1K tokens

$0.18 / 1K tokens

Reference hardware

A100 (80GB) or 16-core CPU AVX512

Autoscaling notes

Batch inference recommended for large repos (>100K LOC)

Deployment scenarios

ScenarioEst. tokensRecommendationCost per scanNote
Medium repo (250K LOC, ~120MB tarball)~85K tokensLattice-2 batch mode$15.30 per full scanUse streaming for repos > 500K LOC to avoid timeout
Integration surface

Runtime expectations

Consistent across the Lattice family with model-specific latency budgets.

Interfaces

  • gRPC endpoint (stream + unary)
  • REST inference proxy
  • CLI for batch audit runs

Artifacts accepted

  • Source trees (Git), SBOM manifests
  • Compiled binaries (ELF/PE/Mach-O)
  • Container images, IaC templates

Outputs

  • SARIF v2.1.0
  • Custom JSON (root-cause + reproduction steps)
  • Markdown incident briefs

Observability

  • OpenTelemetry traces
  • Metric export: Prometheus
  • Audit logs: S3/GCS
Next steps

Engage the research team

Share repository scope, desired runtimes, and deployment constraints so we can scope evaluation access.