Lattice-1
Entry point model for CI/CD and dependency diffing
Parameters
7B
Current production checkpoint.
Context window
64k tokens
Rotary embeddings + chunk routing.
Cost guidance
$0.08 / 1K tokens
A100 (40GB) or 8-core CPU AVX2
Operational envelope
- Quantization: Q4_0, Q8_0
- Latency @ Q8_0 (batch=1, 512-token prompt, A100, CUDA 12.1): p50 82ms | p95 180ms | p99 320ms
- CPU AVX2 (same config): p50 220ms | p95 510ms | p99 980ms
- Batch size: 16 tokens/step
Production SLO thresholds
Model error rate
Inference requests flagged as erroneous (exceptions, timeouts, malformed SARIF) / total requests
Threshold: < 2% over 5-minute sliding window
SARIF reproduction success
Findings where automated sandbox validation confirms vulnerability / total model findings
Threshold: > 60% over 1-hour window
p95 latency
95th percentile inference duration measured server-side
Threshold: < 500ms (L1), < 800ms (L2), < 1500ms (L3)
Benchmark results
Latest evaluation window across public and internal datasets.
| Benchmark | Metric | Lattice-1 | Baseline |
|---|---|---|---|
| CWE-V Suite (1.4k vulns, test split) | F1 | 0.71 | 0.58 (Llama-3-70B) |
| OSS-Fuzz triage (2.1k crashes, held-out) | Recall @ top5 | 0.54 | 0.41 (GPT-4-turbo) |
| Internal exploit chain set (380 chains) | Step accuracy | 0.42 | N/A (proprietary) |
CWE-V Suite (1.4k vulns, test split): 3 runs, seeds: 42, 1337, 9001; temp=0.3
OSS-Fuzz triage (2.1k crashes, held-out): Single pass, no few-shot prompting
Internal exploit chain set (380 chains): Avg over 3 seeds; eval script: github.com/evalops/eval-harness@v2.1.3
Sizing guidance
Reference pricing assumes production hardened inference with observability enabled.
Cost per 1K tokens
$0.08 / 1K tokens
Reference hardware
A100 (40GB) or 8-core CPU AVX2
Autoscaling notes
Horizontal pod autoscaling: target 70% GPU utilization
Deployment scenarios
| Scenario | Est. tokens | Recommendation | Cost per scan | Note |
|---|---|---|---|---|
| Microservice (15K LOC, ~8MB) | ~5K tokens | Lattice-1 streaming | $0.40 per scan | Ideal for CI/CD on every commit |
Runtime expectations
Consistent across the Lattice family with model-specific latency budgets.
Interfaces
- gRPC endpoint (stream + unary)
- REST inference proxy
- CLI for batch audit runs
Artifacts accepted
- Source trees (Git), SBOM manifests
- Compiled binaries (ELF/PE/Mach-O)
- Container images, IaC templates
Outputs
- SARIF v2.1.0
- Custom JSON (root-cause + reproduction steps)
- Markdown incident briefs
Observability
- OpenTelemetry traces
- Metric export: Prometheus
- Audit logs: S3/GCS
Engage the research team
Share repository scope, desired runtimes, and deployment constraints so we can scope evaluation access.