WORLD FIRST · The first persistent cognitive AI card for personal computers. See specifications →
INTRODUCING · PUNKY TIGER LABS

NYMPH S-Quantum.
The first persistent cognitive AI PCIe card
for personal computers.

Powered by the CES-winning DEEPX DX-M1M AI accelerator. Wrapped in Punky Tiger Labs' patented persistent cognitive architecture — 91+ USPTO filings that give consumer machines a memory which survives the reboot.

Insert it. Everything changes.

31 TOPS dedicated AI DeepX DX-M1M + RK3588 NPU Dual M.2 expansion KV-cache & VRAM liberation ONNX · PyTorch · TensorFlow 8K decode + multimodal Fanless low-profile PCIe 100% local + cloud-enhanced
31TOPS
Dedicated AI · INT8
<20W
Peak · ~6W typical
91+
USPTO patent filings
$590
Estimated retail
SILICON SUPPLIERS
DeepX + Rockchip
Integrated under the NYMPH stack.
We don't make silicon — we make it remember. The DeepX DX-M1M (the AI chip that won CES 2026 Best of Innovation) provides dedicated neural acceleration. Rockchip's RK3588 SoC orchestrates the system. The persistent cognitive architecture, the 91+ USPTO patents, and the product belong to Punky Tiger Labs.
The Problem

Every AI system forgets the moment a session ends.

Every context is rebuilt from scratch. Every conversation starts at zero. The majority of compute — and cost — goes toward re-establishing what the system already knew.

NYMPH was designed to end that.

The Solution

Cognitive state lives on the card, not in the prompt.

NYMPH S-Quantum offloads context management, speech, vision, and inference to dedicated on-card processors — freeing GPU VRAM entirely for model weights and rendering.

The card does not replace your GPU. It complements it.

Architecture

Multi-processor cognitive system. Future-proof by design.

Five processors, each on dedicated silicon. Parallel by construction. No contention.

DX-M1M
25 TOPS
DeepX · M.2 Slot 1
KV Cache · LLM · Vision · Audio
DX-M2
2027
M.2 Slot 2
Samsung 2nm GAA · Upgrade-Ready
RK3588 NPU
6 TOPS
Rockchip
System Orchestrator
Mali-G610
GPU
Arm
On-card render
NAND 64GB
Persistent
NYMPH IP
State storage

PCIe Gen3 x4 · <20W peak · ~6W typical · Fanless heatsink · Low profile · Dual M.2 slots (1 populated, 1 future-ready)

Beyond an Accelerator

NYMPH is not just another accelerator card.
It enables entirely new forms of AI at the hardware level.

01

Persistent-memory AI agents that survive reboots

Context survives session boundaries, app restarts, and full system reboots. Warm-state resume measured in the tens of milliseconds.

No consumer hardware currently offers this.
02

Up to 8× larger context windows · 325% faster generation

KV cache moves to dedicated on-card memory, freeing GPU VRAM for model weights. Long-context token generation accelerates dramatically.

Up to 8× context · +325% long-context throughput
03

Parallel voice + vision + LLM pipelines

Speech, object detection, and language modeling run simultaneously on dedicated processors. No contention, no serialization.

3,523 FPS classification at 3W · DeepX-verified
04

Autonomous agents running 24/7 at ~6W

Always-on AI under $2/month in electricity. State persists across any interruption. Agents resume from exact state after crash or reboot.

~6W typical · <$2/month · 24/7 operation
05

GPU remains free for rendering, gaming, training

Run larger models on the same GPU while NYMPH handles cognitive load. Cognitive NPCs that remember the player. Zero impact on FPS.

Zero impact on rendering FPS
06

Local AI that remembers and evolves over time

Multiple live AI workspaces cached simultaneously. Instant context resume in milliseconds. Knowledge accumulates across sessions, weeks, months.

The cognitive equivalent of virtual desktops.
Specifications

Technical details.

AI Performance (V1)
31 TOPS (INT8) — DX-M1M 25 TOPS + RK3588 NPU 6 TOPS
Active AI Processors
DX-M1M · RK3588 NPU · Mali-G610 GPU · 8-core ARM CPU
M.2 Expansion Slot
1× M.2 2280 ready for DX-M2 (2027 upgrade)
Quantization Engine
DeepX IQ8 — FP32-level precision in INT8 format
On-Card Memory
4 GB LPDDR4X (DX-M1M) + LPDDR5 (RK3588)
Persistent Storage
64 GB NAND — cognitive state survives reboots
Host Interface
PCIe Gen3 x4
Power Consumption
<20W peak · ~6W typical
Thermal Solution
Passive heatsink — fanless operation
Form Factor
Low profile PCIe card
Video Processing
RK3588 VPU — 8K decode, 4K encode
OS Support
Linux (Ubuntu) · Windows · Android AOSP
AI Frameworks
ONNX · PyTorch · TensorFlow via DXNN SDK
Compatibility
Any x86 desktop/workstation with PCIe slot
Estimated Retail Price
$590 USD
Patented technologies
KV-Pinning State Capsules SCMP TAPIM PNCA OCCS HCIS TOKENFLOW STREAMFLOW
Observed Impact

What changes when you insert it.

31TOPS
Dedicated AI
expandable 2027
8×
Larger context
windows
+325%
Faster long-context
token generation
100%
Context persists
after reboot
6W
Typical always-on
power draw

Observed in controlled internal testing. Actual results vary by configuration.

Comparison

Your PC today vs Your PC + NYMPH.

Scenario
Without NYMPH
With NYMPH
Close session and return
Full context lost
Resumes instantly
Reboot PC
All AI state destroyed
100% preserved in NAND
Switch between projects
Previous destroyed
All cached, instant switch
8-hour continuous session
Progressive degradation
Stable hour 1 to hour 8
LLM + vision + audio
One at a time
All parallel (dedicated)
Run AI agent overnight
GPU 350W, PC unusable
NYMPH 6W, PC free
GPU model capacity
Limited by VRAM
Significantly expanded
Cloud AI token costs
Majority is recomputation
Substantially reduced
DeepX Verified · DX-M1M · 3W

Throughput on dedicated silicon.

All workloads run on DX-M1M while the host GPU remains 100% available.

Model / Task
Throughput
Source
MobileNetV2 (Classification)
3,523 FPS
Verified
ResNet50 (Classification)
1,186 FPS
Verified
YOLOv8L (Detection)
366 FPS
Verified
DeepLabV3 (Segmentation)
223 FPS
Verified
Pose Estimation
200+ FPS
Verified

DeepX benchmarks from published Model Zoo data.

Use Cases

Who it's for.

AI Developers

Persistent-memory models, stateful agents

Local fine-tuning on private data. Any ONNX model with cloud-grade capabilities. Open SDK.

Power Users

Faster cloud workflows at lower cost

Claude Code, ChatGPT, Cursor with dramatically reduced token costs. Multiple cached contexts. Voice, vision, language in parallel.

Gamers & Creators

Cognitive NPCs, AI game master

Pose detection 200+ FPS. SD/Flux + LLM simultaneously. Zero FPS impact on games.

Streamers

Moderation, subtitles, overlays — parallel

AI chat moderation, live subtitles, detection overlays — all on NYMPH while the GPU handles game and encoding.

Local AI Enthusiasts

Extended contexts, instant model switching

Llama, Mistral, DeepSeek with extended contexts. Models that remember you across sessions. Ollama instant switching.

Privacy-First Users

100% local AI · full sovereignty

Nothing leaves your machine. Complete offline cognitive system. Patented architecture, your data only.

OPEN SDK

Build what wasn't possible before.

NYMPH SDK will be released as open-source software. Hardware is proprietary. Ecosystem is free.

01
Persistent-Memory Models
AI that accumulates knowledge across weeks and months. Hardware-level state the user owns and controls.
02
Real-Time Multimodal Pipelines
Audio, vision, language on dedicated processors. Simultaneous, not sequential. Under 20W total.
03
Stateful Autonomous Agents
Survive crashes, reboots, power loss. Resume from exact point of interruption. Run indefinitely at ~6W.
04
Local Fine-Tuning on Private Data
GPU trains while NYMPH handles inference and state. Private data never leaves the machine.
05
Any ONNX-Compatible Model
Llama, Mistral, Phi, Qwen, DeepSeek — cloud-grade capabilities on a desktop.
06
Cognitive Game Characters
Persistent memory, evolving behavior, zero rendering impact. A new interactive category.
07
Local AI Security Cameras
RK3588 VPU decodes 8K, DX-M1M runs detection. Multiple streams, complete privacy.
Silicon Partners

Best-in-class silicon.
Our cognitive architecture.

NYMPH does not manufacture silicon. We integrate the best available — and we make it remember. The persistent cognitive architecture, the 91+ USPTO patents, and the card itself are Punky Tiger Labs IP.

DeepX
AI Silicon
DX-M1M neural processor · 25 TOPS INT8 at 3W. Proprietary IQ8 quantization delivers FP32-level accuracy in INT8. Winner of CES 2026 Best of Innovation.
CES 2026 ×2 EE Times WEF MINDS
Rockchip
System SoC
RK3588 SoC · 8-core ARM CPU + 6 TOPS NPU + Mali-G610 GPU + 8K VPU. The system orchestrator that coordinates the cognitive pipeline.
8-core ARM 8K decode Mali-G610
Upgrade Path · 2027

NYMPH V2: Insert the future.

NYMPH S-Quantum ships with two M.2 2280 slots. The first holds the DX-M1M. The second is empty — ready for DeepX's next-generation DX-M2 processor on Samsung's 2nm GAA (Gate-All-Around) process node. When the DX-M2 becomes available, insert it into the second slot. No new card. No new drivers. No lost state.

DX-M2 Process Node
Samsung 2nm GAA (Gate-All-Around)
Expected Sampling
Q3 2026
NYMPH V2 Upgrade Target
2027
Installation
Insert DX-M2 into second M.2 slot on existing card
Compatibility
Full backward compat — V1 SDK, State Capsules, NAND data
Result
DX-M1M + DX-M2 running in parallel on one card

Buy NYMPH today with one processor. Add the next generation tomorrow. Your state, your data, your models — all carry forward.

From the Lab

AI should not reset every session.
AI should remember.

An evolution of the LLM paradigm — not a replacement. Hardware-level persistent state, accumulating knowledge across sessions. 91+ USPTO patents cover the full technology stack.

A different product for a different future.
One where AI remembers.
In a few weeks

The first consumer AI hardware.
For gamers, devs, and hardcore users.

S-Quantum is weeks from launch. Leave your email and we will reach out when units, pricing, and availability are confirmed — no spam, only product news.

Notify me at launch →