Abstract
We introduce OAM (OpenAGI Model), a revolutionary file format designed to democratize artificial intelligence through extreme efficiency and universal portability. OAM enables powerful AI models to run on any device—from high-end servers to web browsers and embedded systems—by providing a zero-overhead binary container format with minimal parsing complexity. This paper presents the technical specification, design philosophy, and practical applications of the OAM format, demonstrating how it achieves true model portability across programming languages and platforms while maintaining computational efficiency.
1. Introduction
The current landscape of AI model deployment is fragmented. Different frameworks use incompatible formats (PyTorch's .pt, TensorFlow's SavedModel, ONNX, etc.), each requiring specific runtime dependencies and often platform-specific optimizations. This fragmentation creates barriers to AI accessibility and limits the potential for cross-platform deployment.
OAM addresses this fundamental challenge by providing a universal, language-agnostic format that prioritizes:
- Extreme Portability: Run on any device, any platform, any language
- Zero Dependencies: No external libraries or frameworks required
- Minimal Overhead: Designed for instant parsing and loading
- Architecture Agnostic: Support for any model architecture
- Hyper-Compression: Models smaller than training data through learned generalization
2. Technical Specification
2.1 File Format Structure
The OAM format uses a simple binary container with three main sections:
┌─────────────────────────────────────┐
│ Magic Bytes: 'OAM1' (4 bytes) │
├─────────────────────────────────────┤
│ Metadata Length (4 bytes, uint32) │
│ Metadata (JSON, UTF-8) │
├─────────────────────────────────────┤
│ Vocabulary Length (4 bytes) │
│ Vocabulary (JSON, UTF-8) │
├─────────────────────────────────────┤
│ Weights Length (4 bytes) │
│ Weights (JSON/Binary, flexible) │
└─────────────────────────────────────┘
2.2 Metadata Block
The metadata block contains essential model information in JSON format:
{
"created_at": "2025-12-03T21:00:00Z",
"model_name": "example-model",
"description": "Model description",
"params_count": 1250000,
"params_count_readable": "1.25M",
"layers": 5,
"architecture": "rnn|ngram|transformer|custom"
}
2.3 Design Rationale
The format deliberately uses JSON for metadata and vocabulary to maximize human readability and cross-language compatibility. While binary formats could be more compact, JSON ensures that any programming language with basic JSON parsing can read OAM files without specialized libraries.
The weights section is flexible—implementers can choose JSON for maximum portability or binary formats for efficiency. This flexibility allows optimization for specific use cases while maintaining the universal container structure.
3. Implementation
3.1 Reference Implementation (Python)
The reference implementation demonstrates the simplicity of the format:
import json
import struct
class OAM:
def save(self, filepath):
meta_bytes = json.dumps(self.metadata).encode('utf-8')
vocab_bytes = json.dumps(self.vocab).encode('utf-8')
weights_bytes = json.dumps(self.weights).encode('utf-8')
with open(filepath, 'wb') as f:
f.write(b'OAM1')
f.write(struct.pack('I', len(meta_bytes)))
f.write(meta_bytes)
f.write(struct.pack('I', len(vocab_bytes)))
f.write(vocab_bytes)
f.write(struct.pack('I', len(weights_bytes)))
f.write(weights_bytes)
@staticmethod
def load(filepath):
model = OAM()
with open(filepath, 'rb') as f:
magic = f.read(4)
if magic != b'OAM1':
raise ValueError("Invalid OAM file")
meta_len = struct.unpack('I', f.read(4))[0]
model.metadata = json.loads(f.read(meta_len))
vocab_len = struct.unpack('I', f.read(4))[0]
model.vocab = json.loads(f.read(vocab_len))
weights_len = struct.unpack('I', f.read(4))[0]
model.weights = json.loads(f.read(weights_len))
return model
3.2 Cross-Language Support
The format's simplicity enables implementation in any language. A JavaScript implementation requires only basic file I/O and JSON parsing. C implementations can use standard library functions. This universality is a core design goal.
4. Model Architectures
4.1 N-gram Models
OAM supports efficient n-gram language models with hierarchical propagation. These models achieve remarkable compression ratios—often smaller than the training data itself—by learning true statistical patterns rather than memorizing sequences.
4.2 RNN Models
Recurrent neural networks stored in OAM format include full weight matrices, hidden state dimensions, and vocabulary mappings. The format supports both character-level and word-level tokenization strategies.
4.3 Custom Architectures
The architecture-agnostic design allows researchers to experiment with novel model designs. As long as the weights and metadata can be serialized to JSON, any architecture can be stored in OAM format.
5. Performance Characteristics
5.1 Loading Speed
OAM files load with minimal overhead. The sequential structure allows streaming reads, and the simple format requires no complex parsing logic. Typical load times are measured in milliseconds for models under 100MB.
5.2 Inference Efficiency
Models in OAM format are optimized for CPU inference with minimal memory footprint. N-gram models achieve near-instant token generation, while RNN models maintain competitive performance without GPU acceleration.
5.3 Compression Ratios
Through learned generalization, OAM models often achieve file sizes smaller than their training corpora. A 5-gram model trained on 6.5MB of text data can compress to under 1MB while maintaining strong generative capabilities—demonstrating true pattern learning rather than memorization.
6. Use Cases and Applications
6.1 Browser-Based AI
OAM enables sophisticated AI models to run entirely in web browsers. JavaScript implementations can load and execute models without server communication, enabling privacy-preserving, offline-capable AI applications.
6.2 Edge Computing
The minimal resource requirements make OAM ideal for embedded systems and IoT devices. Models can run on microcontrollers and edge devices where traditional deep learning frameworks are impractical.
6.3 Research and Education
The human-readable format facilitates understanding of model internals. Students and researchers can inspect weights, vocabulary, and architecture details without specialized tools.
6.4 Rapid Prototyping
The simplicity of the format accelerates experimentation. Researchers can quickly implement custom architectures and training procedures without wrestling with framework-specific serialization logic.
7. Comparison with Existing Formats
7.1 ONNX
While ONNX provides broad framework interoperability, it requires the ONNX runtime and has significant complexity. OAM trades some optimization potential for radical simplicity and zero dependencies.
7.2 SafeTensors
SafeTensors focuses on secure tensor storage for large models. OAM targets a different niche: ultra-portable, lightweight models that prioritize accessibility over raw scale.
7.3 Framework-Specific Formats
PyTorch (.pt), TensorFlow (SavedModel), and similar formats are tightly coupled to their frameworks. OAM provides true framework independence at the cost of framework-specific optimizations.
8. Future Directions
8.1 Binary Weight Encoding
Future versions may standardize efficient binary weight representations while maintaining the JSON-based metadata and vocabulary for maximum compatibility.
8.2 Quantization Support
Explicit support for quantized weights (int8, int4) could further reduce model sizes and improve inference speed on resource-constrained devices.
8.3 Streaming Inference
The sequential format naturally supports streaming—loading and executing models in chunks for extremely large models that exceed available RAM.
8.4 Ecosystem Development
Community-driven implementations in additional languages (Rust, Go, Swift) will expand the OAM ecosystem and validate the format's universality claims.
9. Conclusion
OAM represents a paradigm shift in AI model distribution and deployment. By prioritizing simplicity, portability, and accessibility over framework-specific optimizations, OAM democratizes AI and enables new categories of applications.
The format proves that sophisticated AI models need not be locked behind proprietary frameworks or require massive computational resources. With OAM, intelligence becomes truly portable—running anywhere from billion-dollar datacenters to $5 microcontrollers.
As the AI community continues to push the boundaries of model capabilities, OAM ensures that these advances remain accessible to everyone, everywhere, on any device.
10. References
- OAM Reference Implementation:
oam_web/oam.py - OAM Documentation: oam_web/docs.html
- Example Models:
oam_web/model_ng.oam,oam_web/model_rnn.oam - OpenAGI Project: https://open-agi.netlify.app
Acknowledgments
This work is part of the OpenAGI initiative to build accessible, safe, and beneficial artificial general intelligence. We thank the open-source community for their continued support and contributions.