技術
Technical architecture and AI models
Data Flow
- 1.User sends fax or uploads image via marketing website demo
- 2.Backend API receives request and queues processing job in Redis
- 3.AI pipeline analyzes image: Vision AI extracts text, Annotation Detector finds marks, Intent Extractor determines action
- 4.MCP servers execute actions (send email, place order, book appointment, etc.)
- 5.Results stored in PostgreSQL, files saved to S3, confirmation fax sent via Telnyx
- 6.Metrics aggregated and displayed on dashboard for monitoring and analysis
Multi-Model AI Pipeline
Faxi uses a sophisticated AI pipeline combining multiple specialized models. Each model excels at a specific task, and their outputs are combined to achieve high overall accuracy. This ensemble approach ensures robust performance across diverse fax formats and handwriting styles.
Core AI Models
Vision AI (GPT-4 Vision)
Optical Character Recognition and Visual Analysis
Extracts text from fax images including both printed and handwritten content. Uses advanced computer vision to understand document structure, identify form fields, and recognize Japanese characters with high accuracy.
Techniques Used:
Annotation Detector
Visual Annotation Recognition
Identifies hand-drawn marks on faxes such as checkmarks, circles, arrows, and underlines. Associates annotations with nearby text to understand user intent (e.g., circled product = selected item).
Techniques Used:
Intent Classifier (Claude)
Natural Language Understanding and Action Extraction
Analyzes extracted text and annotations to determine what action the user wants to perform. Classifies intents (email, shopping, appointment, etc.) and extracts relevant parameters with high confidence.
Techniques Used:
Processing Pipeline
Image Preprocessing
Enhance image quality, remove noise, correct skew and rotation
Vision Analysis
Extract text regions and identify visual elements
Annotation Detection
Find and classify hand-drawn marks
Intent Extraction
Understand user intent and extract parameters
Confidence Scoring
Assess reliability of each component
Key Innovations
🎯Context-Aware Processing
AI understands the relationship between text and annotations. A circle around a product name indicates selection, while an arrow points to important information.
🔄Iterative Refinement
Models work together iteratively. Vision AI output informs annotation detection, which in turn helps intent classification achieve higher accuracy.
🌐Multilingual Support
Specialized handling for Japanese characters (Kanji, Hiragana, Katakana) alongside English, with cultural context awareness for proper interpretation.
📊Confidence Calibration
Each prediction includes a calibrated confidence score. Low-confidence results trigger clarification requests, ensuring users are never left with incorrect actions.
Overall Performance
For Technical Evaluators
Our AI pipeline leverages transfer learning from pre-trained foundation models, fine-tuned on domain-specific data including Japanese handwriting samples and fax artifacts. We employ ensemble methods to combine predictions, use active learning to continuously improve accuracy, and implement robust error handling with human-in-the-loop fallbacks for edge cases. The system is designed for production deployment with monitoring, A/B testing, and continuous model updates.
Frontend
Next.js 14
React framework with App Router for server-side rendering and optimal performance
React 18
Modern UI library with hooks and concurrent features for responsive interfaces
TypeScript
Type-safe JavaScript for robust code and better developer experience
Tailwind CSS
Utility-first CSS framework for rapid UI development and consistent styling
Recharts
Composable charting library for interactive data visualizations
Framer Motion
Animation library for smooth, performant UI transitions
Backend
Express.js
Fast, minimalist web framework for Node.js handling API requests
Node.js
JavaScript runtime for scalable server-side applications
PostgreSQL
Robust relational database for storing users, jobs, and metrics
Redis
In-memory data store for job queues and caching
AWS S3
Object storage for fax images and generated documents
AI & Machine Learning
Claude (Anthropic)
Advanced language model for intent extraction and natural language understanding
GPT-4 Vision
Multimodal AI for OCR, handwriting recognition, and visual analysis
Custom ML Models
Specialized models for annotation detection and form field recognition
Infrastructure
Telnyx
Cloud communications platform for sending and receiving faxes
Vercel
Deployment platform for Next.js with edge network and automatic scaling
Docker
Containerization for consistent development and production environments
Why This Stack?
- ✓Performance: Server-side rendering and edge deployment for fast load times
- ✓Scalability: Horizontal scaling with containerization and cloud infrastructure
- ✓Reliability: Type safety, robust error handling, and comprehensive testing
- ✓Developer Experience: Modern tooling with hot reload and TypeScript support
- ✓AI-First: Integration with leading AI models for state-of-the-art accuracy
What is MCP?
Model Context Protocol (MCP) is an open standard that enables AI systems to securely connect with external data sources and tools. Faxi uses MCP servers to extend functionality beyond basic fax processing, allowing users to interact with email, shopping, appointments, and more—all through their familiar fax machine.
Available MCP Servers
Extensibility
The MCP architecture makes Faxi infinitely extensible. Organizations can develop custom MCP servers to integrate with their own systems—healthcare records, inventory management, CRM platforms, and more. This allows Faxi to adapt to any use case while maintaining a simple fax interface for users.
Healthcare
Integrate with EHR systems for appointment booking and prescription refills
Government
Connect to public services for permit applications and benefit enrollment
Enterprise
Build custom integrations for internal workflows and legacy systems
Why MCP Matters
- •Standardized: Open protocol ensures compatibility and interoperability
- •Secure: Built-in authentication and authorization mechanisms
- •Scalable: Add new capabilities without modifying core system
- •Future-proof: Adapt to new services and technologies as they emerge
