Yappify Models

Technical information about Brian's AI chat models

Yappify 1.0

Local RAG Model

A lightweight, pattern-based response system that runs entirely in your browser. Uses Retrieval-Augmented Generation (RAG) with pre-computed embeddings to provide fast responses about Brian's background and projects.

Technical Specifications

Architecture Rule-based + RAG
Response Time ~500ms
Memory Usage < 5MB
Vector Database JSON embeddings
Production Ready

Qwen 2.5-1.5B-Instruct

Large Language Model

A state-of-the-art 1.5 billion parameter language model from Alibaba Cloud, running locally via WebLLM and WebGPU. Provides sophisticated conversational AI with deep understanding of Brian's work and general knowledge.

Technical Specifications

Parameters 1.5B
Quantization 4-bit (q4f16_1)
Model Size ~900MB
Context Length 32K tokens
Acceleration WebGPU
Framework MLC-AI WebLLM
Beta Release

GPT-4 & Claude Integration

Cloud API Models

Future integration with OpenAI's GPT-4 and Anthropic's Claude models via secure API connections. These will provide access to the most advanced AI capabilities while maintaining privacy and security.

Coming Soon