
Cosmo 42 is an Open Source platform under the MIT license that transforms corporate and public sector documents into AI-queryable resources, with all processing happening entirely on-premises
COSMO 42 is now available on GitHub under the MIT license, built on a modern enterprise-grade stack: Java 25 and Spring AI 2 on the backend, React and TypeScript on the frontend. The platform loads documents in PDF, DOCX, and XLSX formats and makes them queryable through AI. Visual processing, chunking, and embedding all happen entirely on-premises: nothing leaves the infrastructure, nothing is tracked externally.
The platform is designed to run entirely on-premises: all documents remain within the client's infrastructure, with no external tracking or data transmission. COSMO 42 Open Source includes no native authentication system by deliberate architectural choice: this allows teams to start experimenting immediately and simplifies integration with the organization's existing access management systems, without added layers.
Open, modular codebase ready to be extended and adapted to any specific use case within the organization
Integrates with existing authentication and access management systems, with no migrations or invasive adaptations required
Visual processing, chunking, and embedding all run locally, with no data exposed to external services or infrastructure
Native configuration for Gemma 4, optimized for GPUs with 4 to 24 GB of VRAM, with no cloud service dependencies
Optimized for local, scalable hardware infrastructure, accessible without high-cost hardware investment
An artist search system that replaces traditional complex interfaces with a natural conversational experience. Thanks to a custom model created by Ex Machina, the system understands requests in multiple languages and responds in milliseconds, without requiring GPUs. The combination of vector search and metadata ensures precise and immediate results
A system dedicated to hydro-meteorological risk management that analyzes documents in any format, including images, graphs, and tables. The platform contextualizes data through a multi-agent system that adapts to user needs, creating connections between information. A specialized language model guides analysis in the specific context of environmental risk
A solution dedicated to human resources management that operates on complex textual datasets with multiple overlapping sources. The system enables targeted and fast searches to obtain structured data, seamlessly integrating with existing platforms. Its lean architecture allows rapid responses without requiring complex language models
A versatile system that extracts structured information from various sources: HR documents, insurance contracts, utilities bills, videos, and images. Processing can occur in real-time or through scheduled processes, adapting to specific sector needs. Domain logic guides interaction with language models to ensure accurate and relevant results
With the open source release, COSMO 42 becomes a freely extendable engine that communities and organizations can adapt to any specific use case. The source code, documentation, and configuration files are available on GitHub under the MIT license. Contact us to find out how to build your own AI document solution on top of COSMO 42.
