Cosmo42logo
COSMO 42 by Ex Machina:
Open Source for secure, on-premises AI document management

Cosmo 42 is an Open Source platform under the MIT license that transforms corporate and public sector documents into AI-queryable resources, with all processing happening entirely on-premises

Open Source and Modular

COSMO 42 is now available on GitHub under the MIT license, built on a modern enterprise-grade stack: Java 25 and Spring AI 2 on the backend, React and TypeScript on the frontend. The platform loads documents in PDF, DOCX, and XLSX formats and makes them queryable through AI. Visual processing, chunking, and embedding all happen entirely on-premises: nothing leaves the infrastructure, nothing is tracked externally.

Security and Privacy

The platform is designed to run entirely on-premises: all documents remain within the client's infrastructure, with no external tracking or data transmission. COSMO 42 Open Source includes no native authentication system by deliberate architectural choice: this allows teams to start experimenting immediately and simplifies integration with the organization's existing access management systems, without added layers.

Adaptability

Open, modular codebase ready to be extended and adapted to any specific use case within the organization

Integration

Integrates with existing authentication and access management systems, with no migrations or invasive adaptations required

Automation

Visual processing, chunking, and embedding all run locally, with no data exposed to external services or infrastructure

Local AI Models

Native configuration for Gemma 4, optimized for GPUs with 4 to 24 GB of VRAM, with no cloud service dependencies

Efficiency

Optimized for local, scalable hardware infrastructure, accessible without high-cost hardware investment

What You Can Build with COSMO 42

Stagend

An artist search system that replaces traditional complex interfaces with a natural conversational experience. Thanks to a custom model created by Ex Machina, the system understands requests in multiple languages and responds in milliseconds, without requiring GPUs. The combination of vector search and metadata ensures precise and immediate results

Airas

A system dedicated to hydro-meteorological risk management that analyzes documents in any format, including images, graphs, and tables. The platform contextualizes data through a multi-agent system that adapts to user needs, creating connections between information. A specialized language model guides analysis in the specific context of environmental risk

Skill match

A solution dedicated to human resources management that operates on complex textual datasets with multiple overlapping sources. The system enables targeted and fast searches to obtain structured data, seamlessly integrating with existing platforms. Its lean architecture allows rapid responses without requiring complex language models

Content Extraction

A versatile system that extracts structured information from various sources: HR documents, insurance contracts, utilities bills, videos, and images. Processing can occur in real-time or through scheduled processes, adapting to specific sector needs. Domain logic guides interaction with language models to ensure accurate and relevant results

Always a Step Ahead

With the open source release, COSMO 42 becomes a freely extendable engine that communities and organizations can adapt to any specific use case. The source code, documentation, and configuration files are available on GitHub under the MIT license. Contact us to find out how to build your own AI document solution on top of COSMO 42.

EXM tag

Set up a meeting

For further details about the project or how to adopt COSMO 42 Open Source, get in touch.