COSMO 42 – AI on premise and open source

With the launch of ChatGPT, individuals and businesses alike were steered toward cloud-based AI.

A technology previously unthinkable was suddenly at everyone’s fingertips—with a single click, (nearly) for free, promising to unlock the future and revolutionize the way we work, and much more. Once the initial euphoria faded, companies and public administrations gradually began to grapple with the implications and uncomfortable questions surrounding cloud AI: Where does our data end up? Who controls sensitive information, and how is it used? What happens when a document leaves the corporate network to be processed by a cloud AI? Saying yes to commercial Artificial Intelligence was easy. Managing its risks, especially regarding data confidentiality and sovereignty, has proven to be an operational nightmare for many organizations.

At Ex Machina, we have always believed that for the AI revolution to be successfully adopted by enterprises and public administrations, it must guarantee information ownership, confidentiality, security, and sovereignty. Therefore, it needs to be an on-premises technology operating strictly within the corporate perimeter. It is in this exact context that we decided to release COSMO 42 as an open-source framework (MIT license) for local AI solution development. The COSMO 42 framework is modular and includes various tools to integrate mainstream technologies and models, including Gemma 4—which is optimized to be compatible with graphics cards ranging from 4 GB to 24 GB of VRAM, allowing it to run on local, scalable hardware infrastructures without requiring prohibitive hardware investments.

COSMO 42 is released with a reference implementation consisting of a simple web app that allows users to upload their documents in PDF, DOCX, or XLSX format and chat with them locally in total safety. By deliberate architectural choice, the web app does not include a native authentication system. This allows users to immediately test the tool locally, verify its effectiveness, and later integrate it with the authentication and access management systems already in place within the company. Visual processing, chunking, and embedding all take place on-site. Not a single bit leaves the organization!

In addition to the reference implementation, the COSMO 42 repository includes an additional tool called Studio, which is disabled by default but ready to use for development teams. Studio was created to solve a practical need: easily testing and comparing different language models while maintaining a shared user interface. At Ex Machina, we use Studio daily as a testing ground to compare models and understand which one responds best to the exact same input. In this sandbox environment, you can enter a prompt, add attachments, and visually evaluate the model’s response in real time without going through the embedding phase. The output can be analyzed as raw text, Markdown, or JSON format, depending on your integration needs.

COSMO 42 is a flexible, open-source framework that communities and companies can adapt to various use cases, but its core promise remains unchanged: Artificial Intelligence can assist and enhance corporate workflows without compromising data security and ownership.

The source code, configuration files, and documentation are publicly accessible on GitHub, at the official Ex Machina project repository. We look forward to supporting developers and organizations utilizing COSMO 42 to build their own secure, local AI. On our Cosmo 42 page, you can find more information.