Rossi Stefano Logo

Rossi Stefano

blur background
Standalone AI Assistant

Standalone AI Assistant

Rossi Stefano
features
  • ML
  • AI
  • Transformerjs
  • VueJS
  • WebGPU
  • A standalone zero-knowledge AI for this site

    In this website I implemented ChaD GPT a web assistant that can answer questions about me found in this website site and other custom resources. What is special about him, is that it runs directly in your browser using your hardware. No need of any server-side processing, and no cost for me. All data and processing happens locally, ensuring privacy and security and with basically zero AI costs. This represents a significant step forward in web technology and AI, allowing for more interactive and intelligent user experiences without compromising on privacy.

    It is important to note that this is not a super intelligent AI assistant, and requires a modern hardware to run. A 3GiB GPU should be enough to run the model. It can be used on a wide range of devices, including desktops, laptops, and even with mobile.

    Key Concepts

    What is a GPU ?: A Graphics Processing Unit, it’s a specialized processor designed for parallel processing tasks. Most complex AI and graphical algorithms uses Matrix Calculus and vector tools such as Dot Product. These mathematical operations can be parallelized in very small operations.

    How it works

    First of all, ChaD GPT is built using Transformer.js, a library to run large language models using Javascript. Then, to interface with your Hardware, it uses WebGPU, the new graphics API that provides high-performance access to the GPU. This allows ChaD GPT to run efficiently and quickly, without the need of an external server.

    This inference model used is SlolLM 1.7B which is a good compromise between performance and size, perfect for my use case.

    The models lacks of specific site knowledge. That’s why ChaD GPT knowledge is augmented using RAG. Informations are stored in a vector database, Qdrant. The site is chunked and then embedded into numerical informations so that the AI is capable of learning new informations even in real time, to answer your queries. If you want to know more about him, ask him directly.

    The last important component of the ChaD GPT is the chat memory. Being a small model, it has a limited context window. To overcome this limitation, a chat memory system extract important elements and store them in Neo4j, a graph database.

    Requirements

    Currently, all modern browsers support the WebGPU API, including Chrome and Safari without more configuration.

    For Firefox and Linux user, just use the Firefox Nightly version, which has WebGPU enabled by default.

    You also need a GPU, stronger GPUs allows for faster answers. A GPU with 3GiB VRAM should be more than enough to run ChaD GPT. This AI assistant is designed to run on most modern devices. If you are using your toaster or a potato instead, don’t complain.

    Some hardware such as some type of MacBook pro may not have a dedicated GPU but still be able to

    How to use it

    To use the the ChaD GPT assistant, simply click on the chat icon in the bottom right corner of the page or in the top menu bar for the mobile version. A chat window will open. The first time you need to confirm the you want to download the model which is about 1.7GiB in size. Once the model is downloaded, you can start chatting with the AI assistant. Allowing you to ask questions or request information. The AI assistant will respond with relevant answers based on the data it has been trained on and the information stored in the vector database. ChaD GPT is instructed to only answer questions revelevant to me or this site. And he is pretty serious about it, dont try anything fanny or he will get angry!

    If your browser does not support WebGPU, you will see a message indicating that the AI assistant is not available.

    Privacy considerations

    Chad GPT could runs entirely in your browser, which means that all data and processing happens in your device. No data is sent to the internet. However, for my convenience I am using a private version of the two databases connected through internet.


    Open ChaD GPT