Here are some notes on how to run Large Language Models fully isolated from all network connections on MS Windows. This is not ChatGPT but a similar open-source implementation that can be driven with a variety of downloadable weights (which basically are the knowledge of the model).

This approach allows computation with LLMs on air-gapped systems, on temporarily disconnected secure machines, secure virtual machines, or, as in this case, Windows Sandbox built-in easy-to-use isolated VMs.

Use of a carefully isolated machines is essential for work where valuable or important data or information is being processed, e.g., in legal text searches, security, etc.

Preparation

I am using the llamafile project and Mistral 7b parameter model. They can be downloaded as follows (in Powershell, warning: 5GB+ download!)

 curl -o llamafile.exe https://github.com/Mozilla-Ocho/llamafile/releases/download/0.2.1/llamafile-server-0.2.1
 curl -o mistral.gguf https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf

This step is the only step which needs to be done on an internet-connected computers, the two files are that are needed.

Running

I run the models in a completely isolated Windows Sandbox with no network access, configuration file:

<Configuration>
  <VGpu>Enable</VGpu>
  <Networking>Disable</Networking>
  <MemoryInMB>11000</MemoryInMB>
</Configuration>

Then start the sandbox and simply copy & paste into the sandbox the two files downloaded above.

The server with front-end in a browser (all running locally) can be started with:

llamafile.exe    -m mistral.gguf

Example use

As an example of using the model, I used the model to summarise the famous “Worse Is Better” essay by Richard P. Gabriel. Simply save the text of the essay, prefix by [INST]Summarize the following text: and postfix with [/INST] and copy and paste into the prompt.

A good set of parameters to run the server seems to be:

llamafile.exe    -m mistral.gguf     -c 6700     -ngl 35   -n 500

The following is the output from the mistral model in this case , without further comment except that reading the essay is well worth your time:

Thank you for providing this text. It seems like there are two different approaches to software design and implementation, which are known as the MIT approach and the New Jersey approach. The MIT approach emphasizes correctness, consistency, completeness, and simplicity in both the implementation and interface of a design. On the other hand, the New Jersey approach prioritizes simplicity over everything else, even if it means sacrificing correctness or consistency.

It’s interesting to note that the author believes that worse-is-better is actually a better philosophy than the MIT approach when used for software implementation. The author uses an example of Unix and C as examples of systems designed using the New Jersey approach, which are known for being easy to port and requiring fewer resources.

The author also argues that the programmer is conditioned to sacrifice some safety, convenience, and hassle to get good performance and modest resource use when using the New Jersey approach, resulting in programs that work well both on small and large machines. Additionally, the author states that the initial virus (Unix or C) has to be basically good, but once it spreads and becomes portable, there will be pressure to improve it over time.

Overall, this text presents an interesting perspective on software design and implementation philosophies and highlights the trade-offs between different approaches. It’s worth considering these ideas when designing and implementing software systems. ```

Screenshot in use:

screenshot

How to use fully isolated Large Language Models in Windows

Preparation

Running

Example use

Related Posts You May Enjoy:

Isolated Python on Windows using Sandboxes airgap

Need strategic help with your project? Let us know at webs@bnikolic.co.uk, we may be able to assist!

Any comments, suggestions or enquiries? Send a message: