Internet in a box

JTH

TSP Legend
Reaction score
1,292
I've been going down numerous rabbit holes, testing Llama AI models, running it in a VM, I'm losing about 10-15% overhead on resources, but we are still in beta and I don't want to alter the HOST system.

It started with the idea I wanted an Internet in a box type of Cyberdeck (fully offline devise) for when the world ends.
You can download all the WIKI, and of course all other sorts of documents you deem relevant to rebuilding the world.
I can then build an extensive library, using purpose built AI models to sift through it when I need quick indexing or answers.

If you have the right data and models, you can build advisors, like a Doctor, Farmer, etc

Anyhow, I rebuilt the Laptop, due to needing one large partition to store models and databases in one space. An 8-core AMD Ryzen 7 4700U with 16GB of ram, and 500GB storage (without a dedicated GPU), can run Small" Llama 3 8B Instruct models if they are heavily quantized. This is about the equivalent of Chat GPT-3.5 but with about half the context memory, so I can run similar (but shorter) task.
 
Last edited:
I've been going down numerous rabbit holes, testing Llama AI models, running it in a VM, I'm losing about 10-15% overhead on resources, but we are still in beta and I don't want to alter the HOST system.

It started with the idea I wanted an Internet in a box type of Cyberdeck (fully offline devise) for when the world ends.
You can download all the WIKI, and of course all other sorts of documents you deem relevant to rebuilding the world.
I can then build an extensive library, using purpose built AI models to sift through it when I need quick indexing or answers.

If you have the right data and models, you can build advisors, like a Doctor, Farmer, etc

Anyhow, I rebuilt the Laptop, due to needing one large partition to store models and databases in one space. An 8-core AMD Ryzen 7 4700U with 16GB of ram, and 500GB storage (without a dedicated GPU), can run Small" Llama 3 8B Instruct models if they are heavily quantized. This is about the equivalent of Chat GPT-3.5 but with about half the context memory, so I can run similar (but shorter) task.
Interesting. Good luck.
 
  • Like
Reactions: JTH
Cool stuff JTH,

So, which models have stood out to you so far? And what is the largest one you've been able to run on your VM?

When I find time this week, I'm thinking of downloading and playing with Qwen3-4B-Thinking-2507 off of LM studios.
 
Cool stuff JTH,

So, which models have stood out to you so far? And what is the largest one you've been able to run on your VM?

When I find time this week, I'm thinking of downloading and playing with Qwen3-4B-Thinking-2507 off of LM studios.

Nothing stands out just yet, I've only been running basic reasoning test to compare how they answer the questions. Truth be told, it's both better than I expected (but slow), and also disappointing because I know I don't have the specs to run larger models. I'm slamming the CPUs hard which means when running the models I have no room for real-world task.

We both know what's going to happen here, I'm going to have build a machine...

Screenshot_2026-03-23_19-57-41.png
 
Back
Top