- 1
- April
Ollama Series EP.1 — In an era where AI tools like ChatGPT and Claude have become essential for work, many organizations are asking: "Is our data truly safe when we send it to AI?" Ollama is the answer for organizations that want to use AI without sending data outside their walls — an open-source tool that makes running Large Language Models (LLMs) on your own machine as easy as running Docker. This article is EP.1 of the Ollama Series, covering the fundamentals — what Ollama is, what it can do, how it differs from Cloud AI, and why Thai organizations should pay attention.
In short — What is Ollama?
- Ollama is an open-source tool that lets you run LLMs on your own machine with a single command
- Supports macOS, Windows, Linux — install in under 5 minutes
- Run popular models instantly: Llama 3.1, Gemma 2, Qwen 2.5, Mistral, DeepSeek-R1, Phi-4
- Data never leaves your machine — ideal for organizations concerned about data privacy and PDPA compliance
- Zero API costs — you only pay for electricity and hardware
- Works offline — no internet connection required
- Built-in REST API — integrate with other applications immediately
What Is Ollama?
Ollama is an open-source tool that makes running Large Language Models (LLMs) on your personal computer or organization's server remarkably simple — no complex Python environment setup, no manual model weight management. Just install Ollama and type ollama run llama3.1 to get an AI chatbot running on your machine instantly.
Think of Ollama as "Docker for AI Models" — just as Docker made running software in containers easy, Ollama makes running AI models easy. You don't need deep Machine Learning knowledge. Just know which model you want, and run ollama pull like you'd run docker pull.
Ollama was developed by a team led by Jeffrey Morgan, first released in 2023, and has grown rapidly. As of April 2026, it has surpassed 100 million downloads and 110,000+ GitHub Stars, making it the world's most popular local AI tool.
Timeline: From Small Project to Local AI Standard
| Date | Event | Significance |
|---|---|---|
| Jul 2023 | Ollama v0.1 launched | macOS support, easy Llama 2 deployment |
| Nov 2023 | Linux support added | Expanded to servers and developer workflows |
| Feb 2024 | Windows support (Preview) | All major OS covered — accessible to everyone |
| Mar 2024 | NVIDIA GPU + AMD GPU support | 5-10x faster with GPU acceleration |
| Apr 2024 | Day-one Llama 3 support | Ollama became the go-to way to try new models |
| Jul 2024 | Structured Output (JSON Mode) | Production-ready for application integration |
| Jan 2025 | DeepSeek-R1 supported immediately | Easiest way to run DeepSeek locally |
| 2025-2026 | Downloads surpass 100 million | Became the world's most popular local AI tool |
Why Local AI? — Cloud AI vs Local AI
Before understanding Ollama, it's important to know that the AI tools we use daily (ChatGPT, Claude, Gemini) are Cloud AI — every question you type is sent to AI company servers abroad for processing. This means all data you submit has already left your organization.
Local AI is the opposite approach — download AI models to your own machine or server, and process everything locally. Data never goes anywhere. Here's a clear comparison:
| Aspect | Cloud AI (ChatGPT, Claude) | Local AI (Ollama) |
|---|---|---|
| Where is data stored? | Foreign servers (US/China) | Your own machine — never leaves your org |
| Internet required? | Yes — won't work without it | No — works fully offline |
| Cost | $20/month (Pro) or pay-per-token | Free — only electricity + hardware costs |
| Model intelligence | GPT-4o, Claude Opus — very smart | Llama 3.1 70B, Qwen 72B — slightly behind |
| Speed | Fast (powerful servers) | Depends on hardware — GPU makes it fast |
| Data privacy compliance | Must verify each provider's DPA | Safe — data never leaves your organization |
| Customization | Limited — only what providers allow | Full control — custom prompts, fine-tuning |
| Best for | General tasks, maximum intelligence needed | Sensitive data, AI experimentation, cost savings |
What Can Ollama Do?
Ollama isn't just a chatbot — it's a platform for running AI models that supports multiple use cases:
1. Chat / Q&A
Use it like ChatGPT instantly — ask questions, summarize text, translate languages, write emails, draft documents. All without data leaving your machine. Ideal for organizations handling sensitive information like enterprise risk data or internal reports.
2. Code Generation
Models like CodeLlama, DeepSeek-Coder, and Qwen2.5-Coder excel at writing code, debugging, and explaining code. Developers can use Ollama as a private AI coding assistant without sending source code to external companies.
3. Document Analysis (RAG)
Use Ollama with RAG (Retrieval-Augmented Generation) to build AI that answers questions from your organization's documents — manuals, policies, or knowledge that's at risk of being lost. We'll cover this in detail in EP.4 of this series.
4. Image Processing (Vision)
Models like LLaVA, Moondream, and Llama 3.2 Vision can read and analyze images — extract data from receipts, analyze charts, or describe photographs. All processed on your own machine.
5. Integration with Other Apps (API)
Ollama includes a built-in REST API for seamless integration with web apps, chatbots, and automation systems. We'll explore this in detail in EP.5.
Popular Models on Ollama
Ollama supports over 100 models from its official Model Library. Here are the most popular and relevant ones for organizations:
| Model | Size | RAM Required | Best At |
|---|---|---|---|
| Llama 3.1 8B | 4.7 GB | 8 GB | General purpose, decent Thai support |
| Llama 3.1 70B | 40 GB | 48 GB | Very smart, near GPT-4 level |
| Gemma 2 9B | 5.4 GB | 8 GB | By Google, fast and accurate |
| Qwen 2.5 7B | 4.4 GB | 8 GB | By Alibaba, strong Asian language support |
| Qwen 2.5 72B | 41 GB | 48 GB | Excellent Thai, rivals GPT-4 |
| DeepSeek-R1 8B | 4.9 GB | 8 GB | Reasoning — "thinks before answering" |
| Mistral 7B | 4.1 GB | 8 GB | From France, very fast |
| Phi-4 14B | 8.4 GB | 16 GB | By Microsoft, strong reasoning |
| LLaVA 13B | 8.0 GB | 16 GB | Image understanding (Vision) |
Model Size vs RAM — Simple Rules:
- 7-8B Models (4-5 GB) — need at least 8 GB RAM — runs on most laptops
- 13-14B Models (7-9 GB) — need at least 16 GB RAM
- 70B Models (40 GB) — need 48 GB+ RAM or 48 GB GPU VRAM — requires a powerful workstation or server
- Having a GPU (NVIDIA/AMD) makes it 5-10x faster, but it runs without GPU too (just slower)
Ollama for Thai Organizations — Why Should You Care?
For Thai organizations, Ollama offers clear advantages in four key areas:
1. PDPA Compliance — Data Stays In-House
Thailand's Personal Data Protection Act (PDPA) requires organizations to carefully manage personal data. Sending employee data, customer information, or financial records to Cloud AI abroad can create legal risks. Ollama solves this directly — all data stays on your machine, nothing goes outside.
2. Long-Term AI Cost Reduction
If your organization has 50 employees using ChatGPT Plus at $20/month each = $1,000/month ($12,000/year or approximately 420,000 THB/year). Setting up one Ollama server (investment of about 100,000-200,000 THB) shared by everyone via Open WebUI would be significantly cheaper in the long run — with no recurring monthly subscriptions.
3. Freely Experiment with New AI Models
Want to try DeepSeek-R1? Run ollama run deepseek-r1. Want Llama 3.1? Run ollama run llama3.1. Ollama makes experimenting with new AI models free and easy — no account signup, no credit card required.
4. Future-Proof Your AI Strategy
The AI trend is shifting from "using someone else's AI" to "having your own AI." Organizations that start experimenting with local AI today will have an advantage in adapting to the AI era, with both experience and infrastructure ready.
Who Should / Shouldn't Use Ollama
| Good Fit | Not Ideal For |
|---|---|
| Organizations concerned about data privacy / PDPA | Users who need the smartest possible AI (GPT-4o still leads) |
| Developers / IT needing an AI coding assistant | Organizations without IT staff (someone needs to set it up) |
| Organizations wanting to reduce long-term AI costs | Tasks requiring very long document processing (limited context window) |
| R&D teams experimenting with multiple models | Users unfamiliar with command line interfaces |
| Organizations needing AI that works offline | Tasks requiring full multimodal capability (audio+image+text) |
| Government / military agencies that prohibit external data transfer | Machines with less than 8 GB RAM |
Ollama and ERP Systems — How They Connect
For organizations already running ERP systems, Ollama opens the possibility of connecting AI directly to business processes while keeping data in-house. For example:
- Report Summarization: Pull data from ERP and have AI summarize it in plain language instead of reading pages of numbers
- Trend Analysis: Feed sales data, manufacturing costs, or inventory levels to AI for trend analysis and alerts
- Employee AI Assistant: Build a chatbot that answers questions about company policies, workflows, or how to use the ERP system — using RAG to pull from manuals
- Document Verification: Have AI help verify chart of accounts, purchase order accuracy, or contracts before approval
Saeree ERP + Local AI:
Saeree ERP is currently developing an AI Assistant that will enable users to query ERP data using natural language. For organizations interested in AI-powered ERP, feel free to consult our team for free.
Ollama Series — Read More
Ollama Series — 6 Episodes, Complete Local AI Guide:
- EP.1: What Is Ollama? — Run AI on Your Own Machine (this article)
- EP.2: Install Ollama on Every OS — macOS / Windows / Linux, Run Your First AI in 5 Minutes
- EP.3: Using Ollama for Real — Choosing Models, Writing Prompts, and Creating Modelfiles
- EP.4: Ollama + RAG — Build AI That Answers from Your Documents
- EP.5: Ollama API — Connect AI to Your Apps and Enterprise Systems
- EP.6: Secure Self-Hosted AI — Security & Best Practices
"In an era where data is an organization's most valuable asset, having AI that operates within your own walls isn't an option — it's a necessity."
- Saeree ERP Team
