What Is Ollama?

1
April

Ollama Series EP.1 — In an era where AI tools like ChatGPT and Claude have become essential for work, many organizations are asking: "Is our data truly safe when we send it to AI?" Ollama is the answer for organizations that want to use AI without sending data outside their walls — an open-source tool that makes running Large Language Models (LLMs) on your own machine as easy as running Docker. This article is EP.1 of the Ollama Series, covering the fundamentals — what Ollama is, what it can do, how it differs from Cloud AI, and why Thai organizations should pay attention.

In short — What is Ollama?

Ollama is an open-source tool that lets you run LLMs on your own machine with a single command
Supports macOS, Windows, Linux — install in under 5 minutes
Run popular models instantly: Llama 3.1, Gemma 2, Qwen 2.5, Mistral, DeepSeek-R1, Phi-4
Data never leaves your machine — ideal for organizations concerned about data privacy and PDPA compliance
Zero API costs — you only pay for electricity and hardware
Works offline — no internet connection required
Built-in REST API — integrate with other applications immediately

Ollama is an open-source tool that makes running Large Language Models (LLMs) on your personal computer or organization's server remarkably simple — no complex Python environment setup, no manual model weight management. Just install Ollama and type ollama run llama3.1 to get an AI chatbot running on your machine instantly.

Think of Ollama as "Docker for AI Models" — just as Docker made running software in containers easy, Ollama makes running AI models easy. You don't need deep Machine Learning knowledge. Just know which model you want, and run ollama pull like you'd run docker pull.

Ollama was developed by a team led by Jeffrey Morgan, first released in 2023, and has grown rapidly. As of April 2026, it has surpassed 100 million downloads and 110,000+ GitHub Stars, making it the world's most popular local AI tool.

Timeline: From Small Project to Local AI Standard

Date	Event	Significance
Jul 2023	Ollama v0.1 launched	macOS support, easy Llama 2 deployment
Nov 2023	Linux support added	Expanded to servers and developer workflows
Feb 2024	Windows support (Preview)	All major OS covered — accessible to everyone
Mar 2024	NVIDIA GPU + AMD GPU support	5-10x faster with GPU acceleration
Apr 2024	Day-one Llama 3 support	Ollama became the go-to way to try new models
Jul 2024	Structured Output (JSON Mode)	Production-ready for application integration
Jan 2025	DeepSeek-R1 supported immediately	Easiest way to run DeepSeek locally
2025-2026	Downloads surpass 100 million	Became the world's most popular local AI tool

Why Local AI? — Cloud AI vs Local AI

Before understanding Ollama, it's important to know that the AI tools we use daily (ChatGPT, Claude, Gemini) are Cloud AI — every question you type is sent to AI company servers abroad for processing. This means all data you submit has already left your organization.

Local AI is the opposite approach — download AI models to your own machine or server, and process everything locally. Data never goes anywhere. Here's a clear comparison:

Aspect	Cloud AI (ChatGPT, Claude)	Local AI (Ollama)
Where is data stored?	Foreign servers (US/China)	Your own machine — never leaves your org
Internet required?	Yes — won't work without it	No — works fully offline
Cost	$20/month (Pro) or pay-per-token	Free — only electricity + hardware costs
Model intelligence	GPT-4o, Claude Opus — very smart	Llama 3.1 70B, Qwen 72B — slightly behind
Speed	Fast (powerful servers)	Depends on hardware — GPU makes it fast
Data privacy compliance	Must verify each provider's DPA	Safe — data never leaves your organization
Customization	Limited — only what providers allow	Full control — custom prompts, fine-tuning
Best for	General tasks, maximum intelligence needed	Sensitive data, AI experimentation, cost savings

What Can Ollama Do?

Ollama isn't just a chatbot — it's a platform for running AI models that supports multiple use cases:

1. Chat / Q&A

Use it like ChatGPT instantly — ask questions, summarize text, translate languages, write emails, draft documents. All without data leaving your machine. Ideal for organizations handling sensitive information like enterprise risk data or internal reports.

2. Code Generation

Models like CodeLlama, DeepSeek-Coder, and Qwen2.5-Coder excel at writing code, debugging, and explaining code. Developers can use Ollama as a private AI coding assistant without sending source code to external companies.

3. Document Analysis (RAG)

Use Ollama with RAG (Retrieval-Augmented Generation) to build AI that answers questions from your organization's documents — manuals, policies, or knowledge that's at risk of being lost. We'll cover this in detail in EP.4 of this series.

4. Image Processing (Vision)

Models like LLaVA, Moondream, and Llama 3.2 Vision can read and analyze images — extract data from receipts, analyze charts, or describe photographs. All processed on your own machine.

5. Integration with Other Apps (API)

Ollama includes a built-in REST API for seamless integration with web apps, chatbots, and automation systems. We'll explore this in detail in EP.5.

Popular Models on Ollama

Ollama supports over 100 models from its official Model Library. Here are the most popular and relevant ones for organizations:

Model	Size	RAM Required	Best At
Llama 3.1 8B	4.7 GB	8 GB	General purpose, decent Thai support
Llama 3.1 70B	40 GB	48 GB	Very smart, near GPT-4 level
Gemma 2 9B	5.4 GB	8 GB	By Google, fast and accurate
Qwen 2.5 7B	4.4 GB	8 GB	By Alibaba, strong Asian language support
Qwen 2.5 72B	41 GB	48 GB	Excellent Thai, rivals GPT-4
DeepSeek-R1 8B	4.9 GB	8 GB	Reasoning — "thinks before answering"
Mistral 7B	4.1 GB	8 GB	From France, very fast
Phi-4 14B	8.4 GB	16 GB	By Microsoft, strong reasoning
LLaVA 13B	8.0 GB	16 GB	Image understanding (Vision)

Model Size vs RAM — Simple Rules:

7-8B Models (4-5 GB) — need at least 8 GB RAM — runs on most laptops
13-14B Models (7-9 GB) — need at least 16 GB RAM
70B Models (40 GB) — need 48 GB+ RAM or 48 GB GPU VRAM — requires a powerful workstation or server
Having a GPU (NVIDIA/AMD) makes it 5-10x faster, but it runs without GPU too (just slower)

Ollama for Thai Organizations — Why Should You Care?

For Thai organizations, Ollama offers clear advantages in four key areas:

1. PDPA Compliance — Data Stays In-House

Thailand's Personal Data Protection Act (PDPA) requires organizations to carefully manage personal data. Sending employee data, customer information, or financial records to Cloud AI abroad can create legal risks. Ollama solves this directly — all data stays on your machine, nothing goes outside.

2. Long-Term AI Cost Reduction

If your organization has 50 employees using ChatGPT Plus at $20/month each = $1,000/month ($12,000/year or approximately 420,000 THB/year). Setting up one Ollama server (investment of about 100,000-200,000 THB) shared by everyone via Open WebUI would be significantly cheaper in the long run — with no recurring monthly subscriptions.

3. Freely Experiment with New AI Models

Want to try DeepSeek-R1? Run ollama run deepseek-r1. Want Llama 3.1? Run ollama run llama3.1. Ollama makes experimenting with new AI models free and easy — no account signup, no credit card required.

4. Future-Proof Your AI Strategy

The AI trend is shifting from "using someone else's AI" to "having your own AI." Organizations that start experimenting with local AI today will have an advantage in adapting to the AI era, with both experience and infrastructure ready.

Who Should / Shouldn't Use Ollama

Good Fit	Not Ideal For
Organizations concerned about data privacy / PDPA	Users who need the smartest possible AI (GPT-4o still leads)
Developers / IT needing an AI coding assistant	Organizations without IT staff (someone needs to set it up)
Organizations wanting to reduce long-term AI costs	Tasks requiring very long document processing (limited context window)
R&D teams experimenting with multiple models	Users unfamiliar with command line interfaces
Organizations needing AI that works offline	Tasks requiring full multimodal capability (audio+image+text)
Government / military agencies that prohibit external data transfer	Machines with less than 8 GB RAM

Ollama and ERP Systems — How They Connect

For organizations already running ERP systems, Ollama opens the possibility of connecting AI directly to business processes while keeping data in-house. For example:

Report Summarization: Pull data from ERP and have AI summarize it in plain language instead of reading pages of numbers
Trend Analysis: Feed sales data, manufacturing costs, or inventory levels to AI for trend analysis and alerts
Employee AI Assistant: Build a chatbot that answers questions about company policies, workflows, or how to use the ERP system — using RAG to pull from manuals
Document Verification: Have AI help verify chart of accounts, purchase order accuracy, or contracts before approval

Saeree ERP + Local AI:

Saeree ERP is currently developing an AI Assistant that will enable users to query ERP data using natural language. For organizations interested in AI-powered ERP, feel free to consult our team for free.

Ollama Series — Read More

Ollama Series — 6 Episodes, Complete Local AI Guide:

EP.1: What Is Ollama? — Run AI on Your Own Machine (this article)
EP.2: Install Ollama on Every OS — macOS / Windows / Linux, Run Your First AI in 5 Minutes
EP.3: Using Ollama for Real — Choosing Models, Writing Prompts, and Creating Modelfiles
EP.4: Ollama + RAG — Build AI That Answers from Your Documents
EP.5: Ollama API — Connect AI to Your Apps and Enterprise Systems
EP.6: Secure Self-Hosted AI — Security & Best Practices

"In an era where data is an organization's most valuable asset, having AI that operates within your own walls isn't an option — it's a necessity."
- Saeree ERP Team

What Is Ollama?

What Is Ollama?

Timeline: From Small Project to Local AI Standard

Why Local AI? — Cloud AI vs Local AI