02-347-7730  |  Saeree ERP - Complete ERP Solution for Thai Businesses Contact Us

What Is Ollama?

What is Ollama Run AI locally LLM for organizations
  • 1
  • April

Ollama Series EP.1 — In an era where AI tools like ChatGPT and Claude have become essential for work, many organizations are asking: "Is our data truly safe when we send it to AI?" Ollama is the answer for organizations that want to use AI without sending data outside their walls — an open-source tool that makes running Large Language Models (LLMs) on your own machine as easy as running Docker. This article is EP.1 of the Ollama Series, covering the fundamentals — what Ollama is, what it can do, how it differs from Cloud AI, and why Thai organizations should pay attention.

In short — What is Ollama?

  • Ollama is an open-source tool that lets you run LLMs on your own machine with a single command
  • Supports macOS, Windows, Linux — install in under 5 minutes
  • Run popular models instantly: Llama 3.1, Gemma 2, Qwen 2.5, Mistral, DeepSeek-R1, Phi-4
  • Data never leaves your machine — ideal for organizations concerned about data privacy and PDPA compliance
  • Zero API costs — you only pay for electricity and hardware
  • Works offline — no internet connection required
  • Built-in REST API — integrate with other applications immediately

What Is Ollama?

Ollama is an open-source tool that makes running Large Language Models (LLMs) on your personal computer or organization's server remarkably simple — no complex Python environment setup, no manual model weight management. Just install Ollama and type ollama run llama3.1 to get an AI chatbot running on your machine instantly.

Think of Ollama as "Docker for AI Models" — just as Docker made running software in containers easy, Ollama makes running AI models easy. You don't need deep Machine Learning knowledge. Just know which model you want, and run ollama pull like you'd run docker pull.

Ollama was developed by a team led by Jeffrey Morgan, first released in 2023, and has grown rapidly. As of April 2026, it has surpassed 100 million downloads and 110,000+ GitHub Stars, making it the world's most popular local AI tool.

Timeline: From Small Project to Local AI Standard

Date Event Significance
Jul 2023 Ollama v0.1 launched macOS support, easy Llama 2 deployment
Nov 2023 Linux support added Expanded to servers and developer workflows
Feb 2024 Windows support (Preview) All major OS covered — accessible to everyone
Mar 2024 NVIDIA GPU + AMD GPU support 5-10x faster with GPU acceleration
Apr 2024 Day-one Llama 3 support Ollama became the go-to way to try new models
Jul 2024 Structured Output (JSON Mode) Production-ready for application integration
Jan 2025 DeepSeek-R1 supported immediately Easiest way to run DeepSeek locally
2025-2026 Downloads surpass 100 million Became the world's most popular local AI tool

Why Local AI? — Cloud AI vs Local AI

Before understanding Ollama, it's important to know that the AI tools we use daily (ChatGPT, Claude, Gemini) are Cloud AI — every question you type is sent to AI company servers abroad for processing. This means all data you submit has already left your organization.

Local AI is the opposite approach — download AI models to your own machine or server, and process everything locally. Data never goes anywhere. Here's a clear comparison:

Aspect Cloud AI (ChatGPT, Claude) Local AI (Ollama)
Where is data stored? Foreign servers (US/China) Your own machine — never leaves your org
Internet required? Yes — won't work without it No — works fully offline
Cost $20/month (Pro) or pay-per-token Free — only electricity + hardware costs
Model intelligence GPT-4o, Claude Opus — very smart Llama 3.1 70B, Qwen 72B — slightly behind
Speed Fast (powerful servers) Depends on hardware — GPU makes it fast
Data privacy compliance Must verify each provider's DPA Safe — data never leaves your organization
Customization Limited — only what providers allow Full control — custom prompts, fine-tuning
Best for General tasks, maximum intelligence needed Sensitive data, AI experimentation, cost savings

What Can Ollama Do?

Ollama isn't just a chatbot — it's a platform for running AI models that supports multiple use cases:

1. Chat / Q&A

Use it like ChatGPT instantly — ask questions, summarize text, translate languages, write emails, draft documents. All without data leaving your machine. Ideal for organizations handling sensitive information like enterprise risk data or internal reports.

2. Code Generation

Models like CodeLlama, DeepSeek-Coder, and Qwen2.5-Coder excel at writing code, debugging, and explaining code. Developers can use Ollama as a private AI coding assistant without sending source code to external companies.

3. Document Analysis (RAG)

Use Ollama with RAG (Retrieval-Augmented Generation) to build AI that answers questions from your organization's documents — manuals, policies, or knowledge that's at risk of being lost. We'll cover this in detail in EP.4 of this series.

4. Image Processing (Vision)

Models like LLaVA, Moondream, and Llama 3.2 Vision can read and analyze images — extract data from receipts, analyze charts, or describe photographs. All processed on your own machine.

5. Integration with Other Apps (API)

Ollama includes a built-in REST API for seamless integration with web apps, chatbots, and automation systems. We'll explore this in detail in EP.5.

Popular Models on Ollama

Ollama supports over 100 models from its official Model Library. Here are the most popular and relevant ones for organizations:

Model Size RAM Required Best At
Llama 3.1 8B 4.7 GB 8 GB General purpose, decent Thai support
Llama 3.1 70B 40 GB 48 GB Very smart, near GPT-4 level
Gemma 2 9B 5.4 GB 8 GB By Google, fast and accurate
Qwen 2.5 7B 4.4 GB 8 GB By Alibaba, strong Asian language support
Qwen 2.5 72B 41 GB 48 GB Excellent Thai, rivals GPT-4
DeepSeek-R1 8B 4.9 GB 8 GB Reasoning — "thinks before answering"
Mistral 7B 4.1 GB 8 GB From France, very fast
Phi-4 14B 8.4 GB 16 GB By Microsoft, strong reasoning
LLaVA 13B 8.0 GB 16 GB Image understanding (Vision)

Model Size vs RAM — Simple Rules:

  • 7-8B Models (4-5 GB) — need at least 8 GB RAM — runs on most laptops
  • 13-14B Models (7-9 GB) — need at least 16 GB RAM
  • 70B Models (40 GB) — need 48 GB+ RAM or 48 GB GPU VRAM — requires a powerful workstation or server
  • Having a GPU (NVIDIA/AMD) makes it 5-10x faster, but it runs without GPU too (just slower)

Ollama for Thai Organizations — Why Should You Care?

For Thai organizations, Ollama offers clear advantages in four key areas:

1. PDPA Compliance — Data Stays In-House

Thailand's Personal Data Protection Act (PDPA) requires organizations to carefully manage personal data. Sending employee data, customer information, or financial records to Cloud AI abroad can create legal risks. Ollama solves this directly — all data stays on your machine, nothing goes outside.

2. Long-Term AI Cost Reduction

If your organization has 50 employees using ChatGPT Plus at $20/month each = $1,000/month ($12,000/year or approximately 420,000 THB/year). Setting up one Ollama server (investment of about 100,000-200,000 THB) shared by everyone via Open WebUI would be significantly cheaper in the long run — with no recurring monthly subscriptions.

3. Freely Experiment with New AI Models

Want to try DeepSeek-R1? Run ollama run deepseek-r1. Want Llama 3.1? Run ollama run llama3.1. Ollama makes experimenting with new AI models free and easy — no account signup, no credit card required.

4. Future-Proof Your AI Strategy

The AI trend is shifting from "using someone else's AI" to "having your own AI." Organizations that start experimenting with local AI today will have an advantage in adapting to the AI era, with both experience and infrastructure ready.

Who Should / Shouldn't Use Ollama

Good Fit Not Ideal For
Organizations concerned about data privacy / PDPA Users who need the smartest possible AI (GPT-4o still leads)
Developers / IT needing an AI coding assistant Organizations without IT staff (someone needs to set it up)
Organizations wanting to reduce long-term AI costs Tasks requiring very long document processing (limited context window)
R&D teams experimenting with multiple models Users unfamiliar with command line interfaces
Organizations needing AI that works offline Tasks requiring full multimodal capability (audio+image+text)
Government / military agencies that prohibit external data transfer Machines with less than 8 GB RAM

Ollama and ERP Systems — How They Connect

For organizations already running ERP systems, Ollama opens the possibility of connecting AI directly to business processes while keeping data in-house. For example:

  • Report Summarization: Pull data from ERP and have AI summarize it in plain language instead of reading pages of numbers
  • Trend Analysis: Feed sales data, manufacturing costs, or inventory levels to AI for trend analysis and alerts
  • Employee AI Assistant: Build a chatbot that answers questions about company policies, workflows, or how to use the ERP system — using RAG to pull from manuals
  • Document Verification: Have AI help verify chart of accounts, purchase order accuracy, or contracts before approval

Saeree ERP + Local AI:

Saeree ERP is currently developing an AI Assistant that will enable users to query ERP data using natural language. For organizations interested in AI-powered ERP, feel free to consult our team for free.

Ollama Series — Read More

"In an era where data is an organization's most valuable asset, having AI that operates within your own walls isn't an option — it's a necessity."

- Saeree ERP Team

References

Interested in ERP for Your Organization?

Consult with Grand Linux Solution experts — free of charge

Request a Free Demo

Call 02-347-7730 | sale@grandlinux.com

Saeree ERP Author

About the Author

Paitoon Butri

Network & Server Security Specialist, Grand Linux Solution Co., Ltd.