Skip to content

Ollama: Installation and Getting Started

🚧 TRANSLATION PENDING - Last updated in Spanish: 2026-01-25

Ollama is a tool that simplifies running Large Language Models (LLMs) locally. This guide will help you install and start using Ollama in your DevOps environment.

📦 Installation

macOS

# Using Homebrew (recommended)
brew install ollama

# Or using the official script
curl -fsSL https://ollama.ai/install.sh | sh

Linux

# Ubuntu/Debian
curl -fsSL https://ollama.ai/install.sh | sh

# Or using the package
sudo apt update
sudo apt install ollama

Windows

# Using Winget
winget install Ollama.Ollama

# Or download from the official website

Installation verification

ollama --version
# Output: ollama version is 0.1.x

🚀 Getting Started

Start Ollama

# Start the service (runs in background)
ollama serve

# Or on macOS/Linux with launchd/systemd
brew services start ollama  # macOS
sudo systemctl start ollama # Linux

Download your first model

# List available models
ollama list

# Download a small model to start
ollama pull llama2:7b

# Popular models
ollama pull llama2          # 7B parameters, good balance
ollama pull codellama       # Specialized in code
ollama pull mistral         # Efficient model
ollama pull phi             # Very small, fast

Run a model

# Interactive mode
ollama run llama2

# Once inside, you can ask questions:
# >>> What is Kubernetes?
# >>> Create a bash script for backup
# >>> Explain container concept

🛠️ Advanced Usage

Run specific models

# Models with different sizes
ollama run llama2:13b      # Smarter, slower
ollama run llama2:7b       # Balance speed/intelligence
ollama run llama2:3.2b     # Very fast, less intelligent

# Specialized models
ollama run codellama       # For code generation
ollama run mathstral       # For mathematics
ollama run llama2-uncensored # Without content restrictions

REST API

# Start API server
ollama serve

# Verify it's running
curl http://localhost:11434/api/tags
# Python usage example
import requests

response = requests.post('http://localhost:11434/api/generate',
    json={
        'model': 'llama2',
        'prompt': 'Explain Docker in 3 lines',
        'stream': False
    })

print(response.json()['response'])

Model Management

# List installed models
ollama list

# View detailed information
ollama show llama2

# Remove a model
ollama rm llama2:7b

# Copy a model with new name
ollama cp llama2 my-custom-model

# Create custom model
echo 'FROM llama2
PARAMETER temperature 0.8
PARAMETER top_p 0.9' > Modelfile

ollama create my-model -f Modelfile

⚙️ Configuration and Optimization

Environment Variables

# Models directory
export OLLAMA_MODELS=/opt/ollama/models

# Custom port
export OLLAMA_HOST=0.0.0.0:8080

# GPU (if available)
export OLLAMA_GPU_LAYERS=35  # For large models

System Configuration

# ~/.ollama/config.yaml (if exists)
models:
  - name: llama2
    parameters:
      temperature: 0.7
      top_p: 0.9
      max_tokens: 2048

🔧 DevOps Integration

In Docker

FROM ollama/ollama:latest

# Pre-download models
RUN ollama serve & sleep 5 && \
    ollama pull llama2:7b && \
    ollama pull codellama:7b

EXPOSE 11434
CMD ["ollama", "serve"]

In Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ollama
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ollama
  template:
    metadata:
      labels:
        app: ollama
    spec:
      containers:
      - name: ollama
        image: ollama/ollama:latest
        ports:
        - containerPort: 11434
        resources:
          requests:
            memory: "4Gi"
            cpu: "2"
          limits:
            memory: "8Gi"
            cpu: "4"
        volumeMounts:
        - name: models
          mountPath: /root/.ollama/models
      volumes:
      - name: models
        persistentVolumeClaim:
          claimName: ollama-models-pvc

Automation Scripts

#!/bin/bash
# setup_ollama.sh

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Start service
sudo systemctl enable ollama
sudo systemctl start ollama

# Wait for ready
sleep 10

# Download essential models
ollama pull llama2:7b
ollama pull codellama:7b
ollama pull mistral:7b

echo "Ollama configured with basic models"

📊 Monitoring and Troubleshooting

View logs

# Service logs
journalctl -u ollama -f

# Ollama logs
tail -f ~/.ollama/logs/server.log

Common issues

# If it doesn't start
sudo systemctl status ollama
ps aux | grep ollama

# If it doesn't respond
curl http://localhost:11434/api/tags

# Free GPU memory
ollama stop all-models

Performance metrics

# View GPU usage
nvidia-smi

# View processes
ps aux --sort=-%mem | head

# Continuous monitoring
watch -n 1 nvidia-smi

🎯 DevOps Use Cases

1. Log analysis

# Analyze application logs
cat app.log | ollama run llama2 "Analyze these logs and find errors:"

# Kubernetes troubleshooting
kubectl logs pod-name | ollama run llama2 "Explain these K8s errors:"

2. Code generation

# Automation scripts
ollama run codellama "Create a bash script for PostgreSQL backup"

# IaC configurations
ollama run llama2 "Generate Terraform configuration for an EKS cluster"

3. Documentation

# Create README
ollama run llama2 "Create a README.md for a user REST API"

# Document code
ollama run codellama "Document this Python function:"

🔗 Additional Resources