Initial Commit w/ Host Playbooks

This commit is contained in:
Caleb Braaten 2024-02-06 12:36:51 -08:00
commit 4351c6803b
9 changed files with 346 additions and 0 deletions

54
README.md Normal file
View File

@ -0,0 +1,54 @@
# Welcome to my Lab 👻
## What is this?
This is a repo that contains all the code and documentation for my homelab/server(s)
My servers run [Alpine Linux](https://www.alpinelinux.org/about/), chosen for it's small and lightweight footprint leaving more resources for my applications. [ZFS](https://openzfs.org/wiki/Main_Page) acts as the storage layer for my applications and data and provides various protections against data loss. (Backups are still important! Don't store anything in one spot unless you're willing to lose it.) New hosts can be configured with [Ansible](https://docs.ansible.com/ansible/latest/getting_started/index.html) and joined at the application layer.
Docker is used to containerize applications and they are deployed using [Nomad](https://www.nomadproject.io/). This repo has those job specs for you to build off of or utilize yourself. [Consul](https://www.consul.io/) is used for service discovery and service mesh. I use [Traefik](https://traefik.io/) as a reverse proxy to register and route traffic to the applications through tags in the nomad specs to expose the consul services to the outside world.
## Getting Started Yourself
### Host Init
The `host_init` directory contains the ansible playbooks to configure a new host as well as a [readme](./host_init/README.md) to guide you through how. You will need to make sure you have [installed Ansible](https://docs.ansible.com/ansible/latest/installation_guide/installation_distros.html) if you want to use the playbooks.
On MacOS, you can install Ansible with [Homebrew](https://brew.sh/):
```bash
brew install ansible
```
Once all the playbooks complete, you should be able to access the following web interfaces:
| Service | URL |
| --- | --- |
| Consul | http://\<host>:8500 |
| Nomad | http://\<host>:4646 |
| Traefik | http://\<host>:8080 |
### host_config
This directory contains the configuration files for the nodes in your compute cluster. These files are copied to the host during the host init process. You can learn more about bootstrapping a host in the [host_init readme](./host_init/README.md).
These configs should be good starting points for you to build off of but you may need to make some changes to meet your requirements. (such as security policies, storage, etc.)
You should definitely verify the following fields are what you want:
- `node_name` in consul.hcl
- `bind_addr` in consul.hcl
- `client_addr` in consul.hcl
- `host_volume`s in nomad.hcl
### utils
Rather than update the configuration files on the remote host, you can write your changes to the files directly and use the `utils` directory to copy them to the remote host and restart the services with the new configuration. This is useful for making changes to the configuration files without having to run the host init process again or connecting to the host to make the changes manually.
To run the utils, you will need to have [Ansible](https://docs.ansible.com/ansible/latest/installation_guide/installation_distros.html) installed. When you have ansible installed you can run the utils with the following generic command:
```bash
ansible-playbook -i root@<ipaddress>, ./utils/update<service>Config.yml
```
Be aware, that if you have an error it could cause the service to fail to start and you may need to connect to the host to look at the log files and fix the issue.
### Nomad Jobs
This is where the nomad job specs are stored. You can learn more about the job specs in the [nomad_jobs readme](./nomad_jobs/README.md).

14
host_init/0-all.yml Normal file
View File

@ -0,0 +1,14 @@
- name: ZFS Playbook
ansible.builtin.import_playbook: ./1-zfs.yml
- name: Docker Install
ansible.builtin.import_playbook: ./2-docker.yml
- name: Consul Install and Config Transfer
ansible.builtin.import_playbook: ./3-consul.yml
- name: Nomad Install and Config Transfer
ansible.builtin.import_playbook: ./4-nomad.yml
- name: Treafik Install and Config Transfer
ansible.builtin.import_playbook: ./5-traefik.yml

16
host_init/1-zfs.yml Normal file
View File

@ -0,0 +1,16 @@
- name: Install ZFS on Alpine Linux
hosts: all
tasks:
- name: Install ZFS packages
community.general.apk:
name: "{{ item }}"
state: present
with_items:
- zfs
- name: Load ZFS kernel module
community.general.modprobe:
name: zfs
state: present
persistent: present

24
host_init/2-docker.yml Normal file
View File

@ -0,0 +1,24 @@
- name: Install Docker on Alpine Linux
hosts: all
tasks:
- name: Enable community packages
ansible.builtin.lineinfile:
path: /etc/apk/repositories
regexp: '^#http://dl-cdn.alpinelinux.org/alpine/v3.18/community'
line: 'http://dl-cdn.alpinelinux.org/alpine/v3.18/community'
state: present
- name: Update apk packages
community.general.apk:
update_cache: true
- name: Install docker with apk
community.general.apk:
name: docker
- name: Start docker service
ansible.builtin.service:
name: docker
state: started
enabled: true

35
host_init/3-consul.yml Normal file
View File

@ -0,0 +1,35 @@
- name: Install Nomad on Alpine Linux
hosts: all
tasks:
- name: Enable community packages
ansible.builtin.lineinfile:
path: /etc/apk/repositories
regexp: '^#http://dl-cdn.alpinelinux.org/alpine/v3.18/community'
line: 'http://dl-cdn.alpinelinux.org/alpine/v3.18/community'
state: present
- name: Update apk packages
community.general.apk:
update_cache: true
- name: Install consul with apk
community.general.apk:
name: consul
- name: Remove default consul config
ansible.builtin.file:
path: /etc/consul/server.json
state: absent
- name: Copy consul config to host
ansible.builtin.copy:
mode: preserve
src: ../host_config/consul.hcl
dest: /etc/consul/server.hcl
- name: Start consul service
ansible.builtin.service:
name: consul
state: started
enabled: true

30
host_init/4-nomad.yml Normal file
View File

@ -0,0 +1,30 @@
- name: Install Nomad on Alpine Linux
hosts: all
tasks:
- name: Enable community packages
ansible.builtin.lineinfile:
path: /etc/apk/repositories
regexp: '^#http://dl-cdn.alpinelinux.org/alpine/v3.18/community'
line: 'http://dl-cdn.alpinelinux.org/alpine/v3.18/community'
state: present
- name: Update apk packages
community.general.apk:
update_cache: true
- name: Install nomad with apk
community.general.apk:
name: nomad
- name: Copy nomad config to host
ansible.builtin.copy:
mode: preserve
src: ../host_config/nomad.hcl
dest: /etc/nomad.d/server.hcl
- name: Start nomad service
ansible.builtin.service:
name: nomad
state: started
enabled: true

30
host_init/5-traefik.yml Normal file
View File

@ -0,0 +1,30 @@
- name: Install Caddy on Alpine Linux
hosts: all
tasks:
- name: Enable community packages
ansible.builtin.lineinfile:
path: /etc/apk/repositories
regexp: '^#http://dl-cdn.alpinelinux.org/alpine/v3.18/community'
line: 'http://dl-cdn.alpinelinux.org/alpine/v3.18/community'
state: present
- name: Update apk packages
community.general.apk:
update_cache: true
- name: Install traefik with apk
community.general.apk:
name: traefik
- name: Copy traefik config to host
ansible.builtin.copy:
mode: preserve
src: ../host_config/traefik.yml
dest: /etc/traefik/traefik.yaml
- name: Start traefik service
ansible.builtin.service:
name: traefik
state: started
enabled: true

118
host_init/README.md Normal file
View File

@ -0,0 +1,118 @@
# Host Initialization
This directory contains the Ansible playbooks used to provision a new host. It installs and loads configuration files for the following:
- ZFS
- Docker
- Consul
- Nomad
- Traefik
Configuration files are stored in the `../host_config` directory. The `../utils` directory contains Ansible playbooks that can transfer these config files (and by extension any changes you made) and restart the services for any subsequent changes after running the initial `host_init` playbooks found here.
## Alpine Setup
### Downloading and Booting Alpine Linux
1. Download the latest Alpine Linux ISO from [here](https://alpinelinux.org/downloads/).
2. Burn the ISO to a USB drive using [Etcher](https://www.balena.io/etcher/).
3. Boot the USB drive on the host you want to install Alpine Linux on.
If you need more guidance, check out the [Alpine Linux User Handbook](https://docs.alpinelinux.org/user-handbook/0.1a/Installing/medium.html#_downloading) or the [Alpine Linux Installation Guide](https://wiki.alpinelinux.org/wiki/Installation).
### Installing and Configuring Alpine Linux
#### Initial Setup
1. After booting Alpine Linux, login as root (no password) and run the following command
```bash
setup-alpine
```
2. Follow the prompts to configure the system. When asked "Which ssh server?" accept the default of `openssh`.
3. When asked "Allow root ssh login?" respond with `yes` instead of the default `prohibit-password`. We will want to prohibit password but we will need it to copy our SSH key to the system.
#### Configure SSH
4. If you do not have an ssh key pair already, generate a key pair on the local machine you will be connecting *from*.
```bash
ssh-keygen
```
> .pub file is the public key and ends up on the remote machine. The other file is the private key and stays on the local machine you are connecting from.
5. Copy public ssh key to remote host (run from local, you will be prompted for the remote host's password)
```bash
ssh-copy-id -i ~/.ssh/id_rsa.pub <user>@<host>
```
> Where user is the user on the remote host and host is the remote host's IP address or domain name.
#### Secure System
6. SSH into the remote host
```bash
ssh root@<host>
```
6. Restrict remote host ssh access
```bash
vi /etc/ssh/sshd_config
```
7. Set the following configuration option
```bash
PermitRootLogin prohibit-password
```
8. Restart ssh service
```bash
rc-service sshd restart
```
#### Prep Ansible
9. Install Python for Ansible to run.
```bash
apk add python3
```
10. Test your connection from the local machine
```bash
ansible all -i root@<ipaddress>, -m ping
```
> The `,` is actually needed to tell ansible to use the cli argument as the host file and is not a typo. The \<ipaddress> is the remote host's IP address or domain name.
11. If you got a SUCCESS; You can now run Ansible Playbooks!
## Running Ansible Playbooks
You can run the playbooks in this directory one at a time and remix them as needed. Ansible allows for playbooks to be composed though so if you are just wanting to replicate what I have, you **ONLY** need to run `./0-all.yml` to run all the playbooks and configure a new host.
```bash
ansible-playbook -i root@<ipaddress>, ./0-all.yml
```
## Storage and ZFS
My host has the following drives:
| Drive | Size | Type | Purpose |
| --- | --- | --- | --- |
| /dev/nvme0n1 | 16GB | Intel Optane | System |
| /dev/nvme1n1 | 16GB | Intel Optane | unused |
| /dev/sda | 512GB | SSD | ZFS Pool |
| /dev/sdb | 512GB | SSD | ZFS Pool |
| /dev/sdc | 4TB | HDD | ZFS Pool | raidz2 with drives c-h |
| /dev/sdd | 4TB | HDD | ZFS Pool | raidz2 with drives c-h |
| /dev/sde | 4TB | HDD | ZFS Pool | raidz2 with drives c-h |
| /dev/sdf | 4TB | HDD | ZFS Pool | raidz2 with drives c-h |
| /dev/sdg | 4TB | HDD | ZFS Pool | raidz2 with drives c-h |
| /dev/sdh | 4TB | HDD | ZFS Pool | raidz2 with drives c-h |
The first Optane drive houses the OS and things installed through Ansible. The second Optane drive is unused at the moment but may be used as a SLOG or L2ARC depending on bottlenecks identified. The SSDs are mirrored and intended to be used for applications with high iops or faster read/write requirements. The HDDs are in a raidz2 configuration and intended to be used for binary data storage in a more traditional 'NAS' setting. raidz2 means up to two drives can fail before data becomes irrecoverable.
### Nomad Config
While Ceph or Seaweed-FS allow for dynamic storage provisioning, the overhead does not make sense when a single host is being used for storage. Additionally, standard nomad host drives don't require further service implementations. Instead, in the nomad host config, be sure to add the drives you will want and the respective path on the host. You can update this at anytime and run the `../utils/nomad_config.yml` playbook to update the config on the host making the drives available to the scheduler.
Because ZFS is used and storage provisioning cannot be predicted for everyone, provisioning is not dynamic and must be done manually to meet your system requirements. You can take a look at [zfs.md](./zfs.md) for some helpful commands to get started.

25
host_init/zfs.md Normal file
View File

@ -0,0 +1,25 @@
// Write me a cheatsheet for ZFS
# ZFS Cheatsheet
| Command | Description |
| --- | --- |
| `zpool status` | Show status of zpools |
| `zpool import ssd` | Import an existing zpool named ssd |
| `zpool create ssd /dev/sda` | Create a zpool named ssd with a single disk /dev/sda |
| `zfs list` | Show some metadata of zpools |
| `zfs create ssd/mydataset` | Create a dataset named mydataset in the zpool ssd |
| `zfs set compression=lz4 ssd/mydataset` | Set compression to lz4 for the dataset mydataset in the zpool ssd |
| `zfs destroy ssd/mydataset` | Delete a dataset named mydataset in the zpool ssd |
> IMPORTANT: If you want to use a ZFS Dataset as a host volume for a nomad job, you will need to set permissions for docker to have read and write access by running `chown 70 /zpool/dataset/path` on the host.
## Notes
- ZFS is a file system and logical volume manager that is scalable, includes data integrity, and has a lot of features for managing storage.
- `zpool` is the command for managing storage pools and `zfs` is the command for managing datasets. Datasets exist within storage pools.
- ZFS has a lot of features such as compression, deduplication, snapshots, and clones.
- ZFS is a good choice for managing storage on a single host but it is not a distributed file system. If you need a distributed file system, you should look into Ceph or Seaweedfs.