Homelab/host_init/README.md

# Host Initialization

This directory contains the Ansible playbooks used to provision a new host. It installs and loads configuration files for the following:
- ZFS
- Docker
- Consul
- Nomad
- Traefik

Configuration files are stored in the `../host_config` directory. The `../utils` directory contains Ansible playbooks that can transfer these config files (and by extension any changes you made) and restart the services for any subsequent changes after running the initial `host_init` playbooks found here.

## Alpine Setup

### Downloading and Booting Alpine Linux

1. Download the latest Alpine Linux ISO from [here](https://alpinelinux.org/downloads/).
2. Burn the ISO to a USB drive using [Etcher](https://www.balena.io/etcher/).
3. Boot the USB drive on the host you want to install Alpine Linux on.

If you need more guidance, check out the [Alpine Linux User Handbook](https://docs.alpinelinux.org/user-handbook/0.1a/Installing/medium.html#_downloading) or the [Alpine Linux Installation Guide](https://wiki.alpinelinux.org/wiki/Installation).

### Installing and Configuring Alpine Linux

#### Initial Setup

1. After booting Alpine Linux, login as root (no password) and run the following command
```bash
setup-alpine
```

2. Follow the prompts to configure the system. When asked "Which ssh server?" accept the default of `openssh`.

3. When asked "Allow root ssh login?" respond with `yes` instead of the default `prohibit-password`. We will want to prohibit password but we will need it to copy our SSH key to the system.

#### Configure SSH

4. If you do not have an ssh key pair already, generate a key pair on the local machine you will be connecting *from*.
```bash
ssh-keygen
```
> .pub file is the public key and ends up on the remote machine. The other file is the private key and stays on the local machine you are connecting from.

5. Copy public ssh key to remote host (run from local, you will be prompted for the remote host's password)
```bash
ssh-copy-id -i ~/.ssh/id_rsa.pub <user>@<host>
```
> Where user is the user on the remote host and host is the remote host's IP address or domain name.

#### Secure System

6. SSH into the remote host
```bash
ssh root@<host>
```

6. Restrict remote host ssh access
```bash
vi /etc/ssh/sshd_config
```

7. Set the following configuration option
```bash
PermitRootLogin prohibit-password
```

8. Restart ssh service
```bash
rc-service sshd restart
```

#### Prep Ansible

9. Install Python for Ansible to run.
```bash
apk add python3
```

10. Test your connection from the local machine
```bash
ansible all -i root@<ipaddress>, -m ping
```
> The `,` is actually needed to tell ansible to use the cli argument as the host file and is not a typo. The \<ipaddress> is the remote host's IP address or domain name.

11. If you got a SUCCESS; You can now run Ansible Playbooks!


## Running Ansible Playbooks

You can run the playbooks in this directory one at a time and remix them as needed. Ansible allows for playbooks to be composed though so if you are just wanting to replicate what I have, you **ONLY** need to run `./0-all.yml` to run all the playbooks and configure a new host.

```bash
ansible-playbook -i root@<ipaddress>, ./0-all.yml
```

## Storage and ZFS

My host has the following drives:

| Drive | Size | Type | Purpose |
| --- | --- | --- | --- |
| /dev/nvme0n1 | 16GB | Intel Optane | System |
| /dev/nvme1n1 | 16GB | Intel Optane | unused |
| /dev/sda | 512GB | SSD | ZFS Pool |
| /dev/sdb | 512GB | SSD | ZFS Pool |
| /dev/sdc | 4TB | HDD | ZFS Pool | raidz2 with drives c-h |
| /dev/sdd | 4TB | HDD | ZFS Pool | raidz2 with drives c-h  |
| /dev/sde | 4TB | HDD | ZFS Pool | raidz2 with drives c-h  |
| /dev/sdf | 4TB | HDD | ZFS Pool | raidz2 with drives c-h  |
| /dev/sdg | 4TB | HDD | ZFS Pool | raidz2 with drives c-h  |
| /dev/sdh | 4TB | HDD | ZFS Pool | raidz2 with drives c-h  |

The first Optane drive houses the OS and things installed through Ansible. The second Optane drive is unused at the moment but may be used as a SLOG or L2ARC depending on bottlenecks identified. The SSDs are mirrored and intended to be used for applications with high iops or faster read/write requirements. The HDDs are in a raidz2 configuration and intended to be used for binary data storage in a more traditional 'NAS' setting. raidz2 means up to two drives can fail before data becomes irrecoverable.

### Nomad Config

While Ceph or Seaweed-FS allow for dynamic storage provisioning, the overhead does not make sense when a single host is being used for storage. Additionally, standard nomad host drives don't require further service implementations. Instead, in the nomad host config, be sure to add the drives you will want and the respective path on the host. You can update this at anytime and run the `../utils/nomad_config.yml` playbook to update the config on the host making the drives available to the scheduler.

Because ZFS is used and storage provisioning cannot be predicted for everyone, provisioning is not dynamic and must be done manually to meet your system requirements. You can take a look at [zfs.md](./zfs.md) for some helpful commands to get started.