There’s so much more to creating a website than just writing content. For starters, how are others going to see your beautiful new website?

Well to show your hard work off to others, you need to host it somehow.

But what does it mean to “host”? And how do you host a website?

These were questions I was asking myself at the start of this project. And so this post will explore hosting, and how I was able to self-host this blog on a Raspberry Pi.

Hosting Defined#

Let’s start at the beginning - what is hosting?

Put simply, web hosting is the process of making your site available to others on the internet.

Visitors access websites through a web address typed into their browser - the web address being the name of the website.

Computers don’t understand this web address though, so it gets translated into an IP (Internet Protocol) address.

Once the browser has the IP address, it opens a connection with the web server using the address. When the connection is established, the browser requests the site, which the server sends.

So the process of setting up a domain name, mapping it to an IP address, and configuring a webserver to serve content is the process of hosting.

Web Servers#

To send our website to a visitor’s browser (to serve the request) our site will need to live in a web server.

A web server accepts requests for web content such as a page, an image or a video, via HTTP (or the more secure HTTPS variant), and responds with the content or an error message.

So to serve a site from the Pi, I needed to install a web server.

I wanted a web server that is efficient, both in terms of memory and power consumption.

A Supercharged Engine#

There are lots of web server frameworks to choose from. You can even write your own if you’d like.

For this project, I chose nginx (pronounced “engine x”), an HTTP web server with many bells and whistles.

As well as being a web server, it can be used as a reverse proxy, content cache, load balancer, TCP/UDP proxy server, and a mail proxy server.

Nginx is very efficient, being able to handle more than 10,000 simultaneous connections with a low memory footprint - ideal for the limited resource available on the Pi!

Installation#

Installing on the Pi was easy:

sudo apt update -y
sudo apt upgrade -y
sudo apt install nginx -y

Installing on other distros is straight-forward too as the nginx installation is well documented.

Once installed, I enabled the nginx service so that Nginx starts automatically on boot:

sudo systemctl enable nginx

One Boss, Lots of Workers#

Let’s explore a little how nginx works.

Nginx is designed as a single master process and several worker processes.

The master process reads the configuration file and maintains worker processes. Worker processes do the actual processing of requests

The number of worker processes is defined in the config file and may be fixed or adjusted automatically to the number of available CPU cores.

If nginx is already running and the configuration file is changed, the change will not be visible immediately. For it to take effect, either nginx must be restarted, or the command to reload the configuration (nginx -s reload) must be used.

As I realised, this is different to changing the content nginx serves (in other words, changing the website).

When content is changed, nginx will automatically start serving the new content to visitors, so there’s no need for a restart.

“Your content, sir”#

To serve content, you need to put your files in a specific directory, and then point nginx to that directory in the config file.

By default, nginx expects content files in the /var/www/html directory, but as I say, you can change this.

The config file itself is usually kept in /etc/nginx/sites-available/.

My Setup#

I have 2 web servers set up, one for staging and one for deployment.

My staging web server is only accessible from within my home network. I use it to preview new content and layout changes I’ve made to the blog. It’s super handy because I can see changes on PCs and handheld devices.

If I’m happy with the changes, I can push them to the live web server, which is exposed to the internet.

The live web server is a clone of the staging web server, with the exception that the live server listens on port 443 for HTTPS, which is the secure version of HTTP.

I’ll go into more detail on this when I dive into automation in a later post.

Getting Online#

A web server isn’t much use if the content it serves can’t be accessed by others.

If you’re as crazy enthusiastic as me and decide to self-host, there are a few key steps to getting your web server online.

To identify these steps, let’s understand the journey taken when a visitor views our website:

  1. A visitor searches for the blog using the domain.
  2. The domain is translated into an IP address.
  3. The request is sent to the network the nginx web server is on.
  4. The request is forwarded from within the network to the web server.
  5. The web server returns the static site.

Getting A Domain#

As I said way back at the beginning of this project, one of the first steps to getting online is purchasing a domain.

A web domain is a human-readable address used to access your site on the internet.

You can purchase a domain from a domain registrar, but before doing so you’ll need to come up with a domain name.

Once you have a name for your site, the registrar will check its availability.

A common gotcha is searching for a top-level domain ending in .com or .co.uk. These are extremely popular so are usually taken.

I fell foul to this when obtaining the domain for this site - I wanted it to be thecodechameleon.com, but as that was already taken, it became thecodechameleon.io.

Making A Connection#

Web domains are only useful if they actually send you to the content you expect. So this was the next thing I had to consider - how to send people on this blog when they search for it?

As I said earlier, computers don’t understand the websites we type into the browser. It must be translated into a sequence of numbers called the IP address.

No normal person searches for a website using it’s IP address, so I had to think about how the web domain would be translated to the IP address of the nginx web server running on the Raspberry-Pi.

Since the Pi is on my home network, I had to work out a way of exposing it to the internet. This can be risky without taking precautions.

By default, home routers don’t expose devices on their network to the internet. Instead the router acts as a kind of firewall, only allowing specific traffic in and out.

The Pi nginx server listens on port 443 (for https), so in order to access the server from the internet, I had to expose this port.

To do that, let’s discuss port forwarding.

Port Forwarding#

Port forwarding is a network technique to allow external devices access to services on a private network. Since the router acts as a firewall, we need to enable requests coming in from port 443 (https requests).

Forwarding too many ports creates a large attack surface since they are visible on the internet. Only forward the port(s) you need!

You can set up port forwarding on the admin page of your router.

Something else to consider is that devices on an internal network likely have a dynamic IP address. This means port forwarding rules will break over time as the IP address of the target device changes.

To prevent that, I set the Raspberry Pi to have a static IP address, which you can also do in the admin page of your router.

Duck-Duck-DNS#

My ultimate goal was to make www.thecodechameleon.io point to an IP address, so I had to work out a way of mapping the domain to said IP address.

My research pointed me to the Domain Name System (DNS), which is a naming system for computers, services, and other resources on the internet. Think of the DNS as a mapping from domains to IP addresses.

I was able to update the DNS for my domain through the domain registrar I used. Changes to the DNS can take anything from a few minutes to a couple of days to filter through.

To make things more complicated, by default, most ISP’s (Internet Service Providers) set a dynamic external IP address, so your IP address will change frequently.

This is a problem when trying to set up the DNS for a domain since the address is likely to change and break the link.

What I needed was an automated way of updating DNS records mapping the domain to my IP address. In the trade this is called a dynamic DNS.

To keep the links updated, I used Duck DNS, because it’s free and easy to set up (noticing a theme here?).

To get the link set up, I created a sub-domain with Duck DNS and then linked it to my IP.

To automatically update the DNS records with an ever-changing IP address, Duck DNS has a bash script to use as a cronjob.

Security#

This is a big part of the hosting process and is an ongoing effort. There were a few key steps I took.

Setting Up SSL Certificates#

SSL (Secure Sockets Layer) is a cryptographic protocol designed to secure communication over the internet.

SSL encrypts data so it can’t be read while in transit, authenticates the identity of websites using digital certificates, and protects against data tampering and interception.

For hosting a website, this meant obtaining an SSL certificate, which is a way of verifying a website as legitimate.

Modern web browsers will refuse to display a site that doesn’t have an SSL certificate: they’ll flag it as insecure.1

To get an SSL certificate, you need to go to a certificate authority.

For this blog I use certbot to automate the process of getting and renewing certificates. Certbot retrieves certificates from Let’s Encrypt, a non-profit certificate authority.

Let’s Encrypt certificates are only valid for a maximum of 90 days. This is intentional as it encourages automation.

Fortunately, certbot will take care of this for you. What’s even better is that certbot interoperates with nginx out of the box!

sudo certbot --nginx -d <domain name>

Setting Up A Firewall#

If you are hosting on your own hardware using port forwarding, you’ll want to make sure that all the ports you’re not exposing are secure. That means setting up a firewall.

To keep things simple, I went with the Uncomplicated Firewall, abbreviated to “ufw”.

Ufw lives up to its name - it’s well documented and really simple to use!

Since the nginx server is listening on port 443, and this port is being forwarded by the router, I had to allow traffic for this port through on the Pi:

sudo ufw allow https

And to disable all other traffic:

sudo ufw default deny incoming
sudo ufw default deny outgoing

This denies all other incoming and outgoing traffic to and from the web server.

Since my web server is running on the Raspberry Pi and I manage and monitor the Pi through ssh, these firewall rules were too restrictive, as they also block ssh access. So I re-enabled ssh access 2:

sudo ufw allow ssh

I mentioned that I have a live site and a staging site. To keep the staging site secure, I restricted access requests to it to a certain network IP.

sudo ufw allow from <IP_ADDRESS> to any port <PORT> proto tcp

Fail2ban#

Another precaution I took was installing fail2ban, which is used to prevent brute-force attacks and bot scans.

Fail2ban works by scanning log files, and bans IP addresses which match patterns defined in config files for a configurable amount of time.

SSH Hardening#

Finally, another protection I’ve taken (perhaps bordering on paranoia) is hardening SSH access to the Raspberry Pi.

As mentioned earlier, I monitor and administer this site remotely over SSH.

In order to harden SSH connections, I’ve disabled password connections, which can be brute-forced.

Instead, I’ve configured the Pi to only allow public-key authentication. I’ve added the public keys from a few trusted devices so I can login to the Pi from them. Apart from being more secure, this also speeds up login!

Conclusion#

And that’s it! After following those steps I had a fully functioning website, hosted and maintained entirely by me!

I’ll never forget the first time I typed in www.thecodechameleon.io to the browser after setting everything up. There was a brief pause as the browser fetched the site, and then - boom! - I landed on the homepage.

It was surreal. I was so glad that all of these moving parts were working together.

With everything working, I began writing new posts for the blog.

As I became more familiar with the writing and publishing process, I was starting to create a development workflow, and once established, it was time to start automating…


  1. You can self-certify, but most browsers will still flag the site as insecure. ↩︎

  2. It’s a good idea to change the default port used for SSH (port 22) to harden against port scanning. This can be done in the ssh config. The custom port will also need to be enabled in ufw. ↩︎