Architecture AWS (Old)

PLEASE NOTE : This AWS architecture is what StakeyRice used to be hosted, but I since switched to a Home setup, please refer to the current Architecture page. I kept this page because it might be useful for others to use if they want to host on AWS.

Stakey Rice is using Amazon AWS for our infrastructure, using VPC, EC2, and all the security features within it. This website is also hosted on AWS’s Lightsail. Below is diagram for our AWS setup :

General Definition

Bastion Node : the single and only node that allow me to SSH into it, and only after login to this Bastion host, then I can access the other nodes. It is secured by IP restriction, SSH certificate, 2FA, UFW, and more.

Relay Node 1 and 2 : This 2 nodes are the public facing node that talk to other relay nodes in the Cardano network. They are hosted in different availability zones for redundancy.

Block Producing Node : This is the node with highest security, because if this node is compromised, your keys and certificate will be compromised. It only talk to Relay Node 1 and 2, and nothing else.

Internet Gateway : Only the nodes in Public Subnet and Management Subnet have access to the Internet Gateway, to talk to the Internet.

Air Gap Machine : This is a Ubuntu machine at home, and only connected to the Internet during the setup phase. Before I start putting any Cardano CLI on it for key managements, it’s taken off the Internet. It’s not even connect to my home network, it’s “Air Gap”. I use USB stick to transfer any files between this machine and the nodes.

Technical Details

Bastion Node
EC2 t3.micro, 2 vCPU, 1 GB Ram, 8 GB SSD (gp2), Ubuntu Server 20.04.2 LTS
This node only need basic Ubuntu funtionality, as long as it’s up and running, you don’t need anything else. It might run as t3.nano (0.5 Gb Ram), but since t3.nano and t3.micro is only a few dollar different a month, I went with t3.micro. At the time of creating this machine, the Availability Zone D (Canada) doesn’t offer t3a instance, so I just picked t3. (t3 is Intel base, t3a is AMD base, and t3a is cheaper)

Relay Nodes and Block Producing Node
EC2 t3a.large, 2 vCPU, 8 GB Ram, 40 GB SSD (gp2), Ubuntu Server 20.04.2 LTS
I actually started with t3.medium, but since then the node requirement changed from 4GB Ram to 8GB Ram, so I move it to t3a.large instance instead. Switching from t3 to t3a (Intel to AMD), because of lower cost (approx 10% difference), but similar performance.

Air Gap Machine
Spare PC at home with Ubuntu Server 20.04.2 LTS.
The Air Gap machine is the one generating all your key and certificates, and all it need is just a copy of your current Cardano binaries, any PC or Raspberry Pi is more than enough to handle it. I have a spare PC at home, and I clean it out with Ubuntu Server 20.04.2 LTS, so it’s the same OS as the Relay Nodes. You can update the OS and Cardano binaries offline with USB stick to transfer files, no need to connect to the Internet at all.

Security

It took me a while to setup this pool, but I spent more time on the security aspect of it more than the Cardano node setup, because if I setup Cardano nodes wrong, it might just not work, but if I setup the security wrong, it could mean the whole setup being compromised. Everything is setup with zero trust, and only open port/services when it is required.

Network ACL : I have 3 Network ACL, one for each subnet. Not only they protect the subnet from the Internet, but also from each other. Network ACL protect on the subnet level, and it’s stateless.

Security Groups : I have 3 Security Groups, one assign to each instance grouped in each subnet. The Security Group provide security on the instance level, and it’s stateful.

UFW (Uncomplicated Firewall) : Each instance including the Air Gap machine all use UFW. All of them start with “deny all incoming”, and only open ports as needed. Rules are as restricted as possible. For example, if a connection is needed only between Relay node and Block Producing node, I put in the static IP into the UFW rules, instead of the whole subnet.

SSH Security : For remote management I will need to SSH into the Bastion host first, and then I will be able to jump to the other nodes from there. The SSH is secured by using SSH certificates, IP restricted only from the IP I will SSH from, protected with 2 factor authentication, etc.

Encryption : The Air Gap machine itself, and any file transfer on USB stick is all disk or USB encrypted. In case if I lose the Air Gap machine or the USB sticks, they will be useless.

The above is pretty well know security measures you can take to secure your nodes. There are also other security measures, but I will not be sharing everything here. Security is never enough nowadays, I will always look for other measures to secure this pool.

Future Roadmaps

Here are some of the features I will implement :

  • Backup Block Producing Node in Availiability Zone B (quick recovery in case of BP node failure, and Zone redundancy)
  • Publish Node Statistic using Grafana (currently using for monitoring only privately)
  • Automation to act on node failures.