As we approach ICONSENSUS, we are excited to be a part of the ICON network decentralization! This comes with much anticipation as well as challenges from the unknowns. In order to better prepare for what lies ahead, I wanted to put a series of articles out to start the conversation regarding setting up resilient nodes. Resilient nodes will be key to having a successful decentralized ICON network, and I think the earlier we discuss different options to make our nodes more resilient, the better off we will be. I will be releasing a series of articles that discuss my thoughts on designing resilient nodes for ICON. I look forward to discussions as we improve together! I will be using Amazon Web Services (AWS) as the use case cloud service provider for this study.
On Sunday, June 2, 2019, Google had a severe network outage, resulting in services and applications losing connectivity. If a powerhouse such as Google can go down, then we must be ready as well! ““There are only two types of companies: Those that have been hacked and those that will be hacked.” Mueller, former Director of FBI. I think this quote is accurate — no one can design a system that is 100% secure both now and into the future. We must prepare for things we know today, and hope to be resilient to things we do not know will happen tomorrow. In preparing a secure system, it is important to use disciplined systems engineering principles, ensure a layered security approach, and a graceful failure and restart mode.
System failures are not always nefarious. Often we think of systems as going down only due to hacks, but systems can fail for other reasons, such as due to honest bugs in software. They can also fail because they were under provisioned for the workloads required. Real-world weather can also cause infrastructure damage that causes the system to malfunction. All of these must be considered in designing a secure system.
We want to design our ICON node to have high availability and dependability in order to handle the varying levels of requests from the ICON network. We will design our system with a focus on Confidentiality, Integrity, and Availability. Confidentiality refers to protecting information from unauthorized parties. This can include the data in our node, but also includes user credentials and items that would allow unauthorized access to our node. Integrity is the authenticity of information. Thus, it includes protection of information from being modified by unauthorized parties, and understanding when it is. It is the ability to be confident that our node has not been tampered with and is functioning properly, without bugs. Availability means the information and services are accessible by authorized parties. In this case, it means that the developers have access to our servers to modify as needed and most importantly, that the ICON network can successfully access our node.
In order to encompass these principles, we will design security at every layer. AWS uses a ‘shared responsibility model,’ which means AWS is responsible for the security of the cloud infrastructure and users are responsible for the workloads they deploy on AWS. (Figure 1)
AWS Data Centers are built in clusters across the World. As of June 2, 2019, there are 66 Availability Zones (AZ) spread across 21 geographic regions across the globe (dynamic map available at: https://aws.amazon.com/about-aws/global-infrastructure/).
Each region is isolated from other regions — thus each region can be looked at as a completely separate cloud service provider. Each Availability Zone is physically separated from others; however, they are interconnected within the same region. Thus, Availability Zones can be utilized to provide redundant system architecture in the same region, whereas additionally regional resiliency can be added by having multiple Node setups in different regions. An example would be a node setup with replication in an availability zone, and that setup of two nodes replicated across two regions.
This sounds difficult, but AWS provides resources and tools to simplify and automate these processes. We will use AWS software-defined infrastructure, auto-scaling and self-healing in order to automate these processes.
In the examples, I will use a combination of auto-scaling groups and load balancers to balance the ICON network demands across Availability Zones, while using only a single public-facing IP address, and appearing as a single node to the ICON network. The auto-scaling group will either scale out (to increase in size) or scale in (to decrease), as demands require. The load balancer will be the single public-facing IP address and will properly balance the ICON demand load between the dynamically changing auto-scaling groups (Figure 3).
Additionally, cloud monitoring services and alarms will be setup to automate decisions as well as notify the administrator when there are significant changes to the processing load.
From an individual node standpoint, the individual instances will be hardened based on best security practices. This will include setting up firewall protection, disabling unused ports, setting up proper security authentication measures and fine-grained security groups using the principle of least privilege. I will also setup detective controls for logging and auditing, so we can have incident response teams explore the data following an attack.
Appropriate network security measures will be instantiated to protect against DDoS and other network attacks at the end-points of the system infrastructure.
Lastly, AWS Inspector will be utilized as a service to provide feedback and analysis of the security and resiliency of our setup, as well as external services, such as Nessus, by Tenable, to provide an external vulnerability assessment. Once secure, we will look into individual penetration testing of the system.
To recap, I am very excited to participate in ICONSENSUS and excited about the decentralization of the ICON network! I will look to make our ICON Node resilient through a disciplined and layered approach to preserve confidentiality, integrity, and availability. We covered a lot of different areas in this post in order to provide an overview of what we intend to look. We will present more detailed and focused articles on the areas discussed, with examples using AWS services. We look forward to future discussions and hope you enjoy our articles!