ICON Node Resilience

As ICON becomes decentralized, it will be of upmost importance to keep nodes up and running to ensure a secure and well-performing network, as well as happy voters (they don’t get their returns if the node is down). In doing so, I’d like to start discussions on different ways we can improve resiliency. This thread focuses on some high-level ideas to start the conversation, and then dives into lower level details that are specific to AWS and Ubuntu nodes. This is by no means a comprehensive list and I look forward to discussions!

To begin, I’d like to post an overview of resiliency and why I believe it is important:

https://medium.com/iconcm/iconsensus-resiliency-part-1-introduction-25159997f01a

I posted a new part 2 focusing on system operations (Sys Ops) for short, resiliency: https://medium.com/iconcm/iconsensus-resiliency-part-2-sys-ops-security-934de1c999c0?

If others could chime in for Sys Ops on other services (outside of AWS) - that would be fantastic for discussion!

We have published this post on this topic… Updates to Yellow Paper (IISS 2.0) - ICON’s Penalty System — Steemit

It is also important to know that

there is nothing in this universe that is 100% secure .

There will always be some risk involved and what ICONists and P-Reps can do is to reduce the risk level to minimum.

It will need lot of planing, time, and effort to have 99.999% (I I should never say 100%) uptime and reliability.

I will look into this (we all need to look into this & time is running out). Thanks. :pray: