If you are ready to explore NuRaft, the best starting point to learn using NuRaft is the examples within the examples directory in the repo. There are two examples: echo and calculator, both share some common files. For a hello-world equivalent example, see Quick Start Guide.
The major tasks to use NuRaft include the following:
in_memory_log_store.hxx and in_memory_log_store.cxx provide a log store example for both examples.
Defining log entry format, and state machine actions, and providing state machine snapshot support. See echo/echo_state_machine.hxx and calculator/calc_state_machine.hxx for the examples.
There is a cluster-wide configuration cluster_config, which contains a list of srv_config for each server. Class state_mgr manages the configuration and srv_state. You need to override the base class with custom specifics. See in_memory_state_mgr.hxx for example.
In example_common.hxx there is code to define a server, and initialize raft protocol with parameters (init_raft).
Assuming we have a declaration
raft_params params;
Some important protocol parameters to consider are the following:
-
params.heart_beat_interval_
: a leader will send a heartbeat message to followers if this interval (in milliseconds) has passed since the last replication or heartbeat message. -
params.election_timeout_lower_bound_
andparams.election_timeout_upper_bound_
: they determine the time (in milliseconds, between the lower and upper) a follower will wait before initiating leader election. These three parameters together determine how long a leader failure will be detected. When a leader fails, writes will be temporarily unavailable until a new leader is elected. The average lapse will be about half way between this timeout interval. -
params.reserved_log_items_
: the number of trailing log entries will be preserved when a snapshot is taken. If one member falls behind the others temporarily, it needs to catch up from the leader using the Raft log. If the log is truncated too soon, the member would have to use a snapshot to catch up due to needed log entries being unavailable. Snapshot based recovery is costly, especially if the data volume is large. Keeping enough trailing log entries will help avoid the costly snapshot-based catch up. -
params.snapshot_distance_
: snapshot frequency (in number of log entries). When a member restarts or recovers from a remote snapshot, it will replay the log entries after the snapshot. Frequent snapshots will reduce the number of log entries to play, but incurs more overhead of snapshots. On the other hand, less frequent snapshot may increase the time of restart or catch up with less cost of snapshots.
echo/echo_server.cxx and calculator/calc_server.cxx contains both server and user interface code.
You start servers and then add to the cluster. Launcher can help to start Raft. See add_server()
and server_list()
in example_common.hxx for example.
You remove a server, and then shut it down. Use launcher for shutdown, see server code in response to the quit command.
The examples contain user interface code in the main server. If you use a client-server model, you need to define the communication API, say using gRPC for example. You need to expose server role status to the client for it to differentiate the leader from the followers.
-
Writes will go to the leader only. Raft can be configured to forward requests to the leader, but this may incur more hops, thus longer latency. If a node is no longer a leader, it can return the current leader to the client.
-
Reads from the leader for latest values.
-
Reads from the followers for the values that may be delayed.
For more details and advanced topics, please refer to the How to Use document.