1
00:00:01,640 --> 00:00:08,220
So in this session we are going to talk about Redis High Availability and Redis High Availability

2
00:00:08,220 --> 00:00:09,090
is different

3
00:00:09,110 --> 00:00:14,210
than, you know, Redis Replication. In Redis Replication,  what we have done is we had

4
00:00:14,210 --> 00:00:20,600
a Master Server,  then we had N number of,you know, Slave Nodes and whenever Master goes down,  you can

5
00:00:20,600 --> 00:00:23,240
connect to Slave Node  to perform Read Only Operations.

6
00:00:23,240 --> 00:00:29,030
However, if you want to convert a Slave Node to a Master  Node,  you have to take some sort of Manual

7
00:00:29,030 --> 00:00:29,420
Action.

8
00:00:32,620 --> 00:00:40,190
Now with the help of something called Redis Sentinel.  This is a software,  which is bundled with the Redis

9
00:00:40,980 --> 00:00:42,300
Software Package.

10
00:00:42,420 --> 00:00:48,960
So Redis Sentinel is going to allow us, you know, or going to provide us a Mechanism using which

11
00:00:49,020 --> 00:00:58,740
we are going to Automate basically the process of making a Slave Node to a Master Node.  Whenever the existing

12
00:00:58,740 --> 00:01:04,150
Master Node  is down.

13
00:01:04,150 --> 00:01:10,150
So now what is the requirement,  which you need to have in order to, you know, perform this

14
00:01:10,600 --> 00:01:11,380
Redis High Availability.

15
00:01:11,710 --> 00:01:16,840
So one of the key things which you need to note is,  that you need to install a Sentinel on minimum of

16
00:01:16,840 --> 00:01:19,660
3 Servers. okay.

17
00:01:19,670 --> 00:01:27,000
Now the reason,  why you need 3 servers on which you need to install Redis Sentinel is,  that there

18
00:01:27,000 --> 00:01:29,180
is something called Quorum.

19
00:01:29,220 --> 00:01:36,780
Okay so Quorum Configuration basically is the minimum number of Sentinel Servers.  Which needs to agree

20
00:01:36,780 --> 00:01:43,200
that a Master Node is down and then it is going to convert one of the Slave Nodes to a Master Node.

21
00:01:43,830 --> 00:01:44,050
Okay.

22
00:01:44,070 --> 00:01:52,980
So Quorum,  basically Redis Sentinel is going to have an Election process and in that Election process,  you

23
00:01:52,980 --> 00:02:02,000
know,  majority of Redis Sentinels  are going to decide that Yes Master Node is down and it is not

24
00:02:02,000 --> 00:02:02,590
reachable.

25
00:02:02,660 --> 00:02:10,600
And now this is  the time they need to be promoting one of the Slave Nodes to a  Master Node.  Okay.

26
00:02:10,640 --> 00:02:17,130
So whatever, you know, number you provide in the Quorum Configuration that is a minimum number of Servers, Sentinel

27
00:02:17,160 --> 00:02:26,400
Servers or  Sentinel Services,  which is going to be a agree and going to perform this Election of a new

28
00:02:26,400 --> 00:02:35,210
Master Server.

29
00:02:35,330 --> 00:02:43,630
The other thing to notice is,  that Sentinel runs  on a default Port  of 2 6 3 7 9.  Okay whereas Redis

30
00:02:43,670 --> 00:02:48,070
runs on a  default Port of 6 3 7 9.

31
00:02:48,710 --> 00:02:51,770
Both of these ports can be changed from the Configuration File.

32
00:02:52,190 --> 00:03:00,420
So if we remember, Redis  configuration is stored in Redis.conf  File and Sentinel configuration is stored

33
00:03:00,430 --> 00:03:10,240
in Sentinel.conf  File. Now both of these files are kept at the same Directory and also there is a called

34
00:03:11,350 --> 00:03:17,800
Redis-sentinel utility,  which is also stored in the same Directory,  where Redis-server

35
00:03:17,800 --> 00:03:27,740
Utility is available.  So you do not need a separate, you know, downloading and installing of this Sentinel Software.

36
00:03:27,740 --> 00:03:29,440
It comes bundled with the same software.

37
00:03:29,450 --> 00:03:35,960
The only thing is,you need to start the services, after doing the Correct Configuration.

38
00:03:35,990 --> 00:03:42,230
Now let's discuss about the working of these Nodes.  So now this Redis Sentinel, we are saying that

39
00:03:42,230 --> 00:03:49,700
we need to install it on minimum of three Nodes but those Nodes do not need to be a completely separate

40
00:03:49,700 --> 00:03:55,400
server.  Okay that can be a same Redis  Server,  on which you can install Redis Sentinels.

41
00:03:55,460 --> 00:04:00,830
So, in this example you can see we have a Master Node and then we have 2 Slave Nodes,  which is S 1 and

42
00:04:00,920 --> 00:04:06,950
S 2.  Okay and then we have Redis  Sentinel 1,  which is running on Master node,  then we have Redis Sentinel

43
00:04:06,950 --> 00:04:12,860
2,  which is running on Slave Node 1 and then we have Redis  Sentinel 3,  which is running on Slave Node 2.

44
00:04:15,170 --> 00:04:16,070
Okay.

45
00:04:16,120 --> 00:04:25,700
And in this case whenever Master is going to get down.  In that case you can see Redis  Sentinel are

46
00:04:25,700 --> 00:04:32,650
going to decide,  that it is the  time they need to elect and promote one of the Slave Nodes to a Master.

47
00:04:32,660 --> 00:04:36,990
And in this case they have promoted Slave Node to a Master Node.

48
00:04:37,500 --> 00:04:38,060
Okay.

49
00:04:38,990 --> 00:04:45,830
And also when one of the Slave Node is converted to a Master Node,  all the previous or any previous Master

50
00:04:45,830 --> 00:04:48,800
Node is going to be automatically changed to a Slave Node.

51
00:04:49,460 --> 00:04:56,080
So whenever this Master Node which is M 1,  comes back this is actually going to become a Slave Node.

52
00:04:56,630 --> 00:05:06,660
Okay and this is going to become a Slave Node of this server, newly elected Master Server. So both of these

53
00:05:06,660 --> 00:05:08,670
are basically a Slave Node of

54
00:05:08,670 --> 00:05:10,090
This Master Server.

55
00:05:13,760 --> 00:05:20,450
Now there is something called down after milliseconds parameter. Okay  which you see here and this is the configuration

56
00:05:20,450 --> 00:05:22,650
which is done in Sentinel.conf file.

57
00:05:23,780 --> 00:05:31,790
So down after milliseconds number  tells Sentinel that,  this is the number of milliseconds Sentinel should

58
00:05:31,970 --> 00:05:35,750
wait after master is not able to reach.

59
00:05:35,780 --> 00:05:42,140
Okay so whatever number you define here,  that is a minimum number of time,  Sentinel it is going to wait for

60
00:05:42,140 --> 00:05:43,310
the master to come back up.

61
00:05:43,340 --> 00:05:50,340
If master is not able to connect for that number of time,  then it is just going to start the Election

62
00:05:50,340 --> 00:05:51,880
of a New Master Process.

63
00:05:55,040 --> 00:05:59,690
In our  example,  we are going to you know,  discuss about the same scenario. We are going to have 3

64
00:05:59,690 --> 00:06:05,590
servers.  On which we are going to make the same setup and we will see how all of these works.

65
00:06:05,970 --> 00:06:09,680
OK.

66
00:06:09,870 --> 00:06:16,500
So now let's discuss about what are  the different kind of setup,  which you can have and what are  the,  you

67
00:06:16,500 --> 00:06:21,060
know some of the advantages or disadvantages of,  all of those kind of scenarios.

68
00:06:21,060 --> 00:06:24,660
Let's try to see what are the issues which we can have and what is a mitigation

69
00:06:29,240 --> 00:06:36,140
Now the first deployment type which is shown here,  is a Deployment where we have only 2  Redis Sentinel

70
00:06:36,140 --> 00:06:39,560
running. Redis Sentinel 1 and Redis Sentinel 2.

71
00:06:39,560 --> 00:06:43,620
So this is the Deployment which you should never do.  It is never going to work.

72
00:06:47,690 --> 00:06:57,570
The reason being is because Quorum is set to 2 and whenever this, you know,  whenever let's say we have a problem

73
00:06:57,570 --> 00:06:58,570
with this server.

74
00:06:58,680 --> 00:07:05,460
And in that case and M 1 and RS 1  both these services are going to get impacted.  Both of these will

75
00:07:05,460 --> 00:07:06,330
not be available.

76
00:07:06,810 --> 00:07:13,710
So now a Redis Sentinel 2,  doesn't have the minimum number of Redis sentinels available,  to perform

77
00:07:13,710 --> 00:07:22,380
the Election process and convert this Slave Node to a Master Node.  So that's the reason it is recommended

78
00:07:22,650 --> 00:07:24,230
that you never do this Deployment

79
00:07:29,230 --> 00:07:29,540
okay.

80
00:07:29,540 --> 00:07:39,420
The only way when this setup is going to work is,  when there is, you know,  the server on which Redis Sentinel

81
00:07:39,420 --> 00:07:41,970
1 is running.

82
00:07:42,000 --> 00:07:42,440
Okay.

83
00:07:42,450 --> 00:07:49,860
That  server somehow is available and Redis Sentinel 1 is also available and there is some impact

84
00:07:49,950 --> 00:07:52,430
only on the Redis Master Node.

85
00:07:52,530 --> 00:07:58,170
Let's say for some reason Redis Master Node, you know, service is not running but that server is available

86
00:07:58,620 --> 00:08:03,150
and Redis Sentinel 1 is running and Redis Sentinel 2 is also running.

87
00:08:03,300 --> 00:08:08,190
That is the only scenario in  which it is going to work because in that case Redis Sentinel 1

88
00:08:08,190 --> 00:08:16,180
and Redis Sentinel 2 are going to agree,  that Yes master is down and it is going to promote Slave

89
00:08:16,180 --> 00:08:17,420
to Master Node.

90
00:08:17,620 --> 00:08:21,170
Okay because Quorum is set to 2 and that's  why

91
00:08:21,180 --> 00:08:26,460
Redis Sentinel 1 and Redis Sentinel 2,  both are available,  then this process can work.

92
00:08:26,880 --> 00:08:33,320
However in Production,  usually we know that, you know, this is very unlikely that you are going to have

93
00:08:33,330 --> 00:08:38,640
problem only with Redis and you are not going to have problem with Redis Sentinel.  Whenever there is

94
00:08:38,640 --> 00:08:40,060
any problem on that server.

95
00:08:40,320 --> 00:08:47,730
Okay so that's the reason this is the setup which is not recommended.

96
00:08:47,860 --> 00:08:50,150
Now let's discuss about another deployment.

97
00:08:50,290 --> 00:08:56,500
So in this deployment we have 3 sentinels.

98
00:08:56,510 --> 00:08:59,050
We have a Master and we have 2 Replicas

99
00:09:03,240 --> 00:09:07,100
and  here Quorum is set to 2 and this is a good approach.

100
00:09:07,110 --> 00:09:09,950
If you have a minimum of 3 servers.

101
00:09:10,870 --> 00:09:11,160
Okay.

102
00:09:11,160 --> 00:09:17,240
So when you have minimum of 3 servers,  all you can do is you can set up 1 Redis Sentinel  on

103
00:09:17,240 --> 00:09:25,170
each of the servers and one of the server can become Redis Master and remaining 2 servers are

104
00:09:25,190 --> 00:09:35,450
going to be Slave 1 and Slave 2. So in this case whenever one node goes down as we have seen that,  in

105
00:09:35,450 --> 00:09:40,180
that case the other node is going to become Master Node.

106
00:09:40,190 --> 00:09:43,100
So let's say M 1  was down and this is going to become a Master Node.

107
00:09:46,840 --> 00:09:52,870
However let's say for some reason, you know this master is in a different subnet and Slave 1 and Slave 2

108
00:09:52,870 --> 00:10:04,420
is in a different subnet.  Let's say IP of this begins with something like 10. you know 0.0.*

109
00:10:07,080 --> 00:10:13,280
and then these two Slave Nodes are from a different subnet.  So it starts with,  let's say something

110
00:10:13,860 --> 00:10:18,370
from 10.4.0.*

111
00:10:23,490 --> 00:10:33,800
and you know for some reason you have issue between this subnet, connecting to this subnet.  Okay.

112
00:10:34,930 --> 00:10:42,610
So in this case what is going to happen is,  basically Redis Sentinel, you know, whole or this block or  set

113
00:10:42,610 --> 00:10:45,930
of users.

114
00:10:46,020 --> 00:10:53,100
you know, are  not going to be able to connect to Master Node.  Even though you know,  the actual service or on

115
00:10:53,100 --> 00:10:57,080
the server is still up and running okay.

116
00:10:57,090 --> 00:11:01,800
In fact Redis Sentinel 1 is also running and Redis  is also running on this server.

117
00:11:02,310 --> 00:11:09,600
However,  since there is a network issue and network partition,  between you know this  subnet

118
00:11:09,600 --> 00:11:15,720
and this subnet.  So Redis Sentinel 2 and  Redis Sentinel 3 is basically going to

119
00:11:18,330 --> 00:11:23,430
agree that,  Yes they are not able to reach Master.  So they are going to convert this you know, one of the

120
00:11:23,430 --> 00:11:30,570
Slaves to a Master and then the other remaining you know previous,  the previous Master Node is going

121
00:11:30,570 --> 00:11:35,880
to be turned into a Slave Node whenever that is available and going to join this network.

122
00:11:36,930 --> 00:11:43,290
So in this scenario,  basically lets say if we have a user which is User A. It is connected to you know,  directly

123
00:11:43,290 --> 00:11:49,650
to Master Node.  So whenever it is performing all the right operation that is going to be you know,

124
00:11:49,740 --> 00:11:56,790
So since this Master Node is still a Master Node.  It is going to allow all the right operations but whenever

125
00:11:56,790 --> 00:12:01,450
you know this issue is resolved and all of these networks are able to connect to each other.

126
00:12:01,450 --> 00:12:06,210
This,  since this master you know,  state has been changed to a Slave Node by Sentinel.

127
00:12:06,810 --> 00:12:15,840
So whenever it joints back the network. This M 1 is going to be a Slave of this New Master and whatever data

128
00:12:15,870 --> 00:12:24,060
which has been written by User A,  is going to get lost.  Okay because this Slave will be trying to become

129
00:12:24,090 --> 00:12:28,020
in the same state,  which this New Master is.

130
00:12:30,900 --> 00:12:33,650
However this issue does have a Mitigation.

131
00:12:33,660 --> 00:12:38,460
So there is something called Minimum replicas  to write configuration.

132
00:12:41,480 --> 00:12:42,820
which you can set to 1,

133
00:12:42,830 --> 00:12:43,570
In this case.

134
00:12:43,610 --> 00:12:50,930
So whenever you have a scenario like this. In this case if User A  is writing data to Master,  Master will

135
00:12:50,930 --> 00:12:53,630
not be having any Slaves. Okay.

136
00:12:53,670 --> 00:12:58,620
It will not be having any Slaves to write or replicate its data to.

137
00:12:58,940 --> 00:13:05,960
And since we have set minimum replicas to write is 1.  So Master must have at least or minimum of 1 replicas,

138
00:13:06,050 --> 00:13:12,200
where it needs to replicate its data to and if it is not able to find that one replica,  it is not going

139
00:13:12,200 --> 00:13:18,500
to allow any write operation. So user A, whenever it is trying to perform write operation,  it is going

140
00:13:18,500 --> 00:13:19,500
to get error.

141
00:13:19,700 --> 00:13:22,240
And this is how User A, is going to get aware

142
00:13:22,250 --> 00:13:25,260
Yes there is some issue, which is going on with the Redis Server.

143
00:13:26,000 --> 00:13:34,080
However user A,  still can successfully perform all the Read operations.

144
00:13:34,180 --> 00:13:43,510
Now once you do this mitigation.  There is again one more  drawback is basically

145
00:13:43,510 --> 00:13:49,910
that whenever both of the replicas are down.  The master is not going to accept

146
00:13:49,930 --> 00:13:50,910
writes.

147
00:13:50,950 --> 00:13:58,630
Okay so let's say,  when we have issue with S 1 and  S 2 and both of them are not available.  In that case master

148
00:13:58,750 --> 00:14:02,140
server is not going to allow any write operation.

149
00:14:02,140 --> 00:14:10,390
The reason being is because minimal replica is set to 1.  So again whenever you are doing a set up in

150
00:14:10,390 --> 00:14:16,820
your Production Environment,  you can assess these scenarios that, Do you going to face any problem which

151
00:14:16,840 --> 00:14:25,100
can occur because of Network Partition. Probably you can try to make all the Redis notes a part of the

152
00:14:25,670 --> 00:14:26,840
same network partition.

153
00:14:30,970 --> 00:14:31,350
Again

154
00:14:31,360 --> 00:14:36,490
You know different kind of issues can produce a different behavior.

155
00:14:36,490 --> 00:14:45,080
So always it is best,  you do a good testing before deploying a  Replication in a Production Environment.

156
00:14:45,080 --> 00:14:51,410
Now let's discuss about another scenario.  In this scenario  we have only 2 Servers available for

157
00:14:51,410 --> 00:14:52,250
Redis.

158
00:14:52,710 --> 00:14:58,780
Okay so we have made one Redis Server to a Master Node and then we have converted one Redis  Server

159
00:14:58,780 --> 00:15:00,270
to a Slave Node.

160
00:15:00,290 --> 00:15:04,440
Now in this scenario you see,  we have not installed Redis
sentinel on Redis Server.

161
00:15:04,550 --> 00:15:06,620
So rather what is done here is

162
00:15:09,500 --> 00:15:14,560
you know, there are set of 3 servers where application Server is or application is running,  on

163
00:15:14,560 --> 00:15:16,870
those nodes Redis Sentinels are installed.

164
00:15:20,610 --> 00:15:21,420
okay and Redis

165
00:15:21,460 --> 00:15:27,900
these Redis Sentinels  are still going to monitor Master and Slave Nodes and  going to work the same

166
00:15:27,900 --> 00:15:35,280
way,  which you know,  which we have seen in this example.

167
00:15:35,290 --> 00:15:41,440
Now the issue which  can come here is,  when you have you know,  this application server is totally disconnected

168
00:15:41,440 --> 00:15:48,340
with this Master and Slave Node. So let's say for some reason,  when you know all of these is completely

169
00:15:48,340 --> 00:15:53,310
disconnected with Master and Slave.

170
00:15:53,530 --> 00:15:56,080
In that case you are going to have issues

171
00:15:59,060 --> 00:16:04,720
and also for some reason.  Lets say if there is no connectivity between this Applications Server and Master Server,

172
00:16:04,760 --> 00:16:08,990
then you won't be able to set up basically this kind of setup.

173
00:16:19,060 --> 00:16:26,140
Now let's discuss about another deployment scenario.  In this scenario it is assumed that on the application

174
00:16:26,140 --> 00:16:28,440
side also,  you have only 2 servers available.

175
00:16:28,880 --> 00:16:29,170
Okay.

176
00:16:29,200 --> 00:16:30,780
So when I'm saying Application Side.

177
00:16:30,780 --> 00:16:37,630
So basically this can be the servers where you have your,  let's say Web Application running or any application

178
00:16:37,630 --> 00:16:44,430
running,  which is going to connect to the Redis Data Base.  So in this case,   we have only 2

179
00:16:44,430 --> 00:16:48,900
servers available for Application and 2 servers  are available for Redis Data Base.

180
00:16:49,770 --> 00:16:53,730
So in this case we have installed 4 Sentinels.

181
00:16:53,790 --> 00:17:00,420
Okay so we have 1 Sentinel which is installed on Application Server and then we have Redis Sentinel 2

182
00:17:00,420 --> 00:17:03,670
which is also installed on Application Server.

183
00:17:03,750 --> 00:17:10,200
Then we have Redis Sentinel 3, which is  installed on the same server,  where  currently Redis master

184
00:17:10,200 --> 00:17:15,910
is installed and then we have Redis Server or  Redis Sentinel 4,  which is installed on the same server where

185
00:17:15,970 --> 00:17:23,860
Slave 1, is installed and in this case now since we have 4 Sentinels available.

186
00:17:23,870 --> 00:17:29,200
So that's why,  minimum number of Sentinel which needs to be agreed is set to 1.

187
00:17:29,220 --> 00:17:31,640
oh, I'm sorry, which is set to 3

188
00:17:31,880 --> 00:17:34,430
So you can see here Quorum is set to 3

189
00:17:37,150 --> 00:17:38,730
and this is also a good approach.

190
00:17:38,800 --> 00:17:45,430
If you have a minimum of,  basically you have  minimum of 3 servers,  including your Client Server and Redis

191
00:17:45,490 --> 00:17:49,140
Server.  So client,  by Client Server what I mean here is,  the Application Server.

192
00:17:49,750 --> 00:17:56,770
So  somehow basically you need to achieve is,  that you need to have  minimum of 3 servers available

193
00:17:56,770 --> 00:18:05,330
to install your Redis Sentinel, you know that is  the precisely basically we are trying to say in all

194
00:18:05,330 --> 00:18:13,050
of the scenarios.  Okay in none of the scenarios,  we have seen you know, perfectly fine working model

195
00:18:13,050 --> 00:18:17,130
where we have less than 3 sentinels available.

196
00:18:22,810 --> 00:18:28,720
So that's it about Redis High Availability and  Sentinel Deployment. In Next Session we are

197
00:18:28,720 --> 00:18:35,800
going to do a Hands-on Session.  We are going to set up 3 servers and on those servers we are

198
00:18:35,800 --> 00:18:38,380
going to set up a high availability using Sentinels.
