1
00:00:01,280 --> 00:00:06,080
In this lecture, we're going to take a look at how we can use caches in order to optimize the build

2
00:00:06,100 --> 00:00:06,490
speed.

3
00:00:08,570 --> 00:00:13,850
You have probably noticed that some of the jobs need a lot of time to run, especially the big jobs

4
00:00:13,970 --> 00:00:16,610
it needs to download some dependencies before it can run.

5
00:00:16,970 --> 00:00:23,450
And this takes a lot of time now compared to more traditional S.I servers like JENKINS'.

6
00:00:23,930 --> 00:00:30,170
This extra time it takes to download a docker image, download the dependencies and everything may seem

7
00:00:30,170 --> 00:00:30,890
like forever.

8
00:00:32,240 --> 00:00:37,190
In case you're wondering why digital is behaving in this way, you need to think back to the architecture

9
00:00:37,190 --> 00:00:37,970
of Hitler.

10
00:00:38,600 --> 00:00:43,670
And we'd said previously that every job is starting in a clean environment.

11
00:00:43,670 --> 00:00:50,930
So we download a clean dukkha image and on top of that we have to code from our git repository.

12
00:00:51,620 --> 00:00:58,310
But we have absolutely no other dependencies, no other code that has been generated previously from

13
00:00:58,310 --> 00:00:59,080
the previous jobs.

14
00:00:59,330 --> 00:01:01,550
So we are always starting new.

15
00:01:01,550 --> 00:01:06,860
Like when you're first setting up your project locally on a machine, you have to run NPM install to

16
00:01:06,950 --> 00:01:09,050
get everything installed, all the dependencies.

17
00:01:09,890 --> 00:01:14,990
And this is not only valid for node in this case, it's very different any programming language, but

18
00:01:15,320 --> 00:01:20,030
because it's rarely the case that you are working on a project and you don't have external dependencies.

19
00:01:20,360 --> 00:01:24,860
And as we know, we do not store external dependencies in our data repository.

20
00:01:26,800 --> 00:01:32,400
But rest assured, there is a solution for this, and this can make our jobs a bit faster.

21
00:01:33,850 --> 00:01:41,650
Using gashes is possible to speed up the execution of the job by instructing it to hold onto some files

22
00:01:41,650 --> 00:01:47,320
that we may later need, and if you're wondering where, what should we?

23
00:01:47,860 --> 00:01:52,920
And as we said, we need to download all this external project dependencies.

24
00:01:53,200 --> 00:01:59,900
So an ideal candidate for caching are exactly this external project dependencies that we are not storing

25
00:01:59,920 --> 00:02:02,410
it and that we need to download all the time.

26
00:02:03,070 --> 00:02:09,550
So if we can instruct Getler like, hey, we might need this file soon, so don't throw them away or

27
00:02:09,550 --> 00:02:12,820
don't download them all the time, keep them somewhere.

28
00:02:13,570 --> 00:02:15,780
Then this can make our jobs around faster.

29
00:02:17,780 --> 00:02:23,900
Now, in our case, all these dependencies are located in a folder called Noad Underscore Module's,

30
00:02:24,260 --> 00:02:26,670
and this is exactly the folder that we want to save.

31
00:02:27,530 --> 00:02:29,690
So let's go ahead and change our pipeline a bit.

32
00:02:32,530 --> 00:02:37,270
Now, here we have to build stage, and this is one of the stages that really needs to download all

33
00:02:37,270 --> 00:02:40,660
the dependencies, especially because of this comment npm install.

34
00:02:41,290 --> 00:02:48,300
So for that reason, let's take a look at how we can optimize this and we can instruct now to store

35
00:02:48,310 --> 00:02:48,850
the files.

36
00:02:49,120 --> 00:02:51,340
And we can do that by using cash.

37
00:02:52,000 --> 00:02:56,050
When we're using cash, we first have to specify a key and a path.

38
00:02:56,920 --> 00:02:59,830
I'm going to start with the path because this is probably the easiest to understand.

39
00:03:00,760 --> 00:03:06,880
We have to tell Getler what to save and and we are only interested in saving the node modules because

40
00:03:06,880 --> 00:03:12,160
everything which is inside the node modules is something that will be later downloaded and we do it

41
00:03:12,160 --> 00:03:12,460
again.

42
00:03:14,890 --> 00:03:17,200
So it is definitely a good idea to save the node modules.

43
00:03:17,920 --> 00:03:20,600
The second part regarding the cache is specifying a key.

44
00:03:20,980 --> 00:03:25,950
Now we need somehow to identify when we can use this cache.

45
00:03:27,730 --> 00:03:34,300
Now we can specify here a key as a string, which can be something like my cache.

46
00:03:36,670 --> 00:03:42,160
But it's actually a good practice to have a cash that is based on a specific branch.

47
00:03:42,880 --> 00:03:49,450
Now, currently we only have one branch and that is master and we any way to npm install, we will check

48
00:03:49,450 --> 00:03:53,350
again if there are any outdated dependencies or any dependencies that do not match.

49
00:03:54,360 --> 00:04:01,500
Compared to what we have inside module's, but using this environment variable that is provided by the

50
00:04:01,680 --> 00:04:05,700
lab, we can get the reference to the current price that we are working on.

51
00:04:05,940 --> 00:04:10,050
So this key here, if we are master, will be master.

52
00:04:14,570 --> 00:04:18,680
Or within a specific branch would be something like feature whatever.

53
00:04:19,820 --> 00:04:23,780
It's much easier to use this predefined environment variable from Gitlow.

54
00:04:25,220 --> 00:04:29,060
Now, there's one more thing, and this is valid with a lot of things in Gitlow.

55
00:04:29,720 --> 00:04:33,350
We can specify this on this JOP level.

56
00:04:34,870 --> 00:04:39,220
Or we can globally specified, so if I getting this out of here.

57
00:04:42,700 --> 00:04:49,120
And moving this here outside from any particular job, this will be the global cash conflagration,

58
00:04:49,750 --> 00:04:56,860
and this is actually quite a good thing because not only to build website job needs this cash, but

59
00:04:57,040 --> 00:05:00,790
most likely test website as well because it's doing NPM install.

60
00:05:01,750 --> 00:05:05,830
So for that reason, it might be a good idea to make this cash globally.

61
00:05:06,460 --> 00:05:14,170
This also means that this cash will be used when running other jobs, which actually don't need a cash.

62
00:05:14,560 --> 00:05:19,930
So it's totally up to you how you want to configure it, but definitely it's a bit easier to do it like

63
00:05:19,930 --> 00:05:20,230
this.

64
00:05:22,630 --> 00:05:27,370
Now, let's take a look at how this performs and if it was worth building it in.

65
00:05:30,540 --> 00:05:37,440
Let's take a look at to build websites up and right on top, you will see fatele fail does not exist

66
00:05:37,830 --> 00:05:45,160
and this is right under checking cash for Mossler and this is exactly what Getler is supposed to do.

67
00:05:45,210 --> 00:05:48,270
It's supposed to first check if there's a cash.

68
00:05:49,050 --> 00:05:54,870
And in this case, of course, there isn't no there is no cash because we haven't actually created one.

69
00:05:55,830 --> 00:05:58,700
So the first one of this job, this is definitely fine.

70
00:05:59,130 --> 00:06:01,290
Now, let's crawl towards the bottom of it.

71
00:06:05,650 --> 00:06:06,880
You will now see that.

72
00:06:08,490 --> 00:06:15,750
Gottleib is starting to create a cash creating cash monster, and it has found a bunch of files and

73
00:06:15,750 --> 00:06:16,680
what it actually does.

74
00:06:16,710 --> 00:06:23,610
It takes all those files, it creates a zip archive, and it's uploading that archive somewhere so it

75
00:06:23,610 --> 00:06:24,750
can later download it.

76
00:06:28,500 --> 00:06:33,960
Now, let's take a look at one of the other jobs in our pipeline, and that is the best website.

77
00:06:37,970 --> 00:06:45,920
And what test website will do when starting this job is to check the cash and again identify before

78
00:06:45,920 --> 00:06:48,260
the cash is Mustoe because this is our branch.

79
00:06:48,500 --> 00:06:51,770
So this is why it says overall checking cash for muster.

80
00:06:52,130 --> 00:06:57,710
And this time it finds a cash because we have actually uploaded a cash previously from the bill to job.

81
00:06:58,010 --> 00:07:02,330
And it's not downloading this cash and it doesn't have to download all the dependencies.

82
00:07:03,620 --> 00:07:10,840
And then at the end, it will again save all the files that it has and upload them again.

83
00:07:13,630 --> 00:07:19,780
So in this way, the cash will be all the time updated, and especially the NPM install command that

84
00:07:19,780 --> 00:07:26,080
we have here will make sure that the dependencies that we have are all the time up to date, even produce

85
00:07:26,080 --> 00:07:26,530
a cash.

86
00:07:31,140 --> 00:07:35,050
Now, let's run again the pipeline from the beginning to see if we get any improvements.

87
00:07:35,550 --> 00:07:37,200
I'm going to click here, run pipeline.

88
00:07:39,500 --> 00:07:44,300
I don't arrange for my staff and I don't have any variables that I want to define, so I'll simply click

89
00:07:44,420 --> 00:07:45,080
on Pipeline.

90
00:07:52,250 --> 00:07:58,580
Now, if you take a look at the total execution time for the last one and compared to previous jobs.

91
00:08:00,300 --> 00:08:07,110
We'll see that we have like a 20 seconds improvement, and that may not seem a lot, but we also do

92
00:08:07,110 --> 00:08:13,770
not have a lot of dependencies and we have also not configured this properly because we have enabled

93
00:08:13,770 --> 00:08:14,520
the global cash.

94
00:08:14,520 --> 00:08:16,520
But not all the jobs need that global cash.

95
00:08:17,640 --> 00:08:21,990
So for that reason, this is not 100 percent optimal and it can be better improved.

96
00:08:23,970 --> 00:08:29,790
But as a starting point, is this all that you need to know regarding how caches work or are supposed

97
00:08:29,790 --> 00:08:30,180
to work?

98
00:08:31,890 --> 00:08:38,789
There's one final thing I wanted to show you, and sometimes it happens that cashes, misbehave and

99
00:08:38,909 --> 00:08:42,960
make jobs fail or totally unexpected reason.

100
00:08:44,100 --> 00:08:49,800
And what you can do from catalepsy is to clear the caches and that can be done from here, clear on

101
00:08:49,860 --> 00:08:50,490
caches.

102
00:08:50,500 --> 00:08:53,440
And if you click it, you will be able to empty the caches.

103
00:08:53,520 --> 00:08:56,560
And then when you start the pipeline again, you will have no caches.

104
00:08:56,850 --> 00:08:59,820
So this is like a hidden thing that you should know about.

105
00:09:00,060 --> 00:09:05,050
If you are massively using caches in your projects, it may happen that you have such problems with

106
00:09:05,070 --> 00:09:07,170
caches, then this is how we can clear it.

107
00:09:10,640 --> 00:09:16,850
Now, let's recap for a second, we have defined here a global cash, we have defined a key and using

108
00:09:16,850 --> 00:09:21,950
this environment variable from GitHub, we are specifying the key to the current branch where this code

109
00:09:21,950 --> 00:09:23,090
is being saved.

110
00:09:23,990 --> 00:09:26,230
And we have defined only one path that we need.

111
00:09:26,240 --> 00:09:31,460
But in case there are multiple paths, you can simply add one below node modules or whatever else you

112
00:09:31,460 --> 00:09:36,920
are using and they will be automatically saved by Getler.

113
00:09:37,340 --> 00:09:44,750
So the way it goes is that Gottleib looks for cache entries based on the key specified, and if something

114
00:09:44,750 --> 00:09:50,150
is available, it will be downloaded before the job starts right at the beginning of the execution of

115
00:09:50,150 --> 00:09:56,630
the job and again after the job has finished, the cache will be updated with the latest version in

116
00:09:56,630 --> 00:09:58,160
case something has changed.

