1
00:00:04,740 --> 00:00:06,380
Hello and welcome to this lecture.

2
00:00:06,420 --> 00:00:10,380
And we are learning advanced darker concepts in this lecture.

3
00:00:10,380 --> 00:00:14,890
We're going to talk about doctors Torres drivers and file systems.

4
00:00:14,940 --> 00:00:23,110
We're going to see where and how doctors stores data and how it manages file systems of containers let

5
00:00:23,110 --> 00:00:30,640
us start with how a doctor stores data on the local file system when you install Docker on a system

6
00:00:30,850 --> 00:00:38,780
it creates this folder structure at where lip Docker you have multiple folders under it called a few

7
00:00:38,850 --> 00:00:42,670
affairs containers image volumes etc..

8
00:00:42,670 --> 00:00:46,540
This is where Doctor stores all its data by default.

9
00:00:46,540 --> 00:00:53,950
When I say data I mean files related to images and containers running on the dock or host for example

10
00:00:54,370 --> 00:01:00,610
all files related to containers are stored under the containers folder and the files related to images

11
00:01:00,670 --> 00:01:03,280
are stored under the image folder.

12
00:01:03,280 --> 00:01:08,250
Any volumes created by the docker containers are created under the volumes folder.

13
00:01:08,260 --> 00:01:10,030
Well don't worry about that for now.

14
00:01:10,030 --> 00:01:12,590
We will come back to that in a bit.

15
00:01:12,610 --> 00:01:19,900
For now let's just understand where Docker stores its files and in what format.

16
00:01:20,010 --> 00:01:26,190
So how exactly does Docker stored the files of an image and a container you understand that we need

17
00:01:26,190 --> 00:01:30,070
to understand Dockers layered architecture.

18
00:01:30,120 --> 00:01:37,340
Let's quickly recap something we learned when Docker builds images it builds these in a layered architecture.

19
00:01:37,440 --> 00:01:44,610
Each line of instruction in the docker file creates a new layer in the Docker image with just the changes

20
00:01:44,610 --> 00:01:46,160
from the previous layer.

21
00:01:46,260 --> 00:01:53,040
For example the first layer is a base Ubuntu operating system followed by the second instruction that

22
00:01:53,040 --> 00:01:57,780
creates a second layer which installs all the APD packages.

23
00:01:57,780 --> 00:02:04,080
And then the third instruction creates a third layer which with the python packages followed by the

24
00:02:04,080 --> 00:02:06,990
fourth layer that copies the source code over.

25
00:02:06,990 --> 00:02:16,410
And then finally the fifth layer that updates the entry point of the image since each layer only stores

26
00:02:16,440 --> 00:02:18,780
the changes from the previous layer.

27
00:02:18,810 --> 00:02:21,600
It is reflected in the size as well.

28
00:02:21,690 --> 00:02:27,450
If you look at the base one to image it is around and 120 megabytes in size.

29
00:02:27,540 --> 00:02:35,730
The AAPT packages that are installed is around 300 M B and then the remaining layers are small to understand

30
00:02:35,850 --> 00:02:39,090
the advantages of this layered architecture.

31
00:02:39,090 --> 00:02:46,670
Let's consider a second application this application has a different darker file but is very similar

32
00:02:46,670 --> 00:02:54,290
to our first application as in it uses the same base image as a one to use as the same python and flask

33
00:02:54,290 --> 00:03:00,450
dependencies but uses a different source code to create a different application.

34
00:03:00,770 --> 00:03:03,620
And so a different entry point as well.

35
00:03:03,620 --> 00:03:09,860
When I run the docker build command to build a new image for this application since the first three

36
00:03:09,860 --> 00:03:17,240
layers of both the applications are the same Docker is not going to build the first three layers.

37
00:03:17,330 --> 00:03:25,100
Instead it reuses the same three layers it built for the first application from the cache and only creates

38
00:03:25,130 --> 00:03:33,620
the last two layers with the new sources and the new entry point this way Docker builds images faster

39
00:03:33,980 --> 00:03:37,000
and efficiently saves disk space.

40
00:03:37,010 --> 00:03:43,550
This is also applicable if you were to update your application code whenever you update your application

41
00:03:43,550 --> 00:03:51,650
code such as the abductee y in this case Docker simply reuses all the previous layers from cache and

42
00:03:51,650 --> 00:03:59,270
quickly rebuilds the application image by updating the latest source code thus saving us a lot of time

43
00:03:59,690 --> 00:04:08,910
during rebuilds and updates let's rearrange the layers bottom up so we can understand it better at the

44
00:04:08,910 --> 00:04:16,440
bottom we have the base open to layer then the packages then the dependencies and then the source code

45
00:04:16,530 --> 00:04:24,400
of the application and then the entry point all of these layers are created when we run the docker build

46
00:04:24,400 --> 00:04:32,140
command to form the final Docker image so all of these are the Docker image layers.

47
00:04:32,140 --> 00:04:37,870
Once the build is complete you cannot modify the contents of these layers and so they are read only

48
00:04:37,930 --> 00:04:45,810
and you can only modify them by initiating a new build when you run a container based off of this image.

49
00:04:45,930 --> 00:04:52,680
Using the docker run command Docker creates a container based off of these layers and creates a new

50
00:04:52,770 --> 00:04:55,910
rateable layer on top of the image layer.

51
00:04:55,920 --> 00:05:03,460
The rateable layer is used to store data created by the container such as log files by the applications.

52
00:05:03,520 --> 00:05:10,200
Any temporary files generated by the container or just any file modified by the user on that container

53
00:05:11,190 --> 00:05:16,020
the life of this layer though is only as long as the container is alive.

54
00:05:16,230 --> 00:05:23,010
When the container is destroyed this layer and all of the changes stored in it are also destroyed.

55
00:05:23,010 --> 00:05:31,570
Remember that the same image layer is shared by all containers created using this image if I were to

56
00:05:31,570 --> 00:05:38,370
log into the newly created container and say create a new file called temp dot t t.

57
00:05:38,680 --> 00:05:43,720
It would create that file in the container layer which is read and write.

58
00:05:43,720 --> 00:05:50,290
We just said that the files in the image layer are read only meaning you cannot edit anything in those

59
00:05:50,290 --> 00:05:51,640
layers.

60
00:05:51,650 --> 00:05:57,230
Let's take an example of our application code since we bake our code into the image.

61
00:05:57,310 --> 00:06:03,480
The code is part of the image layer and as such is read only after running a container.

62
00:06:03,520 --> 00:06:08,500
What if I wish to modify the source code to say test a change.

63
00:06:08,650 --> 00:06:15,400
Remember the same image layer may be shared between multiple containers created from this image.

64
00:06:15,400 --> 00:06:20,240
So does it mean that I cannot modify this file inside the container.

65
00:06:20,260 --> 00:06:27,880
Now I can still modify this file but before I save the modified file Docker automatically creates a

66
00:06:27,880 --> 00:06:33,850
copy of the file in the read write layer and I will then be modifying a different version of the file

67
00:06:34,210 --> 00:06:36,190
in the rewrite layer.

68
00:06:36,310 --> 00:06:42,100
All future modifications will be done on this copy of the file in the rewrite layer.

69
00:06:42,130 --> 00:06:48,730
This is called copy on write mechanism the image layer being read only just means that the files in

70
00:06:48,730 --> 00:06:55,450
these layers will not be modified in the image itself so the image will remain the same all the time

71
00:06:55,840 --> 00:06:59,920
until you rebuild the image using the docker build command.

72
00:07:01,540 --> 00:07:07,630
What happens when we get rid of the container all of the data that was stored in the container layer

73
00:07:07,810 --> 00:07:09,970
also gets deleted.

74
00:07:09,970 --> 00:07:17,500
The change we made to the Abdul Pillai and the new ten file we created will also get removed.

75
00:07:17,530 --> 00:07:19,940
So what if we wish to persist this data.

76
00:07:20,110 --> 00:07:25,930
For example if we were working with a database and we would like to preserve the data created by the

77
00:07:25,930 --> 00:07:34,050
container we could add a persistent volume to the container to do this first create a volume using the

78
00:07:34,050 --> 00:07:35,760
docker volume create command.

79
00:07:36,570 --> 00:07:43,890
So when we run the docker volume create data underscore volume command it creates a folder called data

80
00:07:43,950 --> 00:07:50,620
underscore volume under the var lib Docker volumes directory.

81
00:07:50,680 --> 00:07:56,830
Then when I run the docker container using the docker run command I could mount this volume inside the

82
00:07:56,830 --> 00:08:02,070
docker containers rewrite layer using the dash the option like this.

83
00:08:02,260 --> 00:08:09,520
So I would do a docker run Daschle then specify my newly created volume name followed by a colon and

84
00:08:09,520 --> 00:08:16,120
the location inside my container which is the default location where miniscule stored data and that

85
00:08:16,120 --> 00:08:18,410
is where lib my askew.

86
00:08:19,030 --> 00:08:26,570
And then the image name my askew all this will create a new container and mount the data volume we created

87
00:08:26,750 --> 00:08:28,040
into var lib.

88
00:08:28,090 --> 00:08:35,150
My obscure folder inside the container so all data written by the database is in fact stored on the

89
00:08:35,150 --> 00:08:37,930
volume created on the docker host.

90
00:08:38,420 --> 00:08:43,430
Even if the container is destroyed the data is still active.

91
00:08:43,430 --> 00:08:48,890
Now what if you didn't run the docker volume create command to create the volume before the docker run

92
00:08:48,890 --> 00:08:49,970
command.

93
00:08:49,970 --> 00:08:56,990
For example if I run the docker run command to create a new instance of my rescue container with the

94
00:08:56,990 --> 00:09:04,670
volume data underscore volume 2 which I have not created yet Docker will automatically create a volume

95
00:09:04,790 --> 00:09:09,700
named data underscore volume 2 and mount it to the container.

96
00:09:10,010 --> 00:09:18,770
You should be able to see all these volumes if you list the contents of the var lib Docker volumes folder.

97
00:09:18,780 --> 00:09:26,370
This is called volume mounting as we are mounting in volume created by Docker under the var lib Docker

98
00:09:26,370 --> 00:09:27,630
volumes folder.

99
00:09:28,140 --> 00:09:34,830
But what if we had our data already at another location for example let's say we have some external

100
00:09:34,830 --> 00:09:42,270
storage on the docker host at or slash data and we would like to store database data on that volume

101
00:09:42,660 --> 00:09:47,000
and not in the default where the docker volumes folder.

102
00:09:47,190 --> 00:09:52,370
In that case we would run a container using the command Docker run Daschle.

103
00:09:52,500 --> 00:09:57,870
But in this case we will provide the complete part to the folder we would like to mount.

104
00:09:57,870 --> 00:10:05,250
That is what slash data for Slash minus Q Well and so it will create a container and mount the folder

105
00:10:05,520 --> 00:10:07,450
to the container.

106
00:10:07,460 --> 00:10:09,600
This is called bind mounting.

107
00:10:09,740 --> 00:10:16,670
So there are two types of mounts a volume mounting and a bind mount volume mount mounts a volume from

108
00:10:16,670 --> 00:10:23,200
the volumes directory and bind mount mounts a directory from any location on the docker host.

109
00:10:24,870 --> 00:10:33,480
One final point to note before I let you go using the dash V is an old style the new way is to use dash

110
00:10:33,480 --> 00:10:39,170
mount option the dash dash mount is the preferred way as it is more verbose.

111
00:10:39,390 --> 00:10:44,670
So you have to specify each parameter in a key equals value format.

112
00:10:44,820 --> 00:10:51,780
For example the previous command can be written with the dash mount option as this using the type source

113
00:10:51,870 --> 00:10:53,700
and target options.

114
00:10:53,700 --> 00:11:01,320
The type in this case is bind the source is the location on my host and target is the location on my

115
00:11:01,320 --> 00:11:01,950
container

116
00:11:05,170 --> 00:11:08,760
so who is responsible for doing all of these operations.

117
00:11:09,040 --> 00:11:11,050
Maintaining the layered architecture.

118
00:11:11,050 --> 00:11:18,730
Creating a viable layer moving files across layers to enable copy and write etc. It's the storage drivers.

119
00:11:18,970 --> 00:11:23,480
So Dockery uses storage drivers to enable layered architecture.

120
00:11:23,620 --> 00:11:32,590
Some of the common storage drivers are a user fast BTR affairs DFS device mapper overlay and overlay

121
00:11:32,590 --> 00:11:35,660
to the selection of the storage driver.

122
00:11:35,690 --> 00:11:40,230
Depends on the underlying OS being used for example with Ubuntu.

123
00:11:40,250 --> 00:11:45,920
The default story is driver is a new offence whereas this store as driver is not available on other

124
00:11:45,920 --> 00:11:49,220
operating systems like fedora or S.O.S.

125
00:11:49,220 --> 00:11:57,080
In that case device mapper may be a better option Docker will choose the best stories driver available

126
00:11:57,110 --> 00:12:04,160
automatically based on the operating system the different stories drivers also provide different performance

127
00:12:04,220 --> 00:12:10,610
and stability characteristics so you may want to choose one that fits the needs of your application

128
00:12:10,970 --> 00:12:12,680
and your organisation.

129
00:12:12,680 --> 00:12:18,500
If you would like to read more on any of these stories drivers please refer to the links in the attached

130
00:12:18,560 --> 00:12:20,860
documentation for now.

131
00:12:20,880 --> 00:12:24,930
That is all from the docker architecture concepts.

132
00:12:25,020 --> 00:12:28,240
See you in the next lecture.
