WEBVTT 00:00.390 --> 00:06.390 Welcome in this lecture, you're going to learn how to decode the rules in a UTF eight encoded string 00:06.390 --> 00:09.810 value manually, but why would you want to do so? 00:10.590 --> 00:15.930 Well, by manually decoding Groomes, you can gain a better understanding of how the decoding really 00:15.930 --> 00:16.440 works. 00:16.450 --> 00:18.030 And sometimes it is the best way. 00:18.540 --> 00:19.710 All right, let's get started. 00:20.970 --> 00:27.390 Remember this, it contains multiple bathrooms, as you also learned before, it isn't easy to work 00:27.390 --> 00:29.600 with UTF eight encoded string values. 00:30.060 --> 00:35.490 I mean, you cannot index or slice them without knowing where their rooms start and end. 00:36.150 --> 00:41.900 In this lecture, you're going to learn a new way to find out where the rooms in a string start and 00:41.900 --> 00:42.120 end. 00:43.920 --> 00:48.690 Now I'm going to declare a string variable and I'm going to assign it the Turkish translation of the 00:48.690 --> 00:52.560 opening lines of a sci fi book, The Hitchhiker's Guide to the Galaxy. 00:52.890 --> 00:57.600 Now I'm going to print its bite by bite to the console using a loop LACHSA. 01:01.680 --> 01:02.590 Look at the output. 01:02.910 --> 01:04.120 It doesn't look good. 01:04.140 --> 01:08.220 Right here it looks fine, but when I printed, it becomes messy. 01:08.640 --> 01:13.380 As you know, this is because some rooms are sponte into multiple bytes. 01:13.860 --> 01:19.740 So you cannot print the string value that contains multiple bedrooms by Drak looping over its bytes. 01:20.400 --> 01:22.770 You need to print the rooms instead of the bytes. 01:23.370 --> 01:25.820 They want to print the room using bytes. 01:25.860 --> 01:27.520 It is called decoding. 01:28.050 --> 01:32.520 So first I'm going to put the next five into a room variable LACHSA. 01:34.200 --> 01:41.340 As you can see, it doesn't work yet it still prints gibberish unnecessarily, making a bad value bigger 01:41.340 --> 01:43.050 by converting it to a room. 01:43.560 --> 01:45.780 Remember, the room type is for whites. 01:46.140 --> 01:51.450 So it's clear that converting a buyer to a room doesn't magically solve decoding problem. 01:52.200 --> 01:53.320 So why should I do? 01:53.850 --> 01:58.850 Well, as I said, I need to find out where everyone starts and ends. 01:59.490 --> 02:00.760 So how can I do that? 02:01.520 --> 02:04.060 Actually, there is a function that can do this. 02:04.440 --> 02:05.400 Let me show it to you. 02:10.320 --> 02:16.560 As you can see, it searches for the first room and returns it, I'll talk about the second result, 02:16.560 --> 02:18.120 value the size later. 02:18.750 --> 02:21.720 By the way, the court room accepts a bit less. 02:21.900 --> 02:25.080 If I had a bite slice, I would be using this function. 02:25.080 --> 02:28.330 But I have a street value instead, so I'm going to use it. 02:28.350 --> 02:30.680 Sister, the court room in shrink. 02:31.950 --> 02:37.650 As you can see, instead of a bite slice, it accepts a street value, there are almost the same. 02:37.900 --> 02:41.160 Always remember this in gold standard library. 02:41.460 --> 02:44.400 Then you see a function that expects a wide slice. 02:44.700 --> 02:48.270 There may be another function that expects a string value as a. 02:49.150 --> 02:50.140 OK, let me run it. 02:51.040 --> 02:51.970 What's going on here? 02:52.360 --> 02:58.690 It prints the same character over and over again, but why the problem happens because I'm giving the 02:58.870 --> 03:02.070 court room in street function always the same street value. 03:02.350 --> 03:04.060 So it looks to the same string. 03:04.070 --> 03:07.270 And all this finds the first rule again and again. 03:07.750 --> 03:13.420 For example, here, if I had given the function this string value, it would always return me the first 03:13.420 --> 03:17.230 room when the court room in stream function finds the room. 03:17.380 --> 03:23.560 The next time I call it, I should give it the next part of the string like so so that it will be able 03:23.560 --> 03:25.330 to find the second rule instead. 03:25.750 --> 03:29.710 It's because now the first room is the second room of the original string. 03:29.890 --> 03:31.150 So it goes like this. 03:32.090 --> 03:33.470 Until the spring Antz. 03:34.440 --> 03:40.340 So how can I know which part of the string that I should give to the court room in string function? 03:40.910 --> 03:45.420 Remember, the court ruling in string also returns the size of the first room. 03:45.900 --> 03:49.640 So it practically returns where the first rule ends like so. 03:50.540 --> 03:54.030 So the next bite is where the next room starts, right. 03:54.810 --> 04:00.630 So by using the size of the first room, you can easily jump to the next room one by one. 04:01.850 --> 04:07.640 OK, first, I'm going to save the size of the first room, then I'm going to jump to the next room 04:07.640 --> 04:14.310 like so I also need to slice the shrink Lexar so that it will return the remaining part of the string. 04:14.760 --> 04:17.740 OK, there are still problems, but we are getting close. 04:18.410 --> 04:23.870 It looks like it jumps too fast because I jump twice here and here. 04:24.440 --> 04:31.160 So I need to remove this part because the size parameter already tells me where the next room starts. 04:33.390 --> 04:33.860 Perfect. 04:34.110 --> 04:35.070 Now it works. 04:35.460 --> 04:40.710 Let's take a look at this card for the last time is because now I'm going to remove almost everything 04:40.710 --> 04:41.640 from this card. 04:42.270 --> 04:47.990 This because you all learned how the room decoding works behind the scenes as your witness. 04:48.180 --> 04:50.880 It's tedious to the court rooms manually. 04:51.360 --> 04:54.020 So it's time to switch to the automatic mode. 04:54.330 --> 04:59.550 So Gore offers us the full range loop as a way for decoding the rules automatically. 04:59.700 --> 05:02.130 Let me convert this into a full range loop. 05:08.260 --> 05:08.740 Perfect. 05:09.160 --> 05:14.210 So as you learned before, for range loops automatically decode the rules in a string value. 05:14.770 --> 05:18.160 However, sometimes decoding manual, it may still be useful. 05:18.580 --> 05:21.000 Let's take a look at another example here. 05:21.310 --> 05:23.560 I have a Turkish word in a watchlist. 05:24.340 --> 05:26.170 Let me print the bytes in Hex. 05:30.700 --> 05:35.130 Let's say you want to make the first one uppercase, how can you do that? 05:35.710 --> 05:41.440 The first two words belong to the first room, but normally you cannot know that without analyzing the 05:41.440 --> 05:41.830 words. 05:42.760 --> 05:46.260 So first you need to find the bite size of the first room. 05:47.080 --> 05:50.310 You can do that by using a four arranged loop like so. 05:50.890 --> 05:56.700 However, this loop returns the bytes, not the rooms, but you need to get the rooms. 05:56.710 --> 06:01.010 So first you need to convert the words to a string like so. 06:01.660 --> 06:04.140 So the range loop will return the rooms. 06:04.810 --> 06:09.740 However, remember that converting a slice to a string is a costly operation. 06:09.910 --> 06:12.780 So this is not efficient for now. 06:12.790 --> 06:14.380 Let's keep it as it is. 06:15.170 --> 06:21.310 OK, next you need to get the size of the first room like so here I'm getting the index of the second 06:21.310 --> 06:21.620 room. 06:21.640 --> 06:26.770 Then I quit in a vault, encoded you to have a drink, value the second room. 06:26.770 --> 06:28.900 She'll start from where the first rule. 06:28.900 --> 06:34.930 And so by using this knowledge here, I find the bite size of the first room. 06:36.300 --> 06:42.240 Lastly, I'm going to convert the first room to uppercase by using the bytes packages to upper function. 06:42.450 --> 06:44.910 Now I'm going to copy our the first room. 06:46.570 --> 06:49.840 I also need to convert the words last was to printed. 06:51.290 --> 06:56.600 As you can see, the first leather is in upper case, and you can see that only the first two bites 06:56.600 --> 07:01.590 of the world's last has changed because only the first two bodies belong to the first group. 07:02.210 --> 07:06.610 However, using a full range loopier is kind of cumbersome and not efficient. 07:07.310 --> 07:11.750 Fortunately, you can do the same thing using the UTF aid package more efficiently. 07:12.320 --> 07:18.490 I just need to call the decoder room function to get the size of the first room like so now I can comment 07:18.500 --> 07:19.730 out the for loop eyeball. 07:20.270 --> 07:24.420 As you can see, I don't need to convert the bytes to a string value anymore. 07:24.950 --> 07:31.580 This code is the same as the above for loop, so it is efficient and it contains lescott cool. 07:31.790 --> 07:37.120 As you can see, sometimes it is way easier to use the court room function than a full range loop. 07:37.760 --> 07:43.100 You can find more functions like the room function in the UTF eight and Unicode packages. 07:43.110 --> 07:44.330 Please investigate the. 07:45.180 --> 07:48.090 All right, that's all for now, seeing the next picture by.