ml5.js: KNN Classification Part 3

ml5.js: KNN Classification Part 3 - YouTube

https://www.youtube.com/watch?v=JWsKay58Z2g

Transcript:

(00:00) [BELL RINGS] Welcome back. I'm going to finish this ml5 KNN classification example for like a gesture-based controller. So I just-- in between the last video and this one, I just sort of trained a quick KNN model where if my head is up here, it registers as up. If my head is down here, it registers as down.

(00:19) Over here would be to the left, and over here would be to the right. Now, you know, interestingly enough, because I have a green screen here and I'm a sort of quite a different color than the green screen, I could probably just do this with some image processing and like pixel manipulation to sort of like check where my head is, so with a lot of if statements.

(00:37) That's the sort of interesting thing about like, you can almost think of you doing machine learning as it's like this automatic if statement, I think, is one way of thinking about it. But I also should note that I'm actually almost somewhat surprised that this is working, because you have to remember, the image classification model MobileNet that this is built on is trained off of like a lot of things like dogs and cats and objects.

(01:01) It really has never seen people. There's no images in its database with like a green screen. So I would have actually expected it to actually think that all of these images wherever I am are kind of similar according to the way it thinks about the world. Whereas if I were to like do this, like that's going to be like remarkably different.

(01:21) So there might be some ways that you could use the sort of way, your understanding of how MobileNet was trained and what kind of images were in there to have other images that are the most distinguishable for it to like learn about. But this is something you probably have to do, you could do through research and you could do through trial and error.

(01:44) Interestingly enough, even if I move the computer around, is it still going to work? Up, down, left, right, so I'm really at a huge advantage here having the green screen. But if you're using this to build some kind of kiosk installation and you have-- it doesn't have to be green, but like a fixed background, that's going to allow users to play with it and have it work pretty well.

(02:04) All right, but the next thing that I need to do, right? So what I want to do is two things in this video. Number one, I want to be able to save it. So I should add that I spent all this time training it. As soon as I refresh the page, I'm just going to do it, it's gone. All of that training data is gone.

(02:17) So I want to be able to add saving, so let's add that first. And I'm just going to do that. I'm also just going to keep doing everything in keypress. if key equals S, I'm going to say knn.save, model.json. So I think this is what it is, but basically, everything that it's learned, all of those logics and labels are all sitting in memory, and I can spit them all out into a json file.

(02:43) So let's just do that really quickly without a lot of training. So I'm just going to like arbitrarily train a bunch of things up and down. I'm going to click Save. And I'm going to go to the Downloads directory. You can see I now have a model.json file. I'm going to drag it into my code folder, and then I'm going to go over to Visual Studio Code and take a look at it.

(03:02) There it is. Look. This is actually the saved knn model. All of this stuff. You can see there's lots of configuration information up here. This is all the domain of TensorFlow.js, configuration stuff that's super interesting. You could go and we could do a deeper dive into that, but we don't have to worry about that.

(03:19) We can see that all of the numbers of all the logics are all in here. It's just a big file with all the stuff, all of the logics and all the labels and the configuration information that tensorflow.js and ml5 needs. Now, I should, you might be thinking, wow, it's like a lot of stuff in there. You know, how big of a file is that? And actually, it's just text, and it's just 561 kilobytes.

(03:40) So even if I had like hundreds and hundreds and hundreds, or thousands of training images, it's going to be fine. You have to remember, it's taking an image file, which we work with all the time, and boiling it down to just 1,000 numbers. So even though it is stored ultimately as text, it's much less data.

(03:56) And there are ways of optimizing the storage and all sorts stuff like that. But ultimately, this is a perfectly reasonable thing to do. So now, let's actually do a better job at training it. Let's actually get a better model to train. [BELL RINGS] You might not realize this, but it's about four or five hours in real time that has passed since the moment ago when I was about to do the next thing in this tutorial.

(04:19) I ran into a pretty significant bug which ml5 didn't allow for saving a json file of a certain size with a certain amount of training images. And you can see right now, I just tested it. With 76 training images, it just worked. So I'm going to continue this tutorial as if I'm picking up right where I left off, but you should be aware that you're actually going to need a newer version of ml5.

(04:42) The version number will appear right here and be in the video's description in order to get the example to work with more than maybe 40 training images. So just be aware of that particular bug. Now, where was what I? What I was about to do was try to train a better model with left, right, up, and down. And I am going to do that now.

(05:02) OK, here we go. I'm going to move-- now the thing is, I just have to-- I actually want us to realize this. Even though this is my left, that's to the right on the screen. So I need to call this to the right. Wait, oh, it's so confusing. No, oh, no, no. I'll figure this out later. I'm going to actually go to my left, which is to the right.

(05:26) It's fine. It's going to be OK. Here we go. Oh, no, let's speed this part up, because I'm going to train it for a while. You don't need to watch all of it. OK, so I trained it for a little while. Let's see how well it performs, and if it's not performing so well, I could add some more images.

(05:45) So this is to the right. Pretty good. This is to the left, pretty good. Where-- if I move off, Like I probably should give it just like, I should probably like, let's tell it this is left also, even with just a sliver of me there. And then if I move down, down, down, down, so even with just a sliver of me, let's tell it it's down.

(06:09) OK, now up. Up, up, down, left. That's left, and right. OK, this is a good model. Let's now save the model. So I'm going to press S, and the model is saving. I am going to go to the Downloads folder. I'm going to take a look at it. Now look at this. This is actually a 5.5 megabytes. And by the way, using the word model is perhaps a little bit misleading.

(06:37) I probably should have just called it knn.json because a knn model is just the raw, all the data. It's all of the logics with their label. There's no neural network involved here at all. There was a network involved in generating the features of the image, but now we're just storing a lot of them so that we can do a nearest neighbor calculation.

(06:58) All right. So this needs to go into the knn directory. And hopefully we can see that it is here. There it is. It's a really big file. I could click on it. It kind of loads in Visual Studio code, but I don't really need to look at it. OK, so now, the next thing that I need to do, right? If I refresh this page, it wants me to train again.

(07:20) So the next thing that I need to do is load the model. So I'm actually going to move this. I want to make sure before I load my knn model, the knn database, data set, what I want to do is make sure MobileNet has finished loading. So I actually got to take this out and put this in the model ready function.

(07:43) And then I am going to say knn load model.json. And I don't need to ever-- I don't need to train it again. I mean, I can train. I don't need to call go classify and draw the way I did before. That's no longer relevant. I can just call, oh, this needs a callback. So confusing. When the model is loaded, then I can say go classify.

(08:13) So I'm just putting an anonymous function callback in here. So when the first thing is load MobileNet. MobileNet's ready. Load my knn data set. My knn data set is ready. Start classifying images. So let's go back and refresh. Load MobileNet. Oh, I should put in, let's see. I'm gonna say console.log.

(08:35) I'm gonna say MobileNet, let's clean this up and load it. Then I'm gonna say knn Data Loaded. And so now I'm going to refresh. Model loaded, knn loaded, and then if I go right, left, down, up, we've done it. We now have four classes being triggered by where I put my head. And let's move on to the next part, the exciting part, the sort of the part where you should really stop watching this video and come up with your own creative idea.

(09:06) But just for simplicity, I am going to put an x and a y, just to give you a demo. I'm going to make an x and a y. I'm going to start those x and ys at width equals 2, height, width divided by 2 and height divided by 2. I am going to then, in draw, I'm going to get rid of this. Oh, and by the way, ignore this down here.

(09:30) I put in some extra code here to get this to work right now because it's not working in the ml5 library, but don't worry. It's-- by the time you watch this hopefully, there'll be a version in the ml5 library, and the example that I publish will be the version that doesn't have that extra code in there.

(09:44) But I added the corrected bug fix from ALK, I put it directly in my code. Oh, boy. This is going to live on forever in this tutorial, but whatever. It's good. Good to see how the process works, and that's this. Your code is going to look like this, not like this. But this is my little hack extra Save function just to get it to work.

(10:01) OK, moving on. In draw, I'm going to say background 0, and I want to draw an ellipse at width divided by 2, height, no. I don't want to draw an ellipse at the x and y. x and y with a size of 36. I'm making that up. I'm going to make it white. It always comes up here. I'm trying to like shove it down there, but my auto format brings it up.

(10:23) OK. Then, I don't need to draw the video anymore, however I would like to see the video. So I'm going to get rid of video hide, and let's just see what happens here. All right, so we've got the ellipse, and now I want to be able to move it to the left. Now left is really to the right in terms of it following me.

(10:45) But I want it to move the opposite way of how, I want it to mirror myself. I got it. Let's start with up and down. That's easy. OK, so the thing that I need to do is, I need a global variable. Why is this yelling at me? No, no, don't save. I need a global variable to keep track of what the label is, and I'll start that as just an empty string.

(11:05) When I classify, when I get a classification, I want to set that variable. I do want it displayed in the window so I can see it, but then, in draw, whatever that label is-- oh, I wonder if I should move it, let's just do it in draw. There's a variety of different ways I could do this, but I could just say if the label equals up, then y minus minus.

(11:32) Otherwise, if the label is down, then y plus plus. So I'm going to move x and y up and down. Here we go. Up, down. Up, down. Works very nicely. I would probably want to do like more of a physics simulation or something where I'm pushing it, but you get the idea. Now let's add left and right. This is the confusing part.

(11:56) If label equals left-- remember, left is my left, meaning I'm moving this way, but it's actually moving-- I'm moving this way, it's actually going to the right. But I want it to go to the left. I'm going to see myself going the wrong way. Oh, this is very confusing. No, but interaction-wise, I want it to go this way, so I'm going to say x minus minus.

(12:22) I should mirror the image. Maybe what I'll do, just for the time being is x equals constrain between x, like just don't let it go off the screen. And y equals constrain, just to like leave this in the example. And then how do I mirror the image? I know how to draw it in reverse. Can I just do it in the dom element? Like CSS magic? [BELL RINGS] Thank you once again to ALCA Design who reminded me that there's a CSS property transform, which I can give it basically what looks like the p5 scale function.

(12:54) I want to reverse the x-axis, keep the y-axis, flip it negative 1. This is some p5 code to apply some CSS, but certainly, I could just put it in HTML, et cetera. But now, when I go back and refresh, as I move to the right, now of course it's opposite now, down what you're seeing. Like I could go and like high five myself.

(13:14) But what I'm seeing is exactly right. Like if I take myself off, whoops, now I'm not in the picture, OK? So watch this. I'm going to move to the right. Come on. I'm going to try to keep it in the center. That's going to be my game here. I'm not, up a little bit. OK. Down a little bit. I need to do a classification, another label, for like don't move.

(13:38) Right? So I could add one more label. I'm not going to save and train, but let's just see if this works. I really should go, oh, wait. You can't see me anymore. Here I am. I should add, where is my key press? Let's add else if key equals space knn add example logics. We'll call it stay. So stay, I don't actually even have to do another if statement, because it has to-- the stay is like not doing anything.

(14:16) So I can run this, and I'm going to give it some training data. Stay, stay, stay, stay, stay. So I'm giving a training, and so now it knows to stay. So I should move to the right, and I'm going to turn myself out right. To the left. And by the way, it's getting it wrong, so I can like just give it some more data.

(14:40) Like that should be left, right? Stay-- come on, stay, stay. Oh, I hit save by accident. Stay. Up, up, down. I can do this all day. All right. This is my result. So now what we have done, just to review. We have looked at loading the MobileNet model. I guess I should bring myself back here as well. We have looked at loading the MobileNet model, which is an image classification model trained on the ImageNet database.

(15:18) We are not using the classifications from that model. Rather, we are taking the logics that layer before the last layer, before the softmax, before the probability is a sign. We are taking that. We are saving a lot of those logic, each paired with a label, and building up a big database. And then when we get new images, we think all of those images with their labels are in 1,000 dimensional space.

(15:44) We get a new image. It's somewhere in 1,000 dimensional space. What collection of neighbors is it closest to? And which label is that? That's its label. We get that result, and we do something with it in with p5. So thank you for watching this tutorial sequence about ml5, TensorFlow.jss, and KNN classification.

(16:03) Please, if you make some kind of interesting thing with this, please share it with me. Look in this video subscription for a way to do that, but certainly @Shiffman on Twitter is a quick way to do that. Also, share it with @ml5js also, so that ml5 can see it. OK, goodbye, thanks for watching. [BELL RINGS] [MUSIC PLAYING]

Search This Blog

bighuge tutorial

ml5.js: KNN Classification Part 3

Comments

Post a Comment

Popular posts from this blog

ml5.js: KNN Classification Part 2

ml5.js: Feature Extractor Classification

ml5.js: Transfer Learning with Feature Extractor