AI and Machine Learning for Coders: A Programmer's Guide to Artificial Intelligence

Speaker 1

00:00

The world of AI and machine learning is just exploding, isn't it. And if you're a coder looking in, you might be thinking, Okay, how do I actually get started here without needing a PhD.

Speaker 2

00:09

That's a really common question and honestly, the tools available now have made it much more accessible for developers to jump.

Speaker 1

00:16

In absolutely, and that's really our mission for this deep dive. We're digging into key parts of Lawrence Moroney's book AI and Machine Learning for coders. We want to pull out the most important bits for you.

Speaker 2

00:28

Think of it as a practical starting point. We're aiming to give you the essentials on deep learning, computer vision, and LP, basically focusing on how you can use TensorFlow to tackle these things.

Speaker 1

00:39

Yeah, the perspective here is crucial. It's tailored for you, the coder. It's about the tools you can grab now and the problems you can start solving.

Speaker 2

00:47

Right, just like the book intents equipping you to become an mL developer by focusing on actually doing it, not getting bogged down in just the theory or the super complex math.

Speaker 1

00:57

And people have really responded to that approachmon called it the much needed practical starting point. Soufu mentioned how it teaches the key building blocks so you can code AI for PCs mobile the browser.

Speaker 2

01:10

Yeah, Laurence Maroney's vision was clearly about empowering developers, and, like Andrew En says in the foreword, great Adventures await you, it's an exciting space.

Speaker 1

01:19

Okay, So let's unpack this. Where does machine learning really differ from say, traditional programming and what's the main platform.

Speaker 3

01:26

We'll be talking about.

Speaker 2

01:28

Well, the fundamental difference is a kind of flip in thinking. In traditional programming, you write explicit rules, rules, act on data, and you get answers.

Speaker 1

01:36

Like coding a game like Breakout, specifically, write the logic how the ball moves, what happens when it hits a brick, scoring, paddle, misses. It's all defined rules exactly.

Speaker 2

01:47

Or think about activity detection from a wearable. You might write rules like okay, if speed is over x, it's running. If it's between y and X, it's walking. You define the logic based on the data.

Speaker 1

01:57

But that approach hits the ceiling pretty quickly, right. What if you wanted to detect something way more complex, like golfing precisely?

Speaker 2

02:05

How do you write rules for that? The mix of swinging pausing walking, It gets incredibly hard, maybe even impossible, to define robust rules that cover every single variation by hand.

Speaker 1

02:18

So the old rules act on data to give answers. Method breaks down and the rules are just too fuzzy or complex to write yourself. And that's where mL comes in.

Speaker 2

02:28

Right. With machine learning, you kind of flip it. You provide the data and you provide the answers. We call those labels. Then the machine learning algorithm figures out the rules or patterns connecting the data to those answers.

Speaker 3

02:39

Ah okay.

Speaker 1

02:39

So for the activity example, you'd give it sensor data from someone walking, running, biking, and golfing, and you'd label those chunks of data this part is walking, this part is golfing.

Speaker 2

02:48

Yep. And the mL algorithm looks at all that labeled sensor data maybe acceleration, rotation time, whatever, and it learns the underlying patterns that distinguish golfing from the others. It derives the complex rules you couldn't realistically right, that's the core shift. It's pretty powerful, and.

Speaker 1

03:03

The platform that's really designed to put this power into coder's hands is TensorFlow.

Speaker 2

03:08

TensorFlow, Yeah, it's this huge open source platform for building and using mL models. Its real value, I think is that it handles a lot of the underlying complexity. It implements common algorithms, common patterns, so you.

Speaker 1

03:20

The coder can focus more on the actual problem you're trying to solve with mL, unless on say, implementing backpropagation from.

Speaker 2

03:27

Scratch, exactly focus on the scenario. And TensorFlow is built to be flexible. You can deploy the models you create almost anywhere web cloud, mobile apps, on Android or iOS, even tiny embedded systems.

Speaker 1

03:39

And how do you typically work with it? Python, install idse.

Speaker 2

03:42

All of the above. Really, you can pip install it in Python, use it in IDEs like pischarm, or a really popular way is using cloud environments like Google Collab that gives you access to GPUs and TPUs without needing the hardware yourself.

Speaker 1

03:57

Okay, let's drill down. What does the simplest possible example of this learning look like? Like the basic building.

Speaker 2

04:03

Block, right, So imagine teaching a network a really simple linear relationship like why equals two x one? You'd give examples if x is one, why is one of x's two, Why is three x's three, y is five? And so on Okay, a tiny neural network, even one with just a single neuron, can learn this. It does it by adjusting two internal values, a weight it multiplies the input x by, and a bias it adds.

Speaker 1

04:29

So it's basically learning the two in the menx one from our equation. Yeah, those learned numbers, the weight and bias, those are the parameters.

Speaker 2

04:35

Of the network exactly, And that's a really important distinction for coders getting into mL. Parameters, weights and biases are what the network learns from the data. They're different from hyper.

Speaker 1

04:45

Parameters, right. Hyperarameters are the knobs you turn before training starts. Aren't they things that control the learning process itself.

Speaker 2

04:51

Yeah, things like the learning rate, how quickly it adjusts weights, or the number of neurons you decide to put in a layer, or how many epochs meaning how many times it sees a whole data set. You experiment with these to get better results. And neurons also usually have something called an activation function. It's like a little function that processes the neurons output. A common one is relute rectified linear unit. It basically just passes the value through if

05:15

it's positive and outputs zero otherwise. This adds nonlinearity, which is crucial for learning anything beyond simple lines.

Speaker 1

05:22

Okay, makes sense. We've got the basic shift to learning the platform, the simplest building block. Let's apply this to a huge area. Yeah, computer vision making machines.

Speaker 2

05:31

See so at its core and image is just a grid of numbers, right pixels. A small gray scale image like from the Fashion MNIS data set might be twenty eight by twenty eight pixels. Each pixel has a value, say zero to two fifty five for how bright it.

Speaker 1

05:46

Is, and color images just have more numbers per pixel, usually three for red, green, and blue.

Speaker 2

05:50

Three channels.

Speaker 3

05:51

Yeah.

Speaker 2

05:51

Now if you try to feed those raw pixel values directly into that simple single neural network we just talked about, or even a basic multi layer network, well it would really struggle because a simple network doesn't understand spatial structure. It just sees a flat list of numbers. It might learn to recognize, say a sneaker, if it's exactly like the ones in the training data, same position, same angle.

Speaker 1

06:13

Ah, but if you show it the sneaker slightly rotated, or maybe a different type of sneaker, like a high heel, it.

Speaker 2

06:19

Might completely fail. It hasn't learned the features that make something a sneaker. It just memorized specific pixel patterns in specific locations.

Speaker 3

06:27

Okay, so that's the problem.

Speaker 1

06:28

Convolutional neural networks or CNNs are designed.

Speaker 2

06:31

To solve exactly. CNNs are built to automatically find and learn hierarchical features in images, things like edges, textures, shapes, objects, regardless of where they appear in the image.

Speaker 3

06:43

How do they do that? What are the core ideas?

Speaker 2

06:44

Two main operations, convolutions and pooling. Convolutions use small filters, like maybe a three by three grade of weights that slide across the image. Each filter is trained to detect a specific local pattern, maybe a vertical edge or a certain curve or text.

Speaker 1

07:00

So the filter scans the image and produces a sort of map showing where it found that.

Speaker 2

07:04

Pattern precisely, and applying a filter reduces the image dimensions slightly. A three x three filter on a twenty eight x twenty eight image gives you a twenty six x twenty six output map.

Speaker 3

07:16

Okay, and pooling.

Speaker 2

07:17

Pooling layers then reduce the size of these feature maps, making the representation smaller and more manageable while keeping the most important information. A common type is max pooling. You might take a two x two area and just keep the maximum value, throwing away the other three. This halves the dimensions but keeps the strongest signal for that feature in that region.

Speaker 1

07:36

And by stacking these convolutions and pooling layers.

Speaker 2

07:39

The network builds up understanding. Early layers find simple edges, Middle layers combined edges into corners and textures. Deeper layers combine those into parts of objects, and then whole objects. The key thing for you as a coder is that the CNN automates this feature extraction. You don't have to hand code edge detectors anymore.

Speaker 3

07:56

Right, Let's make it concrete.

Speaker 1

07:58

The book uses a horse or human classifier.

Speaker 3

08:00

Example.

Speaker 2

08:00

Yeah, that data set uses bigger color images, maybe three hundred by three hundred pixels with three color channels. So the input shape is three hundred by three hundred by three.

Speaker 1

08:08

And since it's just two classes horse or human, it's binary classification. You can use just one neuron in the final output layer.

Speaker 2

08:16

You can, and you typically attach a Sigmoid activation function to it. Sigmoid squashes any input value into a range between zero and one, perfect for probability. You can interpret the output as say, the probability that the image is a human.

Speaker 1

08:30

The source mentioned a specific failure case, though, where a model trained on this data set saw a picture of just like the top half of a person and classified it as a horse. Why would that happen? It seems like a common beginner frustration.

Speaker 2

08:44

It often boils down to the training data and overfitting. If your training set mostly contains full body pictures of humans standing up and maybe horses in profile, the model learns those specific views. When it sees something unusual, like only the upper body of a person perhaps and oppose it hasn't seen, it might lapt onto some features, maybe texture, maybe background that it learned we're associated with horses in

09:05

the training data. It hasn't generalized the concept of human well enough outside the examples it saw.

Speaker 1

09:10

Okay, so this overfitting. Doing great on training data but failing on new stuff. How do you fight that, especially if you don't have tons of data?

Speaker 2

09:19

Several really good techniques. One is image augmentation. It's clever. During training, you don't just feed the network your original images. You apply random transformations on the fly, maybe rotate the image slightly, zoom in or out a bit shift. It horizontally or vertically flip it.

Speaker 1

09:35

Ah, so you're essentially creating slightly modified versions of your existing images, making the data set seem bigger and more varied.

Speaker 2

09:42

Exactly, the model learns that a horse is still a horse even if it's tilted or slightly zoomed, it becomes more robust. Tensorflow's image data generator makes this super easy to set up.

Speaker 3

09:52

What else?

Speaker 2

09:53

Another huge one is transfer learning. This is incredibly powerful. You might only have a few thousand horse and human images, but other people have trained enormous models, like mobile net or inception on millions of images covering say one thousand different categories on the ImageNet data set. Those massive models have already learned really really good general purpose feature extractors

10:15

in their early convolutional layers. They know how to detect edges, textures, shapes, basic object parts, things useful for recognizing any image.

Speaker 1

10:26

So with transfer learning, you take those pre trained layers.

Speaker 2

10:29

Yep, you basically chop off the original final layers that were specific to the one thousand image Neet classes. You freeze the weights of the early layers so they don't change, and you add your own new classification layers on top, maybe just a couple of layers ending in that single sigmoid neuron for your horse human tasks.

Speaker 1

10:47

And you only train your new small layers using your smaller data set.

Speaker 2

10:50

Mostly yes, or sometimes you fine tune by unfreezing the last few pre trained layers and training them a tiny bit too. But the point is you're leveraging all the knowledge learn from the giant data set for your specific problem. It's a massive shortcut. TensorFlow Hub is a great place to find these pre trained models.

Speaker 3

11:08

That makes a lot of sense any other tricks for overfitting.

Speaker 2

11:10

Dropout regularization is another common one. During training, For each batch of data, you randomly drop out, meaning temporarily ignore a certain percentage of the neurons in a layer.

Speaker 3

11:20

Wait, you just turn them off?

Speaker 2

11:21

Why It sounds weird, but it forces the network to learn more robust and redundant representations. It prevents any single neuron from becoming overly specialized or critical for making predictions based on the training data. It encourages the network to distribute the learning. You often see the training accuracy and the validation accuracy performance on data held back from training stay much closer together when you use dropout.

Speaker 1

11:46

Okay, got it. So that's a good overview for images. What about the other big area text, natural language processing? How to machine start to understand language?

Speaker 2

11:54

Well? Like images, text needs to be converted into numbers first. The initial step is usually tokenization.

Speaker 1

12:01

Breaking sentences down into words or maybe even parts of words, and giving each unique piece a number ID like these is one, cat is two, sat.

Speaker 2

12:09

As three exactly. You build a vocabulary, a mapping from words to integer tokens. Good tokenizers handle things like punctuation. Maybe today just becomes the token for today, and crucially, you need a plan for words that weren't in your training vocabulary and out of vocabulary or OOV token.

Speaker 1

12:28

Once you have tokens, you turn sentences into sequences of these numbers. The cat sat might become the sequence one two three, right.

Speaker 2

12:36

And because neural networks usually need fixed size inputs, you have to make all your sequences the same length. You either pad shorter sequences, usually with zeros, or you trunk paate longer ones.

Speaker 3

12:46

How do you pick the length?

Speaker 2

12:48

You typically look at the distribution of sentence links in your data. Maybe ninety five percent of your sentences are shorter than say eighty five words, so you might pick eighty five as your max length to minimize padding while capturing most sentences fully.

Speaker 1

13:00

And sometimes you clean the text first, remove HTML maybe common words.

Speaker 2

13:04

Yeah, preprocessing is often important, removing HTML tags, maybe converting to lowercase. Sometimes you remove stop words. Common words like is is it is? The that might not carry a much specific meaning for your task. So is it sunny today might become tokens for just sonny today.

Speaker 1

13:23

Okay, so we have sequences of numbers, Yeah, but just assigning arbitrary IDs like one, two, three doesn't tell the model anything about meaning. Right, Cat two isn't inherently related to dog maybe fifty exactly.

Speaker 2

13:35

That's where embttings come in. This is a really key concept in NLP.

Speaker 3

13:38

Right.

Speaker 2

13:39

Embeddings represent words not just as single numbers, but as vectors lists of numbers in a multi dimensional space. Think of it like giving each word coordinates on a complex map, and.

Speaker 1

13:49

The idea is that words with similar meanings end up closer together.

Speaker 3

13:52

On this map.

Speaker 2

13:52

Precisely, king and queen might have similar vectors. Walking and running might be close. The relationships between words are captured by their relative positions in this embedding space.

Speaker 1

14:02

The book uses a cool example with pride and prejudice characters right, plotting them based on learned dimensions like gender or nobility.

Speaker 2

14:08

Yeah, that's a great way to visualize it. Mister Darcy and Elizabeth Bennett might be positioned based on these learned semantic features. The key is that the network learns these vector representations. The dimensions aren't predefined. They emerge from how words are used together in the training text.

Speaker 1

14:26

So the network figures out that king is used in similar contexts to queen, but maybe also related to man, while queen is related to woman.

Speaker 2

14:35

Right, and you can even optimize the number of dimensions in your embedding vectors. A rule of thumb is maybe the fourth root of your vocabulary size. So for a few thousand words, maybe seven or eight dimensions is enough instead of say sixteen or three two. It trains faster without losing too much meaning.

Speaker 1

14:52

But if you'd just average the embedding vectors for all words in a sentence, you lose word order, don't you. It becomes a bag of words.

Speaker 2

14:58

That's a major limitation of simple embedding approaches. Word order is critical in language. Dog bites man versus man bites dog totally different meanings, same bag of words, So.

Speaker 1

15:07

To handle sequence in context, we need something more sophisticated, like we're current nural networks RNNs exactly.

Speaker 2

15:13

RNNs are designed from the ground up for sequential data. They have a kind of internal memory or state that gets updated as they process each word or token in a sequence. This state carries context from previous words forward.

Speaker 1

15:26

But you mentioned simple RNNs can struggle with long sentences. They might forget important context from the beginning.

Speaker 2

15:33

Yeah, that's the venish ingradient problem. Essentially, the influence of early words can fade out over long sequences. If you have a sentence like I grew up in France, so I speak fluent, the word France early on is key to predicting French at the end. A simple RNN might lose that connection.

Speaker 1

15:50

Okay, and that's why lstm's long short term memory networks were developed, right.

Speaker 2

15:54

LSTMs are a special type of RNM. They have internal mechanisms called gates that expl ilicitly control what information to remember, what to forget, and what to output. This makes them much much better at capturing long range dependencies and sequences.

Speaker 1

16:09

And then there are bidirectional LSTMs. How do they improve things?

Speaker 2

16:12

So a standard LSTM reads the sequence from start to end. A bidirectional LSTM has two LSTMs. One reads forwards, the other reads backwards. Then it combines their outputs at each step.

Speaker 1

16:25

Ah, so it gets context from both.

Speaker 2

16:27

Directions exactly for understanding language, this is often really powerful. Think about sentiment analysis. Sometimes the keyword determining the sentiment comes late in the sentence, or for predicting a missing word, knowing the words that come after it is just as important as knowing the words before it.

Speaker 1

16:45

Like that I lived in country right Gaelic example. Seeing Gaelic later helps figure out.

Speaker 2

16:49

The country precisely. The backward pass provides that future context, and you can feed pre trained embeddings like love vectors into these LSTMs to give them a head start on word meaning.

Speaker 1

17:00

Okay, so we can use these models to understand text. What about generating text? How does that work?

Speaker 2

17:05

The core idea is pretty straightforward. Actually, you train a model to predict the next word in a sequence given the preceding words.

Speaker 1

17:12

So if your training data has the sentence the quick brown Fox, you'd create training examples like input the label quick input, the quick label brown input, the quick brown label Fox exactly.

Speaker 2

17:24

You slide a window across your text corpus, creating these input sequences and their corresponding next word labels. The labels are usually one hot encoded a vector of zeros with a single one at the index, corresponding to the correct next word in your vocabulary, and the.

Speaker 1

17:39

Model architecture would be similar embedding layer than maybe an LSTM or bidirectional LSTM.

Speaker 2

17:44

Yep, very common. Then to generate text, you start with a seed text, maybe a word or a phrase. You feed that seed sequence into your trained model. It outputs a probability distribution over all the words in your vocabulary for what the next word is most likely to be.

Speaker 1

18:00

You pick a word based on those probabilities, maybe the most likely one.

Speaker 2

18:04

Usually yeah, or sometimes you sample from the distribution to get more variety. Then you append that predicted word to your seed text. Now you have a slightly longer sequence, and.

Speaker 1

18:13

You feed that new sequence back into the model to predict the next word and repeat.

Speaker 2

18:17

That's the loop, keep feeding the growing sequence back in predicting the next word, appending it.

Speaker 1

18:22

The book had an example using song lyrics right starting with in the town of a Lottie yeah.

Speaker 2

18:28

And the model correctly predicted one which was the actual next word in the song it was trained on, and using different seeds like sweet, Jeremy sad Dublin produced other plausible next words based on the patterns learned from the lyrics.

Speaker 1

18:41

Though it's fair to say these simple generation models can often start repeating themselves or outputting stuff that doesn't make much sense after a while.

Speaker 2

18:48

No, definitely, they can descend into gibberish quite quickly. Getting coherent long form text generation is much harder. It often involves more complex architectures, maybe stacking multiple LSTM, careful tuning of hyper parameters, and more sophisticated sampling strategies. The generated Shakespearean text in the book is an example of getting something more structured, even if parts are nonsensical, Right.

Speaker 1

19:11

Okay, images text? What about data where the sequence is time itself, time series data like weather or stock prices.

Speaker 2

19:20

Yeah, time series data is everywhere, weather forecasts, stock market trends, sensor readings over time, even something like Moore's law tracking transistor density. It's all data points ordered chronologically, and this.

Speaker 1

19:33

Kind of data often has specific characteristics you need to understand.

Speaker 2

19:36

Absolutely. There's often an overall trend is the value generally increasing or decreasing over time. There's seasonality patterns that repeat at regular intervals, think daily temperature cycles, weekly website traffic spikes, yearly retail sales patterns. Okay, there's autocorrelation, meaning the value at one point in time is correlated with values at previous points, like if it's hot today, it's probably going to be warmish tomorrow, or maybe a predictable decay after some event, and.

Speaker 1

20:05

Then there's always just random noise, right, fluctuations You can't really predict exactly.

Speaker 2

20:10

Understanding these components helps in modeling.

Speaker 1

20:13

So how do you prep time series data for an mL model? It's not exactly sentences.

Speaker 2

20:18

The standard technique is windowing. You essentially turn the time series prediction problem into a supervised learning problem.

Speaker 3

20:24

How does that work?

Speaker 2

20:25

You create fixed size input sequences or windows of past data points. For example, a window might be the last thirty days of temperature readings, and the corresponding label for that window is typically the value you want to predict, maybe the temperature reading for the next day, day thirty one.

Speaker 1

20:43

Okay, So you slide this window across your entire time series history, creating lots of input window next value pairs precisely.

Speaker 2

20:51

That becomes your training data.

Speaker 1

20:52

Set, and before you build a fancy mL model, you'd probably want some simple baselines to compare against.

Speaker 2

20:58

Definitely, you need to know if your MLMA model is actually adding value. The simplest baseline is the nive forecast. Just predict that the next value will be the same as the last observed value, so predict tomorrow's temperature will be the same as today's.

Speaker 1

21:10

Or maybe a moving average averaging the values in the last.

Speaker 2

21:13

Window right that smooths out noise, but often lags behind trends and doesn't capture seasonality.

Speaker 3

21:18

Well.

Speaker 2

21:19

You calculate the error of these baselines maybe mean absolute error MAE, and that gives you a target for your mL model to beat.

Speaker 1

21:27

What mL models work well here on this windowed data You.

Speaker 2

21:30

Can try basic dense neural networks DNNs, but architectures that understand sequences are often better. You can use one D convolutions, similar to how two dcnns find spatial patterns and images. One dcnns can find patterns across consecutive timesteps within your window. You need to use causal padding, though, to make sure the convolution only looks at past data, not future data it shouldn't know about.

Speaker 1

21:54

Makes sense, and RNNs LSTMs grus absolutely.

Speaker 2

21:58

Since time series is inherently sequential, rn n's are a natural fit. They can maintain state across the timesteps within the window, potentially capturing complex temporal patterns and dependencies that simple DNNs or even CNNs might miss.

Speaker 1

22:11

So again, it's about experimenting with different window sizes, architectures, hyper parameters to see what works best for your specific time series.

Speaker 2

22:18

Exactly, different data sets will respond better to different approaches.

Speaker 1

22:21

Okay, so we've trained all these amazing models for vision language time series. Now the crucial part for a coder, how do you actually use them in an application deployment?

Speaker 2

22:30

Right, if you want to run your model directly on a user's device, like in a native mobile app on Android or iOS, or maybe on an embedded system like a Raspberry Pie, the main tool is tensorflowite.

Speaker 1

22:43

T flight And why run it on the device on the edge. Why not just call a server API.

Speaker 2

22:48

Several big advantages late and see the prediction happens right there instantly, no network delay, connectivity, it works even if the device is offline, and privacy the user's data like an image want to classify, doesn't have to leave their device, which is huge nowadays.

Speaker 1

23:04

Okay, that makes sense, So how does it work. Let's take our simple Y for X one model again.

Speaker 2

23:09

So you train your model as usual, probably in Python using TensorFlow. You save the trained model. Then you use a tool called the TF dot light dot T flight converter to convert your saved TensorFlow model into a special optimized dot T flight format.

Speaker 3

23:22

So you get this lightweight dot T flight file.

Speaker 2

23:24

Yep, it's usually much smaller then in your mobile app code Jabacotlin for Android, Swift Objective, BAFF for iOS, or your embedded system code like Python or C plus plus on a PI. You use the t flight interpreter library. You load that dot T flight file into the interpreter. You prepare your input data, making sure it has the exact shape and data type the model expects. For example, that might be a float three two array containing ten point zero with the shape of one one.

Speaker 3

23:50

Gotta get the details right there absolutely.

Speaker 2

23:52

Then you pass that input to the interpreter, invoke it, run the prediction, and it gives you back the output tensor, which would contain something like eighteen point ninety seven.

Speaker 1

24:01

The prediction for why what about more complex things like running an image classifier on mobile.

Speaker 2

24:06

The core process is the same convert to dot t flight use the interpreter. The trickier part is usually getting the image data into the right format before feeding it to the interpreter. How So, mobile platforms have their own image formats, like androids bitmap or iOS's uiimage. Your CNN model, however, probably expects input as a say, two to four by two two four x three tensor of floating point numbers

24:30

normalized between zero one. So you need code to take the native image, resize it, extract the pixel values, maybe dealing with raw bite buffers and bit shifting for colors, normalize them and arrange them into the correct tensor shape.

Speaker 1

24:41

All right, There's some data wrangling involved, yeah, and some platform specific setup for the interpreter itself in the app project.

Speaker 2

24:47

Definitely. T flight also offers optimization, particularly quantization. During the conversion process, you can tell it to convert the model's weights from thirty two bit floats to say, eight bit integers, and that makes the model much smaller, often a four x reduction in file size, and it usually runs faster too, maybe two three x speed up on mobile CPUs or specialized hardware like edgetpus.

Speaker 3

25:10

Is there catch?

Speaker 2

25:11

There can be a small loss in accuracy because you're reducing precision. You always need to test the quantized model to see if the accuracy trade off is acceptable for your specific application.

Speaker 1

25:21

Okay, so t flight for native embedded What if you want mL in the browser or for a no JS backend, then.

Speaker 2

25:27

You're looking at pencerflow dot js. Tfs it lets you define train and run mL models entirely in JavaScript.

Speaker 1

25:34

You can actually train models in the browser.

Speaker 2

25:36

You can you can define a model layer by layer, similar to how you do it in Python. Using JavaScript APIs, you'd use TFJS tensors for your data. Training often involves a sinkle weight because it happens asynchronously in the browser. You can build that Y two x one model or even more complex things like image classifiers or models that work on CSV data, all within javascripts.

Speaker 3

25:59

That's pretty cool.

Speaker 1

26:00

But maybe the biggest win for web deevs is using models someone else has already trained.

Speaker 2

26:04

Absolutely, that's the power of pre converted JavaScript models. Places like TensorFlow, dot org and tfhub offer many sophisticated models already converted to the TFGS format, so.

Speaker 1

26:14

You can just load them and use them with minimal code exactly.

Speaker 2

26:17

There are models for toxicity detection and text image classification using things like mobile net, where you can just pass an mg tag pose detection with posnet that gives you coordinates of body joints from an image or video. You can integrate these powerful features into your web app pretty easily.

Speaker 1

26:33

Just load the model script, write a few lines of JavaScript to run inference.

Speaker 2

26:36

Pretty much, and you can even do transfer learning in TFJS. You can load a pre trained model like mobilenet, use a function like model dot infer mmg embtting to grab the internal feature embeddings for your own images, and then use those embeddings as input to train a small new TFJS model tailored to your specific classification task.

Speaker 3

26:57

Nice.

Speaker 1

26:58

Okay, one more deportment scenario. What if you need a dedicated, scalable server for running predictions, maybe lots of users hitting it at once.

Speaker 2

27:07

For that kind of robust production environment, you'd use TensorFlow serving. It's specifically designed to deploy TensorFlow models as a high performance inference server.

Speaker 3

27:15

What is it anal for you?

Speaker 2

27:16

It's built for low latency and high throughput. Crucially, it handles model versioning really well. You can deploy multiple versions of the same model, say V one and a new V two you just trained, and TensorFlow serving can manage serving requests to either version or transition traffic smoothly, so you.

Speaker 3

27:32

Can ab test models or roll out updates safely.

Speaker 2

27:35

Exactly. It typically exposes APIs like rest or gRPC. Your application sends an inference request with the data, TensorFlow Serving loads the correct model version, often managed via configuration files, runs the prediction efficiently and sends the result back. It's built for reliable, scalable production use.

Speaker 1

27:55

Okay, we've covered a ton of ground from the basic concepts through vision, language, time series, and now all these ways to deploy models. This really pulls together the practical side from Roney's book.

Speaker 2

28:07

Yeah, we started with that fundamental shift learning rules from data instead of writing them, and saw how tensilflow provides the platform for coders to do that.

Speaker 1

28:14

Then we dove into computer vision CNN's dealing with data limits using augmentation and transfer learning and.

Speaker 2

28:20

NLP tokenizing embeddings to capture meaning and RNNs like lstm's especially by directional ones to understand sequence and context even for generating texts.

Speaker 1

28:30

We touched on time series windowing data using CNN's and RNNs for prediction.

Speaker 2

28:34

And finally getting those models out there TFLight for devices, TFGS for the Web, and tf serving for scalable back end inference.

Speaker 1

28:41

It really feels like these tools make AI and mL much more practical for coders today. You don't need to be a deep research scientist to start building things exactly.

Speaker 2

28:51

The focus shifts to understanding the techniques when to use a CNN versus an LSTM, how to prepare your data, how to leverage pre train mode models, and applying them to your problems, which leads.

Speaker 3

29:02

To a final thought.

Speaker 1

29:03

Maybe we talked about transfer learning using these powerful pre train models, and also about running models right on the device with t flight for speed and privacy. So thinking about those capabilities combined, what kinds of really complex, maybe even personalized, intelligent features could you start imagining for the applications you build things that maybe seemed impossible just a few years ago, but are now potentially within reach for a code or using these tools

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript