AWS Certified Advanced Networking – Specialty (ANS-C01) Certification Guide: A pragmatic guide to acing the AWS ANS-C01 exam | CyberSecurity Summary podcast

Speaker 1

00:00

Welcome back to the deep dive. Today we're pulling back the curtain on something well fundamental yet often kind of unseen, the intricate world of cloud networking on AWS.

Speaker 2

00:11

That's right, you've sent.

Speaker 1

00:12

Us a stack of sources looks like excerpts from an AWS certification book, and our mission is pretty clear. Distill those crucial insights. We give you a shortcut really to being genuinely well informed about the hidden physics of the cloud. Let's umpact this.

Speaker 2

00:29

It is fascinating, isn't it, Because you know, cloud services can feel almost like magic, but underneath they're built on these deeply rooted networking principles, just with a dynamic global twist. So today, yeah, we're going to try and uncover those aha moments. We'll go from the virtual network interfaces powering your instances all the way to how global traffic is managed, how it's secured, revealing the key bits and importantly the unseen challenges that keep everything running.

Speaker 1

00:56

Okay, So where do we even begin with something as big as cloud networking? Maybe start right at the foundation makes sense.

Speaker 2

01:02

Let's talk about the elastic network interface, the E and I, The E and I.

Speaker 1

01:05

Right, So think of it like the cloud's virtual network card for your EC two instances. Every single bit of network traffic in and out it flows through an E and I exactly. But the key thing about E and i's, it seems, isn't just that they're virtual NICs. It's their flexibility, right, Absolutely, the flexibility is huge.

Speaker 2

01:21

You can attach them, detach them in different states too, Like you.

Speaker 1

01:25

Can do a hot attachment while the instance is actually running.

Speaker 2

01:29

Yep, hot attachment, while running warm attachment if it's stopped, or even coal attachment right when you'll launch it.

Speaker 1

01:34

Wow, And an instance can have more than one, like connected to different parts of your network.

Speaker 2

01:39

Yeah, definitely, you can have multiple E and i's on a single EC two instance, each connected to a different VPC subnet, maybe for different security zones or traffic types. That detachment flexibility really changes how you think about high availability.

Speaker 1

01:52

Okay, so that's the interface. What about the addresses themselves inside aws?

Speaker 2

01:56

Good question. Let's dive into IP addressing. So mostly you'll see vpcs using private IP ranges, you know, the standard RFC nineteen eighteen stuff.

Speaker 1

02:06

Right like ten dot or one seventy two dot sixteen exactly.

Speaker 2

02:11

But subnets can also allow for the auto assignment of public IPv four addresses when you'll launch an instance.

Speaker 1

02:18

Ah okay, but wait if those auto assigned public ips can just change, say if you stop and start the instance. Yeah, how do you deal with services that absolutely need a fixed, unchanging external address, like a web server or something.

Speaker 2

02:34

That's a fantastic question, and that's precisely where elastic IP addresses or EPs come in. They're indispensable for that hpiece. They are static public IPv four addresses. You basically allocate them to your AWS account, not directly to an instance initially.

Speaker 1

02:49

Ah okay, So they belong to the account.

Speaker 2

02:51

Right, and then you could associate an EP with an E and I or directly with an EC two instance. The crucial flexibility here is that the EEP isn't permanently tied to that specific piece of hardware or virtual.

Speaker 1

03:01

Hardware, so you can move it around exactly.

Speaker 2

03:03

If an instance fails or you need to swap something out, you just reassociate that same EP with a different instance or E and I. Gives you that stable public phase for your applications, regardless of the underlying instance churn.

Speaker 1

03:15

That makes a lot of sense, okay. Building on managing IPS, efficiently. The sources mentioned something called prefix lists. What are those about? How do they make life simpler?

Speaker 2

03:25

Prefix lists are actually quite clever. They're basically custom managed lists of IP address ranges or prefixes. You maintain these lists and then you can reference them consistently in your network.

Speaker 1

03:36

Can fix like in security groups or route tables precisely.

Speaker 2

03:40

Instead of typing out or copying and pasting potentially huge lists of IP addresses over and over again, you just refer to the prefix list by its name. It simplifies policy creation immensely.

Speaker 1

03:51

Okay, so you define it once, use it many times exactly.

Speaker 2

03:53

And they are two types. You've got AWS managed prefix lists, which AWS maintains for their own services, makes it super easy to allow traffic to S three or DynamoDB for example.

Speaker 1

04:03

Oh nice.

Speaker 2

04:04

And then you have customer managed prefix lists where you define your own groups of ips. Maybe you create one called all dev resources that includes all this CIDR blocks for your development VPCS makes managing access way easier.

Speaker 1

04:16

Right, I can see how that would tidy things up. Now for something that sounds a bit more mysterious, the hyperplane. What on earth is that?

Speaker 2

04:24

Yeah? It does sound a bit sci fi doesn't it. Think of the hyperplane as like the virtual network engine of AWS. It's the massively distributed underlying infrastructure that takes the physical network and slices it up virtually for every customer. That's what makes vpcs and all these services actually work.

Speaker 1

04:42

Okay, the engine behind the scenes, But what's surprising about it.

Speaker 2

04:46

What's surprising or at least really important to understand, is how it operates with these artificial limits. AWS puts these in place to ensure fair resource allocation across all tenants.

Speaker 1

04:56

Limits like bandwidth.

Speaker 2

04:57

Yeah, bandwidth or throughput limits are part of it, and they very hugely depending on the service. You know, a transit gateway VPC attachment might go up to fifty gigabits per second, whereas a single VPN tunnel might top out at say one point twenty five gvps and direct connect depending on the port, maybe one ten or even one hundred gvps. Now, these numbers kind of show the different skills you're working at.

Speaker 1

05:19

That huge difference between TGW and a VPN tunnel is striking. But you mentioned something else, something trickier than just bandwidth.

Speaker 2

05:29

Yes, and this is the one that catches people out, even experienced folks. It's the packets per second or PPS limitations.

Speaker 1

05:35

Packets per second. Why is that trickier?

Speaker 2

05:37

Because you can often hit the PPS limit before you hit the bandwidth limit, especially with lots of small packets, like certain types of application traffic, or maybe even a d DOS attack using small packets.

Speaker 1

05:48

And what happens?

Speaker 2

05:48

Then you start dropping packets silently. Your bandwidth monitors might look totally fine, nowhere near saturated, but packets are just vanishing into the ether because the hyperplane component handling your traffic can't process them fast enough.

Speaker 1

06:01

Ouch. So how do you even spot that? If there are, as you said, no obvious signs, That.

Speaker 2

06:06

Is the challenge. It feels like a ghost in the machine. Diagnosing it usually means you need to look beyond just throughput. You need metrics on packet counts, maybe packet drop counters if the service exposes them, or you have to use tools like VPC flow logs or even VBC traffic mirroring, which we can get into later, to try and see what's actually happening at the packet level. It definitely defies traditional bandwidth troubleshooting.

Speaker 1

06:30

Okay, that's a really crucial subtle point So we've got the building blocks E and i's ips limits. But clouds aren't usually isolated islands, right, You need them to talk to each other.

Speaker 2

06:41

Absolutely, Connecting things is key.

Speaker 1

06:43

So moving beyond individual instances, how do we connect these cloud components. What's the simplest way? VPC peering.

Speaker 2

06:50

Yeah, VPC peering is often the starting point. It creates a direct private connection between two vpcs using aws's backbone. It's pretty straightforward to set up. And actually there aren't explicit throughput limits imposed by the peering connection itself, beyond instance or other limits.

Speaker 1

07:07

Sounds good, but there have to be catches.

Speaker 2

07:09

Right, Oh, definitely key considerations. First, it's nontransitive, meaning if VPCA is peered with VPCB and VPCD is peered with VPCC, VPCA cannot automatically talk to VPCC just by going through B. There's no implicit routing pass through. You'd need a separate peering connection directly between A and C. Ah.

Speaker 1

07:29

Okay, so no hubbin spoke, just using peering exactly.

Speaker 2

07:31

And the second big one, maybe even bigger. It absolutely cannot be used if the vpcs have overlapping CADR ranges.

Speaker 1

07:38

Right, If both vpcs use ten point zero point zero point zero one six For example.

Speaker 2

07:42

YEP peering just won't work. Routing across peered vpcs relies on static routes. You have to manually add to the route tables in both vpcs for traffic to flow back and forth.

Speaker 1

07:51

That IP overlap thing, that sounds like a potential nightmare. You mentioned company mergers earlier. Imagine trying to connect two company networks that both picked say ten point one hundred point zero points zero one six independently peering is out? Is there any way to make workloads with overlapping IPS talk?

Speaker 2

08:08

You're right, it's a huge challenge in mergers or large organizations VPC peering it's a wall there. There is a solution, though it's not perfect. Yeah, using private net gateways.

Speaker 1

08:18

Not gateways, but usually those are forgetting out to the internet.

Speaker 2

08:21

Correct, those are public net gateways, but you can also set up private net gateways. They allow workloads in one DPC to initiate connections to workloads in another VPC, even if they have overlapping IPS, because the neat gateway handles the address translation on the way out.

Speaker 1

08:36

Ah, clever, but you said initiate.

Speaker 2

08:39

Yeah, that's the caveat. The communication generally have to be initiated from the side using the neat gateway. It's not a truly transparent bidirectional connection like peering would be if the ips didn't overlap. Solves a specific problem, but it's not a universal fix for overlapping CIDRs.

Speaker 1

08:55

Okay, so peering is simple, but limited, especially by transitivity and IP overlap. How did AS addressed the need for larger, more complex, maybe hub and spoke networks in the cloud.

Speaker 2

09:04

Well, the community first came up with solutions like the Transit VPC. This usually involves setting up dedicated EC two instances running routing software, network virtual appliances or mvas in the central VPC to act as a hub.

Speaker 1

09:17

So building your own router in the cloud basically.

Speaker 2

09:19

Pretty much it worked, But managing those mvas, worrying about their scaling, high availability, it's complex. So AWS eventually released a mandaged service to solve this much more elegantly, the AWS Transit Gateway or TGW Transit Gateway.

Speaker 1

09:34

Okay, how's that different.

Speaker 2

09:35

TGW acts as a fully managed, highly scalable central cloud router or hub. You attach your vpcs, your VPN connections, your direct connections all to the TGW. It simplifies INNERVPC connectivity massively and also makes hybrid networking connecting to on premises much cleaner. It takes the routing burden off you and puts it into a managed AWS service. A true hub and spoke model becomes easy.

Speaker 1

09:59

Got it? So TDW is the modern way for complex connectivity. Now, speaking of hybrid, what about that dedicated link you mentioned, Direct connect or DX. Why would someone go for DX instead of just setting up a VPN over the Internet. It seems like VPNs are pretty common.

Speaker 2

10:13

They are common and often sufficient, but direct connect offers several really critical advantages, especially for larger enterprises or sensitive workloads. Like what First, Privacy and security DX provides a dedicated private circuit. Your traffic isn't going over the public Internet, so it can't be snooped on easily. Second, reliability, DX comes with service level Agreements slas, promising certain levels of uptime. The public Internet is inherently best effort.

Speaker 1

10:40

Okay, so more secure, more reliable, and.

Speaker 2

10:43

Third performance significantly higher bandwidth as possible. D X connections come in one gbp's, ten gvps and now even one hundred gbp's flavors, plus you generally get lower and more consistent latency compared.

Speaker 1

10:56

To the Internet one hundred gigs. Wow. And it's literally a physical connection right like a cable.

Speaker 2

11:02

Yes, Fundamentally you work with AWS or a partner to get a physical cross connect cable run in a shared data center, a direct connect location between your networking equipment and aws's equipment. There's even a document involved, the Letter of Authorization in Connecting Facility Assignment or LOACFA, that you use to authorize the data center technicians to make that physical link. It's a tangible piece of your cloud connection.

Speaker 1

11:23

A physical manifestation of the cloud. Okay, that's cool. So DX sounds robust. What if one link isn't enough bandwidth or you need more redundancy, and how do you actually get that physical pipe connected into your virtual network your.

Speaker 2

11:36

Vpcs great questions for more bandwidth or redundancy. AWS offers link aggregation groups or lags. This is pretty neat. It lets you bundle multiple physical DX connections together, say for one gbp's links, and treat them as a single logical connection with combined bandwidth like four gbps. It simplifies management too.

Speaker 1

11:55

Ah like bonding network interfaces.

Speaker 2

11:57

Exactly like that, and to extend that physical connectivity into your actual AWS resources, you use virtual interfaces or visifs. Vifs essentially carve up that physical DX connection or LAG into logical pathways using VLAN tagging standard A to two point one qvland tags. This lets you run different types of network traffic over the same physical.

Speaker 1

12:17

Link, different type.

Speaker 2

12:18

Yeah, we mainly distinguish between three types. Private vifs, which are used to connect directly to your vpcs, usually via a component called a virtual private gateway PGW or more commonly now a transit gateway. Then there are public vifs, which let you access public AWS services like S three or EC two APIs over your private DX link bypassing the Internet.

Speaker 1

12:39

Okay, private for vpcs, public for AWS services. What's the third?

Speaker 2

12:43

The third is the transit VIF. This one is specifically designed to connect your direct connect link to a transit gateway.

Speaker 1

12:49

Right for that hub and spoke model with TGW precisely.

Speaker 2

12:53

And there's a crucial point here about transit, vis and TGW. I've heard about something called hairpinning that can get really expended if you're not careful.

Speaker 1

13:01

Ah. Yes, hairpinning. You're absolutely right to bring that up. It's a potentially very costly mistake. If you're using transit gateway with direct connect, it's critical that you use a single transit VIF per TGW connection to your on premises site.

Speaker 2

13:14

Why just one, because if you have multiple or misconfigure routing, you can end up with hair pinning. This is where traffic comes in from your on prem network via DX, goes to the TGW maybe needs to get to another VPC, but instead of routing directly, the TGW rows it back out the same DX connection towards your on prem router, only for your router to send it immediately back into AWS over DX again to reach the intended VPC.

Speaker 1

13:37

So it makes a U turn back through your own network.

Speaker 2

13:41

Exactly, a totally unnecessary round trip out of AWS and back in. And since you pay for a data egress from AWS, that double egress gets incredibly expensive really fast. A single transit VIF for TGW connection point, along with proper route propagation and filtering, prevents this costly detour.

Speaker 1

13:59

Wow. Okay, definitely noted avoid the hairpin So last piece on connectivity, what if you need your vpcs to talk privately to services could be aws's owned services or maybe third party sauce providers you use, but you don't want to go out to the Internet, and you don't want to route through net gateways if you can avoid it, what's the play there?

Speaker 2

14:17

That's the perfect use case for AWS private link private LINKA private link uses a component called a VPC endpoint. It essentially creates a secure private connection directly from your VPC to the service. The service endpoint effectively gets a private IP address within your vpc's address range, making the external service appear as if it's running right there inside your network.

Speaker 1

14:40

Ah So no Internet gateway, no, not no public eyps involved for that service connection.

Speaker 2

14:45

Correct traffic stays entirely within the AWS network backbone. It massively improves security by keeping sensitive data off the public Internet, and it simplifies your network architecture because you don't need complex firewall rules or net setups just to reach those services privately. It's very powerful for secure service consumption.

Speaker 1

15:03

Okay, that covers a lot of ground on how to connect things, but even with the best designs, things go wrong. Cloud networks, maybe even more than traditional ones, can have these elusive problems. Because so much is abstracted. How do you start peeling back those layers when inevitably something breaks. Let's talk about potential problems first. What kind of things typically bite you in cloud networking?

Speaker 2

15:23

Oh, there's a whole list. We definitely see IP address allocation issues pretty often, like a subnet just runs out of available IPS, IP exhaustion, yeah, or worse, those overlapping CIDR ranges we talked about causing weird routing conflicts. If someone tries to connect things that shouldn't be connected, then there are root scale limitations. AWS services have limits on

15:44

the number of routes they can handle. Exceed those and routes might just disappear, or BGP sessions with your on prem gear might tear down.

Speaker 1

15:51

Okay, limits again. What else?

Speaker 2

15:53

Packet size mismatches. This is a subtle one. Issues with maximum transmission unit MTU or maximum segment size MSS can cause fragmentation. This often doesn't look like a network down problem, but it hits applications. You might see really slow file transfers or some web apps timing out without obvious network errors.

Speaker 1

16:13

Right because the network itself is passing packets just fragmented ones the application struggles with exactly.

Speaker 2

16:18

Then we have the hard limits we discussed band with throughput limitations which are usually pretty core quotas, and those tricky PPS limitations causing those silent packet drops that are so hard to diagnose.

Speaker 1

16:29

Still scary.

Speaker 2

16:30

Yeah, and related to that just general packet loss, maybe due to unreliable transit somewhere between regions, or maybe the end hosts themselves are just overwhelmed and dropping packets. And finally, never underestimate plain old security misconfiguration, a wrong rule in a security group or more often a network Access control list NaCl is a super frequent cause of it just doesn't connect problems.

Speaker 1

16:56

That's quite a list. Sounds like troubleshooting could be finding a needle in a haystack. What tools does AWS actually give you to get visibility to see inside this sometimes opaque window.

Speaker 2

17:08

Well, the courterstone of observability in AWS is definitely Amazon cloud Watch.

Speaker 1

17:12

Cloud Watch right, that's for metrics and logs for pretty much everything exactly.

Speaker 2

17:16

It's the central hub you need to understand its core components. There are name spaces, which are basically containers for metrics from a specific service like EC two or ELB. Then the metrics themselves. Those are the actual time series data points like CPU utilization or network in. Then you have dimensions, which are key value pairs that help you filter in group metrics like instant seed or autoscaling group name, and finally periods which define the time interval over which the

17:42

data is aggregated, like one minute or five minutes. Cloud Watch is your main dashboard for performance, health and setting alarms.

Speaker 1

17:49

So cloud watch gives you the high level metrics. But what about seeing the actual traffic flows, like which connections are being allowed or denied. That sounds more like VPC flow logs precisely.

Speaker 2

18:01

VPC flowlugs give you metadata about the IP traffic flowing through your VPC. They capture information for each flow like source and destination, IP ports protocol, the number of packets and bytes, and crucially, the forwarding decision made by the VPC router, whether the traffic was accepted or rejected.

Speaker 1

18:19

That accept traject status seems key for troubleshooting.

Speaker 2

18:22

It is, but remember flow lugs are not full packet captures. They don't show you the payload, but they give you really valuable insight into network level decisions, and you can even set up custom formats for flow logs now custom formats.

Speaker 1

18:33

How would you use that?

Speaker 2

18:34

Well, for instance, you could include TCP flags in your logs. That might help you troubleshoot specific issues like TCP handshake problems. Are you seeing syn packets but no syn ACKs back? Things like that. It lets you tailor the logs to the problem you're investigating.

Speaker 1

18:51

That's handy. Now to make this concrete, the source material had this Prailcats troubleshooting example. Can you walk us through that? It seemed like a good illustration of using these tools systematically.

Speaker 2

19:02

Yeah, the Trailcats examples classic. They had a website and it was having these mysterious connectivity problems between two of its back end servers. So the first thing they did was enable VPT flow logs, but they did it at the NI level for the servers involved.

Speaker 1

19:17

Okay, looking right at the server's network interfaces, right, and.

Speaker 2

19:20

Those logs showed nothing rejected all ec SPTT. So initial thought might be, okay, the network's fine, must be an application problem.

Speaker 1

19:27

A dead end potentially exactly.

Speaker 2

19:29

But they didn't stop there. They widened the scope. They enabled flow logs, but this time at the subnet.

Speaker 1

19:34

Level AH one level up from the instance NI YEP and boom.

Speaker 2

19:39

The subnet level logs immediately showed rejected traffic between those two servers.

Speaker 1

19:43

So what did that point to?

Speaker 2

19:44

It pointed directly to a network Access control list or ANACL. Because nacls operated at the subnet boundary, they were blocking the traffic before it even got to the instance's E and I. The ENI level logs never saw the rejected packets because they never or reach the E ANDI.

Speaker 1

20:01

That's a brilliant example of how changing your observation point widening the scope is critical in cloud troubleshooting.

Speaker 2

20:07

Absolutely, you have to look at the different layers.

Speaker 1

20:09

Okay, so flow logs give metadata except reject But what if you do need to see the actual packet contents, like you suspect something weird in the payload or you need deep protocol analysis. Is there an equivalent to plugging in wire shark via a span port like in a physical data center.

Speaker 2

20:26

Yes, there is. That's VPC traffic mirroring. It essentially provides that span port capability in the cloud.

Speaker 1

20:32

Okay, how does it work?

Speaker 2

20:33

It lets you capture network traffic from a specific source, usually an EC two instances E and I and mirror it send a copy to a designated target.

Speaker 1

20:41

What kind of target?

Speaker 2

20:42

The target could be another E and I, maybe on an instance running wire shark or some security tool. Or it could be a network load balancer or even a gateway load balancer which might front a whole fleet of monitoring appliances.

Speaker 1

20:55

And you can control what traffic gets mirrored. You don't want to flood your monitoring tool, right.

Speaker 2

21:00

You use a filter which is basically an access control list using the standard five tuple format source itis ip port protocol to specify exactly which flows you want a mirror, and the whole thing source target filter is tied together in a session.

Speaker 1

21:16

So Source E and I filter the traffic, send it to a target for analysis exactly.

Speaker 2

21:20

It's incredibly powerful for deep packet inspection, security threat analysis, compliance monitoring, and just advanced troubleshooting where flow lugs aren't enough, you get the full packet.

Speaker 1

21:31

Got it? And just quickly you mentioned visibility for larger networks. Transit Gateway Network Manager TGNM. What's its role?

Speaker 2

21:37

Think of TGNM primarily as a unified dashboard and visualization tool, especially if you have a complex network involving multiple transit gateways, vpcs, VPNs direct connect maybe even reaching into different AWS regions or connecting to on premises sites. TGNM helps you see it all, so draws you a mapped pretty much. It gives you a logical and often a geographical view of your global network topology. You can register your on prem

22:03

devices and sights too. It helps with monitoring the health and status of your TGW attachments and routes all in one place. It brings that bird's eye view which is vital as network scale.

Speaker 1

22:15

Okay, that makes sense, essential for managing complexity. Now, let's shift gears slightly, but stay related security. It's non negotiable. Obviously. A TOBS talks about a layered approach to traffic control right from the edge all the way down to the instance. How does that layering start way out at the global edge.

Speaker 2

22:30

Right before traffic even gets close to your VPC? AWS offers protection. The first line is often AWS Shield. This is primarily for a distributed denial of service or d DO protection.

Speaker 1

22:42

Shield just protection well.

Speaker 2

22:43

Shield standard is automatically enabled and protects against common network level DIDOS attacks, but Shield Advanced is a paid service that gives you much more. Totion four seven Access to aws's d DOOS Response Team DRT, detailed attack diagnostics and importantly, economic protection. AWOS can help cover cost and CURD due to d DOO sub driven spikes and usage on services like cloud Front or load balancers.

Speaker 1

23:07

Oh okay, insurance against DDOSE costs too. What else is out at the edge?

Speaker 2

23:11

Then you have ABUS WAFT the Web application firewall. This operates at layer seven, the application layer. It helps protect your web applications from common exploits like SEQL injection, cross sided scripting, XSS file inclusion, things that target vulnerabilities in your application code itself. You apply WAFT rules typically to cloud Front distributions or application load balancers.

Speaker 1

23:31

So shield handles the flood. Why handles the malicious application requests?

Speaker 2

23:35

Got it? Now? Bringing that firewall capability inside your vpcs? What about AWS Network Firewall? How does that work? AWS Network Firewall is a managed stateful firewall service that you deploy within your vpcs. It gives you fine grained control over traffic flowing between subnets between vpcs or in and out to the internet or on prem networks. You deploy firewall endpoints into specific subnets.

Speaker 1

24:01

And how do people typically design with it? The sources mention different patterns.

Speaker 2

24:05

Yeah, there are few common architectural patterns. One is the distributed design. You put a network firewall end point in basically every VPC that needs protection pros and cons pro granular policy control per VPC. Potentially lower latency as traffic doesn't have to leave the VPC for inspection. Con can get expensive and complex to manage policies across many firewalls.

Speaker 1

24:24

Okay, what's the alternative?

Speaker 2

24:25

The centralized design? Here you create a dedicated security VPC or inspection VPC. All traffic inner VPC, Internet ingresscress VPNDX traffic gets routed through network firewall end points in this central.

Speaker 1

24:39

VPC AH single choke point for inspection exactly.

Speaker 2

24:43

Pro cost savings, fewer end points, centralized management and policy enforcement condishing potentially adds latency as traffic has to detour through the inspection VPC and that VPC becomes a critical dependency.

Speaker 1

24:55

Makes sense any other patterns.

Speaker 2

24:58

There's also often a combination design trying to get the best of both worlds. Maybe you centralize inspection for east west inner VPC traffic, but you distribute the firewalls for Internet and gresscress traffic within each VPC to reduce latency for external connections. It's about balancing cost, latency, and manageability.

Speaker 1

25:16

Okay, choices depending on your needs. Now let's drill down to the absolute fundamentals inside of VPC. Remind us again about security groups sgs versus network access control lists and acls. They both filter traffic. But how are they different and where do people get tripped up?

Speaker 2

25:32

This is super important and yeah, confusion here causes a lot of issues. Okay, security groups or sg's think of them as stateful firewalls operating at the instance level, really the e NI level.

Speaker 1

25:42

Stateful meaning stateful.

Speaker 2

25:44

Means if you allow an outbound connection from your instance, say on part eighty, the SG automatically allows the return traffic back to the instance for that specific connection without needing a separate inbound rule. It understands the connection state. Also with sgs, the order of the rules as matter. All allow rules are evaluated, and importantly, sg's only support allow rules. There's an implicit deny at the end.

Speaker 1

26:08

Okay, stateful instance level allow rules only order doesn't matter. What about nacls.

Speaker 2

26:14

Network Access control lists or nacls. These are stateless firewalls operating at the subnet.

Speaker 1

26:19

Level stateless meaning stateless.

Speaker 2

26:20

Means they don't track connection state. If you allow outbound traffic on port eighty, you must also have an explicit inbound rule allowing traffic back on the ephemeral ports typically high numbered ports. For the response to get through, you need rules for both directions. Ah more work YEP and NEACL rules are processed in order, from the lowest numbered rule to the highest. The first rule that matches the

26:42

traffic is applied, and that's it. Crucially, nacls support both allow and DNY rules, so you can create explicit blocks. There's also an implicit deny at the very end rule number.

Speaker 1

26:53

Asterisk okay, stateless subnet level order matters allow and deny rules. So where's the common mistake?

Speaker 2

27:01

Often people forget the stateless nature of nacls and don't add the return traffic rules, or they mess up the rule order, having a d and an Y rule that accidentally blocks traffic they intended to allow because it comes before the allow rule. Because they're at the subnet level. A misconfigured NaCl can cut off a whole group of instances.

Speaker 1

27:18

Got it. Be careful with nacls now. This race is an important question. Load balancers. We usually think of them for performance and availability, but how do they fit into the security and traffic flow picture, especially the different types?

Speaker 2

27:30

Absolutely critical role for both. Okay, let's bring it down first. Network load balancers or nlbs. These operate down at layer thirty four the network and transport layers. They look at IP addresses and ports, typically using a five touple hash source it st IP sources support protocol to distribute connections.

Speaker 1

27:48

Okay, lower level, what's key about them? For security?

Speaker 2

27:51

A key feature of nlbs is that they preserve the original client source IP address when forwarding traffic to the back end instances. This is super sportant for logging, security analysis, or applying IP based rules on the back end. Also, because they operate at layer four, they're often used for inserting third party network virtual appliances and vas like firewalls or intrusion detection systems into the traffic path non disruptively.

Speaker 1

28:14

Ok. NLB layer four keeps client IP good for mvas. What about albs.

Speaker 2

28:19

Application load balancers or albs? These are smarter, operating up at layer seven the application layer hgtph GTPs. They can make routing decisions based on things like the requested urlpath like images are appy hosttheaders, gray string parameters, even HTTT.

Speaker 1

28:34

Method much more granular routing.

Speaker 2

28:36

Exactly, and a big function of alb's is TLS termination. They handle the HTTPS decryption encryption, offloading that compute intensive work from your back end web servers.

Speaker 1

28:45

That sounds good for performance any security implications.

Speaker 2

28:48

Yes, while offloading TLS is efficient, it does mean the connection between the ALB and your back end instance is typically unencrypted HTTP unless you specifically configure re Encryptionechnically, it breaks into end encryption within your VPC boundary. There Also, because the ALB terminates the connection, the back end instance doesn't see the original client IP directly. It sees the alb's IP.

Speaker 1

29:11

Ah, so you lose the client IP.

Speaker 2

29:14

You do unless the ALB adds the x forwarded four HTTP header and your applications configure to read and trust that header to find the original client IP. Many web frameworks do this, but it's an extra step.

Speaker 1

29:24

Okay. ALB layer seven smart routing TLS termination with caveats needs x foard the four or for client IP. What's the third type GLB.

Speaker 2

29:32

Keitway load balancers or glbs. These are a bit different They are specialized built on NLB technology, but designed specifically for simplified service insertion of MVAS, particularly security appliance is like firewalls, IPS, sides, et cetera.

Speaker 1

29:47

How do they simplify it Compared to using an NLB for MVAS.

Speaker 2

29:51

Glb's use a special tunneling protocol called genety encapsulation. Essentially, when traffic hits the GLB, it wraps the original network packet in another packet, the genev packet, and sends it to one of the registered security appliances.

Speaker 1

30:05

NVA's okay, it puts the packet in a package.

Speaker 2

30:07

Why because this preserves the entire original packet headers and all the security appliance can inspect it fully see the original source and destination everything. Then, after inspection, the appliance sends the potentially modified or approved back it back to the GLB. Still encapsulated, the GLB unwraps it and sends the original packet on its way.

Speaker 1

30:24

Ah, so the NVA doesn't even need to know about the network routing. It's exactly the GLB handles all the routing complexities. The NVA just receives packets, inspects them, and sends them back It makes the security appliance fleet function like a transparent bump in the wire, super elegant for deploying security services scalably.

Speaker 2

30:42

That is clever is shifting from network traffic security to DNS security equally vital. How does AWS Root fifty three help lock down DNS both for the Internet and internally?

Speaker 1

30:53

Where fifty three is aws's DNS service and it plays a big role. First, you have public hosted zones. These manage DNS records for your Internet routable domain names like your company's website. Standard DNS stuff, right. But then you also have private hosted zones. These are associated with one or more of your vpcs and manage DNS records for internal domain names that should only be resolvable from within those vpcs, like Service.

Speaker 2

31:15

Dot, Internal dot Corp exactly. This allows you to have a split DNS or split horizon DNS architecture. Your internal servers or instances within the VPC can resolve both internal names from the private hosted zone and external Internet names via standard DNS resolution. It keeps your internal namespace private and secure, preventing internal host names or service names from leaking or being resolved externally.

Speaker 1

31:41

That makes sense for separating internal and external views. What about those advanced Route fifty three routing policies The sources listed simple failover, latency weighted, geolocation. Can you quickly unpack what each is for.

Speaker 2

31:54

Sure offer powerful traffic managing capabilities. Simple is just basic round robin DNS no health check standard failover is for active passive setups. You define a primary record and a secondary record. If the primary resource becomes unhealthy based on Root fifty three health checks, Root fifty three automatically starts returning the secondary records. IP think the old Twitter failwaale.

Speaker 1

32:14

Page okay, high availability YEP.

Speaker 2

32:16

Latency based routing is cool. Root fifty three has data on network latency from different parts of the Internet to AWS regions. It directs users to the AWSN point like a load balancer or instance in a specific region that provides the lowest latency for.

Speaker 1

32:30

Them, directing users to the closest healthy server latency wise.

Speaker 2

32:34

Exactly weighted routing lets you distribute traffic across multiple resources based on percentages you define. You could send ninety percent of traffic to your stable version and ten percent to a new version for EV testing, or just balance load

32:46

across endpoints. Unevenly if needed, useful for rollouts very and geolocation routing lets you wrap traffic based on the user's actual geographic location, like sending all European users to your Frankfort Regent servers and all US users to your Virginia region servers. Good for localization or data sovereignty.

Speaker 1

33:03

Got it? And finally, what about DNSEC? What problem does that solve? Is it about encryption?

Speaker 2

33:09

Good question. Dns sec is not about encrypting DNS queries or responses. Its purpose is authentication and integrity. It uses digital signatures and public key cryptography to allow a DNS resolver like your ISP's resolver or a public onelike Google's, to verify that the DNS records that receive actually came from the authoritative DNS server and haven't been tampered with in transit.

Speaker 1

33:32

So it prevents DNS spoofing or cash poisoning exactly.

Speaker 2

33:35

It builds a chain of trust from the root DNAs servers down to your domain, ensuring the IP address you get back for a host name is authentic. It's about trust, not confidentiality.

Speaker 1

33:44

Okay, authentication and integrity for DNS makes sense. Yeah, now let's zoom out Again, we've talked about the components, the connections, the security. How is all this modern cloud infrastructure actually built and managed? Especially at scale? This brings us to ideas like DevOps and crucially in structure as code or IAC. What's the core philosophy here right?

Speaker 2

34:04

DevOps broadly is about breaking down silos between development and operations teams, focusing on automation, collaboration, and faster, more reliable software delivery. Infrastructure is code or IAC is a key practice enabling DevOps for infrastructure?

Speaker 1

34:20

And what's the central idea of IIC?

Speaker 2

34:22

The core idea is managing and provisioning your infrastructure, your servers, networks, databases, load balancers, everything through machine readable definition files using code rather than manual configuration or clicking around in a console. It's about making your infrastructure ephemeral andmaleable.

Speaker 1

34:39

Ephemeral and malleable meaning easily created, destroyed, change precisely.

Speaker 2

34:44

You treat your infrastructure definitions like application code, and this leads to arguably the most game changing insight of IIC, the concept of immutability.

Speaker 1

34:52

Immutability that sounds important, What does that actually mean when you're managing cloud resources? Why is it such a big deal? For reliability and speed.

Speaker 2

35:00

Fundamental paradigm shift immutability means that once infrastructure is deployed, say a server or a cluster, you don't make changes to it directly. You don't log in and patch it or reconfigure it.

Speaker 1

35:11

In place, So no patching. How does that work?

Speaker 2

35:15

If you need to update or change something, apply a patch, deploy new code, change it, config you don't modify the existing infrastructure. Instead, you build a new instance or set of instances from your updated IAC definition which includes the patch or new code. Deploy the new set, test it, switch traffic over, and then you simply destroy the old unchanged infrastructure.

Speaker 1

35:38

The whole cattle not pets analogy exactly.

Speaker 2

35:40

You don't nurse sick pet servers back to health. You replace the disposable cattle units. Yeah. You treat your infrastructure definitions like source code, store them in version control depositories like get, use declarative specification documents like YAML for cloud formation, and rely on automation to deploy consistently.

Speaker 1

35:57

And the benefit isn't just avoiding manual work.

Speaker 2

36:00

No, the core benefit isn't primarily cost savings, although that can happen. It's about speed, consistency, and safety Deployments become repeatable and predictable. Rollbacks are easier. Just redeploy the previous version of the code. Drift between environments is minimized. If there's a problem, you don't spend hours troubleshooting a potentially broken server. You just redeploy a known good state from code in minutes. It fundamentally changes how you approach operations.

Speaker 1

36:25

That's a really powerful concept. Okay, So focusing on aws's main iac tool, AWS CloudFormation, how do its templates enable this immutable approach.

Speaker 2

36:36

CloudFormation templates are where you define your AWS resources declaratively, usually in Yamel or JSON. You state what you want like I want a VPC with this cid R block, I want two subnets inside it. I want an EC two instance with this AMI and one subnet. You don't specify the how like the API calls to make. CloudFormation figures that out.

Speaker 1

36:53

So you declare the desired state exactly.

Speaker 2

36:56

And the power comes from features within the templates. You have parameters which let you pass in values ad deployment time, like the vpc, CIDR block, or the instance type or maybe an environment name devastating PROD. This makes templates reusable.

Speaker 1

37:11

Okay, parameters for reusability, what else.

Speaker 2

37:13

And you have intrinsic functions. These are special functions you can use right inside the template code, things like differ lets you reference the ID or attribute of another resource defined in the same template, like getting the idea of the VPC to create a subnet in it, get at fetches an attribute. Sitor can calculate subnet CIDR blocks based on a main VPC block. Select can pick an item

37:35

from a list, maybe an availability zone. These functions allow you to create dynamic, interconnected infrastructure definitions without hard coding everything.

Speaker 1

37:42

So templates define the what parameters make them reusable, and functions add dynamic capabilities.

Speaker 2

37:47

Got it and quickly. For developers who might prefer Python or Java overwriting YAML, what's the AWS Cloud Development Kit or CDK. CDK is another way to do iac On aws IT lists, developers define their cloud infrastructure using familiar programming languages Python, Typescript, Java, c shark Go. You write code using CDK constructs, which represent AWUS resources.

Speaker 1

38:13

So you write Python code to define a VPC YEP, and.

Speaker 2

38:17

Then when you run the CDK toolkit, it synthesizes your Python code or whatever language you used into a standard AWS CloudFormation template.

Speaker 1

38:25

Ah, so it generates the CloudFormation for you exactly.

Speaker 2

38:29

The benefit is you can use the power of your programming language loops, conditionals, object oriented programming, existing libraries, code completion in your ide to define infrastructure. You can create reusable components and patterns more easily than with raw CloudFormation templates.

38:43

Sometimes it's particularly good for complex applications or for teams already comfortable with those programming languages, allowing them to define both their app and its infrastructure in the same language ecosystem.

Speaker 1

38:54

Very cool, leveraging programming skills for infrastructure. Okay, huh, that was quite the journey. So there you have it, a real deep dive pulling back the layers on AWS networking. We went from the basic virtual network cards, the e andi's and wrestled with the hyperplane's.

Speaker 2

39:10

Hidden limits of those PPS limits right.

Speaker 1

39:12

Then we looked at all the ways to connect things peering, TGW, direct connect, and the security layers from shield and wef down to sgs nacls, plus the nuances of different load balancers, and finally how modern teams actually build and manage all this using infrastructure as code.

Speaker 2

39:30

It really shows how AWS provides this incredibly rich toolkit right for control, for monitoring, for automation. It turns what could be overwhelming complexity into manageable patterns, which is what

39:41

ultimately enables all that innovation we see happening in the cloud. Absolutely, and I think understanding these layers, you know, from the foundational e andi's the strategic use of transit gateways, all the way to that philosophical shift towards the infrastructure as code, it means you're not just like using cloud services passively, You're truly starting to harness their underlying power.

Speaker 1

40:00

Yeah, understanding the physics behind the magic exactly.

Speaker 2

40:03

It encourages us, encourages you listening to look beyond the console surface and constantly ask, Okay, what hidden network physics might be a play here, and how can we leverage that or maybe mitigate its effects to build something even better, even more innovative.

Speaker 1

40:19

That is a powerful thought to leave everyone with, isn't it? How can you leverage or mitigate those hidden physics? Fantastic? Well, thank you for joining us on this deep dive. Until next time, keep that curiosity well fed.

Transcript source: Provided by creator in RSS feed: download file

AWS Certified Advanced Networking – Specialty (ANS-C01) Certification Guide: A pragmatic guide to acing the AWS ANS-C01 exam

Episode description

Transcript