Digital Forensic Education: An Experiential Learning Approach - podcast episode cover

Digital Forensic Education: An Experiential Learning Approach

May 27, 202624 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

A practical, experiential learning approach to teaching the investigation of digital evidence. Edited by Xiaolu Zhang and Kim-Kwang Raymond Choo, the text details how students apply theoretical knowledge to real-world scenarios, such as solving the DFRWS IoT Forensic Challenge. The sources emphasize the transition of digital forensics from a reactive police necessity to a rigorous academic discipline involving smart devices and big data. The provided excerpts specifically outline a case study involving a simulated drug lab raid where students analyzed data from IoT devices like Amazon Echo, Nest cameras, and smart alarms. Ultimately, the work illustrates how hands-on projects prepare the next generation of practitioners to navigate the rapidly evolving landscape of cybercrime and digital litigation.

You can listen and download our episodes for free on more than 10 different platforms:
https://linktr.ee/cyber_security_summary

Get the Book now from Amazon:
https://www.amazon.com/Digital-Forensic-Education-Experiential-Learning/dp/3030235467?&linkCode=ll2&tag=cvthunderx-20&linkId=76eaf3951fb145cf90b1fc1800991ca0&language=en_US&ref_=as_li_ss_tl

Discover our free courses in tech and cybersecurity, Start learning today:
https://linktr.ee/cybercode_academy

Transcript

Speaker 1

Look around your room right now, Like, seriously, just take a glance around. Do you have a smart speaker sitting on a shelf somewhere.

Speaker 2

Or you know, maybe thermostat on the wall right.

Speaker 1

Or a Wi Fi enabled security camera pointing out the front window. Now, I want you to imagine if a crime took place right where you are sitting, what story would those devices tell to an investigator.

Speaker 2

It's a really fascinating thought experiment, especially for anyone who actually pays attention to how data flows through a modern network. Yeah, totally, because we tend to install these devices just for our own convenience, right. We treat them as these passive tools that are just waiting for an API call or like a voice command.

Speaker 1

And they're just asleep until we need them exactly yea.

Speaker 2

But in reality, they are these silent, unblinking witnesses. They generate this massive, interlocking web of telemetry about our physical environment.

Speaker 1

And untangling that invisible web is exactly what we are doing in this deep dive today. We're looking at excerpts from the book Digital Forensic Education Experiential Learning Approach.

Speaker 2

Right, and specifically we're going to unpack how university students tackled the DFRWS.

Speaker 1

That's the Digital Forensic Research Workshop For those who don't.

Speaker 2

Know, Yeah, they're IoT forensic challenge. It is basically a simulated raid on an illegal drug lab, complete with a cast of shady characters and a house that is just packed full of smart home gadgets.

Speaker 1

It's so cool. It serves as a literal masterclass in how everyday network telemetry and device logs are transformed into courtroom evidence.

Speaker 2

The DFRWS challenge is somewhat legendary in the field because it's intentionally messy, like really messy.

Speaker 1

Yeah, they don't hold your hand, not at all.

Speaker 2

So our goal today is to look at the raw mechanics of this investigation, like how do digital breadcrumbs, raw packet captures, proprietary school light databases, or fin cash files. How does all that get synthesized into a rock solid narrative of a crime.

Speaker 1

We're going to trace the steps of these student investigators to see how they reconstructed a really volatile, careotic physical event strictly from binary data and network logs.

Speaker 2

It's like putting together a puzzle where half the pieces are invisible.

Speaker 1

Literally, but before we kick in the door of this simulated drug lab. We need to talk about the educational philosophy driving this, because teaching digital forensics isn't like teaching chemistry or physics.

Speaker 2

No, it's really not.

Speaker 1

It didn't start in some pristine university ivory tower with established centuries old laws.

Speaker 2

Yeah, the history of digital forensics is super unique because it grew almost entirely bottom up, bottom up.

Speaker 1

Right.

Speaker 2

What does that mean in this context, Well, it started decades ago simply because law enforcement would seize a machine, realize there was data trapped inside a hard drive, and they literally had to build the extraction tools from scratch.

Speaker 1

Wow. So no instruction manual none.

Speaker 2

Early practitioners were relying on like basic hex editors and raw bitstream imaging long before commercial suites like Celebrate or XAM even existed. The tools were entirely homegrown out of sheer operational.

Speaker 1

Necessity, which means they were essentially building the airplane while flying it. And frankly, even with today's commercial tools, that dynamic hasn't really changed, right, I mean, the technology evolves so fast that relying on traditional textbook learning is basically a recipe for obsolescence.

Speaker 2

Oh one, by the time a university actually publishes a textbook on Android filesystem forensics. The kernel has updated, the encryption architecture has changed, and you know your extraction methods are totally broken.

Speaker 1

That is so frustrating.

Speaker 2

It is. That's why the text we are analyzing heavily emphasizes experiential learning, drawing specifically on David Colb's theory.

Speaker 1

Right Coolb's theory, which centers on learning as a continuous iterative cycle. You can't just passively absorb lectures about file system No.

Speaker 2

Students have to transition into active experimenters. When you're doing an extraction in the real world, you're inevitably going to encounter a proprietary IoT device or some some firmware build you've literally never.

Speaker 1

Seen before, and the commercial tools just choke on it exactly.

Speaker 2

Commercial forensic tools will often just fail to parts the data completely. If you only know how to push a button on a software suite, you're dead in the water.

Speaker 1

Yeah, you have to know what's happening under the hood, you really do.

Speaker 2

You need to know how to drop down to the raw hex level and actually carve the data manually.

Speaker 1

Let's use an analogy here to make this bit more relatable. Learning digital forensics via textbook is kind of like learning how to cook by reading a recipe.

Speaker 2

That's a good way to put it.

Speaker 1

It's great in theory, but entirely useless if you are suddenly dropped into a chaotic burning kitchen where the stove is broken and half your ingredients are mislabeled.

Speaker 2

Right, you can't just follow step three, You've step two is on fire exactly.

Speaker 1

You have to learn to think on your feet, smell what's burning, and improvise with what you actually have.

Speaker 2

And what's fascinating here is how the instructors design this specific curriculum to replicate that exact burning kitchen. The students were given a choice.

Speaker 1

Okay, what was a choice.

Speaker 2

They could conduct a standard predictable research paper or dive into this grueling DFRWS challenge. And the challenge is famously difficult because it's completely open ended.

Speaker 1

There's no like answer key at the back of the book.

Speaker 2

Nope, there is no single flag to capture. It perfectly mimics the imperfect, asymmetrical nature of real world investigations where you just have fragmented data and a ticking clock.

Speaker 1

I love that they opted into the chaos. I mean that takes guts. So let's set the scene for this challenge. The setup is straight out of a premium cable drama.

Speaker 2

Oh, it really is. It's very cinematic.

Speaker 1

It's May seventeenth, twenty eighteen. At exactly ten point four a m. Police get an alert about an unsuccessful arson attempt at an illegal drug lab.

Speaker 2

And they arrive on the scene at ten four or five, right.

Speaker 1

And they owner the lab, a guy named Jesse Pinkman, which is obviously a nod to breaking bad, but the students have to treat him as a factual target. Well, he's missing, right, Pinkmin is gone.

Speaker 2

But during the initial SUITEP police do locate two of Pinkman's known associates, Deep Pandana and as Varka.

Speaker 1

Okay, so they have suspects, they do, yeah, and.

Speaker 2

Both men admit they have Wi Fi credentials for the labs network, but they completely deny any involvement in the raid or the subsequent fire right, and they claim Pigman was super paranoid about rival gang, so he installed this massive, multi layered security system and always kept the alarm heavily armed in home mode while he was working inside.

Speaker 1

Okay, which immediately tells the investigator something crucial. If Pinkin was inside when the fire started, the system should have recorded the perimeter breach exactly.

Speaker 2

The logs should show someone coming in. So the forensic team walks in to catalog the hardware, and it is just an IoT nightmare.

Speaker 1

It's a gold mine and a nightmare all at once. What do they find?

Speaker 2

They find an Ice smart alarm system, a QB camera, a nett camera, an arlow pro setup, an est Protect smoke detector, an Amazon Echo, a wink hub, a Raspberry Pie acting as a rogue router.

Speaker 1

Wow, it's a lot.

Speaker 2

And crucially Pinkman's personal sound Sung Galaxy S six edge.

Speaker 1

Okay, so it's the quintessential fragmented IoT environment. You have a dozen distinct devices, varying communication protocols like zigb Z Wave Standard eight to two point one one, all routing to different proprietary cloud server.

Speaker 2

Right, it's the total mess of different ecosystems trying to talk to each other.

Speaker 1

But wait, I have a massive problem with how the students approach this initially.

Speaker 2

Oh, what's that?

Speaker 1

If the house is literally plastered with high end security cameras like the nest R low QB, why did the student group spend the majority of their initial analysis meticulously ripping apart a single Samsung phone. I mean, wouldn't any sane investigator just pull the cloud video feeds or check the local SD cards?

Speaker 2

Well, you would, hope, so sure, But in a real world IoT deployment, the cameras often act as dumb terminals.

Speaker 1

Really just dumb terminals.

Speaker 2

Yeah, the video is pushed directly to an AWS bucket or a proprietary cloud server. If the suspect provoke the cloud tokens, or if law enforcement doesn't have a warrant serve to Google or Amazon yet, that cloud data is entirely inaccessible.

Speaker 1

Oh so the physical tammery in the house is basically empty.

Speaker 2

Often yes, Plus local storage is frequently overwritten or just absent on these models. The smartphone, however, acts as the central orchestrator.

Speaker 1

Got it.

Speaker 2

It holds the apikeys, the cash thumbnails, the apps sandboxes in the local network configuration files.

Speaker 1

Okay, so the phone is basically the remote control for the entire physical environment.

Speaker 2

Precisely, but proving that phone is the orchestrator requires some really solid network forensics. Group two didn't just assume the S six edge was driving.

Speaker 1

The traffic, right, you can't assume anything.

Speaker 2

They had to actually anchor it to the network toatology, So they parsed a massive wire shark PCCAF file basically a packet capture from the local network, and mapped out all the IP and MB addresses.

Speaker 1

But MBA adjuresses can be easily spoofed, right, especially by someone running a rogue Raspberry Pi router in a literal drug lab. How do they definitively linked that specific network traffic back to Pinkman's physical device sitting in the evidence.

Speaker 2

Locker By diving into the Android file system itself. They accried the file system from the S six edge and extracted a specific system file located at wifey slash dot mac dot info, a hidden file exactly, and this file hard codes the device's truemec address, which was AC five f three e seventy three eighty three seventy eight.

Speaker 1

Okay, so they have the hard coded identity, right.

Speaker 2

So by matching that hardware level file to the Mac address broadcasting in the PCAP, they totally eliminated the spoofing theory. They proved Pinkman's phone wasn't just resting on a table it was actively routing traffic to the Raspberry Pie and facilitating communication between the environmental sensors and the ice mart alarm based station.

Speaker 1

That is brilliant. So they essentially found the conductor of the orchestra. Now that they have the layout and the verified devices, the students had to actually reconstruct the timeline of the arson. How are they extracting invisible trigger events to build a real chronological sequence?

Speaker 2

This requires diving deep into the abs sand boxes. The students managed to extract a squire light database belonging to the Icemart alarm app, specifically a file named tb underscore IPU dairy.

Speaker 1

Which acts as an audit log for the alarm.

Speaker 2

States right, yes, exactly, But if you've ever dealt with IoT databases, you know this data isn't neatly form added for humans. The timestamps are logged in epoch time.

Speaker 1

Oh no, epoch time is anyone who works with databases nos is a massive headache when you're trying to reconstruct a localized physical event. It really is, because the database just gives you a ten digit integer representing the seconds that have passed since nineteen seventy and it's inherently in UTC.

Speaker 2

And that time zone offset is exactly where forensic timelines live or die. The lab was physically located in the UTC plus two time zone. Oh boy, right, So for every single entry in that squad database, the students had to script a conversion to translate the epoch integer into a human readable date and then manually apply that two hour off.

Speaker 1

And if you missed that offset, your entire timeline of the crime is shifted by one hundred and twenty minutes, which completely destroys your case in court.

Speaker 2

Oh, the defense attorney would have a field day.

Speaker 1

Ah.

Speaker 2

But they nailed the conversions, and once they queried that ipudiary database, they started seeing specific user profiles, triggering the alarm states what kind of profiles? Well, they found logs for a user called fee Boss, which they actually correlated to voice commands issued at the Amazon Echo. They found a profile for j Pinkman. Naturally, and crucially, they uncovered a third user ID pand Adote Panda doodo.

Speaker 1

Wow. That third user idea is highly suspicious given the suspects they have in custody, I mean deep Pandema right, yeah, exactly.

Speaker 2

But the investigation wasn't a straight line. Group one, for instance, spent a massive amount of time analyzing a Google Mail database found on the S six edge, hoping to find communications orchestrating the raid.

Speaker 1

Okay, that makes sense, check the emails.

Speaker 2

Yeah, but inside the database columns they hit a wall of compressed blobs binary large objects.

Speaker 1

Here's where it gets really interesting, though, because, as you mentioned earlier, if a commercial tool fails to parse a proprietary database blob, you have to drop down to the hex level.

Speaker 2

And that's exactly what Group one did. They pulled the raw hexadecimal code of these unreadable blobs and went hunting for file signatures.

Speaker 1

Just manually scanning the hex code. That is tedious work, it is, but.

Speaker 2

They identified a specific magic number at the header zero x seven eight nine C.

Speaker 1

Wait zero x seven eight nine c in the context of filecarving. Seeing that immediately tells an investigator that the app developers utilized zlib compression to save space on the mobile device.

Speaker 2

You got it, It's the standard sold header.

Speaker 1

So they recognize aszlibheader, stripped it out, passed the raw payload through a decompression algorithm, and the data just unfolded.

Speaker 2

It was a truly brilliant piece of manual file carving. They successfully extracted forty fully readable HTML emails, and twenty of those emails were system alerts generated by the IoT devices themselves, kind of alerts, status updates from the net system, subscription alerts, safety summaries, things like that.

Speaker 1

Okay, that sounds like an absolute gold mine for building a timeline.

Speaker 2

It sounds like one. Yeah, but this highlights the sheer frustration of digital forensics. After all that manual hex analysis and decompression, the emails were a total dead ad.

Speaker 1

You are kidding me.

Speaker 2

Ope. They provided some background context about the network configuration, sure, but absolutely no operational intelligence reguarding the raid or the fire. There's no smoking gun in the inbox.

Speaker 1

Wow. That is brutal. But I guess that dead end forces them to pivot, which really validates the whole experiential learning model we talked about. They hit a wall with the emails, so they pivot their focus to the Amazon Echo.

Speaker 2

Right. They pull a database file called sift underscore Amazon Underscore Alexa dot dB, and inside they locate an audio dot wavefile that was recorded and cashed at exactly ten point two two am.

Speaker 1

Okay, ten point two two.

Speaker 2

They hit play on the file and it is this incredibly loud, piercing, rhythmic beeping sound.

Speaker 1

Now, human intuition takes over here. You are actively investigating an arson case at a drug lab. You find an audio file of a blaring alarm in the house at ten point two two. The immediate assumption is that the fire started at ten point two two and triggered the smoke detector.

Speaker 2

Oh. Absolutely, any traditional detective walking onto that scene listens to that tape and immediately writes down in their notebook ten point two two am incendiary device deployed.

Speaker 1

Right. But this is digital forensics. You don't trust your ears. You trust the corroborating logs exactly.

Speaker 2

And if we connect this to the bigger picture, this is arguably the most critical lesson in the entire DFRWS curriculum. In a distributed IoT environment, isolated data points are inherently untrustworthy.

Speaker 1

You need multiple points of failure or confirmation.

Speaker 2

Rather, Yes, you must synthesize and cross reference across different vendor ecosystems. The students absolutely refuse to accept the audio file at face value. They were digging for the raw telemetry from the nest Protect smoke detector.

Speaker 1

They found the cash file for it, right, a Jason payload, Yeah.

Speaker 2

A file labeled cash dash one three three two two five two three sixty two dot.

Speaker 1

Json and JSON payloads and proprietary apps can be deeply tested and heavily obfuscated.

Speaker 2

Right, very much so. But they managed to parse the key value pairs and when they extracted the time stamps for the specific event key labeled protect underscore smoke underscore worn, that data was completely contradictory to the audio file.

Speaker 1

Oh really, what did it say?

Speaker 2

The nest protect did not log a smoke event until ten point three six am.

Speaker 1

Ten point three six. That is a massive fourteen minute gap.

Speaker 2

Right. The environmental sensor definitively proves there was zero smoke or particulate matter in the air at ten point two two.

Speaker 1

Okay, so this forces the students to completely re evaluate the audio file. If it's not a smoke alarm, what is it?

Speaker 2

Well, they crossed reference to ten point two two timestamp back to the eismar alarm logs in that IPU dairy database. They discovered that at the exact second the echo recorded the beeping, the peripheral door sensor logged that the front door was a jar.

Speaker 1

Oh, I see where this is going.

Speaker 2

While simultaneously a voice command was issued to arm the system.

Speaker 1

That is such a satisfying piece of deduction. The Amazon Echo wasn't recording a fire alarm at all. It accidentally woke up as an open mic and captured the security keypad furiously beeping because Pinkmin was trying to arm the system while the front door was wide open.

Speaker 2

Exactly, it completely rewrites the physical timeline. It proves Pinkmin was just fumbling with the security system at ten point two to two, not fleeing an active fire.

Speaker 1

It perfectly demonstrates why digital investigations require this relentless skepticism of your own biases. You really have to let the synthesized network data dictate the physical narrative, not the other way around.

Speaker 2

You do data first, narrative second.

Speaker 1

Okay, let's unpack this, because with all these disparate logs finally synchronized on a unified timeline, we've got the I smart alarm database, the parse nest Jason, the offset epoch times. We can actually answer the Attorney General's primary questions.

Speaker 2

Right, who initiated the fire and what the hell happened? To the disabled QB camera guarding the entrance.

Speaker 1

Let's walk through those final minutes leading up to the arson. The synchronized timeline is incredibly tight.

Speaker 2

Here very tight. So at ten point three four and seventeen seconds, the logs indicate the system is set to home mode by the user profilety Boss.

Speaker 1

And by correlating this with the Amazon Echo logs, we know the Boss utilizes voice commands, which indicates Pinkman is physically present inside the lab actively securing the perimeter right.

Speaker 2

But a mere fourteen seconds later, at ten point three four and thirty one seconds that exact same squall light database logs a state change. The alarm is abruptly disarmed.

Speaker 1

Wait disarmed, and the user token attached to that disarmed command panded dodo wow.

Speaker 2

This is immediately followed by the contact sensor on the front door logging in open state. Then the logs show a dark period.

Speaker 1

A dark period for how long there is.

Speaker 2

A gap of about ninety seconds of completely unrecorded physical activity inside the lab. And then at exactly ten point three six and nine to six seconds, the nest Jason five registers the protect underscore smoke underscore warren key.

Speaker 1

The arson attempt is officially underway exactly. I mean, it doesn't take a twenty year homicide veteran a piece that logic together. The user profile Pandado do is glaringly obviously the indocia they have in custody deep Pandana.

Speaker 2

Yep. The logs prove he authenticated, dropped the alarm system and open the door a minute and a half before the accelerant was lit.

Speaker 1

The students logically concluded he was the primary infiltrator. The digital breadcrumbs place him actively bypassing the perimeter security at the precise moment the raid initiates.

Speaker 2

They absolutely nailed him, but the Attorney General had one lingering question that really threatened the integrity of the case.

Speaker 1

Right the QB camera.

Speaker 2

Yeah, the QB camera pointing directly at the entry vector was dead. So the question is did the fire melt the circuitry or was it intentionally neutralized by the suspects beforehand.

Speaker 1

And this is where the dynamic of having competing student groups really pays off. Group one, the ones who did that amazing manual file carbon on the zlib emails, they actually hit a brick wall with the QB camera.

Speaker 2

They did, They did fantastic work pulling cashed thumbnail artifacts from the ARRLO base station. They showed unidentified figures moving in the lab, but the QB camera just remained a total black hole to them.

Speaker 1

But Group two took a different route.

Speaker 2

Group two approached the camera anomaly from a network infrastructure perspective. Remember the wire Shark PCP file they used to anchor the ssex edges mme address earlier, right the packet. They went back into those raw packets, but instead of looking for data payloads, they started hunting for dropped connections.

Speaker 1

See I am naturally skeptical of this approach. How do you prove a camera was intentionally tampered with just by looking at a PCCAT file. Couldn't the camera have just experienced like a power surge or a flaky Wi Fi drop.

Speaker 2

That is exactly the assumption defense attorneys love to make. But network traffic has specific behavioral baselines.

Speaker 1

Okay, what does that mean in this context?

Speaker 2

Group two filtered PCK for the Raspberry Pyrider's automated keep alive pings. These are just little signals constantly hitting the QB camera's I address to make sure it's still.

Speaker 1

There, just saying are you there? Are you there?

Speaker 2

Exactly? And they noticed that cameras suddenly stopped acknowledging the pins. There was no TCP teardown, no graceful network disconnect, just a sudden, hard drop into a dead state. Oh and crucially, this anomalous flat line occurred prior to Pandana dropping the alarm at ten point three four.

Speaker 1

Ah. I see, It's like a hospital heart monitor flatlining before the patient even goes into surgery.

Speaker 2

That's a perfect analogy.

Speaker 1

In network forensics, the absolute absence of packets is just as vital as a data payload. The total silence from that specific IP address preceding the physical breach by Pandana proves the camera didn't melt in the fire of all. It was preemptively and manually disconnected from the network infrastructure. It was premeditated tampering.

Speaker 2

It was, And this raises a really important question about how we structure forensic teams in the real world. Think about it. Group one hyper focused on visual artifacts and filecarving. Group two focused on network topologies and packet analysis.

Speaker 1

Right, and neither group possessed the complete picture independently exactly.

Speaker 2

Experiential learning proves that modern digital investigations require serious cross disciplinary collaboration. You need the host based forensic analyst parsing databases right alongside the network engineer analyzing traffic flows to build an undeniable case.

Speaker 1

It's not just a lone wolf hacker staring at a terminal in a dark room anymore. It's a synchronized team attacking the architecture from multiple vestors.

Speaker 2

Yeah, one person is writing Python scripts to offset epoch time, someone else's isolating slibheaders in a hex editor, and a third person is filtering millions of packets until they realize a simple lack of ping responses proves criminal intent.

Speaker 1

And that synthesis is really the ultimate takeaway from this analysis. These students took raw hexadecimal code, abstract network packet capture, and these highly confusing cash files and molded them into a highly specific, undeniable timeline of human behavior.

Speaker 2

They prove deep Pandana, operating under the alias Pandido doo bypassed a perimeter alarm, infiltrated a secure facility, and less than two minutes later triggered an environmental sensor with an arson attempt, all.

Speaker 1

While a key surveillance node had been systematically blinded beforehand. Incredible. So what does this all mean for you listening outside the classroom. It's an incredibly rigidsus piece of analytical work by these students, But it brings us right back to the thought experiment.

Speaker 2

We started with, the one about your own room.

Speaker 1

Yes, think about your own home network. Every single time you open your smart lock, every time you ask your voice assistant to dim the lights, every time your phone quietly handshakes with your router just to maintain a connection, you are generating a forensic timeline.

Speaker 2

You really are. You are continuously writing an autobiography encoded in proprietary databases that you will probably never have the tools to actually look at.

Speaker 1

We operate in an era where our physical environments are meticulously logged by sensors we explicitly purchase and installed just for our own comfort.

Speaker 2

But, as the DFRWS challenge so clearly demonstrates, those sensors possess absolutely no loyalty to their owners. They simply log the state changes truthfully and coldly.

Speaker 1

Which leaves you with the thought I just haven't been able to shake since reviewing these case files. What's that as our homes, our vehicles, and our urban infrastructure become perfectly recorded, deeply interconnected IoT environments. What happens to the concept of an unwitnessed event. Wow, If every smart plug, smoke detector, and Wi Fi router is silently maintaining a database of state changes, is the perfect crime even theoretically

possible anymore? Or will the burglars and arsoness of the future have to possess the skill set of a senior network administrator meticulously spoofing Mac addresses and wiping JSON caches just to get away with kicking in a front door.

Speaker 2

It's terrifying thought.

Speaker 1

It is definitely something to think about the next time you ask your smart speaker to check the weather.

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android