Java Memory Model (JMM) - podcast episode cover

Java Memory Model (JMM)

Jan 22, 202629 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

RexIDE - https://rex.mindmeld360.com

We explore the fundamental concept of Happens-Before (HB)—not a measure of time, but a formal mathematical relationship that provides the only real guarantee of visibility and ordering in a multi-threaded world. Without an HB edge, your threads are effectively living in different realities, where one thread might see an object has been published but observe its internal fields as stale or uninitialized zeros.

Transcript

Quick note, this episode isn't sponsored. I'm building a new kind of IDE called Rex because existing ones make it hard to work across multiple projects in parallel. I'm sharing it to get feedback from listeners. I'd really love to hear your thoughts. The link is in the description. And now let's move on with today's super interesting episode. Welcome back to the Deep dive. Today we are tackling a topic that it really acts as a kind of dividing line in the career of a

software engineer. Yeah, it really does. It's the difference between someone who you know knows how to write Java syntax, how to get the code to compile, and someone who actually understands what the machine is doing with that code. We are diving into the abyss of the Java Memory Model or the JMM. It really is the final boss of Java interviews, isn't it? It's that layer where the abstraction we all rely on starts to leak. It leaks everywhere. It does. I mean you spend your early

career being told Java is safe. The JVM manages memory for you, don't worry about pointers. And then you open the JMM specification and realize, oh, actually there is a tremendous amount of invisible chaos happening underneath my feet. An invisible chaos is the perfect phrase for it. I was reading through the breakdown of the Java Language specification for this, and honestly the thing that struck me most is that the computer is basically lying to us.

It is constantly, yeah, and for good reason. We grow up thinking of code as a recipe, right? Step one, break the eggs. Step 2 whisk the eggs. Step three, fry the eggs. We assume this strict linear order, of course, but looking at how the JM Ed works, the computer might decide to heat the pan before it even buys the eggs, provided it knows the result will be the same to you. That is the illusion of serial execution, and honestly, it is a

necessary lie. If the computer actually executed instructions in this strict order you wrote them, modern software would be crawling. It would be unusably slow. Really that slow? Oh, absolutely. The hardware relies on that deception, on that reordering, to view the performance you expect. So our mission today is to

dismantle that illusion. We need to explain why the hardware cheats, how it gets away with it, and most importantly, when that cheating causes our programs to crash in ways that seem completely impossible. Right, We're going to cover happens before, which sounds like a time travel paradox, but is actually the law of the land. We're going to talk about volatile, which I feel like almost everyone uses wrong at least once in their career.

Or they use it everywhere because they're scared, which is also wrong. Exactly, that's the other side of the coin, and we'll dissect the infamous double checked locking bug, which is basically a Horror Story for developers. A true rite of passage. Plus we'll look at the modern tools like bar handles. The very sharp knives in the drawer. Let's start with the basics, then segment one the reality check. Why does the hardware reorder my code?

I mean, if I write X = 1 followed by y = 2, why is it so hard for the CPU to just, you know, do that in order to? Understand that you have to understand something called the memory wall. This is the fundamental massive gap between CPU speed and memory speed. A modern CPU is screaming fast. It can execute billions of cycles per second. It is a Ferrari but main member your Ram sticks. Comparatively, they're ancient. They're just so slow. So if the CPU is a Ferrari, what

is RAM? A horse and buggy. If the CPU is a Ferrari, fetching data from main RAM is like waiting for a postal delivery from the next town over. It takes forever in CPU time. Hundreds, sometimes thousands of CPU cycles can pass in the time it takes to get one piece of data. Wow. So imagine the CPU hits a line of code that says load data from variable A. Variable A is out in main memory.

The CPU has a choice. It can sit there idling, wasting all those hundreds of cycles doing absolutely nothing while it waits for that data to arrive. Which is a massive waste of expensive silicon. A huge waste. Or you can look ahead. It scans the next few lines of your code and sees oh the next line is set variable B to 10. Now a variable B is already sitting in the cache right there in the CPUs local memory.

CPU says you know what, I'm going to do this second thing while I wait for the first thing to arrive. So it reorders the instructions purely to hide the latency to hide that postal delivery time. Exactly to the CPU and to the Just in Time compiler which does similar tricks. The rule is we can shuffle this deck of instructions however we want as long as the result within the single thread remains correct. That's the key phrase right there within this single thread. That's everything.

The CPU keeps track of its own mess. It knows OK I wrote to B before A even though the code said A before B. So if the user asks for the value of A later, I have to make sure I stall and present the correct result. It maintains what's called as if serial semantics. As if. As long as you are the only thread running, you will never ever know the reordering happened. You live in a happy, consistent, linear world. But the moment we introduce a second thread.

The illusion shatters completely. Yeah, because thread 2 doesn't know about thread one's internal bookkeeping. Fred 2 just looks at shared memory. If thread one wrote B first because it was faster and didn't need a memory fetch, thread 2 might look at memory and see that B has changed, but A is still null or its old value. Even though in the code I wrote, a was clearly assigned first. Correct.

From Thread 2's perspective, Thread one is executing code backwards or in random bursts, and that is where you get those bugs that are literally impossible to reproduce on purpose. We call them Heisenbugs, right from the Heisenberg Uncertainty principle. Yes, because the moment you try to observe them, say by adding a system dot out dot printlen or attaching a debugger that changes the timing, it introduces overhead. And the reordering disappeared. Exactly.

The CPU might stop optimizing so aggressively because of the new overhead. So you attach the debugger and the bug vanishes. You take it off and your application crashes an hour later. It's madding. That is terrifying. It's effectively gaslighting by the processor. It really is. And it's not just reordering. That's only half the problem. The other half is visibility. OK, we have to talk about the physical architecture again.

We tend to assume memory is this one big shared whiteboard that everyone sees at the same time, but it's not. Right, the architecture involves caches L1L2L3 caches on the chip. Think of it this way. You and I are working on a shared project. I am thread A, you are thread B, but we are working in different offices down the hall from each other. Main memory is a bulletin board in the hallway. OK, I like this analogy. I have a notebook on my desk.

That's my super fast L1 cache. You have your own notebook on your desk. I write xe key one in my notebook. Do you see it? No, I'm in the other room. I only see what's on the bulletin board in the hallway, which is the old value of X. Exactly. And here is the kicker, I am lazy. Or rather, I am efficient. I'm not going to stand up and walk down the hall to the bulletin board every single time I write a number that takes too much time. That's like a main memory

access, too slow. Right, I'm going to scribble in my notebook for a while. Eventually, maybe when my cache line gets full or I'm forced to, I'll walk down and pin my pages to the board. That action is flushing domain memory. And even if you do pin it, I don't see it immediately, do I? Because I'm just staring at my own notebook. I'm not constantly checking the hallway. Correct. You have to decide to stand up, walk to the hall, and check the board to update your own notebook.

That's invalidating your cache. If we don't have assistant a rotocol to agree on when to walk to the hallway, We're living in two different realities. I think X is 1. You think X is 0 and we are both right according to our local hardware. And the Java Memory model is basically the schedule for walking to the hallway. It's the set of rules that forces us to communicate. That is a perfect way to put it. The JMM is the contract.

It tells us if you use these specific keywords, the JVM guarantees that thread A will post to the board and thread B will go look at it. Without those keywords, the JVM promises absolutely nothing. The hardware can stay in its local cache forever if it wants to. OK, so let's look at the contract. This brings us to segment 2. The spec uses this term happens before. It sounds very chronological.

Action A happens before action B, but based on what you just said about reordering, getting hung up on time seems like a huge mistake. It's the biggest mistake you can make. Happens before is a legal term, not a physics term. It's about visibility and ordering constraints. If we say action A happens before action B, we are strictly saying one thing. Whatever result was produced in action A is guaranteed to be visible to action B, and that's it.

That's the core guarantee. Even if in physical wall clock time they happen almost simultaneously, or if the hardware did some weird time travel optimization, the final observable result must look as if a happened first. O how do we forge this chain? Because by default 2 threads have no happens before relationship. They're in those separate offices, right? We need synchronization actions to create a link. The most basic 1 is the program order rule, which we already

discussed inside one thread. Line 1 happens before line 2. That's the easy one, the one that. Gives us the illusion of sanity. Exactly. The interesting ones are between threads. The classic mechanism is the monitor lock. This is. When we use the synchronized keyword, yes. The rule is simple but powerful. An unlock on a monitor happens before every subsequent lock on that same monitor. Let's. Apply the hallway analogy to synchronized. What's happening there? OK.

The monitor is like a specific pass key to the bulletin board. When I finish my synchronized block, I release the lock. The JMM rule forces me to take all my scribbles from my notebook, everything I have done up to that point, and pin them to the board. I flush everything. It's full. Sync a full. Sync. Now you come along and you want to enter a synchronized block on that same object you have to acquire the lock. I have to get the.

Pass key and that. Action of acquiring the lock forces you to throw away your stale notebook pages regarding those variables and read fresh ones for the board. You invalidate your cache. So because I unlocked which means I wrote to the board, and then you locked which means you read from the board, you are guaranteed to see all of my changes that makes. Perfect sense. That's why synchronized is safe. It forces the walk to the hallway. But synchronized is considered

heavy. It stops other threads, it blocks. It can cause contention. What if I don't want to stop the world? Then you look. At volatile and this brings us to segment 3. This is possibly the most misunderstood keyword in the entire Java language I feel. Like I see volatile used whenever someone is just guessing. My multi threaded code is muggy. Let's throw volatile on some variables and see if it helps. That is. Voodoo programming.

You're just poking things hoping the magic works, but volatile is much more subtle. It is lighter weight than a lock because it doesn't cause a thread to block. Thread B doesn't have to wait for thread A to finish a big block of code, but it has very strict rules. OK. What's the happens before rule for volatile the? Rule is a write to a volatile variable. Happens before any subsequent read of that same variable. OK, let's. Use the analogy again. I have a volatile boolean called

flag when. I in thread A write flag True, because flag is volatile, I am forced to go to the bulletin board and updated immediately. But here's the critical part, the part most people miss. I don't just update flag, I update everything I wrote before that volatile write wait. Everything, even the normal non volatile variables, yes. This is the piggybacking effect. This is the aha moment for understanding the JMM. Imagine I write to a normal non volatile variable data. It's both 42.

That's just a scribble in my notebook. Then on the very next line I write to a volatile boolean ready shrew. Because ready is volatile, I am forced to go to the hallway, but the rule says I have to flush my state. So I pin up the data 42 page first and then I pin up the ready true page. So the. Volatile variable acts as a sweeper. It pushes all the prior changes along with it exactly. It's a memory fence.

It says all rights prior to this point must be made visible before this volatile right can be made visible. It's called the release fence. Now on your side, you're in thread B. You're in a loop checking the value of ready the moment you read through. Is a volatile read, which is a volatile. Read it acts as an acquire fence.

The JMM guarantees that you not only see ready is true, but you're also guaranteed to see data 42, the normal variable piggybacked on the visibility of the volatile one, that is. Incredibly powerful. You can use one volatile flag to signal that a huge graph of objects is ready to be read without having to make every single field in those objects volatile. Correct. It is the basis of many lock free algorithms. But here is where you have to be so careful.

Because volatile guarantees visibility, people assume it guarantees atomicity and it does not. OK. Let's define the difference clearly. Synchronize provides atomicity. It's like locking the door to the office so only one person can be in there touching the data at the same time. Does volatile do that? Not at. All volatile guarantees. We're all looking at the same page on the bulletin board. It does not guarantee that we won't try to write on that page at the same time and overwrite

each other's handwriting. The classic interview trap is Is count plus plus thread safe if the integer count is declared volatile? My gut. Reaction is to say yes, because if I read it, I'm guaranteed to see the very latest value written by another thread and. That's the trap everyone falls into. Count plus looks like 1 instruction, but under the hood it is actually three separate operations. One read the current value of count 2. Add 1 to that value in ACPU register three.

Write the new value back to count a read. Modify write operation OK, so. Imagine count is five. I am thread AI, read count. I see five. You're thread B, you read count. You also see 5 because we both have perfect visibility thanks to volatile, right? We're both. Up to date now I. Add 1 to my local copy, I have 6. At the same time you add 1 to your local copy, you have 6. I write my 6 back to the count variable, it's now 6. Then you write your 6 back to the count variable.

It's still 6. We both did the work but the counter only went up by one. We lost an update because volatile didn't. Lock the door while we were doing the math, it just made sure we both saw the number 5 on the board before we went back to our desks. Precisely. Yeah, it just made sure we saw the door was open. So volatile is fantastic for status flags. Things like I am done stop working configuration loaded simple booleans or indicators that are written by one thread and read by many.

It is not for counters and it is not for any complex business logic where the new value depends on the old value. So if. I need to do count plus plus safely. I need to use an atomic integer or a lock, yes? Atomic integer uses a different special hardware instructions CAS or compare and swap which essentially does that whole read, modify, write cycle as one indivisible atomic hardware operation. But that, as you said is a whole other deep dive.

OK, I want. To pivot to the Horror Story we mentioned in the intro, this seems like a good time for a case study. Segment 4 Double checked locking. This pattern looks so smart on paper and yet it has caused so much grief. It's the. Siren song of concurrency. Every intermediate developer thinks they've invented it, and every senior developer has to patiently explain why they need to delete it. Let's set. The scene we have a Singleton, some big heavy object.

We don't want to create it when the application starts, we want to create it lazily the first time someone asks for it. Lazy. Loading a standard requirement, so. The simple 100% correct Safeway is just make the entire get instance method synchronized. You basically say lock the door, check if the instance is null, if yes, create it, unlock the door, return it. That works. Perfectly, it is 100% correct. You can ship that code and sleep well at night must.

There's always a but but. Synchronization implies overhead. In early Java versions it was quite slow and developers being clever thought why am I paying the cost of locking every single time I want to get this object? I only need to lock once, the very first time I created. After that it's just a read operation. Reading shouldn't require a lock that. Logic sounds solid. The lock is only for the creation path, so. They invented double check locking. It goes like this.

Check. If the instance is null without a lock, If it is not null, just return it. That's the fast path. If it is null, then and not only then do you enter a synchronized block. Step 2, Inside the lock, you check if it's still null. You have to do this because another thread might have beaten you to the lock while you were waiting to double check exactly. And if it's still null. Step three, you create the object instance. New Singleton, it sounds.

Logic proof You're only locking when absolutely necessary, but. In Java versions before 5, which was in 2004, and even today, if you don't use volatile correctly, this code is lethal. It's a. Ticking time bomb? Explain. The failure mode because if I'm inside the synchronized block I'm the only on creating the object. I feel safe you are. Safe thread A, the creator is fine. The victim is thread B who comes along. A nanosecond later, thread B comes along.

It hits that first. Check the one outside the lock if in. Java bytecode and in machine code, creating an object is not one atomic action, it's effectively 3 high level steps. One, allocate memory for the object, grab some bytes from the heap 2 initialize the object, run the constructor code, set the fields to their initial values, 3 point the instance variable to that newly allocated memory address. OK. Allocate, initialize, publish in that order 123 but remember.

Our old friend reordering the compiler and the CPU are perfectly allowed to swap steps two and three. If it's faster, they might say you know what, I'll point the instance variable to the memory address first and then I'll run the slow constructor. Wait, so there's a brief moment where the instance variable is not null. It points to a valid memory address, but the constructor hasn't run yet. Precisely.

The memory it points to contains default values, zeros for integers, nulls for any object references. It is a partially constructed object. Now thread B hits that first check is instance null. Well, step three happened. The reordered right to instance happened. So instances pointing to a memory address, it is not null. Thread B says great, it's ready. It skips the lock entirely, grabs the object reference and tries to use it. But the.

Constructor hasn't run. The fields aren't initialized exactly. Thread B is holding a hollow object, a shell. It tries to call a method on, it encounters A null field that should have been initialized, and your program flows A null pointer exception. Or worse, it calculates a financial transaction using 0 as the interest rate. Because the rate field wasn't set yet, that is. Insidious. The object exists, but it's a ghost we.

Call it unsafe publication. You publish the reference to the object before you finish building the object. So how? Does the fix work? The notes say you just add volatile to the instance variable. How does one word fix all of this right? Remember the happens before rule for volatile. A volatile right creates a release fence if the instance field is declared volatile.

The JMM forbids the reordering of the right to instance with any of the operations that happened before it in program order, which includes the constructor initialization, So. Volatile forces the order back to 123. Allocate, initialize, then publish. Exactly. It ensures that if thread B sees a non dull instance, it is absolutely guaranteed to see the fully constructed, fully initialized object. It cannot see the half baked version.

It's. Amazing that one keyword has such a profound effect on the CPU pipeline, but the expert notes also mentioned that we probably shouldn't be writing doublechecked locking at all anymore. Yeah, it's. Basically considered an anti pattern now. It's a great teaching tool for the JMM, but in practice there are better ways. If you want a lazy thread safe Singleton, just use the initialization on demand holder idiom which is. Using a static inner class, right?

Yes. You create a private static inner class that holds the static final instance. You rely on the JVM's class loader. The Java Language Specification guarantees that class loading is thread safe. It handles all the locking and synchronization for you behind the scenes. It's cleaner, it's often faster, and it is impossible to mess up. So the. Take away there is. Don't try to be clever with low level locking unless you have a very very good reason with the JVM.

Do the heavy lifting. 100% speaking. Of letting the JVM help us, let's talk about final fields. This is segment 5. We usually think of final as I can't reassign this variable like constant other languages, but in the JMM it has a special superpower regarding visibility this. Is one of the coolest and most useful parts of the spec. It's called freeze semantics. We just talked about the nightmare of seeing a partially constructed object. Well, final is the antidote.

The JM gives a very strong guarantee if a field is declared final and the object is constructed correctly defined. Correctly, what can go wrong it? Means you didn't let the this reference escape during the constructor. You didn't do something silly like passing this to another object or adding this to a global list inside the constructor code. Why is that?

Bad because. If another thread can get a reference to your object while the constructor is still running, then even the final fields might not be visible yet. The freeze happens at the end of the constructor. You let someone look at the object before it was frozen. Got it. So as long as you don't leak this if you. Construct it cleanly. Then any thread that sees that object at any time is to see the correct initialize values of all its final fields, so they.

Can never see the default 0 or null value for a final field. Never. Even if there is no synchronization whatsoever. Even if there is a data race to publish the object itself, the final values are guaranteed to be visible and correct. They are frozen into the object. This is. Why everyone says immutable objects are inherently thread safe? This. Is the exact reason.

If you make a class where every field is final and there are no setters, you can pass instances of that class between threads without any locks, without any volatile, without any worry at all. It is impossible for another thread to see it in an inconsistent state that is a. Massive design tip. If you're struggling with concurrency bugs, maybe the answer there isn't more locks, maybe it's to stop using mutable state. Just make everything final it. Solves 90% of the problems.

You move the problem from. How do I synchronize access to this changing variable to? I'll just create a new immutable object with the new value. It's the functional programming approach and it works beautifully in Java for concurrency. OK. We have covered the safe recommended stuff. Now we have to go to the danger zone Segment 6, the modern era. For a long time we had synchronized and volatile and then there was this hidden unofficial class called sun dot

misky dot unsafe. Unsafe the forbidden fruit. It gave you direct C style memory access. You could allocate memory off the Java heap. You could perform atomic operations. It was incredibly fast, incredibly dangerous, and technically unsupported. But.

Everyone used it every. Major high performance library, Cassandra, Kafka, Netti, you name it used it because they needed that last ounce of speed and Java. 9 finally said OK, we can't stop you, so let's at least give you a standard supported API, and that's where Varhandle comes from. Varhandle is basically a safe standard way to do the things Unsafe used to do. It gives you extremely fine grained control over the JMM memory barriers. With volatile you have a heavy hammer.

You get perfect visibility and perfect ordering always, but sometimes that's too expensive. Var handle lets you choose your mode of memory. Access the. Notes list these modes. We have Plain, Opaque, Acquire, Release, and volatile. Let's walk through them briefly so we know what we're dealing with. Plain mode plain. Modes is the Wild West. It treats the variable as a normal Java variable. No visibility guarantees. Reordering is allowed.

It's as fast as possible, but effectively useless for cross thread coordination. It's for single threaded work next. Up opaque. That's a strange name. Opaque. Is weird. It basically just tells the compiler do not delete this line of code. Sometimes the JIT optimizer is so smart it sees you right to a variable that nobody reads nearby and it just deletes the instruction entirely to save time.

Opaque says no really I need you to perform this memory access, but it still doesn't guarantee other threads see it immediately. It's mostly for ensuring progress, making sure the code actually executes, then we. Get to acquire, release. This sounds like the piggybacking effect we talked about earlier with volatile it is. Exactly that, but decoupled. This is the sweet spot for library writers.

A standard volatile variable is a two way St. A volatile write is a release fence and a volatile read is an acquire fence. Acquire release let's you split them. A set Release says make all my previous rights visible to other threads. Then write this value and get. Acquire says read this value and make sure I see all the rights that happened before the corresponding release. So if. I only need one way coordination, like a producer thread pushing data into a queue.

For a consumer thread, I can use acquire, release and save the cost of the full volatile fence on one side of the operation. Exactly. You are shaving off nanoseconds, but if you're writing something like the L Max disruptor or a high frequency trading platform, those nanoseconds add U to millions of dollars. But the. Warning in the expert notes is pretty stark. It says these are sharp tools. Most application developers should stick to Volatile and synchronized, absolutely.

The complexity explodes. You are now manually managing memory barriers. It's incredibly easy to get wrong. There was a case study in the notes about a system where two worker threads try to claim an e-mail to send it. If you use plain or opaque modes via VAR handles because you think, oh, a little race condition won't hurt, it's a benign race. You end up in a disaster. You end up with double sends.

Both workers think they claim the task because worker A's right to the claimed flag wasn't visible to worker B in time. You have to really, really understand the hardware memory model you're running on to use var handles correctly. If you aren't writing a library that lives inside Java dot util dot concurrent, you probably don't need them. That's. Fair.

It's like owning A blowtorch. It's very useful for a professional plumber, but extremely dangerous for the guy who's just trying to make toast, that is. A perfect analogy. Stick to the toaster. Use synchronized and volatile. So we've been deep in the weeds. Let's surface for a moment and summarize. If a listener is driving to work right now and wants to retain the core survival guide for the Java Memory Model, what are the three commandments they need to remember?

OK. Commandment #1 the hardware is not your friend, the JMM spec is. Don't rely on it works on my machine. Your laptop is likely in by 86 processor from Intel or AMD by 86 has a strong memory model. It acts nice. It often keeps things in order even when the spec says it doesn't have to. It's. Forgiving. Very. Forgiving, but if you deploy that code to an AWS Graviton instance or a modern Mac which uses ARM chips, those are weak memory models. They reorder aggressively to get

performance. Your code that worked perfectly on your laptop for years will suddenly break in production. You must code to the JMM contract, not the hardware behavior you happen to observe. That's a big. One testing on by 86 does not prove thread safety on ARM commandment. #2 Know your tools and use them for their intended purpose. Use volatile for simple status flags like booleans. Use synchronized or atomic classes for counters and operations where the new state

depends on the old state. Do not mix them up. Volatile is not a lock. It will not protect you from race conditions and read, modify, write sequences and the third. Commandment. #3 embrace immutability. Final fields are the cheat code for safe concurrency. If you can design your objects to be immutable, set all their values in the constructor, and never change them, you bypass this entire headache of visibility and ordering. The JMM does all the work for

you free. Semantics for the the win. This has been a fascinating look into the engine room of the JVM. It's a bit unsettling to realize that our human concept of time is just a friendly suggestion to a modern computer. And that's. The final thought I'd want to leave people with We humans are obsessed with what's called sequential consistency. We want a single global timeline for the universe. Event A happened at 12.000 point 01. Event B happened at 12.000 point

02 and every one agrees right? A single source of truth, but for. Hardware designers and compiler writers. Strict time, strict sequential consistency is death. It absolutely kills performance. The deeper you go into concurrent systems, the more you realize time in computers isn't a straight line, it's a graph of causal dependencies. The deeper you go into concurrent systems, the more you realize the question isn't when did this happen, What is it? Though the question is.

Who is allowed to see that this happened? Who is? Allowed to see that changes the whole perspective. It's not about clocks, it's about permissions and visibility between threads. Thank you for guiding us through the invisible chaos. It was a. Pleasure. It's a fun, if scary topic and. Thank you to everyone listening.

Hopefully the next time you type synchronized or volatile, you'll picture that little notebook in the bulletin board in the hallway and appreciate the incredible heavy lifting that JVM is doing to keep your reality consistent. This has been the deep dive. We'll see you next time.

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android