Building a Video SDK with Marc Schwieterman

00:03

Leo Dion (host): Welcome to another episode of Empower Apps. I'm your host, Leo Dient. Today I'm joined by Mark Schwieterman. Mark, thank you so much for coming on the show. Mark Schwieterman (guest): you iOS. Leo Dion (host): Yeah. Before we get into daily and what daily does, I'll let you go ahead and introduce yourself. Mark Schwieterman (guest): Sure. So I've been in tech for a while primarily did backend stuff and chemical information industry for quite a while.

00:28

Grew up on Apple computers, refocused on that around iOS two or three or something like that. And that's been my primary focus in recent years. I've been in a few different places and been a daily for about six months now. I just, I don't, I really love working on video stuff and just sort of communication technology that helps people connect as we all saw during the recent events. Leo Dion (host): Yes. I can imagine. So you're working at right now with daily.

00:51

You want to explain what daily is exactly and what do they do? Mark Schwieterman (guest): Sure. So daily basically does pretty much anything related to video. So we're a real time video infrastructure for developers. We offer SDKs both for the browser in the form of daily JS. And then we have another daily X library, which we use for mobile SDKs, but for iOS, Android react native flutter. We also are just about to release a Python library that can be used for AI workflows. We have.

01:20

Global mesh infrastructure. So you don't have to deal with the server side of things for your apps. You can just get started and start using things immediately. In addition to that, we have a little latency video streaming options where you can actually like composite different things and stream those to an audience. Both video and voice calls. And we also satisfy a variety of different compliance requirements, such as SOC two and like a HIPAA as an example for a telehealth.

01:48

Companies and things of that nature. So if you have a problem, we have a solution. Leo Dion (host): Yeah. And that's, it's a growing thing right now. No, is it mostly for like meetings or conference calls essentially? Mark Schwieterman (guest): So you can really build whatever you want.

02:03

Our systems are built on WebRTC, and we have a product also called Prebuilt which is something that'll give you like a call experience, where you can have meetings with other people and create rooms breakout sessions and stuff like that. But really we give you the tools to build pretty much anything that you want to related to video and streaming. Leo Dion (host): Okay. I'm asking if it's do you do anything with like live streaming, Twitch or YouTube or something like that?

02:27

Or is it mostly focused on meetings between one or more or two or more people? Mark Schwieterman (guest): We can support that too. That is not my personal area of expertise, but I believe there's like RTMP and similar protocols and we're compatible with those. So like you could use our services to do a similar broadcast as those. And I'm not positive, but I believe we can also forward things on through other things too, but I would actually have to double check on that.

02:50

Leo Dion (host): Yeah no, no problem. No problem. Cool. Today we wanted to talk about you're the head of the iOS SDK or maybe do you do work on all the SDKs? Mark Schwieterman (guest): So we have like a client team and there's two of us, myself and another one of my team members who are primarily focused on the iOS SDK. But we do jump around like we have a core layer that we need to work on at times. And so we just jump in wherever needed. Leo Dion (host): Okay. Okay.

03:18

So let's talk specifically about the iOS SDK. I looked at it I gotta say the documentation is solid really good stuff out there we'll get into that too, but look what kind of makes what makes it unique for Developing an SDK for iOS specifically in this like video space. Mark Schwieterman (guest): I would say one thing that makes daily and our SDK is unique is the team that we have here. We just, we have a variety of people that are very deep in different areas.

03:48

And so one of the features that we just released in the SDK is adaptive HEVC. And so typically with WebRTC by default it uses VP8 and software encoding which will actually run in your app process. However with this, our adaptive HDVC feature it will actually use the hardware encoders on the device. And that's also what lets us do H. 265, which will provide like significantly better video quality.

04:14

With lower bandwidth and there's lots of there's different characteristics make it more optimal for delivery and things of that nature. So we have access to those lower level like hardware features that you wouldn't find with more, off the shelf solutions. And we'll also we basically, we like helping our Customers solve their problems.

04:35

So if there's something that's like possible and a customer is interested in doing it, like we can usually find a way to make that work for them and we'll add features and such in support of that. Leo Dion (host): So I want to, I'm curious about the whole encoding space as far as what you need to do most of it is cause my understanding was HEVC is only Apple, or is it like all? All platforms support HEVC, or how does that work?

05:03

And then when you're talking about hardware encoding, I assume you mean like the SOC, the A19 or whatever the heck chip we're on, I forgot. A17 is the new one, right? Like those have encoders built into the chip itself, correct? Mark Schwieterman (guest): Yes. Correct.

05:18

Yeah. So this is very complicated and there's lots of details, so I'm going to make sure that I get everything aligned right here, but so basically with codecs, typically you have open source versions of things and then closed sourced similar to like how with MP3s, there were open source and closed source versions of that provided by different companies. So in the video space. H two six four H two six five. Those are not open source, but then there are corresponding versions of those.

05:46

And I'd probably get that wrong. I'm almost certain that VP eight is the open source version of H two six four, but that may not actually be the right one. Leo Dion (host): Hehehehe. Okay. Mark Schwieterman (guest): As far as interoperability goes with web RTC browsers are required to support basically. VPH and H264. So they have to support all of them. So as long as you support those, you have compatibility. And then H265 is not actually supported by most browsers that I am aware of.

06:17

And so basically when you stream video, you can have multiple layers so you can actually have like different encodings, like all in the same spree, same stream, more or less. Leo Dion (host): Oh no, I used the wrong word. Mark Schwieterman (guest): I think container is more of the file formats. I'm not actually, yeah, I'm not sure what the exact term would be. But so as one example you might you might publish three different quality levels. So if you think of an iOS app.

06:44

Like little thumbnails, you just need low resolution because you don't need 720 by 1280, right? Which your phone can't actually publish if you Leo Dion (host): Yeah, yeah, I know what you mean. Yeah I, let's just say hypothetically I've gone through and... Run ffmpeg or youtube.

06:57

dl and it'll tell you all the different streams you can get access to so Yeah, I think that's what you're getting at is you can get like the you can get the webm or you can get the h264 or whatever it is Mark Schwieterman (guest): It exactly. So we include basically two layers. One that's H. 264 for compatibility. So browsers and devices that can't actually handle the H. 265 can then use that. And, exactly. And then for H. 265 and clients that support it, it will then use that.

07:29

And so that allows us to have higher quality video images, basically, with less resource consumption. And there's some other features. There's this concept of temporal scalability. And so if you imagine I believe it's every eight. So there's imagine eight frames. You can drop frames to decrease the frame rate, which then also reduces the bandwidth that's needed. Leo Dion (host): Yeah, Mark Schwieterman (guest): So broad strokes, H. 264, H. 265. As far as the encoding.

07:56

By default that'll happen in your app. So if you were to profile it in instruments or something, you would see CPU usage and stuff like that going on in your app process. And then if you use the hardware encoders, which are actually on Apple's SoCs that'll actually go out and use the hardware instead of your app process. And if you profile those, you'll see increased activity in MedioserverD, which is the little process on iOS that manages that.

08:21

So you're actually having your video frames go over to that process and be encoded before they actually then get streamed out. Leo Dion (host): Nice. Nice. So what, was there anything new that came out as far as with the Apple event recently? And the new hardware? Mark Schwieterman (guest): Yeah. I haven't taught, had time to dig into this in detail. But I have seen basically the new iPhone 15 pros and what is it? The a 17 pro, I think the way they're offset always trips me out.

08:53

But that supposedly actually has an AV one decoder. And so that's like the successor After H. 265. Also H. 265 HEVC are basically the same thing. Leo Dion (host): Okay. Okay. Mark Schwieterman (guest): use them interchangeably. But so eventually we will have hardware decoding support. Now that's not something we can use on the encoding side yet, but it is interesting to see, these things being added.

09:16

and so in the future, encoders, then that's something else that we could potentially start leveraging to, to have even more options. Leo Dion (host): Yeah. Was there anything else you want to talk about when it comes to video encoding and decoding? Or should we talk about the SDK stuff? Mark Schwieterman (guest): I think probably SDK. Leo Dion (host): Alright, let's get into it. You guys deliver an XC framework, correct?

09:39

First of all, let's explain what an XC framework is to those who don't know as far as a mode of delivering a Swift package, so to speak. Mark Schwieterman (guest): Sure. So XC frameworks, SPM or at least after SPM became more in use. And so you can package up a binary assets, XC framework being one of those, and it lets you combine XC The platform specific macho slices for different PLA platforms all into one executable.

10:08

So you can basically distribute a framework that you could then use in iOS, mac, oss, or like whatever platforms you need to support, and you'll have one slice for each supportive platform in that overall framework bundle. So it's kinda like a framework of frameworks. Leo Dion (host): And it's a binary, right? Because Swift packages, people think it's just the source code that you import, but you're not doing that. You're doing an actual binary. What are the advantages of that?

10:35

Mark Schwieterman (guest): Let's see here. I think for one that you don't have to build it. So it doesn't add your compilation time. But for us it's basically a necessity. Just because being dependent upon WebRTC, that actually uses an entirely different build system. Leo Dion (host): Ah, okay. Mark Schwieterman (guest): we couldn't create a completely pure Swift SPM package just because of some of the dependencies that we have to use to build the overall framework.

11:02

Leo Dion (host): Okay, that makes total sense. What was the the new kind of library that you can build? I'm losing it now. It's not dynamic, and it's not static, but it's the one in between. They Mark Schwieterman (guest): Oh, the new ones. Yeah. Yeah. And I was 60. Oh, I forget. I forget what it's called too. But yeah, I know what you're talking about. Yeah, so there's the basically it can be either one more or less. Leo Dion (host): Have you looked into that at all?

11:25

Or is it, is Web, is WebRTC, is that like C or what is it exactly? Mark Schwieterman (guest): C and. Leo Dion (host): Okay. Mark Schwieterman (guest): WebRTC is developed and maintained by Google and then the bulk of that is written in C for kind of like the core layers of it but then there are also platform specific SDKs. So for iOS, then there's additionally Objective C Leo Dion (host): Okay. Mark Schwieterman (guest): of that. Leo Dion (host): Okay. Okay. Yeah. What's it called?

11:53

It's called mergeable libraries. Mark Schwieterman (guest): Mergeable libraries. Yes. Yeah. I'm so Leo Dion (host): go ahead. Mark Schwieterman (guest): go ahead. Leo Dion (host): I was just going to say, I know they added support for C plus this year. Mark Schwieterman (guest): I've seen that. I've not had a chance to dig into that yet, but that's something that I'm very interested in.

12:09

Just because you work with WebRTC and you can use a lot of the classes, but sometimes you need to customize their behavior. But you potentially still need to be able to like, call C code. And there, there's some cases where we've started building like our own sort of replacement for some of those standard types. Leo Dion (host): Oh, okay. Mark Schwieterman (guest): needed that yet, but I can see that being useful. And then you have the benefit of the code you're working on is all Swift.

12:33

So you're not having to jump over into other languages quite as often. So Leo Dion (host): I had to do Mark Schwieterman (guest): very appealing to me. Leo Dion (host): a while ago before I had even a blog post about this, was this like how to have Swift talk to C and I ended up having to create an Objective C interface in between and it was a mess, so yeah that's a Mark Schwieterman (guest): That, that's my life.

12:51

We basically have a core layer that's objective C plus, and we actually have our, some of our core libraries are rust and we have to call that, which that's a whole nother thing we could probably talk about for a while. But. Leo Dion (host): So let's talk a little bit about the CI setup. So you guys, I assume, I don't know where your code is stored, but you're using, how are you setting it up? So that way your XE framework is built and always tested and things like that.

13:17

Mark Schwieterman (guest): Sure. We use GitHub and GitHub actions, and currently we have some co located runners basically, and are actually in the process of setting up some images just to make the maintenance of that easier. And actually, brief tangent, but I noticed that you have something named Bushel that seems like it's potentially in this space. We've been looking at Tartlet, but I'm just curious is Bushel intended to be a CI solution or Leo Dion (host): No, it's not.

13:42

Tartlet is definitely more of the CI solution. I definitely want to look at, I've, yeah, I've gone down that rabbit hole. But no, Bushel is more to run on device to test your app on device. More so than a CI solution. For now. For now. Mark Schwieterman (guest): So as you probably know, there's like license restrictions and stuff. So you're only allowed to run like two, two virtual machines, like per Mac iOS license or something of that nature.

14:07

And at least so far from what I've seen the hosted runners you can run like two VMs, which works pretty nice. So our build system effectively, we trigger through GitHub actions. But then we also have to build all of our frameworks and we need to be able to build all of The languages. So it gets quite complicated. We have some stuff that runs for Android on other hosted runners. And then for iOS stuff, we run the core build.

14:31

And so basically we build the core library which is both WebRTC and media soup C plus and C languages. Then we have the daily core library, which is built in rust. And that controls, that's like things that we add on top of WebRTC basically to provide additional feature. And then obviously some of these customizations like the adaptive HTTPVC that we were talking about earlier that for that we actually had to modify WebRTC itself so we have our own custom build of that.

15:00

And then to actually call into Rust code from Swift, you can't. So we use FFI and bindgen basically. So we generate C bindings to our Rust code and then call that from Swift. Yeah, so there's the, Leo Dion (host): Yeah, what, I'm not familiar with FFI. You want to explain a little bit how that works? Mark Schwieterman (guest): So it's foreign function interface.

15:20

If you've ever used a dynamic language like Ruby, you've probably seen this where like you install certain gems and it'll start compiling things. And so that's the idea. And so basically it works, it's similar in concept to how obviously message send works. Leo Dion (host): A little bit. Yeah. Mark Schwieterman (guest): So you have the self pointer and then like a selector which kind of defines the method that you're going to call and then some number of arguments.

15:43

And so the way this works is you basically generate C bindings into, in this case, Rust code and then to invoke that. You have a pointer to that you pass in, which is the thing that you're actually going to invoke the method on, and then additionally pass along some arguments in some other format. And so we have like a serialization format, basically, where we give it, we call, we have a pointer, and then we have a payload that more or less has all the stuff needed to do.

16:09

To make the actual method invocation. So in CI, all that has to be built from scratch and then cleaned and then at the end of it packaged up and into a framework. So it's it can get pretty complicated. Leo Dion (host): Yeah. Yeah. Yeah. Yeah. Okay, let's talk a little bit more about the SDK and the iOS SDK as far as what it provides. So I deep dived into it. It's really great. Like you got a UI kit, everything's UI kit and easy to plug in. Where are things that as far as SwiftUI is concerned?

16:40

Mark Schwieterman (guest): So we do support Swift UI. We have a demo app on GitHub Iowa starter kit, and that is written in Swift UI there are not really native view there. So have you had to use like UI view representable UI view controller representable in this? Leo Dion (host): Yeah. Yep. Mark Schwieterman (guest): So that's basically what we do currently is we package up some of the WebRTC related views inside of a Swift UI view and then expose it that way.

17:05

Leo Dion (host): What are some of the concerns folks should have if they're going to start plugging this SDK? What are some things they should know off the bat that they need to be aware of? Mark Schwieterman (guest): For the most part, things do just work. It is a little bit different. I would say the number one thing with SwiftUI, and I'm sure you're familiar with this But is, you need to be conscious of changing state in a way that can cause the views to have to be re rendered.

17:32

Leo Dion (host): Okay. Mark Schwieterman (guest): So what we recommend basically our SDK is very robust and we provide a lot of functionality to do different things. And it's often helpful to have like your own domain layer. And so you can transform things between like the fuller at a high level we have the concept of a call that you would then join or leave and then you have participants in the call. And so that's like me whomever.

18:00

You don't want to be redrawing your screen if like someone who's not even going to talk un mutes something or mutes it again because you're not actually going to see that. And you want to make sure that you have things structured in a way that is going to be effective for you and that also gives you like a seam to simplify things and then have you used the SwiftUI Preview Canvas much? Leo Dion (host): Yes. Yep.

18:22

Mark Schwieterman (guest): Okay, that is probably hands down my favorite feature, period. Leo Dion (host): Yeah Mark Schwieterman (guest): Have you been able to use SwiftUI much in your own work personally? Leo Dion (host): Yeah, I do everything in SwiftUI right now. Mark Schwieterman (guest): Okay. Some of the companies I've worked at, we've needed to support UI kit and legacy apps. And haven't really been able to use Swift UI there much.

18:42

And I've gotten much deeper in it and probably the past a year and a half or so. And it's the productivity benefits of the preview canvas, different devices, not having to build and run a different screens and re navigate like that is just amazing. So that's why I mentioned creating your own domain. So like using our SDK with SwiftUI, it will just work.

19:02

But there are huge benefits both to thinking through like the performance implications of like how you model your own domain and additionally structuring things so that you can very easily create structs for use in the preview canvas, just to do your iterative UI development there. Leo Dion (host): Yep, that makes total sense. Yeah, it reminds me we just did an episode with Brandon Williams from Point Free and he was talking about how Mark Schwieterman (guest): Ah, love TCA.

19:27

Leo Dion (host): He keeps, keeping stuff separate so that way you don't have to worry about the video working when you run the SwiftUI preview or a simulator, for instance, and that kind of sounds like what you're talking about with the domain stuff. It's exactly the same concept where you try to keep stuff separated out. Mark Schwieterman (guest): In fact, like I, I am a huge fan of TCA wanted to use it at a previous company, but just couldn't do to like backward support.

19:48

I actually built something. With Daily's Frameworks and TCA is like one of the first things that I did is just like a personal fun project. The current starter kit that we have it doesn't use TCA because I didn't want to use any third party dependencies just so that people could, to make it more accessible. Like you already know the stuff you need to know for Apple development basically. But I would very much to port that to TCA.

20:09

The, it does a great job of solving all of these sort of like problems that SwiftUI doesn't really have an out of the box solution to basically. Leo Dion (host): right, yeah, exactly. You, you obviously, if you're in UI kit, you're gonna do a lot of delegates and things like that. Have you, how's that gone with transitioning to using more like publishers or async sequences or combined or any of that stuff? Mark Schwieterman (guest): Yes. Mostly good. Although I we did have a learning experience.

20:36

So we support, I, I believe back to iOS 13. And Okay. 13, it might be 14. I think we may have raised it to 14, but we also want to make sure that, customers in any type of business can use our framework. And so depending on the type of company, they may need to still support UI kit. So we do have a delegate and for UI kit apps, that's probably a good way to go in some cases. Additionally, we also have published properties.

21:05

For some of the states or for example, like call state is a published property currently. So our like participants we have a big participant struct and that gets published. That can work good if you're doing like an MVVM style UI kit app or in Swift UI. But the one problem with that is that you can't really create a protocol for something that has like a published property on it.

21:29

Yeah, so if you're wanting to provide like I personally would love to provide, like a first party test double that people can use just so they don't have to write their own make that easier with published properties, you can't use that. One change that we plan on making soon is basically switching from published properties to explicit publishers for Leo Dion (host): Oh, interesting. Okay. Okay. Yeah what I've ended up doing is just putting a protocol on the thing inside the observed object.

21:57

And then having the observed object just be whatever the class it's gonna be in and the protocol's inside it. And then the proper, the protocol provides whatever you want to provide to the published properties that way. But, yeah. Mark Schwieterman (guest): Oh like the pro like the properties of the object or value that would be published.

22:15

Leo Dion (host): Yeah, and then have the protocol be like a different implementation for simulator or for whatever yeah, but that is, I've heard that problem a lot with a lot of people where they're like, I want to use a published property and it's we'll get into the new stuff, but yeah Mark Schwieterman (guest): and that is, yeah, obviously that, that is really interesting because, that's all going to be changing with iOS 17. And I personally do you think combine is here to stay?

22:37

Cause it seems. Like some of it's going into Swift packages, but I don't know. Leo Dion (host): Or you mean to like observation Mark Schwieterman (guest): so Combine itself, I think it seems like it was introduced initially in support of Swift UI.

22:51

Leo Dion (host): right, Mark Schwieterman (guest): but then the async await Swift concurrency stuff seems to like largely be superseding it, but there's that there's a, I believe it's Swift async algorithms package that you can use and that like similar stuff that you would have with combine publishers is being added to that framework. Leo Dion (host): So I'll be honest, I haven't deep dived into, excuse me. I haven't deep dived into the async algorithms more.

23:13

I think the observation stuff has really removed a lot of the combined code that I've been doing. In bushel, for instance, I've talked about this. Like I've, I have one piece of combine right now in bushel, and that's because it was something that I wrote a year ago that I just am too lazy to migrate over and it works great. I think it's more of the observation stuff has really made it. Not necessary.

23:36

There's things I miss about it for sure, but, and I like the functional programming aspect and I think that's what async algorithms provides. But yeah, I, is it here to, is it here to stay? Probably not. But it's not going to go away anytime soon. I would imagine. Mark Schwieterman (guest): I'm very conflicted. Cause it took me a while to warm up to combine. It's just, it's a very different model.

24:00

And I feel like as soon as I started to get my head around it Swift concurrency came around and then, have you done much with the async sequences and Leo Dion (host): Yeah, I have not done anything with async sequences. That's where I, my, that's my blind spot is is this like async sequences and how that replaces combine. I'm much more comfortable with async 08 when it comes to just the simple Oh, this is an async call to a network and you get back something.

24:23

But the async sequence stuff, I'm still, I'll be honest. They're like, I had to do something with store kit recently and it was like. Okay, I guess this is a for loop, but yeah, it's weird. Mark Schwieterman (guest): Yeah, and just, yeah, like the sequencing and there's stuff where like you basically have to use a task group if you don't want things to serialize and there's just little things that at least aren't always immediately intuitive.

24:45

I think it's really interesting for SDKs though, because like, how do you serve a bunch of customers that have a variety of different needs? Leo Dion (host): Yeah, you screwed, honestly, in a lot of ways. Yeah, cause it's oh, if you had to serve somebody in iOS 14 and I still have some apps like that, and it's yeah, good luck trying to do any of that stuff. Mark Schwieterman (guest): Where I've settled on and so like we, we have, and we'll continue to have a delegate.

25:09

I think we're going to replace the published properties with publishers and that way people that need them can use them and then probably add in like the async streams at some point, but you can also get those off of publishers. So they're interchangeable, but not exactly. Leo Dion (host): yeah, you gotta serve everybody in every space. I was gonna ask, like, how does observation look to you as far as as an alternative to published properties?

25:32

Cause I've got all in observation with Mark Schwieterman (guest): actually, yeah so I have not used observation a ton. I am very excited about it for yeah for apps. I'm very excited about it because what I've seen, at least with Swift UI is it's very non intuitive that changing something non visible could cause your entire view tree to re render. And so I think that's gonna like basically just solve a lot of that. But I haven't used it enough.

25:58

Can you actually use observable with to provide like a protocol or for an interface that you want someone else to use? Have you tried anything like that yet? Leo Dion (host): No, I haven't. I'm so stuck on using it the way observed objects are that I haven't wrapped, I haven't put my observed objects, or my, oh gosh, I knew I was going to say that, observation stuff in a protocol. I still do it the old way where I have a protocol inside the observed object. No, I haven't. I, you probably could?

26:28

I don't know. It I would think it would break the macro, but I could totally be wrong on that. I know I've had Mark Schwieterman (guest): The whole macro thing, like that, that just seems Leo Dion (host): and SwiftData and trying to use protocols in that space. Macros are inter I love them, but you gotta be careful because there's stuff you can do with macros that'll break them and you don't even know until you like deep dive that, oh. This modifies the code this way. That's why it doesn't work.

26:53

So yeah. It'd be interesting to see if you can go that route. Mark Schwieterman (guest): That, that expansion feature is just like magical to me. Cause the property wrappers are just opaque and it's we don't necessarily know how these things work and just being able to see like the actual code Leo Dion (host): Yeah. Yeah. Yeah. I think the biggest advantage of macros is just now that we can see how some of those stuff behind the scenes. Works and we have access to it.

27:16

Like you said, property wrappers are codable or any of that stuff that was always like in the compiler that we couldn't understand what was going on. Now we can do that ourselves. Which is awesome. Mark Schwieterman (guest): So I don't know that we'll be adopting observation, at least on the API. We'll probably do that like publisher change that I mentioned first, but my plan is basically to update our starter, get app to Leo Dion (host): Starter kit. That's what I was going to say.

27:42

Yeah. Yeah. Yeah. Cause Mark Schwieterman (guest): yeah. Leo Dion (host): I don't know how to do this with opposite. Yeah. Okay. Mark Schwieterman (guest): Exactly. Yeah, I was hoping to do a beta branch, but did not have time to actually do that. But now that, iOS 17 is about to drop, it's about time. So I'll probably be doing that soon. And then I always like to try something first and then take what works and pull it up.

28:00

And so probably just use it directly in the app and then figure out okay, can we pull this back into the SDK itself in a way that's helpful? Or maybe the app is like the, app layer is like the appropriate place to use that, right? Leo Dion (host): What were there any other API APIs that were introduced this year that you're like, this is going to be awesome for us at daily? Mark Schwieterman (guest): Yes. Yes.

28:20

Some of which we can't actually use yet, which is currently frustrating, but I think I found something. One thing that's really cool is, have you used continuity camera at all? Leo Dion (host): Not yet. No I know what it is. It's like where you can use the iPhone as a camera on a Mac or a TV. Yeah. Mark Schwieterman (guest): E Exactly, yes, yeah. There's there's actually like a third party app named like Camo or something which sort of provided a way to do this Leo Dion (host): Yeah.

28:44

Mark Schwieterman (guest): And continuity of cameras. Kind of Apple's first party version of that. They're now bringing it to TVOS, which for video conferencing stuff is really cool because, you could just sit back and have a conference on your Apple TV. And then use your phone as the camera for that. And that totally works something we can maybe come back to.

29:03

Although currently it's a bit more difficult to actually get our framework to work on tvOS due to the way it's built and some of the Rust interactions. Leo Dion (host): Oh, gotcha. Okay. Mark Schwieterman (guest): but I think we have an approach that will work with that. So that is going to be really cool once we get that working. There's also the addition of I think it's called UVC. It's like a USB peripheral device for iPad.

29:27

This is something else I want to explore, but in some cases I think it's helpful to be able to have a video conference on an iPad and then be able to have potentially like a camera that is not your iPad, similar to how you can do a continuity camera, but Version of that. There's new improved voice processing. So for any of these things where, you're worried about background noise and stuff, that's just going to automatically get better.

29:49

And then just as a fun thing I was 17 is adding, like gesture based reactions where it'll just recognize gestures that you do on screen and then do a visual effect. Have you played with that at all?

30:02

Leo Dion (host): no, but I was just thinking about the Apple event and how we can do how we can do the double tap, like all these little things that we've gotten from what do you call it, from the Vision Pro, like Apple's been working on this stuff on the side and they're just like here you go Here's a little something for the iPhone. You can guys can use.

30:18

Yeah, it's Mark Schwieterman (guest): So funny thing on that the watch double tap, I saw that and I was like in awed and, I looked at that and immediately I just, I, in my head, I could see like the vision pro, thing when they showed it off and like the pinch gestures, right? It turns out that was like an existing accessibility gesture on watch OS already, Leo Dion (host): I just released the podcast episode. So we just talked about that.

30:39

Yeah. Mark Schwieterman (guest): Okay, so you got into that, on that. Okay, I will check that out then for the full backstory. I just, I thought that was fascinating because I was like, oh, this is cool. Leo Dion (host): episode it was like do they do all this with assistive touch like yeah No, totally. So Oh I was gonna ask Mark Schwieterman (guest): So good. Leo Dion (host): Okay I they mentioned this, they had a video about it.

31:00

I don't know if you saw this and I don't know if you've you do anything with it, but did you see like the rotating stand thing that they added to the iPhone? Mark Schwieterman (guest): Oh night nightstand or stand something, Leo Dion (host): There was a road. There's a, they had a whole WWDC video about this. Okay. I'm going to look it up. Sorry. But there's a new API, so you can plug into a USB stand and. Mark Schwieterman (guest): Oh yes. Yeah, I did actually see that.

31:26

And it would let you, like remote control it or something like that? Leo Dion (host): Yeah. What was that called? Mark Schwieterman (guest): I know the video. I think Yeah so I've used a video conferencing stuff before I worked at one company remotely and they would have like cameras in rooms and you could control them. So if you're trying to like talk to someone, you could like zoom in or zoom out. And yeah, whatever that is, they, they're, I believe now adding support for that.

31:49

So I'm imagining like a little, like a iPhone driven, like robot or something that can like roam around and then you can like control. It's I would just be something super fun. Leo Dion (host): I'm trying to look what, which one it was. Oh, here it is. Dot kit integrate motorized iPhone stands Mark Schwieterman (guest): That's it. Leo Dion (host): Yeah. Yeah. Yeah. So yeah, I thought that was interesting. I don't think we've seen any hardware about it yet, but.

32:14

Yeah Mark Schwieterman (guest): one I saw. And yeah, like I, I don't actually I don't know, I guess I'd like to see if I could get my hands on one of those devices just to play with it. Cause that seems, and you think about the A lot of this like AI image recognition stuff is getting really good and in some cases you don't even need to use third party things for that. So like with iOS you know how iPad has center stage?

32:35

Leo Dion (host): Yeah, Mark Schwieterman (guest): And in the Apple studio supply display as well basically you can actually get like the region of interest back from that. So it'll give you like the rectangle, like where faces are basically not so long ago, you couldn't do that yourself, basically you had to write your own code, but just, be able to build like an app or something that uses that.

32:52

And then we'll potentially use doc kit or something to be like, oh, we'll reorient towards a person or something interesting that we're trying to keep an eye on. Leo Dion (host): Yeah, exactly. Okay, let's step back, we'll stop talking about the fun stuff and we'll talk about the Mark Schwieterman (guest): All right. Leo Dion (host): So a lot of this, I assume you have to run a lot of this through AV Foundation. How was that experience?

33:13

What were some curveballs that you got that were just like wow, this is not what I thought it was gonna be? I've done a little bit with AV foundation, but not a lot. I want to, because it's super cool. But what yeah, give me the low down on what it's like doing that experience Mark Schwieterman (guest): So AV foundation is very capable. I have played with it in the past.

33:34

I once actually made a. A podcast app for basically dropping markers like audio markers where you might want to edit Leo Dion (host): chapters. Oh, okay. Yeah. Yeah. Mark Schwieterman (guest): Yeah. And you can do all this stuff. I guess that was more core audio feature, but I would consider that part of that umbrella, but you can do virtually. Anything with it from like building apps that actually process things offline to doing things live there.

33:56

There's a lot going on there Basically with web RTC specifically a lot of that is handled for you, but you still have to interact with it Which is strange So the two main classes that you use are AV audio session Which is a shared global singleton and then like other apps can also Make changes to that. So that is one thing that's tricky.

34:18

So like when you want to do things like choosing your audio device you can use the system, I think it's called MP volume view as one example, you don't necessarily have to write code to do that. But in some cases, you actually do want to and so as an example, if you want to override to like always use the speaker instead of an attached device or something, then you have to get in there and change that session.

34:43

So we have some code that provides some management of that, but in some cases like someone building an app might want to manage that directly too, and you still have to deal with that, and you get notifications on a route change, which means any input or output device has changed.

34:57

Or if like the session is interrupted, like imagine like you get a phone call or something, the kind of things you might get where like normally like the audio might be ducked or something like that just to be able to hear it. And so dealing with that is one point of complexity. And then on the video side of things, which is probably what I should have started with, but you have a AV capture device and that's your interface into all of the cameras that you have.

35:26

And so WebRTC itself supports that directly. And so you can choose between the cameras that you want to use. So as one example, you could flip between the camera facing you, the user, or the camera on the other side facing the environment. And you need to switch between those and working towards adding support for like continuity camera on tvOS.

35:47

That's also like you need to use those API is slightly different because you have to deal with like reconnection and Leo Dion (host): Ah Mark Schwieterman (guest): disappearing on you. Leo Dion (host): okay. Mark Schwieterman (guest): The biggest curveball gotcha that I got, which it was actually a learning moment for me. It has to do with. Video rotation, basically. So you can just use WebRTC out of the box and it handles device rotation and all that. Question.

36:09

Most of the apps you've worked on, have they supported device rotation? Leo Dion (host): probably not Honestly, Mark Schwieterman (guest): Okay. Okay. On iPad in particular, that's something that you need to deal with just because Leo Dion (host): Yeah iPad, but that was like over 10 years ago. So yeah, Mark Schwieterman (guest): Yeah, something I've just through coincidence of the things that I've worked on like most of them have supported rotation. So I'm just used to that.

36:32

And anyway, so this is the gotcha, is WebRTC, all of that works fine for local video if you use like the video views that come with it and everything rotates, but if you have an app that doesn't rotate, there's actually an issue with that because it has to rotate the camera that like the video that it's sending. So that it's still like the right, right direction on the receiver's screen, but on the local device, if the device doesn't rotate with it, it will rotate it anyways.

37:00

And then it ends up being offset. So to work around that there's AV capture video preview layer. And so that one you can actually use to preview, the way that you would expect and then let everything else rotate. And that's this is actually something that I'll be integrating into our SDK shortly to handle that better. Leo Dion (host): Cool. As far as like networking stuff, like actually sending the video out. How does that work?

37:24

Mark Schwieterman (guest): We do not have to write code to do that really much at all. WebRTC pretty much Leo Dion (host): Does that work? Okay. Okay. Mark Schwieterman (guest): Yeah, and you basically you just give it the device and the video frames get past WebRTC and then based on the encodings that you have configured things, it'll just be streamed out and then On the receiving side, like more or less of things will be used, but frames can be dropped.

37:52

Like WebRTC itself is very resilient, so it can handle packet loss. And then there's a server component called an SFU that also participates in these interactions. Leo Dion (host): All right, let's jump into future plans. What do you have as far as support for other? Obviously not watchOS, sorry. I guess unless you're going to show video and kill the battery, I don't think you're going to support watchOS anytime soon. But as far as we talked about tvOS did you want to expand on that at all?

38:19

Mark Schwieterman (guest): Yeah. So this is where things can get interesting. So WebRTC being C plus you can build it for lots of platforms and Apple's platforms specifically. However the platforms have different behaviors. And so WebRTC out of the box has its own like conditional compilation for Mac OS and UI kit.

38:41

Leo Dion (host): Okay. Mark Schwieterman (guest): So you probably know, I like, like the coordinate spaces are flipped as one example, there's a variety of things that are just like a little bit different. Similar to that, you potentially need to support other platforms too. And with rust in the mix we also need to have platform support there.

39:01

And so in order to build, the way that Rust works is you you build on a certain platform, and then you can also build for certain targets, but they have to be supported by Rust itself, and they have different tiers of support. The first tier is it just works, but it doesn't support the standard library, which quite often you need if you're doing anything non trivial. Tier two is then the standard library should work, but it's not like officially supported.

39:25

So the cur the current state of things right there is basically Rust does now have Tier 2 support for tvOS, and so it is possible to build everything with the standard toolchain and have that work. However, it's not actually shipped that way, so you have to set up your own compiler. Ha! More or less, a nightly build to get it working. Vision OS, which I am super excited about, is... It's not yet supported by rust. However have you heard of Mars uponify?

39:53

I think that was like maybe Steve drones, Smith's project, or there's a few of these Leo Dion (host): Maurice Cepan, the codename for Catalyst a hundred years ago. Yeah. Mark Schwieterman (guest): yeah as an example you have an arm 64 library that you would run on a device and previously, you couldn't run on a Mac until max where I'm 64 too. But then. You could, but you still couldn't because it wasn't actually support, supported.

40:15

But there, there is actually a way to go in and rewrite the macho slices to basically like claim that it supports a platform, that it doesn't. And I've been trying to get that working, but I think that might, cause rest itself, like the support is really quite similar. And so there's not something functionally different on a lot of these, different chips on different device types. It's just more how things are identified. Leo Dion (host): Have you applied for a developer lab?

40:41

Mark Schwieterman (guest): No, Leo Dion (host): Okay. I just thought I'd Mark Schwieterman (guest): I really want to, but I feel like I need an app for that and have not, you mean provision OS, Leo Dion (host): yeah. Mark Schwieterman (guest): Did you? Leo Dion (host): Yeah. No, no way. No way. No, I'm not flying to California. Mark Schwieterman (guest): No. Leo Dion (host): Sorry. Mark Schwieterman (guest): I think there's one in New York. Leo Dion (host): I hope so. It'd be nice.

41:00

Yeah. Yeah. Mark Schwieterman (guest): Yeah, in an ideal world I want to try to get this kind of rewriting the slices approach working for tvOS and VisionOS. As soon as I do that, I'm going to try to port our app and then I don't know, I guess it's maybe too late at this point. But I would definitely like to take advantage of that if I can find a good scenario in which it makes sense. Leo Dion (host): Anything else you want to mention regarding daily or future plans before we close out?

41:24

Mark Schwieterman (guest): Just, if you're interested in building apps check us out. We have tons of resources online, lots of educational stuff about WebRTC generally. And very interested in helping customers build whatever they want to build. Oh, also I guess we just, we're just about to release this daily Python SDK, which is really cool. You can just drop this into the various kind of like.

41:43

Like Jupyter notebook and things of that nature that use Python just like you just import it and you have working like video stuff that you can run, write in those things immediately and you can use it to process a video and audio to do transcription and a variety of other things. So really cool and worth checking out, I think. Leo Dion (host): Yeah. If you're looking for really good documentation, definitely take a look at it, but you guys have done a daily really impressive stuff.

42:06

Great starter kit that gets you going. So definitely take a look at that. Mark, thank you so much for coming on the show. I really appreciate it. Mark Schwieterman (guest): All right. Thanks for your time. And I had a lot of fun Leo. Thank you. Leo Dion (host): Where can people find you in daily online? Mark Schwieterman (guest): Daily is daily. co and for me, I guess if you can spell Schwedermann, just search for Mark Schwedermann and I'm around.

42:26

I'm Mark is me on GitHub, M A R C I S M E. Leo Dion (host): Yep. And we'll pull links to that in the show notes in case people don't know how to spell Swederman sch swederman. People can find me on x I hate that. At Leo g Dion, I'm a Mastodon at Leo G. Dion at c Im. My company is bright digit. Take a look there. There's some new articles coming out about humane code and CI and all sorts of stuff. So thank you so much for joining me for this episode.

42:54

If you're watching this on YouTube, please and subscribe. If you are listening to this on a podcast player, I would love a review. if there's something you want to talk about or there's something you want me to find a guest to talk about or maybe me to talk about let me know. I'd love to hear back. Thank you again. And I look forward to talking to you again. Bye everyone.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript