This episode is brought to you by Propello. People take authentication for granted, but it's more than how customers log in to your product. It's how you manage your relationship with your users. The easiest way to make authentication your advantage... is to use PropelAuth. PropelAuth is more than just functional. It's powerful. With tools like managed UIs, enterprise SSO, robust user management features, and actionable insights, Propel-Off adapts as your product grows. And the best part?
When off is effortless, your team can focus on scaling, not troubleshooting. That means more releases, happier customers, and more growth for your business. Save dev time, win over your customers, and propel your business forward with PropelOff. Check them out at PropelOff.com. That's P-R-O-P-E-L-A-U-T-H dot com. This episode is sponsored by Kinsta. Between juggling client meetings, managing your website, and keeping up with everyday tasks, who has the time to stress out about website security?
With Kinsta, the technical stuff is taken care of, so you can focus on what you do best. Kinsta provides managed hosting for WordPress, offering lightning-fast load times, top-tier security, and unmatched human-only customer support. Whether you're a business owner, web developer, or running a digital agency, Kinsta makes managing your website easy and efficient.
Kinsta provides enterprise-grade security, being one of the few hosting providers for WordPress with SOC 2 and other certifications, guaranteeing the highest level of security for your website. Thanks to their unlimited free expert-led migrations, Kinsta ensures a smooth transition for you from other hosting providers. Customers have reported 200% faster load times post-migration to their platform.
What I find cool about Kinstuff is their fast and reliable customer service and premium features that are included at no extra cost. Ready to experience Kinsta's hosting for yourself? Get your first month free when you sign up at Kinsta.com today. That's K-I-N-S-T-A dot com. One of the reasons that we are excited to bring AI into the real world is to use it as a way to help amplify what humans in the real world are capable of. To be able to actually use this.
as a tool for humans to amplify their intelligence or their ability or what they're literally capable of perceiving in the world. We were lucky to be at the forefront of the AI revolution, be like sitting in the front seat and see. like where things going. One of the biggest challenge in building this intelligence system and building this environment is to build systems which can generalize beyond particular sense of beyond particular tasks. Hi, my name is Ivan. Hi, my name is Nick.
We are the founders of Rocket Happy Eye. This is Code Story, a podcast bringing you interviews with tech visionaries. Six months moonlighting. There's nothing on the back end. who share what it takes to change an industry. I don't exactly know what to do next. It took many goes to get right.
Who built the teams that have their back. The company is its people. The teams help each other achieve more. Most proud of our team. Keeping scalability top of mind. All that infrastructure was a pain. Yes, we've been fighting it as we grow. Total waste of time.
The stories you don't read in the headlines. It's not an easy thing to achieve, my dear. Took it off the shelf and dusted it off and tried it again. Dry the ups and downs of the startup life. You need to really want it. It's not just about technology. All this and more. on CodeStory. I'm your host, Noah Lapart. And today, how Ivan Puperev and Nick Gillian built a physical AI company helping humanity understand the real world in real time. This episode is sponsored by speak easy
Grow your API user adoption and improve engineering velocity with friction-free integration experiences. With Speakeasy's platform, you can now automatically generate SDKs in 10 languages and Terraform providers in minutes. Visit speakeasy.com slash code story and generate your first SDK for free. This message is sponsored by QA Wolf. QA Wolf gets engineering teams to 80% automated end-to-end test coverage and helps them ship five times faster by reducing QA cycles from hours to minutes.
With over 100 five-star reviews on G2 and customer testimonials from SalesLoft, Drada, and AutoTrader, you're in good hands. Join the Wolfpack at QAWolf.com. Ivan Puperev grew up in the Soviet Union studying rocket science. He moved to the U.S. in 2013 to pursue the early days of VR. He has worked at Disney and Google and has been focused on merging the physical and digital world for many years.
Prior to this latest revolution, Ivan was one of the skeptics towards the idea of AI. But at Google, he was convinced by one of his now co-founders the value in this tech. Nick Gillian has been working in real-time machine learning for nearly two decades, but his path here was a bit different. His background is in music and audio engineering. Think the math behind amps, studios, mixers, etc.
During his master's studies, he fell in love with sensors and participated in early development and advancement of this tech alongside machine learning. He built a toolkit, which eventually got the attention of Ivan. Ivan was working at Google, and one of his team members began utilizing the toolkit built by Nick. Upon discovering this, Ivan reached out to Nick to see if he wanted to join this top secret team. And eventually...
Nick convinced Ivan of the power of machine learning and AI. This is the creation story of Archetype AI. At Archetype, we're a physical AI company that's helping humanity make sense of the world. Our big goal is really to empower anyone to take these very powerful AI models and pull them out of the digital realm.
into the physical world where people can actually use this to solve hard problems in the real world through sensors and through these powerful AI models. To do that, we're building Newton. This is our AI foundation model on our platform to allow really anyone to take the sensors they have in the world and hook them up to this model and then use that for
whatever downstream tasks they might need to do. It could be in robotics, it could be in health and safety, it could be in construction, it could be in insurance, in all of these different areas where a customer has some form of sensor and they have some form of problem. and they can't solve that problem today with computer vision or with heuristics or signal processing. It's too expensive and too time consuming.
to go out and use traditional machine learning, where maybe you have to assemble an entire team of experts, spend a year of time to collect data collection, only to ship some very simple use case. And if you slightly change one of the variables in your system, you have to start again and go back to scratch. I went to MIT to do my postdoc there. And I was working with a fantastic group at the Media Lab called Responsive Environment.
There I built something called the gesture recognition toolkit. I took a lot of my algorithms for my PhD and I open sourced them. Probably one of the best things I've done. It's had a lot more impact beyond any of the papers I've written or these type of things. And a few years later, I was working at Samsung.
Avan called me up one day, and it turns out he'd used this toolkit. One of the folks on his team had used it in the early days of these Google projects. I knew who he was from all of his work in Kyan West and these kind of itch-the-eye communities, right?
A man called me up and said, hey, do you want to come and work on this top secret project at Google? We probably can't talk about it. We'll probably never ship it. It's really crazy. I can't really tell you much about it, but do you want to come? And I was like, sure, let's do it. That was 10 years ago at this point, that was 2014. So I've been working with Ivan for a long time and I've been able to slowly convince him on the power of machine learning and the power of deep learning.
I was quite skeptical about AI because I'm coming from an old-school mathematical background, mathematics, aerospace engineering, and there you have the differential equations, everything's predictable, you can control everything, single point, completely deterministic. Nick joined the team. And he's converted me to the belief in deep learning and AI capabilities to solve problems which are completely impossible to define formally, like using classical mathematical techniques.
We had a great time at Google. We built a lot of really key technologies, such as radars and really complex algorithms. We ship a bunch of products. And then when foundation models came in, I felt that this is one of the biggest problems we can talk about later, about the bringing AI in the physical world or using AI techniques for physical world, which really will change the way we build a computing system for physical world. We took a dive and started the company.
This will be interesting. Given the history that… you and the team have and how this has been building over years at different companies and different experiences. I'm curious what you would consider the MVP for Archetype, the first version of what you would call Archetype and how long it took you to get there.
of tools you're using to bring it to life. And, you know, we've kind of touched on those at a high level, but I want to give space for an MVP answer. We've been working back at Google as a team. So Nick, John, and other co-founders also, we worked together for eight years.
Basically working on this problem, which is foundational, you can think about how can you understand the physical world. So humans perceive the world through the natural biological sensors, eyes and ears and touch and other sensors. That's how we know about the physical world. If you have a mobile phone and you have a really powerful sensor,
How much information can you extract from the sensor? And how can make these algorithms which extract information to be then generalizable and powerful? How can you combine multiple other sensors together, for example, from the phone and from the computer and work them together?
So the meaning of this data is very different for the humans, but for the sensor, it's all the same kind of vanilla data. As Nick mentioned, we were lucky to be at the forefront of the AI revolution, be like sitting on the front seat and see where things are going. So we saw that one of the biggest challenges in building this intelligence system and building this environment is to build systems which can generalize beyond particular sensors, beyond particular tasks.
before the foundational model, before this generative AI kind of becoming so powerful and commonplace. Every time you train the system, you have to collect exact data and exact tasks you need to recognize. and you train the model on this exact task you want a model to perform and then the model does exactly what you're asking the model to do
but cannot do anything outside of that. And every time you want to add one more task, or there's a little change of environment, or a little change in the device, or you change the shape and suddenly your signal looks slightly different, you have to restart this process again.
For most of the products, for most of the people who want to use census and build something beyond simple switch or beyond simple position controller, the cost of data collection to build a robust system is just insurmountable. And that's, I think, stopped all these kind of great ideas people have about smart environment, intelligent spaces, ambient intelligence. IoT, that's all were crushed by the need to build every single use case by hand.
What we saw the opportunity here is to build models which generalize from a small number of tasks that you can generalize to do tasks which the model has been trained on and do them better than narrow models.
be able to fuse multiple types of sensors together. And that's new kind of new techniques based, not fundamental based in transformers, which allows to train a very large amount of data. And then once you train it, the amount of data may be large, but then you can reuse it so much broadly in the general use cases.
So we saw this opportunity to create a true, what we call physical AI or physical intelligence, which would be able to perceive the world just like humans do and be able to make decisions and interpret like humans do. Really an MVP for physical AI is obviously you need the AI model, but the key thing about physical AI is it's really bringing the model into the real world.
So that means that you need to be able to hook up some form of sensor to the model, and you need to be able to hook up the output of the model and route that back to the real world in some way to trigger a safety alert, for example, if a worker is in a dangerous area, or it could be... put up to a robot to actuate something, or it could be to generate a report at the end of the day so that our manager of a construction site can optimize their time and so forth.
The MVP of Newton is really this early version of the platform that allows a customer to take some form of sensor, hook it up to our foundation model. and actually get a specific output that is useful to them. So to complete this kind of chain that completes the full link that allows them to actually take this AI model and actually run it in the real world.
and actually apply it either in real time or in kind of batch mode to a specific problem they have. In the long term for Newton are thinking about many types of sensors that can be applied to the model and many downstream use cases. For us, the MVP really is picking a subset of sensors and a subset of use cases that we can apply to early customers and really prove that not only the platform and the technology works from...
a plumbing and an engineering perspective, but that it can actually solve their problem today in the real world. When we start putting together what would be this MVP of the model, what kind of answers people want? First of all, what would be the most common things people like to know? Number one. And number two, how are they going to interact with the damn thing?
If you're a operator of a crane in some sort of a construction site, are you going to type on your laptop asking questions and reading the answers? What that interaction is going to be? And the way we did it is when we started a company, as we started building this foundational model and infrastructure to connect, because you have just built a lot of stuff, data collection, how to train it. In parallel, we went on.
and talk to, you know, hundreds and hundreds of companies. If you have sensor data, what kind of things you're looking for? And then we run this big kind of like discussions, research and collecting data from the customers. it's come down to three things very large categories of companies have a census which measure objective state of the world which means
Things like temperature, vibration, gas flow, electricity flow in different forms and shape, different ranges. So this data is fundamentally really complex. So they want model to understand fundamentally physical principles behind that and be able to predict anomalies.
predict future behaviors, predict when it's going to fail, and be flexible in a sense that they don't have capacity to train the model for a year. It has to very quickly understand what data is and basically adapt. And a second one... There's very different types of customers who are not looking at the data coming from machines, but they're looking at people's behavior.
whether they want to understand safety and security of their workers, whether they want to measure productivity of the constructors, which digging channels or build some stuff, how much people work. If the rocks drop, why is the weather? Is it the time of the day? Is this too much coffee? Is it not enough coffee? So how can we capture human behavior? And well, people have a really good understanding. Machines cannot get the subtle cues.
so this is two mvps and the third one use case is to bring them together basically what you would like to do is to connect objective measurement of the world physically correct measurement of the world and connect to the human behavior i understand how changes in the physical environment affect people's wellness, being, and so on and so forth. And the other way around is how people's operations for the factory change performance of machines and what's the effect they create.
One way to think about is almost like left side of the brain and right side of the brain. Left side of the brain thinks about physical world and understand how physical world works. And the right side explains that. And then they have to be connected. They have to be connecting tissue between the left side and the right side.
you look at physical world and they can explain but also you should be able to generate responses you should be able to say i'm actually thinking the response should be this let's try it and the model should be able to go and generate these responses that's how we think about the model This message is sponsored by SnapTrade. Link in user brokerage accounts and build world-class investing experiences with SnapTrade's unified brokerage API.
With over $12 billion in connected assets and over 300,000 connected accounts, SnapTrade's API quality and developer experience are second to none. SnapTrade is SOC 2 certified and uses industry-leading security... Get started for free today by visiting snaptrade.com slash codestory. I'm so pumped about my new offshore dev partner, I'm saving a ton, and they even signed an NDA.
Wait, is that our financial statements on the internet? My dev partner has great communication. They can acknowledge an issue or email message super fast after they run it through Google Translate, of course. I have a call set up with my partner right now. Hey, guys. Guys, are you there? Wait, did you just ghost me? Ever been in these situations? Yeah, me too. That is, until I found Tecla. Tecla is world-class, near-shore talent for technology teams who demand the highest standards.
The platform gives you access to over 50,000 vetted Latin America technology experts, including developers, designers, and IT professionals, all fluent in English and all in your time zone. In seven days or less, you can have your tech expert hired through Techla's fast, efficient hiring process, which helps you find and onboard top candidates.
Get senior level talent at competitive rates, often half the cost of hiring in the US, but aligned with US work culture and time zones, making teamwork easy and productive. Hire the best tech experts today with Tecla. Learn more at tecla.io. That's T-E-C-L-A dot I-O. This episode is sponsored by MailTrap, an email platform developers love. Try for free at mailtrap.io. That's M-A-I-L-T-R-A-P dot I-O.
So then on that MVP, right, the proof, you got to make certain decisions. And I hear one already in this is where we're going to start. But I'm curious about decisions and trade-offs you had to make in putting together that early version. And I'm curious about that process you went through to make those decisions and how you coped with them. One of the early things that we learned, how people would actually interact with AI in the physical world.
as opposed to some digital agent or digital assistant, right? If you think about the standard chatbot interface, it's really, it's back and forth, right? There's this kind of... synchronous loop, you ask a question, you get an answer, you follow up, you get another answer, you follow up, and so forth. We looked at these interaction models and how they could be applied to the real world, and they didn't really seem to fit.
If you want to check if your packages arrived today, do you want to just keep going to your bot and asking the bot, hey, did my package arrive? Hey, did my package arrive? You don't want to do that, right? You just want to say, notify me when FedEx comes to the door or did the gardeners come this week? For this... We've come up with this third model. So in addition to the assistants and the agents, this third interaction model, which we're building for physical AI, which we call lenses.
And the concept there is that you can build a lens and you can apply it to a foundation model to basically steer the model and guide the model towards what you're most interested in as a user, right? So that could be more of a safety area or it could be more of a security area or it could be.
productivity area for example. By applying this lens you can take then real-time sensor data and stream it through the model and have the lens continuously analyze this data and then you can actually interact with that alongside the model. The model now starts to become a tool that you can actually use in the world that you still have some agency over as a user.
you can still always go back to the tool and modify it or change the lens which what we call changing the focus so you can modify this on the fly or you can run multiple lenses in parallel for example if you're a larger company and you have a large sensor fleet then you have many stakeholders
that want to look at different types of things with the same sensors, right? You can actually apply many lenses on top of the same model. They give you very different answers because they're maybe focused on different aspect areas right and we're finding this concept of lenses is really helping customers a lens is a nice way for a customer to really put their constraints into newton and to communicate to newton like what they're most interested in
Because while our goal is to build a single horizontal platform that can support any type of sensor and many physical AI use cases, most customers have one or two sensors and one or two use cases.
But there's millions of customers, right? So that's where you really see the scale of this. The goal there is how can we build these lenses where it allows the customer to actually put a lot of their constraints and they can put in additional auxiliary information into the lens to really help guide it but under the hood it's really one foundation model that stared on this when we started the company and we tried to explain people what we're trying to do well it's completely obvious to us
it's completely black people would completely blank out what yeah physical world foundation model what is that it was interesting because i think that chat gpt made our life much easier because we initially, when we explain people what we try to do, we were saying like, think about a chat GPT for physical world, a chat GPT which can understand everything and you can talk about what happens to you right now, local AI.
I think it goes way beyond CHGPT. CHGPT is the first step in AI revolution, I think. It's going to go much bigger. These different interaction models, the assistants, the agents and lenses. They, of course, they can all work together, right? You can have an agent which actually is running a LANs, which is then talking to your assistant as well, right? They all are powerful independently, and you can, of course, combine them together to build these very powerful systems.
But we feel for physical AI, this concept of lenses is really helping guide how you can actually work with AI in the real world, as opposed to some digital entity. This episode is sponsored by Speakeasy. Whether you're growing the user adoption of your public API or streamlining internal development, SDKs can turn the chore of API integration into effortless implementation. Unburden your API users from guessing their way around your API while keeping your team focused on your product.
Shorten the time to live integration and provide a delightful experience for your customers. With Speakeasy's platform, you can now automatically generate up-to-date, robust, idiomatic SDKs in 10 languages and Terraform providers in just a matter of minutes. SDKs are feature rich with type safety, auto retries, and pagination. Everything you need to give your API the developer experience it deserves. Deliver a premium API experience without the premium price tag.
Visit speakeasy.com slash code story to get started and generate your first SDK for free. This message is sponsored by QA Wolf. If slow QA processes bottleneck your software engineering team and you're releasing slower because of it, you need a solution. You need QA Wolf. QA Wolf gets engineering teams to 80% automated end-to-end test coverage and helps them ship five times faster by reducing QA cycles from hours to minutes.
With over 100 five-star reviews on G2 and customer testimonials from SalesLoft, Drada, AutoTrader, and many more, you're in good hands. Ready to ship faster with fewer bugs? Join the Wolf Pack at QAWolf.com to see if they can help you squash the QA bottleneck. This message is sponsored by SnapTrade. Link end-user brokerage accounts and build world-class investing experiences with SnapTrade's unified brokerage API.
With over $12 billion in connected assets and over 300,000 connected accounts, SnapTrade's API quality and developer experience are second to none. SnapTrade is SOC 2 certified and uses industry-leading security... Get started for free today by visiting snaptrade.com slash codestory.
So let's move forward then. You've got what you're describing as the MVP. You have gone through the difficult decision of making the... the hard choices and how you approach it and how you define things and that's all that all makes sense i'm curious about progression right how have you progressed and how are you planning to progress to mature and and grow the product and i think to wrap in a box a little bit what i'm looking for is
You know, how do you go about building your roadmap? What process do you go through to decide that, okay, this is the next thing to build or to address with Archetype, with Newton, with all the things? The Archetype AI is a company. And the product we're building is Newton. And Newton is both AI model and the platform around it. So the model is wrapped into the platform.
so that people who want to use Newton, they don't have to download it from hugging face and then figure out how to connect all the sensors and management and data structures and authentication. how to load real-time data. There's a lot of complexity in that, so it's a platform and a model together. One of the biggest learnings we learned is that in a digital domain you have models, the infrastructure is there.
you download the model and it can run right and relatively straightforward setup we're talking the physical world companies physical companies the kids like download the model where to download it maybe have some data centers here maybe they have on-prem devices you have to give them solution you don't give them just a model model is not a solution we have a clear vision of where we want newton to go we've built out a roadmap of how we're going to get there but what we're really trying to do
is be very careful about which of the customers that we work with in these initial pilots to really prove out the model. and to make sure that we really listen to what those customers actually need, like the problems they have, the type of sensors they have, the requirements they're bringing.
We are using that to really help guide the roadmap in terms of which features we accelerate and which features we bring through to these initial releases. So we can actually build something today that people can use they don't need to wait one year or five years or ten years for some form of robot or some form of magic new sensor to come out like with the fleet of sensors they've already deployed for the last
five or ten years for other technology how can they leverage that how do they take the problems they have today and use newton to solve those now those are really the things we're using to accelerate our roadmap and then the skill there or the art is how to make sure that as we're working on those near-time customers and those real-time pilots.
to make sure we're driving everything, as Ivan says, towards these fundamental technologies that can scale properly and that can serve thousands of customers. And we're not just building one-off bespoke solutions for each. domain or reach sensor type right it's like how to really drive that towards a true technology roadmap that builds and scales and grows every customer we work with i'm curious about how
you built your team. How do you go about building your team? And what I think is really interesting is what do you look for in those people to indicate that they're the winning horses to join you at Archetype? We're distributed first, right? So... Our team right now is getting close to 25 people. We're really trying to center the team between two major time zones, West Coast, US, and within a few hours of GMT.
We have a team already which is distributed across Europe and across the US. This has allowed us to really look for the best talent in a much wider scope, so we're not just limited to the Bay Area. And it really allows us to bring in the best group of researchers, research engineers.
software engineers and the people can help us build all the data tooling and all those critical parts of the system that are fundamental to making the technology work right and making the flywheels turn first off we're building this international team, which is also something that we use quite a bit back at Google. And it really worked for us. And there's some strategies there in terms of how to make that work. And I think we've been able to build that up well over our time.
No, we're not building like another mobile app or another chatbot. We're building something new. So I think one of the key things we look for when we're hiring folks is they can cope with ambiguity and that they're not just repeating what everybody else is doing it's very important to understand that it's very important to be able to replicate that but they need to be able to think a little bit differently about some of these problems or be able to cope with you know very
open areas and how to think about that either from a research perspective or from an engineering perspective so that we can be at the tip of the wave in terms of how to bring AI into the physical world. And we're not waiting for someone else to invent some technology so that we can build on top of. We can actually just go out and build it ourselves. So I think that ability to cope with ambiguity is something I definitely look for.
when I interview folks and when I work with people. The way you can think about is people's capability to work without playbook. Because we don't have a playbook for all the things. And maybe these playbooks, which have been done before, they may be harmful for the company. have to make your own playbook as you go and some people are not comfortable with that because if you build your playbook on your own you're going to fail and be comfortable with failure is very important
I think the cause we're trying to build a company is that we should not be afraid of failure. It's okay to fail. It's very important to be honest about failure. It's okay to fail, but it's not okay not to know why you fail, right? So that... Any failure is a critical piece to knowledge about how to build a system. So being comfortable with that and being comfortable in making your own playbook and knowing there's no playbook is, I think, what we're looking for.
And I think hands-on is very important. So people should be able to at least try to do everything, at least have a spirit to try to jump on every problem. In a startup, particularly in a startup, and we're very close to what we had when we were back at Google, because even... People think Google has an infinite amount of money, but that's not true. Everybody who can do corporations know that maybe a corporation has an infinite amount of money, but you don't. You have a little bucket.
So you have to be extremely careful about who you bring, how you build. It's almost like a startup, particularly in the innovation field, in the build companies. The important thing is being hands-on, but also it's very important that people should be comfortable with the change of direction.
You have to do what you have to do, because again, we're not working in a field where we have the map, all the guideposts there, you know where to go, you just need to find the optimal way through these guideposts. There's no map, it's a wide space.
And we also have to react to how the market changes and how the public space changes because the field develops extremely fast and we can't be too reactive and there's just like a swing in the wind every time something changes. We have to have our own course. but also we have to adjust this course based on what happens. When Jason Huang comes out at CES 2025,
In a sense, the next thing, the next big thing in physical AI, I think he defines it as robots and self-driving cars. We have to take account of that and modify our message to reflect that. Be able to talk without a playbook, be agile, changing directions, be hands-on, and not afraid to fail. These four things are critical for any kind of innovation startup people building. We hope you've enjoyed part one of this creation story of Archetype. Tune in tomorrow for part two.
Code Story is hosted and produced by Noah Laphart. Be sure to subscribe on Apple Podcasts, Spotify, or the podcasting app of your choice. And when you get a chance, leave us a review. Both things help us out tremendously. And thanks again for listening.