Welcome to
The Open Source Way. This is our podcast series, SAP's Podcast series about the difference that open source can be. And in each episode, we'll talk with experts about open source and why they do it The Open Source Way. I'm your host Karsten Hohage, and in this episode, I'm going to talk to Ingo Kober and Jason Kafka about the Open Component Model going by the acronym OCM. Hi, Ingo. Hi, Jason. Great to have you both here.
Hi,
Karsten.
Hi,
Karsten.
Okay,
let's look at who these guys are. Ingo is a product owner at SAP in the context of Kubernetes Lifecycle Management . He has an overall SAP history of more than 22 years. And what else he says about himself is that outside of the usual working hours you might find him either DJing somewhere in a club in the area of Heidelberg, not like in Sao Paulo or something, but in the area of Heidelberg, or having fun in Diablo 4.
Jason has been with SAP for 14 years and is an Engineering Manager responsible for the Gardener Lifecycle Management team. Fun fact about him: Jason was the guy who made Standup Paddling popular in Germany. Well, at least in Baden-Württemberg that is. Jason, sorry, how did that happen?
Well, I
mean, so I think eight years ago or nine years ago, I was on a vacation in the US and there I encountered that sport, and I really liked it a lot. And then I was one of the first guys, I would say, in Germany actually paddling on rivers and lakes, using that Standup Paddleboarding thingy. And I remember this quite well because everybody was looking at me, what are you doing there? Is it a surfboard? Do you need a license for that? Yada
yada. So, I think I was one of the first guys doing that here.
You were
definitely earlier than I was. I picked that up during the pandemic because it was a thing you could do alone out there on the lake or something. Ingo, DJing ? What kind of clubs?
Well,
something in the Heidelberg area. Villa Nachttanz, that's a very small club, and I was in the Halle02 also. And some other events in the Neckar water. That's a thing around the Neckar where they put sand and beaches, and they offer local DJs to do stuff there.
As I
happen to know Villa Nachttanz, that's more of an alternative location. What kind of music?
It's
mostly electronics or like deep house, house, techno.
Okay, but
not your dancefloor 90s parties. That wouldn't fit into Villa, right?
No,
and I'm not a wedding DJ. Please don't call me that.
Okay,
okay. Nothing against any wedding DJs who might be listening. No, of course. Yeah, we have to be careful here.
No, of
course not. No offense, please.
And we
also have to be careful that we don't keep chit-chatting about something totally different. Because in our pretalks we found out that we can. So Ingo, back to you. OCM - Open Component Model. What is that? Can you give us the shortest possible description?
Yeah, and
that's already quite a tricky question, the shortest possible description, because it's a complex topic. And this is why we asked ChatGPT to just read our official website, ocm .software, and come up with a short summary of our website . And what the AI answered is: "The OCM, the Open Component Model, is an open standard for describing software bills of delivery in a technology-agnostic, machine-readable format."
Which is still, I think, quite a bit, yeah, complicated, if you hear that for the first time. So, in another nutshell I would say: "OCM is a machine-readable format to describe software components, including their technical artifacts and how to access these artifacts."
Okay. So,
it's basically a sticker on the package, or the template for the sticker on the package, that says: "What's in here?"
Yeah,
correct.
All right.
And Jason, the purpose, or the use cases for OCM are?
Well, in
the end, I mean, as Ingo said, OCM is a machine-readable format to make software artifacts tangible for machines. That is enabling all sorts of automation based on software artifacts. And we invented OCM in the Gardener CI/CD team to automate our delivery processes, right?
So, initially OCM was used to describe software artifacts of the Gardener, which would then be given to a delivery management, and then to provide them with insights into which software artifacts, docker images, and such were delivered to which location, yeah. And this was basically the first use case we had. And then quickly after that we realized way more practical use cases
. For instance, with description of OCM we could describe our artifacts and their location inside different software repositories, and then we could use this to, let's say, automate the whole scanning procedures we had. So I mean, think of OCM as a kind of abstraction layer we put over our software artifacts.
And so OCM describes what the software artifacts are, and then we have a translation which basically takes this description, feeds this to scanning tools, and then those scanners can use this OCM format to perform all sorts of scanning tasks. For the Dev teams that was a super gain in productivity because they no longer had to deal with each and every single scanner, because it was just OCM describing the software artifacts and then exposing this to the various scanner tools.
So, if SGS or whoever comes up with another five scanning tools in the future, our Gardener teams are set up for that. And yeah, so let's say we come to another interesting scenario here. So, if you take a look at the OCM website our slogan here is in the end, "describe, transport and deploy." So, this means you can describe software artifacts with OCM, you can have the tooling around it, which then allows you to ship these artifacts described via OCM to various locations.
And you can also enrich this. And this is what we also do. You can enrich this format with instructions on how to deploy artifacts. So in the end, it's a description which allows you to ask questions like: 'What's inside of this package? How can I deploy you? Where do you come from? Are you real?' Because you can sign it, yeah.
Okay. Now,
you did talk some about scanners and things, and about pipelines. I'm familiar with the build pipelines, but we're not talking about build pipelines here, Ingo. This is a l ifecycle management thing, right? So, as you said, it's a transport and deploy thing, rather.
It is a
l ifecycle management thing, that's correct. We want, in the end, to like streamline Software Lifecycle Management processes if you will. Like compliance scanning, reporting, delivery, even deployment can be streamlined or at least improved, the process of deploying stuff using the Open Component Model.
The reason, and I think Jason touched upon it already, why we are investing in this topic is that we still see a lot of teams struggling with these, often compliance related requirements, at SAP when it comes to develop and deliver software. And quite often we see teams tackle the same problems but in different ways. And this is why we believe if we like could agree at least on like one way of describing software components, we could really improve the SAP developers lives.
Jason,
when we talked before this recording, you, I think, mentioned that one could consider it as "a version of Cyclone DX" or of the SPDX. Can you explain that?
Yeah. I
mean, so when you take a look at OCM the first thing, I mean people always say, 'Oh, that's a bill of materials, software bill of material.' And if you're dealing with SBOMs, usually Cyclone DX, and SPDX, and other formats will pop up in your head. And yeah, I mean, OCM is kind of, let's say, similar in that it provides a list of artifacts which the software product needs in order to run. But, so with OCM we went quite a different road.
So first of all, SPDX and Cyclone DX both give you a lot of options. I mean there are a lot of things which you can put in there optional. I mean it's not that there is a strict rule set which says this, and this, and this value needs to be there. It's a lot of, let's say, options you have there. And OCM is very opinionated. So, we have a strict rule set: What can go inside this description and what shouldn't be there? It's extensible, but we have a really strict rule set here.
And I mean, and there isn't one term we coined here. We talk about OCM, about a software bill of delivery. I mean, and because we want to distinguish that we are describing all the high-level artifacts that you need to ship to somewhere in order to make your application run. It's like in German 'Lieferschein.' It's hard to translate, but that's, I would say, a clear distinction to SPDX here.
Okay,
and that means beyond SPDX it also has, basically, information about what to do with the package rather than just what's in it. Or did I understand that right?
I mean, if
I can just chime in here one more time. So I mean, OCM can provide you with a list of deliverables, right? And it's also extensible to more things. I mean, like I said, you can ask it a few questions: What's inside of you? How can I deploy you? I mean, this is another use case we intended for the OCM format when we deployed in our Software Lifecycle Management. So, we have another tooling which we developed that's called The Landscaper
. And we put in there, in this OCM descriptor, things called blueprint. So, if Landscaper picks up any artifact described via OCM, then it knows there is a blueprint how to deploy this. I mean, you could put in there any other things, like a Helm chart, or other installation procedures.
But we think with OCM you have one description package which really describes your application, not only from the content point, but also it gives you the opportunity for deploying mechanisms to pick that up, and that automatically deploys that stuff. So it's, this is not what, for instance, nothing that SPDX was intended for. It's just a plain bill of material. And with OCM we see way more information you can package in there.
It's not only about delivery information, we touched it briefly on the compliance aspect here, we are also thinking--and we are also in talks with our SGS friends here--that we can put more compliance documents in there which are relevant for certain audits. So, we have in focus everything you want to deliver, and basically attest that this thing is the real thing, should be part of this OCM descriptor.
And if
it's not in the mandatory core, basically there is still the extensibility by which to add additional information for your individual package.
Yeah,
exactly.
All right.
Yeah, you
can add Cyclone DX manifests in there . You can put in there, VEX format things. I mean this is Vulnerability eXchange format things, whatever you want. But in the end, I mean there is a clear description in the higher level how this format looks like so you can read out additional information on the fly.
Okay.
I understand it on the conceptual side, I think. Let's maybe turn to, what is that made of? What's its format, and how is it created?
So, the
format is usually YAML files, or you could also do JSON, and these files we call 'Component Descriptors'. So yeah, I mean it describes components, so that maybe makes sense. And these are the software bills of delivery thingies, these files, Component Descriptors, they contain first and foremost a name and the version so that you know what version of a certain component you're actually talking about. These two together make this unique ID of that component.
So, it should be like globally unique so that wherever you are, you can specifically address the specific version of a software component. And then it contains this list of technical resources like docker images, Helm charts, as Jason mentioned it already, executables, whatever else you can put in every like technical resource you can think of, but only those which are of course required by this specific software component version to be deployed to run somewhere, right?
You can also put in URLs towards the source code repositories which you have used in order to create the build artifacts, why not? Just as an additional information you can put in, like, any other kind of metadata, actually simple labels, or other file formats, just everything that kind of helps to describe and work with your component.
Think of it, maybe, if you will, like a generic metadata container hosting data that describes your software component in a uniform way, so that other tools and processes can make use of this metadata. And you usually ship that alongside the whole software component. And regarding the question where these files are being created, this usually happens in the build.
So your build pipelines, like Concourse or something, in these, you have to provide a certain set of data like the name for example, like all of the resources that you want to specify in these Component Descriptor files. And if you have set up this once properly, then most of this can be automated, like increasing the version numbers with each build. That's something that you can, of course, automate.
Okay. But
initially, as you now said, increasing the version numbers can be automated. Some initial information still has to be created by humans, right?
Yes, at
some point in time you decide on the component name, right? And you have to explicitly decide, usually as a developer, which are like the docker images and the helm charts that my software component needs in order to get deployed. And this is a deliberate decision you do at some point in time.
Okay. So,
that's something that's not like imaginable that this would be fully automated, or is it?
No, I
think that cannot be fully automated because it's really a decision which docker image you package within your component.
Well, I
mean, this would basically automate the product owner's role completely. I mean there is a guy who has an idea of how the product should look like, and what parts should be part of the product. I mean, and this is something that there's still a human decision in it. There's still a human who plans this product and how the packaging should look like. I mean, there are ways, of course, maybe where you can use automation, but we are way apart from that.
How
about, if everything that you're using in your package already has OCM information attached to it? Then this would be fully automatic, right?
Yeah, but
then you are building another component. Is it? So, you're building another component based on other components that are available already.
That's an
interesting question. I mean, so basically, let's say okay, let's go with this, Karsten. I mean, so let's say, you have basically described everything with OCM and then you would say: 'Hey ChatGPT, please plug this together in a meaningful way.' So, if there is enough meaningful information in these OCM descriptors we could put this in there. I could think of a way, in the future of course, on how something actually could plug this together. Why not? But we haven't actually tried this out
yet. But I mean, you see how rapidly AI evolved in the past. So, I think a lot of things are possible in the future, maybe even this.
I just had
the abstract thought, if everything was described in a standardized way then the next level, that puts it together, should be able to machine-generate that next higher-level description, right?
Exactly,
yeah.
And other
question that may be similar is: is that generated once, the OCM description, or does it like pick up things along the way, sort of down the pipeline?
So
initially, yes, each build pipeline of your software component, for each new version that you generate with a new build pipeline run, you would create another new version of your software component descriptor file. And after the build you could even attach more stuff to it. Yeah, that's possible. So, it's not totally like immutable after the first build. And maybe you would relate scan results to the component, maybe you would even like to include scan results as specific files into components.
You could do that, so we are not totally against this. What I would like to point out is, at some point in time, whenever you say: So my component is now ready. It should be in an immutable state, so that whatever you have created until a certain point in time, and then you should say: No, I'm not doing any more changes anymore to the software component itself.
Yeah, of
course I understand that. I think, what I had in my head--and you fortunately now mentioned it--was, exactly, you were talking about scans before. And I thought, okay, these scans have a result. So, there's no point if that result wouldn't be written into . Or is there?
You can do
it like that, but you could also like take this globally unique ID and take this as the ID you put in some database . And you just put also the scan results together with this unique ID, with the software component version that you have just scanned also in the database. Then you can use this global unique ID as like a correlation ID, if that's a term somebody is knowledgeable of, and like make the connection to the scan results via this
ID. I mean, you always have the ID, you don't necessarily need to change the component descriptor in order to relate these results.
Okay,
maybe let's not dwell on this for too long because I have the feeling I'm fantasizing about things here . And we actually want to get across, in the first place, a very good concept. So maybe, Jason, why did you, again, come up with it in the first place? And also, why is it run open source?
Well, I
will definitely give you an idea of this, but maybe just one remark to our last bit here. Closing off this component descriptor is actually done by signing it. So, when you say I'm done with it, I will not add anything further, you sign it. And that's basically sealing it. And then you can, basically you know, this is the thing we have scanned, and that's kind of the connection you can make to this kind of scanning report database.
We are also in the works of that, but it's really just PoC level right now. But we are also following that way. So, coming to your question: Why did we come up with that idea, and especially why did we put it to open source? There are two, I would say, two thoughts to it. The first one was, of course, we wanted to validate OCM against other use cases outside of SAP. So, we wanted to see what are others thinking of OCM. Is it, I mean because we were also all convinced OCM is a great
idea. But you know how it is. If you're just in your echo chamber, you pat yourself on the shoulders and say: 'Great, best thing ever.' And then show it to somebody outside and say: 'That's a complete crap what you made there.' So, and I mean, we were of course sure that it's not crap, but we wanted to present this to the outside world and get feedback. We're also not doing this on our own. We are, of course, working here together with our valued partner Weaveworks.
They put a lot of thought into that as well. So, we have here, again, I mean, the validation point. So, that's number one, validate OCM by the outside world. And second, we saw that OCM is basically, it's not only useful within SAP, right? So we see that other companies have similar problems. For instance, when we talk about delivering artifacts . It's not only about shipping the stuff to the private cloud. OCM has private cloud deliveries in its mind.
That basically was one of the first use cases we thought about when we came up with OCM. So, we have tooling which basically takes OCM described artifacts, and ships them completely self-contained into any private environment. And this is something we see other companies have the same problems, and so we think it's value for them. And they can basically see OCM, see 'Hey, that's a great
thing. We have a similar use case, ' and can basically take this, adapt it to their needs, and feedback other ideas and tools. So we, again, think this is a kind of feedback loop which gives us valuable information we can then incorporate into our tooling as well.
Well I
guess in the end, it's always, the standardization is always a benefit to all who could make use of the standard. So in this case, all the Hyperscalers, anyone who's large enough to run a private, or contained cloud, et cetera, et cetera.
Exactly.
Yeah.
Great,
sounds cool. So, what's the status of it like right now, and what do you envision it to become?
So, we
don't have a major version out yet. We started with OCM, specifically, mid of last year, so it's about a year old. We are like celebrating a birthday.
Happy
birthday.
Thank you.
But
we have a few minor releases out yet. And these versions, they are fully usable, and they do provide benefit as they are. Like in creating component descriptors, adding resources, that can all be done on a CLI level. So, writing your shell scripts for that shouldn't be an issue. And I would call it, it's a beta state,
yeah? W hat we envision for the future is really that OCM becomes an open standard in the industry for describing software components, and then also like perform all kinds of l ifecycle management processes on top of these descriptions. If you would ask me, ideally in one or two years, we have like an ecosystem of tools on top of this OCM description from various other vendors or like people in the open-source space. Things that like enable processes on top of this OCM specification.
I'm just
thinking of an example because, interestingly, just, I think, last week or before last, a colleague asked me about a way to attach some basic compliance information, like security standards, information to entities. Would that be an OCM thing to do, or not?
I mean, so
I think we already touched on the purpose of extensibility of the OCM format.
Okay.
You can of
course take a SPDX and include this into OCM. So I mean, we already, I mean, so let's take a step back. So, there was the executive order last year by the US government which, in the end, regulates that every software out there, and including software as a service solution, they need to have an SBOM. So that if you ship something then you need to have an SBOM. And this is something that SAP also needs to comply to,
right? And what you can think of now is, I mean, think it's not good enough to just put an SBOM into every tar archive you have on your marketplace. It kind of needs to be in a context: Where does this SBOM relate to? What does it relate to? So, you need to have context. You have to need something which models the product - that's OCM. So, when you describe your product with OCM, then basically nothing would stop you from including this fine grained SBOM like SPDX into your OCM archive as well.
That is definitely something which is possible.
The one
thing that kind of works on the back of my mind all the time is, how did the world live without something like this before? Because, I mean, no truck is allowed to go down the road without a bill of materials, what it has loaded. Nothing is allowed to go anywhere without declaring itself.
Well, I
mean, it's always a thing, like I mean, so you can of course, you can have your classical SBOM. But in the end, it's all about, do you want to have your processes more automated, or do you want to have it with a lot of manual interaction? So of course, things like we do them now, they work. I mean, we see this in all, a lot, of places within SAP. I mean, do you need to change anything? No. If you have enough time and want to spend a lot of money, then go for it.
But so let's take for instance, we want to know if in a certain product we deliver to some places is a certain vulnerability. The usual problem is finding out what has been actually shipped where. And we have a lot of Excel sheets involved, and a lot of these processes, and kind of what have you. So, we would like to lay the foundation with OCM to have a means to fully automate all these things with a fully described software artifact delivery. That's what we aim for.
Of course, you can do it the old way, but it costs you time, and blood, and money. We want to save a lot of our colleagues time, and invest in something which allows us, or enables us, to build tools running on that format, and doing automated processes.
I
forgot the machine-readable part when I asked that question, 'how did the world live without it before?' So basically, it's the change from the truck is being stopped by customs, or it's an RFID chip that is just simply being read, right?
Exactly,
right. Yeah.
Okay.
So with that, with, as you said, once, for the first time, I think, to an analogy 'Exactly.' Let's leave it with that analogy and come to the famous last two questions. And the famous last two questions are, before last question: If anyone is interested to learn more about OCM, where do they go?
Ideally,
they just go over to the website ocm.software. We have quite a bunch of tutorials, guides, documentation, contribution guidelines. I mean, it's open source. If you're missing a feature or want to contribute, just go there. We have guidelines on how to do contribution. All of our GitHub repos and the GitHub organization is linked there. So, that's the place to go: ocm.software.
Okay.
Anything else? Any YouTube tutorials or marketing clips?
We're
working on that.
You're
working on that.
Yeah.
Okay,
we'll probably find out then. Then, famous last question: the three or four main things that you want people to remember from the podcast - one could call it key takeaways.
Yeah
I think, Ingo, what we both, let's say, repeat a lot of times with OCM: we just want to establish a common - I mean for us, at least at SAP, but also outside - a common machine-readable format for describing software artifacts to enable automation, yeah. And OCM is versatile, it's extensible. And like I said, it enables automation of many software l ifecycle related processes. And there is a new term we would like to coin here. It creates software builds of delivery,
yeah? So, these are the so-called SBoDs. I mean, you can also read this up on our ocm.software webpage. And Karsten, I have to say it, in this cast, in the rules, you have to accept this.
Okay.
This is kind of an insider thing. When we talked a little bit before, I kind of said, well, I'm not too fond of these Indiana Jones movies anyway, not even of the old ones. And I was like, these two guys here almost cancelled this recording appointment when I had said that?
It was
close to being cancelled, really.
Close, y
eah. Tough call.
But as I
also said before, I don't have anything against Harrison Ford, he's played some great things. I don't have anything against LucasArts, I love the Star Wars movies. And I don't know, we shouldn't mention any more brands now. So, let's leave it at that. And this is not key takeaways about OCM either, although we could probably recoil the acronym in some way, that would fit that, anyway. So, thank you very much, Jason and Ingo, for being our guest today. It was nice to have you both here.
Yeah.
Thank you too, Karsten. Was great to have the opportunity to participate in your podcast.
Thank you.
Same for me.
And thank
you out there for listening to The Open Source Way. If you enjoyed this episode, please share it . And don't miss our next one published every last Wednesday of the month. You'll find us on openSAP and in most of those places where you find your other podcasts like Apple Podcasts, Spotify, and the likes. Thanks again and bye bye.
Bye bye.
Thank you.
Bye.