This is Master's in Business with Barry Ridholds on Bloomberg Radio.
This week on the podcast Strap Yourself In, I have another extra special guest. John mccauliffe is co founder and chief investment officer at the Volleyon Group. They're a five billion dollar hedge funds and one of the earliest shops to ever use machine learning as it applies to training and investment management decisions. It is a full systematic approach to using computer horsepower and database and machine learning and their own predictive engine to make investments and trades, and
it's managed to put together quite a track record. Previously, John was at d SHAW where he ran statistical arbitrage. He is one of the people who worked on the Amazon recommendation engine, and he is currently a professor of
statistics at Berkeley. I don't even know where to begin, other than say, if you're interested in AI or machine learning or quantitative strategies, this is just a masterclass in how it's done by one of the first people in the space to not only do this sort of machine learning and apply it to investing, but one of the best. I think this is a fascinating conversation and I believe you will find it to be so. Also, with no
further ado, my discussion with volleyon groups. John mccauliffe. John mccauliff, Welcome to Bloomberg.
Thanks, Barry. I'm really happy to be here.
So let's talk a little bit about your academic background. First, you start out undergrad computer science and applied mathematics at Harvard before you go on to get a PhD from California Berkeley. What led to a career in data analysis? How early did you know that's what you wanted to do?
Well, it was a winding path. Actually, I was very interested in international relations and foreign languages when I was finishing high school. In fact, I spent the last year of high school as an exchange student in Germany. And so when I got to college, I was expecting to major in government and go on to maybe work in the foreign service something like that.
Really, so this is a big shift from your original expectations.
Yeah, it took about one semester for me to realize that none of the questions that were being asked in my classes had definitive and correct answers.
Did that frustrate you?
It did frustrate me. Yeah, And so I stayed home over winter I stayed Excuse me, I didn't go home. I stayed at college over winter break to try to sort out what the heck I was going to do, because I could see that it wasn't My plan was in disarray. And I'd always been interested in computers, had played around with computers, never done anything very serious, but I thought I might as well give it a shot, and so in the spring semester I took my first
computer science course. And when you write software, everything has a right answer. It either does what you wanted to do or.
It doesn't, does not compile exactly. So that's really quite quite fascinating. So what led you from Berkeley to d sure that they're one of the first quand shops. How did you get there? What sort of research did Yeah?
I actually I spent time at d Shot in between my undergrad and my PhD program, So it was after Harvard that I went to that show.
Did that light an interest in using machine learning and computers applied to finance or what was that experience like?
Yeah, it made me really interested in and excited about using statistical thinking and data analysis to sort of understand then amics of securities prices. Machine learning did not play really a role at that time, I think, not at d SHAW, but you know, probably nowhere it was too immature a feel in the nineties. But I had already been curious and interested in using these kinds of statistical tools in trading and in investing when I was finishing college and then at d SHAW. You know, I had
brilliant colleagues and we were working on hard problems. So I really, I really got a lot.
Of us still one of the top performing hedge funds, one of the earliest quant hedge funds, A great a great place to absolutely cut your teeth at. So was it Harvard d SHAW and then Berkeley?
Yeah, that's right?
And then from Berkeley? How did you end up at Amazon?
I guess I should correct myself. There was a year at Amazon after d Eshaw, but before Berkeley.
And am I reading this correctly? The recommendation engine that Amazon uses you helped develop?
I would say I worked on it. I would you know it existed, It was in place when I got there, and sort of the things that are familiar about the recommendation engine had already been built by my manager and his colleagues. But I worked I did research on improvements and different ways of forming recommendations. It was funny because at the time, the entire database of purchase history for all of Amazon fit in one twenty gigabyte file on a disc, so I could just load it on my computer and run.
Now I don't think we could do that anymore. We could not, so, thank goodness is Amica Zone cloud services, so you could put what is it, twenty five years and hundreds of billions of dollars of transactions. So my assumption is products like that are highly iterative. The first version is all right, it does a half decent job, and then it gets better, and then it starts to
get almost spookily good. It's like, oh, how much of that is just the size of the database, and how much of that is just a clever algorithm.
Well, that's a great question, because the two are inextricably linked. The way that you make algorithms great is by making them more powerful, more expressive, able to describe lots of different kinds of patterns and relationships. But those kinds of approaches need huge amounts of data in order to correctly sort out what's signal and what's noise. The more expressive.
A tool like that is like a recommender system, the more prone it is to mistake one time noise for persistent signal, and that is a recurring theme in statistical prediction. It is really the central problem in statistical predictions. So you have it in recommender systems, you have it in predicting price action, in the problems that we solve, and elsewhere.
There was a pretty infamous New York Times article a couple of years ago about targets using their own recommender system and sending out maternity things to people. A dad gets his young teenage daughters what is this and goes in to yell at them, and turns out she was pregnant and they had pieced it together. How far of a leap is it from these systems to much more sophisticated machine learning and even large language models.
It's the answer, it turns out, is that it's a question of scale. That wasn't at all obvious before GPT three and chat GPT, But it just turned out that when you have, for example, GPT is built from a database of sentences in English, it's got a trillion words in it that database, and when you take a trillion words and you use it to fit a model that has one hundred and seventy five billion parameters. There is apparently a kind of transition where things become, you know,
frankly astounding. I don't I think, I don't think that anybody who isn't astounded is telling the truth.
Right. It's eerie is in terms of how sophisticated it is, but it's also kind of surprising in terms of I guess what the programs like to call hallucinations. I guess if you're using the Internet as your base model, Hey, there's one or two things on the Internet that are wrong, so of course that's going to show up in something like chap GPT.
Yeah, you know. Underlyingly, there's this tool GPT three that's really the engine that powers jed GPT, and that tool it has one goal. It's a simple goal. You show at the beginning of a sentence, and it predicts the next word in the sentence, and that's all it is trained to do. I mean, it really is actually that simple.
It's a dumb program that looks smart.
If you like. But the thing about predicting the next word in a sentence is whether you know the sequence of words that's being output, is leading to something that is true or false, is irrelevant. The only thing that it is trained to do is make highly accurate predictions of next words.
So when I said it's really very sophisticated, it just for what we tend to call this artificial intelligence. But I've read a number of people said, hey, this really is an AI. This is something a little more rudimentary.
Yeah, I think, you know, a critic would say that artificial intelligence is a complete misnomer. There's sort of nothing remotely intelligent in the colloquial sense about these systems. And then a common defense in AI research is that artificial intelligence is a moving target. As soon as you build a system that does something quasi magical that was the old yardstick of intelligence, then the goalposts get moved by the people who are supplying the evaluations. And I guess
I would sit somewhere in between. I think the language is unfortunate because it's so easily misconstrued. I wouldn't call the system dumb, and I wouldn't call it smart. It's you know, those are those are not characteristics of these systems.
But it's complex and sophisticated.
It certainly is it has one hundred and seventy five billion parameters. That doesn't fit your definition of complex you know what would?
Yeah, that works for me. So your in your career line, where is aphametrics and what was that recommendation engine? Like?
Yeah, So that was work I did as a summer research intern during my PhD. And that work was about what's called the problem is called genotype calling. So genotype calling, I'll explain, Barry, do you have an identical twin? I do not, Okay, So I can safely say your genome is unique in the world. There's no one else who has exactly your genome. On the other hand, if you were to lay your genome in mind alongside each other lined up, they would be ninety nine point nine percent identical.
About one position in a thousand is different. But those differences are what caused you to be you and me to be me. So they're obviously of intense kind of scientific and applied interest. And so it's very important to be able to take a sort of a sample of your DNA and quickly produce a profile of all the places that have variability what your particular values are, Okay, And that problem is the genotyping problem.
And this used to be a very expensive, very complex problem to solve that. We've spent billions of dollars figuring out now a lot faster, a lot cheaper.
A lot faster. In fact, even the technology I worked on in two thousand and five two thousand and four is multiple generations old and not really what's used anymore.
So let's talk about what you did at the efficient frontier. Explain what real time click prediction rules are and how it works for a keyword search.
Sure, the revenue engine that drives Google is search keyword ads, right, So every time you do a search, at the top you see ad ad AD, And so how do those ads get there? Well, actually it's surprising maybe if you don't know about it, but every single time you type in a search term on Google and hit return, a very fast auction takes place, and a whole bunch of companies running software bid electronically to place their ads at
the top of your search results. And the more or less the results that are shown on the page are in order of how much they bid. It's not quite true, but you could think of it. It's true.
A rough outline. So the first three sponsored results on a Google page, go through that auction process, and I think at this point everybody knows what page rank is for for the rest of that that's right, And that seemed to be Google secret sauce early on.
Right, Well, you know, to talk about the the ad placement. So the people who are supplying the ad, who are participating in the auctions, they have a problem, which is how much to bid, right, And so how would you decide how much to bid? Well, you want to know basically the probability that somebody is going to click on your ad, and then you would multiply that by how much money you make eventually if they click. And that's kind of an expectation of how much money you'll make.
And so then you gear your bid price to make sure that it's going to be profitable for you. And then so really you have to make a decision about what this click through rate is going to be. You have to predict the click through probability.
So I was going to say, this sounds like it's a very sophisticated application of computer science probability and statistics. And if you do it right, you make money, and if you do it wrong, your ad budget is a money loser.
That's right.
Huh. So tell us a little bit about your doctorate, what you wrote about for your PhD at Berkeley.
Yeah, so we're back to genomes. Actually, this was around the time when I was in my first year of my PhD program, is when the human genome was published in Nature. So it was kind of really the beginning of the explosion of work on kind of high throughput, large scale genetics research. And one really important question after you've sequenced a genome is well, what are all the bits of it doing. You can look at a string
of DNA. It's just made up of these kind of four letters, but you don't want to just know the four letters. They're kind of a code. And some parts of the DNA represent useful stuff that is being turned by your cell into proteins and et cetera, and other parts of the DNA don't appear to have any function at all, and it's really important to know which is
which as a biology researcher. And so it's you know, for a long time before high throughput sequencing, biologists would be in the lab and they would very laboriously look at very tiny segments of DNA and establish what their function was. But now we have the whole human genome sitting on disk, and we would like to be able to just run an analysis on it and have the computer spit out everything that is functional and not functional.
And so that's the problem I worked on. And a really important insight is that you can take advantage of the idea of natural selection and the idea of evolution to help you. And the way you do that is you have the human genome, you sequence a bunch of primate genomes nearby relatives of the union, and you lay all those genomes on top of each other, and then you look for places where all of the genomes agree. Right,
there hasn't been variation that's happening through mutations. And why hasn't there been, Well, the biggest force that throws out variation is natural selection. If you get a mutation in a part of your genome that really matters, then you're kind of on it and you won't have progeny and that'll get stamped out. So natural selection is this very
strong force that's causing DNA not to change. And so when you when you make these primate alignments, you can really leverage that fact and look for conservation and use that as a big signal that something is functional.
Huh, really really interesting. You mentioned our DNA is ninety nine point ninety nine. I don't know how many places to the right of the decimal point you would want to go, but very similar. How how similar or different are we from let's say, a chimpanzee. I've always questioned, there's an urban legend that they're practically the same. It always seems like it's overstated two percent. So you and I have a point one percent different me and the average chimp. It's two point zero percent.
That's exactly right. Yeah, so chimps are essentially our closest non human primate relatives.
Really really quite fascinating. So let's talk a little bit about the firm. You guys were one of the earliest pioneers of machine learning research. Explain a little bit what the firm does.
Sure, so, we run trading strategies investment strategies that are fully automated, so we call them fully systematic, and that means that we have software systems that run every day during market hours, and they take in information about the
characteristics of the securities we're trading. Think of stocks and then they make predictions of how the prices of each security is going to change over time, and then they decide on changes in our inventory, changes in held positions based on those predictions, and then those desired changes are sent into an execution system which automatically carries them out.
So fully automated. Is there supervision or it's kind of running on its own with a couple of checks.
There's lots of human diagnostic supervision, right, So there are people who are watching screens full of instrumentation and telemetry about what the systems are doing. But those people are not taking any actions, right unless there's a problem, right, and then they do.
So let's talk a little bit about how machines learn to identify signals. I'm assuming you start with the giant database that is the history of stock prices, volume movement, etc. And then bring in a lot of additional things to bear. What's the process like developing a particular trading strategy.
Yeah, so, as you're saying, we begin with a very large historical data set of prices and volumes, market data that kind, but importantly all kinds of other information about securities, financial statement data, textual data, analyst data.
So it's everything from prices fundamental everything from learnings to revenue to sales, etc. I'm assuming the change and the delta of the change is going to be very significant in that. What about macroeconomic what some people call noise, but one would imagine some signal in everything from inflation to interest rates to GDPs firm spending. Are those inputs worthwhile or how do you think about those?
So we don't hold portfolios that are exposed to those things. So it's really a business decision on our part. We are working with institutional investors who already have as much exposure as they want to things like the market or to well recognized econometric risk factors like value, and so they don't need our help to be exposed to those things. They are very well equipped to handle that part of their investment process. What we're trying to provide is the
most diversification possible. So we want to give them a new return stream which has good and stable returns, but on top of that, importantly, is also not correlated with any of the other return streams that they already that they already have.
That's interesting. So can I assume that you're applying your machine learning methodology across different asset classes or is it strictly equities? Oh?
No, We apply it to UH to equities, to credit, to corporate bonds, and we trade futures contracts, and in the fullness of time, we hope that we will be trading kind of every security in the world.
So, so currently stocks, bonds, When you say futures, I assume commodities, all.
Kinds of futures contract.
It's really really interesting. So it could be anything from interest rate swaps to commodities to the full gamut. So, so how different is this approach from what other quant shops do that really focus on equities.
I think it's kind of the same question as asking, well, what do we mean when we say we use machine learning or that you know we are our principles are our machine learning principles, and so how does that make us different than the kind of standard approach in quantitative trading? And the answer to the question really comes back to this idea we mentioned a little while ago of how powerful the tools are that you're using to form predictions. Right, So in our business, the thing that we build is
called a prediction rule. Okay, that's that's our widget and What a prediction rule does is it takes in a bunch of input, a bunch of information about a stock at a moment in time, and it hands you a guess about how that stock's price is going to change over some future period of time. Okay, and so there is one most important question about prediction rules, which is how complex are they? How much complexity do they have?
Complexity is a colloquial term. It's unfortunately another example of a place where things can be vague or ambiguous because a general purpose word has been borrowed in a technical setting. But when you use the word complexity in statistical prediction, there's a very specific meaning. It means how much expressive power does this prediction rule have? How good a job can it do of approximating what's going on in the
data you show it. Remember, we have these giant historical data sets, and every entry in the data set looks like this. What was going on with the stock at a moment in a certain moment in time, it's price action, it's financials analyst information. And then what did its price do in the subsequent twenty four hours or the subsequent
fifteen minutes or whatever. Okay, and so when you talk about the amount of complexity that a prediction rule has, that means how well is it able to capture the relationship between the things that you can show it when you ask it for a prediction, and what actually happens to the price. And naturally you kind of want to use high complexity rules because they have a lot of approximating power. They do a good job of describing anything that's going on. But there are two There are two
disadvantages to high complexity. One is it needs a lot of data, otherwise it gets fooled into thinking that randomness is actually signal. And the other is that it's hard to reason about what's going on under the hood. Right, when you have very simple prediction rules, you can sort
of summarize everything that's good that they're doing in a sentence. Right, you can look you can look inside them and get a complete understanding of how they behave, and that's not possible with high complexity prediction rules.
So I'm glad you brought up the concept of how easy it is or how frequently you can fool an algorithm or a complex rule, because sometimes the results are just random. And it reminds me of the issue of back testing. No one ever shows you a bad back test. How do you deal with the issue of overfitting and back testing that just is geared towards what already happened and not what might happen in the future.
Yeah, that is you know, if you like the million dollar question in statistical prediction, Okay, and it might you might find it surprising that relatively straightforward ideas go a long way here. And so let me let me just describe a little scenario of how you deal you can deal with this. All right, we agree, we have this big historical data set, right, One thing you could do is just start analyzing the heck out of that data set and find a complicated prediction rule. But you're you've
already started doing it wrong. The first thing you do before or you even look at the data is you randomly pick out half of the data and you lock it in a drawer. Okay, and that leads you with the other half of the data that you haven't locked away. On this half, you get to go hogwild. You build every kind of prediction rule, simple rules, enormously complicated rules,
everything in between. Right, and now you can check how accurate all of these prediction rules that you've built are on the data that they have been looking at, and the answer will always be the same. The most complex rules will look the best. Of course, they have the most expressive power, so naturally they do the best job
of describing what you showed them. The big problem is that what you showed them is a mix of signal and noise, and there's no way you can tell to what extent a complex rule has found the signal versus the noise. All you know is that it's perfectly described to the data you showed it. You certainly suspect it must be overfitting if it's doing that. Well, okay, so now you freeze all those prediction rules. You're not allowed
to change them in any way anymore. And now you unlock the drawer and you pull out all that data that you've never looked at. You can't overfit data that you never fit, and so you take that data and you run it through each of these prediction rules that's frozen, that you built. And now it is not the case at all that the most complex rules look the best. Instead, you'll see a kind of U shaped behavior where the very simple rules are too simple. They've missed signal, they
left signal on the table. The two complex rules are also doing badly because they've captured all the signal but also lots of noise. And then somewhere in the middle is a sweet spot where you've struck the right trade off between how much expressive power the prediction rule has and how good a job it is doing of avoiding the mistaking of noise for signal.
Really really intriguing. So you guys, have you've built one of the largest specialized machine learning research and development teams in finance. How do you assndle a team like that and how do you get the brain trust to do the sort of work that's applicable to managing assets.
Well, the short answer is, we spend a huge amount of energy on recruiting and uh, you know, identifying the sort of premier people in the field of machine learning, kind of both academic and practitioners, and we exhibit a
lot of patients. We we wait a really long time to be able to find the people who are kind of really the best, and that that that matters enormously to us, both from the standpoint of the success of the firm and also because it's something that you know, we value extremely highly just having great colleagues, brilliant colleagues that you know, I want to work in a place where I can learn from all the people around me.
And you know, when when my co founder Michael Caratanev and I we're talking about starting Bollion, one of the reasons that was on our minds is we wanted to be in control of who we worked with. You know, we really wanted to be able to assemble a group of people who were, you know, as brilliant as we could find, but also you know, good people, people that we liked, people that we were excited to collaborate with.
So let's talk about some of the fundamental principles Volnon is built on. You reference a prediction based approach from a paper Leo Briman wrote called two Cultures. Yeah, tell us a little bit about what two cultures actually is.
Yeah. So this this paper was written about twenty years ago. Leo Briman was one of the great probabilists and statisticians of his generation. Berkeley professor need I say, and you know, Leo had been a practitioner in statistical consulting actually for quite some time. I'm in between a U. C. L. A tenured job and returning to academia at Berkeley, and he learned a lot in that time about actually solving prediction problems and instead of hypothetically solving them in sort
of the academic context. And so all of his insights about the difference really culminated in this paper from two thousand that he wrote.
The difference between practical use versus academic theory if you like.
Yeah, And so he identified two schools of thought about solving prediction problems, right, and one school is sort of model based. The idea is there's some stuff you're going to get to observe stock characteristics. Let's say there's a thing you wish you knew future price change, let's say, and there's a box in nature that turns those inputs into the output, right. And in the model based school of thought, you try to open that box, reason about how it must work, make theories. In our case, these
would be sort of econometric theories, financial economics theories. And then those theories have knobs, not many, and you use data to set the knobs, but otherwise you believe the model. And he contrasts that with the machine learning school of thought, which is also has the idea of Nature's box. The inputs go in, the thing you wish you knew comes out. But in machine learning, you don't try to open the box. You just try to approximate what the box is doing.
And your measure of success is predictive accuracy, and is only predictive accuracy. If you build a gadget and that gadget produces predictions that are really accurate they turn out to look like the thing that nature produces, then that is success. And at the time he wrote the paper, his assessment was ninety eight percent of statistics was taking the model based approach, two percent was taking the machine learning approach.
And are those statistics still valid today or have we shifted quite a bit?
We shifted quite a bit, And different different arenas of prediction problems have different mixes these days. But even in finance, I would say it's it's probably more like fifty to.
Fifty really that much?
Yeah, I think you know, And if you the logical extreme is natural language modeling, which was done for decades and decades in the model based approach, where you kind of reasoned about linguistic characteristics of how people kind of do dialogue and those models had some parameters and you fit them with data, and then instead you have, as we said, a database of a trillion words and a tool with one hundred and seventy five billion parameters, and you run that and there is no hope of completely
understanding what is going on inside of GPD three. But nobody complains about that because the results are astounding. The thing that you get is incredible. And so that is by analogy, the way that we reason about running systematic investment strategies. At the end of the day, predictive accuracy is what creates returns for investors. Being able to give complete descriptions of exactly how the predictions arise does not in itself create returns for investors. Now, I'm not against
interpretability and simplicity all equal. I love interpretability and simplicity, but all else is not equal. If you want the most accurate predictions, you are going to have to sacrifice some amount of simplicity. In fact, this truth is so widespread that Leo gave it a name in his paper. He called it Accam's dilemma. So Accam's razor is the philosophical idea that you should choose the simplest explanation that
fits the facts. Akam's dilemma is the point that in statistical prediction, simplest approach, even though you wish you could choose it, is not the most accurate approach if you care about predictive accuracy. If you're putting predictive accuracy first, then you have to embrace a certain amount of complexity and lack of interpretability.
Huh, that's really quite fascinating. So let's talk a little bit about artificial intelligence and large language models. You follow d Shaw playing in e commerce and biotech. It seems like this approach to using statistics, probability, and computer science is applicable to so many different fields.
It is. Yeah, I think you're talking about prediction problems ultimately, So in recommender systems, you can think of the question as being well, if I had to predict what thing I could show a person that would you be most likely to change their behavior and cause them to buy it. It's a kind of prediction problem that motivates recommendations. In biotechnology.
Very often we are trying to make predictions about whether someone, let's say, does or doesn't have a condition a disease based on lots of information we can gather from high throughput diagnostic techniques. These days, the keyword in biology and
in medicine and biotechnology is high throughput. You're running analyses on an individual that are producing hundreds of thousands of numbers, and you want to be able to take all of that kind of wealth of data and turn it into diagnostic information about.
And we've seen AI get applied to pharmaceutical development in ways that people just never really could have imagined just a few short years ago. Is there a field that AI and large language models are not going to touch or is this just the future of everything.
The kinds of fields where you would expect uptake to be slow are where it is hard to assemble large data sets of systematically gathered data. And so any field where it's relatively easy to at large scale, let's say, produce the kinds of the same kinds of informations that experts are using to make their decisions, you should expect that field to be impacted by these tools if it hasn't been already.
So you're kind of answering my next question, which is what led you back to investment management. But it seems if there's any field that just generates endless amounts of data.
It's the markets, that's true. And I had already been really interested in the problems of systematic investment strategies from my time working in d SHAW, and so my co founder Michael Kratanav and I, you know, we were both in the Bay Area in the two thousand and four. He was there because of a firm that he had founded, and I was there finishing my PhD. And we started to talk about the idea of using contemporary machine learning methods to build strategies that would be, you know, really
different from strategies that result from classical techniques. And we had met at d SHAW in the nineties and been less excited about this idea because the methods were pretty immature. There wasn't actually a giant diversity of data back in the nineties in financial markets, not like there was in two thousand and five. And compute was really still quite expensive in the nineties, whereas in two thousand and five, you know, it had been dropping in the usual More's
Law way. And this was even before GPUs, and so when we looked at the problem in two thousand and five, it felt like there was a very live opportunity to do something with a lot of promise that would be really different. And we had the sense that not a lot of people were of the same opinion, and so it seemed like something that we should try.
That there was a void. Nothing nothing in the market hates more than a vacuum and intellectual approach. So so you mentioned the diversity of various data sources. What what don't you consider, like how how far off of price and volume do you go in the net you're casting for inputs into into your systems.
Well, I think we're prepared as a you know, as a as a research principle, we're prepared to consider any data that has some bearing on price formation, like some some plausible bearing on how prices are formed. Now, of course we're you know, we're a relatively small group of people with a lot of ideas and uh, and so we have to prioritize so you know, in the event we end up pursuing data that you know makes a lot of sense, you know, we don't we don't try.
I mean, can you go as far as politics or the weather, like how far off of prices can you can you look?
So, you know, an example would be the weather. You're for most securities, you're not going to be very interested in the weather, but for commodities future as you might be, so that you know, that's the kind of reasoning you would apply.
Right, really really interesting. So let's talk about some of the strategies. You guys are running short and mid horizon US equities, European equities, Asian equities, mid horizon US credit, and then cross assets. So I might to assume all of these are machine learning based, and how similar different is each approach to each of those asset classes.
Yeah, they're all machine learning based. The kind of principles that I've described of using as much complexity as you need to maximize predictive accuracy, et cetera. Those principles underlie all the systems. But of course it's trading. Trading corporate bonds is very different from trading equities, and so the implementations reflect that reality.
Huh. So let's talk a little bit about the four step process that you bring to the systematic approach, and this is off of your site, so it's it's data prediction, engine, portfolio construction, and execution. Yeah, I'm assuming that is heavily computer and machine learning based. At each step along the way. Is that is that fair?
I think that's fair. I mean to different degrees. The data gathering that's you know, that's a that's largely a software and kind of operations and infrastructure job.
Do you guys have to spend a lot of time cleaning up that data and making sure that because you hear between CRISP and s and P and Bloomberg, sometimes you'll pull something up and they're just all off a little bit from each other because they all bring a very different approach to data assembly. How do you make sure everything is consistent and there's no errors or errants inputs throughout.
Yeah, through a lot of effort. Essentially, there there we have. You know, we have an entire group of people who focus on data operations, both for gathering a historical data and for the management of the ongoing live data feeds. There's no way around that. I mean, that's just work that you have to that you have to do.
You just have to brute force your way through that. And then the prediction engine. Sounds like that's the single most important part of the machine learning process if I'm understanding you correctly, that that's where all the meat of the technology is.
Yeah, I understand the sentiment. I mean, it's worth emphasizing that you do not get to a successful systematic strategy without all the ingredients. You have to have clean data because of the garbage in garbage out. You have to have accurate predictions. But you know, predictions don't automatically translate into returns for investors. Those predictions are kind of the power that drives the portfolio holding part of the system.
So let's talk about that portfolio construction. Given that you have a prediction engine that and good data going into it, so you're fairly confident as to the output. How do you then take that output and say, here's how I'm going to build a portfolio based on what this generates.
Yeah, so there are three big ingredients in the portfolio construction. The predictions what is usually called a risk model in this business, which means some understanding of how volatile prices are across all the securities you're trading, how correlated they are, how you know if they have a if they have a big movement, how big that movement will be. That's
all the risk model. And then the final ingredient is what's usually called a market impact model, and that means an understanding of how much you are going to push prices away from you when you try to trade. This is a reality of all trading. You buy a lot of a security, you push the price up, you push
it away from you in the unfavorable direction. And in the systems that we run, the predictions that we're trying to capture are about the same size as the effect that we have on the markets when we trade, and so you cannot neglect that impact effect when you're thinking about what portfolios to hold.
So execution becomes really important. If you're not executing well, you are moving prices away from your profit.
That's right, and it is you know, probably the single thing that undoes quantitative hedge funds most often is that they misunderstand how much they're moving prices. They get too big, they start trading too much, and they sort of blowed themselves up.
It's funny that you say that, because as you were describing that, the first name that popped into my head was long term capital managements trading these really thinly traded obscure fixed income products, and everything they bought they sent higher because there just wasn't any volume in it. And when they needed liquiditly there was none to be had. And you know that plus no risk management one hundred x leverage equals a kaboom.
They made a number of mistakes the book. The book is good. So when genius fail in, oh absolutely love that fantastically fascinating.
So when you're reading a book like that, somewhere in the back of your head are you thinking, hey, this is like a what not to do when you're setting up a machine learning fund. How influential is something like.
Well one hundred percent? I mean, look, I think the most important adage I've ever heard in my professional life is good judgment comes from experience. Experience comes from bad judgment. So the extent to which you can get good judgment from other people's experience, that is that that is like a free tuition. And so we talk a lot about all the all the mistakes that that that other people have made. And you know, we do not congratulate ourselves
on having avoided mistakes. We think those people were smart. I mean look that you know, you read about these events and these people. None of these people were dummies. They were sophisticated Nobel laureates.
Yeah, right, it's they just didn't have a guide book on what not to do, which which you guys.
Do We don't. No, I don't think we do. I mean apart from that, apart from reading about right, But everybody is undone by a failure that they they didn't they did, they didn't think of ever didn't know about yet. And we're extremely cognizant of that.
Huh. That has to be somewhat humbling to come being on the lookout for that blind spot that could disrupt everything.
Yes, yeah, humility is the key ingredient in running in running these systems.
Really quite amazing. So let's talk a little bit about how academically focused volling On is. You guys have a pretty deep R and D team internally, you teach at Berkeley. What does it mean for a Hedge fund to be academically focused?
What I would say probably is kind of evidence based rather than academically focused. Saying academically focused gives the impression that kind of papers would be the goal or the desired output, and that's not the case at all. We have, you know, a very specific applied problem that we are trying to solve.
Papers are a mean to an end.
Papers are you know, we don't write papers for external consumption. We do lots of writing internally, and that's to make sure that that you know, we're keeping track of our own kind of scientific process.
But you're fairly widely published in statistics and machine learning. Yes, what purpose does that serve other than a calling card for the fund as well as Hey, I have this idea and I want to see what the rest of my peers think of it. When when you put stuff out into the world, what sort of feedback or pushback do you get?
I guess I would have to say, I really I do that as kind of a double life of non financial research. So it's just something that I really enjoy. Principally, what it means is that I get to work with PhD students, and you know, we have really outstanding PhD students at Berkeley in statistics, and so it's an opportunity for me to do a kind of intellectual work that namely, you know, writing a paper laying out an argument for public consumption, et cetera that is kind of closed off as far as so.
Not adjacent to what you guys are doing at Volleyon generally. No, No, that's really interesting. So then I always assume that that was part of your process for developing new models to apply machine learning to new assets. Take us through the process. How do you go about saying, Hey, this is an asset class we don't have exposure to. Let's see how to apply what we already know to that specific area.
Yeah, we have it's a great question. So we're trying as much as possible to get the problem for a new asset class into a familiar setup, into you know, as standard a setup as we can, and so we know what these systems look like in the world of equity. And so if you're trying to do the same kind, if you're trying to build the same kind of system for corporate bonds, and you start off by saying, well, okay, i'd like I need to know, you know, closing prices
or inter day prices for all the bonds. Already, you have a very big problem in corporate bonds because there is no there is no live price feeds that's showing you a bit offer quote in the way that there is inequity. And so before you can even get started thinking about predicting how a price is going to change, it would be nice if you know what the price currently was, and that is already a problem you have to solve in corporate bonds as opposed to being just an input that you have access to.
The old joke was trading by appointment only. Yeah, and that seems to be a bit of an issue. And there are so many more bond issues than there are equities. Absolutely, is this just a database challenge or how do you work?
No, it's a statistics problem, but it's it's a different kind of statistics problem. We're not in this case. We're not trying to yet. We're not yet trying to predict the future of any quantity. We're trying to say, I wish I knew what the fair value of this of this CSIP was. I can't see that exactly because there's no live order book that with a bid and an offer that's got lots of liquidity that lets me figure out the fair value. But I do know what.
At best, you have a recent price, maybe not even so recent.
I have lots of related information. I know you know this bond. Maybe this bond didn't trade today, but it traded a few times yesterday. I get to say I know where it traded. I'm in touch with bond dealers, so I know where they've quoted this bond, maybe only on one side over the last few days. I have some information about the company that issued this bond, et cetera. So I have lots of stuff that's related to the
number I know that I want to know. I just don't know that number, right, And so what I want to try to do is kind of fill in and do what's what in statistics or in control we would call a now casting problem, huh, And it's an analogy actually is too automatically controlling an airplane. So surprisingly, Oh, the main there there are there are when you're if you're trying if a software is trying to fly in air plane, there are six things that it absolutely has
to know. Has to know the x y z of where the plane is and the x y z of its velocity where it's headed. Right, those are the six most important numbers. Now, nature does not just supply those numbers to you. You cannot know those numbers with perfect exactitude.
But there's lots of instruments on the plane, and there's GPS and all sorts of information that is very closely related to the numbers You wish you knew, and you can use statistics to go from all that stuff that's adjacent to a guess and infill of the thing you wish you knew, And the same goes with the current price of a corporate bond.
Huh. That's really kind of interesting. So I'm curious as to how often you start working your way into one particular asset or a particular strategy for that asset and just suddenly realize, oh, this is wildly different than we previously expected, and suddenly you're down a rabbit hole to just wildly unexpected areas. It sounds like that isn't all then uncommon.
It is not uncommon at all.
Huh.
No, it's a nice you know, there's this kind of wishful thinking that all we have. You know, we figured it out in one asset class in the sense that we have a system that's kind of stable and performing reasonably well that we that we have a feel for, and now we want to take that system and somehow replicate it in a different situation. And while we're going to standardize the new situation to make it look like
the old situation. That's the principle. That principle kind of quickly goes out the window when you when you start to make contact with the reality of how the new asset class actually behaves.
So stocks are different than credit, are different than bonds, or different than commodities. They're all like starting fresh. Yeah, over what some of the more surprising things you've learned as you've applied machine learning to totally different asset classes.
Well, I think, you know, corporate bonds provide a lot of examples of this. I mean, the fact that you don't actually really know a good live price or a good live bid offers it seems seems you know, it's surprising. I mean, this is this fact has started to change.
Like over the years, there's been an accelerating electronification of corporate bond treading, and that's you know, that's that's been a big advantage for us actually because we were kind of first movers and so we've really benefited from that. So the problem is diminished relative to how it was, you know, six seven years ago when we started, but it's still relative equities, it's absolutely there.
Yeah, So you get so when in other words, if I'm looking at a bond mutual fund or even a bondytf that's trading during the day. That price is somebody's best approximation of the value of all the bonds inside. But really you don't know the nav, do you.
You're just kind of guessing, Barry, don't even get me started on bonditfs real because.
That it seems like that would be the first place that would show up. Hey, bondytf's sound like throughout the day they're gonna be mispriced a little bit or wildly mispriced.
Well, the bond ETF there's a sense if you're a if you're a market purist, in which they can't be mispriced because there's their price is set by supplying demand in the ETF market, and that's a super liquid market, and so there may be a difference between the market price of the ETF and the under the nave of the underlying portfolio, except in many cases with bond ETF
there's not even a crisply defined underlying portfolio. It turns out that the authorized participants in those ETF markets can negotiate with the fund manager about exactly what the constituents are of the create redeem baskets, and so it's not even at all clear what you mean when you say that the nav is this or that relative to the price of the ETF.
So when I asked about what's surprising when you work you in on a rabbit hole. Hey, we don't know what the hell's in this bond ETF. Trust us, it's all good. That's a pretty surprise. And I'm only exaggerating a little bit, But that seems like that's kind of shocking.
It's it is surprising when you find out about it, but you quickly come to understand. If you trade single name bonds, as we do, you quickly come to understand why bond ETFs work that way.
I recall a couple of years ago there was a big Wall Street Journal article on the g l d E t F, and from that article I learned that GLD was formed because gold dealers had just excess gold piling up in their warehouses and they needed a way to move it. So that was kind of shocking about that ETF any other space that that led to a sort of big surprise as you worked your way into it.
Well, I think ETFs are a kind of a good source of these examples. So the volatility ETFs, the you know, the ETFs that are that are based on the VIX or that are short the vics. You may remember several years.
Ago I was gonna say the ones that haven't blown up.
Yeah right, there was this event called Valmageddon where.
That was ETF notes, wasn't it the Yeah.
Right, there are these essentially these investment products that were short VIX, and VIX went through a spike that caused them to have to liquidate, which was part I mean, the people who designed the sixteene traded note. They understood that this was a possibility, so they had a sort of uh descriptions in their in their contract for what it would mean. But yeah, always surprising to watch something suddenly go out of business.
We seem to get a thousand year flood every couple of years. Maybe we shouldn't be calling these things thousand year flood. That's right, that's a that's a big misnomer.
As statisticians, we tell people, you know, if you if you think that you've experienced a six sigma event, the problem is that you have underestimated sigma.
That that's really interesting. So so, given the gap in the world between computer science and an investment management, how long is it going to be before that narrows and we start seeing a whole lot more of the sort of work you're doing applied across the board to to the world of investment.
Well, I think it's happening. It's been happening for for quite a long time. I mean, for example, all of modern portfolio theory. Really, it kind of began in the fifties with you know, first of all, Markowitz and other people thinking about, you know, what it means to benefit from diversification, and the idea that you know, diversification is
the only free lunch in finance. So I would I would say that, you know, the idea of thinking in a in a systematic and scientific way about how to to manage and grow wealth, not you know, not even just for institutions, but also for individuals. Has is an example of a way that these ideas have kind of had profound effects.
I know, I only have you for a little while longer, So let's jump to our favorite questions that we ask all of our guests, starting with tell us what you're streaming these days? What are you either listening to or watching to keep yourself entertained.
I A few things I've been watching recently. The Bear, I don't know if you've heard So Great, So great, right, and I'm in Chicago, as I know, we were just from Yeah, so.
So and and there are parts of that show that are kind of a love letter to absolutely as you get deeper into the series, because it starts out kind of gritty and you're seeing the underside, and then as we progress, it really becomes like a lovely postcard. Such an amazing show.
So really really love that show. Was I was late to better call Saul that I'm finishing up. I think as good as as Breaking Bad, So I maybe when you haven't heard of there's a show called Mister in Between.
Which is mister Yeah.
It's not Hulu, it's from it's from Australia. It's about a guy who's, you know, a doting father living his life. He's also essentially a muscle man and hitman for for local criminals in his part of Australia. But it's half hour dark comedy.
Right, so not quite Barry and not quite Sopranos somewhere.
Yeah, that's exactly.
Yeah, sounds really interesting. Tell us about your early mentors who helped shape your career.
Well, Berry, I'd been lucky to have a lot of people who were you know, both really smart and talented and willing to you know, take the time to help me learn and understand things. So actually, my co founder, Michael Caratanov, he was kind of my first mentor in finance. He he had been a d SHAW for several years when I got there, and he he really taught me kind of the ins and outs of of market micro structure.
I worked with a couple of people who managed me at d SHAW yo see Friedman and Kapeel Mature, who have gone on to hugely successful careers in quantitative finance, and they taught me a lot to when I did my PhD. My advisor Mike Jordan, who's a kind of world famous machine learning researcher. You know, I learned enormously from him. And there's another professor of statistics who sadly
passed away about fifteen years ago named David Friedman. He was really just an intellectual giant of the twentieth century and probability and statistics. He was both, you know, one of the most brilliant probabilists and also an applied statistician. And this is this is like a pink diamond kind of combination. It's that rare to find someone who has that kind of technical capability but also understands the pragmatics of actually doing data analysis. He spent a lot of
time as an expert witness. He was the lead statistical consultant for the case on census adjustment that went to the Supreme Court. In fact, he told me, uh, what went that in the end? Uh, you know, the the people against adjustment they won in a unanimous Supreme Court decision. And David Freeman told me, he said, you know, all that work and we only convinced nine people.
But not nine people that kind of matter, Yeah, exactly.
So it was just it was a real it was kind of a once in a lifetime privilege to get to spend time with someone of that intellectual caliber. And there were others too. I mean, I've been I've been very fortunate that.
That's quite a list to begin with. Let's talk about books. What are some of your favorites and what are you reading right now?
Uh? Well, I'm a I'm a big book reader, so I had a long list.
But probably by the way, this is everybody's favorite section of the podcast. People are always looking for good book recommendations, and if they like what you said earlier, they're gonna love love your book recommendations, so fire away.
So I'm a big fan of kind of modernist dystopian fiction. So a couple of examples of that would be the book Infinite Jest by David Foster Wallace, wind Up Bird Chronicle by Hirouki Murakami. Those are two of my all time favorite books. There's a I think much less well known but beautiful novel. It's a kind of academic coming of age novel called Stoner by John Williams. A really moving, just a tremendous book. Sort of more dystopia would be White Noise to Lilo and kind of the classics that
everybody knows nineteen eighty four and Brave New World. Those are two more of my favorite.
Huh, it's funny when you mentioned The Bear. I'm in the middle of reading a book that I would swear the writers of The Bear leaned on called Unreasonable Hospitality by somebody who worked for the Danny Meyer's Hospitality Group. Eleven Madison Park in Ramsey Tavern and all these famous New York haunts, and the scene in The Bear where they overhear a couple say, oh, we visited Chicago when you never had deep dish, So they send the guy
out to get deep dish. There's part of the book where at eleven Medicine Park this people actually showed up with suitcases. It was the last thing they would eat doing before they heading to the airport. And they said, oh, we ate all these great places in New York, but we never had a New York hot dog. And what do they do. They send them out to get someone out to get a hot dog. They played it and use all the condiments to make it very special, and it looks like it was ripped right out of the Bear,
or vice versa. But if you're interested in just hey, how can we disrupt the restaurant business and make it not just about the celebrity chef in the kitchen but the whole experience. Fascinating kind of nonfiction book?
That does sound really interesting.
Yeah, really, you mentioned the Bear and it just popped into my head. Any of the books you want to mention that's that's a good list to start with.
Yeah. My other kind of big interest is science fiction, speculative fiction. Unsurprisingly right, Sorry, sorry, but so there are some classics that I think everybody should read. Ursula LeGuin loves just amazing. So The Dispossessed and The Left Hand of Darkness, those are just two of the best books I've ever read period.
Forget Left Handed Darkness stays with you for a long time.
Yeah right, yeah, really really amazing books. I'm rereading right now, Cryptonomicon Neil Stevenson. And one other thing I try to do is I have very big gaps in my reading. For example, I've never read Updyke, so I started reading The Rabbit.
Serious World of Corn. It's a garb and they're they're very much of an era.
Yeah, that's right.
What else give us more? Uh?
Wow? Okay, let's see George so.
He.
Oh wow, I think I think you'd love him. So He's his real strength is short fiction. He had He's written great novels too, but tenth of December this is his best collection of of fiction. And that this is more kind of modern dystopian, kind of comic dystopian stuff.
You keep coming back to dystopia, yeasinating.
I find, you know, it's uh, it's very different from my my day to day reality. So I think it's a you know, it's a great change of pace for me to be able to read this stuff. So, uh,
some some science writing, I can tell you. Probably the best science book I ever read is The Selfish Gene by Richard Dawkins, which kind of really you know, you have a kind of intuitive understanding of genetics and natural selection in Darwin, but the language that Dawkins uses really makes you appreciate just how how much the genes are in charge and how little we as the as the you know he calls he calls organisms survival machines that the genes have kind of built and and exist inside
in order to ensure their propagation. And his whole the whole point of view in that book just gives you, Uh, it's really eye opening, makes you think about natural selection and evolution and genetics in a completely different way, even though it's all based on the same kind of facts that.
You know, it's just framing.
It's the framing and the perspective that are really that really kind of blow your mind. So it's a great it's a great book to read.
Huh, that's a hell of a list. You've given people a lot of things to start with, and now down to our last two questions, HM, what advice would you give to a recent college grad who is interested in a career in either investment management or machine learning.
Yeah? So, I mean I work in a very specialized sub domain of finance, So there are a lot of people who are going to be interested in investment in finance that I that I couldn't give any specific advice to. I have kind of general advice that I think is useful both for finance and even more broadly. This advice is really kind of top of Maslow's pyramid advice. If you know, if you're trying to kind of write your novel and pay the rent while you get it done,
this is I can't really help you with that. But you know, if what you care about is building this career, then I would say number one piece of advice is work with incredible people. Like far and away, much more important than what the particular field is the details of what you're working on, is the caliber of the people that you do it with, both in terms of your own satisfaction and how much you learn and and and
all of that. I think you know you'll learn, you'll benefit hugely on a personal level from working with incredible people and if you don't work with people that are like that, then you're probably going to have a lot of professional unhappiness. So it's kind of either or.
That's a really intriguing answer. So final question, what do you know about the world of investing, machine learning, large language models, just the application of technology to the field of investing that you wish you knew twenty five years or so ago when you were really first ramping up.
I think one of the most important lessons that I learned, had to learn the hard way kind of going through and running these systems, was that it's kind of comes back to the point you made earlier about the primacy of prediction rules. And it may be true that the most important thing is the prediction quality, but there are lots of other very necessary, mandatory ingredients, and I would put kind of risk management at the top of that list.
So I think it's easy to to maybe neglect risk management to a certain extent and focus all of your attention on predictive accuracy. But I think it really does turn out that if you don't have high quality risk management to go along with that predictive accuracy, you won't succeed. And I guess I wish I had appreciated that in a really deep way twenty five years ago.
John, This has been really, absolutely fascinating. I don't even know where to begin other than saying thank you for being so generous with your time and your expertise. We have been speaking with John mccauloff. He is the co founder and chief investment officer at the five billion dollar hedge fund Volleyon Group. If you enjoy this conversation, well, be sure and check out any of the previous five
hundred we've done over the past nine years. You can find those at iTunes, Spotify, YouTube, or wherever you find your favorite podcast. Sign up for my daily reading list at rid Halts. Follow me on Twitter at Barry Underscore Rit Halts until I get my hacked account at rid Holt's back. I say that. I say that because the process of dealing with the seventeen people left at once
Twitter now x is unbelievably frustrating and annoying. Follow all of the fine family of podcasts on Twitter at podcast I would be remiss if I did not thank the crack team that helps put these conversations together each week. Paris Woald is my producer. Attiko val Bron is my project manager. Sean Russo is my director of Research. I'm Barry rid Halts. You've been listening to Masters in Business on Bloomberg Radio at