Q3: Jonathan Larkin – Seeking Alpha? Try MORE Alpha Factors - podcast episode cover

Q3: Jonathan Larkin – Seeking Alpha? Try MORE Alpha Factors

Dec 05, 201651 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Summary

Delving into the next step of quantitative trading, this episode discusses combining alpha factors to build more robust predictive models. Guests explain how "weak classifiers," even those barely better than a coin flip, can be intelligently aggregated using ensemble learning and machine learning algorithms like Adaboost to reduce noise and amplify signals. The conversation also covers the importance of low correlation between factors, methods for monitoring factor lifespan, and the practical considerations for independent traders applying these advanced techniques.

Episode description

In practice, no one trading model will ever be that good on its own. Luckily statistics has come up with a lot of theory about how you can combine weaker models to create better overall predictions. We’ll discuss how to combine many different trading signals into overall models and some of the practical considerations in doing so.

--

Sponsored by DataCamp.com – For new and intermediate programmers, DataCamp have an evergrowing library of interactive courses that’ll help you to become a data science wizard.

Learn more about your ad choices. Visit megaphone.fm/adchoices

Transcript

Introduction to the Series and Guests

Chat with Traders collaboration with Quantopian episode three. What's good traders? Welcome back for part three of six in this series, which is a collaboration with Quantopian. If we've not met in the past, my name is Aaron Firefield. I'm your host and thank you so much for being here. So following on from the previous episode when we spoke about alpha factors.

This episode is a continuation as we discuss what comes next in the workflow of many quantitative trading firms, and that is combining alpha factors. So we discuss why this is something which you would want to do, some of the various techniques quants use for aggregating alpha factors, and other topics that surround this subject too. So for this episode, Delaney and myself are joined by Jonathan Larkin, who is the Chief Investment Officer at Quantopian.

Jonathan's been trading in the professional space since the late nineties, many of those years spent at JP Morgan. Now just to remind you of a few things. If there are any questions you have, please go to quantopian.com forward slash questions. That's where you can submit questions and ask anything you're unsure about relative to the topics and the theme of the series, of course. Uh episode six is the QA episode and that's where we'll answer as many of these questions as possible.

Any links and additional resources which get mentioned during this episode or any other episode in the series have been curated for you at Quantopian.com forward slash. Chat with traders. So that's well worth a look. And you can also find the same on the Chat with Traders website, also. Now I've mentioned this previously. This series is sponsored by datacamp.com.

So for those of you who want to learn how to code, specifically in Python or R, I definitely encourage you to visit datacamp.com. They have a whole array of online courses that specifically teach data science. data manipulation, data visualization, probability and statistics, and machine learning, all of which is relevant to quant finance and their courses vary in difficulty from beginner to intermediate.

So to start any one of their online courses now for free, simply visit datacamp.com and create an account. And if you like. Um this would be really cool actually. So if you like Tweet me at ChatwithTraders and let me know which course you decide to start with. I think that'd be that'd be really cool to hear about. So at Chatwith Traders, tweet me which data camp course you plan on um starting. Again, that's datacamp.com.

Now one last thing, just quickly, there is a small amount of background noise in some parts of this episode. So I do apologize for this, but please just push on. It's really nothing too major.

Jonathan Larkin's Quant Trading Journey

Well that's all from me. Let's bring in Delaney and Jonathan Larkin. So Delaney, how you going? Good to be speaking again. Yeah, I'm doing pretty well. I mean, again, I say this every time, but constantly in just a state of being massively busy, but just a lot of really interesting stuff that's coming along. So, you know, I I have can't can't complain.

Absolutely. Nice. Well, it's a good state to be. And uh for this episode, uh episode three, uh, in this six part mini series, we've got Jonathan Larkin joining us. Uh welcome, John. How are you? Hi, I'm good, thanks. Good to be here. Excellent. Sean, I think it would be really cool if we could hear a bit about your background um just before we get stuck into um the subjects we're going to be discussing today, but

Yeah. I mean tell us a little bit about how did you get your start in the world of quant trading and investing. I think that'd be great to hear. Sure, absolutely. Uh about twenty years ago, um out of uh undergrad. I joined um uh investment bank JP Morgan uh in their in their training program uh spent uh about ten years there in um equity derivatives and equity derivatives um

you know, is a quantitative uh discipline, uh great learning ground for learning the markets. Uh can be as technical and as quantitative as you want it to be, but at the end of the day you're also uh trading underlying stocks, so you get a bit of the fundamental angle as well. Okay. And I know you've worked at a number of firms since then. Can you tell us about some of the various roles you've had over the years? I know you were at Millennium Partners for quite a few of those.

Uh sure, absolutely. Um first half of my career has been on the sell side and the second half uh has been on the the buy side. So uh I've been in the hedge fund business now. Um uh for a little over ten years uh as well. Um my experience has been working in multi-strad hedge funds. Um

and uh within the multi strat space, the multi manager hedge fund uh segment as well. So firms that uh hire very many uh individual portfolio managers and aggregate um the risk of individual strategies into one commingled vehicle. Excellent. Well um yeah, no that's that's really cool to hear. And of course you're at Quantopian now. Uh what is it you do there exactly? Sure, I'm the chief investment officer at Quantopian, so

I'm the one responsible for um putting together the investment vehicles that we're ultimately going to offer uh out to clients in the near future. So aggregating the uh strategies that we license from our community of authors uh and assembling those together uh into a uh single investment vehicle. Very good man. Well

Classifiers: Predicting Stock Movement

This episode we're going to be discussing classifiers and in some ways I think continuing the discussion about factors as well. So I think probably the best place to start is by me asking what is a classifier and how would you explain this? I I I can take this one because I'm gonna leave I'm gonna leave the complicated stuff for Jonathan. Um but you get the softball question. I get the softball ones. Um but uh so

A classifier is a very common notion in statistics and it's actually one of kind of the key components in a lot of um modern machine learning. So the notion of a classifier. is quite simple at its at its uh base level, which is really just a function that takes in an input. and, you know, classifies it, so produces an output of some category. Um and the simplest example might be a binary classifier, which takes in some input and then classifies that input as uh an instance of um

A uh one set or another set. It's only two options. It's a binary classifier. So in a sense, when you think about making predictions on stocks, um, well you can think of that as being a classification problem because at every point in time you're trying to decide whether the stock is a member of the set of stocks that's going to go up in the future.

Or whether it's a member of the set of stocks that's going to go down in the future. And it turns out that once you start thinking about any predictive model as a classifier that's basically trying to pick you know longs or shorts, um it makes it a little easier in other ways to think about um you know your entire workflow, your entire infrastructure. So you know at its at its ver at its core, let's I'll just give you a real example of a classifier.

uh a classifier exists within a fire detector, a smoke alarm, right? And and every time point that smoke alarm looks at the input data that it's getting from its sensor and it says, I want to classify this. as an example of there being a fire or an example of there being no fire. Um and it just does that repeatedly. Uh and so, you know, a good classifier is going to say no fire when there's no fire and fire when there's fire, but of course it's imperfect and

And that's why sometimes you get it ringing um when you're just making breakfast. Okay. So uh forgive me if you've already kind of explained this in that answer, but why would you want to use a classifier and and uh how does that actually look? Like how does where does a classifier fit into your algorithm, into your your whole sort of trading strategy?

Basically, uh you know, any model that forecasts stocks as going one direction or another can be thought of as a classifier because you are classifying stocks as going one direction or the other direction in the future, namely up or down. So again, it's just a way of thinking about um your model and it's a way of simplifying it a little bit. And and it and it falls into this notion of kind of

accepting that you're not gonna be super accurate because when you have a model that predicts future returns, it's really hard to predict the exact future returns, right? That that's something that in general and and you know um Jonathan has having much more experience in the industry than may may ha have a different opinion on this, but in in my opinion generally it's very difficult to predict, you know, absolute amount of of future returns.

And so sometimes what people will do is they'll simplify the problem into saying rather than predicting absolute amount, let's just predict it's gonna go up or gonna go down. And sometimes you can kind of formulate that as an easier problem because you're making uh a softer claim.

Why Combine Alpha Factors

Right. There are two trends that have really caused um this concept of a uh factor as a classifier to emerge. Uh the first is that um you know historically when I say historically I'm talking maybe you know five, ten years ago, uh you could build a quant strategy uh around a a single alpha factor or maybe two or three alpha factors. Um But the world is quite uh competitive and this is a quite lucrative uh endeavor. So over time, um

uh a lot of uh s the ability to generate a strategy out of a very few number of alpha factors, um, you know, that's that's gone away. It's been competed away. Um so today, um uh strategists and portfolio managers are faced with the task of combining um you know many, many alpha factors uh into um into a single prediction or into a single portfolio. And, you know, along this time, uh there's been

um quite a number of scientific advances uh in in machine learning. Um and as it relates to this, uh specifically something called ensemble learning. uh which is the process uh by which multiple uh single classifiers can be combined uh in an intelligent way to produce the results.

Alpha Factors as Classifiers

uh which is vastly superior than any one single classifier. Yeah, and I think that's something we're we're really gonna get into uh during this episode is actually combining multiple alpha factors. Uh just before we do though, uh, you know, last episode we we did speak about alpha factors um in general. Uh what relationship do alpha factors have with classifiers? Like how do these kind of

fit together? Like how does one lead into the other? Uh so an alpha factor, um The output of an alpha factor is um some kind of ranking or some kind of relative value across your universe um no of securities. So if you have a thousand securities, you know, your alpha factor will be uh a number uh per security um that is uh proportional to your model's uh expected return um per per security. So you have one

uh value per security. The value could be a rank, it could be, you know, any sort of uh dimensionless number that comes out of your calculation, as long as the ordering that um those numbers produce is uh proportional to the expected forward return across the universe. And when you have a classifier you're talking about um a uh much reduced um output set.

So Delaney was talking about a binary classifier. You could classify stocks as, you know, ones or negative ones and, you know, the ones will be uh stocks that are expected to uh significantly outperform and the negative ones could be stocks uh that you would expect to uh significantly underperform. Yeah, and as a as a follow-up to that, uh let's say that you're you have a an alpha factor and you've ranked all your stocks based on your their expected return, their relative value. Um

And then you go long the ones on the top equal weighted, and you go short the ones on the bottom equal weighted. Again, just like kind of that simple continuation of the strategy that we talked about last time. um, you know, sometimes referred to as like long shared equity or factor based long shorted equity. Uh so Uh if you've done that, well in that case really that's no different from taking your alpha signal, your score for each stock, and then just converting it

to a one or a negative one, depending on its, you know, whether or not it's above or below your cutoff. Maybe that's the mean, maybe that's a ten percent cutoff on either side. Um but really it's the same as if you had said take my alpha score and then convert it to a classifier by saying if it's above half

you know, convert it to one, if it's below half, convert it to negative one, and then go long in all the ones, go short in all the negative ones. So that's another way of thinking about the connection between an alpha factor and a classifier based on how you would actually trade off of it.

The Strength of Combined Weak Models

Okay, sure. So, you know, as Jonathan kinda hinted before, you want to combine uh alpha factors Um and I know this is something you wanted to speak about, Delaney, so you know You wanted to talk about the subject of how a single alpha factor is a weak classifier. So I I think it'd be great to hear your thoughts around this.

Yeah, well I mean I'll just kind of briefly give my thoughts and then I think it would also be great to to to have Jonathan speak on this as well. But um in in general in statistics, uh there's this notion of combining weak predictive models. And what that means is uh let's say that um you know, you have some model and it's okay, it's not great, it gets the answer right more than half the time, but you know, not ninety-five percent of the time. Um well

You can think of that as having a model with some signal and some noise, right? And you know, the idea is that it's going to have some true signal, plus or minus some noise, um, and the noise can cause the answer to be off in either direction. Uh and if you have Two models and they're both not super great, uh, but then you put them together, the idea is that the signals can add up. But the noise, because it's kind of random and goes both directions, will hopefully start to cancel itself out.

Um and so oftentimes uh by combining multiple models that all have some predictive power. um you can uh kind of reduce the noise around your signal while actually keeping the same signal or even increasing the signal. Um and another way to think about this is just having multiple lenses, multiple angles onto some problem. You know, you're combining many different perspectives into one much better overall picture.

Uh and it's no different with classifiers. So with classifiers if you have multiple classifiers, all of which aren't super great on their own, Um just like Jonathan was saying earlier, there's a whole branch of statistics and machine learning which deals with Um combining these things uh such that the end result is this new kind of monster classifier, uh which is just much better than any of them on their own. And in fact, oftentimes you can get it to be better than all of them kind of combined.

Um, you by this this this effect of the noise reducing. Um so if you have a bunch of alpha factors which are kind of not super great on their own. Oftentimes there may be opportunities to combine them in intelligent ways which uh reduces the noise without sacrificing too much of your signal. Is there anything you want to add on to that, Jonathan? Uh well it's hard to improve uh on Delaney's outstanding uh explanation. Um But you know, one thing I'll say is um

you know, the concept of uh weak in the term weak classifier, um, you know, is very important. Um a lot of single alpha factors, um, you know, have decayed over time and you know many market participants um you know think that

you know, weak factors are no longer useful or no longer profitable, um, you know, in in trading strategies. But, you know, part of the magic of uh ensemble learning is that um So as long as you have a uh a a weak classifier that's marginally better than uh uh fifty percent um success rates, you know, that can be additive uh to a combined model. Now now when you say if uh over fifty percent success rate, how is that success rate measured?

It's quite simple. Um, you know, the model makes a prediction, um, you know, plus one, minus one, um, you know, plus one for stocks that are outperforming, minus one for stocks that are underperforming, um and you know, it's simply the hit rate of those predictions.

But for that prediction to be deemed accurate or inaccurate Doesn't there need to be some sort of time factor involved for Uh like the the time span for how far ahead it's predicting, which is something that hasn't really come into the picture yet. So that would be defined um you know as you model out your single alpha factors. Uh so

the the single alpha factor when I was talking before, you have some relative ranking across your universe and that relative ranking um is proportional to forward returns. Um, you know, of course you have to define uh some kind of period um you know over which you're looking uh to forward returns, and you would want to be consistent across all of the office actors that you're combining.

Weak Factors and Aggregation Methods

Okay, I think that makes sense. So How do you actually go about aggregating multiple alpha factors together? I think that would be interesting to speak about. So I mean there's there's a few different ways that you can start approaching this. Um uh but actually something I actually want to touch on briefly, maybe even before we go into that, is just some of the reasons why

um, you know, individual factors may not be very good on their own. Uh and, you know, I I just had a few notes here of of kind of just high level reasons why that might be the case. Um one of them would be something factor lifespan. J like Jonathan mentioned, um, when a factor becomes well known

it loses its edge because everybody knows about it and there's no longer this kind of novel s this novel um signal that you're trading off of. Everybody else has already gotten ahead of it. It's are it's all arbitraged away.

other reasons that a factor might be not might not be very good on its own. It's possible that you've discovered an alpha factor which has a signal over random. So, you know, it it has like, you know, quite a good accuracy rate, but when you f when you actually do, say, a back test and look at the transaction cost you'd be paying on trying to trade that factor, you find out that to get

um, you know, t to to to hold a portfolio based on this factor, you'd have to be trading so many stocks that you'd be paying a ton of transaction costs. And uh so that factor by itself Whereas it may not be tradable, it may be producing a signal that when combined with some other factors

Would produce a tradable portfolio. Um and and in similar ways, issues with not being able to trade a single factor based on liquidity. Um not being able to trade a single factor based on on market impact, uh, or just not being able to trade a single factor because it's not super accurate by itself.

um are all things in which, you know, by starting with a factor that by itself is not very good, you could actually add it to a bunch of other factors and come up with a model that is quite good together. So sometimes, you know, Something may look like garbage on its own, um, but there may be a use for it, uh especially if it's uncorrelated with with other other known factors. Um so uh just as a to start talking about combining them together, um

Well, I mean the simplest approach would just be averaging them together and and not doing anything else. Uh and Uh I think that there's uh, you know, not a not insignificant fraction of people who just will average factors together. Um, then you can start getting more sophisticated. Uh let's say that you have uh a dynamic weighting system. in which you weight the factors um based on some historical metric. The historical metric could be maybe like the trailing three month accuracy.

of the factors when treated as a classifier. Um and then you can even go beyond that uh when talking about um some of kind of the newest and and and most exciting ways to to to combine factors which are fall into this category of machine learning, um and like Jonathan said, ensemble learning, um, which would be actually be using uh what's known as a a signal boosting algorith uh algorithm.

Um, a signal-boosting machine learning method which takes multiple factors and attempts to use machine learning to produce at every time point. uh good combinations of factors that will produce very strong predictions going forward.

Monitoring Alpha Factor Performance

The concept of combining factors is you know as old as quantitative finance, um, you know, itself. Uh you know, you certainly don't need machine learning uh to do this and you know, for thirty years uh people have done it without machine learning. Um, you know, as Delaney mentioned, you know, simple averaging is fine. Um uh regression type uh techniques, um, you know, where you're predicting the composite return um

you know, across as many factors, uh, is is one technique. Um it's really been um, you know, in the last five years or so where machine learning has come to the forefront, um and it really, really does uh outperform um You know, a lot of the simpler or more historical techniques

Okay, now in your response there, Delaney, you you brought up an interesting point, uh, right at the beginning about the lifespan of your factors. I mean, how do how do you think about the lifespan of your factors? How do you monitor that sort of thing? Well, uh so people like to talk a lot about alpha, right? Um and alpha is often one of the metrics used to describe the quality of a factor. And in this case, alpha refers to what you have

in the factor that's left over when you like explain away all the parts that can be explained by known, commonly used factors and and techniques. So what do I mean by that? Well Let's say that you were to take the return stream of a portfolio that was based on trading some factor A, and then you said, Okay, well I know that factor is

B, C, and D are just super well known um, you know, in in the market. Uh everybody knows about them. And I wanna see how much of the return stream from my new factor A is coming from B, C and D. Uh B C and D could also be factors that you were already trading in your model, and therefore you want to make sure that your new factor isn't just uh you know, a recombination of the factors that are already trading in your model.

Uh you could run a regression using a as the um outcome or dependent variable and b c and d as the independent input variables. And if the input variables explain a well, um then Basically, you'll have high coefficients. You won't have much alpha. Um alpha, of course, in the regression model represents like the constant amount of returns that are just not explained by your other input variables. And so what people will do is they'll construct

um alpha models in which they try to look at, you know, how much alpha is coming out of a factor that's unexplained by other known factors. Um and that's one way you could start measuring over time how well a factor is doing and then also start looking for the decay of the factor as maybe over time other people started to learn about the anomaly, other people started to trade it. Um it starts being eaten away.

Uh and and and oftentimes I think what you'll see is if you if you put in factors that maybe used to be traded, used to work, you'll see periods of higher alpha and then a gradual decay as people discovered it um and started trading on it. So

That's in in in my understanding one of the common ways you would you would do that. Obviously the choice of what in what input variables you'd put into the regression model, you know, that's not trivial. Um But that that's one of the ways you kind of analyze the lifespan of a factor, as it started out having a high alpha, high explanatory power.

and then kind of maybe start to to trail off over time. You could also potentially measure the accuracy, like Jonathan was saying, just how often of the time it gets right that a stock is gonna go up. versus a stock is gonna go down. Um, Jonathan, are there other things that I'm missing here?

Understanding Regression Models

I think that's a great explanation, Delaney. Okay, now when you talk about regression models. I think it might be helpful if you could just explain maybe what uh regression models like. What are what are you referring to there? Sure. So um and and we'll have uh links to this stuff up at quantopian.com slash chat with traders. Again, that's gonna be our universal link that we'll try to put all of the content on.

We also in the Quantopian Lecture Series that qu that's quantopian.com slash lectures, we show examples of regression models being used for precisely this purpose in the lecture series. We actually have like primers on regression and multiple regression models. and then primers on factor models which use a lot of regressions. And the notion is that

A regression model just attempts to say attempts to explain uh the results you know in one variable with some other variables. So for example, if you thought that there was a linear dependence um between uh the amount of uh the amount of trades that you made and the amount of money that you made, what you could do is you could put the amount of trades that you are making in as an in as an input variable, the amount of money that you're making as an output variable.

And regression would tell you kind of on average, assuming there's a linear relationship, um what multiple of money you're making is sorry, what like what multiple of trades you're making is the amount of money you're making. So for every trade you make you get two dollars. Uh a regression would say, oh well I'm gonna give you that The money you're making is two times the trades you're making. So it's just it's just a way of estimating dependence. And there's many like

little tricks and gotchas and assumptions that have to be satisfied and we'd we discuss that in the lectures. But again, the notion here is really that what we're trying to do is we're just trying to say, great, you discovered this really interesting new way of ranking stocks. Now how much of it

is just coming from stuff that's already known about. Because if you discover, if you run a regression model and you discover that your factor is very well explained as point five times this known factor plus point five times this other known factor plus one times this third known factor.

then you you're really you're not doing anything new, right? There's no alpha there. The alpha's gonna be zero because there's nothing left over on the table after after you've explained it out by these other factors. Um and and so we're trying to measure the dependency using a regression model. Beyond that, it's hard to explain

uh without visuals in front of you. So I recommend that um people who are interested do uh check out the lecture series at quantopian dot com slash lectures. There are a few different ones on linear regression models. Absolutely.

Benefits of Combining All Factors

Um now just continuing on the the point of aggregating multiple alpha factors, let's say you have an alpha factor which is is really good. It's a it's a good alpha factor. Is that something you would still want to combine with uh weaker alpha factors? Just from first principles and you know just uh discarding um uh any kind of advanced uh sophisticated machine learning, you know, if you have factors um you know which have uh zero correlation uh to each other.

um, you know, in a in a ideal world. So, you know, forgetting about the implementation cost of um you know, potentially trading some of the weaker factors. If you have factors that have zero correlation to each other you know you would always want to um you know combine those you will always get uh an incremental uh benefit you know in risk and return. Now in some cases taking into account um you know real world

Uh barriers such as uh frictions and and trading costs. Yeah, without uh machine learning, a very simple combination. you know, may not uh give you any benefit. Um but you know, when you use a modern um machine learning technique, um, you know, as long as your weak uh as long as your weak alpha factor is uh

Machine Learning for Factor Aggregation

you know, somewhat uh predictable, even marginally uh positive predictable, uh it will benefit you in the output um you know the combined model. Now Uh you know, the the word the term machine learning keeps coming up um throughout this conversation. I know we're gonna get into machine learning more in episode five.

Uh, but you know, as it's come up a number of times, I think it might be helpful if we could just have like a a quick overview of of how machine learning actually aids in aggregating multiple alpha factors. Uh so the machine learning techniques that we're talking about um here are classification techniques. Um And uh what these techniques are very, very good at is, you know, finding um you know very, very complex and hidden structures uh with

within data. You know, traditional statistical techniques, you know, are good at finding linear relationships um, you know, between different variables. But the real benefit um of machine learning is that you can uncover uh these nonlinear relationships and you know by that I mean you know it's very, very difficult. Um you know, without these techniques to understand, you know, how different factors interrelate, um, you know, and when one um is predictive and the second one is not

you know, is predicting in the other direction and the third one is predicting in the other direction, you know, that might be, you know, one result. You know, if they're all predicting, you know, in the same direction, that might be a different result. And, you know, the uh different uh result states uh can be non-obvious and machine learning is very, very good at uh uncovering um that hidden structure.

Right. And and just to be clear, you're talking about machine learning in regard to actually aggregating multiple alpha factors, right? You're not actually talking about machine learning as a means to discover alpha factors in themselves, are you? Uh that's an excellent um distinction um and you know is absolutely uh spot on. Um

it um uh I'm not saying it's uh it's not valuable uh machine learning I'm not saying machine learning is invalid. Um to generate a single alpha factor in itself, but where uh where I've seen um uh the most uh success in practice is uh in the combination step.

The Power of Low Correlation

Okay. Yeah. Good answer. A little earlier you mentioned uh correlation. So I mean what considerations need to be made when thinking about the correlation between multiple factors? Uh so correlation um is a um you know measure of um Yeah, how one variable uh explains another variable. So, you know, in our sense, um, you know, we would want low correlation, we would want um

the results of, you know, one factor um to be unrelated uh from the predictions, um, you know, of another factor. And um the thing that uh I look for in individual strategies is I look for um a high uh high sharp ratio. So the sharp ratio is the uh ratio of the um uh the return of a strategy to the volatility of a strategy. And you know it it it happens that when you have um you know strategies um

that are uh that have low correlation with each other or factors that have low correlation with each other. You know, and you combine these, the returns um combine in a linear way, um and the volatilities combine, um

in a sublinear way. They combine actually um you know relation to uh uh the square root of the number of factors. So that is uh kind of um you know portfolio effects magic and you know when you add up factors uh that have low correlation, you know, you get the magical effect um, you know, that you get significantly enhanced risk return on the combination. Yeah, and and and just to add to that, you can kind of think of this at a meta level as

You know, each factor is really just an asset you can hold in your portfolio. Um and I've I've referred back to this lecture a few times, but if you go to the Quantopian lecture series uh and you go to the position concentration risk lecture. It really tries to drive home this notion of why you shouldn't invest in few assets and also how investing in multiple uncorrelated assets can do precisely what Jonathan was describing.

bringing down the the variance and the risk while not sa sacrificing on returns.

Um and so the same thing is true for alpha factors. Uh if you consider each alpha factor as kind of the returns that you would get trading on just that alpha factor, well averaging them together uh you know, will kind of be the same as averaging together uh the return streams where you're just trading on that alpha factor uh and can produce a return stream in which you're not sacrificing on the returns, but you are bringing down the variance.

Another point that I wanted to say which is related to that is uh in terms of thinking about this correlation from a more intuitive level for people who may not have had you know as much mathematical background.

One example that you can use to think about this is say uh you're polling, you're polling people and you're trying to figure out how an election is going to go. Well, if you poll someone and then you poll their neighbors Um, you know, maybe their neighbors have a very similar perspective on the world as that person.

They have the same outlook, their views, their beliefs are likely to be similar, and and whatever they tell you is likely to be kind of correlated. Um, there's not going to be a huge amount of difference. But then as you go and, you know, pull someone in the next neighborhood and pull someone in the next state and pull someone in the next country. You're getting an increasing variety of different perspectives on the same issue. And

Basically kind of you can think of this in the same way with factors, that if two factors are very correlated with each other, if they kind of live in the same world, they're not going to explain anything new. So that's another kind of intuitive reason why you want to try to um

Get factors which are very new and different from each other when you're building up a model which combines factors. Because if you're just combining factors that are similar to each other, You're not getting anything new and the resulting combined model won't really be any better than uh the individual correlated factors.

Number of Factors and Sharpe Ratio

Yeah, that's a really great analogy, uh there, Delaney. I like how how you did that. When we talk about combining uh multiple alpha factors, how many alpha factors are we talking about? Like what's uh What's a good number are we talking about, like sort of combining five or fifty? Uh so this is just uh kind of a practical um you know answer just based on my um you know e experience. Um

Uh the shorter answer is um, you know, the more the better. The longer answer is um, you know, in uh equity uh strategies, um you tend to um uh start to get diminishing return from combination, you know, after you um you know you reach um you know about twenty five. 25 factors. Okay, interesting. Now, Jonathan, you said uh just before that you like to focus on uh I think it was strategies that have a a high sharp ratio. Uh would you mind explaining your reasoning for this?

So the sharp ratio, um, to me is a measure of the quality of the alpha of the strategy. So, um you know, ultimately we want to produce um vehicle uh for investors um you know which they find compelling which you know they think is um uh uh uh you know very, very competitive, you know, versus um competing funds. Um and you know, Sharp ratio is a uh industry standard measure.

um you know to measure the quality of uh you know of the returns. So so you know we're not interested just in producing, you know, a uh a high return um you know, over the course of the year, you know, we're interested in producing consistency um of return as well. So we care not just about um you know the absolute measure of return, but we care about the volatility of those returns and the path that those returns take.

Adaboost: Amplifying Weak Signals

Okay. Good answer. Now Delaney, I think you wanted to bring up uh at a boost algorithms. Was there anything you wanted to talk about um in regard to that subject? So basically AtaBoost uh is uh is a machine learning algorithm and it specifically is one of these ensemble algorithms that uh tries to take multiple

uh classifiers which aren't very good on their own, you know, have maybe a a signal that's barely better than flipping a coin and combine them together to produce something that's that's um much better than the sum of the parts. And There's actually some implementations that are currently being worked on by Thomas Wiaki, who's Quantopian's uh chief data scientist. And um he has posted some examples of this.

uh in the uh in the forums uh on Quantopian and we'll we'll link to them again if you go to that quantopian dot com slash chat with traders we'll have a collection of like all the appropriate links for you know that we talk about in each episode. But the notion is really and again I'll try to explain it with

uh with an analogy here and then I'll maybe let uh Jonathan jump in with some more uh kind of specific comments. Uh but the analogy that I like to use with these algorithms uh is uh really kind of this notion and it and and it's can be a little hard to wrap your head around at first.

Um but it's really common, especially in computer science, uh that if you have a process which gets the answer right, say fifty-one percent of the times, you know, fifty-one percent of the time, well rather than running it once and then um you know like just believing it on based on that one try, which you know is gonna be pretty inaccurate'cause it only gets it right fifty one percent of the time.

There's this notion in computer science of kind of signal amplification in which you run it thousands or tens of thousands of times and you look for the answer that it picks the most.

And usually um you know if you're choosing let's say between two outcomes, well if it's going to choose the right outcome fifty one percent of the time. If you run it lots and lots of times, you'll notice that you know, it's very likely that it will produce more correct answers than it will produce incorrect answers, and so you can pick the thing that it picks more of the time.

Now, in f it sometimes that can be hard to do because maybe you can't get 10,000 independent runs. You can't run this thing 10,000 times. uh what a sp boosting algorithm like Adiboost tries to do is saying like, okay, we have uh a set of you know let's just say like experts and they're all making predictions or classifiers and they're all making predictions. And I don't tr fully trust the predictions that any one of these guys are making.

Um, but can I combine them in an intelligent way that and kind of average together their predictions into a into a meta prediction, an overall prediction, um, you know

But that doesn't trust any individual expert, but tries to get at what signal is existing across this entire set of experts. Yeah, I think to to add, um, you know, the way that type of um algorithm works um is um you know you just start with uh kind of assuming they're all equally valid, um and then, you know, that assumption will, you know, produce um predictions um you know in your training set and you can bear those predictions to your training set.

And then you start to iterate and you start to modify how you're weighting these individual individual experts. Fellini was saying, I thought that was a good term. So waiting these individual experts. Uh and then you iterate uh through different waiting schemes, um, you know, until uh

you know, you reach um, you know, a certain uh threshold um of improvement. Yeah, I I think definitely for for more information around that, it'd be a good idea to check out uh the link Delaney shared with you uh just before.

Real-World Application and Challenges

You know, all these things which we have discussed up to this point, are these things which can be done on the Quantopian platform or is there something more required? Uh so everything we've been discussing um you know can indeed uh be done um you know on the Quantopian platform. So the educational materials and the links um

that Telane had mentioned, uh you know, those will show a uh a walkthrough from, you know, first principles um, you know, using uh completely the Quantopian platform. Right, right. And you know, as we we do the this podcast and we're stepping through these various components of the workflow of a professional quant, for someone to actually apply this and implement these things.

Is that something independent traders and independent quants can realistically do, you know, with a bit of time, with a bit of study and, you know, some effort? Or is it or is this sort of a bit more advanced than that? I don't think there's any barrier to entry, um, you know, with understanding the techniques. Um, you know, however, uh the the the quant workflow um you know that we're describing um and certainly the factor type um analysis and the creation of alpha factors, you know, those are

factors that predict uh returns across, you know, very many um stocks. Uh So, you know, you don't need a particularly high um hit ratio, you know, fifty-five percent, sixty percent hit ratio, you can get an exceptional uh portfolio return if you diversify that. you know, across um, you know, very many stocks. And

You know, that um is uh something that creates a barrier to entry um, you know, for for independent traders. So you need to have, you know, quite a bit of capital uh in order to be able to trade a portfolio of, you know, a thousand stocks. Absolutely. And and is there anything you'd like to add on to that, Delaney? No, I mean I think that's all exactly right. Obviously like

uh the restrictions of how much capital is available is an important thing to consider. Um and and then like the other thing is really with these with these factors, I mean you can certainly pare down the universe and make it so that you're selecting ten factors to go long on and ten factor sorry, ten stocks to go long on and ten stocks to go short on from that ranking. But again, because you're investing in so much so many fewer assets. generally you have to have a much higher

uh confidence in, you know, a much higher accuracy for your model. So the same factors that work for predicting um future returns across these very broad swaths of assets may not be appropriate for people who are trying to trade portfolios of 10 or 50 assets because when you're trading that many fewer assets.

you need a higher, you know, accuracy on your model, otherwise, you know, you you you face more risk. So Um, you know, you might have to develop different models or or or you know, maybe just not even use the same workflow.

So uh the thing that's nice about this workflow is it's just a really nice and general way of thinking about investing and thinking about, you know, in general a model just classifies things as going up or going down in the future. Um and and you can't really get too much simpler than that. So um and and the other thing to remember is uh you know if you discover a factor, even if you can't necessarily trade it.

Uh if it's a good factor, um, you know, it has value. Uh and and just because you can't trade it yourself doesn't mean that it's it's not worth it's not worth anything to you. Mm. Yeah, that's that's a really good point actually. Um, is there anything else uh either of you would like to add before we wrap this episode up? I'll just say like again, uh, you know, that this stuff can be pretty difficult to understand on first pass. So like I said, I think last time

We put a lot of effort into the lecture series here at Quantopian. Uh we're we're not gonna have, you know, full coverage of everything we talked about quite yet. We're gonna have coverage of most of what we talked about. Um But again, the the two links that are important and I recommend if at any point something piqued your interest or you didn't quite follow.

Just go to Quantopian.com/slash lectures for all the lectures. Um, and then we'll also try to link to specific resources and then also that implementation. of Adi Boost that Thomas made, um, if you go to quantopian dot com slash chat with traders, we'll have specific resources there for you. Uh Jonathan, did you have anything else? I just wanted to add that, I mean, this is not a theoretical um discussion at all. You know, these techniques, um, you know, are being used very, very successfully.

um you know, at uh you know, some very, very large um and and famous hedge funds. So, you know, we're excited to bring um you know, this educational material um, you know, to our community and um, you know, we're excited to see what people will do with it.

Next Episode and Conclusion

For sure. Good points. Good points. Now Uh we've got episode four coming up next. Uh Delaney, do you want to give us a quick summary of what uh listeners can expect to hear from that one?

Absolutely. So again with the theme of these episodes is just when you think you figured it out, we're gonna go a little a little deeper and show you how we need to actually do more. So With episode four, let's say that you've got a great combined, you know, alpha model and it uses ten factors and it does some really great uh weighting between them and everything's looking awesome.

Um well uh that by itself, if you just went to trade and you went longs and went short based on the predictions of that model. uh you could still run into situations in which your portfolio could get into very bad states in regards to the risk that you're taking on.

Um factors might have biases towards certain sectors, they might have biases towards taking on more longs than shorts. Uh you know, they lots of different things can go wrong. So what we're gonna talk about next episode is this notion of Portfolio optimization and how to actually go from these desired factor positions, things that the model wants you to take.

uh the positions the model wants you to take to actually something that you would be comfortable trading in real life. Um and for that discussion, uh, we're gonna bring in Scott Sanderson, who's a really, really smart engineer here, works at Quantopian, and is currently designing um an API uh that's going to do precisely that. So he's like a a really good person to talk to about this. His background is actually math and philosophy. So it's it's always a great discussion with Scott.

Cool. Yeah, no, I very much look forward to that. Well, Jonathan, I just want to say uh a special thank you for coming onto this podcast. I appreciate you, you know, coming on and sharing your insight from was it twenty plus years uh in the industry. So that was great to have you, man. Thank you very much.

Yeah, absolutely. I enjoyed it. Thank you. Thanks for your interest, uh and for your for your support, ongoing support of this series. No trouble. No trouble at all. All right, Delaney, uh let's leave it at that for now and we'll call this a wrap.

This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.
For the best experience, listen in Metacast app for iOS or Android