Get startedGet started for free

Modern portfolio theory (MPT); efficient frontiers

1. Modern portfolio theory (MPT); efficient frontiers

In this chapter, we'll learn how to use machine learning for portfolio selection with modern portfolio theory, or MPT.

2. Efficient frontier

A portfolio is a group of assets such as stocks and bonds. Modern portfolio theory attempts to find ideal investment portfolios by quantifying returns and risks. Returns are percent change in price of assets. Risk is measured via volatility, using standard deviation. With MPT, we end up with a plot of volatility vs returns, showing the "efficient frontier". The upper left boundary of the plot is the "efficient frontier", which shows maximal returns given a certain risk.

3. Joining data

We'll use 3 securities, also known as stocks, here: AMD, CHK, and QQQ. pandas' concat() method is used to join the DataFrames by supplying a list of DataFrames as the first argument. We set axis=1 to stack DataFrames horizontally. Rows with missing values are dropped using dropna() so each date has all the stock prices.

4. Calculating returns

We first must decide on a timeframe for rebalancing our portfolios, which is when we'll sell and buy stocks to match our desired portfolio. Here we'll use the monthly timeframe so we have enough data, but a yearly time frame can be good for US tax purposes. We'll calculate daily returns using pct_change on full_df and store this in returns_daily; this will be used to estimate volatility. Then we get a DataFrame called monthly_df which contains data from the first day each month. We do this by resampling full_df with BMS, which stands for "business month start". We use first() to get the first day of each month. Finally, we get the monthly returns by using pct_change on monthly_df.

5. Covariances

Next, we'll work towards calculating risk, measured by volatility. For portfolios, this is complicated and involves matrix math. We won't get into details, but we must calculate the covariances of our stocks to calculate portfolio volatility soon. For this, we loop through each month and calculate daily returns. We're using masking with the returns_daily DataFrame to get values for the month and year that are currently in the loop. Then we use pandas' cov() method to calculate covariances.

6. Generating portfolio weights

Next, we generate portfolios by first generating many portfolio weights for each stock. We go through each date in our dataset using the dictionary keys from covariances. We save the current date's covariance for use in volatility calculations. Then we go through 5000 iterations and use numpy's random-dot-random() to generate 3 numbers for asset weights. We normalize these weights so they sum to one by dividing by their sum.

7. Calculating returns and volatility

We're going to take the code from the previous slide and add to it. We use the weights to calculate returns by taking the dot product with np-dot=dot(). The volatility calculation is more complex, and involves the covariances and weights, with some dot products and a square root. Don't worry about the math for now, and just use the given code to calculate volatility. Finally, we're using Python's setdefault() method for dictionaries to create new dictionary entries with date as the key, and an empty list as the value. For each randomly-generated portfolio, we append the returns, volatility, and weights to the respective dictionaries. We now have a sample of portfolios that we can use to find the "efficient frontier".

8. Plotting the efficient frontier

We now plot the efficient frontier, which shows maximal returns given a certain risk. This plot can be made for each date in a 2D plot, so we'll use the latest date from our covariances dictionary. We then scatter volatility versus returns with 50% transparency by using a value of 0-point-5 for alpha.

9. Calculate MPT portfolios!

Now it's your turn to calculate MPT portfolios!