Machine Learning for Trading - Japanese Market -

I tested whether the algorithm implemented in OMSCS’s Machine Learning for Trading (ML4T) could be applied to Japanese stocks and whether it could actually generate profits.

Overall

In the final project of ML4T taken in OMSCS, I implemented an investment algorithm using machine learning. While the course focused on US stocks, I wanted to verify if it would be effective for Japanese stocks as well. Details about the course can be found in the article below.
CS 7646: Machine Learning for Trading

Method

To conduct the verification under conditions similar to those in the course assignments, I calculated with the following conditions. To avoid violating the course’s policies, I will only provide an overview of the core aspects of the model and other essential content.

Model: A model based on Random Forest implemented in ML4T
Target: Mitsubishi UFJ Financial Group (8306)
In-sample Period: 2008/1/1 - 2009/12/31
Out-of-sample Period: 2010/1/1 - 2011/12/31
Trading: Long and Short positions can be taken based on predictions.

Data

Since the course provided data for US stocks, I needed to find a source for Japanese stock data. After researching, I found that I could obtain Japanese stock data from Stooq using pandas_datareader.
【Python】Obtaining Japanese Stock Price Information Using pandas-datareader

Stooq provides free stock price data and appears to be one of the few sources that offer both US and Japanese stock data. For Japanese stocks, there is about a one-day lag in price updates, which may not be suitable for timely predictions. However, since this model focuses on daily data, I thought this wouldn’t be a significant issue. When actually operating the model and conducting trades, it will be necessary to consider the timing of data acquisition, so it might be better to use a paid service.

Stooq

How to Obtain Data Using pandas_datareader

You can import pandas_datareader and append .JP to the stock code to retrieve the data.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


import pandas_datareader.data as pdr
import datetime as dt

symbol = '8306'
ticker_jp = symbol + '.JP'

start = dt.datetime(2008, 1, 1)
end = dt.datetime(2009, 12, 31)

df = pdr.DataReader(ticker_jp, 'stooq', start, end)

Note: There appear to be API limitations, and if you perform multiple read operations in a single day, it may suddenly stop returning any responses. In such cases, no error is returned during data retrieval, so caution is required.

You can obtain information such as the opening price, high price, low price, closing price, and volume for each stock as shown below.

stock_data

Result

The results of the simulation under the above conditions are shown. The initial investment capital is set to 1, and the subsequent fluctuations are illustrated in the graph. As a benchmark, it represents the case where the target stock was purchased at the beginning of the period and held continuously.

insample_result

During the in-sample period used for training, even though the benchmark resulted in a slight negative return, the machine learning-based approach was able to achieve approximately 20% return.

outofsample_result

For the out-of-sample period, the result ended slightly positive compared to the benchmark. The step-like progression suggests that positions were relatively quickly liquidated. It appears that it was a sideways market with little significant fluctuation, which may have limited the ability to achieve substantial returns.

The results of applying the model to the Japanese market indicated that simply applying the model did not yield profits, suggesting that considerable trial and error would be necessary.

Reflection

I had considered using the model implemented in ML4T for actual investing, but it has become clear that it may not be directly usable. During the course, the TA also advised that using the model for trading without careful consideration is reckless. I had thought that the Japanese market might be more efficient for algorithmic trading due to lower trading volumes compared to the US market; however, it appears that it is not so straightforward.

In the course, there were options to implement using either Random Forest or Q Learning, and I am curious about the results that could have been achieved using Q Learning. I would like to challenge myself to tune the model and implement Q Learning when I have the opportunity.

Taking the ML4T course has not only allowed me to learn about machine learning itself but also significantly encouraged me to experiment with various approaches. I believe this growth is substantial, and I intend to continue applying what I have learned in the future.