CS 7646: Machine Learning for Trading

For my first course in OMSCS, I took Machine Learning for Trading. It was a course primarily focused on the application of machine learning, and I found it to be highly satisfying.

Overall

In OMSCS, I enrolled in Machine Learning for Trading (hereafter referred to as ML4T).
CS 7646: Machine Learning for Trading
According to OMSCentral, it is considered a highly satisfying course with a moderate workload, which made it seem like an appropriate choice to gauge the program. Additionally, it is worth noting that improvements to OMSCS courses are made annually based on surveys. In fact, the format of the midterm and final exams has changed compared to previous years due to the emergence of Generative AI. Therefore, any references to course content going forward will pertain only to what I experienced during my enrollment, and there is a good chance that changes may occur in the future.

The final goal of the course was to implement a profitable investment algorithm using machine learning to predict stock prices. Consequently, the focus was more on practical applications in the real world rather than the theoretical aspects of machine learning. Moreover, since the theme was investment, time was also allocated to financial economics topics such as the Efficient Market Hypothesis and options. The midterm and final exams featured questions that required selecting key considerations for applying machine learning, framed as if one were a machine learning engineer at a hedge fund.
Review: Machine Learning for Trading
Efficient Market Hypothesis

Content

During my enrollment, the course progressed by watching past lecture videos and reading reference materials, with a total of 8 assignments assigned. Additionally, there were 2 open-book exams. Among the reviews on OMSCentral, the difficulty of each assignment was described as follows, and I believe it is fairly accurate. Assignments with reports tend to require more time. Since I was aware of which assignments would be heavier in advance, I was able to plan accordingly and started early, enabling me to progress in a structured manner. The 8 assignments were not entirely independent; rather, they felt more like components gradually built toward the final goal of implementing the investment algorithm.

Subjective Project Difficulty Ranking: 8 > 3 > 6 > 5 > 1 > 2 > 4

You can verify past syllabi from the following site:
CS7646 FALL 2023

The specific content of the assignments is as follows:

Project 1

The first assignment was quite demanding, requiring a report of up to 7 pages. The content focused on implementing a simple simulation based on betting on black or red in a casino’s Even Bet. Additionally, the report included the experimental methodology, results, and discussions. The content was fundamental, covering topics such as expectation calculation and how to conduct numerical experiments, serving as an introduction for future assignments. For those with research experience, it would be familiar to experiment, summarize results, and logically discuss findings, thus introducing the expectations for the course.

Project 2

While Project 1 was relatively demanding, Project 2 was very simple and involved performing optimization using scipy.optimize.
Optimization (scipy.optimize)

Project 3

I had been warned by the TA that Project 3 would be intensive, and indeed, it involved a considerable amount of work. This project required the implementation of Decision Trees, Random Trees, and Random Forests without using libraries, based on theoretical principles, and summarizing the results in a report of up to 7 pages. The project aimed to reveal the tendency for overfitting through experiments and compare the characteristics of each method.

Project 4

Following the heavy assignment, a lighter assignment was presented, which was considerate. The task involved generating test data where linear regression always achieves high accuracy and where Decision Trees also achieve high accuracy. This assignment assessed the understanding of the characteristics of each method.

Project 5

From this point onward, the assignments related to financial economics continued. The task involved building a simulation to determine the final returns based on stock trading. Ultimately, this assignment became a necessary component for performing profitability simulations using machine learning based on the outputs of all previous projects.

Project 6

Project 6 was quite tough despite having only about a week for completion, as it involved a significant amount of work. It required investigating five techniques used in technical analysis for investments, implementing them, and simulating returns. Additionally, a summary of each method was required to be compiled into a brief report.

Project 7

In this project, I implemented Q-learning and Dyna, both of which are types of reinforcement learning. Since I had not previously encountered reinforcement learning, this assignment was highly educational. Q-learning involved a simple implementation where Q-values were updated in a table, and I confirmed that the implemented model could successfully navigate a maze, appropriately selecting routes in the execution results.

Project 8

This project served as a culmination of everything learned so far, involving a simulation of investment strategies using machine learning. The results were to be analyzed and summarized in a report, similar to previous projects. For the methods used, I could choose either Random Forest or Q-learning, and hyperparameter tuning was required to improve accuracy. The TA emphasized that, given the accumulation of knowledge from previous projects, this assignment might not take much time, but tuning could be time-consuming, so it was advised to start early. Consequently, I began early and implemented Random Forest, which I felt confident about, allowing me to submit the assignment comfortably.

Reflection

The four report assignments had specific formatting requirements, extending up to 7 pages in A4 size, which was quite burdensome. Since we were instructed to write in a manner similar to actual research papers, detailing experimental methods in a way that others could replicate, it could serve as a good introduction for those who had never written a paper before.

Regarding the lecture content, using libraries to implement concepts like Random Forest, which I had previously understood only vaguely, deepened my theoretical understanding. Reinforcement learning was an unknown territory for me, so learning how it functions through implementation was a valuable opportunity. Ultimately, being able to check how much profit could be made from investments through backtesting by combining the techniques I learned was rewarding and made this course one of the