profile-pic

Chris Mathew

GithubEmail

About

Hi, welcome to my corner of the Internet! I'm a graduate from the University of Waterloo, Canada where I studied Mathematics and Accounting. I am interested in using the power of data and software to solve real business problems. I started out as a huge Excel nerd, but I'm always trying to broaden my skillset to address the complexity and size of today's business problems. In my spare time, I work on random projects, try to pick up skills, fiddle with new tech, watch basketball, poorly attempt to play basketball, and take walks. Looking forward to exploring more of the world when I get the chance!

Projects

Low-code PDF Extraction

"Pdflow" is a low-code solution to build text extraction sequences for PDF files. When I was a Staff Accountant at KPMG, I built several VBA programs that involved reading text from PDF documents (partnership returns, corporate documents, etc.); I finally decided to build a tool that would help simplify the process of building the extraction sequence itself.

The web app for Pdflow is built using MeteorJS + React. AWS Lambda is used to partition large extraction jobs. Text extraction is performed via the Tesseract OCR Engine. Image pre-processing (grayscale, otsu threshold) is used to improve the accuracy of the extraction. Developers can also utilize the Pdflow API to integrate flows into their own apps. To accomodate Tesseract, I built a Docker image which gets uploaded to AWS ECR and then deployed to Lambda.

Pdflow has over 50 users consisting of students, educators, business owners and civic data analysts. Although Stripe integration is included in the app, my users seem more than happy making use of my generous free tier 🥲.

pdflow

ERP System

"Gourd" was built by me as an internal ERP system for our start-up, Intelline, before transitioning to Quickbooks. While most sane people would have used an existing ERP application to run their business, I was determined to improve my web dev skills while building an useful, tailored internal tool for our company.

Gourd includes modules for: recording/tracking tasks, expenditure management, accounting and budgeting. It also has integrations with Tableau (analysis) and Google Drive (uploading/downloading supporting documentation).

gourd

Stock Project

I started this project with my friend Tobias after listening to a lecture at UW about stock price reactions to earnings releases. This project scrapes various sources (Yahoo Finance, Tipranks, Questrade, etc.) and tracks daily/intraday metrics on stocks from the following markets: CSE, TSX, TSXV, NASDAQ, NYSE. A scoring algorithm is continuously run, sending us notifications of what to buy/sell.

The scoring algorithm is quite naive at the moment. I am considering using a supervised learning algorithm (likely an ensemble) to help build a model between the metrics I track and stock performance.

As this is a private project, I am limited in what I can share. Below is an architecture chart to give you an idea about how it works; I have also provided links to a repo with a very small section of the source code and a cropped screenshot of our documentation in Sphinx.

pc-flow

ML Projects

I have linked a few Colab notebooks above containing ML projects I've worked on/learnings. I'll continue to add to this repository over time.

  • Housing Regression: predict median house values in California districts
  • MNIST Classifier: accurately classify digits from MNIST dataset
  • Moons Classifier: grow a Random Forest using an ensemble of Decision Trees

Other

  • Pdflooper: Automates batch PDF page insertion/deletion/replacement (demo)
  • Autorun: Send jobs to a remote computer without having to set up a server
  • Cir: Tool to document/explore graph relations between circuit components.
  • This Website: built with NextJS, React, Typescript