Reflection on citibank-statement-to-csv project

Every month when I get my credit card statement I like to go through it and categorise the transactions (a bit like in you can in Netbank), so that I can see where and how I am spending my money. The statements come as PDFs and getting the data into spreadsheets involved copying and pasting data from the PDF into Notepad++, running a few regular expressions (setup as a macro) to format the data and then copying and pasting that into the spreadsheet. It wasn’t a difficult process, but I thought it would be fun to try and automate it, even if it that turned out like xkcd 1319.

The application was completed last week and is available here with source code here. The following is my short reflection on the project.

Architecture and language

An entirely client side JavaScript application that runs in the browser is definitely a winner. What can’t you do in a browser now. It also meant I could make the app easily available through

I’ve mentioned that I am a fan of TypeScript, but usually in regards to having strong typing. I also like the improved tooling support. I could write JavaScript without intellisense etc. if required, but why when better things are available? Computers exist to make our lives easier and free us to do better things so why not let IDEs support us to develop software. I did try out Sublime for developing this app, but I found it a bit lacking. I probably didn’t have it setup properly, but configuring an IDE was not something I wanted to waste time on.

Time taken

The 4 months from first commit (Feb 11th) to last (Jun 24th) is more than I would have expected at the start. I think the biggest factor in this was that I found learning new libraries and tools hindered my momentum early on. From the start I decided to use this as an opportunity to use new tools and libraries, which was good, but it was frustrating especially when it seemed to take 2 hours of research to make 5 minutes actual progress.

The main offenders here were webpack and pdf.js, I found the documentation of these pretty unhelpful at times. For webpack I had look for the way other people configured it and dig into what it was actually doing and for pdf.js I read through all the provided examples and built a prototype to check it could do what I wanted. Using TypeScript also meant that setting up webpack, jasmine and karma were all a little bit trickier. I found this boilerplate helpful.


Once I had webpack configured it worked well. The continuous test runner was very quick, by the time I had changed to the console after saving a change the tests had completed. This fast feedback is something I now miss in my normal C# and Visual Studio development environment.


I took the inside out approach to code the StatementParser and CsvConverter classes, but didn’t create any tests for PdfScraper. I probably should have created some dummy PDF files and approached it from a TDD perspective as well.


Definitely not a strong point of mine. Making something look like a design is easy, but coming up with an appealing design is challenging. It looks alright now, but that was several hours on its own (yes, even for how simple it is). I decided to use material design concepts and referenced Materialize for CSS.

Browser quirks

I rhetorically asked before what can’t you do in a browser now, but there are still a few quirks. Converting the csv data into a file and downloading it was two lines of code (thanks data URIs), but giving the file a meaningful name required 8 and is a bit of a browser hack that may break with future updates.

Where to now?

When I started the project I was using Excel, but I have since moved to Google Sheets. Google Sheets has a nice API for making changes so the next steps are to fork this project and create a version that adds the data directly into the sheet.

Reflection on citibank-statement-to-csv project