Why you should blog

For the last few years I have been a career mentor for I.T. students at QUT. As part of that I try to inspire the students to start blogging. In this post, I’m going to detail reasons why you should blog. It’s written with students and graduates in mind, but I think everyone should be blogging. I regret that I was slow to start.

It is intimidating to start. You might think that you don’t have anything worth blogging about, or that your posts won’t be good enough for others. Forget about what others might think.

You should blog for your own personal development.

Blogging will help you work through ideas, you will learn more as you research your posts, and your communication will improve.

Blogging is good for your career. It will show your interest, that you enjoy learning. It will also show that you can communicate and share ideas. It will differentiate you from other students and graduates without a blog.

Forget that others might read your blog and start blogging for your own benefit. Blog about what you are learning, the parts you found challenging and what helped you to overcome that. Blog about what you have built and how you have applied what you are learning.

You are just starting out in your career and that gives you a different perspective from experienced bloggers. Writing from your perspective might allow you to create content that helps others starting out in their career. Sooner or later people will start reading your blog and you will have contributed to the community. You might even find looking at your view count addictive!

Hopefully this convinces you to start and I am always happy to help and review.

If you need more convincing I recommend you read Scott Hansleman’s or Steve Yegge’s or Erik Dietrich’s opinions.

Why you should blog

Fixing a CloudFormation rollback loop

Disclaimer: Apologies if some of the details of this are incorrect, I am working off my notes and recollection of what happened at the time. I started to blog about this and had a couple of attempts to confirm, but couldn’t reproduce the issue. If you’ve worked with CloudFormation you will understand how slow the attempts to reproduce were.

Problem

I had an issue which resulted in a CloudFormation stack ending up stuck. It failed to update and then failed to rollback the failed update. Attempting the rollback a second time produced the same result. This is how the events appeared in the AWS console.

failure

I am not sure what the changes we made were that resulted in it getting stuck. I think I was trying to update the AMI and the new AMI was not set up to work correctly in Opsworks.

I couldn’t see a way to resolve this issue in the AWS console so I did some searching and found a way to get out of the rollback look.

Solution

The AWS command line tools often have more functionality than what is available in the web console. I had a look at the commands available for CloudFormation and found continue-update-rollback, which has a “–resources-to-skip” parameter. Using that command with the ID of the stuck instance, ProcessorOpsworksInstance1, got the rollback to complete, however I wasn’t quite done.

success

The following warning is provided with the “–resources-to-skip” parameter:


Warning

Specify this property to skip rolling back resources that AWS CloudFormation can’t successfully roll back. We recommend that you troubleshoot resources before skipping them. AWS CloudFormation sets the status of the specified resources to UPDATE_COMPLETE and continues to roll back the stack. After the rollback is complete, the state of the skipped resources will be inconsistent with the state of the resources in the stack template. Before performing another stack update, you must update the stack or resources to be consistent with each other. If you don’t, subsequent stack updates might fail, and the stack will become unrecoverable.


What you need to do here depends on what changes the failed update was making and what state your stack ended up left in.

For me the AMI failed to update and a new instance was not started. At the end of the rollback CloudFormation would be expecting a ProcessorOpsworksInstance1 in Opsworks running with the original AMI, but I had an instance that would not start. I decided the safest thing to do would be to recreate the instance so that I would be working from a clean slate in the future. To do this I deleted the Opsworks instance by removing it from the template and updating the stack. After that I then added it back into the template and updated the stack again. This got me back to where I started.

Fixing a CloudFormation rollback loop

2017

Happy New Year!

We’re already a week into 2017. There are a few milestones coming up for me this year. I will be turning 30 later in the year and I will also reach a decade in software development (I graduated from university and started my first software job in Nov 2007).

Over the last couple of weeks I’ve been doing some reflection and have decided that a big objective for me this year is to get more out of my involvement in the developer community. I regularly attend events and meetups, and from these I have learnt a lot and have met of great people, but this year I want to do more to promote my and One Model’s brand. To achieve this I have come up with 2 goals for the year.

1. Deliver a talk at a meetup

I attend a lot of meetups and they all rely on the community for presentations and it is time I gave back and help provide some content. To be fair I have volunteered to speak before, but this year I need to make sure it happens.

2. Post a blog each week

This aim of this is not quantity over quality, but to get myself thinking about blogging more and to increase my efficiency at writing blog posts. This should encourage me to share more and I may even be able to develop the talk from my blog content. I have several drafts of posts and a bunch of ideas and it is time to get these published.

These aren’t the only things I want to achieve this year, but if I get to the end of the year and haven’t fulfilled these I’ll be disappointed.

2017

Some troubleshooting tips for custom AMI Opsworks instances that fail to start

broken

When Opsworks instances fail to start, AWS doesn’t provide any hints about what caused the failure. Here are a couple of things I have found that can cause this failure.

  • If you have logged into the instance and customised it, make sure it has been setup for Opsworks.

 

Some troubleshooting tips for custom AMI Opsworks instances that fail to start

Some reflection

I finished citibank-statement-to-sheets a while ago (GitHub tells me the last commit was on August 10th) and have been putting extra hours into One Model in the time since. I have some projects in mind that will help me organise and manage our code base, but until now I felt other priorities were more important to complete. Tonight I was excited to start on the first of these projects Ef6Uml, an app that will allow me to generate a class diagram of our entities on demand. I created the repo, put some details into the README and then started to create the solution and projects when I realised I still need to install update 3, which is how I ended up writing this.

The twelve months that have passed since last September have flown by. On the personal side I got married and took some time off for the wedding and honeymoon. Professionally One Model has grown from just a single developer (me!) to a team of 3 led by me. My role has changed from being the lead developer to managing the team and ensuring they are setup to deliver. I am also a director of the Australian subsidiary and have to manage the administration it requires. One Model is in a good place, both the business and the software but for this I am going to focus on the software.

I feel that we are getting the balance of quality to speed right. There are pieces that could be better, but there are pieces that could be worse. In AWS we have a dev environment that developers are administrator’s off and have full control of. This can mirror production, and with the exception of the data it does. This environment and AWS allows the team to operate with DevOps principles. Our infrastructure is scripted (CloudFormation and OpsWorks). We have continuous integration and delivery (deployment to dev is automatic) through AppVeyor and Octopus deploy. We didn’t start with all this, we grew from using Elastic Beanstalk and basic deployments directly from AppVeyor, but have built it up over the year and it now allows us to get features and fixes out quickly. This is also not the final destination, we will continue to improve, but it’s where we are at now.

Going forwards I think the next area of improvement for the team is UI acceptance tests. We have a good suite of unit tests that get run as part of the build, but are lacking acceptance tests to ensure the units work together as expected.

A lot of my own personal development has been in managing the team. I’ve been the technical lead before, but not directly responsible for management. Luckily I’ve had a couple of good managers who have inspired and shaped me and have been reading plenty of blog posts on various topics. I have read Managing Humans, but my goal is to read more literature on the subject. That said the team is still small, so I don’t spend a lot of time managing them. Most of my time gets spent in more of a product owner/business analyst role ensuring that features are ready for the team to start on and have been fleshed out to more than the 5 words on a card in Trello they tend to start out as.

The hardest part of my new role has been hiring. Other than sitting in on a couple of interviews I had no experience in this area. There are a lot of resources on the subject (even specifically on hiring developers) and over the last year I have been developing my approach. I am still having trouble filling the start of the hiring funnel and am relying on recruiters for candidates. I don’t think this is ideal (recruiters aren’t cheap) and candidates are still hard to come by. We need to do more brand awareness. A blog would be good and some more effort into our open source projects wouldn’t hurt. I also attended the AWS developer day in Brisbane and noted a lot of other local companies and startups had everyone wearing branded shirts. That was a missed opportunity for us. I also attend a lot of meetups and can probably put more effort into spruiking One Model there.

I was going to write a little bit about the meetups I have been attending, but this has turned into a pretty long piece already and it is getting late. I am already behind where I want to be on my Ef6Uml project. All I did tonight was create the repo and the solution. Fortunately there is a long weekend coming up to work on it.

Some reflection

Notes from Fog Creek’s “How to Hire the Smartest Programmers” series

Sharing some notes I took from Fog Creek’s video series “How to Hire the Smartest Programmers” that I studied when I first had to start hiring for One Model. While hiring for One Model is different to hiring for Fog Creek, there are some techniques that I have been able to apply to my own process.

Notes from Fog Creek’s “How to Hire the Smartest Programmers” series

Reflection on citibank-statement-to-sheets

A little while ago I finished the next version of my statement parser to push the data into Google Sheets. The following is my reflection on the project.

Ditching the jQuery habit

When I started building this application I took the approach of starting simple and only adding things as they were required, but I didn’t apply this line of thought to jQuery. For whatever reason, despite seeing web development without jQuery being promoted around the web, I didn’t think to start without it. The changes to this version of the application added some complexity so I decided to use Knockout to abstract all the UI interaction. In the process of changing the application I realised jQuery could be removed completely.

Maybe the use of jQuery was too ingrained, because using Knockout and a view model is definitely simpler and I should have taken that approach from the start. Next time I will. I also should have used templates and created multiple view models for the different application states to break it up and remove all the state flag properties from it.

Disappointment with Google’s APIs

At work we are pulling data from APIs and I get to see a lot of poorly designed ones. The worst I have had to experience so far has been ADP’s where their online documentation is almost completely out of date and they distribute the updated information by emailing a PDF! Coming from that I had high hopes for Google’s API for working with Drive and Sheets, but unfortunately reality did not live up to expectation.

I found the API difficult to work with and in the end regretted trying to use the JavaScript version. I think it would have been easier to just have sent AJAX requests directly. The documentation for the API is not clear on the expected structure of the parameters for many requests and there were several points where I had to repeatedly call a function and look at the error returned to determine how to build the required parameter. This was an underwhelming experience and having to wrestle with this was a point of demotivation in the project.

If I was using the a strongly typed language this wouldn’t have been a problem, but for JavaScript it was. I believe all the different libraries and the online documentation is generated from a single source, but there needs to be more online documentation (at least for JavaScript). They could also provide TypeScript definitions. I did start to create some, but decided that I wasn’t going to be able to commit to finishing them.

Reflection on citibank-statement-to-sheets

Avoiding numeric overflows in Redshift decimal multiplication

The size of the resulting data type of calculations with DECIMAL (or NUMERIC) types in Redshift depends on the operands. For multiplication the precision of the result equals the sum of the precision of the two operands plus 1 and the scale of the result equals the sum of the scale of the operands. For example

SELECT CAST(9.9 AS DECIMAL(2, 1)) * CAST(9.9 AS DECIMAL(2, 1))

returns a result of 98.01 with a data type of DECIMAL(5, 2). This page documents how the size of the result data type is calculated*.

This change in data type can cause numeric overflow. This following example causes an overflow.

SELECT CAST(1 AS DECIMAL(38, 19)) * CAST(2 AS DECIMAL(38, 19))

The resulting data type would be DECIMAL(77, 34), which causes an overflow as it exceeds the maximum precision and scale (38 and 37 respectively). You might think this example trivial, the scale of the numbers is too excessive, but as you add more operands the scale drops. Also the actual numbers in a 38 precision column are likely to be larger than 1 or 2 and the larger they are the smaller the maximum scale.

This behaviour is good when the size data type is smaller in the result (the first example), but frustrating when you get an error for a result that fits in the data type (the second example). Luckily there is another way to perform the calculation to avoid the error.

The workaround

Multiplication can be converted to division e.g. 2 x 2 becomes 2 / (1 / 2). Calculations that error can be converted to division instead of multiplication to avoid numeric overflows for example.

SELECT CAST(1 AS DECIMAL(38, 19)) / (1 / CAST(2 AS DECIMAL(38, 19)))

Be aware that fractions that cannot be represented in decimal (such as 1 / 3) will be a fraction off e.g.

SELECT CAST(1 AS DECIMAL(38, 19)) / (1 / CAST(3 AS DECIMAL(38, 19)))

returns 3.0000000000000003.

Additionally to make the calculation act like a multiplication and prevent divide by zero errors I recommend you make use of NVL and NULLIF e.g.

SELECT NVL(CAST(“column1” AS DECIMAL(38, 19)) / (1 / CAST(NULLIF(“column2”, 0) AS DECIMAL(38, 19))), 0)

Footnote

* The Redshift documentation has formulas for calculating the result size of a divide, but I don’t think they are accurate as division does not produce numeric overflows like multiplication does. According to the documentation

SELECT CAST(1 AS DECIMAL(38, 19)) / (CAST(1 AS DECIMAL(38, 19)) / CAST(2 AS DECIMAL(38, 19)))

would have a result data type of NUMERIC(155, 78), which is far greater than the size of the multiplications result data type, but no numeric overflow occurs.

Avoiding numeric overflows in Redshift decimal multiplication

Reflection on citibank-statement-to-csv project

Every month when I get my credit card statement I like to go through it and categorise the transactions (a bit like in you can in Netbank), so that I can see where and how I am spending my money. The statements come as PDFs and getting the data into spreadsheets involved copying and pasting data from the PDF into Notepad++, running a few regular expressions (setup as a macro) to format the data and then copying and pasting that into the spreadsheet. It wasn’t a difficult process, but I thought it would be fun to try and automate it, even if it that turned out like xkcd 1319.

The application was completed last week and is available here with source code here. The following is my short reflection on the project.

Architecture and language

An entirely client side JavaScript application that runs in the browser is definitely a winner. What can’t you do in a browser now. It also meant I could make the app easily available through github.io.

I’ve mentioned that I am a fan of TypeScript, but usually in regards to having strong typing. I also like the improved tooling support. I could write JavaScript without intellisense etc. if required, but why when better things are available? Computers exist to make our lives easier and free us to do better things so why not let IDEs support us to develop software. I did try out Sublime for developing this app, but I found it a bit lacking. I probably didn’t have it setup properly, but configuring an IDE was not something I wanted to waste time on.

Time taken

The 4 months from first commit (Feb 11th) to last (Jun 24th) is more than I would have expected at the start. I think the biggest factor in this was that I found learning new libraries and tools hindered my momentum early on. From the start I decided to use this as an opportunity to use new tools and libraries, which was good, but it was frustrating especially when it seemed to take 2 hours of research to make 5 minutes actual progress.

The main offenders here were webpack and pdf.js, I found the documentation of these pretty unhelpful at times. For webpack I had look for the way other people configured it and dig into what it was actually doing and for pdf.js I read through all the provided examples and built a prototype to check it could do what I wanted. Using TypeScript also meant that setting up webpack, jasmine and karma were all a little bit trickier. I found this boilerplate helpful.

Webpack

Once I had webpack configured it worked well. The continuous test runner was very quick, by the time I had changed to the console after saving a change the tests had completed. This fast feedback is something I now miss in my normal C# and Visual Studio development environment.

TDD

I took the inside out approach to code the StatementParser and CsvConverter classes, but didn’t create any tests for PdfScraper. I probably should have created some dummy PDF files and approached it from a TDD perspective as well.

Styling

Definitely not a strong point of mine. Making something look like a design is easy, but coming up with an appealing design is challenging. It looks alright now, but that was several hours on its own (yes, even for how simple it is). I decided to use material design concepts and referenced Materialize for CSS.

Browser quirks

I rhetorically asked before what can’t you do in a browser now, but there are still a few quirks. Converting the csv data into a file and downloading it was two lines of code (thanks data URIs), but giving the file a meaningful name required 8 and is a bit of a browser hack that may break with future updates.

Where to now?

When I started the project I was using Excel, but I have since moved to Google Sheets. Google Sheets has a nice API for making changes so the next steps are to fork this project and create a version that adds the data directly into the sheet.

Reflection on citibank-statement-to-csv project