Relearning Everything (*)

Stu Fleming
Dec 1, 2024
9 min read

I'm an engineer, but I'm not. Squint sideways at that BEng. and the 10 years or so of professional programming experience on my CV and it's true, but not current. So what does it take to re-legitimise experience into a modern engineering environment?

I learned programming/coding/engineering originally in three ways:

self-taught - I had BASIC/Z80 assembly/FORTH proficiency early on
formal education: an engineering degree combining Computer Science and Electronic Engineering
professional practice: I've worked with programming and engineering teams for nearly 40 years

In more recent years, I've added a 4th method of accelerated learning through deep-dives to rapidly assimilate new material, then link it to what I already know and consolidate that through practice. This is the technique I used to gain a Microsoft Azure Administrator Associate certification a couple of years ago.

This article presents some observations and accomplishments from a small software project in python. What I wanted to achieve in this exercise was to explore in a practical setting the interplay between different aspects of a modern software development, update my previous experience with what I found, and to experiment with AI co-pilot assistance. By doing this, I hoped to progress some way down the track of being able to legitimately practise as an engineer.

(*) Not everything in a Universal sense, but certainly challenging all my assumptions about what I think I know.

What would be useful to learn?

So, if I'm going to cosplay an engineer, what would be actually useful to learn?

something low-cost (preferably free!)
something widely-used
something feasible to deliver value with moderate effort

I decided I would devote a few days of effort to learn something new and for a number of reasons, this ended up being the mix:

Cursor IDE - a modern visual development environment that incorporates an AI co-pilot
Python - a widely-used language that has the added benefit of a simple development Web server setup in Flask
GitHub - online source-code repository
gh and git - Linux command-line tools for source-code management
OAuth - an open standard for authorization focused on developer simplicity.
pylint - a static code analysis tool for the Python programming language
pytest - a framework that makes it easy to write small, readable tests

I chose OAuth as the problem domain as I have implemented this in the past in a limited fashion, and I had in mind a larger project that will require it.

A Focused Exercise

I set out to achieve the following:

An implementation of OAuth that will support Google, Facebook and AppleID authorisation flows (with a placeholder for email/password mechanism outside of OAuth later)
A remote repository for the source code, with a representative commit history of changes as I developed the solution
A set of unit tests for success and failure
Everything running on my Linux desktop.

The requirement for AppleID OAuth was dropped fairly quickly as the required developer account has a NZ$149 annual fee.

In total, to get to the end-point that I estimated at the outset to be valuable, this was 12 hours of effort, divided into 3 separate days of learning, practical development and refinement.

Note that in all of this, I am working alone, not as part of a team. In my discussion below, there are a number of areas that link to separate experience as a team coach and introduce the added complexity of working with multiple people with different skill sets.

Summary of What Was Learned

I achieved significant personal value from what I learned through this exercise:

The implementation of OAuth works for both Google and Meta/Facebook implementations
I updated my code development environments with a comprehensive toolkit (IDE, lint, test, repo and pipeline)
I learned the rudiments of a new language (python)
I implemented a non-trivial authentication protocol by extending sample code
I can write and execute basic automated unit tests
I explored the capabilities and some limitations, and some of the ethical issues related to AI co-piloting tools embedded in software development environments

GitHub Source Code Repository

Category: I know this, I just need to configure things

Complexity: Low

Things to do:

install gh - sudo apt install gh
install git - sudo apt install git
Add SSH keys for remote repository - ssh-keygen, add to ssh-agent, upload public key to GitHub account
Authenticate to GitHub - git auth login
Authorize git from gh - gh auth setup-git
Create the repo on GitHub
Link to remote repository - git remote add origin <URL on GitHub>

This gets everything connected up - a local development directory on my desktop, command-line tools and secure access to the remote repository.

OAuth

Category: I did this once, long ago

Complexity: Moderate

For the OAuth application, I considered two approaches:

Top-down, standards-based, read the documentation on OAuth and the documentation on each provider's implementation and build a reference implementation or
Find some sample code to adapt and extend

I found this sample code (Create a Flask Application With Google Login by Alexander VanTol that demonstrates a tiny Web application using Flask that implements Google's OAuth flow. My first step was to implement this verbatim and use it as the basis for my first commit to the repo.

Set up the API keys in Google
Install the app dependencies - Flask, requests, oauthlib, pyOpenSSL, Flask-Login
Build and run the code - it works!
Commit the changes as the initial commit to the repo

This was a reasonable first increment (a couple of hours to this point with tool and environment setup). I'm still editing in vi at this point (I have muscle memory of vi key bindings going back to 1986).

To this, I'm going to refactor and add:

Meta/Facebook OAuth
A database schema that allows a user to use multiple OAuth methods
Unit tests

At this point, I'm still learning Python (making lots of mistakes with syntax), not enforcing code format standards, not running unit tests and still editing in vi. Time for a change.

Cursor IDE

Coincidentally, I saw a post from Gergely Orsoz (https://bsky.app/profile/gergely.pragmaticengineer.com/post/3lbsfy5654c2m ) talking about the rapid rise in popularity of the Cursor IDE (www.cursor.com). My immediate thought was that this could provide the language formatting support (source code prettifying and linter integration) and familiar key bindings that I needed to smooth the learning path. The added bonus was that I could explore the capabilities and ethical issues associated with the inbuilt AI tools.

Installing Cursor was as simple as downloading an app image and registering an account on the AI back-end. It seamlessly absorbed my existing VisualStudio extensions and it was simple to add the relevant Python and tooling extensions.

Now there's a new can of worms - while this may solve my code practice issues (guiding, avoiding frustrating syntax mistakes, suggesting code changes etc), there are the ethical issues to consider when using AI co-pilots:

what was the underlying model trained on (who's work)?
what will my contributions be used to train in the future?
what are the overall costs of building, training and operating the model?
how do I characterise my learning and resulting expertise - is it the same or different to the acquisition of skills and knowledge in any of my existing learning styles?

I approached the use of a co-pilot initially as one where I would ask questions in a code review, then progressed to where I would ask questions of another team member (as another developer, or as an engineer with a testing specialty) and finally to where I would ask questions of the co-pilot as if I were a team coach.

Overall, I found the co-pilot beneficial, in some expected ways and one unexpected. I found the usual concerns from the current stage of LLM technology - the co-pilot occasionally made things up (minor hallucinations), more often had mistakes in the first suggestions (quickly corrected after prompting), tended to agree with suggestions and omitted to follow constraints in many cases.

The most significant problem I found was that the co-pilot suggested test cases as stubs that passed by default, rather than failed. This could lead to a false sense of progress and if not caught, to the deployment of code that had not been fully tested.

The second most significant problem is that the co-pilot tends to agree with suggestions, even if they are misguided or wrong. If the engineer lacks experience, or blindly trusts the co-pilot, this could lead to problems with functionality, code quality, or require unnecessary iterations.

Python Tools

Category: This is new to me, I'm going to learn this

Complexity: High

From my earliest experiences, I've learned programming by trying things out - in this tight iterative loop between editor, interpreter/compiler and running the code. Where that code comes from depends on the setting:

top-down architecture - standards documents and user requirements, larger blocks of systems and subsystems decomposing to code modules, classes and functions
bottom-up/organic approach - exploratory, proof-of-concept, learning

The latter approach is where I'm most comfortable - it's how I learned originally and it connects well with my other learning styles (speedreading, note-taking, Google search). The caveat is the one I always remember from a 1980s-era programming course: take care to avoid hacking. Hacking in this case meaning: to rapidly and iteratively coding a solution that works, tending to overfit, and stopping when the code works.

To avoid rapid iterative development becoming hacking, I'm going to employ two kinds of protection. One is behaviour-driven testing (unit tests simulate user behaviour) and the other is automated static code analysis to align language usage with common practice.

I'm going to aim for a threshold score of 8.0+ on pylint (a commonly-used standard for production-quality code). I can achieve this in different ways (each with different complexity):

Run pylint manually after code edits and edit to fix code warnings
Add pylint as a precommit hook
Integrate pylint into the Cursor editor (install pylint VS extension)
Add linting checks to a CI/CD pipeline (e.g. GitHub Actions)

I'm going to make it a requirement (precommit hook) that all unit tests pass before code can be committed to the repo. These tests are run by pytest.

Pytest

Category: This is new to me, I'm going to learn this

Complexity: Moderate

I've long been an advocate of test-driven development and behaviour-driven testing. Before the advent of modern tools, this process tended to be largely manual, error-prone and brittle. On the other hand, modern tools introduce new complexity (fixtures, mocking, faking) and it is often unclear in a software team who has responsibility for building and maintaining unit tests.

Here, knowing nothing as a starting point, I relied on the capabilities of the co-pilot to write the unit tests, then reviewed and refined them, categorised into success and failure cases and then extended them to handle multiple OAuth providers.

Pylint

Category: I know this, I'm just rusty

Complexity: Low

The use of linting tools has been part of software development for a long time, to deal with static code issues that may not be caught by interpreters/compilers.

I found it difficult to get the co-pilot to respect pylint rules when it suggested code samples, even when explicitly prompted. Unguided, this will lead to issues, particularly in a team environment. In my experience with teams, code reviews of largely whitespace or formatting issues are generally treated as low-value or even annoying. To avoid these issues, my approach was to integrate pylint into the editor (allowing me to catch formatting issues very early) and into the CI/CD pipeline with precommit hooks to catch anything that might have been missed.

This improves two things:

It reduces the frequency of occurrence of lint-only commits
It provides support to improve the code suggestions of the co-pilot as any warnings must be addressed before committing

I also found this extremely helpful when learning a new language, to observe the warnings and to refactor my code before committing to align better with common practice.

Individual vs Team

In this example, all of the work has been done by one person. In modern development, the smallest unit is the team, so what are the dynamics that we need to be concerned about (where this individual approach might lead to friction in teams, or where there are wider organizational concerns):

code reviews - a team will have its own processes and norms around reviewing code, when that happens, what revisions are necessary
code quality - what happens when moving from proof-of-concept to production-quality?
architecture - the approach here is bottom-up, what needs to happen to integrate with a larger system?
testing - the approach here is for the developer to produce the unit tests; how does this integrate into a team environment where others might have better testing-oriented skills?
pace of delivery / size of increment - how to manage increments of delivery with the associated collaboration for code reviews, testing and integration, so that overall team delivery velocity is maintained and aligned to goals?

Cognitive Impacts

There are a number of factors that can lead to cognitive stress in software development:

complexity of the problem space - having to deal with multiple concepts, inter-related components, large scale, performance or security implications, algorithmic complexity, large numbers of edge cases etc
interaction and collaboration - talking with others about the problem or the approach to solving it,or problems with development can be challenging or even unsafe
large amount of work in progress - context-switching involves significant overhead as a person is interrupted from one task to do another (often higher priority or more urgent) or if they have a habit of starting several tasks but not finishing them
unfinished work - getting work to a stopping point allows context to be released; often we carry around a lot of context about unfinished work

I found it helpful to be able to ask questions about the code by chatting with the co-pilot.