What I've learned while ramping up on a large software project
In mid-March, I started my new role as a Senior Software Engineer at GitHub. Yes, I started a job in the middle of a pandemic. The company brings on new joiners every two weeks for a structured onboarding process, and my cohort was the first to not be flown to San Francisco for it. I was disappointed to not have an in-person experience, but especially looking back now, it was very much the responsible decision for everyone’s safety. Fortunately, the next two months and a bit have exceeded my expectations with flying colors. I love my team, I’m excited about the work I’m doing, and, I haven’t deployed software to GitHub production scale ever before in my life. There’s a huge amount to learn, and it’s been really fun to feel an acceleration in my own growth in the past few weeks.
GitHub’s main codebase is written in Ruby on Rails. It’s been a while since I’ve been paid to write Ruby on Rails. I briefly worked on an RoR project at Pivotal (for about 6 weeks), but before that, I last touched it during my first-ever junior dev role in 2014. The framework has changed significantly in that time, and, let’s be honest: the last time I worked in this stack, I was not a good programmer.
I had to ramp up quickly, as the only senior-level engineer on the team. I understood that a senior-level Rails programmer is expected write code that is performant (fast) and stable (bug-free), that also conforms to established Ruby and Rails conventions. But within the company, I also had to get myself to a point where I could step into a leadership role, by driving bigger-picture technical decisions, providing product roadmap input, and setting the pace of the team. In this blog post, I’ve broken down these very large and open-ended expectations into a few smaller areas of proficiency. (Also, not going to claim that I’m there yet! I’m definitely not. But the feedback I’ve received so far has been encouraging.)
I’ve tried to capture some of my major learnings in the last two months, about the things that have accelerated my growth. I’ll call out that some of these things will be specific to working on a large Ruby on Rails monolith, backed by MySQL. By “large”, I actually mean two things: the codebase literally has many lines of code, and also, there are millions of active users who generate enormous amounts of data each day. But, no matter what your current or future technology stacks are, the mechanics of breaking down open-ended learning challenges into small tasks is an evergreen skill to build.
It pays off to level up on the database layer
At the end of my second week, I literally spent 2-3 hours going very slowly through the Codecademy SQL tutorial — for real! Then moved on to gradually more advanced resources like Julia Evans’ zines. I also asked for help in internal Slack channels, which led me to a great conference talk by my colleague Bryana Knight about SQL gotchas. When your production databases start getting large, you have to internalize a few habits:
- Perform your data fetching and processing completely inside of SQL, or ActiveRecord, and never in application memory if it can be avoided. SQL databases and ActiveRecord are designed to process large lookups, and ActiveRecord ships with helpful methods for batched lookups that keep memory consumption low. At the point that data “enters” application memory, it should be ready for business logic to be performed on it: it should ideally require no further array-based filtering or processing. SQL functions like GROUP BY are very useful for this! Getting this feedback on PRs made me realize that I had to say goodbye to all my favorite Ruby enumerators like
select
and get better at actually writing SQL. - Don’t iterate over large tables. On one of my first PRs, I wrote a query that started with
User.where
. GitHub’s Users table is over 40 million records long. I received feedback pretty quickly to never do this, and instead, grab only the users you need by querying a smaller table, and fetching only the associated records. It’s a little like practicing greedy algorithms everyday. - Really learn how SQL JOINs work: you need to be able to explain the differences between an inner join, left join, and outer join. Also, inner joins are the default for MySQL and Postgres.
- Learn how subqueries work. I haven’t had to write one yet — but I did have to debug one!
- Know when to use indexes. Previously, I didn’t really understand indexes that well: I was in the camp of “index ALL THE THINGS!” Which turns out to not really be correct. This excellent article by Vaidehi Joshi breaks down indexes and compound indexes extremely well.
- Learn to love EXPLAIN. Again, newbie from the world of small production datasets here — you don’t really need to optimize lookups when you’re not breaking 1 million records in any table. EXPLAIN is your best friend when databases get large.
- Always be challenging yourself to think about database schema design in the context of how that table will be used. What application-level events trigger a write, how fast will it grow, will it be updated in a standalone manner or in tandem with another table — such that we might need to wrap everything in a Transaction.
My next steps are to dive even deeper with a short reading list, all recommended by colleagues:
- Database Design for Mere Mortals
- High-Performance MySQL
- and possibly a revisit of Designing Data-Intensive Applications.
Approach existing code and patterns with a healthy critical lens
When I was a junior developer, every time I was handed a task that I hadn’t done before, my first instinct was to search all over the codebase for an example of when someone else had done the thing previously, then copy-paste a few blocks of code, substituting in my own variable names and making some small changes. I assumed that whoever wrote that code, at that time, knows more than me right now, so therefore I should follow in the footsteps of their wisdom. That way, I would be less likely to introduce bugs! Right? (Wrong.)
I still definitely search the codebase, but as a senior engineer, it’s not so I can find a pattern to copy. It’s so I can understand where the seams in the code are, try to understand the constraints that the engineer who wrote the code was working with, and make an informed decision about how to proceed. I no longer take legacy code as gospel; rather, it’s a historical artifact of a past state of the world, and its chief value is to give me more information about how I can solve the problem today.
If you discover that there is room for improvement on a historical pattern, should you clean up as you go? In an ideal world, we would always be cleaning up the campground, no matter who came before us. But, in large codebases, it’s likely that each file is either explicitly or informally owned by a specific feature team. Within GitHub, we use CODEOWNERS to automatically ping teams when PRs are opened against files that they’re responsible for. So, if I refactored three different files that contained the same old pattern to make them conform to my new pattern, possibly three other teams would get pinged. In a healthy, safe organization, the fact of notifying another team is usually a safe thing to do, but I think it’s even healthier practice to perform incremental improvement in collaboration with the teams that own the code.
A big refactoring effort while you’re in the middle of developing a tightly-scoped feature is, in my opinion, a process anti-pattern, because it generates invisible work. Refactoring and addressing technical debt more generally should be considered as work, and properly triaged and tracked as such. You can even just write a brief issue to describe the refactor, and perform it immediately after you finish the in-flight work.
An aside: depending on your machine specs, an IDE can buckle under the memory demands of a full-codebase indexed search, if your codebase is hundreds of thousands of lines of code, like mine is. I use the ag command line utility with just a simple string to see likely method calls (it’s not perfect, but it takes seconds, and gets me more than 80% of the way).
Spend time on research: it’s part of the job!
I used to feel a twang of guilt when I went a whole day without pushing code. But, something I’ve learned over the years is that as you become more senior in your career, you’re more likely to spend longer periods of time between code pushes. You’ll take on more leadership tasks such as breaking down work, performing research and validation, facilitating, writing, and so on. Allowing yourself the breathing room to ingest information slowly, and being kind to yourself, is very much a learned skill that takes practice (at least, for me it is).
When ramping up on a new codebase, those research periods can feel like they take a long time. But it’s an important part of the process to ensure that you are building the correct mental models, so you can provide informed input into open-ended decisions later on.
As part of my ramp-up process on a new area of work, I try to:
- Read all relevant RFCs
- Read all relevant Architectural Decision Records and past design docs
- Read internal technical documentation about generalized best practices, if it exists
- In our org, this takes the form of style guides and developer-facing documentation about processes like “How to write a SQL migration for the monolith”. Your org’s version of this may be an internal wiki, scattered Google Docs, or, the worst possible place — in someone’s head. If that’s the case, consider writing it down somewhere, and making others aware of where you wrote it down.
- Go through a sampling of previous stories that have been completed
- Have conversations with teammates to gather historical decision-making context
- If needed, take to the internet to see if there are public discussions about technical or product implementation
- If it’s a bug fix request, take stock of all the associated open tickets to find out what’s been attempted previously, and to assess the severity of the issue
- Read the corresponding integration tests and unit tests for the code files that I’ll likely be changing
When I feel like I’ve understood a scoped problem, I start by doing some writing, to double-check that my understanding is correct. I write down my assumptions (usually, as a comment in a GitHub issue). If a problem is ready for development work, I write down the specific tasks that need to be accomplished, identify which tasks can be done in parallel, and call out areas of risk — and how we can mitigate them, perhaps with technical spikes, time-boxed research, or pairing with someone outside of the team who has more expertise.
Be nosy about what other teams are doing
As an “outgoing introvert”, working in a remote-first company has actually been great for me, because asynchronous, text-based communication is my jam. On Slack, it’s easy to lurk in a lot of different channels, and pick up ambient context from many different parts of the org, who are working on other parts of our shared giant codebase.
It really reinforces my learning when I can draw connections. I find it helpful to slowly figure out where my team and my work fits into the bigger picture, not just in terms of an organizational roadmap (this should hopefully exist as a document you can just read, somewhere in a shared Google Drive) but also in terms of communication patterns — the informal stuff that isn’t written down anywhere. But beyond the learning piece, I just like building a more holistic view of what’s going on elsewhere in the organization, for a few reasons:
- It helps me identify interesting teams and people to consider working with in the future
- I learn about new process experiments to try with my team, in the spirit of continuous improvement
- I feel more connected to the wider social fabric of the company, and it improves my morale
When you’re new to a company — at least for the first 90 days — you have a layer of protective armor, which gives you a blank check to ask questions like “Hi, I’m new! Can you tell me what X does?” I deploy this tactic liberally, to adjacent teams, to learn about what they’re working on.
There is a time management piece here, too. You can’t spend all your time picking up on generalized knowledge. You have to balance how you spend your ramp-up time, so you can become effective at technical execution as well. I also acknowledge that butting into a Slack channel full of strangers can be intimidating, particularly if you’re not outgoing, or if you are part of a marginalized identity group that experiences stereotype threat. It’s also highly dependent on the overall level of collaboration of the org. If it’s not already normalized to reach across team lines for help, then this is politically riskier to do during your ramp-up period.
I’ve mentioned a few times throughout, but want to reiterate that your mileage will vary based on your company’s culture, the stage of your career, and what you feel personally comfortable doing. Hopefully this helps!