What I Learned from My First Contribution To Node.js Core

As a full-stack JavaScript developer, I‘ve been using Node.js for several years now to build scalable back-end services and command line tools. Node has become an integral part of my development toolkit, so I‘ve always had an interest in understanding how it works under the hood and potentially contributing to the project.

However, despite my curiosity, I kept putting off diving into the Node source code, making excuses about not having enough time or worrying I wouldn‘t have the necessary low-level programming skills to make a meaningful contribution.

It wasn‘t until a Node core collaborator personally encouraged me to get involved after seeing one of my talks that I finally decided to take the plunge. With his support, I overcame my initial hesitation and embarked on my first contribution to the Node.js project. Here‘s what I learned along the way.

The Impact of Node.js

Before diving into my contribution story, it‘s worth taking a step back to appreciate just how significant Node.js has become. What started as an experiment in using JavaScript outside the browser has turned into a major platform powering everything from startups to the enterprise.

Consider these stats:

Metric Value
NPM packages Over 1.3 million
GitHub stars Over 80,000
Core contributors Over 2,600
Weekly downloads Over 70 million
Websites using Node.js Over 30 million

Data sources: modulecounts.com, githut.info, npm-stat.com, w3techs.com

Node.js‘s popularity stems from its unique value proposition – enabling developers to use JavaScript, the most widely known programming language, to write server-side and command line applications with great performance. As someone who writes full-stack JavaScript myself, I‘ve directly benefited from the productivity boost of using a single language and common tooling across the whole stack.

At the same time, Node‘s growing adoption means that contributing to the project, whether by reporting issues, improving documentation, or submitting code changes, can have a huge impact. With millions of developers and companies depending on Node, every contribution helps make the software better for the broader ecosystem.

Setting Up a Dev Environment

My first challenge was getting my local development environment set up to build and debug Node.js itself. It had been years since I last touched C++, and back then I was using a Windows machine with Visual Studio. These days, I‘m fully invested in the VSCode ecosystem on macOS.

Fortunately, the Node.js team maintains excellent documentation on building the project from source. With the help of a VSCode extension for C/C++, I was able to get everything compiled and running without too much fuss.

I did run into some confusion around launching the Node debugger and attaching the LLDB debugger to the process, but I eventually found a configuration that worked well. Being able to hit breakpoints and step through the Node source was a huge help as I started exploring the codebase.

Here‘s a simplified version of my VSCode debug config:

{
  "name": "Debug Node.js",
  "type": "lldb",
  "request": "launch",
  "program": "${workspaceFolder}/out/Debug/node",
  "args": ["app.js"],
  "cwd": "${workspaceFolder}"
}

With this, I could set a breakpoint in the Node C++ code, launch the debugger, and step through the native parts of the codebase in addition to my own JavaScript.

Taking on My First Issue

With my environment squared away, the next step was finding a way to actually contribute. Again, the Node core collaborator who initially encouraged me to get involved was instrumental here. He pointed me to a relatively self-contained issue related to the new experimental worker_threads module.

A Primer on Worker Threads

If you‘re not familiar, worker threads are a relatively new feature in Node.js that allow running JavaScript in parallel on independent threads. This is a big deal for a platform that has traditionally relied on a single-threaded event loop model.

With worker threads, CPU-bound tasks can be offloaded to separate threads to execute in parallel without blocking the main event loop. This can lead to significant performance improvements for certain workloads.

Here‘s a simple example of spinning up a worker thread:

const { Worker } = require(‘worker_threads‘);

const worker = new Worker(`
  const { parentPort } = require(‘worker_threads‘);
  parentPort.postMessage({ hello: ‘world‘ });
`, { eval: true });

worker.on(‘message‘, (msg) => {
  console.log(msg); // Prints { hello: ‘world‘ }
});

In this snippet, the main thread creates a new Worker, passing it some JavaScript code to evaluate in a separate thread. That code posts a message back to the parent thread, which can listen for messages via the worker.on(‘message‘) event.

Under the hood, the worker threads implementation in Node.js uses a combination of native C++ code and JavaScript to implement the threading and communication primitives. And it was in this code that my first contribution opportunity emerged.

Debugging a Race Condition

The specific issue I tackled had to do with a race condition when a worker thread was disposed. There were situations where messages from the worker‘s stdout (e.g. from console.log statements) could arrive on the main thread after references to the underlying resources had already been cleaned up.

To start, I attempted to put together a minimal test case that reproduced the problem consistently. This proved tricky due to the timing-sensitive nature of race conditions – the bug would manifest on some of my test runs but not others. After many iterations, I finally settled on a test that failed often enough to be useful, if not 100% of the time.

Here‘s a simplified version of the repro:

const { Worker } = require(‘worker_threads‘);

const worker = new Worker(`
  console.log(‘Inside worker‘);
`, { eval: true });

worker.on(‘exit‘, () => {
  // Worker exited but its console.log might not have been received yet
  worker.terminate();  
});

In this case, calling worker.terminate() immediately in the exit event callback didn‘t ensure that the ‘Inside worker‘ message had been processed on the main thread yet. If it arrived after terminate() had cleaned up the handles, it would try to write to a closed stream.

With a reproducing test case in hand, I dove into debugging to understand the sequence of events that caused the race condition. I learned that in addition to the explicit message channel between the main thread and workers, there‘s a second internal communication channel used for propagating console.log, console.error, and process.stdout writes from worker threads.

The bug occurred because we were disposing of a worker and releasing references to its resources before ensuring that all pending messages on this internal channel had been processed. If a message arrived after the references were gone, it would try to operate on invalid handles.

Here‘s a simplified diagram of the race condition:

Main Thread              Worker Thread

  worker.terminate()
      |
      | 
      |                   console.log(‘...‘)
      |                         |
      |                         |
      V                         V
 Free resources          Msg sent via internal channel
      |                         |
      |                 Invalid handles error!
      V                         |

My first few attempts at fixing this were admittedly clumsy. I tried adding promises and callbacks to wait for the message queue to empty before disposing of the worker. None of these solutions were very elegant.

After talking it through with the collaborator who was mentoring me, I discovered there was an existing drain() method on the MessagePort class that would synchronously emit any queued messages. This was already being called for the public-facing message channel – I simply needed to call it on the internal channel as well before disposing of the worker.

In the end, the fix was a one-line change:

// worker.cc

void Worker::Dispose() {
  // ...

  env->isolate()->RunMicrotasks();

  // Drain any remaining messages from internal channels
  public_port_->Drain();
  internal_port_->Drain(); // <- The fix!

  // ...
}

By forcing the internal message port to drain before disposing of the worker, we ensured that any pending messages would be processed before the associated resources were freed, preventing the race condition.

Although it was a small change in terms of lines of code, tracking down this issue gave me a much deeper appreciation for the challenges of writing thread-safe code and the subtleties of the worker threads implementation.

Key Takeaways

In addition to the satisfaction of seeing my code merged into Node core, I took away some valuable lessons from this experience:

Don‘t be afraid to try, even if you feel uncertain. Everyone has to start somewhere. The Node core collaborators were incredibly supportive of me as a first-time contributor.

As someone who‘s been coding professionally for many years, it‘s easy to feel intimidated when contributing to a large, established project like Node.js. Imposter syndrome is real! But the maintainers truly want to help new contributors succeed. They were patient in explaining things and didn‘t make me feel bad for any knowledge gaps.

Asking for help is not a sign of weakness. I could have spun my wheels attempting to fix this bug in all the wrong ways if I hadn‘t reached out to get another perspective.

Getting stuck is part of the process. It‘s tempting to go heads-down and try to power through on your own, but often the fastest way to get unstuck is to swallow your pride and ask for help. Experienced maintainers have a wealth of context that can quickly point you in the right direction.

Debugging unfamiliar codebases is a skill that you can improve with practice. Each time I dug through the worker_threads implementation, I got faster at tracing the relevant code paths.

At first, diving into a large C++ codebase was daunting, even with my prior C++ experience. But like any skill, debugging ability improves the more you do it. Over time I built up a mental map of the key flows and data structures in the worker threads subsystem.

Contributing to OSS is a fantastic way to learn. I gained a much deeper understanding of Node‘s architecture and concurrency model through this one bug fix than I ever did by simply using the Node APIs.

It‘s one thing to know how to use an API, but it‘s another level of insight to understand how that API is implemented under the hood. Tracing through the Node codebase gave me a newfound appreciation for just how much complexity is being abstracted away by the clean public interfaces.

The OSS community is full of smart, gracious people who want to help you succeed. Nearly everyone I interacted with in the process of making this contribution went out of their way to make me feel welcome.

From code reviews to GitHub discussions to IRC chats, I was struck by how friendly and supportive the Node core contributors were at every step of the process. They welcomed questions, offered tips, and gave constructive feedback to help me improve my submission.

Just the Beginning

In the end, my first contribution to Node.js was fairly small in scope – just a one-line change to fix a race condition. But the impact on my growth as a developer was immense. I confronted my fears, broadened my skill set, and discovered the satisfaction of improving a tool that I use every day.

If you‘ve ever thought about contributing to Node core or any other open source project, I strongly encourage you to give it a try. Find a supportive community, start small, and don‘t hesitate to ask for help. You might be surprised by how much you‘re capable of.

I know this is just the beginning of my Node core journey. I still have a ton to learn, but I‘m excited to keep collaborating with the amazing team behind this project. I hope sharing my experience might inspire a few other developers to get involved as well. The more diverse perspectives we have shaping the future of Node.js, the healthier the platform will be for everyone.

At the end of the day, contributing to open source is about more than just code. It‘s about being part of a community and building something bigger than yourself. And that‘s an opportunity we can all be grateful for.

Similar Posts