How I Built a Twitter Bot That Generates Song Lyrics Using Node.js and Markov Chains

As a musician and programmer, I‘ve always been fascinated by the intersection of technology and creativity. I love experimenting with using code to generate art, music, and other content. One of my recent projects was building a Twitter bot that can automatically generate song lyrics and post them periodically.

In this in-depth tutorial, I‘ll walk through how I built my lyric-generating Twitter bot from scratch using Node.js and the Markov chain algorithm. I‘ll cover everything from setting up a Twitter developer account, to writing the bot code, to deploying it in the cloud for 24/7 operation. By the end, you‘ll have all the knowledge and code you need to create your own creative Twitter bots!

Project Overview & Components Needed

Here‘s an overview of what we‘ll need to build our Twitter lyric bot:

  • A Twitter Developer account and a new application
  • Your programming environment of choice (I used Node.js and Visual Studio Code)
  • The bot‘s code, which will use the Markov chain algorithm to generate lyrics
  • A source of training data (a large corpus of song lyrics)
  • A cloud platform to deploy and run the bot 24/7 (I chose Heroku)
  • A job scheduler to make the bot post periodically

Don‘t worry if some of these are new to you – I‘ll walk through each part step-by-step! The only prerequisites are basic familiarity with JavaScript and using the command line.

Setting Up Your Twitter Developer Account

The first thing we need to do is set up a Twitter Developer account so we can create an application and get the necessary API keys and access tokens.

  1. Go to https://developer.twitter.com and sign in with your normal Twitter account.
  2. Click "Create an app" and fill out the application details. You can call it something like "My Lyric Bot".
  3. On the app‘s "Keys and tokens" page, generate a new Consumer API key and secret. Then generate a new Access Token and secret. Save all four of these strings somewhere secure, as we‘ll need them later!

Bot Development Environment

Next we need to set up our local development environment. I recommend using Node.js and a text editor like Visual Studio Code. Make sure you have a recent version of Node and npm installed.

Open up the terminal and make a new directory for the bot project:

mkdir lyric-bot 
cd lyric-bot

Then initialize it as a Node project and install the dependencies we‘ll need:

npm init -y
npm install twit dotenv fs

Here‘s what each of those dependencies does:

  • twit is a Twitter API client library that makes it easy to interact with the Twitter API from our code
  • dotenv allows us to store configuration in a .env file instead of hardcoding API keys
  • fs is Node‘s built-in filesystem module which we‘ll use to read our lyric data

Now open this directory in your text editor and we‘re ready to start coding the bot!

Generating Lyrics with a Markov Chain

The core of our lyric bot is a Markov chain – a statistical model that uses probability to generate sequences of words based on a training text.

Don‘t worry if you‘re not familiar with the math behind Markov chains – we can use them without needing to understand all the theory! The key idea is that we‘ll feed in a large corpus of existing song lyrics as training data. The Markov chain will learn patterns from this data, like which words tend to follow other words. Then we can have it generate new sequences of words which will be statistically similar to the training data.

Here‘s a high-level look at how it works:

  1. Split the training text into a list of words
  2. Build a probability map showing how often each word is followed by each other word
  3. To generate new text, pick a starting word, then keep picking the next word based on the probabilities from step 2.

Here‘s the code to build the Markov chain (create a new file called markov.js):

const fs = require(‘fs‘);

class MarkovMachine {
  constructor(text) {
    this.words = text.split(/[ \r\n]+/);
    this.chains = this.getChains(); 
  }

  getChains() {
    const chains = {};

    for (let i = 0; i < this.words.length - 1; i++) {
      const word = this.words[i];
      const nextWord = this.words[i+1];

      if (!(word in chains)) chains[word] = {};

      if (!(nextWord in chains[word])) {
        chains[word][nextWord] = 0;
      }

      chains[word][nextWord]++;
    } 

    return chains;
  }

  getNextWord(word) {
    const wordChains = this.chains[word];
    const words = Object.keys(wordChains);

    // Use the relative frequencies as weights for picking the next word
    const frequencies = words.map(w => wordChains[w]); 
    return pickFromFrequencies(words, frequencies);
  }

  generateText(numWords) {
    const words = [pickRandom(Object.keys(this.chains))];

    for (let i = 0; i < numWords - 1; i++) {
      const prevWord = words[words.length - 1]; 
      const nextWord = this.getNextWord(prevWord);
      words.push(nextWord);
    }

    return words.join(‘ ‘);
  } 
}

function pickFromFrequencies(items, frequencies) {
  const sum = frequencies.reduce((a, b) => a + b);
  let count = 0;
  const threshold = Math.random() * sum;

  for (let i = 0; i < frequencies.length; i++) {
    count += frequencies[i];
    if (count > threshold) {
      return items[i];
    }
  }
}

function pickRandom(items) {
  const index = Math.floor(Math.random() * items.length);
  return items[index];
}

// Load the lyric text data
const text = fs.readFileSync(‘lyrics.txt‘, ‘utf8‘);

// Create the Markov machine 
const machine = new MarkovMachine(text); 

// Generate some sample lyrics
const lyrics = machine.generateText(20);
console.log(lyrics);

This defines a MarkovMachine class which takes in a large source text, analyzes the word transitions to build a probability map (this.chains), and uses that to generate new sequences of words.

The generateText method is what we‘ll call to get the bot‘s lyrics. It picks a random starting word, then keeps picking the next most likely word based on the Markov chains. The numWords argument lets us set how long the generated text should be.

To use this, we need to provide a source text file of song lyrics. I used a corpus of about 50,000 lines of lyrics from pop songs. The more data you provide, the better results you‘ll get! Save your lyric data as a text file called lyrics.txt in the same directory.

When you run this script, it will load the lyrics, build the Markov model, and spit out a sample of machine-generated lyrics! Of course they won‘t be perfectly coherent or meaningful, but they should at least vaguely resemble real song lyrics. Here‘s an example of what mine generated:

she rocky we back your wrong was everlasting place
to me nobody of we ah me
loves had gone were dream she if

Not exactly Grammy-worthy, but it‘s a start! Feel free to tinker with the Markov chain parameters or use different training data to get better results. But this is already enough for our bot to start posting some amusing non-sequiturs.

Connecting to the Twitter API

Now that we can generate lyric text, we need to wire it up to the Twitter API so our bot can post tweets. Create a new file called bot.js with this code:

const Twit = require(‘twit‘);
const fs = require(‘fs‘); 
const MarkovMachine = require(‘./markov‘);

require(‘dotenv‘).config();

const T = new Twit({
  consumer_key: process.env.TWITTER_CONSUMER_KEY, 
  consumer_secret: process.env.TWITTER_CONSUMER_SECRET,
  access_token: process.env.TWITTER_ACCESS_TOKEN,
  access_token_secret: process.env.TWITTER_ACCESS_TOKEN_SECRET
});

const machine = new MarkovMachine(fs.readFileSync(‘lyrics.txt‘, ‘utf8‘));

function tweet() {
  const lyrics = machine.generateText(20);

  T.post(‘statuses/update‘, { status: lyrics }, (err, data, response) => {
    if (err) {
      console.error(err);
    } else {
      console.log(`Posted: ${lyrics}`);
    }
  });
}

tweet();

This uses the twit library to create a client connected to the Twitter API using the keys and secrets we got from the developer portal. Make sure to create a file called .env with your tokens like this:

TWITTER_CONSUMER_KEY=paste_your_key_here 
TWITTER_CONSUMER_SECRET=paste_your_secret_here
TWITTER_ACCESS_TOKEN=paste_your_token_here
TWITTER_ACCESS_TOKEN_SECRET=paste_your_secret_here

The tweet function generates some lyrics using our Markov machine and posts them to Twitter using the API client. Right now this will just post once when we run the script.

Go ahead and test it out:

node bot.js

Check your bot‘s Twitter feed and you should see it posted some fresh lyrics!

Deploying and Automating the Bot

The last piece is to deploy our bot somewhere so it can run continuously and post lyrics on a regular schedule. I used the free tier of Heroku.

First create a new file called Procfile with:

worker: node bot.js

This tells Heroku how to run the bot. Commit everything to a new git repository:

git init
git add .
git commit -m ‘initial commit‘

If you haven‘t already, install the Heroku CLI and login. Then create a new Heroku app:

heroku create my-lyric-bot

Configure the environment variables using your Twitter tokens:

heroku config:set TWITTER_CONSUMER_KEY=your_key_here 
heroku config:set TWITTER_CONSUMER_SECRET=your_secret_here
heroku config:set TWITTER_ACCESS_TOKEN=your_token_here
heroku config:set TWITTER_ACCESS_TOKEN_SECRET=your_token_secret_here

Finally, push the code to Heroku:

git push heroku main

The bot should now be live in the cloud and posting lyrics! The last thing we need is to set up a scheduler to run it periodically. I used the Heroku Scheduler add-on to run the tweet function once every hour:

heroku addons:create scheduler:standard
heroku addons:open scheduler

This will open the scheduler dashboard where you can create a new job to run node bot.js on an hourly frequency.

And that‘s it! Our friendly neighborhood lyric bot is now up and running, tirelessly generating and tweeting quirky song lyrics for the world to enjoy. The cool thing is this same technique can be used to build all sorts of other creative Twitter bots – you‘re really only limited by your imagination!

Some ideas for upgrading the bot:

  • Improve the lyric quality by using a larger corpus of training data or a more sophisticated language model like GPT-2
  • Add the ability to reply to mentions and interact with followers
  • Use sentiment analysis to detect and match the emotions in the generated lyrics
  • Integrate with a music composition API to pair the lyrics with melodies

I hope this has inspired you to create your own Twitter bots and explore the possibilities of generative art and computer creativity. The full code for my Lyrics Bot is available on GitHub. Feel free to use it as a starting point for your own projects. Happy botting!

Similar Posts