Generate an RSS feed from a Twitter user timeline

I needed to generate a valid RSS feed from a Twitter user’s timeline, but only for tweets that matched a certain pattern. Here’s how I did it using PHP.

First, I added the dependency on the TwitterOAuth library by Abraham Williams:

$ composer require abraham/twitteroauth

This library will handle all of my communication and authentication with Twitter’s API. Then I created a read-only app in my Twitter account and securely noted the four key authentication items I would need, the consumer API token and secret, and the access token and secret.

Now, I can quickly bring recent tweets from my target Twitter user into a PHP variable:

require "/path/to/vendor/autoload.php" ;
use Abraham\TwitterOAuth\TwitterOAuth;

$consumerKey       = "your_key_goes_here"; // Consumer Key
$consumerSecret    = "your_secret_goes_here"; // Consumer Secret
$accessToken       = "your_token_goes_here"; // Access Token
$accessTokenSecret = "your_token_secret_goes_here"; // Access Token Secret

$twitter_username    = 'wearrrichmond';

$connection = new TwitterOAuth( $consumerKey, $consumerSecret, $accessToken, $accessTokenSecret );

// Get the 10 most recent tweets from our target user, excluding replies and retweets
$statuses = $connection->get(
	'statuses/user_timeline',
	array(
		"count" => 10,
		"exclude_replies" => true,
		'include_rts' => false,
		'screen_name' => $twitter_username,
	)
);

My specific use case is that my local public school system doesn’t publish an RSS feed of news updates on its website, but it does tweet those updates with a somewhat standard pattern: the headline of the announcement, possibly followed by an at-mention and/or image, and then including a link back to a PDF file on their website that lives in a certain directory. I wanted to capture these items for use on another site I created to aggregate local news headlines into one place, and it mostly relies on the presence of an RSS feed.

So, I only want to use the tweets that match this pattern in the custom RSS feed. Here’s that snippet:

// For each tweet returned by the API, loop through them
foreach ( $statuses as $tweet ) {

	$permalink = '';
	$title     = '';

	// We only want tweets with URLs
	if ( ! empty( $tweet->entities->urls ) ) {

		// Look for a usable permalink that matches our desired URL pattern, and use the last (or maybe only) one
		foreach ( $tweet->entities->urls as $url ) {

			if ( false !== strpos( $url->expanded_url, 'rcs.k12.in.us/files', 0 ) ) {
				$permalink = $url->expanded_url;

			}
		}

		// If we got a usable permalink, go ahead and fill out the rest of the RSS item
		if ( ! empty( $permalink ) ) {

			// Set the title value from the Tweet text
			$title = $tweet->text;

			// Remove links
			$title = preg_replace( '/\bhttp.*\b/', '', $title );

			// Remove at-mentions
			$title = preg_replace( '/\@\w+\b/', '', $title );

			// Remove whitespace at beginning and end
			$title = trim( $title );

			// TODO: Add the item to the feed here

		}
	}
}

Now we have just the tweets we want, ready to add to an RSS feed. We can use the included SimplePie library to do this. In my case, the final output is written to an output text file, which another part of my workflow can then query.

Here’s the final result all put together:

<?php

/**
 * Generate an RSS feed from a Twitter user's timeline
 * Chris Hardie <chris@chrishardie.com>
 */

require "/path/to/vendor/autoload.php" ;
use Abraham\TwitterOAuth\TwitterOAuth;

$consumerKey       = "your_key_goes_here"; // Consumer Key
$consumerSecret    = "your_secret_goes_here"; // Consumer Secret
$accessToken       = "your_token_goes_here"; // Access Token
$accessTokenSecret = "your_token_secret_goes_here"; // Access Token Secret

$twitter_username    = 'wearrrichmond';
$rss_output_filename = '/path/to/www/rcs-twitter.rss';

$connection = new TwitterOAuth( $consumerKey, $consumerSecret, $accessToken, $accessTokenSecret );

// Get the 10 most recent tweets from our target user, excluding replies and retweets
$statuses = $connection->get(
	'statuses/user_timeline',
	array(
		"count" => 10,
		"exclude_replies" => true,
		'include_rts' => false,
		'screen_name' => $twitter_username,
	)
);

$xml = new SimpleXMLElement( '<rss/>' );
$xml->addAttribute( 'version', '2.0' );
$channel = $xml->addChild( 'channel' );

$channel->addChild( 'title', 'Richmond Community Schools' );
$channel->addChild( 'link', 'http://www.rcs.k12.in.us/' );
$channel->addChild( 'description', 'Richmond Community Schools' );
$channel->addChild( 'language', 'en-us' );

// For each tweet returned by the API, loop through them
foreach ( $statuses as $tweet ) {

	$permalink = '';
	$title     = '';

	// We only want tweets with URLs
	if ( ! empty( $tweet->entities->urls ) ) {

		// Look for a usable permalink that matches our desired URL pattern, and use the last (or maybe only) one
		foreach ( $tweet->entities->urls as $url ) {

			if ( false !== strpos( $url->expanded_url, 'rcs.k12.in.us/files', 0 ) ) {
				$permalink = $url->expanded_url;

			}
		}

		// If we got a usable permalink, go ahead and fill out the rest of the RSS item
		if ( ! empty( $permalink ) ) {

			// Set the title value from the Tweet text
			$title = $tweet->text;

			// Remove links
			$title = preg_replace( '/\bhttp.*\b/', '', $title );

			// Remove at-mentions
			$title = preg_replace( '/\@\w+\b/', '', $title );

			// Remove whitespace at beginning and end
			$title = trim( $title );

			$item = $channel->addChild( 'item' );
			$item->addChild( 'link', $permalink );
			$item->addChild( 'pubDate', date( 'r', strtotime( $tweet->created_at ) ) );
			$item->addChild( 'title', $title );

			// For the description, include both the original Tweet text and a full link to the Tweet itself
			$item->addChild( 'description', $tweet->text . PHP_EOL . 'https://twitter.com/' . $twitter_username . '/status/' . $tweet->id_str . PHP_EOL );

		}
	}
}

$rss_file = fopen( $rss_output_filename, 'w' ) or die ("Unable to open $rss_output_filename!" );
fwrite( $rss_file, $xml->asXML() );
fclose( $rss_file );

exit;

Here’s the same thing as a gist on GitHub.

I set this script up to run via cronjob every hour, which gives me a regularly updated feed based on the Twitter account’s activity.

Several ways this could be improved include:

  • Better escaping and sanitizing of the data that comes back from Twitter
  • Make the filtering of the Tweets more tolerant to changes in the target user’s Tweet structure
  • Genericizing the functionality to support querying multiple Twitter accounts and generating multiple corresponding output feeds
  • Fixing Twitter so that RSS feeds of user timelines are offered on the platform again

If you find this helpful or have a variation on this concept that you use, let me know in the comments!

Running WordPress cron on a multisite instance

For a long time I used the WP Cron Control plugin and an associated cron job to make sure that scheduled actions on my WordPress multisite instance were executed properly. (You should never rely on event execution that is triggered by visits to your website, the WordPress default, IMHO.) But after the upgrade to WordPress 5.4 I noticed that some of my scheduled events in WordPress were not firing on time, sometimes delayed by 10-20 minutes. I did some troubleshooting and got as far as suspecting a weird interaction between that plugin and WordPress 5.4, but never got to the bottom of it.

When I reluctantly went in search of a new solution, I decided to try using WP CLI cron commands, executed via my server’s own cron service. Ryan Hellyer provided most of what I needed in this helpful post, and I extended it a bit for my own purposes.

Here’s the resulting script that I use:

#!/bin/bash

# This script runs all due events on every WordPress site in a Multisite install, using WP CLI.
# To be run by the "www-data" user every minute or so.
#
# Thanks https://geek.hellyer.kiwi/2017/01/29/using-wp-cli-run-cron-jobs-multisite-networks/

PATH_TO_WORDPRESS="/path/to/wordpress"
DEBUG=false
DEBUG_LOG=/var/log/wp-cron

if [ "$DEBUG" = true ]; then
        echo $(date -u) "Running WP Cron for all sites." >> $DEBUG_LOG
fi

for URL in = $(wp site list --field=url --path="/path/to/wordpress" --deleted=0 --archived=0)
do
        if [[ $URL == "http"* ]]; then
                if [ "$DEBUG" = true ]; then
                        echo $(date -u) "Running WP Cron for $URL:" >> $DEBUG_LOG
                        wp cron event run --due-now --url="$URL" --path="$PATH_TO_WORDPRESS" >> $DEBUG_LOG
                else
                        wp cron event run --quiet --due-now --url="$URL" --path="$PATH_TO_WORDPRESS"
                fi
        fi
done

Then, in my system crontab:

# Run WordPress Cron For All Sites
*/2 * * * * www-data /bin/bash /path/to/bin/run-wp-cli-cron-for-sites.bash

Yes, I run cron every 2 minutes; there are some sites I operate that require very precise execution times in order to be useful. One implication is that this solution does not scale up very well; if the total execution time of all cron jobs across all sites exceeds 2 minutes, I could quickly run into situations where duplicate jobs are running trying to do the same thing, and that could be bad for performance or worse.

Generic ‘send to Slack’ shell script

On any given server I maintain, I like to set up a generic “send a message to Slack” shell script that can be called from any other tool or service running on that machine. With it I can log information of interest to a Slack channel for reading or maybe action.

Here’s what send-to-slack.sh usually looks like:

#!/bin/bash -e

message=$1

[ ! -z "$message" ] && curl -X POST -H 'Content-type: application/json' --data "{
              \"text\": \"${message}\"
      }" https://hooks.slack.com/services/12345/67890/abcdefghijklmnop

That last line has the “incoming webhook URL” provided by Slack when you set Slack up to receive messages via incoming webhooks, something that is included in even their most basic/free tier.

Running the script and sending a message to the channel is as simple as $ sh send-to-slack.sh 'My message goes here' and the result is what you would expect:

Once that’s in place and tested, I can call the script from wherever I want on that server. Other shell scripts. Custom WordPress functions. Cron jobs. And so on.

There are many other ways this could be customized or extended. It’s worth noting that this is not necessarily a fully secure way to do things if you have untrusted users who can control the input to the script and the message that gets output…please remember to sanitize your inputs and escape your outputs!

 

Put all those email newsletters in an RSS feed

The other day someone told me that they think blogging is dead.

I tried to suppress the sounds of existential pain emanating from deep within my soul, but it still hurt.

Blogging is far from dead, but I also recognize that email newsletters are all the hotness right now when it comes to getting your written thoughts in front of someone else. And I recognize that if you want to follow some kinds of updates from some kinds of people or organizations, you’re going to have to do the email thing.

For a while, I used email filters to manage this issue, dutifully creating or updating them in my setup each time I cringe-fully subscribed to a new email newsletter list after searching in vain for an RSS feed subscribe button. Then I would let them all go into just the right email folder (or maybe even still my inbox) so I could read them when I was in the mood to read blog-posts-as-email-messages on a given subject.

Ugh.

I didn’t like that this approach still created a kind of additional email “to do” burden on me, leaving me with folders to sort, search through and clear out. Newsletter content is usually not actionable or time-sensitive. What I really wanted was to treat all those email newsletter messages like blog post headlines in a separate kind of reader app, available to be read at my leisure. YOU KNOW, LIKE AN RSS FEED READER.

So here’s my current setup:

  1. newsletter emails go to a dedicated email alias configured at my mail provider, and that’s what I subscribe to lists with
  2. those messages are forwarded to a Zapier-powered recipe that converts them into items on a custom generated RSS feed
  3. I subscribe to the RSS feed in my feed reader, Feedly.

Now I can browse the headlines when I want to, read some items and gloss over others, and my email inbox is no longer crowded with articles that aren’t necessarily actionable or time-sensitive for me.

A few things I could do to tweak this setup further:

  • Right now all of my email newsletters go into a single RSS feed. For better categorization and readability, I could break these out into individual feeds.
  • The translation of HTML-only emails (another annoying thing about the email newsletter age) doesn’t always work well into the RSS feed format as supported by Zapier. I haven’t really explored a fix for this but it hasn’t affected me much so far.

Also note that Zapier’s pricing structure is such that depending on the number of incoming messages you have, you might need to upgrade to a paid plan.

That’s it. My email inbox has benefitted greatly from this setup, and I hope yours will too.

New WordPress and WooCommerce plugin: Harmonizely Booking Product

I’ve released a new, free plugin for WordPress and WooCommerce, Harmonizely Booking Product. The plugin creates a new WooCommerce product type that allows you to sell access to scheduled appointments on your calendar, using Harmonizely.

Here’s a quick video to show you how it works:

Disclaimer: I am not affiliated with Harmonizely and they did not ask or pay me to create this plugin, I’m just a fan of the service who wanted to create more ways to use it within the WordPress ecosystem. This post does contain some referral links where I may receive a small percentage of any sales that might result from readers clicking through.

There are a growing number of options to handle appointment scheduling, and if you’re in some field where people schedule things with you a lot (consultant, agency, counselor, accountant, lawyer, healthcare professional) I hope you’re looking at those tools to save you some time. One of the main reasons I like and settled on Harmonizely is because they support the open CalDAV standard for calendar connections and syncing, where as many other services only support Google Calendar or other proprietary connections. (This is especially important to me as a part of advocating for an open web.)

I also like Harmonizely because the service is simple and fast, they regularly release improvements and new features, they have a small and responsive team, and they’ve made their product roadmap public and interactive. Their basic tool is free and they have very affordable pricing for an upgraded version.

Creating this plugin to work with WooCommerce means that anyone who has an existing WooCommerce-powered store can add booking functionality in and keep using their existing payment methods, plugins and other settings. I can imagine a content creator who already sells access to video courses or other educational resources might enjoy being able to let users schedule a quick call with them for a small fee, too. Or maybe someone who offers troubleshooting services of some kind can now give their customers a quick way to pay for and schedule an appointment. There are lots of possibilities, and WooCommerce offers tons of flexibility so you can integrate with Stripe, Paypal, Square and other payment processors.

If you want to sell access to your time through a website, I hope you’ll take a look at Harmonizely, WooCommerce, and this new Harmonizely Booking Product plugin. If you have questions or need help, you can submit a support message or open a GitHub issue.

Enjoy!

WP Engine is a great web host for WordPress developers

I’ve been aware of WP Engine’s WordPress hosting offerings for quite a while now, but I only recently had a chance to dive deeply into the features and benefits they offer to WordPress developers, and I was really impressed.

(Disclaimer: I am not affiliated with WP Engine and am not being compensated by them in any way for this review. But this post contains some referral links where I may receive a small percentage of any sales that result from readers clicking through, and where readers may receive a discount on their purchase.)

Some of the things about WP Engine that stood out to me as really helpful and awesome for WordPress developers:

  1. Super-fast, comprehensive site backup snapshots and cloning. The ability to quickly make a copy of an entire production site (with a large DB and tons of media) to a staging version of that site, or just to a backup snapshot, is a huge benefit. Being able to do it at a click of a button without messing around with export/import tools, find-replace operations or similar command line intervention is just awesome, and enables all sorts of other development best practices when it comes to testing changes and having a safety net for production updates. It’s SO fast, usually completing within a minute or two, so you can make backups/clones all day long without delay. It’s better than any other site backup or environment cloning tool I’ve used in the WordPress hosting space.
  2. Deep integration with git repo management. Although the instructions and interface for setting it up needs a little expanding and polishing, the WP Engine makes it really easy to set up a git repo for a given hosting environment, where changes pushed to its main branch are quickly deployed to the associated environment. They’ve thought through the complexities of exclusions and co-existing with WordPress-initiated core/plugin/theme updates. Add in the GitHub Action to deploy to a WP Engine environment and you’ve got a really sweet development and deployment pipeline setup, all using industry best practices.
  3. Fast and powerful SSH command line access, optimized for security and WP CLI operations. WP Engine seems to understand that command line operations are an essential tool in a WordPress developer or site manager’s toolkit, and they make it really straightforward to use.
  4. Robust system status monitoring and reporting. Whereas some hosts update their system status page well after an impacting event, WP Engine seems to have theirs wired up to show a closer-to-realtime status, and that makes all the difference in not wasting time when troubleshooting or reacting to problems. I also really appreciate that they offer email, Slack and webhook-based notifications for status events, offering endless possibilities with integrating platform events into your development tools and workflows.
  5. Thoughtful tools for keeping WordPress current and secure. WP Engine clearly understands the importance of keeping WordPress core up to date and making sure no insecure plugins or themes are in place any longer than is absolutely necessary. While I think responsibility for these things generally falls to a developer and not the host, I appreciate that they’ve invested in infrastructure here, and I’m sure it benefits them and their support operations in the long run too.
  6. Great support, great communication. Whenever I’ve used the WP Engine support chat they’ve been fast, knowledgeable and straight to the point without being curt. If a question or issue needs input from another internal team, they seem to be able to do that quickly and without any resistance. Their documentation is generally well-written and organized.

In the project I was working on where I finally got to see these features directly in action, I had evaluated a variety of hosts including SiteGround, Pressable, WordPress.com Business, and WP Engine. I picked WP Engine for the above reasons and others, including their focus on WordPress-specific performance optimization.

To be clear, I’m not saying WP Engine is the best WordPress host for every use case, or even most use cases out there. Whether you’re a non-technical WordPress site owner looking for something simple and low-cost, or an enterprise-level site needing something that scales for Superbowl-level traffic with commensurate high-touch support, there are lots of great options out there that might be a better fit. (Having been a part of Automattic/WordPress.com/WordPress.com VIP and seeing the incredible investment in scalable infrastructure there, I know the details really matter at those different ends of the spectrum; I still frequently recommend their offerings too.)

But for a WordPress developer or small development team deploying custom theme and plugin code to a high-traffic site and wanting great WordPress-specific tools, systems and people to support them in that, WP Engine really stands out as worth a look.

Monitoring WordPress events and status with a custom API endpoint

Let’s say your WordPress site has some set of custom functionality that is important to the overall operation of the site, and you want to know right away if it’s not working as expected, even if the site is otherwise “up” and working fine. There could be anywhere from 0 to many things needing attention at any given time, and you don’t want to receive a flood of emails or Slack pings that you have to sort through, you just want a single alert that things are off track, and another notice when they’re back to being in good shape.

I recently handled this case using transients, a custom REST API endpoint, and the service UptimeRobot. The context for me was a set of functions that regularly retrieve information from a variety of third-party sources; most of the time it goes fine, but between network issues, changes in third-party API endpoints or HTML source code and other possible errors, occasionally these functions would break and need updating.

First, I established an error function that was called any time some aspect of my site’s custom functionality encountered a problem that might need my attention.

public static function record_event_fetch_failure( $source = null, $message = null ) {

	if ( empty( $source ) ) {
		return false;
	}

	$source         = esc_attr( $source );
	$transient_name = 'event_fetch_failure_' . $source;

	$failure_data = array(
		'count'                => 1,
		'last_error_timestamp' => gmdate( 'Y-m-d H:i:sP', time() ),
		'last_error_message'   => esc_html( $message ),
	);

	// If the transient is already there, update it.
	$event_failure_counter = get_transient( $transient_name );
	if ( false !== $event_failure_counter ) {
		$failure_data['count'] = $event_failure_counter['count'] + 1;
	}
	set_transient( $transient_name, $failure_data, 24 * HOUR_IN_SECONDS );

}

When called, this function increases the counter of the number of errors interacting with a passed third party data source, storing that counter in a transient. As my custom functions run, any failures will be recorded for up to 24 hours. If there are no additional failures to increase the counter and extend the transient expiration time, then the failure data will go away with the assumption that things are back to normal now. (You may need to adjust these assumptions for your use case.)

Then, I create a REST API endpoint on the site that allows me to monitor that failure data externally.

add_action( 'rest_api_init', function() {
			register_rest_route(
				'mysite/v1',
				'/event_fetch_status',
				array(
					'methods'  => 'GET',
					'callback' => array( $this, 'mysite_event_fetch_status' ),
				)
			);
		} );

And then a callback function to determine the content of that API endpoint:

public static function mysite_event_fetch_status() {

	$error_count = 0;

	foreach ( array( 'facebook', 'eventbrite', 'googlecal' ) as $event_source ) {
		$fail_data = get_transient( 'event_fetch_failure_' . $event_source );
		if ( false !== $fail_data ) {
			$error_count += $fail_data['count'];
		}
	}

	if ( 0 < $error_count ) {
		echo sprintf( 'There have been %d recent event fetch errors.', (int) $error_count );
	} else {
		echo 'OK';
	}
}

Now, I have a REST API endpoint available at https://example.com/wp-json/mysite/v1/event_fetch_status that will either return OK if there have been no recent problems, or an error message with a count of recent issues. I could expand that output to include more detail about which third-party services are having issues and what those issues are, but for the purposes of a red versus green monitoring setup, the basics are fine and I can look into the details when I investigate.

Finally, I set up a monitor in UptimeRobot to check that endpoint on a regular basis and notify me if there’s a problem:

UptimeRobot monitor screenshot
UptimeRobot monitor screenshot

Just for good measure, I also create an admin notice in the WordPress admin area with a little more detail about what is failing:

public static function mysite_event_fetch_admin_notice() {

	$error_count    = 0;
	$error_messages = array();

	foreach ( array( 'facebook', 'eventbrite', 'googlecal' ) as $event_source ) {
		$fail_data = get_transient( 'event_fetch_failure_' . $event_source );
		if ( false !== $fail_data ) {
			$error_count     += $fail_data['count'];
			$error_messages[] = $fail_data['last_error_timestamp'] . ': ' . $fail_data['last_error_message'];
		}
	}

	if ( 0 < $error_count ) {
		echo '<div class="notice notice-warning">';
		echo sprintf( '<p>There have been %d recent event fetch errors.</p>', (int) $error_count );
		echo '<ul>';
		foreach ( $error_messages as $message ) {
			echo sprintf( '<li>%s</li>', esc_html( $message ) );
		}
		echo '</ul>';
		echo '</div>';
	}
}

add_action( 'admin_notices', array( $this, 'mysite_event_fetch_admin_notice' ) );

All put together, I will now receive alerts as configured in UptimeRobot when my custom functions have issues.

You could go the typical route of generating an email or Slack message about each problem, but in my experience this can quickly create a lot of one-off monitoring and alerting configurations in your life, and that can lead to you missing important information or being desensitized to the notices. Instead, I find it’s worth trying to manage all of my time-sensitive notifications across all of my various projects and services in one place where possible, and UptimeRobot or similar services offer a lot of flexibility for that.