Unlocking email content into RSS feeds redux with WordPress and Postie

As a part of some local journalism projects I’m exploring, I wanted to have a way to get information that is being emailed around (press releases, newsletters) into a publicly accessible RSS feed.

I’ve already explored this general “unlock email into an RSS feed” workflow using Zapier but Zapier’s limitations around translating HTML email messages into useful RSS entries led me to explore other options. For a while now I’ve been using Feedly’s paid feature that lets you receive email at a custom address and puts the content into your feed reading experience, and that’s actually been a good solution for me as an individual. (I made sure to set up an address at a domain I control and aliased that to Feedly’s provided address, in case I want to move to another solution later.)

But if we want to help a given audience have better access to information that’s only available in email but is intended to be public, I don’t think it scales well to ask them all to subscribe to the same email newsletter, or to all sign up for a Feedly paid plan. And yet so many organizations continue to use email as a way to distribute information, often instead of a website, and it doesn’t scale well to beg each of them to start (or go back to?) publishing their updates on a website with an RSS feed.

I started looking at using Mailparser for a more generalized solution. Receive the emails, have Mailparser extract the information into a structured, API-queryable format, and then download that information (including any attachments and images) and put it up on a publicly accessible URL somewhere. I knew I needed a way to organize the information coming in according to the email address of the person who sent it, so I’d have to build a small application that managed that categorization during the publishing process.

And then I realized I was basically getting into CMS territory. Publishing text and media on a website. Organizing content by categories and authors. Searchable and sortable. Yeah, I know a tool that already does all that stuff really well: WordPress.

But was I still going to need to build a glue application to process the emails and create WordPress posts?

I was aware of Jetpack’s post by email feature and I think it could work well for some scenarios, but I wanted something a little more purpose built. I did a little bit of Googling and found Postie, a WordPress plugin that has great features for bringing emails into WordPress posts. I exclaimed many words of delight upon finding it, and continued to be impressed as I looked through the thoughtful documentation, the developer-oriented options for extending and customizing it, and the active support and maintenance that goes into it. The WordPress community is amazing that way. I sent the developer a donation.

So, here’s the new workflow I would use:

  • Email is sent to an alias at my custom domain, which goes to a free email provider with IMAP access.
  • Every half hour, Postie goes out and checks for new email messages and creates pending WordPress posts.
  • I get a notification in a Slack channel about the WordPress post, and can publish it if appropriate.
  • The WordPress site provides a built in RSS feed of “emails” as posts on the site.

Amazing! But I still wanted a way to organize the incoming emails and resulting posts based on sender, without using Postie’s default method of creating WordPress users for each sender.

So, I created a custom taxonomy, Sources, with some term meta fields that allow me to associate email addresses and to select whether posts from that source should be pending or published by default. (Of course, email can be forged so it’s never safe to depend on the value of the “From” address in an email to authenticate anything important.)

Here are some code snippets used to accomplish this in a custom WordPress plugin I set up to complement Postie.

First, I defined the custom taxonomy:

// Register Custom Taxonomy
function wciu_source_taxonomy() {
	$labels = array(
		'name'                       => _x( 'Sources', 'Taxonomy General Name', 'wci-updates-functionality' ),
		'singular_name'              => _x( 'Source', 'Taxonomy Singular Name', 'wci-updates-functionality' ),
		'menu_name'                  => __( 'Sources', 'wci-updates-functionality' ),
		'all_items'                  => __( 'All Sources', 'wci-updates-functionality' ),
		'parent_item'                => __( 'Parent Source', 'wci-updates-functionality' ),
		'parent_item_colon'          => __( 'Parent Source:', 'wci-updates-functionality' ),
		'new_item_name'              => __( 'New Source Name', 'wci-updates-functionality' ),
		'add_new_item'               => __( 'Add New Source', 'wci-updates-functionality' ),
		'edit_item'                  => __( 'Edit Source', 'wci-updates-functionality' ),
		'update_item'                => __( 'Update Source', 'wci-updates-functionality' ),
		'view_item'                  => __( 'View Source', 'wci-updates-functionality' ),
		'separate_items_with_commas' => __( 'Separate sources with commas', 'wci-updates-functionality' ),
		'add_or_remove_items'        => __( 'Add or remove sources', 'wci-updates-functionality' ),
		'choose_from_most_used'      => __( 'Choose from the most used', 'wci-updates-functionality' ),
		'popular_items'              => __( 'Popular sources', 'wci-updates-functionality' ),
		'search_items'               => __( 'Search Sources', 'wci-updates-functionality' ),
		'not_found'                  => __( 'Not Found', 'wci-updates-functionality' ),
		'no_terms'                   => __( 'No sources', 'wci-updates-functionality' ),
		'items_list'                 => __( 'Sources list', 'wci-updates-functionality' ),
		'items_list_navigation'      => __( 'Sources list navigation', 'wci-updates-functionality' ),
	);

	$rewrite = array(
		'slug'         => 'source',
		'with_front'   => true,
		'hierarchical' => false,
	);

	$args = array(
		'labels'            => $labels,
		'hierarchical'      => false,
		'public'            => true,
		'show_ui'           => true,
		'show_admin_column' => true,
		'show_in_nav_menus' => false,
		'show_tagcloud'     => true,
		'rewrite'           => $rewrite,
		'show_in_rest'      => true,
	);

	register_taxonomy( 'wciu_source', array( 'post' ), $args );

}
add_action( 'init', 'wciu_source_taxonomy', 0 );

Then, I used Fieldmanager (already installed and activated elsewhere) to add some custom fields to that taxonomy:

// Add source default status and email fields to taxonomy
function wciu_source_taxonomy_meta_box() {
	if ( class_exists( 'Fieldmanager_Group' ) && is_admin() ) {
		$tax_meta = new Fieldmanager_Group(
			array(
				'name'           => 'wciu',
				'serialize_data' => false,
				'children'       => array(
					'email'               => new Fieldmanager_TextField(
						array(
							'label'          => 'Email Address',
							'add_more_label' => 'Add another email',
							'serialize_data' => false,
							'limit'          => 0,
						),
					),
					'default_post_status' => new Fieldmanager_Select(
						array(
							'name'          => 'default_post_status',
							'label'         => 'Default Post Status',
							'description'   => 'Should new content from this source be pending review or published by default?',
							'options'       => array(
								'pending' => 'Pending Review',
								'publish' => 'Published',
							),
							'default_value' => 'pending',
						),
					),
				),
			)
		);
		$tax_meta->add_term_meta_box( 'WCI Source Data', array( 'wciu_source' ) );
	}
}
add_action( 'fm_term_wciu_source', 'wciu_source_taxonomy_meta_box' );

Now when I edit a Source term, I can set up all that useful info:

Example Source term edit screen with new custom fields

For some reason, Postie does not save any of the meta information about the email that created a post in the post meta, so I also decided to do that using the filters and actions the developer made available:

// Add the original email author as post meta
function wciu_add_email_to_post_meta( $post_details ) {
	// This should be done via an action hook, not a filter, but postie doesn't offer one here.
	if ( ! empty( $post_details['email_author'] ) ) {
		add_post_meta( $post_details['ID'], 'wciu_from_email_address', $post_details['email_author'] );
	}

	return $post_details;
}
add_filter( 'postie_post_before', 'wciu_add_email_to_post_meta' );

That creates a post meta field wciu_from_email_address attached to the post that can then be used later to associate the post with one of the Sources in my custom taxonomy. I can also use the custom taxonomy’s selected default publish status to change the post’s status if need be:

// If the post came from an email address we know, associate them and maybe update the post's status
function wciu_maybe_associate_post_with_source( $post ) {
	// See if the post has an email author
	$email_address = get_post_meta( $post['ID'], 'wciu_from_email_address', true );

	if ( ! empty( $email_address ) ) {
		// Get the first sources from the source taxonomy with this email address
		$term_args = array(
			'taxonomy'     => 'wciu_source',
			'meta_key'     => 'wciu_email',
			'meta_value'   => $email_address,
			'meta_compare' => 'LIKE',
			'hide_empty'   => false,
			'number'       => 1,
		);

		$term_query = new WP_Term_Query( $term_args );

		// If we found a source with that email author, associate
		if ( ! is_wp_error( $term_query ) ) {
			$source = $term_query->terms[0];
			if ( ! empty( $source->slug ) ) {
				wp_set_post_terms( $post['ID'], array( $source->slug ), 'wciu_source' );

				// Update the post's status if necessary
				$intended_post_status = get_term_meta( $source->term_id, 'wciu_default_post_status', true );
				if ( ! empty( $intended_post_status ) && $post['post_status'] !== $intended_post_status ) {
					wp_update_post(
						array(
							'ID'          => $post['ID'],
							'post_status' => $intended_post_status,
						),
					);
				}
			}
		}
	}
}
add_action( 'postie_post_after', 'wciu_maybe_associate_post_with_source' );

Note that for this to work, you have to configure Postie to allow mail from any sender, not just known WordPress users.

Finally, I found that it was helpful to add information about the email sender to the post editor, for the case where an email came from a new sender and I might need to create a new Source for it:

// Add sender info to post editor
function wciu_add_info_to_submitbox( $post ) {
	$email = get_post_meta( $post->ID, 'wciu_from_email_address', true );
	if ( ! empty( $email ) ) {
		echo sprintf(
			'<div class="misc-pub-section misc-pub-uploadedby">From <strong>%s</strong></div>',
			esc_attr( $email )
		);
	}

}
add_action( 'post_submitbox_misc_actions', 'wciu_add_info_to_submitbox' );

Now, whenever someone sends an email to the designated address, if they are a known sender they get a new WordPress post with the associated Source info. And for each Source I create on this website, I get a nice RSS feed ready to go at a URI like /source/community-member/feed/. I can then pull those emails/posts into other applications or feed readers as needed. I can choose to link back to the WordPress site full display of the email contents, or just use the content from the RSS feed directly.

Here’s an example emailed post on my local dev site:

(Again, this is all a little bit dangerous given that we’re talking about taking email from arbitrary senders and possibly publishing it on a website. Malicious senders could use malicious attachments or inappropriate content to ruin your day. Take precautions accordingly.)

I could (and may) extend this application by syncing the Source data from some other application via API, so that I’m not managing that data and those email addresses manually in WordPress.

This took some figuring out but I’m really happy with the result so far. Unlocking email content into a structured, programmatically accessible format like RSS is a real need for newsrooms, reporters and others, so I’m excited to have a solution that is relatively repeatable and scalable, and that doesn’t depend on any one proprietary service or tool.

As always, questions and feedback are welcome!

Published by

Chris Hardie

I'm a deep generalist, currently focused on software engineering + writing + the open web.

Leave a Reply

Your email address will not be published.