paint-brush
How To Monitor a Forum for Keywords Using Python and AWS Lambda by@sahin.kevin
1,167 reads
1,167 reads

How To Monitor a Forum for Keywords Using Python and AWS Lambda

by KevinMarch 5th, 2020
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Using the Serverless framework, we can quickly create a CRON job with AWS Lambda and Python to check for keywords in a forum topic. We will use the very popular Python packages Requests and BeautifulSoup to parse the HTML code. We are going to monitor Keywords on IndieHackers.com a popular forum for bootstrapped founders. It's really easy with Slack, you just have to create an app to get a webhook URL as explained here. The deployment command: "voilà" for the code.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - How To Monitor a Forum for Keywords Using Python and AWS Lambda
Kevin HackerNoon profile picture
While building I'm always checking different forums everyday to help people about web scraping related questions and engage with the community. This is very common for early stage startup. There are many benefits to engage with potential customers by answering their questions. First you get to know them better, and it can give ideas for product development.

And then, you provide value and it make them trust you.

Some forums have a way to send you alerts about keyword of tags, others don't. Today we are going to see how you can quickly create a CRON job with AWS Lambda and Python to check for keywords in a forum topic.

Prerequisites

In order to scaffold and deploy our project to AWS lambda, we will use the Serverless framework. It's a great project that makes building/configuring your cloud functions really easy with a simple configuration file. It handles many different clouds (AWS, Google Cloud, Azure...) and different languages.Here are the instruction to install it: We will use the very popular Python packages Requests and BeautifulSoup to parse the HTML code:
 pip install requests
 pip install beautifulsoup4
 pip freeze > requirements.txt
If we didn't use the Serverless framework, you would need to package the dependencies into a Zip and upload it to AWS. Thanks to Serverless, we can use a plugin that will parse the requirements.txt file and automatically take care of packaging the dependencies into a Lambda Layer.To do so:
 npm init
After accepting all the defaults, add this to your serverless.yml:

# serverless.yml
plugins:
  - serverless-python-requirements
custom:
  pythonRequirements:
    dockerizePip: non-linux
You can get more information about this here:

Keyword monitoring

We are going to monitor Keywords on IndieHackers.com a popular forum for bootstrapped founders.
Here is a simple code, that will check all titles for "design":
import json
import requests
from bs4 import BeautifulSoup


def hello(event, context):
    base_url = "//www.indiehackers.com/"
    r = requests.get(base_url)
    soup = BeautifulSoup(r.text, 'html.parser')

    matches = soup.select('a.feed-item__title-link')
    keyword = 'design'
    matching_links = []
    for i in matches:
        if keyword in i.text:
            matching_links.append(base_url + i.get('href'))

    response = {
        "statusCode": 200,
        "body": matching_links
    }

    return response
Now all we need to do is to send a Slack notification (or email ) when something matches our keyword. It's really easy with Slack, you just have to create an app to get a webhook URL as explained .
json = {"text": f"Found a topic matching the keyword on Indie Hackers: {matching_links}"}
slack_request = requests.post(
  WEBHOOK_URL, json=json, headers={"Content-Type": "application/json"}
)
And "voilà" for the code.

Deployment & invocation

In order to invoke your function:
serverless invoke -f hello --log
To automate the function invocation with a CRON job:
functions:
  hello:
    handler: handler.hello
    events:
      - schedule: rate(1 day)
There are different ways to write schedule expression with AWS, you can find a detailed article And now de the deployment command:
serverless deploy
And that's it, easy right?I hope you liked this article, this was a little introduction to the serverless framework and how easy it is to build simple utility scripts like this. If you like web scraping, I just wrote an article about the different available, don't hesitate to take a look. Stay tuned for other blog posts about web scraping :)
바카라사이트 바카라사이트 온라인바카라