Over the few months I have been in the process of building a Data Science online school initially focusing on a rigorous introductory course covering modules ranging from linear algebra to SQL and programming. Without turning this into a 'marketing shill'.
I decided to build a fairly simple WhatsApp chatbot that would take in input and return some output. With the main purpose being sending users programming and statistics problems in order to help them better develop their coding and stats skills. In addition to this, opening up a Slack channel where myself or mentors (Data Scientists in this context) could respond to coding issues raised by different learners. After doing some research, I learnt that Twilio released a Twilio WhatsApp API sometime ago. I opted to build a fairly basic flask chatbot using the Twilio WhatsApp API that would allow users to send some keywords, based on the keyword sent, reply with some output. To put it practically, if someone typed the keywords 'python easy', they would receive a random Python challenge that has been categorized as such from a list of challenges I crawled from different sources. The number and keyword(s) would then be sent to a relational database in order to better understand each user's behavior better and possibly tailor questions based on the input receive from different users.Ok, so let's get down to the fun part. I opted to use to act as the interface to my database. The documentation is very well written, making it a lot easier to pick up quickly.The most important part, atleast for the purposes of my use case, is connecting to the SQL database.
app.config['SQLALCHEMY_DATABASE_URI'] = 'dbms_name://user_name:password@db_host:port/db_name'
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False
From what I read I decided to disable
SQLALCHEMY_TRACK_MODIFICATIONS
to avoid getting the warning message I kept getting. This is a Flask-SQLAlchemy's event notification system that tracks any changes to the SQLAlchemy session and takes extra system resources.Based on the database I had created, I then created a class containing variables that would act as a representation of the columns in the table I created.
class table_name(db.Model):
__tablename__ = 'table_name'
user_ids = db.Column(db.Integer, primary_key = True)
date_col = db.Column(db.DateTime, default = datetime.now(), nullable=False)
key_words = db.Column(db.String(64000))
counter = db.Column(db.Integer())
From my research it is convention for the class to have the same name as that of your table. I then specified the datatypes for each variable, adding a default value in the instance where I deemed that fit. For example in my instance I thought it would make sense for the timestamp to have the default value of the current time particularly since this is recording the time of a given interaction.The next step was building the actual flask app. There are quite a number of tutorials on building apps on that are better written. The most important starting point was defining a /bot endpoint that would listen to each incoming message from a given user that is received by Twilio.
@app.route('/bot', methods=['POST'])
def bot():
With the help of the documentation I used the requests library to retrieve the number and input message from the sender.
incoming_msg = request.values.get('Body', '').lower()
#extract number
number = request.values.get('From', '')
I extracted the input message to check for certain keywords and to send the input message to my database. I also collected the sender's number to keep track of each person's interaction with the app.Each number contained some non numerical values. In order to clean this number for my database I used a regular expression operation to only include numerical values.
cleaned_number = re.sub('[^0-9]', '', number)
Limiting number of requests per 24 hour period
I wanted to cap the number of messages a user could send in a given day to limit the costs I might incur. Each user would only be able to request 5 challenges in a 24 hour period. However, not all requests are equal. The limit would have to exclude help requests for example. In this instance I only created three types of responses. One that would require reading some text and returning a random line from the text. For example if someone types in python easy they would get a random line from the text with the list of 'easy python challenges'. With this type of interaction I would also send the value 1 to the counter column in my table.
def action_control(file_path, incoming_msg):
file = urllib.request.urlopen(file_path)
full_text = [line.decode("utf-8").replace('\n', '') for line in file]
chall = random.choice(full_text)
challenge = ''.join(map(str, chall))
try:
msg.body(challenge)
user_object = table_name()
user_object.user_id = int(cleaned_number)
user_object.key_words = incoming_msg
user_object.counter = 1
db.session.add(user_object)
db.session.commit()
except:
error = 'Sorry, we ran into a mistake somewhere, this will not be added as an attempt. Please try again'
msg.body(error)
user_object = table_name()
user_object.user_ids = int(cleaned_number)
user_object.key_words = incoming_msg
user_object.counter = 0
db.session.add(user_object)
db.session.commit()
The second type of response would be a response where I would send the value 0 to the counter column, following a relatively similar logic with a few changes to factor in reading from a different file type. The third would be a message to factor in a scenario where a user types a keyword that is not recognized by the bot. Instead of rewriting this logic in multiple if statements it made a lot more sense to just create functions I could call in different if statements. For each interaction were the value 1 would be sent to the counter column I added an 'overarching' if statement that would check the total number of interactions a person had made in the last 24 hours before either sending the random challenge or a message indicating that the user had reached his/her limit. In order to check the total number of interactions I used flask-sqlalchemy to query from my database.
total_interactions = Users.query.filter(table_name.user_id == cleaned_number).\
filter(table_name.date_col <= current).\
filter(table_name.date_col >= start).\
filter(table_name.counter == 1).count()
Note: I am using a different table and column names than what I actually used just to make it easier to follow. In this scenario I used user_id to represent the column with the sender's number. This is equivalent to filtering for user_ids where the phone number is the same as the current sender's number, additionally filtering where the date of the interaction is between 23 hours 59 minutes ago and now and the counter is equivalent to 1 (instances were the user made a request that would add to their counter). I would then return the total count of the rows returned from this 'script'.
Lastly, in order to deploy the app to a server on the cloud I downloaded which would run on top of my flask ask created the requirements from all the libraries I used in my virtual environment using and voila.
If you are keen to either help guide people using the app to solve random stats and programming challenges or are keen to have access to the bot feel free to reach out @ [email protected]. or just fill in the access form:
Would be keen to also get any feedback on the bot itself, and potential improvements.