Twitter is holy grail for data and behavior science. Data from twitter can be biased but its still good way to see how mass react or how certain chain unfolds. As all almost all data you share in twitter is publicly available(with API limit), we can do a lot of analysis on these. Twitter has limitation on data range of data you can extract but has feature to stream the live tweet which we can store for
future use.
In this tutorial, we will connect to twitter API and get the live twitter feed, store the feed into SQLite database. We will be using SQLachemy, so you can modify this for database of your choice.
Let get started,
Sign up to twitter develop platform and make new app. Once you make app get the four keys from app detail i.e "customer_key", "customer_secret", "access_token", "access_secret". and store it in config file as dict.
Here code code to get live feed and store it into sqlite database:
Here is first 5 rows of twitter stored in twitter_db.
You can use this data to make dashboard or analysis.
future use.
In this tutorial, we will connect to twitter API and get the live twitter feed, store the feed into SQLite database. We will be using SQLachemy, so you can modify this for database of your choice.
Let get started,
Sign up to twitter develop platform and make new app. Once you make app get the four keys from app detail i.e "customer_key", "customer_secret", "access_token", "access_secret". and store it in config file as dict.
Here code code to get live feed and store it into sqlite database:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""get twitter stream""" | |
import os | |
import sys | |
import logging | |
from collections import defaultdict | |
from sqlalchemy import Column, ForeignKey, Integer, String, DateTime | |
from sqlalchemy.ext.declarative import declarative_base | |
from sqlalchemy import create_engine | |
from sqlalchemy.orm import sessionmaker | |
from tweepy.streaming import StreamListener | |
from tweepy import OAuthHandler, API | |
from tweepy import Stream | |
from config import twitter_keys #custom config file which has keys as dict | |
##twitter keys | |
access_token = twitter_keys['access_token'] | |
access_token_secret = twitter_keys['access_secret'] | |
consumer_key = twitter_keys['customer_key'] | |
consumer_secret = twitter_keys['customer_secret'] | |
#set logger | |
logging.basicConfig(filename="data/log.log", | |
level=logging.DEBUG, | |
format="%(asctime)s:%(levelname)s:%(message)s") | |
#database connection | |
Base = declarative_base() | |
class Pythondb(Base): | |
__tablename__ = 'pythondb' | |
#we will only store following info columns | |
id = Column(Integer, primary_key=True) | |
user_name = Column(String(300)) | |
status = Column(String(300)) | |
created_at = Column(String(35)) | |
fav_count = Column(Integer) | |
# Create an engine that stores data in the local directory's | |
engine = create_engine('sqlite:///data/twitter_db.db') | |
#make session | |
Base.metadata.create_all(engine) | |
DBSession = sessionmaker(bind=engine) | |
session = DBSession() | |
#make stream listener class | |
class StdOutListener(StreamListener): | |
def __init__(self, no_of_tweet, session): | |
"""added no of tweet to atop stream after certain | |
number of tweet""" | |
self.no_of_tweet = no_of_tweet | |
super().__init__() | |
self.count = 0 | |
self.t_dict = defaultdict(list) | |
self.session = session | |
def on_status(self, status): | |
if self.count < self.no_of_tweet: | |
new_data = Pythondb(user_name=status.author.screen_name, | |
status=status.text, | |
created_at=status.created_at, | |
fav_count=status.favorite_count) | |
session.add(new_data) | |
logging.debug("added {}".format(self.count+1)) | |
self.count+=1 | |
else: | |
self.session.commit() | |
self.session.close() | |
return False | |
def on_error(self, status): | |
print(status) | |
return False | |
#set up streams | |
l = StdOutListener(100, session) | |
auth = OAuthHandler(consumer_key, consumer_secret) | |
auth.set_access_token(access_token, access_token_secret) | |
stream = Stream(auth, l) | |
#capture data by the keywords: 'python', 'javascript', 'ruby' | |
stream.filter(track=['python', 'javascript', 'ruby'], async = True) | |
######################################################################## |
You can use this data to make dashboard or analysis.
Quote from Book I am reading:
“People love to say, “Give a man a fish, and he’ll eat for a day. Teach a man to fish, and he’ll eat for a lifetime.” What they don’t say is, “And it would be nice if you gave him a fishing rod.” That’s the part of the analogy that’s missing.”― Trevor Noah, Born a Crime: Stories From a South African Childhood
No comments:
Post a Comment