Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics in ecommerce
    Analytics Technology Drives Conversions for Your eCommerce Site
    5 Min Read
    CRM Analytics
    CRM Analytics Helps Content Creators Develop an Edge in a Saturated Market
    5 Min Read
    data analytics and commerce media
    Leveraging Commerce Media & Data Analytics in Ecommerce
    8 Min Read
    big data in healthcare
    Leveraging Big Data and Analytics to Enhance Patient-Centered Care
    5 Min Read
    instagram visibility
    Data Analytics Plays a Key Role in Improving Instagram Visibility
    7 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Predicting the next Viral Tweet
Share
Notification Show More
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > Predicting the next Viral Tweet
Data MiningPredictive Analytics

Predicting the next Viral Tweet

ThemosKalafatis
ThemosKalafatis
6 Min Read
SHARE

It is time to use Twitter data for another reason: Can Predictive Analytics be used to identify which tweets have an increased probability to become viral?


First we have to identify the problem and see what information we should consider. Every Tweet has an author, a piece of content, and is posted on a specific day and time. More specifically, for every tweet we can collect usage data such as

  • Day of Post
  • Time of post
  • Elapsed minutes since tweet has been posted
  • Author of tweet (Twitter username)
  • Number of followers of the author

and also information such as :

  • Subject of post
  • Whether the tweet involves a question being asked
  • Whether the tweet contains hashtags
  • Whether the tweet contains a “Please Re-Tweet” directive (or variants)
  • Whether a user is mentioned
  • The text of the tweet itself.

Our goal then is to combine the information mentioned above and come up with a predictive model that, when given an author, day, time of post and text of the tweet, it will be able to tell us whether this tweet has an increased probability to become viral …


It is time to use Twitter data for another reason: Can Predictive Analytics be used to identify which tweets have an increased probability to become viral?


First we have to identify the problem and see what information we should consider. Every Tweet has an author, a piece of content, and is posted on a specific day and time. More specifically, for every tweet we can collect usage data such as

  • Day of Post
  • Time of post
  • Elapsed minutes since tweet has been posted
  • Author of tweet (Twitter username)
  • Number of followers of the author

and also information such as :

  • Subject of post
  • Whether the tweet involves a question being asked
  • Whether the tweet contains hashtags
  • Whether the tweet contains a “Please Re-Tweet” directive (or variants)
  • Whether a user is mentioned
  • The text of the tweet itself.

Our goal then is to combine the information mentioned above and come up with a predictive model that, when given an author, day, time of post and text of the tweet, it will be able to tell us whether this tweet has an increased probability to become viral.

For this data and text mining exercise (and keeping in mind that tweets have been sampled from one website and not Twitter itself) let’s define what is a viral tweet: After collecting approx. 8000 tweets from dailyrt.com it was found that the median value of Re-tweets is 17. Here we make the assumption that if a tweet exceeds 30 Re-tweets it is considered viral (and actually this specific assumption makes the classification task much easier).

As discussed above, usage data do not tell us anything about the content of a tweet. Usage data tell us about the name of the author, his/her followers, when the tweet has been posted and how many minutes elapsed since its post. Can this information alone predict whether a tweet will become viral? A data mining model predicted (without using the elapsed time as input field) with an overall accuracy of 75.03% whether a tweet can be viral and – perhaps as expected – shown that the most important factor for making a viral tweet is its author. Running a process called Feature Selection tells us just that :


But what we have seen so far only tells us one – the data mining – side of the story. With text mining we can see the importance of words and authors. To do that, each author is appended at the end of each tweet (so essentially the author becomes a part of each tweet text). Here is what Feature Selection tells us :

A Tweet mentioning Michael Jackson has a great probability of becoming viral but perhaps it should be also posted by a popular author to make a greater impact. Pay attention also to the fact that @mashable and the @theonion are on top of our feature selection list shown above.

The difficult – but also interesting – task is to predict a viral tweet that has an impact not because of its author but because of its content and to do this the methodology of data collection and analysis differs significantly.

On the next post we will see a model predicting viral tweets in action: We will submit several tweets and their author and the model will tell us the probability that each submitted tweet has to become viral.

Link to original post

More Read

artificial intelligence boosting employee retention

Artificial Intelligence is Unlocking the Secret to Boosting Employee Retention

The Stakeholders
Top 10 Twitter Tutorials on YouTube
Benchmarking Revolution R for Data Mining
Two Titanic Data Governance Mistakes
TAGGED:twitterviral tweet
Share This Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

AI for MSPs
Autotask and ConnectWise Prove the Benefits of AI in IT
Artificial Intelligence Exclusive
gamer laptops
Data-Driven Tips to Choose the Perfect Gamer Laptop
Best Practices Reviews
smart crosswalk
AI Reduces Pedestrian Collisions With Smart Crosswalks
Artificial Intelligence Exclusive News
ai success
How Leaders Can Unlock AI’s Full Potential for Business Success
Artificial Intelligence Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Instagram data usage tips
Big Data

5 Innovative Ways To Reduce Instagram Data Usage

5 Min Read

Integrating Live Twitter Streams into PowerPoint Using Xcelsius

6 Min Read

Top Market Researchers on Twitter

2 Min Read

How Mailana Visualizes My Top 10 Loquacious Friends on Twitter

4 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots
ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-24 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?