Tag Archives: Data

Ethics and One Tweet’s Impact on Digital Ocean

I am speaking at HostCamp (side event to WordCamp Europe) in a couple weeks on the topic of Ethics in WordPress Hosting. I'm not really sure WordPress hosting has any specific differences from web hosting in general when talking about ethics. But ethical behavior in the web hosting space is something I talk about a lot. I also aggressively call out people/companies behaving unethically on this blog in the web hosting space. [1] [2] [3] [4]

As I was writing a response to a short interview to introduce the topic, I tried to think about a relevant example of why ethics matter in web hosting. A very recent event jumped to mind, someone tweeting that Digital Ocean [Reviews] shut down their company.

This tweet was sent by @w3Nicolas.

The stats are staggering:

That's only in the communities I participate in, I was sent the link by multiple people in other groups as well. I'm sure tens of thousands of people, if not more, read about this incident.

 

This is a view into what that tweet did to Digital Ocean's data here on Review Signal (I track Twitter data and sentiment about web hosting companies for the unfamiliar). I pulled the past 30 days of Digital Ocean information.

The tweet was sent on May 31, the 4th data point. We see an enormous jump in tweet volume. The preceding days had an average of 248 tweets per day. May 31 had 2000 and June 1 had 2489 tweets, nearly 10X the normal volume for two days. By June 4, we're down to 274 tweets, a normal volume. The internet outrage machine was out in force and spreading the word.

Digital Ocean responded on Twitter with Moisey Uretsky, a cofounder, intervening to escalate and resolve the issue. Digital Ocean also released a post-mortem on June 4 about what happened as promised (Nice to see a company keep their word and admit mistakes).

What does this have to do with Ethics?

Why did I even write this story and what does it have to do with ethics? The question I was trying to answer when I started thinking about this incident and digging into the data is "Why should hosting companies and those who do business with them care about ethics?"

A lot of developers and entrepreneurs read a story about a guy who was shutdown without warning, and then locked out seemingly permanently without being treated fairly. It strikes a chord with people when someone is being treated wrongly/badly with no explanation, especially when it's their livelihood that is impacted. It violates a fundamental moral code of fairness and trust.

The impact for a perceived ethical violation in this case was tens of thousands of people reading a negative story. It generated heated discussions and some very negative comments.

My data showed a tremendous increase in negative messages with the ratio dropping to 34% (Digital Ocean has historically over 70% positive messages).

They were quick to jump into some of the communities and address the issue. The post-mortem on Twitter received 225 Likes and 62 Retweets. That's 2.4% the amount of retweets and 4.9% the Likes. The impact of addressing the issue and trying to improve made a tiny fraction of the impact.

I will be clear here, I don't think Digital Ocean acted maliciously or unethically (intentionally). It sounds like a combination of automated system and a couple human mistakes lead to a very bad outcome for a customer that attracted a lot of attention. The way it was portrayed evoked feelings of an ethical violation of fairness and trust.

Digital Ocean's post-mortem's conclusion:

We wanted to share the specific details around this incident as accurately and quickly as possible to give the community insight into what happened and how we handled it. We recognize the impact this had on a customer, and how this represented a breach of trust for the community, and for that we are deeply sorry. We have a number of takeaways to improve the technical, process, and people missteps that led to this failure. The entire team at DigitalOcean values and remains committed to the global community of developers.

So when companies think about how they should behave, I want to use this example as an argument that people do care about companies behaving ethically and awareness of their behavior can quickly be amplified when a person's story resonates.

Uncovering the Rose Hosting Spam Network on Quora

Welcome back to Dirty, Slimy, Shady Secrets of the Web Hosting Review (Under) World - Episode 3! Read Episode 1 | Episode 2

Today's post features Rose Hosting. Who I refuse to link to because their whole business model seems to involve comment spamming this blog and other sources of information. What started with a simple spam comment sent me down a rabbit hole I wasn't prepared for and shed light on a fairly large spam operation that spanned multiple sites, but my primary focus became Quora with a secondary focus on the web hosting review sites also being manipulated.

Visualization of Rose Hosting Quora Spam Network. An interactive version is available at the end of the article.

The Beginning

It started with a simple spam comment.

fakereview1

The poster tries to compliment the post and then drops in a RoseHosting mention and praises it.

But wait, there's an IP address! Looks like they made a mistake this time.

oscarstanley-arin-ip

So Miami Cloud Hosting is who owns the IP space that this comment came from. Let's see what comes up when I ping rosehosting.com

rosehosting-ip-ping

If you go to that IP, rosehosting.com shows up. So it's correct. Also if you look at their DNS:

rosehosting-dns

So we're 12 IPs away on that A record. Let's check out that IP that actually responded on ARIN.

rosehosting-arin-ip

Bingo. Same Miami Cloud Hosting.

So fakeish looking name, an email with zero google search results and coming from the same IP space on a the cloud hosting provider that hosts RoseHosting. Pretty damning, but unsurprising to see some astroturfing, many of the bigger players just rely on affiliates to do it for them and look the other way.

But I'm not one to accept shitty behavior in this business and just look the other way.

Digging Deeper

Let's see how many more I can dig up. I recognize the Rose Hosting name and know they've spammed me in the past.

jean-debushy-comment mike-hidemyass-comment pablo-comment

mateo-comment

 

The pattern seems to be emails with nothing associated with them on google. There is a protected twitter account with the same username as Pablo, but that's about it.

Mike uses HideMyAss, a VPN service designed to hide identities. VPNs/anonymity have a lot of value, they also happen to be abused by spammers a lot. This pattern looks nefarious.

Jean's comment follows the original Oscar comment's template: compliment, rose host spam, compliment.

They all added in HTML with the rel="nofollow" because they probably realized Google can easily see comment spam and cracked down on it. Putting a nofollow link is supposed to preserve your SEO value by not associating it as a spam link (because it's telling Google not to follow it). Why are these supposed customers adding SEO tactics to their comments and trying to hide their identities?

The Boss Man

I also got this email from Bob, who I assume is the owner based on what's listed publicly and the interviews he's done on at least one other review site which I don't trust a bit, and won't link to either.

But it's all class, I want to get listed and pay a lot.

generous-affiliate-program-rosehosting

So at best they are a 'subtle' please promote me for money kind of web hosting company (which almost every host will do). At worst, they are comment spamming and potentially astroturfing/sockpuppeting web host.

Searching For More

I searched WebHostingTalk, the largest web hosting forum that has run forever and has over 9 million posts.

rosehosting-wht

Just about everyone is talked about here. They have a company account that constantly posts ads. But how is it that in 14 years there are only 2 reviews and most of the threads are asking 'who?' Yet somehow, my blog is getting hordes of accounts recommending them. Another red flag.

Did they learn their lesson on WHT when an account got questioned about sounding like a shill? So the largest forum with 9,000,000 posts has basically nothing about them.

I kept searching and stumbled upon this gem on Twitter

twitter-brandonhimpfen-rosehosting-comment-spam

twitter-brandonhimpfen-rosehosting-discussion

I sense a pattern. Those crazy customers of ours who link to git and tomcat installation tutorials. Carl had a bit of a spamming spree according to Google.

Let's keep digging.

Sockpuppets and Patterns

jeandebushy-reddit-comments

Looks like I found Jean Debushy!

jean-debushy-serchen-review

And again.

jean-debushy-itzgeek-comment-spam

And again. Deep linking their ubuntu VPS on an ubuntu tutorial too, nice SEO tactic.

jean-debushy-quora

It's not a good spam campaign without hitting Quora!

So this name exists solely to promote RoseHosting and it all seemed to happen in October 2015. That's suspicious to say the least.

At this point it became clear that the sockpuppeting is more organized than I originally thought.

Organized Sockpuppets

I started to search the other names I had been spammed from and easily found more bad behavior.

oscar-stanley-disqus

oscar-stanley-discovercloud

Oscar is alive and well it seems on Disqus. and DiscoverCloud.

The Smoking Gun

Quora was the gold mine for uncovering this spam network. Once I found a couple accounts on Quora, I could go through their history and see who upvoted their posts. It would be practical if you were running a spam network to have many accounts upvoting one another to give yourself more visibility. More upvotes, more traffic, easier for me to track it all down.

I discovered 51 accounts connected to RoseHosting and mapped out how they connected to one another. I took those same names and searched for their re-use across other sites. 10 showed up on Serchen, 3 on HostReview, 6 on DiscoverCloud, 6 on HostAdvice, 3 on TrustPilot, 1 on Reviews.co.uk - all industry review sites being manipulated by these same spam accounts. I also discovered 11 more accounts connected to various review sites and comment spam.

Rose Hosting Quora Spam Network

This graph charts the connections (upvotes) of RoseHosting associated Quora accounts. If you hover over a name it links to the Quora details and any other related content spamming like review sites.

Aftermath

I tried for months to reach out to Quora and have never heard a word from them. I did notice when I last checked (March 28, 2017) that at least some of the accounts have been banned. Maybe someone actually read my email and just didn't have the time to respond.

I have reached out to the web hosting review sites and will update as I hear back from them. The only company that did respond and acknowledged the issue was HostAdvice (not to be confused with HostingAdvice which steals Review Signal content to mislead its visitors).

Sources

Full Data Table Available on Google Docs

 

Bonus

Thanks RoseHosting for having the decency to make sure you spammed this article as well. I am guessing your spammers don't understand irony. Or possibly the English language.

A new comment on the post "Uncovering the Rose Hosting Spam Network on Quora" is waiting for your approval
https://reviewsignal.com/blog/2017/03/31/uncovering-the-rose-hosting-spam-network-on-quora/

Author: Merritt George (IP: 75.86.176.9, cpe-75-86-176-9.wi.res.rr.com)
Email: merritt.george@gmail.com
URL:
Comment:
That's a great article! There sure are interesting parts of web hosting that people don't know about.

So hey I wanted to know if you do reviews on new sites? I was looking around and noticed that my current webhost, <a href="https://www.rosehosting.com/" rel="nofollow">Rose Hosting</a> wasn't listed and that's a shame! In a sea of companies with no scruples, they've stood out to me as a solid company that doesn't resort to shady tactics, delivers quality support, and has great uptime.

Would love to see a benchmark!

 

Bonus: Fake Review Screenshots

carl-williams-serchen-review

 

donald-wilson-discovercloud

 

akila-hostadvice

wesley-hermans-host-advice

emre-hakan-review-discovercloud

pete-williams-serchen

gary-coleman-hostreview

jean-debushy-hostreview

 

dirk-vlaar-serchen

 

The Rise and Fall of A Small Orange

If you're an unhappy A Small Orange customer looking to find a better web host and don't want to read why the quality went down, simply head over to our Web Hosting Reviews and find a better hosting company. 

How did a small web hosting company have such a huge impact on Review Signal?

The Early Days

This story begins in October 2011, a year before Review Signal launched. Review Signal had been collecting data for months and early ratings data was starting to become meaningful. A tiny company was at the top of the rankings. A Small Orange.

The most worrisome part of this revelation was that A Small Orange did not have an affiliate program. Which isn't a requirement at all for a listing on Review Signal.

However, after investing years of work, if the top rated company ended up not having an affiliate program, the business was likely sunk before it even started. So I inquired early and heard back from the CEO at the time, “we don't have an affiliate program and at the moment, we have no plans for one.” This was a potential death knell because the entire business model relies on making at least some money, even though I assumed it would be much lower than my competitors who simply sell their rankings to the highest bidder. But as any entrepreneur knows, almost everything is negotiable if you understand what the other person really wants and why. After talking further with the CEO, he explained his issue with web hosting review websites, “they typically have a pay for ranking sort of model and do it either through set rates or affiliate payouts. It varies. The economics at ASO don't really work out for a standard affiliate program.” A Small Orange didn't want to play the game that every other review site out there did. Pay to play, quality be damned.

This CEO hated the games being played as much as I did.

That was all the opportunity I needed. Review Signal's mission has been to fight against that very same model and I knew I had an early ally who could make this work. We ended up working out a deal to pay three months of whatever plan someone purchased and he put a cap on my potential earnings at $250 before he would review the performance. Considering the most popular plans were $25/year and $5/month, this wasn't going to earn a lot, but at least it might start covering some of the very basic costs. The first month I earned $52.38 on 6 sales for an average of $8.73 per sale with A Small Orange.
At least it was something. And a foot in the door was all I needed to prove this crazy idea called Review Signal might have some legs. A Small Orange opened that door and for that our histories will forever be intertwined.

The Good Times

The next few years were very good. I was their first affiliate. I was their biggest affiliate for many years, bringing in over a thousand new customers. I got to know many of the staff and would consider some of them friends. And A Small Orange continued to be the best rated shared hosting company through 2014. Everyone was happy - their customers, the company and Review Signal. I was happy to recommend them based on the data showing they had incredibly satisfied customers. I had people tell me personally they were very happy with them after signing up because of the data I publish here at Review Signal.

2014-01-20 13.34.07

Free Swag and Annual Thank You Card from ASO

The EIG Acquisition

A Small Orange was quietly acquired in 2012. They were acquired by a behemoth in the hosting industry called Endurance International Group (NASDAQ: EIGI) which owns dozens of brands including some of the largest and most well known hosting companies: Blue Host, Host Gator, Host Monster, Just Host, Site5, iPage, Arvixe and more.

EIG has a very bad reputation in the web hosting world. If you ask most industry veterans they will tell you to run to the hills when it comes to EIG. The oft-repeated story is EIG acquires a hosting company, migrates them to their platform and the quality of service falls off a cliff. The best example of this is perhaps their migration to their Provo, UT data-center which had a catastrophic outage in 2013. This outage was huge. The impact dropped four of EIG's largest brands many percentage points in the Review Signal rankings in a single day.  But these major outages continue to happen as recently as November 2015.

In a recent earnings call with share holders, EIG CEO Hari Ravichandran talked about two recent acquisitions and their plans for them. “We expect to manage these businesses at breakeven to marginally profitable for the rest of the year as we migrate their subscriber bases onto our back-end platform. Once on platform, we expect to reach favorable economics and adjusted EBITDA contribution consistent with our previous framework for realizing synergies from acquisitions.”

The EIG Playbook

EIG's playbook has been to acquire web hosting brands, migrate them to their platform and 'reach favorable economics.' They've been doing it for years and it seems to be working well enough for investors to continue to put money into the company. M&A to grow subscriber bases and economies of scale to lower costs. It's a very simple and straightforward business plan. It doesn't speak to anything beyond spreadsheet math though, such as brand value and customer loyalty. And those are certainly lowered and lost post-EIG acquisition according to all the data we've collected over years and multiple acquired brands. It's calloused business accounting, but it makes perfect sense in the race to the bottom industry that is commodity shared hosting.

Review Signal Rating Calculated Pos/(Pos+Neg), without duplicate filtering

Review Signal Rating Calculated Pos/(Pos+Neg), without duplicate filtering

You can see all the EIG brands tracked here on Review Signal in the chart above and their acquisition dates below:

iPage - 2009. BlueHost/HostMonster - 2010. JustHost - Feb 2011. NetFirms - March 2011. HostGator - June 2012. A Small Orange  - July 2012. Arvixe - November 2014. Site5 - August 2015.

You'll notice their ratings, in general, are not very good with Site5 (their most recent acquisition) being the exception. iPage was acquired before I started tracking data. BlueHost/HostMonster also had a decline, although the data doesn't start pre-acquisition. JustHost collapses post acquisition. NetFirms has remained consistently mediocre. HostGator collapses with a major outage a year after acquisition. Arvixe collapses a year after being acquired. Site5 is still very recent and hasn't shown any signs of decline yet.

The Expected Decline of A Small Orange

So nearly every industry veteran I talked to expected A Small Orange to collapse. Immediately after acquisition. Except me. I was, am and will continue to be willing to give the benefit of the doubt to a company until I am shown evidence.

For years, post acquisition people were saying ASO's demise was right around the corner. For years, I still waited for that evidence and the prophecy to become true. But it didn't happen.

It often took EIG less than a year to ruin a brand. We don't have to look further than Arvixe for an example of this, which was acquired in November 2014. Today, Arvixe has one of the lowest ratings of any company on Review Signal at a shockingly low 27%.

But A Small Orange continued to chug along. It didn't hear the naysayers or believe itself to be a victim of the EIG curse. Instead, ASO was the best shared host for years post-acquisition. It seemed to have a fair level of autonomy from the EIG conglomerate. The staff I knew there, remained there, and all indications showed they were still the same company.

Until it wasn't.

The Fall of A Small Orange

A Small Orange Historical Rating

A Small Orange Historical Rating

The chart above shows Review Signal's rating of A Small Orange. The Blue line is the rating as calculated by [Positive Reviews / (Positive Reviews + Negative Reviews)]. The Red line only calculates the rating from the past 12 months of data. It's slightly different than Review Signal's actual calculation because I am not filtering out duplicates for quick analysis. The difference for A Small Orange is that when you remove the duplicates, the year 2015 had a 43% rating indicating there was quite a few people writing multiple negative things about A Small Orange.

Sometime in 2015, the A Small Orange that thousands of people trusted and raved about became another EIG brand. I tried to get the inside story. I reached out to the former CEO who sold the company to EIG and became an executive there for a couple years post acquisition. He reached out on my behalf to EIG's PR team to see if they would participate in this story. Both declined to participate.

So, I'm left to speculate on what happened at A Small Orange based on what's been publicly stated by their CEO and watching their strategy unfold for years across many companies/brands. My best guess is EIG finally got involved with A Small Orange. They used to be a distributed/remote team, now all positions they are hiring for are listed as in Texas (their headquarters). I saw a HostGator representative get moved over to ASO's team, so the internal staff was changing and people were being moved from brands with less than stellar reputations to ASO. The former CEO left mid-2014, which likely left a leadership and responsibility gap. ASO could probably run on auto pilot through the end of 2014, but over time having no champion for your brand in upper management eventually will come back to hurt the brand when decisions get made based on simple economics.

Once 2015 rolled around, the service had noticeably declined. The overall rating for A Small Orange in 2015 was 43% (only using 2015 data). For years, they had been in the 70's. It also ended the year with a massive outage for most, if not all, of their VPS customers which has been going on since Christmas. I personally received multiple messages from users of this site asking about what was happening and alerting me to this decline in service quality.

ASO was also responsible for the Arvixe migration that went very poorly and caused the Arvixe brand to tank. I'm not sure why EIG doesn't have a dedicated migration team to handle these type of moves considering how many acquisitions they go through and how large a role it plays in their growth strategy. But that's a whole separate issue.
It's with great disappointment that I have to admit, the A Small Orange that played such a huge role in the founding and success of Review Signal and provided a great service to many thousands of customers is dead. It's become another hollow EIG brand where the quality has gone down to mediocre levels. And that seems perfectly ok to them, because it's probably more profitable for their bottom line.

Going Forward

This story has had a profound impact on Review Signal. One thing that it made painfully obvious is that the ranking algorithm needs its first update since inception. The current ranking treats every review equally. Which was great when this site launched, because time didn't have any opportunity to be a factor yet. But as this site continues to move forward, I need to acknowledge that a significant amount of time has passed since launch and today. A review from the beginning of Review Signal isn't as relevant as one from this past week in determining the current quality of a web hosting company. A Small Orange right now shows up around 64% which is artificially high because of their long history of good service and it hasn't been brought down yet by the marginally small (by time scale) decline of the past year. But it's painfully clear that it's not a 64% rating company anymore.

Another thing to note is the graphs here all used a simpler calculation [Pos / (Pos + Neg)] to calculate rating without duplicate filtering. What this means is the difference between the rating here and the actual rating on the live site is a measure of the degree people are being positive or negative about a company. If the rating here is higher than the published, it means people are saying on average, more than one good thing about the same company. If the rating is below (as is in most if not all cases here), it means people are are saying more than one negative thing about the company. I'm not sure if this will factor into a new algorithm, but it is something to consider. My intuition says you would see it hinge around 50%, those companies above would likely have more positive supporters, and those below would have detractors.

In the coming months I will try to figure out a better way to generate the ranking number that more fairly represents the current state of a company. My initial thought is to use some sort of time discounting, so that the older the review, the less weight it would carry in the rankings. If anyone has experience working with this or wants to propose/discuss ideas, please reach out - comment here, email me, or tweet @ReviewSignal.

Playing With Data for Fun and Profit (SlideShare Presentation)

Original post from my personal blog

I was asked to talk at Howard University to their digital business class, which focuses on the use of Social Media, Mobile Apps & Platforms, Data Analytics, and Cloud Computing as strategic assets to be utilized in business.

If anyone is curious to learn about the history of Review Signal and where the idea came:

I also built a demo to let the students learn about and play with sentiment analysis in real-time. It used the Movie Review corpus from Cornell. It's a very primitive keyword based system but I thought illustrated the concept well.

Sentiment Analysis Demo

The demo is the close to the first try I made at sentiment analysis. What is in use at Review Signal is infinitely more complex, but if you're curious to learn about sentiment analysis and prefer visual learning, I think it suits that purpose well.

Which Programming Languages, Frameworks and Databases are used at Web Hosting Companies

This is just a small piece of a hopefully bigger article analyzing what I've been looking at in the Review Signal data.

X-Axis: Programming Languages, Frameworks, Databases

Y-Axis: Web Hosting Company

Colors: The deeper the green, the more positive. The deeper the red, the more negative.

What stories do you see?

Long Running Processes in PHP

Here at Review Signal, I use a lot of PHP code and one of the challenges is getting PHP to run for long periods of time.

Here are two sample problems that I deal with at Review Signal that require PHP processes to run for long periods of time or indefinitely:

  1. Data processing – Every night Review Signal crunches millions of pieces of data to update our rankings.
  2. Twitter Streaming API data – this requires a constant connection to the Twitter API to receive messages as they are posted on Twitter

The Tools

One of the best things as of PHP 5 is CLI (Command Line Interface). PHP CLI allows you to run things directly from the command line and doesn't have a time limit built in. All the pains of set_time_limit() and playing with php.ini disappear.

If you're going to be working from the command line, you're probably going to need to learn a little bit of bash scripting.

Finally, we will use cron (crontab / cron jobs)

SSH vs Cron Jobs

I need to explain that when you run something from a ssh session it is different from when you setup a cronjob to run something for you. An SSH session can be a good place to test scripts and run one-time processes. While a cronjob is the right way to setup a script you want run regularly.

If I write this line into my SSH session

php myscript.php

It will execute myscript.php. However, my terminal will be locked up until it completes.

You can get around this by holding ctrl+z (pauses the execution) and then type 'bg' (backgrounds the process).

For longer running processes, this can be nice, but if you lose your SSH session, it will terminate execution.

You can get around this by using the nohup (no hangup) command.

nohup php myscript.php

nohup allows execution to continue even if you lose the session. So if you use nohup, and then background the process it will finish executing regardless of your SSH session's status.
All of this only matters if you are running things manually from the command line. If you are running scripts with some regularity and using cronjobs, then you do not need to worry about these issues. Since the server itself is executing them, the SSH terminal sessions don't matter.

Update: A few readers reminded me that you can add an ampersand (&) to the end a command to background it immediately. This avoids having to ctrl+z, bg.

nohup php myscript.php &

Sometimes, you make a mistake and run a process without nohup but want it to continue running even if your SSH session disconnects. I've run scripts late at night thinking they would be quick, only to find out they took a lot longer than expected and I had to go home. This trick allows you to run the script as a daemon, so it won't terminate upon SSH session ending.

  1. ctrl+z to stop (pause) the program and get back to the shell
  2. bg to run it in the background
  3. disown -h [job-spec] where [job-spec] is the job number (like %1 for the first running job; find about your number with the jobs command) so that the job isn't killed when the terminal closes

Credit to user Node on StackOverflow

Data Processing with PHP

Since I run this script regularly, I create a bash script which is executed by a cron job.

Example bash script which actually runs the PHP script:

#!/bin/sh
php /path/to/script.php

Example cron job:

0 0 23 * * /bin/sh /path/to/bashscript.sh

If you don't know where to put the code above, type 'crontab -e' to edit your cron table and save it. The 0 0 23 * * tells it run when the time is 0 seconds, 0 minutes, 23 hours on any day, any day of the week.

So now we have a basic script which will run every night at 11pm. It doesn't matter how long it will take to execute, it will simply start every night at that time and run until it's finished.

Twitter Streaming API

The second problem is more interesting because the PHP script needs to be running to collect data. I want it running all the time. So I have a php script (thank you to Phirehose library) which keeps an open connection to the Twitter API but I can't rely on it to always be running. The server may restart, the script may error out, or other problems could occur.

So my solution has been to create a bash script to make sure the process is running. And if it isn't running, run it.

#!/bin/sh
 
ps aux | grep '[m]yScript.php'
if [ $? -ne 0 ]
then
    php /path/to/myScript.php
fi

Line by line explanation:

#!/bin/sh

So we start with our path to the shell.

ps aux | grep '[m]yScript.php'

process list is piped (|) to grep which searches for '[m]yScript.php'. I use the [m] regular expression matching so it doesn't match itself. Grep will spawn a process with myScript.php in the command, so it will always find a result if you search without putting something in brackets.

if [ $? -ne 0]

This checks the last command's return value. So if nothing was returned by searching our process list for [m]yScript.php

then
    php /path/to/myScript.php
fi

These lines are executed if our php script isn't found running. It runs our php script. The conditional is then terminated with fi.

Now, we create a cron job that executes the script above:

* * * * * /bin/sh runsForever.sh

So now we have a system that checks every minute to see if myScript.php is running. If it isn't running, it starts it.

Conclusion

You will notice the Twitter streaming script is just a more advanced version of the data processing. Both of the working versions have a lot more things going on in my live scripts but are beyond the scope of this article. If you are interested in extending them, you may want to look into logging as a first step. What I've learned from years of hands-on practice is that this setup can and does work. I've run php processes for many months on this configuration.

Post Mortem of the EIG Outage (August 2, 2013) That Affected BlueHost, HostGator, JustHost and HostMonster

I first wrote about EIG's major outage as it was occurring and had to speculate on a few things before I had the data to support those guesses. This post is a more complete picture of what happened.

Recap

EIG had a major outage on August 2, 2013 that lasted for many hours because core switches in their Provo, Utah datacenter failed. This failure caused customers of BlueHost, HostGator, JustHost and HostMonster to be taken offline.

I speculated as to what would occur after the outage. How would the brands of the affected companies be perceived after such a catastrophic failure? I looked for a comparable event: the GoDaddy DNS outage in September 2012. What I observed from that event was a very quick return to normal volumes of messages and sentiment. GoDaddy regressed to the mean. 

GoDaddy

The charts I used in my original post were lacking. I didn't have time to really collect and analyze all the data, especially sentiment. I could eyeball the historical data and see the ratings bounced back to their original levels but it wasn't a granular look.

godaddy_dns_outage_full

This chart shows the actual outage, tweet volume and sentiment. It's immediately clear that negative sentiment has a huge spike. I also suspect that a lot of the positive messages are actually mis-categorized; Review Signal isn't perfect and things like sarcasm are one of the hardest things for the sentiment analysis algorithms to categorize. The unusual volume lasts three days and then quickly drops back to a normal looking pattern with perhaps a slightly higher baseline volume. The actual rating goes back to hovering around 50%, which GoDaddy's long-term graph hovers around as well.godaddy_chart

Let's get back to the EIG outage and the affected brands. I am only going to talk about two of the brands, BlueHost and HostGator, in this post because on a granular level, the other two, HostMonster and JustHost, didn't have enough data. The brands without enough data will take more time to develop a clear picture about the effects of the outage.

BlueHost

bluehost_sentiment

I was wrong. So far at least. BlueHost had an overall rating of 57% before August 2. It hasn't broken 50% since the outage. BlueHost did not, or has not yet, regressed back to the mean. What's interesting is that the volume of tweets about BlueHost's outage was more than double in quantity to the similar GoDaddy outage, but they both quickly dropped back to normal volume within days of the event.

I will explore this a bit more, but to do that I need to show you the other brand.

 HostGator

hostgator_sentiment

HostGator's outage looks almost identical to GoDaddy's outage. Around 1000 negative messages on the day of the outage and back to normal within days. HostGator appears to have regressed to the mean as quickly as GoDaddy, its rating has been over 60% two days, which are pre-crash levels, where its average rating was 62%.  HostGator behaved exactly as I predicted.

Weird Conclusions and Speculations

Why hasn't BlueHost regress to the mean? One explanation, which I was alerted to by a kind reader (Thanks Linda!), is that not all of HostGator's customers were in the Provo, UT data center. So the outage may have disproportionately affected BlueHost customers compared to HostGator customers. BlueHost is also the larger hosting company by number of customers, although not domain count.

That explanation may explain the volume difference, but I don't think it explains the regression to the mean for one brand and not the other. Presumably the affected customers of both brands should be equally upset. Those lingering feelings should last equally long for both groups of customers.

I can't explain why we haven't seen BlueHost regress, but I can point out a few differences between this outage and the GoDaddy comparison which may be factors. One important factor is duration. GoDaddy's outage lasted 4-5 hours according to reports. The EIG outage lasted from the morning of August 2 until 9 PM. They were reporting 'intermittent instability' into August 3 according to their official website.

I could speculate that the combination of severity, duration and size of the affected brand has caused some sort of more permanent brand damage to BlueHost, but I think that's premature. BlueHost hasn't regressed yet, but I still think it will eventually. A company that large, with such a huge brand and marketing infrastructure will probably recover. I will be watching BlueHost carefully for the next few weeks or months along with the smaller brands to see if it happens. If it doesn't, this will be an interesting case study in branding, communication and perhaps social media.

 

Thank you for reading and if you have any ideas, feedback or suggestions please leave them in the comments below.

Service Interrupted: A Look at the EIG (BlueHost, HostGator, HostMonster, JustHost) Outage through Twitter

I woke up today and quickly found out that one of the major players in the hosting space was having a massive outage.  According to their own blog:

During the morning of August 2, 2013, Endurance International Group’s data center in Provo, UT experienced unexpected issues that impacted customers of bluehost, HostGator, HostMonster and JustHost. Company websites and some phone services were affected as well.

That sounds bad. Really bad. But how bad? Let's take a look at the data:

tweets_per_day_by_company

 

It's pretty clear that today was an outlier. A major outlier for all the affected companies.

Our data collection system here at Review Signal collected over 35,000 tweets today alone about these four companies. That is roughly 14 times the normal amount.

Interestingly enough, there are some very understanding customers out there too, it wasn't all negative.

hostgator_positive

 

How has it affected their rankings?

I must first note that most messages don't make it through our spam filtering systems for a variety of reasons. So despite there being over 35,000 tweets, we did not get 35,000 new reviews. Many of the messages were not up to our quality standards, eg. retweets, spam, duplicate messages and news. If you are interested in learning more about how we calculate scores and what kinds of messages count see our How It Works section.

 

BlueHost

I am not sure why, but BlueHost was impacted a lot more than it's bigger brother HostGator. BlueHost has 1.9 million domains on their server. They also received over 15,000 tweets about them today (50% more than HostGator).

BlueHost was rated at 57% (Overall Rating) from over two years worth of data collected. Today they dropped 8% to 49%. There were over 1,500 negative reviews today (Note: Our data was calculated early to write this article, the day isn't fully over yet).

HostGator

HostGator is the largest of the bunch and has 2.15 million domains under management. They seemed to have fared the storm better than their brothers with less tweets about them in absolute number and relative to their size.

HostGator was rated at 62% (Overall Rating) and dropped 5% to 57%. HostGator received approximately 700 negative reviews today.

HostMonster and JustHost

These are the babies of the bunch, HostMonster has 'only' 700,000 domainso and JustHost has barely over 350,000.

HostMonster went from a 56% (Overall Rating) to 48%, which is a 8% decline. JustHost dropped from 46% to 41%, a total of 5%.

Conclusion

Today was a pretty awful day for all the companies above but some were affected more than others. I don't have any answer as to why that might be. There are many plausible theories such as perhaps there were more BlueHost customers in the Provo, UT data center than the other companies. But without further information, it's only speculation. UPDATE: I was told BlueHost actually has more customers than HostGator, even if HostGator customers have more domainers. A simple explanation as to why BlueHost was impacted more.

What I can say is a major screw up definitely impacts a company's reputation. But large companies seem to regress to the mean.

GoDaddy is a good comparison. They had a major DNS outage around September 11-12. It left a noticeable dip on the overall rating but it seemed to bounce back. February's dip is the super bowl effect that brings a lot of attention to them (more negative than positive, but attention nonetheless). The long-term volume of tweets also doesn't appear to be affected after a few days.

godaddy_chart

godaddy_dns_outage

If we use GoDaddy as a benchmark, these companies will probably be back to their usual levels of service within a week, but today and the next couple days will leave a very long term impact on their rating at Review Signal.

Hurricane in the Cloud: How Hurricane Sandy Impacted Web Hosting Companies

I thought it would be really fun to create an infographic about the effects of Hurricane Sandy on the web hosting companies we track. I learned my infographic skills are optimistically rated: very poor.

So here's the interesting stats and trends we saw occur during Hurricane Sandy:

418 People Tweeted about Sandy and a web hosting company.

119 of those Tweets were talking about Intel's Sandy Bridge technology

15 People were concerned about their web hosting

7 (/15) of those people were worried about Amazon

23 People Claimed to have issues related to Sandy

8 (/23) of those complaints were directed at BlueHost

The Most Popular Sandy Tweets:

"Unsure why Heroku is prepping for Sandy. I thought hurricanes were the strongest kind of cloud." - @tenderlove (42 RTs)

"Linode HQ weathered #sandy only to lose power hours later, late last night. It runs our VoIP phones so no calls until we can work around." -  @linode (15 RTs)

"Laughing Squid founder @ScottBeale is live tweeting post #Sandy recovery from Manhattan. Follow him updates: https://twitter.com/ScottBeale" - @LaughingSquid (9 RTs)

Some Angry/Happy Tweets:

"@GoDaddy UMADBRO?????? UMAD?!?!?!?!?!? I HOPE SANDY COMES AND DESTROYS ALL YOUR SERVERS." - @djdarrenmallett

"With Gawker, HuffPo and others experiencing outages, Sandy is IRL the equivalent of GoDaddy hosting." - @spydergrrl

"wow how in the hell did @linode have 100% uptime in Newark during #Sandy? that's some badass hosting right there." - @procdaddy

"Wow, so Jersey is getting HAMMERED by Hurricane Sandy right now, and yet my @linode in Newark is still up. THAT'S service right there!" - @bill_clark

Other stories we found in the data:

Heroku released regions to help deal with customers who might potentially be affected by Sandy.

We saw BlueHost decided to offer flexible payments to those affected by Sandy.

Finally.

For your entertainment.

You can see my first ever attempt at creating an infographic (in the future, I will hire someone else to design these!)

image