Oct

When Big Data is not so big anymore

                                                   

We are inundated with information. There is so much information around us they coined a special term - Big Data. To emphasize the sheer size of it.

It is, of course, a problem - to deal with a large amount of data. Various solutions have been created to address it efficiently.  

At nmodes we developed a semantic technology that accurately filters relevant conversations. We applied it to social networks, particularly Twitter. Twitter is a poster child of Big Data. They have 500 million conversations every day. A staggering number. And yet, we found that for many topics, when they are narrowed down and accurately filtered, there are not that many relevant conversations after all.

No more than 5 people are looking for CRM solutions on an average day on Twitter. Even less - two per day on average - are asking for new web hosting providers explicitly, although many more are complaining about their existing providers (which might or might not suggest they are ready to switch or looking for a new option).  

We often have businesses coming to us asking to find relevant conversations and expecting a large number of results. This is what Big Data is supposed to deliver, they assume. Such expectation is likely a product of our ‘keyword search dependency’. Indeed, when we run a keyword search on Twitter, or search engines, or anywhere we get a long list of results. The fact that most of them (up to 98% in many cases) are irrelevant is often lost in the visual illusion of having this long, seemingly endless, list in front of our eyes.

With the quality solutions that accurately deliver only relevant results we experience, for the first time, a situation when there are no longer big lists of random results. Only several relevant ones.  

This is so much more efficient. It saves time, increases productivity, clarifies the picture, and makes Big Data manageable.  

Time for businesses to embrace the new approach.

 

Interested in reading more? Check out our other blogs:

Lessons for Businesses from Brazil’s World Cup Disaster

1. Mental, or psychological, state of your team is important: you can put so much pressure on people before they crack. Brazil players didn’t become unqualified professionals overnight. They failed because they were overwhelmed by their country’s expectations, distorted sense of history, and the right to win considered divine. They were too emotionally charged, not in the proper state of mind to compete. So better keep calm, relaxed atmosphere in your team even before launch, or important deadline.

2. Manage customer expectations. Brazil were ramping them up unreasonably. Aggressive messages like the 6th[title] is coming, statements by their coach about two more steps to heaven massively backfired by creating an unhealthy emotional frenzy in the society, which in return influenced the players (see 1.)

3. Logic, organization is the key to successful execution. Germany are not a great team. But they are very well organized. They had a detailed game-plan where every team member knew his task and several different scenarios where prepared. They were able to adjust when the situation on the field changed to squeeze maximum advantage. Sounds simple? That’s because it is. 

READ MORE

Beware the lure of crowdsourced data

Crowdsourced data can often be inconsistent, messy or downright wrong 

We all like something for nothing, that’s why open source software is so popular. (It’s also why the Pirate  Bay exists). But sometimes things that seem too good to be true are just that. 

Repustate is in the text analytics game which means we needs lots and lots of data to model certain  characteristics of written text. We need common words, grammar constructs, human-annotated corpora  of text etc. to make our various language models work as quickly and as well as they do. 

We recently embarked on the next phase of our text analytics adventure: semantic analysis. Semantic  analysis the process of taking arbitrary text and assigning meaning to the individual, relevant components.  For example, being able to identify “apple” as a fruit in the sentence “I went apple picking yesterday” but to  identify “Apple’ the company when saying “I can’t wait for the new Apple product announcement” (note:  even though I used title case for the latter example, casing should not matter)

READ MORE