Oct

When Big Data is not so big anymore

                                                   

We are inundated with information. There is so much information around us they coined a special term - Big Data. To emphasize the sheer size of it.

It is, of course, a problem - to deal with a large amount of data. Various solutions have been created to address it efficiently.  

At nmodes we developed a semantic technology that accurately filters relevant conversations. We applied it to social networks, particularly Twitter. Twitter is a poster child of Big Data. They have 500 million conversations every day. A staggering number. And yet, we found that for many topics, when they are narrowed down and accurately filtered, there are not that many relevant conversations after all.

No more than 5 people are looking for CRM solutions on an average day on Twitter. Even less - two per day on average - are asking for new web hosting providers explicitly, although many more are complaining about their existing providers (which might or might not suggest they are ready to switch or looking for a new option).  

We often have businesses coming to us asking to find relevant conversations and expecting a large number of results. This is what Big Data is supposed to deliver, they assume. Such expectation is likely a product of our ‘keyword search dependency’. Indeed, when we run a keyword search on Twitter, or search engines, or anywhere we get a long list of results. The fact that most of them (up to 98% in many cases) are irrelevant is often lost in the visual illusion of having this long, seemingly endless, list in front of our eyes.

With the quality solutions that accurately deliver only relevant results we experience, for the first time, a situation when there are no longer big lists of random results. Only several relevant ones.  

This is so much more efficient. It saves time, increases productivity, clarifies the picture, and makes Big Data manageable.  

Time for businesses to embrace the new approach.

 

Interested in reading more? Check out our other blogs:

Is Anonymity the Future of the Internet?

Right now we're in a world that sees  transparency as the new form of integrity. Right now we're in a world that understands that reputation is everything. Loyalty is somewhat fleeting as consumers, armoured with this incessant flow of knowledge from the web, have the ability to make swift  judgements and decisions about individuals, companies and governments, often times to the detriment of the target.

The emergence of social media has forced companies to stop hiding from behind that veil of corporate spin and address the very things that the web has thrown at them. Nothing is secret any longer. Even secrets that were once held secure behind invulnerable fortresses now have a strong probability of materializing today.

Is transparency as a norm working? Or, are the results of transparency surfacing a new order that will create yet another tier of acceptance from the masses?

"Anonymity is Authenticity"

Following the death of Rahteah Parsons, who, after being assaulted by 4 boys, was tormented relentlessly by classmates and other kids on social networks; and also following the suicide of Hannah Smith, who experienced the same torment, it's clear the internet has evolved to an era that has given free reign to voice an opinion and use like-minded affiliations to express and further spread that opinion. In these cases, anonymous profiles proliferated the incessant stream of hateful attacks that eventually wore down both girls' defences.

And while I originally argue that anonymity was a cowardice state that allowed people to be and feel comfortable being the anti-self that runs away from accountability, my stance has seen another side of this coin.

Anonymity is Safe

It becomes clear that humans, while inherently social, are discriminating of the things we disclose and to those to whom we share. 

If transparency breeds contempt, then anonymity should build acceptance

The freedom to express opinion and judgement without feeling guarded, or without fearing others linking you to a statement is indeed liberating. And while this free reign may take the form of a soapbox soliloquy or criticisms (and perhaps bullying attacks) against opposing views, there is a large segment of users who want the ability to share a secret, or have a place to vent their frustrations or challenges -- without the fear of reprisal.

Despite revelations from Snowden and the NSA that nothing on the net is private, this does not stop the wave of user adoption for applications like SnapChat, Whisper or Secret.

Here are some recent stats for Snapchat from Mashable

;

I've recently downloaded Whisper and my experience has been more than liberating. It has allowed me an outlet to record my hopes, desires and more importantly, my anger and not-for-public emotions. Being judged in real life or on social takes its toll. If my reputation precedes me, then I will be discriminating about what I say in places where my content and identity are linked.

Popular opinion just doesn't matter. It's irrelevant. But I want to track progress in my life: my emotions, my dark moments, my personal observations, my milestones -- all in my own digital diary.

Why shouldn't users have the option to keep part of their identities secret and separate?

It's up to the next generation

This new medium has created is an endless volatile loop of positive and negative reinforcement. While transparency has extreme benefits, there are just as many negative consequences that have come as a result of creating this honesty within social channels. Society continues to send the wrong message to Millennials and GenZers, warning them to be more discerning and to suppress who they really are as individuals... warning them of the potential consequences should they venture down the wrong path.

How we communicate today poses tremendous issues for this younger generation. Their experiences are grounded in the fear of being vulnerable... fear of being misjudged... fear of not being accepted... fear of being punished. When the next generation grows up, it'll be up to them to shape the landscape and determine how to balance the impacts of transparency and anonymity.

READ MORE

Beware the lure of crowdsourced data

Crowdsourced data can often be inconsistent, messy or downright wrong 

We all like something for nothing, that’s why open source software is so popular. (It’s also why the Pirate  Bay exists). But sometimes things that seem too good to be true are just that. 

Repustate is in the text analytics game which means we needs lots and lots of data to model certain  characteristics of written text. We need common words, grammar constructs, human-annotated corpora  of text etc. to make our various language models work as quickly and as well as they do. 

We recently embarked on the next phase of our text analytics adventure: semantic analysis. Semantic  analysis the process of taking arbitrary text and assigning meaning to the individual, relevant components.  For example, being able to identify “apple” as a fruit in the sentence “I went apple picking yesterday” but to  identify “Apple’ the company when saying “I can’t wait for the new Apple product announcement” (note:  even though I used title case for the latter example, casing should not matter)

READ MORE