Nov

Towards smarter data - accuracy and precision

                                                   

There is a huge amount of information out there. And it is growing. To make it efficient and increase our competitive advantage we need to evolve and start using information in a smart way, by concentrating on data that drives business value because it is accurate, actionable, and agile. Accuracy is an important measure that determines the quality of data processing solutions.

How accuracy is calculated?

It is easy to do with structured data, because the requirements are formalizable. It is less obvious with unstructured data, e.g. a stream of social feeds, or any data set that involves natural language. Indeed, the sentences of natural language are subject to multiple interpretations, and therefore allow a degree of subjectivity. For example, should a sentence ‘I haven’t been on a sea cruise for a long time’ be qualified for a data set of people interested in going on a cruise? Both answers, yes and no, seem valid.

In these cases an argument was put forward endorsing a consensus approach which polls data providers is the best way to judge data accuracy. This approach essentially claims that attributes with the highest consensus across data providers is the most accurate.

At nmodes we deal with unstructured data all the time because we process natural language messages, primarily from social networks. We do not favor this simplistic approach, as it is considered biased, inviting people to make assumptions based on what they already believe to be true, and making no distinction between precision and accuracy. Obviously the difference is that precision measures what you got right, and accuracy measures both what you got right and what you got wrong. Accuracy is a more inclusive and therefore more valuable characteristic.

Our approach is

a) to validate data against third party independent sources (typically of academic origin) that contain trusted sets and reliable demography. Validating nmodes data against third party sources allows us to verify that our data achieves the greatest possible balance of scale and accuracy.

b) to enrich upon the existing test sets by purposefully including examples ambiguous in meaning and intent, and providing additional levels of categorization to cover these examples.

Accuracy is becoming important when businesses move from rudimentary data use, typical of the first Big Data years, to a more measured and careful approach of today. Understanding how it is calculated and the value it brings helps in achieving long-term sustainability and success.

 

Interested in reading more? Check out our other blogs:

nmodes Technology - Overview

                                                       

nmodes ability to accurately deliver relevant messages and conversations to businesses is based on its ability to understand these messages and conversations. Once a system understands a sentence or text, it can easily perform a necessary action, i.e. bring a sentence about buying a car to the car dealership, or a complaint about purchased furniture to the customer service department of the furniture company.

Understanding sentences is called semantics. nmodes has developed a strong semantic technology that stand out in a number of ways.

Here is how nmodes technology is different:

1. Low computational power. We don’t use methods and algorithms deployed by almost everyone else in this space. The algorithms we are using allow us to achieve high level of accuracy while significantly reducing the computational power. Most accurate semantic systems, e.g. Google’s, or IBM’s, rely on supercomputers. By comparison our computational requirements are modest to the extreme, yet we successfully compete with these powerhouses in terms accuracy and quality of results.

2. Private data sources. We work extensively with Twitter and other social networks, yet at the same time we process enterprise data.  Working with private data sources means system should know details specific only to this particular data source. For example, when if a system handles web self-service solution for online electronics store it learns the names, prices, and other details of all products available at this store.  

3. User driven solution. Our system learns from user’s input. Which makes it extremely flexible and as granular as needed. It supports both generic topics, for example car purchasing, and conversations concentrating on specific type of car, or a model.

READ MORE

The Curious Case of AI Technology

                                                         

                                                                 

The notion of Artificial Intelligence has been around for a while.

Yet, unlike other prominent technological innovations such as electric cars or the processor speed, its progress has not been linear.

In fact, as far as industrial impact is concerned, there were times when allegedly there was no progress at all.

The widespread fascination with AI started several generations ago, in 80-s of the last century. This is when a pioneering work of Noam Chomsky on computational grammar led to a belief that human language capabilities in particular, and human intelligence in general, can be straightforwardly algorithmized. The expectation was that the AI-based programs will have a significant and lasting industrial impact.

But despite unabridged enthusiasm and significant amount of effort the practical results were minuscule. The main outcome was disappointment and AI become somewhat of a dirty word for the next 20 years. The research became mostly confined to scientific labs, and although some notable results have been achieved, such as development of neural networks and Deep Blue machine beating acting world champion in chess, the general community was largely unaffected.

The situation started to change about 5-10 years ago with a new wave of industrial research and development.

We now experience somewhat of a renaissance of AI with bots, semantic search, self-service systems, intelligent assistant programs like Siri are taking over. In addition, optimists of science are bragging confidently about reaching singularity during our lifetime.

The progress this time seems to be genuine indeed. There are indisputable breakthroughs, but even more impressive is the width of industries adopting AI solutions, from social networks to government services to robotics to consumer apps.

For the first time AI is expected to have a huge impact on the community in general.

There is this vibe around AI which hasn’t been felt in years. And with power comes responsibility, as they say, - prominent thinkers such as Stephen Hawking raised their voice against the dangers of powerful AI for humanity. Still, as far as current topic is concerned, this is all part of the vibe.

Despite all the plethora of upcoming opportunities, it is important to observe that we are yet to advance from anticipation stage. AI has not became a major industrial asset, an AI firm has not reached a unicorn status, and despite the fact that major industrial players such as IBM are pivoting towards  fully-fledged AI-based model it has not manifested itself in business results.

We are still waiting for AI-based technology to disrupt the global community.

The overall expectation is that it is about to happen. But it hasn’t happened yet.

 

READ MORE