Christine Lynch, Technical Consultant at Nathean Technologies, looks at the growing importance of data scaling techniques and how big data patterns can help organisations solve frequently occurring problems…
“It is a very sad thing that nowadays there is so little useless information.” – Oscar Wilde
Moore’s Law, which is wildly accepted in the Computer Science world and even taken for granted, says that the processing power and storage capacity of computer chips doubles or their prices halve roughly every 18 months. This has led to the amount of digital information increasing tenfold every five years.
This data explosion brings with it new challenges for organisations. While most acknowledge the existence and importance of big data, they are struggling to fully exploit its potential. While new technologies have been developed to store and manage big data, this is only half the battle to gaining insights and value.
The Changing Perception of Big Data
Data has gone from being an unwanted overhead to a central player in business today. With basic programs becoming applications and applications going online the amount of data generated and processed forced a change in this attitude to the current thinking.
Central to this change in attitude and respect of data is the advancements in hardware. Without hardware evolving to the point where storing and accessing data became not only more efficient but cheaper, data in the volume we see today could not have occurred. Data is no longer discarded without thought or consideration to its value. Quite the opposite, in today’s online world even a user’s ‘data exhaust’ – the trails of clicks that internet users leave behind – is recorded and exploited.
Scaling Data Stores
The factors discussed have led to the growing importance of data scaling techniques, both in enhancing current systems and ensuring green field projects are future proofed for scaling. Never before has it been so important for a company to have reliable and fast data stores. New technologies and techniques are emerging to deal with this big data and its scaling. Traditional data stores of relational database management systems are vastly different to the new NoSQL data stores currently emerging.
Scaling data stores is a complex and varied science. There are numerous approaches one can take to increase performance and reliability. It used to be the case that one could only beef up a server in order to increase performance but with advancements in both server and communication technology most large data stores today not only have beefy servers but are spread across several machines in a distributed manner. These different approaches to scaling can be categorised as follows:
– Vertical scaling: adding more power to the machine.
– Horizontal scaling: adding more machines to a networks of machines
Vertical scaling techniques include adding more storage, memory or processing power. Horizontal scaling techniques include server clusters with high spec communication lines, sharding, replication and caching. All of these techniques can be applied to RDBMS and NoSQL data stores.
Big Data Patterns
Design patterns show how templates and frameworks can be applied to solve frequently occurring problems. The evolution of big data has seen the emergence of similar frequently occurring problems. Derick Jose, director at Flutura Decision Sciences and Analytics, calls these big data patterns ‘Workload Patterns’ and believes that they greatly assist in defining solution constructs for businesses.
These distinct and specific big data workload patterns can help businesses identify clearly what their needs are. Once a business has a pattern that fits its use case it is easier to wade through all of the big data technologies, and there are many, in order to find the best fit.
Applying workload patterns to big data solutions greatly reduces the potential complexity of a solution and clarifies the task at hand.
The wave of big data is here to stay and only going up. It has spawned a growing business sector of data analysts, data scientists and business intelligent professionals and tools. They crunch data in an unprecedented way to assist organisations in making sense of their proliferating data, often leading to massive cost savings and new business opportunities. According to an article by Kenneth Cukier, the data management industry is worth more than $100 billion and growing at almost 10% a year, which is roughly twice as fast as the software business as a whole. This monetary value associated with big data means that managing it has never before been so important.
The big data phenomenon is commonly referred to in the technology world using the three V’s: Volume, Velocity and Variety. First introduced by Doug Laney in his article on Data Management it describes how big data management strategies can be defined by the data’s Volume, Velocity and Variety. Volume being the physical amount of data an organisation can manage and store, Velocity the increasing rate at which data is flowing into an organisation and Variety, the diverse nature of data coming from different sources.
With this in mind the need to get value from big data and realise its potential is crucial to successful business development. Organisations need to put in place the business intelligent and data scientist personnel to deal with the variety of data, relevant technical structures to deal with their volume of data and be able to adapt easily to the changing velocity of data. With these in place an organisation can create meaningful patterns to plug into a pattern-based strategy to fully exploit and gain significant value from their data.
What are your thoughts on Big Data? Please leave your comments below.