How to become big data – data analyst

Anyone who works in the tech industry is aware of the rising demand of Analytics/ Machine learning professionals. More and more organisations have been jumping on to the data driven decision making bandwagon, thereby accumulating loads of data pertaining to their business. In order to make sense of all the data gathered, organisations will require Big Data Analysts to decipher the data.

  Data Analysts have traditionally worked with pre formatted data, that was served by the IT departments, to perform analysis. But with the need for real time or near-real time Analytics to serve end customers better and faster, analysis needs to be performed faster, thereby making the dependency on IT departments a bottleneck. Analysts are required to understand data streams that ingest millions of records into databases or file systems, Lambda architecture and batch processing of data to understand the influx of data.

Also analysing larger amounts of data requires skills that range from understanding the business complexities, the market and the competitors to a wide range of technical skills in data extraction, data cleaning and transformation, data modelling and statistical methods.

Analytics being a relatively new field, is struggling to resource the market demands with highly skilled Big Data Analysts. Being a Big Data Analyst requires a thorough understanding of data architecture and the data flow from source systems into the big data platform. One can always stick to a specific industry domain and specialize within that, for example Healthcare Analytics, Marketing Analytics, Financial Analytics, Operations Analytics, People Analytics, Gaming Analytics etc. But mastering the end-to-end data chain management can lead to plenty of opportunities, irrespective of industry domain.

The entire Data and Analytics suite includes the following gamut of stages:

  • Data integrations – connecting disparate data sources
  • Data security and governance – ensuring data integrity and access rights
  • Master data management – ensuring consistency and uniformity of data
  • Data Extraction, Transformation and Loading – making raw data business user friendly
  • Hadoop and HDFS – big data storage mechanisms
  • SQL/ Hive / Pig – data query languages
  • R/ Python –  for data analysis and mining programming languages
  • Data science algorithms like Naive Bayes, K-means, AdaBoost etc. – Machine learning algorithms for clustering, classification
  • Data Architecture – solutionizing all the above in an optimized way to deliver business insights

The new age data analysts or a versatile Big Data Analyst is one who understands the complexity of data integrations using APIs or connectors or ETL (Extraction, Transformation and Loading), designs data flow from disparate systems keeping in mind data security and quality issues, can code in SQL or Hive and R or Python and is well acquainted with the machine learning algorithms and has a knack at understanding business complexities.

Since Big Data and Analytics is constantly evolving, it is imperative for anyone aiming at a career within the same, to be well versed with the latest tech stack and architectural breakthroughs. Some ways of doing so:

  • Following knowledgeable industry leaders or big data thought leaders on Twitter
  • Joining Big Data related groups on LinkedIn
  • Following Big Data influencers on LinkedIn
  • Attending events, conferences and seminars on Big Data
  • Connecting with peers within the Big Data industry
  • Last but not the least (probably the most important) enrolling in MOOC (Massive Open Online Course) and/ or Big Data books

Since Analytics is a vast field, encompassing several operations, one could choose to specialise in parts of the Analytics chain like data engineers – specializing in highly scalable data management systems or data scientists specializing in machine learning algorithms or data architects – specializing in the overall data integrations, data flow and storage mechanisms. But in order to excel and future proof a career in the world of Big Data, one needs to master more than one area. A data analyst who is acquainted with all the steps involved in data analysis from data extraction to insights is an asset to any organization and will be much sought after!

Advertisement

Four steps to becoming a Data-Driven organisation

screen-shot-2016-11-23-at-22-42-36

Not a day goes by when our LinkedIn news feed is not flooded with the mentions of AI and Machine Learning benefitting and changing the ways of mankind, like never before. This hype surrounding AI, Machine learning has resulted in most organisations jumping on the bandwagon without proper evaluation. A couple of years ago, the term Big Data enjoyed a similar hyped status but it has been losing it’s lustre to all the talk about AI and Machine Learning, lately.

The truth, however, is that, AI and Big data need to coexist and converge. Merely collecting and storing data in huge amounts will prove futile, unless AI and Analytics are used to generate meaningful insights that help businesses, enhance customer experience or increase revenue influx.

Making an organisation Data-Driven will take time and will happen in stages. While there are no sure shot ways to create a Data-Driven organisation, below are some ways that could lead to a change:

  1. Strategy – It all starts with a clearly defined strategy in place, stating the Whys, Hows, Whos and Whens. A clear strategy helps in raising awareness across the organisation, about the topic in focus (data in this case) and creates a sense of urgency around the change process. It is imperative that the entire organisation understands the importance and implications of a data-driven organisation, thus encouraging people to update their skill sets and raise their level of data awareness. An all round data strategy should not only include the technology required for execution but the kind of competence and people skills and the sort of conducive atmosphere required for a data-driven organisation to thrive.
  2. People – Just as there are different kinds of skills required within a Marketing or a Software organisation, there are different skill sets for the different job roles within a data organisation. But due to the hype surrounding Machine Learning and AI while companies lack the practical knowledge in data know-how, the tendency is to either hire the wrong people or assign the wrong tasks to the right people! Not everyone has to be a data scientist in the data organisation. There will be people required to work on data architecture, data infrastructure, data engineering, data science and the Business Analysts. These could very well be the same person, if the organisation is lucky enough. But it is unfair to hire a data engineer and assign him/her the task of building Predictive models or hiring a data Scientist to be told to develop BI reports. Strategists will have to spend the time required to understand the nuances of skills and expertise required in a data organisation but it will be worth it, to retain and grown the talent pool required for a Data-driven organisation.
  3.  Patience – Creating a Data-driven organisation will require ample amounts of patience and perseverance. If data has not been involved in the decision making process, earlier,  then the data is most probably not in a state that can be used readily or maybe there is no or not enough data to begin with! In that case, it has to start with gathering the data required to achieve the business goals. Transaction systems have a very different database design than the data storage mechanisms used for Analytics purposes, which entails a design and architecting process before being able to analyse the data. Moreover, as Analysts dig into the transaction data, they surely will encounter non-existence of relevant data, data retrieval issues and unearth data quality issues and data integration problems due to the existence of data silos. In a data-driven organisation, all data sources are integrated to provide a single enterprise version of truth, irrespective of Customer data or Sales or Marketing data. A data platform, integrating all business data sources, ensuring quality and data integrity and security is a time-consuming process. Organisations will have to take this lead time into consideration when strategizing a Data-driven decision making approach.
  4. Organisational Culture – The purpose of a Data-driven organisation is to empower employees by means of data and information sharing to enable the organisation to collectively achieve the business goals. This approach requires employees to be data aware and not use gut feelings to make decisions and this could be a whole new approach for many. This new way of working requires organisational change management, educating people to use facts and figures to arrive at conclusions and make decisions. If an organisation is fairly data aware, in the sense that metrics are used to measure certain processes, in order to turn Data-driven , the organisation has to take steps to use data proactively (read Predictive Analytics) and not just summarise events that happened. The CDOs/ CMOs need to drive data awareness by showcasing quick wins and success cases of Data-driven approaches, as a means to use data as the foundation in every decision making process.

Some organisations may take longer to implement a Data-driven culture than others but there is no way an organisation can become Data-driven, just like that, one fine day! If the CDOs can gauge that the organisation has a longer incubation period then it is good to start with raising data awareness and introducing a BI/ Datawarehousing team. It is not recommended to directly leap on to AI, hiring data scientists, to be then left in a lurch if the organisation and the infrastructure are pretty rudimentary to handle their expertise.

A Data-driven organisation culture starts with the right strategy in place, followed by the right people and technology, evaluating and optimising the entire process, intermittently.