Date: 4 January 2017    Author: Sebastian Sakowski PhD

Analytics and Big Data in the Energy Sector

Tremendous advances in computer technology, huge storage capabilities, and fast communication through computer networks have enabled the application of sophisticated statistical methods for knowledge discovery from data warehouses.

©PAP/Marcin Bednarski

In recent years, we have started storing and processing data like never before and data mining techniques have become vital in various spheres of real life, e.g. the retail industry, finance, and fraud detection. Most electronic information was only recently created, and now we only need to look into data storage to discover new knowledge. Nowadays, we can observe a lot of big data and data mining applications in various areas including energy industry, chemical engineering, ecosystem modeling, and biomedical research. Data mining and big data techniques are still evolving and the latest trends in these fields include web mining, text mining and integration with warehouse systems.

Analytics in business and industry

Investment banks and asset management use advanced mathematical algorithms to analyze huge data sets to make reasonable investment decisions. Goldman Sachs, one of the biggest investment banks in the world, uses artificial intelligence solutions to improve existing business processes. Meanwhile, JPMorgan, another global leader in financial services, is currently developing big data applications and robotics. These banks have a large database of customers and could apply big data analytics to build a risk prediction model and to credit risk scores. Another example of applying big data analytics is Netflix, a multinational entertainment company that specializes in streaming media and attempts to predict which books, music and movies clients may enjoy.

British Petroleum (BP) is an industry leader in using new computer technology such as big data and data mining. It has opened in Huston (USA) the “Center for High-Performance Computing” (CHPC) which is one of the world’s largest supercomputers for commercial research. These computers use big data techniques to support the energy industry from around the world. Using this approach, BP can gather data with the use of seismic technology and geophysical sciences that can be applied in drilling and could help in oil and natural gas exploration. In addition, this data center opens new possibilities for the application of data mining techniques in other areas of business activities, such as midstream and downstream. Recently BP has established the first “Big Data, Data Sciences Instruction and Research Centre” in Azerbaijan, which is located at ADA University in Baku and aims to provide an excellent opportunity for people to share knowledge about big data analytics.

Another oil and gas company (Royal Dutch Shell) collects data from sensors which monitor the low frequency seismic waves of tectonic activities. These huge data are considered alongside information about exploration drilling in order to then make recommendations about where to drill. Shell uses advanced computer technology to improve the operation of the machinery used in drilling and to streamline the transportation of oil and gas. Furthermore, Shell applies complex mathematical models to set prices at petrol stations and strives to increase the efficiency of the distribution of oil and gas.

Polish companies are in the initial phase to explore big data technology. This is because data analytics requires investment in computer infrastructure. For this reason, the first projects in this area were developed by Poland’s largest company. For example, the largest online transaction platform in Poland (Allegro) uses big data analytics to analyze users and to help them find the products they desire. It is worth noting that in the coming years the biggest Polish companies want to develop big data analytics, and those which operate in the drilling industry and geophysical services industry have the potential to use big data analytics and pattern recognition techniques as they process and store huge amounts of seismic data. It can be said that this move is somewhat necessary for companies to excel and compete in today’s competitive global economy. Some Polish companies are able to build global strategies based on innovative technologies such as big data and cloud computing. For example, the Polish company Cloud Technologies applied big data technology to online advertising and the optimization of advertising campaigns. It should be noted that Poland has great opportunities to further develop in the areas of big data and data mining as Polish computer scientists enjoy global respect. This is due to the fact that Polish software developers have won some of the most challenging programming competitions in the world such as the Google Code Jam and the Microsoft Imagine Cup. It should be emphasized that in the latest world ranking of the best countries for computer programming Poland took third place. Therefore, supporting innovative areas of computer science such as big data is crucial from a strategic point of view of the Polish government. It is important both to develop entrepreneurship based on advanced computer techniques and to implement big data technology in Polish companies partly owned by the treasury.

Big data in the energy sector

The energy industry includes four main areas: the petroleum industry, the gas industry, the coal industry, and the electrical power industry. For example in the oil industry we can distinguish different cases for using big data analytics, such as in upstream, midstream and downstream. In upstream big data analytics are used to store, to analyze seismic data, and to decrease crashes in production. However, midstream big data infrastructure is used to store conservation logs and to analyze transport data in real-time. In the oil and gas industry (downstream) big data technology has been employed to gas station automation optimization and to minimize financial risk. Big data techniques and data mining models could be used by energy companies for high frequency trading on financial and energy markets. The ability to collect and analyze data such as day prices and information about trading positions can allow for reduced transaction risks associated with the management of positions on commodity markets. Big data analytics could help commodity traders make better transactions on financial markets and decision makers to quickly respond to rapidly changing commodity markets. Such solutions can provide many advantages, including the discovery of new correlations among commodities, a decrease in transactions costs and improving the calculation of transaction risks. It is worth noting that big data techniques can help trading organizations to identify aberrant trading behaviors and to regulate financial markets too.

Most companies collect data from different sources, such as sensors for drilling technological parameters, demographic data, and from social media. Therefore the integration and management of big data collections are crucial for companies working in the sphere of international business. The complexity and diversity of this information requires algorithmic approaches that can be implemented by data scientists who can use statistical analysis, reporting services, and neural networks. The energy industry uses business intelligence and data mining techniques to optimize decision making, to improve the efficiency of refineries, and to predict the prices of financial markets. In recent years an increasing number of data mining systems have been presented and implemented in various domains. Most energy companies store huge amounts of information in the form of a relational database (e.g. SQL Server) and in the form of unstructured data, (e.g. text documents). This information includes data from IT infrastructure such as real-time data from different industrial sensors and environmental monitoring. Energy companies should use new methods and technologies to integrate and analyze data, e.g. Hadoop and NoSQL. Hadoop technology allows for the distributed processing data on clusters of computers and is designed to scalability of calculation processing. New technologies have lowered the cost of data processing and technology, such as Hadoop, thus making the analysis of large data sets more efficient. We distinguish other big data platforms that can be used in the energy industry, such as IBM open source InfoSphere platform, Microsoft Upstream Reference Architecture (MURA platform), and Oracle Architecture Development Process. Analytics in the energy industry provide tools for knowledge management and thereby generate new added value for economic growth. As such, the energy companies that wish to use analytics in their businesses can start by developing big data and data mining strategies. Due to the complexity of these issues big data strategies should be carried out by a team composed of academicians and experienced data scientists.

Data mining techniques

Data warehouses store information gathered from different sources such as databases, computer systems, and from social media. The amount and the complexity of these data are such that a company must use sophisticated analytics techniques such as machine learning. The main goal of data mining techniques is the optimization of the business process and to use data sets to improve business decisions. Data mining methods focus on the extraction of previously unknown (nontrivial) information from data, and use advanced computer science methods to discover knowledge from large volumes of data and to present the results of big data analysis in a form that is easily comprehensible. Data mining can help a company’s managers with regard to fast and better strategic decision making. These methods can enable the development patterns and relationships which occur in the data in order to then build a model for the real world. The managers can use these models to make predictions and to improve the business process. Data mining is used to build many models, e.g. clustering, time series, and classification, the main aim of which is to solve business problems. The most popular application of analytics in the energy industry is to reduce energy costs and to oversee energy consumption.

However, the concept of big data is to apply in practice advanced computer analysis by using complex mathematical models that are executed on large data sets. The main goal of using big data techniques is to improve the efficiency of strategic decision making. Effective analysis of data allows for quick decision making and responses to changes in the business environment.

Future trends:

It is estimated that the global market analysis of big data is currently growing six times faster than that of the IT sector. There is little doubt that big data analytics has afforded additional benefits to businesses and improved the functioning of corporations. Today’s businesses must be able to react quicker, should be profitable, and must focus on high quality customers services. Company managers should be interested in new analytics methods such as data mining and business intelligence, which impact every industry in the modern world. The implementation of big data and data mining techniques has many advantages for companies, such as improving the efficiency of production process, integrating data from different sensors, improving management processes, and speeding up exploration and production. Big data and data mining are major disciplines in computer science with growing industrial impact which clearly influences the energy industry. It is predicted that big data and data mining technology will face plenty of interesting new challenges in the future, such as finding ways to improve predictions and forecast models. If companies can use data science and big data tools for making predictions, this is likely to optimize investments and help cut costs. In the near future we will see more smart machines with sophisticated sensors that will enable the storage of huge amount of data, while new products based on big data will be created with many companies having to apply big data analytics in order to increase their competitiveness in global markets.

All texts (except images) published by the Warsaw Institute Foundation may be disseminated on condition that their origin is stated.