Leveraging an on-premise, open source Hadoop environment for data analysis, Syngenta built several proof of concept (PoC) data analytics solutions. However, the data processes running on this infrastructure were not integrated, automated nor stable, and the data analytics tools had limited bandwidth. In addition, there was no dedicated processing power or storage for most of the system components and the processing power had to be started and stopped manually to avoid growing operating costs.
Additionally, while the system’s load was predictable most of the time, the PoC platform was unable to scale for data ingestion peaks during harvest periods. These harvest periods are the most demanding and important timeframes for the Syngenta team because the data scientists and applications require a steady supply of data to make business decisions without delay or interruption due to load issues.