acm - an acm publication


  • Big data: developing an open source "big data" cognitive computing platform

    The ability to leverage diverse data types requires a robust and dynamic approach to systems design. The needs of a data scientist are as varied as the questions being explored. Compute systems have focused on the management and analysis of structured data as the driving force of analytics in business. As open source platforms have evolved, the ability to apply compute to unstructured information has exposed an array of platforms and tools available to the business and technical community. We have developed a platform that meets the needs of the analytics user requirements of both structured and unstructured data. This analytics workbench is based on acquisition, transformation, and analysis using open source tools such as Nutch, Tika, Elastic, Python, PostgreSQL, and Django to implement a cognitive compute environment that can handle widely diverse data, and can leverage the ever-expanding capabilities of infrastructure in order to provide intelligence augmentation.