Welcome!

@ThingsExpo Authors: Yeshim Deniz, Pat Romanski, Liz McMillan, Pramod Pawar, Zakia Bouachraoui

Related Topics: @DXWorldExpo, Java IoT

@DXWorldExpo: Blog Post

Big Data: Enterprise Class Machine Learning with Spark and MLbase

New Infrastructure Aims at Economical, In-memory, Large Scale Machine Learning

Machine Learning is a critical part of extracting value from Big Data. Choosing proper model, preparing data and getting usable results on large scale data is non-trivial exercise. Typically process consists of model prototyping using higher level, (mostly) single machine based tool like R, Matlab, Weka, then coding in Java or some other language for large scale deployment. This process is fairly involved, error prone, slow and inefficient.

Existing tools aiming at automating and improving this process are still somewhat immature and wide scale Machine Learning enterprise adoption is still low. Efforts are under way to address this gap i.e. to make enterprise class Machine Learning more accessible and easier.

Spark is new, purpose-built, distributed, in-memory engine that makes it possible to perform compute intensive jobs on commodity hardware clusters. One of applications Spark is targeted and especially suitable for is Machine Learning, key part in getting actionable insights from Big Data.

Machine Learning is compute intensive application, characterized by many iterative passes through data until optimal solution is found, and Spark is natural fit for such workloads.

MLbase (open source project) is ML platform  implemented on top of Spark which aims at easier and more productive implementation of ML algorithms.

Arguably most interesting part of MLbase will be ML Optimizer ( not released yet ), which will automate the task of choosing models.

Choosing proper model is difficult task and there are quite a few attempts to automate this process (one of the most interesting available products is Google Prediction API, a Cloud service which automatically evaluates, picks and executes model on submitted data).

More Stories By Ranko Mosic

Ranko Mosic, BScEng, is specializing in Big Data/Data Architecture consulting services ( database/data architecture, machine learning ). His clients are in finance, retail, telecommunications industries. Ranko is welcoming inquiries about his availability for consulting engagements and can be reached at 408-757-0053 or [email protected]

IoT & Smart Cities Stories
"MobiDev is a Ukraine-based software development company. We do mobile development, and we're specialists in that. But we do full stack software development for entrepreneurs, for emerging companies, and for enterprise ventures," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regu...
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...
Machine learning has taken residence at our cities' cores and now we can finally have "smart cities." Cities are a collection of buildings made to provide the structure and safety necessary for people to function, create and survive. Buildings are a pool of ever-changing performance data from large automated systems such as heating and cooling to the people that live and work within them. Through machine learning, buildings can optimize performance, reduce costs, and improve occupant comfort by ...
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.
With 10 simultaneous tracks, keynotes, general sessions and targeted breakout classes, @CloudEXPO and DXWorldEXPO are two of the most important technology events of the year. Since its launch over eight years ago, @CloudEXPO and DXWorldEXPO have presented a rock star faculty as well as showcased hundreds of sponsors and exhibitors! In this blog post, we provide 7 tips on how, as part of our world-class faculty, you can deliver one of the most popular sessions at our events. But before reading...
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
CloudEXPO New York 2018, colocated with DXWorldEXPO New York 2018 will be held November 11-13, 2018, in New York City and will bring together Cloud Computing, FinTech and Blockchain, Digital Transformation, Big Data, Internet of Things, DevOps, AI, Machine Learning and WebRTC to one location.