Update Wallaroo Labs Promises Easily-Scalable Big Data Processing Infrastructure

Main Contents:

Wallaroo Labs Promises Easily-Scalable Big Data Processing Infrastructure is an article under the topic Data Science Many of you are most interested in today !! Today, let’s InApps.net learn Wallaroo Labs Promises Easily-Scalable Big Data Processing Infrastructure in today’s post !

Built on Pony

Founded in 2014 and previously called Sendence, the New York-based company changed its name to Wallaroo Labs, after its sole product, last fall. Jain said the name change marked the evolution from a consultancy to a product company — and its staff was more willing to wear a Wallaroo T-shirt than one with the name Sendence.

It set out with several high-level goals in mind:

To allow allowing developers to focus on their business logic rather than distributed computing “plumbing.”
To provide portable, high-performance and low-latency data processing.
To manage in-memory state for the application.
To allow applications to scale as needed, even when live.

It decided to start from scratch, aside from building atop a language called Pony, an object-oriented, actor-model high-performance open source programming language that’s compatible with C.

In a blog post about building its own Kafka client — the software supports two types of sources and sinks: TCP and Kafka — the company explained Pony provides reliable, low-overhead concurrency with data safety, though it’s not a practical option for use with the JVM due to the overhead involved.

The Wallaroo Labs platform supports Go and Python natively with the Go API and Python API, and provides its own processing engine.

“Say you’re a Python developer. You’d use the Python API to implement the business logic: What is this data? How do I write in this data? What operations do I need to do on the data? Apply some machine learning to it. Put in some result or alert. Developers can write these code snippets that are the business logic. Just that. They don’t have to deal with any scaling issues or plumbing issues or messaging issues or if something crashes how to restart it. All that is taken care of by us,” Jain explained.

That code runs inside the Wallaroo Labs engine, which is spread across however many servers you need.

“It could be on your laptop, on 20 machines with AWS, on five machines on premise, on 50 machines on Google Cloud. It just runs wherever it needs to run and that’s totally transparent to the developer,” he said.

Open Scale-independence with Wallaroo on Vimeo.

“We wanted to build a framework for the future that combines the best of serverless and the best of the Big Data stack,” Jain said.

With Amazon Lambda or functions-as-a-service, [they’re] easy to implement, easy to scale. You implement the business logic; you don’t need to worry about where it’s running, scale and you can work in modern languages like Python, Go and JavaScript, he explained. But existing serverless frameworks are not very high performance and can handle complicated applications with complex workflows, combining and splitting different sources of data. If you’re using serverless, you have cloud lock-in.

Big Data solutions, meanwhile, are harder to use, harder to scale, and they’re mostly all written in Java. If you want to write some Python or Golang code, they’re not really designed for that. But they’re portable — you can run them anywhere — you can run them on-prem, in any cloud.

Wallaroo is easy to scale, lets you use modern languages like Python and Go, and has much better performance than even the Big Data solutions, according to Jain. It can run anywhere and handle complex applications.

It eliminates the performance hit required when developing in Python or Go then translating to Java for production — that whole setup for sending data back and forth becomes complex and difficult to scale, he said.

Handling State

The actor model approach encapsulates data, minimizes coordination and keeps application state close to the computation.

Managing stateful applications is key to Wallaroo, something most serverless frameworks don’t handle well, Jain said.

“You as an application developer don’t have to think about resiliency or scalability of that state. That’s one of the things our engine and API handle for you,” Jain said.

“StateComputation” is one of the building blocks of a Wallaroo application. Updates are written to an event log that can be replayed in case of failure. Exactly-once message processing is another option to eliminate duplicates.

The company’s closest competitors are the DIY crowd, the biggest chunk, then those pursuing serverless and Java developers building on Spark and Storm. Companies with large Java investments are not its target, Jain said.

“There’s a whole continuum. You have people who have more complex workflows or more high-performance needs than serverless. You have people who have complex workflows or more high-performance needs, but they want to work in Golang or Python,” he said. Wallaroo might be the answer for them.

Feature image via Pixabay.

Source: InApps.net

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.