Home
>
Data Science
>
Update Nordstrom Builds Flexible Backend Ops with Kubernetes, Spark and JanusGraph

March 28, 2022 by Phu Nguyen

Update Nordstrom Builds Flexible Backend Ops with Kubernetes, Spark and JanusGraph

Main Contents:

Nordstrom Builds Flexible Backend Ops with Kubernetes, Spark and JanusGraph is an article under the topic Data Science Many of you are most interested in today !! Today, let’s InApps.net learn Nordstrom Builds Flexible Backend Ops with Kubernetes, Spark and JanusGraph in today’s post !

Key Summary

Overview: The article by InApps Technology details how Nordstrom, a leading U.S. retailer, leveraged Kubernetes, Apache Spark, and JanusGraph to build a scalable, flexible backend system to support its e-commerce and retail operations, enhancing data processing, personalization, and operational efficiency.
Nordstrom’s Context:
- Background: A major retailer with a significant online and in-store presence, handling complex workloads like inventory management, customer personalization, and real-time analytics.
- Challenges:
  - Managing high-volume, heterogeneous data from e-commerce transactions, customer interactions, and supply chain.
  - Scaling infrastructure to handle peak traffic (e.g., Black Friday sales).
  - Enabling real-time personalization and recommendations for millions of customers.
  - Integrating diverse data sources for unified analytics and decision-making.
- Goal: Build a robust, scalable backend to support dynamic retail operations while ensuring performance, flexibility, and cost efficiency.
Technologies Used:
- 1. Kubernetes:
  - Description: An open-source platform for container orchestration, managing deployment, scaling, and operations of containerized applications.
  - Role at Nordstrom:
    - Orchestrated microservices for e-commerce applications (e.g., order processing, inventory tracking).
    - Enabled auto-scaling to handle traffic spikes during sales events.
    - Provided fault tolerance with automatic restarts and load balancing.
  - Example: Deployed containerized APIs on Kubernetes to process real-time customer queries.
- 2. Apache Spark:
  - Description: A distributed computing framework for big data processing, excelling at batch and streaming analytics.
  - Role at Nordstrom:
    - Processed large-scale data (e.g., sales, customer behavior) for analytics and ML model training.
    - Handled real-time data streams (e.g., clickstream data) for dynamic personalization.
    - Integrated with Kubernetes for scalable, containerized Spark clusters.
  - Example: Analyzed petabytes of transaction data to generate sales forecasts.
- 3. JanusGraph:
  - Description: An open-source, distributed graph database optimized for complex, highly connected data.
  - Role at Nordstrom:
    - Managed relationships (e.g., customer-product interactions, supply chain networks) for recommendation systems.
    - Supported real-time graph queries for personalized shopping experiences.
    - Scaled across distributed nodes for high availability and performance.
  - Example: Powered a recommendation engine by modeling customer purchase histories as a graph.
How Nordstrom Built Flexible Backend Ops:
- 1. Microservices Architecture with Kubernetes:
  - Implementation: Deployed microservices (e.g., inventory, checkout, user profiles) in containers managed by Kubernetes.
  - Impact: Improved modularity, enabling independent scaling and updates of services.
  - Example: Scaled checkout service during peak sales without affecting inventory APIs.
- 2. Big Data Processing with Spark:
  - Implementation: Ran Spark jobs on Kubernetes clusters to process batch and streaming data for analytics and ML.
  - Impact: Accelerated data processing for real-time insights and predictive models.
  - Example: Processed clickstream data to update product recommendations in real-time.
- 3. Graph-Based Personalization with JanusGraph:
  - Implementation: Stored customer, product, and interaction data in JanusGraph for graph-based queries.
  - Impact: Enabled fast, personalized recommendations by traversing relationship graphs.
  - Example: Suggested products based on a customer’s browsing and purchase history.
- 4. Unified Data Pipeline:
  - Implementation: Integrated Kubernetes, Spark, and JanusGraph into a cohesive pipeline, with Spark feeding processed data to JanusGraph and Kubernetes orchestrating all components.
  - Impact: Streamlined data flow from ingestion to analytics to customer-facing applications.
  - Example: Ingested sales data via Spark, stored relationships in JanusGraph, and served recommendations via Kubernetes-hosted APIs.
- 5. Scalability and Resilience:
  - Implementation: Leveraged Kubernetes auto-scaling, Spark’s distributed processing, and JanusGraph’s fault-tolerant design.
  - Impact: Handled peak loads and ensured uptime during high-traffic events.
  - Example: Maintained performance during a 10x traffic surge on Cyber Monday.
Benefits:
- Scalability: Handled millions of transactions and users during peak retail events.
- Flexibility: Microservices and graph-based systems supported rapid feature development.
- Cost Efficiency:
  - Optimized resource usage with Kubernetes and Spark.
  - Offshore development in Vietnam ($20-$50/hour via InApps Technology) for backend integration saves 20-40% compared to U.S./EU rates ($80-$150/hour).
- Personalization: Real-time recommendations improved customer engagement and sales.
- Reliability: Fault-tolerant architecture ensured continuous operation.
Challenges:
- Complexity: Integrating Kubernetes, Spark, and JanusGraph required expertise in distributed systems.
- Resource Management: Optimizing Spark and Kubernetes clusters for cost and performance was challenging.
- Data Consistency: Ensuring consistency across Spark’s analytics and JanusGraph’s graph data required careful design.
- Learning Curve: Teams needed training to leverage JanusGraph’s graph query language (Gremlin).
Security Considerations:
- Encryption: Used TLS for data in transit and encryption at rest for Kubernetes, Spark, and JanusGraph.
- Access Control: Implemented RBAC in Kubernetes and JanusGraph to restrict data access.
- Monitoring: Deployed Prometheus and Grafana to track system health and detect anomalies.
- Network Security: Ran services in private VPCs to limit external exposure.
Use Cases:
- E-commerce: Real-time inventory updates and personalized recommendations during sales events.
- Customer Analytics: Analyzing purchase patterns with Spark for targeted marketing.
- Supply Chain: Modeling supplier relationships in JanusGraph for optimized logistics.
- Fraud Detection: Processing transaction data with Spark to identify suspicious patterns.
InApps Technology’s Role:
- Offers expertise in Kubernetes, Spark, and JanusGraph, delivering scalable retail backend solutions.
- Leverages Vietnam’s 200,000+ IT professionals, providing cost-effective rates ($20-$50/hour) for high-quality development.
- Supports Agile workflows with tools like Jira, Slack, and Zoom for transparent collaboration (GMT+7).
Recommendations:
- Use Kubernetes for orchestrating microservices and Spark jobs to ensure scalability.
- Leverage JanusGraph for relationship-driven use cases like recommendations.
- Monitor performance with Prometheus to optimize resource usage and detect issues.
- Partner with InApps Technology for expert backend solutions, leveraging Vietnam’s skilled developers for cost-effective, high-performance deployments.

Read more about Nordstrom Builds Flexible Backend Ops with Kubernetes, Spark and JanusGraph at Wikipedia

You can find content about Nordstrom Builds Flexible Backend Ops with Kubernetes, Spark and JanusGraph from the Wikipedia website

As customers come to expect more flexibility in the way they shop, Nordstrom has been experimenting with ways to optimize its supply chain.

The Seattle-based retailer has been online since 1998, and today operates 115 department stores as well as what it calls Omni hubs, which store inventory but offer no retail services; and what it calls local stores, which don’t hold inventory, but offer an array of retail services. For instance, you can get a pair of pants hemmed there. And then there are “vertical optimized fulfillment centers,” that handle subsets of inventory, such as beauty items that are small — a tube of lipstick, for instance — and are handled differently.

“Our customers expect a lot more flexibility. They want to know when they can get things, where they can get them, at what cost. So that’s what we’re working within today’s environment,” said senior software engineer Jeff Callahan, speaking recently at ApacheCon North America on how Nordstrom uses the JanusGraph open source graph database, Cassandra and Spark in backend operations.

“Kubernetes is a big part of what we do at Nordstrom.” — Jeff Callahan.

One idea the company has been working on lately is called cost-based routing. As an example, he showed a slide of Nordstrom stores in the Los Angeles area arranged as a clover-leaf pattern, though that’s not how they actually appear on the map. If a customer orders socks, shoes, and a jacket and each item is located at a store in a different circle in the cloverleaf but wants to pick those items up at a store in the fourth circle, the idea is how to most efficiently provide that and at what cost.

“As soon as you look at this across the entire country, … gets really complex really quickly,” he said. “Inventory is constantly selling, moving. So we have to know where the inventory is and how soon it’s going to be available. Staffing can obviously affect these things. So if a store doesn’t have enough staff to go out on the floor, take items to ship, then it’s going to be hard to get it on that truck. …And then of course, in L.A. no less, traffic can definitely be a big factor.”

It’s supporting new fulfillment options including pickup in store, next-day pickup and courier delivery from a total of 150 sites with a variety of carriers and levels of service. Meeting customer expectations is its first priority, followed by reducing the company’s cost.

The technology also has to serve the facility manager who may need to take it offline from receiving more orders if it becomes overwhelmed.

Paired with Data Science

The technology has to be flexible enough to support new concepts, such as cost-based routing, or others the business might come up with later. And it must fit within Nordstrom’s existing technology. It was developed in partnership with the data science team.

“Kubernetes is a big part of what we do at Nordstrom,” he said.

JanusGraph “plays nicely with it with a pluggable set of backend components. And it orchestrates a client’s interaction among those backend components.” Nordstrom uses Solrcloud for the indexing system, Cassandra for the data layer. JanusGraph uses ZooKeeper as coordinator.

The JanusGraph data model stores facilities as vertices, transport options as edges and real-time telemetry data as properties. The system includes a graph backend and graph client that Nordstrom created. It adopted an “embedded JanusGraph” pattern for the graph client, which includes JanusGraph libraries and runs in the same JVM as the application. Data pipelines define the flow of data across the backend in support of client services.

“The back end is really what we refer to as our solution,” he said.

Cassandra and SolrCloud are each single Helm charts.

“You can just single-command deploy that and it’s up and running in Kubernetes. And then the same thing with Zookeeper. You can express dependencies and boundaries. But it really abstracts the sort of details of that deployment and configuration so that’s very well encapsulated,” he said.

It’s all rolled into a single Helm chart, so the team can deploy the entire back end with it, come back in about 10 minutes, and it’s ready to run in Kubernetes.

“All these pieces are already configured at the end of that; they interact with each other. It leaves a config map, which is basically a set of properties that clients can use to connect to the back end,” he said.

Managing Relationships

Today, the fully populated graph has about 1 million vertices and 100 million edges, which includes SKUs [individual item numbers] as well as all the different shipping options.

“There are relationships there that, once you have those million vertices, just getting those connections, we end up with about 100 billion edges,” he said.

Today, it processes 100 million daily events through that back end and expects the load to increase by one to two orders of magnitude, he said.

The only real problem it encountered with JanusGraph was that it blocked SparkGraphComputer with CQL backend, which the team worked around by coding custom Spark jobs.

The company has found about a 10x reduction in our actual dollars spent versus DynamoDB, but has yet to determine total cost of ownership, he said.

The pipelines form of OLAP [analytics]-OLTP [transactions]-style interactions with the graph. The system performs between 20 and 100-plus concurrent interactions with the graph.

Among the lessons learned:

Complexity is real. “For a system like this, when you’re trying to operate in production, you can’t be naive about it. You just acknowledge that and plan for it,” he said.
Discipline in DevOps is a huge key. “We had to all agree that that was a priority for us,” he said.
JAR dependencies could become confusing at times, especially with Spark involved. With several different, complicated projects coming together in this back end, jar conflicts sometimes created bizarre problems that were difficult to resolve.
Helm charts were “a huge win.” “I highly recommend it for Kubernetes users,” he said. “It really is configuration is code, so we’re getting source control and tracking. And when you have all these complex systems that individually are hard to manage, having Helm charts to help you really made it much easier.”

Feature image via Pixabay.

Source: InApps.net

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.

Let’s create the next big thing together!

Coming together is a beginning. Keeping together is progress. Working together is success.

Let’s talk

Recommended

Tech News

May 29, 2025 by Anh Hoang

Update Nordstrom Builds Flexible Backend Ops with Kubernetes, Spark and JanusGraph

Key Summary

Read more about Nordstrom Builds Flexible Backend Ops with Kubernetes, Spark and JanusGraph at Wikipedia

Paired with Data Science

Managing Relationships

AI Automation for Business in 2025: A Step-by-Step Guide

FITNESS APP DEVELOPMENT

ONLINE COURSE APP

EVE HR – WEB DESIGN

AIRGOGO WEBSITE

WALLET APP DEVELOPMENT

Ho Chi Minh City Launches Digital Traffic App 2017

Why Your Business Needs a Mobile App Rather Than a Website

7 Questions To Ask Yourself Before You ‘App’ | Entrepreneur

Homestays Marketplace Application Development

Blog post

9 Practical Tips to Choose a Mobile App Development Company for 2023

AI Automation for Business in 2025: A Step-by-Step Guide

Top 10 Offshore Development Companies (ODCs) in 2025

How can businesses effectively integrate AI into their operations?

Locations

Key Summary

Read more about Nordstrom Builds Flexible Backend Ops with Kubernetes, Spark and JanusGraph at Wikipedia

Paired with Data Science

Managing Relationships

Get a custom Proposal

You need to enter your email to download

Blog post

Locations