How do you remove a background from a video in real time?

Real-time video background removal needs per-frame segmentation, which a trained neural network handles where older motion-based methods fail. Banuba's background changer API does this on-device for both photos and video, so the background can be blurred or replaced live without sending frames to a server.

Is background subtraction done on-device or in the cloud?

With Banuba, it runs fully on-device. No camera frame leaves the user's phone, which is the on-device, no-cloud guarantee buyers most often ask for. The setup and capability details are in the Banuba Face AR SDK documentation .

Does a background removal API slow down a mobile app?

Not meaningfully when it is built for real time. Banuba's segmentation holds a stable 30 FPS on supported hardware, and the Web build starts around a 12 Mb baseline depending on enabled features. You can see how it is integrated and benchmark it in Banuba's sample apps .

Can a background removal API for developers work across platforms?

Yes. Banuba ships background subtraction in a single SDK that covers iOS, Android, Flutter, React Native, Web, and desktop, distributed as CocoaPods, Maven, and npm packages. The Banuba GitHub organization hosts the integration samples for each platform.

Blog

Virtual Background

How to Implement Background Subtraction Using an SDK

June 30, 2026

How to Implement Background Subtraction Using an SDK

Background subtraction separates a person from everything behind them in a live camera feed, then replaces or blurs that background in real time. Banuba Face AR SDK is a real-time, on-device face tracking and AR effects SDK that runs at 60 FPS on mid-range mobile hardware with a -90° to +90° head-angle tracking range. That on-device design is what lets background subtraction run frame by frame on a phone instead of round-tripping every frame to a server.

Most teams reach for background subtraction when they are shipping video conferencing, live streaming, or a camera-first social app and discover that training a segmentation network and tuning it across thousands of devices is a multi-quarter project on its own. An SDK collapses that work into an integration task. This guide walks through what background subtraction actually involves, when a from-scratch build is worth it, and how to wire up the feature with Banuba's segmentation pipeline.

How to Implement Background Subtraction Using an SDK

Written by Tania Rohachuk.
Technically reviewed by Artem Harytonau.

Originally posted on Tuesday, June 30, 2026

Last updated on Tuesday, June 30, 2026

Stay tuned Keep up with product updates, market news and new blog releases

[navigation]

TL;DR

Background subtraction is real-time person segmentation: the model classifies every pixel as foreground (the user) or background, then your app replaces, blurs, or recolors the background.
Banuba Face AR SDK ships background separation as a built-in neural network, so you call the Virtual Background API rather than train and maintain your own segmentation model.
Banuba's segmentation holds a stable 30 FPS for backgrounds on supported mobile hardware, which keeps the camera preview smooth on mid-range and older devices.
The Banuba Web build adds roughly 12 Mb (the BanubaSDK.wasm baseline), and this depends on the feature set you enable, so it is worth budgeting against page-load and bandwidth targets up front.
Processing runs fully on-device, with no frame leaving the phone, which answers the most common privacy requirement Banuba sees from buyers.
One Banuba SDK covers iOS, Android, Flutter, React Native, Web, and desktop, so you build background subtraction once instead of per platform.
A background removal API for developers is the faster path when your differentiation is the product around the camera, not the computer-vision research underneath it.

What is background subtraction, and how does it work?

Background subtraction is the task of deciding, for each pixel in a frame, whether it belongs to the person in front of the camera or to the scene behind them. Classic computer vision did this by modeling a static background and flagging pixels that changed, which broke the moment the camera or lighting moved. Modern background subtraction using deep learning replaces that with a trained segmentation network that recognizes people directly, so it works on a moving handheld camera and holds up across skin tones, clothing, and cluttered rooms.

Once you have a clean per-pixel mask, the background becomes editable. You can blur it for a video call, swap in a branded image for a kiosk, drop in a looping video, or place the user inside a 360-degree virtual environment. Banuba's background subtraction technology separates the person without pixelated borders or blurred-away hair strands, which is usually where naive masks fall apart.

This is the same capability people search for as a remove background API or a background removal API. The difference in a live app is that the work has to happen every frame, in real time, on the user's device, not as a one-off call on a single uploaded photo.

Should you build background subtraction from scratch or use an SDK?

Building from scratch means owning the full pipeline: collecting and labeling a segmentation dataset, training a model that generalizes across devices and lighting, compressing it to run in real time on a phone, and then maintaining it as new hardware ships. For a team whose core product is the segmentation model itself, that investment makes sense. For everyone else, it is months of specialized computer-vision work before the first usable frame.

Using an SDK inverts that. The model, the real-time inference, and the cross-platform rendering are already solved, and you integrate the result. The honest tradeoff is customization: you work within the SDK's segmentation behavior and tuning rather than shaping every layer yourself. For the large majority of conferencing, streaming, and social use cases, that ceiling sits well above what the product needs, and the time saved is the whole point.

A useful test: if losing six months to model research would not differentiate your product, an image background removal API or on-device SDK is the pragmatic call. If the segmentation quality itself is your moat, build it.

How do you implement background subtraction with the Banuba Face AR SDK?

Banuba exposes background separation through its Virtual Background API, which is built on the SDK's background-separation neural network. At a high level, the integration is four steps:

Add the SDK to your project. Banuba distributes via CocoaPods, Maven, and npm packages, so installation is a few lines in your existing build configuration. Cross-platform teams pull a single dependency that covers iOS, Android, Flutter, React Native, Web, and desktop.
Initialize the SDK with your client token and start the camera pipeline, following the setup guides in the Banuba Face AR SDK documentation.
Enable the background module on your effect and set what should replace the background, whether that is a blur, a static image, or a video texture. The background separation neural network produces the mask; you supply the replacement.
Test against your target devices and tune resolution to your performance budget.

You do not need to write segmentation code or assemble a rendering pipeline to get there. Banuba publishes working integration samples for both platforms, including the iOS sample apps and the equivalent Android repository, which an engineer evaluating the SDK will typically open first to see how the background module is wired in. A common reference pattern is a live-streaming app built with Amazon IVS and the Banuba Face AR SDK, where the SDK handles the camera, segmentation, and effects, and the streaming layer handles delivery.

virual 3d background Banuba's background subtraction example

What does background subtraction cost in performance and app size?

Two numbers tend to drive the build-versus-buy decision, and both matter more in production than in a demo.

The first is frame rate. Banuba's background subtraction holds a stable 30 FPS for backgrounds on supported mobile hardware, which is the threshold where a camera preview reads as smooth rather than choppy. Sustaining that on mid-range and older phones, not just flagship devices, is what determines how much of your user base gets a usable experience.

The second is footprint. The Banuba Web build adds roughly 12 Mb at its BanubaSDK.wasm baseline, and the exact size depends on which features you enable. On the web especially, that bundle size feeds directly into page-load speed and bandwidth costs, so it is worth measuring against your performance targets early rather than discovering it after launch.

Because everything runs on-device, there is no per-frame server cost and no cloud dependency, which also removes the latency and privacy questions that come with sending camera frames off the phone.

What can you build with real-time background subtraction?

The same on-device segmentation supports a wide range of camera products. Video conferencing and live streaming apps use it for blur and virtual backgrounds so users can present from anywhere. Social and creator apps combine it with AR masks and beauty effects. Retail and event kiosks drop visitors into branded scenes or 360-degree environments. Because Banuba's background remover works in real time and in both portrait and landscape, the same integration carries across mobile, web, and desktop surfaces without rebuilding the feature for each one.

Conclusion

Background subtraction is a deep-learning problem dressed up as a UI feature: easy to describe, expensive to build well, and unforgiving in real-time conditions across the device landscape. If segmentation quality is not the thing that sets your product apart, an SDK gets you a production-grade result in integration time rather than research time. Banuba's background subtraction runs on-device at a stable 30 FPS with a 12 Mb web baseline and one integration across iOS, Android, Flutter, React Native, Web, and desktop, which is usually the difference between shipping the feature this quarter and staffing a computer-vision team for it.

Real-time video background removal needs per-frame segmentation, which a trained neural network handles where older motion-based methods fail. Banuba's background changer API does this on-device for both photos and video, so the background can be blurred or replaced live without sending frames to a server.
With Banuba, it runs fully on-device. No camera frame leaves the user's phone, which is the on-device, no-cloud guarantee buyers most often ask for. The setup and capability details are in the Banuba Face AR SDK documentation.
Not meaningfully when it is built for real time. Banuba's segmentation holds a stable 30 FPS on supported hardware, and the Web build starts around a 12 Mb baseline depending on enabled features. You can see how it is integrated and benchmark it in Banuba's sample apps.
Yes. Banuba ships background subtraction in a single SDK that covers iOS, Android, Flutter, React Native, Web, and desktop, distributed as CocoaPods, Maven, and npm packages. The Banuba GitHub organization hosts the integration samples for each platform.

Top