2026-06-11

The Smart TV in Your Living Room Is an Exit Node for AI's Data Hunger

IncludeSecurity reverse-engineered the Bright Data SDK shipped inside consumer apps: an unauthenticated config turns smart TVs into residential proxy exit nodes that scrape training data for AI, with a 500 MB monthly default of someone else's traffic.

data privacy scraping

The Smart TV in Your Living Room Is an Exit Node for AI's Data Hunger — Photo / Unsplash

Summary

Security firm IncludeSecurity, working with independent researcher Buchodi, reverse-engineered a software development kit (SDK) embedded in consumer apps and laid its runtime behavior out in full. The SDK belongs to Bright Data, a data-collection company that markets itself as running the world’s largest residential proxy network. Its job is specific: it lets the web training data that AI companies want flow out through an ordinary household’s connection rather than out of a datacenter rack that gets blocked on sight.

The researcher captured 30 days of network traffic and statically analyzed the iOS framework binary (brdsdk.framework, version 1.532.120), so the findings don’t rest on rumor. The device continuously reports its physical state (idle or not, battery, network type, CPU and memory load) to a third-party server, which sends back individual scraping jobs for the device to run against other sites using the resident’s own residential IP. The value here isn’t conspiracy-exposure. It’s that reproducible technical evidence nails down something often stated as a forecast. AI’s appetite for data is already reshaping consumer hardware in reverse. This is a live pipeline, not a prediction.

What happened

Bright Data outsources the supply side of its residential proxy network to an SDK. Publishers, the developers of games, utilities, and connected-TV apps, integrate it, and once they have the user’s “consent,” the user’s phone or smart TV becomes one of the network’s exit nodes. The post documents a Roku app called Petflix as a representative case. Its consent screen says the user is allowing Bright Data to “occasionally” use the device’s free resources and IP address to download public web data.

Setting “occasionally” against the actual config is the report’s sharpest contrast. In the SDK’s publicly queryable configuration, the default monthly WiFi bandwidth budget, max_bw_monthly_wifi, is 200,000,000,000 bytes, or 200 GB. The gap between the dialog’s wording and the engineering default is itself a piece of judgment: “occasionally” is written for people to read, 200 GB is written for a machine to spend.

The technical chain runs like this. On every launch the SDK fetches a config from clientsdk.bright-sdk.com/sdk_config_ios.json, passing the app’s bundle ID (appid) and the SDK version string (ver). The researcher found this endpoint unauthenticated in any meaningful sense: supply the right bundle ID and version plus any made-up UUID, and the server returns exactly what a real device gets, including feature flags, idle-detection thresholds, per-country bandwidth tiers, and the partner manifest. Anyone can read the mechanism’s full hand.

After the config fetch, the SDK opens a persistent WebSocket to proxyjs.brdtnet.com:443. One detail is worth keeping: the connection’s TLS certificate carries the common name *.luminatinet.com. Luminati Networks was Bright Data’s name before its publicly announced 2018 rebrand, yet the peer tunnel still runs on the legacy cert. That isn’t trivia. It’s a clean detection pivot the researcher hands you: luminatinet.com / brdtnet.com traffic on your network is specifically the exit-node plane, not some customer legitimately using Bright Data’s customer-facing proxy service.

Once connected, a handshake follows: the server echoes the device’s public IP, assigns a session identifier, then polls device state. The device responds with a stream of telemetry: idle status, WiFi connectivity, mobile network type (LTE/5G), roaming, battery level, whether on battery, whether the screen is on, whether the user is on a call, CPU usage, memory usage, available bandwidth, and more. The researcher’s framing is precise: this is a continuous feed of physical-device state to a third party, gated by a consent dialog whose text is chosen by the host publisher. When the state qualifies, the server pushes cmd_tun frames, the scraping jobs the SDK runs as HTTP requests against third-party sites, sourced from the household’s residential IP.

The researcher adds a line that practitioners will catch: the whole WebSocket protocol is plain JSON with no message signing, no HMAC, no client certificate, no device attestation, only the TLS layer and a server-side IP-reputation filter. For anyone familiar with commercial malware protocol design, this is substantially less secure than typical command-and-control. The comparison isn’t there to alarm. It says the security boundary deciding which device gets work is thin enough to be essentially one TLS layer.

Why it matters

Place this back inside AI’s supply chain and the logic snaps into focus. AI companies depend heavily on scraped web content for pre-training, retrieval, agent grounding, and search. But the modern web isn’t scrapeable from a datacenter: Cloudflare, DataDome, and HUMAN, among others, throttle or block requests from known cloud IPs. The workaround is residential proxies. A scraping job routed through a Comcast or T-Mobile subscriber’s connection arrives at the target looking like a normal paying resident.

That is the thesis this investigation actually bites into: AI’s data demand is pushing surveillance capitalism into the living room. Most prior reporting on proxy supply focused on the illegal side, on botnets, trojanized apps, IoT hardware shipped pre-backdoored. The post notes Krebs reported in October 2025 that a glut of proxies from Aisuru and other sources is fueling large-scale AI-tied data harvesting, and the FBI issued a formal advisory earlier this year. But the legal side, the side operating with user “consent,” has drawn far less scrutiny. The contribution here is to point the lens at legal supply. It does not claim any given publisher’s current shipping app contains the SDK (the author is explicit that the manifest only proves an integration “might have existed at some point,” and per-app verification is required). It does prove that Bright Data ships this partner roster on an unauthenticated public endpoint, and that at least three CTV-focused entities (PlayWorks, CloudTV, Longvision) monetized user devices as exit nodes.

Why a smart TV specifically? The phone-versus-TV table is the load-bearing piece of the argument, and it reads as judgment rather than mere listing. A TV is always plugged in, always on high-speed WiFi, online 24/7 in standby, effectively uncapped on bandwidth, and frequently unattended. It never hits 1% battery, never hops between WiFi networks, never gets locked while the user sleeps. The consent and oversight rows matter most: phones still have a layer of mobile device management (MDM) and endpoint detection and response (EDR), TVs have virtually none; consent on a phone is text on a screen, on a TV it’s a legal document navigated by remote-control arrow keys. The researcher’s puncturing line: privacy-policy disclosure is the wrong control surface for a TV, because nobody scrolls a legal document with a remote. The smart TV is the “ideal proxy” precisely because it sits in the blind spot between consumer attention and corporate control.

A few config details make the lack of real user control concrete. The idle rules include ignore_screen_on: true and ignore_on_call: true: relay even with the screen on, relay while the user is on a call. “Idle” doesn’t mean the user has stepped away, it only means CPU, memory, and battery sit within the SDK’s thresholds. A dual_pairing map stitches a brand’s iOS, Windows, and macOS installs into one identity, cross-platform identity stitching documented inside a public config file. A sharper bypass: use_netifs: true binds the connection to a physical interface (en0 for WiFi, pdp_ip0 for cellular), sidestepping the user’s configured VPN tunnel interface. The researcher confirmed this empirically with transparent TLS interception. He captured every HTTPS call the SDK made except the peer tunnel, even though port 443 was explicitly redirected to the inspector. So even a user who turns on a VPN to protect themselves still has the exit tunnel leaving from their home’s real IP.

Builder impact

If you build a product that depends on third-party data, this post is a supply-chain trust alert. Read it as a due-diligence checklist, not as news you skim and forget.

First, the compliance and ethics risk in your data sourcing is real and it propagates down the chain. The “web data” or “proxy service” you buy may well be supplied at the source by exactly this kind of residential exit, propped up by a “consent” you’d have to navigate with a TV remote to find. The post is careful: being in Bright Data’s manifest only proves an integration “might have existed,” and confirming whether a publisher’s currently shipping app carries the SDK in production requires per-app verification. Carry that discipline into your own procurement. Don’t treat a vendor’s marketing as compliance evidence, demand clarity on where the exit traffic supply comes from, and do per-app verification yourself.

Second, expect a backlash to “my device is secretly gathering data for AI,” and design your own consent surface accordingly. This hit Hacker News (234 points), which says it touched a nerve in the developer community. Petflix’s “occasionally” against a 200 GB default is the textbook script for trust collapse. The lesson cuts both ways: if you integrate any third-party component that touches user resources, IP, or device state, “consent” has to rest on a control surface the user can actually understand. Bury it in a privacy policy reachable by remote and you don’t have consent, and when it gets reverse-engineered, the blowback lands on your brand, not the vendor’s.

Third, treat this as a stress test of supply-chain trust. The researcher’s defensive measures double as a builder’s self-audit kit. At the network boundary you can DNS-block or TLS-SNI-filter domains like *.brdtnet.com and *.luminatinet.com. On managed devices you can use MDM to scan installed app binaries for symbols such as BrdWebSocketFacade and BrdNetwork.DNSResolver and prohibit apps containing them on corporate hardware. If you run a device fleet or a corporate network, this is deployable today. But keep the author’s caveat in mind: because use_netifs binds to the physical interface, peer traffic bypasses corporate WiFi when a device is on cellular. Every network-boundary control only works on traffic that crosses your boundary, so device-level binary scanning is the necessary complement.

What to ignore

Don’t read this as “your TV is spying on you” or “AI companies installed a camera in your living room.” The post says nothing of the sort. The researcher stresses repeatedly that what’s reported is physical-device-state telemetry (idle, battery, network), and the scraping jobs hit third-party public web pages. The post also explicitly attributes the “no personal information accessed except your IP address” line to Petflix’s own wording. The real problem isn’t content surveillance, it’s the quality of consent and what the device is borrowed to do. Dressing it up as a wiretap conspiracy blurs the part you can actually act on.

Don’t over-interpret the per-country bandwidth table geopolitically either. The config lets Uzbekistan and Oman devices relay down to 1% battery with daily caps 20x the default, while Qatar and the UAE are throttled below default. But the author is candid: “we can only speculate as to why the tiers are drawn this way,” with one reading being market segmentation by grid stability and mobile-data cost. If the original author only goes as far as speculation, downstream readers shouldn’t extrapolate it into a confirmed conclusion. What to remember is the default: worldwide, the default still permits 500 MB of someone else’s traffic per month across your home internet. That number alone makes the point without bolting unproven motive onto it.

Finally, don’t get stuck arguing whether this is illegal. The framing is clear: the post is about the legal supply side. User consent exists, it’s a public commercial product, and it uses documented Apple APIs throughout. What should hold a builder’s attention isn’t legality, it’s how thin the word “legal” has been compressed here. A dialog you can’t scroll with a remote becomes the entire authorization basis for lending your home network to AI’s scraping.

Technical takeaway

The detail engineers should keep is the dual evasion design, what the researcher calls the most interesting artifact. The SDK splits the control plane from the data plane and evades each separately. The control plane (config fetch, telemetry pings) is built on CFNetwork’s CFHTTPMessage primitives, which defeats the URLSession-level instrumentation common in mobile app-sec tooling (method swizzling, network extensions, URLProtocol subclasses) while still respecting the system proxy and so staying visible to TLS-intercepting researchers. The data plane (the peer tunnel) is built on NWConnection bound to a physical interface, which defeats VPNs and guarantees scraping leaves from a residential IP. Both are documented Apple APIs, but the combination means a researcher relying on a single technique only ever sees half the SDK’s behavior. The config also ships a forward-looking flag, http3_enabled: true: the peer tunnel may move from TCP/443 to QUIC-based UDP/443 in a future version, which would break any defender relying on TCP connection tracking. For anyone doing endpoint security or anti-scraping, that’s a signal worth defending against early.

Sources

No official primary source available; this analysis is based on reliable secondary reporting (named outlets, cross-confirmed).