The Musings of Jaime David

The Musings of Jaime David

@jaimedavid.blog@jaimedavid.blog

The writings of some random dude on the internet

1,120 posts

1 follower

Tag: accessibility

The Absurd Wall Around Picture‑in‑Picture for Music on YouTube
There is something uniquely frustrating about running headfirst into a limitation that feels completely artificial. Not a technical constraint. Not a hardware shortcoming. Not even a genuine legal impossibility. Just a wall, quietly erected, that exists because someone decided it should. YouTube’s refusal to allow music content to run in Picture‑in‑Picture mode on iPhone and iPad is one of those walls. It stands there, immovable, while everything around it suggests that it should not exist at all. Videos can do it. Movies can do it. Shows can do it. Long‑form documentaries, podcasts with visuals, talking heads, gaming streams, and even YouTube’s own ads can shrink down and float obediently in a corner of the screen. But the moment the content is classified as music, suddenly it becomes impossible, impractical, or “too complex” to allow the same basic functionality.

Picture‑in‑Picture, at its core, is not some experimental or bleeding‑edge feature anymore. On iOS and iPadOS, it has been a standard part of the operating system for years. Apple provides native APIs for it. Developers do not have to invent it from scratch, reverse‑engineer obscure behavior, or hack together unstable workarounds. It is documented, supported, and widely used. Countless apps implement it with relative ease. Video players use it. Streaming platforms use it. Even YouTube itself uses it extensively, as long as the content being played falls into the “acceptable” category. Which makes the exclusion of music content not just annoying, but baffling.

What makes this restriction feel especially absurd is how arbitrary the distinction actually is. A music video is still a video. From a technical standpoint, there is no magical difference between a music video and any other video file hosted on YouTube’s servers. The codec is the same. The playback pipeline is the same. The streaming infrastructure is the same. The app already knows how to keep a video playing while the user switches apps. The floating window already exists. The only thing that changes is a label, a category, a business rule. Suddenly, a video that could float freely a moment ago is now locked in place, demanding your full attention or nothing at all.

This is where the frustration really sets in for users, especially on mobile devices like the iPhone and iPad. These are inherently multitasking devices. Apple markets them that way. People use them that way. You listen to music while reading, while writing, while scrolling, while checking messages, while doing literally anything else. Picture‑in‑Picture fits perfectly into that reality. It allows content to remain present without monopolizing the entire screen. It respects the user’s time, attention, and workflow. Blocking music from that experience feels less like a technical oversight and more like a deliberate act of control.

The irony is that YouTube clearly understands the value of Picture‑in‑Picture. They did not reluctantly implement it under pressure. They actively promote it as a premium feature, especially on iOS. They advertise it as part of the YouTube Premium experience, a way to keep videos playing while you use other apps. They know users want this. They know it improves usability. They know it aligns with how people actually use their devices. And yet, when it comes to music, they draw an arbitrary line and pretend that crossing it would somehow break the universe.

From the user’s perspective, this makes absolutely no sense. Music is arguably the most natural candidate for Picture‑in‑Picture. If anything, music needs visuals less than other content. Most people are not actively watching a music video the entire time it plays. The visuals are often secondary, symbolic, or simply background flair. The primary purpose is the audio. If YouTube can keep a floating video window active for a two‑hour podcast where the visuals are a static shot of someone talking into a microphone, then claiming that a four‑minute music video is somehow incompatible with Picture‑in‑Picture strains all credibility.

The situation becomes even more absurd when you consider that YouTube already allows background playback for music under certain conditions, again usually tied to Premium. The audio can continue when the screen is off. The audio can continue when you leave the app. The system clearly has no problem handling continuous music playback. So what exactly is the obstacle to letting that same content play in a small, floating window? There isn’t one, at least not a technical one. The infrastructure is already there. The behavior already exists in slightly different forms. The restriction is selective and intentional.

This selective limitation feeds into a broader pattern that many users have noticed over the years with YouTube’s mobile apps, especially on iOS. Features are not withheld because they are impossible. They are withheld because they are useful. The more useful a feature is, the more likely it is to be gated, restricted, or segmented into a paid tier. Picture‑in‑Picture for music feels like a textbook example of this philosophy. By making music playback more inconvenient, YouTube nudges users toward YouTube Music, YouTube Premium, or alternative listening habits that better serve the company’s revenue goals.

But even from a business perspective, the logic feels shortsighted. Frustrating users does not necessarily push them toward paid subscriptions. In many cases, it pushes them away from the platform entirely. When people realize that a basic, expected feature is being withheld for no defensible reason, resentment builds. That resentment does not always translate into loyalty or conversion. Sometimes it translates into users seeking out other platforms, other apps, or other ways of consuming the same content with fewer artificial barriers.

There is also an accessibility dimension to this issue that rarely gets discussed. Picture‑in‑Picture is not just a convenience feature. For some users, it is a usability necessity. People with attention differences, neurodivergent users, or those who rely on multitasking to stay engaged often benefit from having content present without dominating their screen. Music, in particular, can be grounding, regulating, or focus‑enhancing. Denying Picture‑in‑Picture for music disproportionately affects these users, all in service of a categorization rule that exists purely at the platform level.

On iPad especially, the restriction feels almost comical. The iPad is designed for multitasking. Split View, Slide Over, Stage Manager, and Picture‑in‑Picture are core features of the device’s identity. Using YouTube on an iPad and discovering that a lecture, a movie, or a random vlog can float neatly in the corner, while a music video stubbornly refuses to do so, highlights how unnatural the limitation really is. The device is capable. The OS is capable. The app is capable. The content is capable. Only the policy is not.

Some defenders of the status quo might argue that music licensing complicates things, that record labels impose restrictions, or that contracts somehow prohibit Picture‑in‑Picture. But this argument quickly falls apart under scrutiny. Music already plays in the background. Music already streams across devices. Music already appears in countless contexts where the visuals are incidental. If licensing were the real obstacle, we would see far more consistent limitations across playback modes. Instead, what we see is a finely tuned set of restrictions that align suspiciously well with monetization strategies.

The inconsistency becomes even clearer when you compare YouTube to other platforms. Many music and video apps have no problem allowing Picture‑in‑Picture or equivalent behavior for audio‑focused content. Some apps go even further, integrating mini players, persistent controls, and seamless transitions between visual and audio modes. These apps demonstrate, again and again, that there is nothing inherently difficult about letting music coexist with multitasking. YouTube’s refusal to do the same stands out precisely because it is an outlier, not a norm.

There is also a philosophical question at the heart of this issue: who controls how content is consumed? When a platform decides that certain types of content must be consumed in a specific, constrained way, it sends a message about ownership and agency. YouTube hosts the content, but users experience it. Blocking Picture‑in‑Picture for music is a subtle assertion of control, a way of saying that even if your device can do this, even if the OS encourages it, even if it would improve your experience, the platform gets the final say.

This tension between platform control and user autonomy is not new, but it becomes especially visible in cases like this because the justification is so thin. If Picture‑in‑Picture for music genuinely broke something, degraded quality, or introduced instability, users might accept it reluctantly. But when everything else works perfectly and only music is excluded, the explanation rings hollow. It feels less like a technical decision and more like a power move.

The end result is a worse experience for everyone except, perhaps, the balance sheet. Users are forced to keep the YouTube app in the foreground just to listen to a song. They are discouraged from multitasking. They are subtly punished for using YouTube as a music platform rather than switching to a separate, branded music app. All of this friction accumulates, turning what should be a seamless, modern experience into something clunky and outdated.

What makes this particularly frustrating is how easily it could be fixed. There is no need for groundbreaking engineering. No need for new standards. No need for radical redesigns. The feature already exists. The app already supports it. The only thing required is the decision to allow it. Flip the switch. Remove the arbitrary exception. Treat music videos like the videos they are. Respect the reality of how people use their devices.

Until that happens, YouTube’s handling of Picture‑in‑Picture for music will remain a symbol of a broader problem in modern platforms: the tendency to prioritize control and monetization over user experience, even when doing so makes the product objectively worse. It is a reminder that many of the frustrations people feel with large tech platforms are not about bugs or limitations, but about choices. Choices to restrict, to gate, and to complicate things that should be simple.
Also on:
- website
January 17, 2026
The Silent Failure of OMNY: How the MTA’s “Modern” System Leaves Riders Behind
The MTA sold OMNY as the future. A sleek, contactless, modern payment system designed to replace the MetroCard, speed up commutes, and drag New York’s transit infrastructure into the 21st century. It was marketed as a seamless solution, a smoother way to move millions of people every day, a tap-and-go miracle. Except, as every rider who has actually lived with OMNY knows, this future has been more frustrating than freeing, more glitchy than graceful, and more annoying than any system this essential should ever be.

OMNY scanners suck. And they don’t just suck in the casual way we complain about daily inconveniences. They suck in a deeper, structural, systemic way that reveals exactly how disconnected the MTA is from the actual lived experience of the people who rely on it. When your entire city depends on public transportation the way New York does, when people need those subways and buses to survive, to work, to attend school, to get groceries, to see family, everything about the system matters. And OMNY is simply not good enough for the weight it carries.

What makes OMNY especially aggravating is that it’s not failing at some abstract, futuristic technical dream. It’s failing at the basics. It struggles with the simplest part of its purpose: letting people enter the station. The scanner doesn’t need to do anything complicated. It just has to accept a tap quickly, consistently, and reliably. But it often doesn’t. Instead, it’s slow, it freezes, it glitches, it double-charges, it doesn’t read certain cards, it doesn’t read certain phones, and sometimes it just gives up entirely. The amount of times riders have watched the screen blink, stall, or spit out a big red X is embarrassing for a system that cost hundreds of millions of dollars.

Every rider knows the feeling. You approach the turnstile, tap your card or phone, and—nothing. The screen stutters, thinking about it as if it’s weighing some metaphysical question, like “Do I truly want to grant you access to the train?” Meanwhile the person behind you starts shifting impatiently, you try again, maybe the angle was wrong, maybe your phone was too close to your wallet, maybe the scanner is just being finicky today. Finally, after multiple taps, maybe it works. Or maybe it still doesn’t and you have to shame-walk to another turnstile and hope that one isn’t possessed by the same demon.

What was supposed to be faster is somehow slower. What was supposed to be futuristic feels already outdated. What was supposed to be convenient has introduced a whole new category of everyday irritation into the lives of people who already have enough to stress about.

And let’s talk about the double-charging problem, because if OMNY has one defining trait besides unreliability, it’s the way it has absolutely no shame about taking extra money from riders. You tap your phone, it doesn’t register, so you tap again. Except it did register, it just didn’t show it. Or maybe it showed it, but lagged. Or maybe it pretended not to show it but secretly registered it behind the scenes. The end result is the same: overcharges. Invisible mistakes. A system that is supposed to make payment easier instead leads to more confusion, more checking bank statements, more disputes, more money lost.

MetroCard readers were far from perfect, but at least you knew where you stood. A swipe was a swipe. If the swipe didn’t work, it told you instantly. The physicality of it made sense. With OMNY, the tap exists in this weird limbo where the scanner may or may not have captured the transaction, and you’re left guessing until your bank account tells you hours later.

That’s another thing—OMNY relies on banking infrastructure in a way MetroCard never did. OMNY assumes everyone has a contactless debit card, or a credit card, or a smartphone capable of storing digital payment methods. It assumes everyone has stable enough finances that daily transit charges won’t cause problems. It assumes everyone is comfortable letting every ride be tied to their personal financial footprint.

But that is not the reality of millions of riders. The MetroCard system was more equitable. You could buy a card with cash. You could put in $5, $10, $20, whatever you had. You could do it anonymously. You could budget. OMNY pushes people into a world where your commute is something you must tether to your banking identity. It quietly erodes the last remnants of accessible transit anonymity. And when you combine that with the already-existing issues of surveillance, data collection, and the increasing digitization of public life, OMNY becomes not just annoying, but unsettling.

Even the OMNY card—which was supposed to solve the issue for people who don’t use or can’t use digital payment methods—is poorly implemented. Harder to find than MetroCards ever were, more expensive upfront, and confusingly marketed. It’s like the MTA forgot the purpose of transit payment systems: to be simple, affordable, and universally accessible.

And then there’s the placement problem. OMNY scanners are often angled awkwardly. They’re mounted at positions that force people to twist their wrists or contort their phones. Some are too low, some too high. Some are on turnstiles that wobble when you lean your hand against them. For a system reliant on physical motion—tapping—basic ergonomics should have been a priority. It wasn’t.

The worst part is how all of these small issues compound during rush hour. When thousands of people are funneling through a limited number of turnstiles, every delay matters. Every glitch becomes amplified. Every red X becomes a microscopic traffic jam. And people become frustrated with each other, when the real culprit is a system that simply doesn’t work as smoothly as it should.

A truly functional system anticipates the realities of its users. OMNY feels like it was built in a vacuum. Designed by committees who don’t ride trains, approved by people who never experience the daily grind, engineered with assumptions instead of empathy. The MTA saw what other cities were doing—London’s Oyster/contactless hybrid system, for example—and wanted to replicate it. But they overlooked the fact that London’s system works because it is stable, consistent, and thoroughly tested. OMNY feels like the opposite: rushed, buggy, half-baked, and constantly needing “software updates” like some broken app you regret downloading.

The irony is that New Yorkers never asked for this. Riders didn’t demand the death of the MetroCard. They didn’t beg for a contactless system. They didn’t rally for OMNY. This was pushed from above, marketed as progress, and framed as inevitable. But progress is only progress when it actually improves people’s lives. OMNY has not done that. If anything, it has created new layers of friction in a system where friction is the last thing anyone needs.

It’s especially bad for disabled riders. People with mobility issues, tremors, limited reach, or sensory sensitivity often find OMNY’s tap system much harder than MetroCard’s swipe. The scanner requires precision. It requires stillness. It requires a very specific type of movement. And if you don’t tap at the correct distance or angle, it rejects you. For people with disabilities, that’s not just annoying—it’s discriminatory. Technology should expand accessibility, not restrict it.

Then there’s the issue of outages. When MetroCard machines went down, it was annoying, but you could still swipe your existing card. But if OMNY goes down, entire stations can bottleneck. Suddenly every single turnstile turns into a dead end. Riders who are already stressed, late, tired, and overwhelmed now face a new obstacle. A modern system should have redundancy, yet OMNY outages show just how brittle the whole setup really is.

And let’s not ignore another glaring flaw: OMNY eliminates the psychological assurance that a MetroCard provided. You could see your MetroCard balance. You knew exactly how many rides you had left. With OMNY, you just trust that your bank is charging correctly. You trust that the weekly fare cap will trigger. You trust a system that has already proven it struggles with the basics.

Riders shouldn’t have to trust. They should know. That is the purpose of a transit payment tool—to give people certainty. OMNY fails at that in nearly every way.

The frustrating thing is, OMNY could have been better. The concept isn’t inherently bad. Contactless systems can work beautifully when done right. But implementation matters. Execution matters. Testing matters. Listening to riders matters. And the MTA has a long history of rolling things out without ever listening to the people who actually use them.

With MetroCard being phased out, people don’t even have the comfort of choosing which system works better for them. They’re being forced into OMNY, forced into a system that’s not ready, forced into a system that wasn’t built with them in mind. You can’t call something modernization when the end result is inconvenience.

The larger issue is that OMNY represents a trend—the idea that tech is always the answer, that newer is always better, that digital solutions automatically improve quality of life. But sometimes technology complicates things. Sometimes the low-tech option is exactly what a city needs. Sometimes physical infrastructure is more reliable than digital infrastructure. And sometimes, like with OMNY, the push to innovate becomes performative rather than practical.

The MTA wanted to look modern. But looking modern and being effective are two completely different things.

A payment system touching the lives of eight million people a day shouldn’t need multiple taps. It shouldn’t freeze. It shouldn’t introduce anxiety. It shouldn’t rely on bank tech that varies from person to person. It shouldn’t cause people to miss trains. It shouldn’t be unreliable during the busiest hours. It shouldn’t create new forms of financial vulnerability. It shouldn’t overcharge, glitch, or lag.

It should just work. Every time. Instantly. Honestly. Predictably. Consistently. Quietly.

Instead, OMNY has become another symbol of how the city’s infrastructure fails riders—overpromising, underdelivering, and leaving people to deal with the fallout.

And it’s not just a minor annoyance. It’s a reflection of how much we tolerate because we have no choice. New Yorkers deserve better. Riders deserve better. The system deserves better. The future of public transit shouldn’t be defined by inconvenience, frustration, and the feeling of being beta-testers for something that should have been perfected before it ever went live.

OMNY scanners suck not because technology is bad, but because the execution was sloppy, careless, and disconnected from rider experience. And until the MTA acknowledges that, until they commit to real improvements rather than PR campaigns, OMNY will remain what it is now: a daily reminder that modernization means nothing if it doesn’t actually work for the people who need it most.
Also on:
- website
November 29, 2025
Musing Mondays #5: The Cost of Convenience: How AI Voice Assistants Are Changing Customer Experience
Technology is evolving at a rapid pace, and with it comes a slew of innovations that promise to make our lives easier. One area where this is particularly visible is in the realm of customer service, where automated voice assistants are increasingly replacing human operators. While these systems are designed to streamline processes and improve efficiency, they can also introduce a host of new challenges — particularly for users who rely on certain accommodations or prefer more personalized interactions.

Take Capital One’s recent change to its phone-based voice assistant system, for example. The company has transitioned from a human-like, slow-paced AI to a more robotic-sounding one that speeds through instructions. While the change is likely designed to improve speed and efficiency, it has left many users, especially those with specific needs, frustrated and dissatisfied.

This shift is more than just a matter of convenience; it brings to light critical questions about how technology serves its users. As AI becomes more integrated into our daily lives, we must consider the ways it impacts accessibility, inclusivity, and user experience. What happens when the “smart” systems we rely on start to overlook the diverse ways in which people interact with technology?

Accessibility and the Hidden Costs of “Efficiency”

When a company like Capital One rolls out a new AI voice assistant, the goal is often to create a system that can handle more users faster. And, on the surface, this seems like a win for efficiency. However, for those who are neurodivergent, have sensory sensitivities, or simply need a little extra time to process spoken information, the faster, more robotic assistant is anything but a win.

For many, using keypad inputs or interacting with slower, more human-like assistants was a much more comfortable and effective way to manage tasks like paying bills or checking balances. But the shift to a voice-only system with no alternative can feel alienating. Users are forced into a style of interaction that may not suit their needs, and without proper accommodations, they’re left to adapt — or struggle.

This isn’t an isolated issue. Across the tech industry, from customer service lines to smartphone apps, companies are increasingly opting for voice-first or AI-driven solutions. Yet, in this push for automation, the subtle human element of customer service is often lost — along with the empathy that comes with it.

The Pushback: How Users Are Reacting

As the AI assistant landscape shifts, many users are vocal about their dissatisfaction with these changes. Some argue that AI can never truly replace human interaction, especially when it comes to understanding the needs of a diverse user base.

From Reddit:
One user said:

“The older system let me use the keypad for everything, and I didn’t have to speak at all. Now it forces me to talk even when I don’t want to.”
This user’s frustration reveals the key problem with forcing voice-based interactions: it ignores the reality that some users are not comfortable speaking or may find it difficult to process information quickly.

From X (formerly Twitter):
Another user tweeted:

“I miss the old voice — it felt like it understood I needed time. This new one just speeds through everything.”
Here, the user is expressing a need for more time and a slower pace, something that a robotic-sounding assistant is unable to provide.

From Trustpilot:
A user posted:

“It talks too fast and I can’t even understand the menu options half the time.”
This user points out the speed of the new voice and how it affects comprehension — something especially concerning for those with auditory processing challenges.

From Reddit (again):
One more comment shared:

“This new robot voice is annoying AF. Bring back the old assistant!”
For this user, the problem isn’t just about speed — it’s about how the assistant’s robotic tone makes the experience feel less human and more disconnected.

These reactions aren’t simply complaints; they are signals that AI systems need to evolve alongside the diverse ways people interact with technology. It’s not just about functionality; it’s about understanding the needs of users in a nuanced, empathetic way.

How Tech Companies Can Do Better

While it’s clear that AI and voice assistants are here to stay, it’s essential that companies make their services more inclusive and accessible. The rapid adoption of AI shouldn’t come at the expense of those who rely on alternative methods of interaction.

Here are a few suggestions for how companies like Capital One (and others in the banking and tech sectors) can better serve their customers:
- Offer a Choice of Interaction Methods: Companies should allow users to choose between keypad inputs, voice prompts, and other modes of interaction, ensuring that users can find the method that works best for them.
- Slow Down AI Speech: For users who need extra time to process information, slowing down the speech rate could improve the experience for many people.
- Involve Diverse User Groups in Testing: When developing AI systems, companies should include a range of neurodivergent users and others with accessibility needs in the testing phase, ensuring that the system works for everyone.
- Avoid Over-Promising on Speed: The assumption that faster equals better doesn’t work for everyone. Companies need to be mindful that in the pursuit of speed, they don’t alienate the people who rely on more thoughtful, human-paced interactions.
Tech for All: Striving for Inclusivity

As AI technology continues to evolve, we must ask ourselves: Who is it really benefiting? A new, faster system may improve efficiency, but if it alienates users who need slower, more customizable options, is it really an improvement?

In a world where we are increasingly dependent on technology for day-to-day tasks, it’s essential that we strive for solutions that are inclusive and accessible for everyone. After all, the most efficient technology is the one that works for everyone, not just those who fit a particular mold.

Have you encountered similar frustrations with voice assistants? Share your experience in the comments below — let’s keep the conversation going about accessibility in AI.
June 9, 2025