tag:blogger.com,1999:blog-7019690775170012012024-03-14T17:28:04.186+05:30Nirbheek’s RantingsNirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.comBlogger100125tag:blogger.com,1999:blog-701969077517001201.post-5128565435841285372023-07-26T14:50:00.002+05:302023-08-21T18:25:52.038+05:30What is WebRTC? Why does it need ‘Signalling’?<p>If you’re new to WebRTC, you must’ve heard that it’s a way to do video calls in your browser without needing to install an app. It’s pretty great!</p><p>However, it uses a bunch of really arcane terminology because it builds upon older technologies such as RTP, RTCP, SDP, ICE, STUN, etc. To understand what WebRTC Signalling is, you must first understand these foundational technologies.</p><p>Readers who are well-versed in this subject might find some of the explanations annoyingly simplistic to read. They will also notice that I am omitting a lot of of detail, leading to potentially misleading statements.</p><p>I apologize in advance to these people. I am merely trying to avoid turning this post into a book. If you find a sub-heading too simplistic, please feel free to skip it. :-)</p>
<h2 id="RTP">RTP</h2><p><strong>Real-time Transport Protocol</strong> is a <a href="https://en.wikipedia.org/wiki/Internet_Engineering_Task_Force" target="_blank" rel="noopener">standardized</a> way of taking video or audio data (media) and chopping it up into “packets” (you can literally think of them as packets / parcels) that are sent over the internet using <strong>UDP</strong>. The purpose is to try and deliver them to the destination as quickly as possible.</p><p><strong>UDP</strong> (user datagram protocol) is a packet-based alternative to <strong>TCP</strong> (transmission control protocol), which is connection-based. So when you send something to a destination (<a href="https://en.wikipedia.org/wiki/IP_address" target="_blank" rel="noopener">IP address</a> + port number), it will be delivered if possible but you have no protocol-level mechanism for finding out if it was received, unlike, say, <a href="https://en.wikipedia.org/wiki/Transmission_Control_Protocol#Connection_establishment" target="_blank" rel="noopener">TCP ACKs</a>.</p><p>You can think of this as chucking parcels over a wall towards someone whom you can’t see or hear. A bunch of them will probably be lost, and you have no straightforward way to know how many were actually received.</p><p>UDP is used instead of TCP for a number of reasons, but the most important ones are:</p><ol>
<li>
<p>TCP is designed for perfect delivery of all data, so networks will often try too hard to do that and use ridiculous amounts of buffering (sometimes 30 seconds or more!), which leads to latencies that are too large for two people to be able to talk over a call.</p>
</li>
<li>
<p>UDP doesn’t have that problem, but the trade-off is that it gives no guarantees of delivery at all!</p>
<p>You’d be right to wonder why nothing new has been created to be a mid-way point between these two extremes. The reason is that new transport protocols don’t get any uptake because existing systems on the Internet (operating systems, routers, switches, etc) don’t (want to) support them. This is called <a href="https://en.wikipedia.org/wiki/Protocol_ossification" target="_blank" rel="noopener">Protocol ossification</a>, and it's a big problem for the Internet.</p>
<p>Due to this, new protocols are just built on top of UDP and try to add mechanisms to detect packet loss and such. One such mechanism is…</p>
</li>
</ol>
<h2 id="RTCP">RTCP</h2><p><strong>RTP Control Protocol</strong> refers to <a href="https://www.rfc-editor.org/rfc/rfc3550#section-6" target="_blank" rel="noopener">standardized messages</a> (closely related to RTP) that are sent by a media sender to all receivers, and also messages that are sent back by the receiver to the sender (feedback). As you might imagine, this message-passing system has been extended to do a lot of things, but the most important are:</p><ol>
<li>Receivers use this to send feedback to the sender about how many packets were actually received, what the latency was, etc.</li>
<li>Senders send information about the stream to receivers using this, for instance to synchronize audio and video streams (also known as <a href="https://en.wikipedia.org/wiki/Lip_sync" target="_blank" rel="noopener">lipsync</a>), to tell receivers that the stream has ended (a BYE message), etc.</li>
</ol><p>Similar to RTP, these messages are also sent over UDP. You might ask “what if these are lost too”? Good question!</p><p>RTCP packets are sent at regular intervals, so you’d know if you missed one, and network routers and switches will prioritize RTCP packets over other data, so you’re unlikely to lose too many in a row unless there was a complete loss of connectivity.</p><h2 id="Peer">Peer</h2><p>WebRTC is often called a “peer-to-peer” (P2P) protocol. You might’ve heard that phrase in a different context: P2P file transfer, such as Bittorrent.</p><p>The word “peer” contrasts with “server-client” architectures, in which “client” computers can only talk to (or via) “server” computers, not directly to each other.</p><p>We can contrast server-client architecture with peer-to-peer using a real-world example:</p><ul>
<li>If you send a letter to your friend using a postal service, that’s a server-client architecture.</li>
<li>If you leave the letter in your friend’s mailbox yourself, that’s peer-to-peer.</li>
</ul><p>But what if you don’t know what kind of messages the recipient can receive or understand? For that we have…</p>
<h2 id="SDP">SDP</h2><p>Stands for <strong>Session Description Protocol</strong> which is a <a href="https://tools.ietf.org/html/rfc4566" target="_blank" rel="noopener">standardized</a> message format to tell the other side the following:</p><ul>
<li>Whether you want to send and/or receive, audio and/or video</li>
<li>How many streams of audio and/or video you want to send / receive</li>
<li>What formats you can send or receive, for audio and/or video</li>
</ul><p>This is called an “offer”. Then the other peer uses the same message format to reply with the same information, which is called an “answer”.</p><p>This constitutes media “negotiation”, also called “SDP exchange”. One side sends an “offer” SDP, the other side replies with an “answer” SDP, and now both sides know what to do.</p><p>As you might expect, there’s a bunch of other technical details here, and you can know all about them at <a href="https://webrtchacks.com/sdp-anatomy/" target="_blank" rel="noopener">this excellent page that explains every little detail</a>. It even explains the format for ICE messages! Which is…</p>
<h2 id="ICE">ICE</h2><p><strong>Interactive Connectivity Establishment</strong> is a <a href="https://www.rfc-editor.org/rfc/rfc5245.html" target="_blank" rel="noopener">standardized</a> mechanism for peers to tell each other how to transmit and receive UDP packets. The simplest way to think of it is that it’s just a list of IP address and port pairs.</p><p>Once both sides have successfully sent each other (“exchanged”) ICE messages, both sides know how to send RTP and RTCP packets to each other.</p><p>Why do we need IP address + port pairs to know how to send and receive packets? For that you need to understand…</p>
<h2 id="How-The-Internet-Works">How The Internet Works</h2><p>If you’re connected to the internet, you always have an <a href="https://en.wikipedia.org/wiki/IP_address" target="_blank" rel="noopener">IP address</a>. That’s usually something like <code>192.168.1.150</code> – a private address that is specific to your local (home) network and has no meaning outside of that. Having someone’s private IP address is basically like having just their house number but no other parts of their address, like the street or the city. Useful if you're living in the same building, but not otherwise.</p>
<p>Most personal devices (computer or phone or whatever) with access to the Internet don’t actually have a <em>public</em> IP address. Picking up the analogy from earlier, a public IP address is the internet equivalent of a full address with a house number, street address, pin code, country.</p>
<p>When you want to connect to (visit) a website, your device actually talks to an ISP (internet service provider) router, which will then talk to the web server on your behalf and ask it for the data (website in this case) that you requested. This process of packet-hopping is called “<a href="https://en.wikipedia.org/wiki/Routing" target="_blank" rel="noopener">routing</a>” of network packets.</p>
<p>This ISP router with a public address is called a NAT (<a href="https://en.wikipedia.org/wiki/Network_address_translation" target="_blank" rel="noopener">Network Address Translator</a>). Like the name suggests, its job is to translate the addresses embedded in packets sent to it from public to private and vice-versa.</p>
<p>Let’s say you want to send a UDP packet to <code>www.google.com</code>. Your browser will resolve that domain to an IP address, say <code>142.250.205.228</code>. Next, it needs a port to send that packet to, and both sides have to pre-agree on that port. Let’s pick <code>16789</code> for now.</p><p>Your device will then allocate a port on your device from which to send this packet, let’s say <code>11111</code>. So the packet header looks a bit like this:</p>
<table style="border-collapse: collapse; border: 1px dotted white;">
<thead>
<tr>
<th style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;">From</th>
<th style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;">To</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;"><code>192.168.1.150:11111</code></td>
<td style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;"><code>142.250.205.228:16789</code></td>
</tr>
</tbody>
</table>
<p>Your ISP’s NAT will intercept this packet, and it will replace your private address and port in the <code>From</code> field in the packet header to its own public address, say <code>169.13.42.111</code>, and it will allocate a new sender port, say <code>22222</code>:</p>
<table style="border-collapse: collapse; border: 1px dotted white;">
<thead>
<tr>
<th style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;">From</th>
<th style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;">To</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;"><code>169.13.42.111:22222</code></td>
<td style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;"><code>142.250.205.228:16789</code></td>
</tr>
</tbody>
</table>
<p>Due to this, the web server never sees your private address, and all it can see is the public address of the NAT.</p><p>When the server wants to reply, it can send data back to the <code>From</code> address, and it can use the same port that it received the packet on:</p>
<table style="border-collapse: collapse; border: 1px dotted white;">
<thead>
<tr>
<th style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;">From</th>
<th style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;">To</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;"><code>142.250.205.228:16789</code></td>
<td style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;"><code>169.13.42.111:22222</code></td>
</tr>
</tbody>
</table>
<p>The NAT remembers that this port <code>22222</code> was recently used for <em>your</em> <code>From</code> address, and it will do the reverse of what it did before:</p>
<table style="border-collapse: collapse; border: 1px dotted white;">
<thead>
<tr>
<th style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;">From</th>
<th style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;">To</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;"><code>142.250.205.228:16789</code></td>
<td style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;"><code>192.168.1.150:11111</code></td>
</tr>
</tbody>
</table>
<p>And that’s how packets are send and received by your phone, computer, tablet, whatever when talking to a server.</p><p>Since at least one side needs to have a public IP address for this to work, how can your phone send messages to your friend’s phone? Both only have private addresses.</p>
<h3 id="Solution-1-Just-Use-A-Server-As-A-Relay">Solution 1: Just Use A Server As A Relay</h3><p>The simplest solution is to have a server in the middle that relays your messages. This is how all text messaging apps such as iMessage, WhatsApp, Instagram, Telegram, etc work.</p><p>You will need to buy a server with a public address, but that’s relatively cheap if you want to send small messages.</p><p>For sending RTP (video and audio) this is accomplished with a TURN (<a href="https://en.wikipedia.org/wiki/Traversal_Using_Relays_around_NAT" target="_blank" rel="noopener">Traversal Using Relays around NAT</a>) server.</p><p>Bandwidth can get expensive very quickly, so you don’t want to always use a TURN server. But this is a fool-proof method to transmit data, so it’s used a backup.</p>
<h3 id="Solution-2-STUN-The-NAT-Into-Doing-What-You-Want">Solution 2: STUN The NAT Into Doing What You Want</h3><p><strong>STUN</strong> stands for “<a href="https://en.wikipedia.org/wiki/STUN" target="_blank" rel="noopener">Simple Traversal of UDP through NATs</a>”, and it works due to a fun trick we can do with most NATs.</p><p>Previously we saw how the NAT will remember the mapping between a “port on its public address” and “your device’s private address and port”. With <a href="https://en.wikipedia.org/wiki/Network_address_translation#Full-cone_NAT" target="_blank" rel="noopener">many NATs</a>, this actually works for <em>any</em> packet sent on that public port by anyone.</p><p>This means if a public server can be used to create such mappings on the NATs of both peers, then the two can send messages to each other from NAT-to-NAT without a relay server!</p><p>Let’s dig into this, and let’s substitute hard-to-follow IP addresses with simple names: <code>AlicePhone</code>, <code>AliceNAT</code>, <code>BobPhone</code>, <code>BobNAT</code>, and finally <code>STUNServer:19302</code>.</p><p>First, <code>AlicePhone</code> follows this sequence:</p><ol>
<li>
<p><code>AlicePhone</code> sends a STUN packet intended for <code>STUNServer:19302</code> using UDP</p>
<table style="border-collapse: collapse; border: 1px dotted white;">
<thead>
<tr>
<th style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;">From</th>
<th style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;">To</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;"><code>AlicePhone:11111</code></td>
<td style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;"><code>STUNServer:19302</code></td>
</tr>
</tbody>
</table>
</li>
<li>
<p><code>AliceNAT</code> will intercept this and convert it to:</p>
<table style="border-collapse: collapse; border: 1px dotted white;">
<thead>
<tr>
<th style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;">From</th>
<th style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;">To</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;"><code>AliceNAT:22222 </code></td>
<td style="text-align: left; padding: 5px; border-collapse: collapse; border: 1px dotted white;"><code>STUNServer:19302</code></td>
</tr>
</tbody>
</table>
</li>
<li>
<p>When <code>STUNServer</code> receives this packet, it will know that if someone wants to send a packet to <code>AlicePhone:11111</code>, they could use <code>AliceNAT:22222</code> as the <code>To</code> address. This is an example of an ICE candidate.</p>
</li>
<li>
<p><code>STUNServer</code> will then send a packet back to <code>AlicePhone</code> with this information.</p>
</li>
</ol>
<p>Next, <code>BobPhone</code> does the same sequence and discovers that if someone wants to send a packet to <code>BobPhone:33333</code> they can use <code>BobNAT:44444</code> as the <code>To</code> address. This is <code>BobPhone</code>’s ICE candidate.</p><p>Now, <code>AlicePhone</code> and <code>BobPhone</code> must exchange these ICE candidates.</p><p>How do they do this? They have no idea how to talk to each other yet.</p><p>The answer is… they <a href="https://notes.centricular.com/5FL6lO_kSpa18N4NPyHBWQ?both#Solution-1-Just-Use-A-Server-As-A-Relay">Just Use A Server As A Relay</a>! The server used for this purpose is called a <strong>Signalling Server</strong>.</p><p>Note that these called “candidates” because this mechanism won’t work if one of the two NATs changes the public port also based on the public <code>To</code> address, not just the private <code>From</code> address. This is called a <a href="https://en.wikipedia.org/wiki/Network_address_translation#Symmetric_NAT" target="_blank" rel="noopener">Symmetric NAT</a>, and in these (and other) cases, you have to fallback to <a href="https://notes.centricular.com/5FL6lO_kSpa18N4NPyHBWQ?both#Solution-1-Just-Use-A-Server-As-A-Relay">TURN</a>.</p>
<h2 id="Signalling-Server">Signalling Server</h2><p><strong>Signalling</strong> is a technical term that simply means: “a way to pass small messages between peers”. In this case, it’s a way for peers to exchange SDP and ICE candidates.</p><p>Once these small messages have been exchanged, the peers know how to send data to each other over the internet without needing a relay.</p><p>Now open your mind: you could use <em>literally any</em> out of band-mechanism for this. You can use <a href="https://docs.aws.amazon.com/kinesisvideostreams/latest/dg/API_Operations_Amazon_Kinesis_Video_Signaling_Channels.html" target="_blank" rel="noopener">Amazon Kinesis Video Signalling Channels</a>. You can use a custom <a href="https://en.wikipedia.org/wiki/WebSocket" target="_blank" rel="noopener">websocket</a> server or a <a href="https://en.wikipedia.org/wiki/Protocol_Buffers" target="_blank" rel="noopener">ProtoBuf</a> server.</p><p>Heck, Alice and Bob can copy/paste these messages into iMessage on both ends. In theory, you can even use carrier pigeons — it’ll just take a very long time to exchange messages 😉</p>
<p>That’s it, this is what <strong>Signalling</strong> means in a WebRTC context, and why it’s necessary for a successful connection!</p><p>What a <strong>Signalling Server</strong> gives you on top of this is <strong>state management</strong>: checking whether a peer is allowed to send messages to another peer, whether a peer is allowed to join a call, can be invited to a call, which peers are in a call right now, etc.</p><p>Based on your use-case, this part can be really easy to implement or really difficult and heavy in corner-cases. Most people can get away with a really simple protocol, just by adding authorization to <a href="https://gitlab.freedesktop.org/gstreamer/gstreamer/-/blob/main/subprojects/gst-examples/webrtc/signalling/Protocol.md#multi-party-calls-with-a-room" target="_blank" rel="noopener">this multi-party protocol</a> I wrote for the <a href="https://gitlab.freedesktop.org/gstreamer/gstreamer/-/tree/main/subprojects/gst-examples/webrtc/multiparty-sendrecv" target="_blank" rel="noopener">GStreamer WebRTC multiparty send-receive examples</a>. More complex setups require a more bespoke solution, where all peers aren’t equal.</p>Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com0tag:blogger.com,1999:blog-701969077517001201.post-80513675162577875822020-09-29T14:00:00.039+05:302022-02-19T01:37:33.118+05:30Building GStreamer on Windows the Correct Way<p>For the past 4 years, <a href="https://twitter.com/tp_muller" target="_blank">Tim</a> and I have spent thousands of hours on better Windows support for GStreamer. Starting in <a href="http://blog.nirbheek.in/2016/05/gstreamer-and-meson-new-hope.html">May 2016</a> when I first wrote about this and then with the <a href="http://blog.nirbheek.in/2016/07/building-and-developing-gstreamer-using.html">first draft of the work</a> before it was revised, updated, and upstreamed.<br /></p><p>Since then, we've worked tirelessly to improve Windows support in GStreamer with patches to many projects such as the <a href="https://github.com/mesonbuild/meson/" target="_blank">Meson build system</a>, GStreamer's <a href="https://gitlab.freedesktop.org/gstreamer/cerbero/" target="_blank">Cerbero meta-build system</a>,
and writing build files for several non-GStreamer projects such as x264, openh264, ffmpeg, zlib, bzip2, libffi,
glib, fontconfig, freetype, fribidi, harfbuzz, cairo, pango, gtk, libsrtp, opus, and
many more that I've forgotten.</p><p>More recently, <a href="https://twitter.com/YangSeungha" target="_blank">Seungha</a> has also been working on new GStreamer elements for Windows such as d3d11, mediafoundation, wasapi2, etc. Sometimes we're able to find someone to sponsor all this work, but most of the time it's on our own dime.<br /></p><p>Most of this has been happening in the background; noticed only by people who follow GStreamer development. I think more people should know about the work that's been happening upstream, and the <b>official and supported</b><i> </i>ways to build GStreamer on Windows. Searching for this on Google can be a very confusing experience with the top results being outdated links or just plain clickbait.</p><p>So here's an overview of your options when you want to use GStreamer on Windows:</p><a name="installing"></a><h2 style="text-align: left;">Installing GStreamer on Windows<br /></h2><div style="text-align: left;"> </div><div style="text-align: left;">GStreamer has released MinGW binary installers for Windows since the early 1.0 days using the Cerbero meta-build system which was created by <a href="https://twitter.com/ylatuya" target="_blank">Andoni</a> for the non-upstream "GStreamer SDK" project, which was based on GStreamer 0.10.</div><div style="text-align: left;"> </div><div style="text-align: left;">Today it supports building GStreamer with both MinGW and Visual Studio, and even supports outputting <a href="https://en.wikipedia.org/wiki/Universal_Windows_Platform" target="_blank">UWP</a> packages. So you can actually go and download all of those from the download page:</div><div style="text-align: left;"><br /></div><div style="text-align: left;"><a href="https://gstreamer.freedesktop.org/download/#windows">https://gstreamer.freedesktop.org/download/#windows</a></div><div style="text-align: left;"><br /></div><div style="text-align: left;">This is the easiest way to get started with GStreamer on Windows.</div><div style="text-align: left;"> </div><div style="text-align: left;"><a name="cerbero"></a><h2 style="text-align: left;">Building GStreamer yourself for Deployment</h2></div><div style="text-align: left;"> </div><div style="text-align: left;">If you need to build GStreamer with a custom configuration for deployment, the easiest option is to use Cerbero, which is a meta-build system. It will download all the dependencies for you (including most of the build-tools), build them with Autotools, CMake, or Meson (as appropriate), and output a neat little MSI installer.</div><div style="text-align: left;"> </div><div style="text-align: left;">The README contains all the information you need, including screenshots for how to set things up:</div><div style="text-align: left;"><br /></div><div style="text-align: left;"><a href="https://gitlab.freedesktop.org/gstreamer/cerbero/#description">https://gitlab.freedesktop.org/gstreamer/cerbero/#description</a></div><div style="text-align: left;"><br /></div><div style="text-align: left;">As of a few days ago, after months of work the native Cerbero Windows builds have also been integrated into our <a href="https://gitlab.freedesktop.org/gstreamer/gst-ci/-/merge_requests/334" target="_blank">Continuous Integration pipeline</a> that runs on every merge request, which further improves the quality of our Windows support. We already had native Windows CI using gst-build, but this increases our coverage.</div><div style="text-align: left;"><br /></div><div style="text-align: left;"><a name="gst-build"></a><h2>Contributing to GStreamer on Windows</h2></div><div style="text-align: left;"> </div><div style="text-align: left;">If you want to contribute to GStreamer from Windows, the best option is to clone the gstreamer <a href="https://gstreamer.freedesktop.org/documentation/frequently-asked-questions/mono-repository.html" target="_blank">monorepo</a> (derived from gst-build which was created by <a href="https://twitter.com/thiblahute" target="_blank">Thibault</a>), which is basically a meson 'wrapper' project that has all the gstreamer repositories aggregated as subprojects. Once again, the README file is pretty easy to follow and has screenshots for how to set things up:</div><div style="text-align: left;"><br /></div><div style="text-align: left;"><a href="https://gitlab.freedesktop.org/gstreamer/gstreamer#getting-started">https://gitlab.freedesktop.org/gstreamer/gstreamer/</a></div><div style="text-align: left;"><br /></div><div style="text-align: left;">This is also the method used by all GStreamer developers to hack on gstreamer on all platforms, so it should work pretty well out of the box, and it's tested on the CI. If it doesn't work, come poke us on <a href="https://webchat.oftc.net/?channels=%23gstreamer" rel="nofollow" target="_blank">#gstreamer on OFTC IRC</a> (or <a href="https://matrix.to/#_oftc_%23gstreamer:matrix.org" rel="nofollow" target="_blank">the same channel via Matrix</a>) or on the <a href="https://gstreamer.freedesktop.org/lists/" target="_blank">gstreamer mailing list</a>.</div><div style="text-align: left;"> </div><div style="text-align: left;"><h2 id="upstream" style="text-align: left;">It's All Upstream.</h2></div><div style="text-align: left;"> </div><div style="text-align: left;">You don't need any special steps, and you don't need to read complicated blog posts to build GStreamer on Windows. Everything is upstream.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">This post previously contained examples of such articles and posts that are spreading misinformation, but I have removed those paragraphs after discussion with the people who were responsible for them, and to keep this post simple. All I can hope is that it doesn't happen again.<br /></div>
<!--<div style="text-align: left;"> </div><div style="text-align: left;">I feel disappointed when companies come along and pretend like they're the ones who did the work, or write <a href="https://www.collabora.com/news-and-blog/blog/2019/11/26/gstreamer-windows/" rel="nofollow" target="_blank">blog posts with clickbait headlines</a>. You really should've called it something like "How to get more plugins when building on Windows", because right now we get a ton of bad bug reports and <a href="https://lists.freedesktop.org/archives/gstreamer-devel/2020-September/076239.html" target="_blank">confused mailing list questions</a> from users who read your posts.</div><div style="text-align: left;"> </div><div style="text-align: left;">I get it, you want to do SEO, and ok I'll deal with the added support burden. </div><div style="text-align: left;"> </div><div style="text-align: left;"><strike>But it gets me really worked up when you publish <a href="https://www.collabora.com/news-and-blog/blog/2020/09/28/building-gstreamer-text-rendering-overlays-windows/" rel="nofollow" target="_blank">blog posts that are completely unnecessary and actively misrepresent the level of support upstream GStreamer has</a>. News flash: GStreamer has been shipping the pango gstreamer element (which includes <code>textoverlay</code>) on Windows <b><a href="https://gitlab.freedesktop.org/gstreamer/cerbero/-/commit/930e2a60f906832a7700ab7c7f7bf4814a5cc021" target="_blank">for almost a decade</a></b>. </strike></div><div style="text-align: left;"><strike><br /></strike></div><div style="text-align: left;"><strike>I'm glad that they at least <a href="https://gitlab.freedesktop.org/gstreamer/cerbero/-/merge_requests/574" target="_blank">posted the pango bugfixes</a> they did upstream, but representing "I fixed some bugs" as "<a href="https://twitter.com/nirbheek/status/1310930206790619136" target="_blank">hey look you can do textoverlay on windows now</a>" is not a good look.</strike></div><div style="text-align: left;"><br /></div><div style="text-align: left;"><b>Update</b>: after conversation with folks at the company, the latest blog post has been edited and fixed.<br /></div><div style="text-align: left;"><br /></div><div style="text-align: left;">I expect better because I collaborate with many of the people at this company within GStreamer and other projects such as Meson. I'd like to particularly thank <a href="https://twitter.com/xclaesse" target="_blank">Xavier</a> for all his contributions to the Meson build system, and towards finishing up the meson integration in cairo, harfbuzz, etc that we started.<br /></div><div style="text-align: left;"> </div><div style="text-align: left;"><strike>They're great people who understand the importance of working upstream, so I'm not sure why the company they work for is behaving this way. Is there pressure from management? Have they hired some aggressive marketing people? No idea. </strike></div><div style="text-align: left;"><br /></div><div style="text-align: left;"><b>Update</b>: I have been told in private by the company that these incidents were a mistake, and are not part of a deliberate marketing strategy.<br /></div><div style="text-align: left;"> </div><div style="text-align: left;">This sort of marketing is actively harming a FOSS project, and this is not the first time. Please stop. Delete <a href="https://twitter.com/Collabora/status/1310662833311539201" target="_blank">the factually wrong tweet</a>, and edit both the company posts so people don't get confused.<br /></div>-->Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com1tag:blogger.com,1999:blog-701969077517001201.post-32306924874975320822020-08-31T22:38:00.006+05:302020-09-09T16:08:24.199+05:30GStreamer 1.18 supports the Universal Windows Platform<div>tl;dr: The GStreamer 1.18 release ships with UWP support out of the box, with official GStreamer binary releases for it. Try out the <strike>1.17.90 pre-release</strike> <a href="https://gstreamer.freedesktop.org/data/pkg/windows/1.18.0/uwp/" target="_blank">1.18.0 release</a> and let us know how it goes! There's also an <a href="https://gitlab.freedesktop.org/seungha.yang/gst-uwp-example" target="_blank">example gstreamer app for UWP</a> that showcases OpenGL support (via ANGLE), audio/video capture, hardware codecs, and WebRTC.</div><div><br /></div><div><h3 style="text-align: left;">Short History Lesson</h3></div><div> </div><div>Last year at the <a href="https://gstreamer.freedesktop.org/conference/2019/" target="_blank">GStreamer Conference in Lyon</a>, I <a href="https://gstconf.ubicast.tv/videos/gstreamer-windows-uwp-and-firefox-on-the-hololens-2/" target="_blank">gave a talk</a> (<a href="https://gstreamer.freedesktop.org/data/events/gstreamer-conference/2019/Nirbheek%20Chauhan%20-%20GStreamer,%20Windows%20UWP,%20and%20Firefox%20on%20the%20HoloLens%202.pdf" target="_blank">slides</a>) about how “Firefox Reality” for the Microsoft <a href="https://en.wikipedia.org/wiki/HoloLens_2" target="_blank">HoloLens 2</a> mixed-reality headset is actually <a href="https://servo.org/" target="_blank">Servo</a>, and it uses GStreamer for all media handling: WebAudio, HTML5 Video, and WebRTC.</div><div><br /></div><div>I also spoke about the work we at <a href="https://www.centricular.com/" target="_blank">Centricular</a> did to port GStreamer to the HoloLens 2. The HoloLens 2 uses the new development target for Windows Store apps: the <a href="https://en.wikipedia.org/wiki/Universal_Windows_Platform" target="_blank">Universal Windows Platform</a>. The majority of win32 APIs have been deprecated, and apps have to use the new <a href="https://en.wikipedia.org/wiki/Windows_Runtime" target="_blank">Windows Runtime</a>, which is a language-agnostic API written from the ground up.<br /></div><div><br /></div><div>So the majority of work went into making sure that Win32 code didn't use deprecated APIs (we used a bunch of them!), and making sure that we could build using the UWP toolchain. Most of that involved two components:<br /></div><div><ul style="text-align: left;"><li><a href="https://gitlab.gnome.org/GNOME/glib" target="_blank">GLib</a>, a cross-platform low-level library / abstraction layer used by GNOME (almost all our win32 code is in here) </li><li><a href="https://gitlab.freedesktop.org/gstreamer/cerbero/" target="_blank">Cerbero</a>, the build aggregator used by GStreamer to build binaries for all platforms supported: Android, iOS, Linux, macOS, Windows (MSVC, MinGW, UWP)</li></ul><div>The target was to port the core of GStreamer, and those plugins with external dependencies that were needed to do playback in <code><audio></code> and <code><video></code> tags. This meant that the only external plugin dependency we needed was FFmpeg, for the gst-libav plugin. All this went well, and <a href="https://www.microsoft.com/en-us/p/firefox-reality/9npq78m7nb0r" target="_blank">Firefox Reality successfully shipped</a> with that work.</div><div><br /></div><div><h3 style="text-align: left;">Upstreaming and WebRTC</h3></div><div> </div><div>Building upon that work, for the past few months we've been working on adding support for <a href="blog.nirbheek.in/2018/02/gstreamer-webrtc.html" target="_blank">the WebRTC plugin</a>, and also upstreaming as much of the work as possible. This involved a bunch of pieces:</div><div><ol style="text-align: left;"><li>Use only OpenSSL and not GnuTLS in Cerbero because OpenSSL supports targeting UWP. This also had the advantage of moving us from two SSL stacks to one. </li><li>Port a <a href="https://github.com/cisco/libsrtp/pull/495" target="_blank">bunch</a> of <a href="https://github.com/cisco/openh264/pull/3301" target="_blank">external</a> optional <a href="https://gitlab.xiph.org/xiph/opus/-/merge_requests/13" target="_blank">dependencies</a>
to Meson so that they could be built with Meson, which is the easiest way for a cross-platform project to support UWP. If
your Meson project builds on Windows, it will build on UWP
with minimal or no build changes.</li><li>Rebase the GLib patches that I didn't find the time to upstream last year on top of 2.62, split into smaller pieces that will be easier to upstream, update for new Windows SDK changes, remove some of the hacks, and so on.</li><li>Rework and rewrite the Cerbero patches I wrote last year that were in no shape to be upstreamed.</li><li>Ensure that our OpenGL support continues to work using <a href="https://www.nuget.org/packages/ANGLE.WindowsStore.Servo/" target="_blank">Servo's ANGLE UWP port</a><br /></li><li>Write a new plugin for audio capture called wasapi2, great work by <a href="https://medium.com/@seungha.yang" target="_blank">Seungha Yang</a>.</li><li>Write a new plugin for video capture called mfvideosrc as part of the media foundation plugin which is new in GStreamer 1.18, also by Seungha.<br /></li><li>Write <a href="https://gitlab.freedesktop.org/seungha.yang/gst-uwp-example" target="_blank">a new example UWP app</a> to test all this work, <i>also</i> done by Seungha! 😄</li><li>Run the app through the Windows App Certification Kit<br /></li></ol><div>And <a href="https://gitlab.freedesktop.org/groups/gstreamer/-/merge_requests?scope=all&utf8=%E2%9C%93&state=merged&label_name[]=UWP" rel="nofollow" target="_blank">several miscellaneous tasks and bugfixes</a> that we've lost count of.<br /></div><div><br /></div><div>Our highest priority this time around was making sure that everything can be upstreamed to GStreamer, and it was quite a success! Everything needed for WebRTC support on UWP has been merged, and you can use GStreamer in your UWP app by downloading the official GStreamer binaries starting with the 1.18 release.</div><div><br /></div><div>On top of everything in the above list, thanks to Seungha, GStreamer on UWP now also supports:</div><div><ol style="text-align: left;"><li><a href="https://medium.com/@seungha.yang/bringing-microsoft-media-foundation-to-gstreamer-27b1316351ee" target="_blank">Hardware-accelerated encoding via Mediafoundation</a></li><li><a href="https://medium.com/@seungha.yang/windows-dxva2-via-direct3d-11-support-in-gstreamer-1-17-1837ecc1691a" target="_blank">Hardware-accelerated decoding via D3D11/DXVA2</a> </li></ol><p><br /></p></div><div><h3>Try it out!<br /></h3></div><div> </div><div>The <a href="https://gitlab.freedesktop.org/seungha.yang/gst-uwp-example" target="_blank">example gstreamer app</a> I mentioned above showcases all this. Go check it out, and don't forget to read <a href="https://gitlab.freedesktop.org/seungha.yang/gst-uwp-example#example-app-for-using-gstreamer-on-uwp" target="_blank">the README file</a>!</div><div> </div><div style="text-align: left;"><h3>Next Steps</h3></div><div><div> </div><div>The most important next step is to upstream as many of the <a href="https://gitlab.freedesktop.org/gstreamer/cerbero/-/blob/7ce7caaa6f8e315573110f5ab796c19884d44303/recipes/glib.recipe#L138" rel="nofollow" target="_blank">GLib patches we worked on as possible</a>, and then spend time porting a bunch of GLib APIs that we currently stub out when building for UWP.<br /></div><div><br /></div><div>Other than that, <a href="https://gitlab.freedesktop.org/gstreamer/cerbero/-/issues/286" target="_blank">enabling gst-libav</a> is also an interesting task since it will allow apps to use FFmpeg software codecs in their gstreamer UWP app. People should use the hardware accelerated d3d11 decoders and mediafoundation encoders for optimal power consumption and performance, but sometimes it's not possible because codec support is very device-dependent. </div><div><br /></div><div><div style="text-align: left;"><h3>Parting Thoughts<br /></h3></div></div></div><div> </div><div>I'd like to thank Mozilla for sponsoring the bulk of this work. We at Centricular greatly value partners that understand the importance of working with upstream projects, and it has been excellent working with the Servo team members, particularly <a href="https://twitter.com/lastontheboat" target="_blank">Josh Matthews</a>, <a href="https://twitter.com/asajeffrey" target="_blank">Alan Jeffrey</a>, and <a href="https://twitter.com/ManishEarth/" target="_blank">Manish Goregaokar</a>.<br /></div><div><br /></div><div>In the second week of August, <a href="https://blog.mozilla.org/blog/2020/08/11/changing-world-changing-mozilla/" target="_blank">Mozilla restructured</a> and the Servo team was one of the teams that was dissolved. I wish them all the best in their future endeavors, and I can't wait to see what they work on next. <a href="https://talentdirectory.mozilla.org/" target="_blank">They're all brilliant people</a>.</div><div><br /></div><div>Thanks to the forward-looking and community-focused approach of the Servo team, I am confident that <a href="https://github.com/servo/servo/discussions/27575" target="_blank">the project will figure things out</a> to forge its own way forward, and for the same reason, I expect that GStreamer's UWP support will continue to grow.<br /></div></div></div>Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com8tag:blogger.com,1999:blog-701969077517001201.post-35506029863489073662019-04-21T01:53:00.000+05:302020-03-26T20:35:36.683+05:30GStreamer's Meson and Visual Studio Journey<div dir="ltr" style="text-align: left;" trbidi="on">
<br />
Almost 3 years ago, <a href="http://blog.nirbheek.in/2016/05/gstreamer-and-meson-new-hope.html">I wrote</a> about how we at <a href="https://www.centricular.com/">Centricular</a> had been working on an experimental port of <a href="https://gstreamer.freedesktop.org/">GStreamer</a> from <a href="https://twitter.com/SpacePootler/status/1119337328558977025">Autotools</a> to the <a href="https://mesonbuild.com/">Meson build system</a> for faster builds on all platforms, and to allow building with Visual Studio on Windows.<br />
<br />
At the time, the response was mixed, and for good reason—Meson was a very new build system, and it needed to work well on all the targets that GStreamer supports, which was <i>all</i> major operating systems. Meson did aim to support all of those, but a lot of work was required to bring platform support up to speed with the requirements of a non-trivial project like GStreamer.<br />
<br />
<span style="font-size: large;">The Status: Today!</span><br />
<br />
After years of work across several components (Meson, Ninja, Cerbero, etc), GStreamer is being built with Meson on all platforms! Autotools is scheduled to be removed in the next release cycle (1.18). Edit: <a href="https://lists.freedesktop.org/archives/gstreamer-devel/2019-October/073113.html">as of October 2019, Autotools has been removed</a>.<br />
<br />
The <a href="https://gstreamer.freedesktop.org/releases/1.16/">first stable release</a> with this work was 1.16, which was released yesterday. It has already led to a number of new capabilities:<br />
<ul style="text-align: left;">
<li>GStreamer can be <a href="https://gitlab.freedesktop.org/gstreamer/cerbero/#enabling-visual-studio-support" target="_blank">built with Visual Studio</a> on Windows inside <a href="https://gitlab.freedesktop.org/gstreamer/cerbero">Cerbero</a>, which means we now ship official binaries for GStreamer <a href="https://gstreamer.freedesktop.org/download">built with the MSVC toolchain</a>.</li>
<li>From-scratch Cerbero builds are much faster on all platforms, which has aided the implementation of CI-gated merge requests on GitLab.</li>
<li>The developer workflow has been streamlined and is the same on all platforms (Linux, Windows, macOS) using the <a href="https://gitlab.freedesktop.org/gstreamer/gst-build/">gst-build meta-project</a>. The meta-project can also be used for cross-compilation (Android, iOS, Windows, Linux).</li>
<li>The Windows developer workflow no longer requires installing several packages by hand or setting up an MSYS environment. All you need is Git, Python 3, Visual Studio, and 15 min for the initial build.</li>
<li>Profiling on Windows is now possible, and I've personally used it to profile and fix numerous Windows-specific performance issues.</li>
<li>Visual Studio projects that use GStreamer now have debug symbols since we're no longer mixing MinGW and MSVC binaries. This also enables usable crash reports and symbol servers.</li>
<li>We can ship plugins that can only be built with MSVC on Windows, such as the Intel MSDK hardware codec plugin, Directshow plugins, and also easily enable new Windows 10 features in existing plugins such as WASAPI. </li>
<li>iOS bitcode builds are more correct, since Meson is smart enough to know how to disable incompatible compiler options on specific build targets.</li>
<li>The iOS framework now also ships shared libraries in addition to the static libraries. </li>
</ul>
Overall, it's been a huge success and we're really happy with how things have turned out!<br />
<br />
You can download the <a href="https://gstreamer.freedesktop.org/download/">prebuilt MSVC binaries</a>, <a href="https://gitlab.freedesktop.org/gstreamer/cerbero/#windows-setup">reproduce them yourself</a>, or <a href="https://gitlab.freedesktop.org/gstreamer/gst-build/#getting-started">quickly bootstrap a GStreamer development environment</a>. The choice is yours!<br />
<br />
<span style="font-size: large;">Further Musings</span><br />
<br />
While working on this over the years, what's really stood out to me was how this sort of gargantuan task was made possible through the power of community-driven FOSS and <a href="https://www.centricular.com/about">community-focused consultancy</a>.<br />
<br />
Our build system migration quest has been long with valleys full of yaks with thick coats of fur, and it would have been prohibitively expensive for a single entity to sponsor it all. Thanks to the inherently collaborative nature of community FOSS projects, people from various backgrounds and across companies could come together and make this possible.<br />
<br />
There are many other examples of this, but seeing the improbable happen from the inside is something special.<br />
<br />
Special shout-outs to <a href="https://www.zeiss.com/" target="_blank">ZEISS</a>, <a href="https://www.barco.com/" target="_blank">Barco</a>, <a href="https://www.pexip.com/" target="_blank">Pexip</a>, and <a href="https://www.cablecast.tv/" target="_blank">Cablecast.tv</a> for sponsoring various parts of this work!<br />
<br />
Their contributions also made it easier for us to spend thousands more hours of non-sponsored time to fill in the gaps so that all the sponsored work done could be upstreamed in a form that's useful for everyone who uses GStreamer. This sort of thing is, in my opinion, an essential characteristic of being a community-focused consultancy, and we make sure that it always has high priority.</div>
Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com1tag:blogger.com,1999:blog-701969077517001201.post-37949910624185042092018-04-10T21:31:00.001+05:302018-04-11T14:13:04.060+05:30A simple method of measuring audio latency<div dir="ltr" style="text-align: left;" trbidi="on">
In my <a href="http://blog.nirbheek.in/2018/03/low-latency-audio-on-windows-with.html">previous blog post</a>, I talked about how I improved the latency of GStreamer's default audio capture and render elements on Windows.<br />
<br />
An important part of any such work is a way to accurately measure the latencies in your audio path.<br />
<br />
Ideally, one would use a mechanism that can track your buffers and give you a detailed breakdown of how much latency each component of your system adds. For instance, with an audio pipeline like this:<br />
<br />
audio-capture → filter1 → filter2 → filter3 → audio-output<br />
<br />
If you use GStreamer, you can use the <a href="https://gstreamer.freedesktop.org/documentation/design/tracing.html#print-processing-latencies">latency tracer</a> to measure how much latency filter1 adds, filter2 adds, and so on.<br />
<br />
However, sometimes you need to measure latencies added by components <i>outside</i> of your control, for instance the audio APIs provided by the operating system, the audio drivers, or even the hardware itself. In that case it's really difficult, bordering on impossible, to do an automated breakdown.<br />
<br />
But we do need some way of measuring those latencies, and I needed that for the aforementioned work. Maybe we can get an aggregated (total) number?<br />
<br />
There's a simple way to do that if we can create a loopback connection in the audio setup. What's a <i>loopback</i> you ask?<br />
<br />
<div style="text-align: center;">
<img alt="Ouroboros snake biting its tail" border="0" src="https://upload.wikimedia.org/wikipedia/commons/c/c8/Ouroboros-simple.svg" width="40%" /></div>
<br />
Essentially, if we can redirect the audio output back to the audio input, that's called a loopback. The simplest way to do this is to connect the speaker-out/line-out to the microphone-in/line-in with a two-sided 3.5mm jack.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9QUn38puTN7zZ_ej3ogxYkEN_w06MRgnlfQKnTEfAaWJzE2HFUg42slqHwx5hTHd9wS1HJG_sM9RXmDDe1s59-6rr4LILvKBqx9fmWqA9s2AxB7azP54U8QLEYfL-SocDBndo44mhJtPY/s1600/photo_2018-03-27_13-52-08.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="photo of male-to-male 3.5mm jack connecting speaker-out to mic-in" border="0" height="300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9QUn38puTN7zZ_ej3ogxYkEN_w06MRgnlfQKnTEfAaWJzE2HFUg42slqHwx5hTHd9wS1HJG_sM9RXmDDe1s59-6rr4LILvKBqx9fmWqA9s2AxB7azP54U8QLEYfL-SocDBndo44mhJtPY/s400/photo_2018-03-27_13-52-08.jpg" width="400" /></a></div>
<br />
Now, when we send an audio wave down to the audio output, it'll show up on the audio input.<br />
<br />
Hmm, what if we store the <a href="https://developer.gnome.org/glib/stable/glib-Date-and-Time-Functions.html#g-get-monotonic-time">current time</a> when we send the wave out, and compare it with the current time when we get it back? Well, that's the total end-to-end latency!<br />
<br />
If we send out a wave periodically, we can measure the latency continuously, even as things are switched around or the pipeline is dynamically reconfigured.<br />
<br />
Some of you may notice that this is somewhat similar to how the `ping` command measures latencies across the Internet.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiV62xBnSrkrZIw1zW1c3rdzdrbcuCb3xx_4oyfjORr14kqgJvsAdiTPDaIuErXQVFUafmjuHwXUAYt0CaUrP1aiL_V9mS8qergunJ5DkKfXVW_uFGPgzxTzfQpdQCCMePK-WG_bjJMlIvF/s1600/ping.png" imageanchor="1"><img alt="screenshot of ping to 192.168.1.1" border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiV62xBnSrkrZIw1zW1c3rdzdrbcuCb3xx_4oyfjORr14kqgJvsAdiTPDaIuErXQVFUafmjuHwXUAYt0CaUrP1aiL_V9mS8qergunJ5DkKfXVW_uFGPgzxTzfQpdQCCMePK-WG_bjJMlIvF/s1600/ping.png" /></a></div>
<br />
<br />
Just like a network connection, the loopback connection can be lossy or noisy, f.ex. if you use loudspeakers and a microphone instead of a wire, or if you have (ugh) noise in your line. But unlike network packets, we lose all context once the waves leave our pipeline and we have no way of uniquely identifying each wave.<br />
<br />
So the simplest reliable implementation is to have only one wave traveling down the pipeline at a time. If we send a wave out, say, once a second, we can wait about one second for it to show up, and otherwise presume that it was lost.<br />
<br />
That's exactly how the <a href="https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-bad/html/gst-plugins-bad-plugins-audiolatency.html">audiolatency GStreamer plugin</a> that I wrote works! Here you can see its output while measuring the combined latency of the <a href="http://blog.nirbheek.in/2018/03/low-latency-audio-on-windows-with.html">WASAPI source and sink elements</a>:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiuRGIBxIx4Y6tWYT2YF3t11Ud69L1TdkK444Bbiji3cP1_wGXssbKhiiR2jA3S1ox13-pdf3uPTucAw0twEX9bq2vECdl3LHTr4gqzQ7p_pJACC5BjdGe4YJZSvonWJq0D7X1tGIZbdfAJ/s1600/wasapi-latency.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiuRGIBxIx4Y6tWYT2YF3t11Ud69L1TdkK444Bbiji3cP1_wGXssbKhiiR2jA3S1ox13-pdf3uPTucAw0twEX9bq2vECdl3LHTr4gqzQ7p_pJACC5BjdGe4YJZSvonWJq0D7X1tGIZbdfAJ/s1600/wasapi-latency.png" /></a></div>
<br />
The first measurement will always be wrong because of various implementation details in the audio stack, but the next measurements should all be correct.<br />
<br />
This mechanism does place an upper bound on the latency that we can measure, and on how often we can measure it, but it should be possible to take more frequent measurements by sending a new wave as soon as the previous one was received (with a 1 second timeout). So this is an enhancement that can be done if people need this feature.<br />
<br />
Hope you find the element useful; <a href="https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-bad/html/gst-plugins-bad-plugins-audiolatency.html#gst-plugins-bad-plugins-audiolatency.description">go forth and measure</a>!</div>
Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com1tag:blogger.com,1999:blog-701969077517001201.post-15665742866400955692018-03-22T13:41:00.003+05:302018-03-24T06:09:34.835+05:30Low-latency audio on Windows with GStreamer<div dir="ltr" style="text-align: left;" trbidi="on">Digital audio is so ubiquitous that we rarely stop to think or wonder how the gears turn underneath our all-pervasive apps for entertainment. Today we'll look at one specific piece of the machinery: latency.<br />
<br />
Let's say you're making a video of someone's birthday party with an app on your phone. Once the recording starts, you don't care when the app starts writing it to disk<span class="st">—</span>as long as everything is there in the end.<br />
<br />
However, if you're having a Skype call with your friend, it matters a <i>whole lot</i> how long it takes for the video to reach the other end and vice versa. It's impossible to have a conversation if the lag (latency) is too high.<br />
<br />
The difference is, do you need real-time feedback or not?<br />
<br />
Other examples, in order of increasingly stricter latency requirements are: live video streaming, security cameras, augmented reality games such as <a href="https://en.wikipedia.org/wiki/Pok%C3%A9mon_Go" target="_blank">Pokémon Go</a>, multiplayer video games in general, audio effects apps for live music recording, and many many more.<br />
<br />
“But Nirbheek”, you might ask, “why doesn't everyone always ‘immediately’ send/store/show whatever is recorded? Why do people have to worry about latency?” and that's a great question!<br />
<br />
To understand that, checkout my previous blog post, <a href="http://blog.nirbheek.in/2018/03/latency-in-digital-audio.html" target="_blank">Latency in Digital Audio</a>. It's also a good primer on analog vs digital audio!<br />
<br />
<h2 style="text-align: left;">Low latency on consumer operating systems</h2><div style="text-align: left;"><br />
</div><div style="text-align: left;">Each operating system has its own set of application APIs for audio, and each has a lower bind on the achievable latency:</div><div style="text-align: left;"><br />
</div><ul style="text-align: left;"><li>Linux has <a href="https://www.alsa-project.org/main/index.php/ALSA_Library_API" target="_blank">alsa-lib</a> (old), <a href="https://en.wikipedia.org/wiki/Pulseaudio" target="_blank">Pulseaudio</a> (standard), <a href="https://en.wikipedia.org/wiki/JACK_Audio_Connection_Kit" target="_blank">JACK</a> (pro-audio), and <a href="https://pipewire.org/" target="_blank">Pipewire</a> (<a href="https://blogs.gnome.org/uraeus/2018/01/26/an-update-on-pipewire-the-multimedia-revolution-an-update/" target="_blank">under development</a>)</li>
<li>macOS and iOS have <a href="https://en.wikipedia.org/wiki/Core_Audio" target="_blank">CoreAudio</a> (standard, pro-audio)</li>
<li>Android has <a href="https://source.android.com/devices/audio/" target="_blank">AudioFlinger</a> (Java API, android.media), <a href="https://en.wikipedia.org/wiki/OpenSL_ES" target="_blank">OpenSL ES</a> (C/C++ API), and <a href="https://source.android.com/devices/audio/aaudio" target="_blank">AAudio</a> (C/C++ API, new, pro-audio)</li>
<li>Windows has <a href="https://en.wikipedia.org/wiki/Directsound" target="_blank">DirectSound</a> (deprecated), <a href="https://en.wikipedia.org/wiki/Technical_features_new_to_Windows_Vista#Audio_stack_architecture" target="_blank">WASAPI</a> (standard), and <a href="https://en.wikipedia.org/wiki/Audio_Stream_Input/Output" target="_blank">ASIO</a> (proprietary, old, pro-audio).</li>
<li>BSDs still use <a href="https://en.wikipedia.org/wiki/Open_Sound_System">OSS</a></li>
</ul><div style="text-align: left;"><br />
</div><div style="text-align: left;">GStreamer already has plugins for almost all of these<a href="#gst-plugins">¹</a> (plus others that aren't listed here), and on Windows, GStreamer has been using the DirectSound API by default for audio capture and output since the very beginning.<br />
<br />
However, the DirectSound API was deprecated in Windows XP, and with Vista, it was removed and replaced with an emulation layer on top of the newly-released WASAPI. As a result, the plugin can't be configured to have less than 200ms of latency, which makes it unsuitable for all the low-latency use-cases mentioned above. The DirectSound API is quite crufty and unnecessarily complex anyway.<br />
<br />
GStreamer is rarely used in video games, but it is widely used for live streaming, audio/video calls, and other real-time applications. Worse, the WASAPI GStreamer plugins were effectively untouched and unused since the initial implementation in 2008 and were completely broken<a href="#gst-windows">²</a>.<br />
<br />
This left no way to achieve low-latency audio capture or playback on Windows using GStreamer.<br />
<br />
The situation became particularly dire when GStreamer added a new <a href="http://blog.nirbheek.in/2018/02/gstreamer-webrtc.html">implementation of the WebRTC spec</a> in this <a href="https://gstreamer.freedesktop.org/releases/1.14/">release cycle</a>. People that try it out on Windows were going to see much higher latencies than they should.<br />
<br />
Luckily, I rewrote most of the WASAPI plugin code in January and February, and it should now work well on all versions of Windows from Vista to 10! You can get <a href="https://gstreamer.freedesktop.org/data/pkg/windows/1.14.0.1/">binary installers for GStreamer</a> or <a href="https://gstreamer.freedesktop.org/documentation/installing/building-from-source-using-cerbero.html">build it from source</a>.<br />
<br />
<h2 style="text-align: left;">Shared and Exclusive WASAPI</h2><br />
WASAPI allows applications to open sound devices in two modes: <i>shared</i> and <i>exclusive</i>. As the name suggests, <i>shared</i> mode allows multiple applications to output to (or capture from) an audio device at the same time, whereas <i>exclusive</i> mode does not.<br />
<br />
Almost all applications should open audio devices in shared mode. It would be quite disastrous if your YouTube videos played without sound because Spotify decided to open your speakers in exclusive mode.<br />
<br />
In shared mode, the audio engine has to resample and mix audio streams from all the applications that want to output to that device. This increases latency because it must maintain its own audio ringbuffer for doing all this, from which audio buffers will be periodically written out to the audio device.<br />
<br />
In theory, hardware mixing could be used if the sound card supports it, but very few sound cards implement that now since it's so cheap to do in software. On Windows, only high-end audio interfaces used for professional audio implement this.<br />
<br />
Another option is to allocate your audio engine buffers directly in the sound card's memory with DMA, but that complicates the implementation and relies on good drivers from hardware manufacturers. Microsoft has tried similar approaches in the past with DirectSound and been burned by it, so it's not a route they took with WASAPI<a href="#ms-audio-history">³</a>.<br />
<br />
On the other hand, some applications know they will be the only ones using a device, and for them all this machinery is a hindrance. This is why <i>exclusive</i> mode exists. In this mode, if the audio driver is implemented correctly, the application's buffers will be directly written out to the sound card, which will yield the lowest possible latency.<br />
<br />
<h2 style="text-align: left;">Audio latency with WASAPI</h2><br />
So what kind of latencies <i>can</i> we get with WASAPI?<br />
<br />
That depends on the <i>device period</i> that is being used. The term <i>device period</i> is a fancy way of saying <i>buffer size</i>; specifically the buffer size that is used in each call to your application that fetches audio data.<br />
<br />
This is the same period with which audio data will be written out to the actual device, so it is the major contributor of latency in the entire machinery.<i></i><br />
<br />
If you're using the <a href="https://msdn.microsoft.com/en-us/library/windows/desktop/dd370865">AudioClient</a> interface in WASAPI to initialize your streams, the default period is 10ms. This means the theoretical <i>minimum</i> latency you can get in <i>shared mode</i> would be 10ms (audio engine) + 10ms (driver) = 20ms. In practice, it'll be somewhat higher due to various inefficiencies in the subsystem.<br />
<br />
When using <i>exclusive mode</i>, there's no engine latency, so the same number goes down to ~10ms.<br />
<br />
These numbers are decent for most use-cases, but like I explained in my <a href="http://blog.nirbheek.in/2018/03/latency-in-digital-audio.html">previous blog post</a>, this is totally insufficient for pro-audio use-cases such as applying live effects to music recordings. You really need latencies that are lower than 10ms there.<br />
<br />
<h2 style="text-align: left;">Ultra-low latency with WASAPI</h2><br />
Starting with Windows 10, WASAPI removed most of its aforementioned inefficiencies, and introduced a new interface: <a href="https://msdn.microsoft.com/library/windows/desktop/dn911487">AudioClient3</a>. If you initialize your streams with this interface, and if your audio driver is implemented correctly, you can configure a device period of just <i>2.67ms</i> at 48KHz.<br />
<br />
The best part is that this is the period not just in exclusive mode but <i>also in shared mode</i>, which brings WASAPI almost at-par with JACK and CoreAudio <br />
<br />
So that was the good news. Did I mention there's bad news too? Well, now you know.<br />
<br />
The first bit is that these numbers are only achievable if you use Microsoft's implementation of the Intel HD Audio standard for consumer drivers. This is fine; you follow <a href="https://blogs.msdn.microsoft.com/matthew_van_eerde/2010/08/23/troubleshooting-how-to-install-the-microsoft-hd-audio-class-driver/">some badly-documented steps</a> and it turns out fine.<br />
<br />
Then you realize that if you want to use something more high-end than an Intel HD Audio sound card, unless you use <a href="http://www.motu.com/newsitems/windows-wave-rt-support-is-now-shipping">one of the rare</a> pro-audio interfaces that have drivers that use the new <a href="https://docs.microsoft.com/en-us/windows-hardware/drivers/audio/understanding-the-wavert-port-driver">WaveRT</a> driver model instead of the old <a href="https://msdn.microsoft.com/en-us/library/windows/hardware/ff538767">WaveCyclic</a> model, you still see 10ms device periods.<br />
<br />
It seems the pro-audio industry made the decision to stick with ASIO since it already provides <5ms latency. They don't care that the API is proprietary, and that most applications can't actually use it because of that. All the apps that are used in the pro-audio world already work with it.<br />
<br />
The strange part is that all this information is nowhere on the Internet and seems to lie solely in the minds of the Windows audio driver cabals across the US and Europe. It's surprising and frustrating for someone used to working in the open to see such counterproductive information asymmetry, and <a href="https://github.com/kinetiknz/cubeb/issues/324">I'm not the only one</a>.<br />
<br />
This is where I plug open-source and talk about how Linux has had ultra-low latencies for years since all the audio drivers are open-source, follow the same <a href="https://www.kernel.org/doc/html/v4.10/sound/kernel-api/index.html">ALSA driver model</a><a href="#alsa-kernel">⁴</a>, and are constantly improved. JACK is probably the most well-known low-latency audio engine in existence, and was born on Linux. People are even using Pulseaudio these days to work with <5ms latencies.<br />
<br />
But this blog post is about Windows and WASAPI, so let's get back on track.<br />
<br />
To be fair, Microsoft is not to blame here. Decades ago they made the decision of not working more closely with the companies that write drivers for their standard hardware components, and they're still paying the price for it. Blue screens of death were the most user-visible consequences, but the current audio situation is an indication that losing control of your platform has more dire consequences.<br />
<br />
There is one more bit of bad news. In my testing, I wasn't able to get glitch-free <i>capture</i> of audio in the source element using the AudioClient3 interface at the minimum configurable latency in shared mode, even with <a href="https://cgit.freedesktop.org/gstreamer/gst-plugins-bad/tree/sys/wasapi/gstwasapiutil.c#n980">critical thread priorities</a> unless there was nothing else running on the machine.<br />
<br />
As a result, this feature is disabled by default on the source element. This is unfortunate, but not a great loss since the same device period is achievable in exclusive mode without glitches.<br />
<br />
<h2 style="text-align: left;">Measuring WASAPI latencies</h2><br />
Now that we're back from our detour, the executive summary is that the GStreamer WASAPI <a href="https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-bad/html/gst-plugins-bad-plugins-wasapisrc.html#gst-plugins-bad-plugins-wasapisrc.description">source</a> and <a href="https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-bad/html/gst-plugins-bad-plugins-wasapisink.html#gst-plugins-bad-plugins-wasapisink.description">sink</a> elements now use the latest recommended WASAPI interfaces. You should test them out and see how well they work for you!<br />
<br />
By default, a device is opened in shared mode with a conservative latency setting. To force the stream into the lowest latency possible, set <i>low-latency=true</i>. If you're on Windows 10 and want to force-enable/disable the use of the AudioClient3 interface, toggle the <i>use-audioclient3</i> property.<br />
<br />
To open a device in exclusive mode, set <i>exclusive=true</i>. This will ignore the <i>low-latency</i> and <i>use-audioclient3</i> properties since they only apply to shared mode streams. When a device is opened in exclusive mode, the stream will always be configured for the lowest possible latency by WASAPI.<br />
<br />
To measure the actual latency in each configuration, you can use the new <a href="https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-bad/html/gst-plugins-bad-plugins-audiolatency.html#gst-plugins-bad-plugins-audiolatency.description">audiolatency</a> plugin that I wrote to get hard numbers for the total end-to-end latency including the latency added by the GStreamer audio ringbuffers in the source and sink elements, the WASAPI audio engine (capture and render), the audio driver, and so on.<br />
<br />
I look forward to hearing what your numbers are on Windows 7, 8.1, and 10 in all these configurations! ;)<br />
<br />
<a href="" name="gst-plugins"></a><br />
<span style="font-size: x-small;">1. The only ones missing are AAudio because it's very new and ASIO which is a proprietary API with licensing requirements.</span><br />
<a href="" name="gst-windows"></a><br />
<span style="font-size: x-small;">2. It's no secret that although lots of people use GStreamer on Windows, the majority of GStreamer developers work on Linux and macOS. As a result the Windows plugins haven't always gotten a lot of love. It doesn't help that <a href="https://gstreamer.freedesktop.org/documentation/installing/building-from-source-using-cerbero.html">building GStreamer on Windows</a> can be a daunting task . This is actually one of the major reasons why we're moving to Meson, but I've already <a href="http://blog.nirbheek.in/2016/05/gstreamer-and-meson-new-hope.html">written about that elsewhere</a>!</span><br />
<a href="" name="ms-audio-history"></a><br />
<span style="font-size: x-small;">3. My knowledge about the history of the decisions behind the Windows Audio API is spotty, so corrections and expansions on this are most welcome!</span><br />
<a href="" name="alsa-kernel"></a><br />
<span style="font-size: x-small;">4. The ALSA drivers in the Linux kernel should not be confused with the ALSA userspace library.</span></div></div>Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com3tag:blogger.com,1999:blog-701969077517001201.post-37037862931355634942018-03-14T06:43:00.000+05:302018-04-05T04:04:59.238+05:30Latency in Digital Audio<div dir="ltr" style="text-align: left;" trbidi="on">
We've come a long way since <a href="https://en.wikipedia.org/wiki/Invention_of_the_telephone" target="_blank">Alexander Graham Bell</a>, and everything's turned digital.<br />
<br />
Compared to analog audio, <a href="https://en.wikipedia.org/wiki/Digital_signal_processing" target="_blank">digital audio processing </a>is extremely versatile, is much easier to design and implement than analog processing, and also adds effectively zero noise along the way. With rising computing power and dropping costs, every operating system has had drivers, engines, and libraries to record, process, playback, transmit, and store audio for over 20 years.<br />
<br />
<div style="text-align: left;">
Today we'll talk about the some of the differences between analog and digital audio, and how the widespread use of digital audio adds a new challenge: <i>latency</i>.</div>
<br />
<h2 style="text-align: left;">
Analog vs Digital</h2>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<b>Analog data</b> flows like water through an empty pipe. You open the tap, and the time it takes for the first drop of water to reach you is the latency. When analog audio is transmitted through, say, an <a href="https://en.wikipedia.org/wiki/RCA_connector" target="_blank">RCA cable</a>, the transmission happens at the speed of electricity and your latency is:<code></code><br />
<br />
<div style="text-align: center;">
<img alt="wire length/speed of electricity" src="https://nirbheek.in/files/blog/analog-latency.svg" /></div>
<br />
This number is ridiculously small<span class="st">—</span>especially when compared to the speed of sound. An electrical signal takes 0.001 milliseconds to travel 300 metres (984 feet). Sound takes 874 milliseconds (almost a second).<br />
<br />
All analog effects and filters obey similar equations. If you're using, say, an analog pedal with an electric guitar, the signal is transformed continuously by an electrical circuit, so the latency is a function of the wire length (plus capacitors/transistors/etc), and is almost always negligible.<br />
<br />
<b>Digital audio</b> is transmitted in "packets" (buffers) of a particular size, like a <a href="https://en.wikipedia.org/wiki/Bucket_brigade" target="_blank">bucket brigade</a>, but at the speed of electricity. Since the real world is analog, this means to record audio, you must use an <a href="https://en.wikipedia.org/wiki/Analog-to-digital_converter" target="_blank">Analog-Digital Converter</a>. The <abbr title="Analog-Digital Converter">ADC</abbr> <a href="https://en.wikipedia.org/wiki/Quantization_(signal_processing)" target="_blank">quantizes</a> <a href="https://wiki.xiph.org/Videos/A_Digital_Media_Primer_For_Geeks" target="_blank">the signal</a> into digital measurements (samples), packs multiple samples into a buffer, and sends it forward. This means your latency is now: </div>
<br />
<div style="text-align: center;">
<img alt="(wire length/speed of electricity) + buffer size" src="https://nirbheek.in/files/blog/digital-latency.svg" /></div>
<div style="text-align: left;">
<br />
We saw above that the first part is insignificant, what about the second part?<br />
<br />
Latency is measured in time, but buffer size is measured in bytes. For <a href="https://en.wikipedia.org/wiki/Audio_bit_depth" target="_blank">16-bit integer audio</a>, each measurement (sample) is stored as a 16-bit integer, which is 2 bytes. That's the theoretical lower limit on the buffer size. The <a href="https://en.wikipedia.org/wiki/Sampling_(signal_processing)#Sampling_rate" target="_blank">sample rate</a> defines how often measurements are made, and these days, is usually 48KHz. This means each sample contains ~0.021ms of audio. To go lower, we need to increase the sample rate to 96KHz or 192KHz.<br />
<br />
However, when general-purpose computers are involved, the buffer size is almost never lower than 32 bytes, and is usually 128 bytes or larger. For <a href="https://en.wikipedia.org/wiki/Multichannel_audio">single-channel</a> 16-bit integer audio at 48KHz, a 32 byte buffer is 0.33ms, and a 128 byte buffer is 1.33ms. This is our buffer size and hence the base latency while recording (or playing) digital audio.<br />
<br />
Digital effects operate on individual buffers, and will add an additional amount of latency depending on the delay added by the CPU processing required by the effect. Such effects may also add latency if the algorithm used requires that, but that's the same with analog effects.<br />
<br />
<h2 style="text-align: left;">
The Digital Age</h2>
<br />
So everyone's using digital. But isn't 1.33ms a lot of additional latency?<br />
<br />
It might seem that way till you think about it in real-world terms. Sound travels less than half a meter (1<span class="st">½</span> feet) in that time, and that sort of delay is completely unnoticeable by humans<span class="st">—</span>otherwise we'd notice people's lips moving before we heard their words.<br />
<br />
In fact, 1.33ms is too small for the majority of audio applications!<br />
<br />
To process such small buffer sizes, you'd have to wake the CPU up <abbr title="1000 / 1.33">750 times a second</abbr>, just for audio. This is highly inefficient, and wastes a lot of power. You really don't want that on your phone or your laptop, and is completely unnecessary in most cases anyway. <br />
<br />
For instance, your music player will usually use a buffer size of ~200ms, which is just <i>5</i> CPU wakeups per second. Note that this doesn't mean that you will hear sound 200ms after hitting "play". The audio player will just send 200ms of audio to the sound card at once, and playback will begin immediately.<br />
<br />
Of course, you can't do that with live playback such as video calls<span class="st">—y</span>ou can't "read-ahead" data you don't have. You'd have to invent a time machine first. As a result, apps that use real-time communication have to use smaller buffer sizes because that directly affects the latency of live playback.<br />
<br />
That brings us back to efficiency. These apps also need to conserve power, and 1.33ms buffers are really wasteful. Most consumer apps that require low latency use 10-15ms buffers, and that's good enough for things like voice/video calling, video games, notification sounds, and so on.<br />
<br />
<h2 style="text-align: left;">
Ultra Low Latency</h2>
<br />
There's one category left: musicians, sound engineers, and other folk that work in the pro-audio business. For them, 10ms of latency is much too high!<br />
<br />
You usually can't notice a 10ms delay between an event and the sound for it, but when making music, you <i>can</i> hear it when two instruments are out-of-sync by 10ms or if the sound for an instrument you're playing is delayed. Instruments such as drum snare are more susceptible to this problem than others, which is why the <a href="https://en.wikipedia.org/wiki/Stage_monitor_system" target="_blank">stage monitors</a> used in live concerts must not add any latency.<br />
<br />
The standard in the music business is to use buffers that are 5ms or lower, down to the 0.33ms number that we talked about above.<br />
<br />
Power consumption is absolutely no concern, and the real problems are the accumulation of small amounts of latencies everywhere in your stack, and ensuring that you're able to read buffers from the hardware or write buffers to the hardware fast enough.<br />
<br />
Let's say you're using an app on your computer to apply digital effects to a guitar that you're playing. This involves capturing audio from the line-in port, sending it to the application for processing, and playing it from the sound card to your amp.<br />
<br />
The latency while capturing and outputting audio are both multiples of the buffer size, so it adds up very quickly. The effects app itself will also add a variable amount of latency, and at 1.33ms buffer sizes you will find yourself quickly approaching a 10ms latency from line-in to amp-out. The only way to lower this is to use a smaller buffer size, which is precisely what pro-audio hardware and software enables.<br />
<br />
The second problem is that of CPU scheduling. You need to ensure that the threads that are fetching/sending audio data to the hardware and processing the audio have the highest priority, so that nothing else will steal CPU-time away from them and cause glitching due to buffers arriving late.<br />
<br />
This gets harder as you lower the buffer size because the audio stack has to do more work for each bit of audio. The fact that we're doing this on a general-purpose operating system makes it even harder, and requires implementing <a href="https://en.wikipedia.org/wiki/Real-time_computing" target="_blank">real-time scheduling</a> features across several layers. But that's a story for another time!<br />
<br />
I hope you found this dive into digital audio interesting! My next post <span style="text-decoration: line-through;">will be</span> is about my journey in <a href="http://blog.nirbheek.in/2018/03/low-latency-audio-on-windows-with.html">implementing ultra low latency capture and render on Windows</a> in the <a href="https://msdn.microsoft.com/library/windows/desktop/dd371455.aspx" target="_blank">WASAPI</a> plugin for <a href="https://en.wikipedia.org/wiki/GStreamer" target="_blank">GStreamer</a>. This was already possible on Linux with the JACK GStreamer plugin and on macOS with the CoreAudio GStreamer plugin, so it will be interesting to see how the same problems are solved on Windows. Tune in!</div>
</div>
Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com1tag:blogger.com,1999:blog-701969077517001201.post-52121081553209932642018-02-26T14:00:00.000+05:302018-02-27T09:57:11.809+05:30Decoupling GStreamer Pipelines<div dir="ltr" style="text-align: left;" trbidi="on"><span style="color: #999999;"><i>This post is best read with some prior familiarity with <a href="https://gstreamer.freedesktop.org/" target="_blank">GStreamer</a> pipelines. If you want to learn more about that, a good place to start is the <a href="https://twitter.com/thaytan/status/956366764543111169" target="_blank">tutorial Jan presented at LCA 2018</a>.</i></span><br />
<br />
<h2 style="text-align: left;">Elevator Pitch</h2><br />
GStreamer was designed with modularity, pluggability, and ease of use in mind, and the structure was somewhat inspired by UNIX pipes. With GStreamer, you start with an idea of what your dataflow will look like, and the pipeline will map that quite closely.<br />
<br />
This is true whether you're working with a simple and static pipeline:<br />
<br />
<code>source ! transform ! sink</code><br />
<br />
Or if you need complex and dynamic pipelines with varying rates of data flow:<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://i.imgur.com/GJC4y2y.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="156" data-original-width="1600" height="37" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjG-fcJCItwqG-9BWYHvsbCSnF7mkHuE9LnLj9QYBVv5jjkVgRr9JtoHlW_GhHUy6yqL2xL6kaiUDtMWJz9IXno0rCOO3LhQt00_7YuMyDP-vXaC7aetsNiSkxFQw_kJqzcgzeQwmGvwSzf/s400/foo.png" width="400" /></a></div><br />
The inherent pluggability of the system allows for quick prototyping and makes a lot of changes simpler than they would be in other systems.<br />
<br />
At the same time, to achieve efficient multimedia processing, one must avoid onerous copying of data, excessive threading, or additional latency. Other features necessary are varying rates of playback, seeking, branching, mixing, non-linear data flow, timing, and much more, but let's keep it simple for now.<br />
<br />
<h2 style="text-align: left;">Modular Multimedia Processing</h2><br />
A naive way to implement this would be to have one thread (or process) for each node, and use shared memory or message-passing. This can achieve high throughput if you use the right APIs for zerocopy message-passing, but because of a lack of realtime guarantees on all consumer operating systems, the latency will be jittery and much harder to achieve.<br />
<br />
So how does GStreamer solve these problems?<br />
<br />
Let's take a look at a simple pipeline to try and understand. We generate a sine wave, encode it with <a href="https://opus-codec.org/" target="_blank">Opus</a>, mux it into an Ogg container, and write it to disk.<br />
<br />
<code><br />
$ gst-launch-1.0 -e audiotestsrc ! opusenc ! oggmux ! filesink location=out.ogg<br />
</code><br />
<br />
How does data make it from one end of this pipeline to the other in GStreamer? The answer lies in <i>source pads</i>, <i>sink pads</i> and the <a href="https://gstreamer.freedesktop.org/documentation/plugin-development/basics/chainfn.html" target="_blank">chain function</a>.<br />
<br />
In this pipeline, the <code>audiotestsrc</code> element has one <i>source pad</i>. <code>opusenc</code> and <code>oggmux</code> have one <i>source pad</i> and one <i>sink pad</i> each, and <code>filesink</code> only has a <i>sink pad</i>. Buffers always move from source pads to sink pads. All elements that receive buffers (with sink pads) must implement a <i>chain function</i> to handle each buffer.<br />
<br />
Zooming in a bit more, to output buffers, an element will call <code>gst_pad_push()</code> on its <i>source pad</i>. This function will figure out what the corresponding <i>sink pad</i> is, and call the chain function of that element with a pointer to the buffer that was pushed earlier. This chain function can then apply a transformation to the buffer and push it (or a new buffer) onward with <code>gst_pad_push()</code> again.<br />
<br />
The net effect of this is that all buffer handling from one end of this pipeline to the other happens <b>in one series of chained function calls</b>. This is a really important detail that allows GStreamer to be efficient by default.<br />
<br />
<h2 style="text-align: left;">Pipeline Multithreading</h2><br />
Of course, sometimes you <i>want</i> to decouple parts of the pipeline, and that brings us to the simplest mechanism for doing so: the <a href="https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer-plugins/html/gstreamer-plugins-queue.html#gstreamer-plugins-queue.description" target="_blank"><code>queue</code> element</a>. The most basic use-case for this element is to ensure that the downstream of your pipeline <a href="https://gstreamer.freedesktop.org/documentation/tutorials/basic/multithreading-and-pad-availability.html" target="_blank">runs in a new thread</a>.<br />
<br />
In some applications, you want even greater decoupling of parts of your pipeline. For instance, if you're reading data from the network, you don't want a network error to bring down our entire pipeline, or if you're working with a hotpluggable device, device removal should be recoverable without needing to restart the pipeline.<br />
<br />
There are various mechanisms to achieve such decoupling: <a href="https://gstreamer.freedesktop.org/documentation/tutorials/basic/short-cutting-the-pipeline.html" target="_blank"><code>appsrc</code>/<code>appsink</code></a>, <code>fdsrc</code>/<code>fdsink</code>, <code>shmsrc</code>/<code>shmsink</code>, <a href="https://www.collabora.com/news-and-blog/blog/2017/11/17/ipcpipeline-splitting-a-gstreamer-pipeline-into-multiple-processes/" target="_blank"><code>ipcpipeline</code></a>, etc. However, each of those have their own limitations and complexities. In particular, <a href="https://gstreamer.freedesktop.org/documentation/application-development/basics/data.html#events" target="_blank">events</a>, <a href="https://gstreamer.freedesktop.org/documentation/design/negotiation.html" target="_blank">negotiation</a>, and <a href="https://gstreamer.freedesktop.org/documentation/tutorials/basic/time-management.html" target="_blank">synchronization</a> usually need to be handled or serialized manually at the boundary.<br />
<br />
<h2 style="text-align: left;">Seamless Pipeline Decoupling</h2><br />
We recently merged a new plugin that makes this job much simpler: <a href="https://cgit.freedesktop.org/gstreamer/gst-plugins-bad/commit/?id=3f7e29d5b32f20dff75a58186533e40bb0ed4081" target="_blank">gstproxy</a>. Essentially, you insert a <code>proxysink</code> element when you want to send data outside your pipeline, and use a <code>proxysrc</code> element to push that data into a different pipeline in the same process.<br />
<br />
The interesting thing about this plugin is that <i>everything</i> is proxied, not just buffers. Events, queries, and hence caps negotiation all happen seamlessly. This is particularly useful when you want to do dynamic reconfiguration of your pipeline, and want the decoupled parts to reconfigure automatically.<br />
<br />
Say you have a pipeline like this:<br />
<br />
<code><br />
pulsesrc ! opusenc ! oggmux ! souphttpclientsink<br />
</code><br />
<br />
Where the <code>souphttpclientsink</code> element is doing a <code>PUT</code> to a remote HTTP server. If the server suddenly closes the connection, you want to be able to immediately reconnect to the same server or a different one without interrupting the recording. One way to do this, would be to use <code>appsrc</code> and <code>appsink</code> to split it into two pipelines:<br />
<br />
<code><br />
pulsesrc ! opusenc ! oggmux ! appsink<br />
<br />
appsrc ! souphttpclientsink<br />
</code><br />
<br />
Now you need to write code to handle buffers that are received on the <code>appsink</code> and then manually push those into <code>appsrc</code>. With the <code>proxy</code> plugin, you split your pipeline like before:<br />
<br />
<code><br />
pulsesrc ! opusenc ! oggmux ! proxysink<br />
<br />
proxysrc ! souphttpclientsink<br />
</code><br />
<br />
Next, we connect the <code>proxysrc</code> and <code>proxysink</code> elements, and gstreamer will automatically push buffers from the first pipeline to the second one.<br />
<br />
<code>g_object_set (psrc, "proxysink", psink, NULL);</code><br />
<br />
<code>proxysink</code> also contains a <code>queue</code>, so the second pipeline will always run in a separate thread.<br />
<br />
Another option is the <a href="https://gstreamer.freedesktop.org/documentation/plugins.html"><code>inter</code> plugin</a>. If you use a pair of <code>interaudiosink/interaudiosrc</code> elements, buffers will be automatically moved between pipelines, but those only support raw audio or video, and drop events and queries at the boundary. The <code>proxy</code> elements push pointers to buffers without copying, and they do not care what the contents of the buffers are.<br />
<br />
This example was a trivial one, but with more complex pipelines, you usually have bins that automatically reconfigure themselves according to the events and caps sent by upstream elements; f.ex <code>decodebin</code> and <a href="http://blog.nirbheek.in/2018/02/gstreamer-webrtc.html"><code>webrtcbin</code></a>. This metadata about the buffers is lost when using <code>appsrc</code>/<code>appsink</code>, and similar elements, but is transparently proxied by the <code>proxy</code> elements.<br />
<br />
The <code>ipcpipeline</code> elements also forward buffers, events, queries, etc (not zerocopy, but could be), but they are much more complicated since they were built for splitting pipelines across multiple processes, and are most often used in a security-sensitive context.<br />
<br />
The <code>proxy</code> elements only work when all the split pipelines are within the same process, are much simpler and as a result, more efficient. They should be used when you want graceful recovery from element errors, and your elements are not a vector for security attacks.<br />
<br />
For more details on how to use them, checkout the <a href="https://cgit.freedesktop.org/gstreamer/gst-plugins-bad/tree/gst/proxy/gstproxysrc.c?id=HEAD#n22" target="_blank">documentation and example</a>! The online docs will be generated from that when we're closer to the release of GStreamer 1.14. There are a few caveats, but a number of projects are already using it with great success.</div>Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com2tag:blogger.com,1999:blog-701969077517001201.post-54605681752583078022018-02-03T11:48:00.001+05:302021-04-28T15:09:16.571+05:30GStreamer has grown a WebRTC implementation<div dir="ltr" style="text-align: left;" trbidi="on"><span style="color: #999999;"><i>In other news, GStreamer is now almost buzzword-compliant! The next blog post on our list: blockchains and smart contracts in GStreamer.</i></span><br />
<br />
Late last year, <a href="https://twitter.com/centricular/status/921727092810592256" target="_blank">we at Centricular announced</a> a new <a href="http://webrtcbydralex.com/index.php/2017/10/21/a-new-webrtc-implementation-is-out/" target="_blank">implementation of WebRTC</a> in GStreamer. Today we're happy to announce that after community review, that work has been <a href="https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/commit/1894293d6378c69548d974d2965e9decc1527654" target="_blank">merged into GStreamer</a> itself! The plugin is called <tt>webrtcbin</tt>, and the library is, naturally, called <tt>gstwebrtc</tt>.<br />
<br />
The implementation has all the basic features, is transparently compatible with other WebRTC stacks (particularly in browsers), and has been well-tested with both Firefox and Chrome.<br />
<br />
Some of the more advanced features such as FEC are already a <a href="https://bugzilla.gnome.org/show_bug.cgi?id=792696" target="_blank">work in progress</a>, and others will be too—if you want them to be! Hop onto IRC on #gstreamer @ Freenode.net or join <a href="https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel" target="_blank">the mailing list</a>.<br />
<br />
<h3 style="text-align: left;">How do I use it?</h3><br />
Currently, the easiest way to use <tt>webrtcbin</tt> is to build GStreamer using either <a href="https://arunraghavan.net/2014/07/quick-start-guide-to-gst-uninstalled-1-x/">gst-uninstalled</a> (Linux and macOS) or <a href="https://gstreamer.freedesktop.org/documentation/installing/building-from-source-using-cerbero.html">Cerbero</a> (Windows, iOS, Android). If you're a patient person, you can follow <a href="https://twitter.com/gstreamer">@gstreamer</a> and wait for GStreamer 1.14 to be released which will include <a href="https://gstreamer.freedesktop.org/download/">Windows, macOS, iOS, and Android binaries</a>.<br />
<br />
The API currently lacks documentation, so the best way to learn it is to dive into the <a href="https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/tree/1.18/tests/examples/webrtc">source-tree examples</a>. Help on this will be most appreciated! To see how to use GStreamer to do WebRTC with a browser, checkout the <a href="https://gitlab.freedesktop.org/gstreamer/gst-examples/-/tree/1.18/webrtc">bidirectional audio-video demos</a>.<br />
<br />
<a name="show-code"></a><h3 style="text-align: left;">Show me the code! <span style="font-size: xx-small; font-weight: normal; vertical-align: text-top;"><a href="#no-code-pls">[skip]</a></span></h3><br />
Here's a quick highlight of the important bits that should get you started if you already know how GStreamer works. This example is in C, but GStreamer also has bindings for <a href="https://coaxion.net/blog/2017/12/gstreamer-rust-bindings-release-0-10-0-gst-plugin-release-0-1-0/">Rust</a>, <a href="https://gitlab.freedesktop.org/gstreamer/gst-python/">Python</a>, <a href="https://github.com/gstreamer-java/gst1-java-core">Java</a>, <a href="https://gitlab.freedesktop.org/gstreamer/gstreamer-sharp/">C#</a>, Vala, and so on.<br />
<br />
Let's say you want to capture video from <a href="https://en.wikipedia.org/wiki/Video4Linux">V4L2</a>, stream it to a webrtc peer, and receive video back from it. The first step is the streaming pipeline, which will look something like this:<br />
<br />
<!-- HTML generated using hilite.me --><div style="background: #272822; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><table><tr><pre style="margin: 0; line-height: 125%"><span style="color: #f8f8f2">v4l2src</span> <span style="color: #f92672">!</span> <span style="color: #f8f8f2">queue</span> <span style="color: #f92672">!</span> <span style="color: #f8f8f2">vp8enc</span> <span style="color: #f92672">!</span> <span style="color: #f8f8f2">rtpvp8pay</span> <span style="color: #f92672">!</span>
<span style="color: #f8f8f2">application</span><span style="color: #f92672">/</span><span style="color: #f8f8f2">x</span><span style="color: #f92672">-</span><span style="color: #f8f8f2">rtp,media</span><span style="color: #f92672">=</span><span style="color: #f8f8f2">video,encoding</span><span style="color: #f92672">-</span><span style="color: #f8f8f2">name</span><span style="color: #f92672">=</span><span style="color: #66d9ef">VP8</span><span style="color: #f8f8f2">,payload</span><span style="color: #f92672">=</span><span style="color: #ae81ff">96</span> <span style="color: #f92672">!</span>
<span style="color: #f8f8f2">webrtcbin</span> <span style="color: #f8f8f2">name</span><span style="color: #f92672">=</span><span style="color: #a6e22e">sendrecv</span>
</pre></td></tr>
</table></div><br />
As a short-cut, let's parse the string description to create the pipeline.<br />
<br />
<!-- HTML generated using hilite.me --><div style="background: #272822; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><table><tr><td><pre style="margin: 0; line-height: 125%">1
2
3
4
5</pre></td><td><pre style="margin: 0; line-height: 125%"><span style="color: #afafdf">GstElement</span> <span style="color: #f92672">*</span><span style="color: #f8f8f2">pipe;</span>
<span style="color: #f8f8f2">pipe</span> <span style="color: #f92672">=</span> <span style="color: #dfafdf">gst_parse_launch</span> <span style="color: #f8f8f2">(</span><span style="color: #e6db74">"v4l2src ! queue ! vp8enc ! rtpvp8pay ! "</span>
<span style="color: #e6db74">"application/x-rtp,media=video,encoding-name=VP8,payload=96 !"</span>
<span style="color: #e6db74">" webrtcbin name=sendrecv"</span><span style="color: #f8f8f2">,</span> <span style="color: #e6db74">NULL</span><span style="color: #f8f8f2">);</span>
</pre></td></tr>
</table></div><br />
Next, we get a reference to the <tt>webrtcbin</tt> element and attach some callbacks to it.<br />
<br />
<!-- HTML generated using hilite.me --><div style="background: #272822; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><table><tr><td><pre style="margin: 0; line-height: 125%"> 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19</pre></td><td><pre style="margin: 0; line-height: 125%"><span style="color: #afafdf">GstElement</span> <span style="color: #f92672">*</span><span style="color: #f8f8f2">webrtc;</span>
<span style="color: #f8f8f2">webrtc</span> <span style="color: #f92672">=</span> <span style="color: #dfafdf">gst_bin_get_by_name</span> <span style="color: #f8f8f2">(</span><span style="color: #a6e22e">GST_BIN</span> <span style="color: #f8f8f2">(pipe),</span> <span style="color: #e6db74">"sendrecv"</span><span style="color: #f8f8f2">);</span>
<span style="color: #dfafdf">g_assert</span> <span style="color: #f8f8f2">(webrtc</span> <span style="color: #f92672">!=</span> <span style="color: #e6db74">NULL</span><span style="color: #f8f8f2">);</span>
<span style="color: #75715e">/* This is the gstwebrtc entry point where we create the offer.</span>
<span style="color: #75715e"> * It will be called when the pipeline goes to PLAYING. */</span>
<span style="color: #dfafdf">g_signal_connect</span> <span style="color: #f8f8f2">(webrtc,</span> <span style="color: #e6db74">"on-negotiation-needed"</span><span style="color: #f8f8f2">,</span>
<span style="color: #a6e22e">G_CALLBACK</span> <span style="color: #f8f8f2">(on_negotiation_needed),</span> <span style="color: #e6db74">NULL</span><span style="color: #f8f8f2">);</span>
<span style="color: #75715e">/* We will transmit this ICE candidate to the remote using some</span>
<span style="color: #75715e"> * signalling. Incoming ICE candidates from the remote need to be</span>
<span style="color: #75715e"> * added by us too. */</span>
<span style="color: #dfafdf">g_signal_connect</span> <span style="color: #f8f8f2">(webrtc,</span> <span style="color: #e6db74">"on-ice-candidate"</span><span style="color: #f8f8f2">,</span>
<span style="color: #a6e22e">G_CALLBACK</span> <span style="color: #f8f8f2">(send_ice_candidate_message),</span> <span style="color: #e6db74">NULL</span><span style="color: #f8f8f2">);</span>
<span style="color: #75715e">/* Incoming streams will be exposed via this signal */</span>
<span style="color: #dfafdf">g_signal_connect</span> <span style="color: #f8f8f2">(webrtc,</span> <span style="color: #e6db74">"pad-added"</span><span style="color: #f8f8f2">,</span>
<span style="color: #a6e22e">G_CALLBACK</span> <span style="color: #f8f8f2">(on_incoming_stream),</span> <span style="color: #f8f8f2">pipe);</span>
<span style="color: #75715e">/* Lifetime is the same as the pipeline itself */</span>
<span style="color: #dfafdf">gst_object_unref</span> <span style="color: #f8f8f2">(webrtc);</span>
</pre></td></tr>
</table></div><br />
When the pipeline goes to PLAYING, the <tt>on_negotiation_needed()</tt> callback will be called, and we will ask <tt>webrtcbin</tt> to create an offer which will match the pipeline above.<br />
<br />
<!-- HTML generated using hilite.me --><div style="background: #272822; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><table><tr><td><pre style="margin: 0; line-height: 125%"> 1
2
3
4
5
6
7
8
9
10</pre></td><td><pre style="margin: 0; line-height: 125%"><span style="color: #afafdf">static</span> <span style="color: #afafdf">void</span>
<span style="color: #f8f8f2">on_negotiation_needed</span> <span style="color: #afafdf">(GstElement</span> <span style="color: #f92672">*</span> <span style="color: #f8f8f2">webrtc,</span> <span style="color: #afafdf">gpointer</span> <span style="color: #f8f8f2">user_data)</span>
<span style="color: #f8f8f2">{</span>
<span style="color: #afafdf">GstPromise</span> <span style="color: #f92672">*</span><span style="color: #f8f8f2">promise;</span>
<span style="color: #f8f8f2">promise</span> <span style="color: #f92672">=</span> <span style="color: #dfafdf">gst_promise_new_with_change_func</span> <span style="color: #f8f8f2">(on_offer_created,</span>
<span style="color: #f8f8f2">user_data,</span> <span style="color: #e6db74">NULL</span><span style="color: #f8f8f2">);</span>
<span style="color: #dfafdf">g_signal_emit_by_name</span> <span style="color: #f8f8f2">(webrtc,</span> <span style="color: #e6db74">"create-offer"</span><span style="color: #f8f8f2">,</span> <span style="color: #e6db74">NULL</span><span style="color: #f8f8f2">,</span>
<span style="color: #f8f8f2">promise);</span>
<span style="color: #f8f8f2">}</span>
</pre></td></tr>
</table></div><br />
When webrtcbin has created the offer, it will call <tt>on_offer_created()</tt><br />
<br />
<!-- HTML generated using hilite.me --><div style="background: #272822; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><table><tr><td><pre style="margin: 0; line-height: 125%"> 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21</pre></td><td><pre style="margin: 0; line-height: 125%"><span style="color: #afafdf">static</span> <span style="color: #afafdf">void</span>
<span style="color: #f8f8f2">on_offer_created</span> <span style="color: #f8f8f2">(</span><span style="color: #afafdf">GstPromise</span> <span style="color: #f92672">*</span> <span style="color: #f8f8f2">promise,</span> <span style="color: #f8f8f2">GstElement</span> <span style="color: #f92672">*</span> <span style="color: #f8f8f2">webrtc)</span>
<span style="color: #f8f8f2">{</span>
<span style="color: #afafdf">GstWebRTCSessionDescription</span> <span style="color: #f92672">*</span><span style="color: #f8f8f2">offer</span> <span style="color: #f92672">=</span> <span style="color: #f8f8f2">NULL</span><span style="color: #f8f8f2">;</span>
<span style="color: #afafdf">const</span> <span style="color: #afafdf">GstStructure</span> <span style="color: #f92672">*</span><span style="color: #f8f8f2">reply;</span>
<span style="color: #afafdf">gchar</span> <span style="color: #f92672">*</span><span style="color: #f8f8f2">desc;</span>
<span style="color: #f8f8f2">reply</span> <span style="color: #f92672">=</span> <span style="color: #dfafdf">gst_promise_get_reply</span> <span style="color: #f8f8f2">(promise);</span>
<span style="color: #dfafdf">gst_structure_get</span> <span style="color: #f8f8f2">(reply,</span> <span style="color: #e6db74">"offer"</span><span style="color: #f8f8f2">,</span>
<span style="color: #e6db74">GST_TYPE_WEBRTC_SESSION_DESCRIPTION,</span>
<span style="color: #f92672">&</span><span style="color: #f8f8f2">offer,</span> <span style="color: #e6db74">NULL</span><span style="color: #f8f8f2">);</span>
<span style="color: #dfafdf">gst_promise_unref</span> <span style="color: #f8f8f2">(promise);</span>
<span style="color: #75715e">/* We can edit this offer before setting and sending */</span>
<span style="color: #dfafdf">g_signal_emit_by_name</span> <span style="color: #f8f8f2">(webrtc,</span>
<span style="color: #e6db74">"set-local-description"</span><span style="color: #f8f8f2">,</span> <span style="color: #f8f8f2">offer,</span> <span style="color: #e6db74">NULL</span><span style="color: #f8f8f2">);</span>
<span style="color: #75715e">/* Implement this and send offer to peer using signalling */</span>
<span style="color: #f8f8f2">send_sdp_offer</span> <span style="color: #f8f8f2">(offer);</span>
<span style="color: #dfafdf">gst_webrtc_session_description_free</span> <span style="color: #f8f8f2">(offer);</span>
<span style="color: #f8f8f2">}</span>
</pre></td></tr>
</table></div><br />
Similarly, when we have the SDP <tt>answer</tt> from the remote, we must call <tt>"set-remote-description"</tt> on <tt>webrtcbin</tt>. <br />
<br />
<!-- HTML generated using hilite.me --><div style="background: #272822; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><table><tr><td><pre style="margin: 0; line-height: 125%">1
2
3
4
5
6
7</pre></td><td><pre style="margin: 0; line-height: 125%"><span style="color: #f8f8f2">answer</span> <span style="color: #f92672">=</span> <span style="color: #dfafdf">gst_webrtc_session_description_new</span> <span style="color: #f8f8f2">(</span>
<span style="color: #e6db74">GST_WEBRTC_SDP_TYPE_ANSWER,</span> <span style="color: #f8f8f2">sdp);</span>
<span style="color: #dfafdf">g_assert</span> <span style="color: #f8f8f2">(answer);</span>
<span style="color: #75715e">/* Set remote description on our pipeline */</span>
<span style="color: #dfafdf">g_signal_emit_by_name</span> <span style="color: #f8f8f2">(webrtc,</span> <span style="color: #e6db74">"set-remote-description"</span><span style="color: #f8f8f2">,</span>
<span style="color: #f8f8f2">answer,</span> <span style="color: #e6db74">NULL</span><span style="color: #f8f8f2">);</span>
</pre></td></tr>
</table></div><br />
ICE handling is very similar; when the <tt>"on-ice-candidate"</tt> signal is emitted, we get a local ICE candidate which we must <a href="https://gitlab.freedesktop.org/gstreamer/gst-examples/-/blob/1.18/webrtc/sendrecv/gst/webrtc-sendrecv.c#L196">send to the remote</a>. When we have an ICE candidate from the remote, we must <a href="https://gitlab.freedesktop.org/gstreamer/gst-examples/-/blob/1.18/webrtc/sendrecv/gst/webrtc-sendrecv.c#L690">call</a> <tt>"add-ice-candidate"</tt> on <tt>webrtcbin</tt>.<br />
<br />
There's just one piece left now; handling incoming streams that are sent by the remote. For that, we have <tt>on_incoming_stream()</tt> attached to the <tt>"pad-added"</tt> signal on <tt>webrtcbin</tt>.<br />
<br />
<!-- HTML generated using hilite.me --><div style="background: #272822; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><table><tr><td><pre style="margin: 0; line-height: 125%"> 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15</pre></td><td><pre style="margin: 0; line-height: 125%"><span style="color: #afafdf">static</span> <span style="color: #afafdf">void</span>
<span style="color: #f8f8f2">on_incoming_stream</span> <span style="color: #f8f8f2">(</span><span style="color: #afafdf">GstElement</span> <span style="color: #f92672">*</span> <span style="color: #f8f8f2">webrtc,</span> <span style="color: #afafdf">GstPad</span> <span style="color: #f92672">*</span> <span style="color: #f8f8f2">pad,</span>
<span style="color: #afafdf">GstElement</span> <span style="color: #f92672">*</span> <span style="color: #f8f8f2">pipe)</span>
<span style="color: #f8f8f2">{</span>
<span style="color: #afafdf">GstElement</span> <span style="color: #f92672">*</span><span style="color: #f8f8f2">play;</span>
<span style="color: #f8f8f2">play</span> <span style="color: #f92672">=</span> <span style="color: #dfafdf">gst_parse_bin_from_description</span> <span style="color: #f8f8f2">(</span>
<span style="color: #e6db74">"queue ! vp8dec ! videoconvert ! autovideosink"</span><span style="color: #f8f8f2">,</span>
<span style="color: #e6db74">TRUE</span><span style="color: #f8f8f2">,</span> <span style="color: #e6db74">NULL</span><span style="color: #f8f8f2">);</span>
<span style="color: #dfafdf">gst_bin_add</span> <span style="color: #f8f8f2">(</span><span style="color: #a6e22e">GST_BIN</span> <span style="color: #f8f8f2">(pipe),</span> <span style="color: #f8f8f2">play);</span>
<span style="color: #75715e">/* Start displaying video */</span>
<span style="color: #dfafdf">gst_element_sync_state_with_parent</span> <span style="color: #f8f8f2">(play);</span>
<span style="color: #dfafdf">gst_element_link</span> <span style="color: #f8f8f2">(webrtc,</span> <span style="color: #f8f8f2">play);</span>
<span style="color: #f8f8f2">}</span>
</pre></td></tr>
</table></div><br />
That's it! This is what a basic webrtc workflow looks like. Those of you that have used the <tt>PeerConnection</tt> API before will be happy to see that this maps to that quite closely.<br />
<br />
<a name="no-code-pls"></a>The <a href="https://gitlab.freedesktop.org/gstreamer/gst-examples/-/tree/1.18/webrtc">aforementioned demos</a> also include a Websocket signalling server and JS browser components, and I will be doing an in-depth application newbie developer's guide at a later time, so you can <a href="https://twitter.com/nirbheek" target="_blank">follow me @nirbheek</a> to hear when it comes out!<br />
<br />
<h3 style="text-align: left;">Tell me more!</h3><br />
The code is already being used in production in a number of places, such as <a href="http://www.easymile.com/" target="_blank">EasyMile</a>'s autonomous vehicles, and we're excited to see where else the community can take it.<br />
<div name="why-gst" style="text-align: left;"><br />
</div>If you're wondering why we decided a new implementation was needed, read on! For a more detailed discussion into that, you should watch <a href="https://gstconf.ubicast.tv/videos/gstreamer-webrtc/">Matthew Waters' talk</a> from the <a href="https://gstreamer.freedesktop.org/conference/2017/">GStreamer conference last year</a>. It's a great companion for this article!<br />
<br />
But before we can dig into details, we need to lay some foundations first. <br />
<br />
<h3 name="what-is" style="text-align: left;">What is GStreamer, and what is WebRTC? <span style="font-size: xx-small; font-weight: normal; vertical-align: text-top;"><a href="#why-build">[skip]</a></span></h3><div style="text-align: left;"><br />
<b>GStreamer</b> is a cross-platform <a href="https://gstreamer.freedesktop.org/documentation/application-development/introduction/gstreamer.html">open-source multimedia framework</a> that is, in my opinion, the easiest and most flexible way to implement any application that needs to play, record, or transform media-like data across an extremely versatile scale of devices and products. Embedded (IoT, IVI, phones, TVs, …), desktop (video/music players, video recording, non-linear editing, videoconferencing and <a href="https://en.wikipedia.org/wiki/Voice_over_IP">VoIP</a> clients, browsers …), to servers (encode/transcode farms, video/voice conferencing servers, …) and <a href="https://wiki.ligo.org/DASWG/GstLAL">more</a>.</div><div style="text-align: left;"><br />
</div><div style="text-align: left;">But what I like the most about GStreamer is the pipeline-based model which solves one of the hardest problems in API design: catering to applications of varying complexity; from the simplest one-liners and quick solutions to those that need several hundreds of thousands of lines of code to implement their full featureset. </div><div style="text-align: left;"><br />
</div><div style="text-align: left;">If you want to learn more about GStreamer, <a href="https://www.youtube.com/watch?v=ZphadMGufY8">Jan Schmidt's tutorial</a> from <a href="http://lca2018.linux.org.au/">Linux.conf.au</a> is a good start.</div><div style="text-align: left;"><br />
</div><div style="text-align: left;"><b>WebRTC</b> is a set of draft specifications that build upon existing <a href="https://en.wikipedia.org/wiki/Real-time_Transport_Protocol">RTP</a>, <a href="https://en.wikipedia.org/wiki/RTP_Control_Protocol">RTCP</a>, <a href="https://en.wikipedia.org/wiki/Session_Description_Protocol">SDP</a>, <a href="https://en.wikipedia.org/wiki/Datagram_Transport_Layer_Security">DTLS</a>, <a href="https://en.wikipedia.org/wiki/Interactive_Connectivity_Establishment" >ICE</a> (and many other) real-time communication specifications and defines an API for making <abbr title="Real-time Communication">RTC</abbr> accessible using browser JS APIs.</div><div style="text-align: left;"><br />
</div><div style="text-align: left;">People have been doing real-time communication over <a href="https://en.wikipedia.org/wiki/Internet_Protocol">IP</a> for <a href="https://en.wikipedia.org/wiki/Session_Initiation_Protocol">decades</a> with the previously-listed protocols that WebRTC builds upon. The real innovation of WebRTC was creating a bridge between native applications and webapps by defining a standard, yet flexible, API that browsers can expose to untrusted JavaScript code.</div><div style="text-align: left;"><br />
</div><div style="text-align: left;">These specifications are <a href="https://datatracker.ietf.org/wg/rtcweb/documents/">constantly</a> being <a href="https://datatracker.ietf.org/wg/rmcat/documents/">improved upon</a>, which combined with the ubiquitous nature of browsers means WebRTC is fast becoming the standard choice for videoconferencing on all platforms and for most applications.</div><br />
<a name="why-build"></a><h3 style="text-align: left;">Everything is great, let's build amazing apps! <span style="font-size: xx-small; font-weight: normal; vertical-align: text-top;"><a href="#why-gst">[skip]</a></span></h3><div style="text-align: left;"><br />
Not so fast, there's more to the story! For WebApps, the <a href="https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection">PeerConnection API</a> is <a href="https://caniuse.com/#feat=rtcpeerconnection">everywhere</a>. There are some browser-specific quirks as usual, and the API itself keeps changing, but the <a href="https://github.com/webrtc/adapter">WebRTC JS adapter</a> handles most of that. Overall the WebApp experience is mostly 👍.</div><div style="text-align: left;"><br />
</div><div style="text-align: left;">Sadly, for native code or applications that need more flexibility than a sandboxed JS app can achieve, there <i>haven't</i> been a lot of great options.</div><div style="text-align: left;"><br />
</div><div style="text-align: left;"><a href="http://webrtc.org/" target="_blank">libwebrtc</a> (Chrome's implementation), <a href="https://janus.conf.meetecho.com/" target="_blank">Janus</a>, <a href="https://www.kurento.org/kurento-architecture" target="_blank">Kurento</a>, and <a href="https://en.wikipedia.org/wiki/OpenWebRTC" target="_blank">OpenWebRTC</a> have traditionally been the main contenders, but after having worked with all of these, we found that each implementation has its own inflexibilities, shortcomings, and constraints.</div><div style="text-align: left;"><br />
</div><div style="text-align: left;"><b>libwebrtc</b> is still the most mature implementation, but it is also the most difficult to work with. Since it's embedded inside Chrome, it's a moving target, the API can be hard to work with, and the project <a href="https://webrtchacks.com/building-webrtc-from-source/" target="_blank">is quite difficult to build and integrate</a>, all of which are obstacles in the way of native or server app developers trying to quickly prototype and try out things.</div><div style="text-align: left;"><br />
</div><div style="text-align: left;">It was also not built for multimedia use-cases, so while the webrtc bits are great, the lower layers get in the way of non-browser use-cases and applications. It is quite painful to do anything other than the default "set raw media, transmit" and "receive from remote, get raw media". This means that if you want to use your own filters, or hardware-specific codecs or sinks/sources, you end up having to fork libwebrtc.</div><br />
<div style="text-align: left;">In contrast, <a href="#show-code">as shown above</a>, our implementation gives you full control over this as with any other <a href="https://gstreamer.freedesktop.org/documentation/application-development/introduction/basics.html" target="_blank">GStreamer pipeline</a>.</div><div style="text-align: left;"><br />
</div><div style="text-align: left;"><b>OpenWebRTC</b> by Ericsson was the first attempt to rectify this situation, and it was built on top of GStreamer. The target audience was app developers, and it fit the bill quite well as a proof-of-concept—even though it used a custom API and some of the architectural decisions made it quite inflexible for most other use-cases.</div><div style="text-align: left;"><br />
</div><div style="text-align: left;">However, after an initial flurry of activity around the project, momentum petered out, the project failed to gather a community around itself, and is now <a href="https://www.youtube.com/watch?v=npjOSLCR2hE" target="_blank">effectively dead</a>.</div><div style="text-align: left;"><br />
</div><div style="text-align: left;"><i>Full disclosure: <a href="https://centricular.com/">we</a> worked with Ericsson to polish some of the rough edges around the project immediately prior to its public release.</i><br />
</div><br />
<a name="why-gst"></a><h3 style="text-align: left;">WebRTC in GStreamer — webrtcbin and gstwebrtc</h3><div style="text-align: left;"><br />
Remember how I said the WebRTC standards build upon existing standards and protocols? As it so happens, GStreamer has supported almost all of them for a while now because they were being used for real-time communication, live streaming, and in many other <a href="https://en.wikipedia.org/wiki/Internet_Protocol">IP-based</a> applications. Indeed, that's partly why Ericsson chose it as the base for <abbr title="OpenWebRTC">OWRTC</abbr>.</div><div name="why-gst" style="text-align: left;"><br />
</div><div style="text-align: left;">This combined with the SRTP and DTLS plugins that were written during OWRTC's development meant that<i> </i>our implementation is built upon a solid and well-tested base, and that implementing WebRTC features is not as difficult as one might presume. However, WebRTC is a large collection of standards, and reaching feature-parity with libwebrtc is an ongoing task.<br />
<br />
Lucky for us, <a href="https://github.com/ystreet/">Matthew</a> made some excellent decisions while architecting the internals of webrtcbin, and we follow the PeerConnection specification quite closely, so almost all the missing features involve writing code that would plug into clearly-defined sockets.</div><div name="why-gst" style="text-align: left;"><br />
</div><div style="text-align: left;">We believe what we've been building here is the most flexible, versatile, and easy to use WebRTC implementation out there, and it can only get better as time goes by. Bringing the power of pipeline-based multimedia manipulation to WebRTC opens new doors for interesting, unique, and highly efficient applications.<br />
<br />
To demonstrate this, in the near future we will be publishing articles that dive into how to use the PeerConnection-inspired API exposed by webrtcbin to build various kinds of applications—starting with a CPU-efficient multi-party bidirectional conferencing solution with a mesh topology that can work with any webrtc stack.<br />
<br />
Until next time!</div></div>Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com47tag:blogger.com,1999:blog-701969077517001201.post-86125724943451892402016-08-17T21:37:00.003+05:302016-08-17T21:40:06.033+05:30The Meson build system at GUADEC 2016<div dir="ltr" style="text-align: left;" trbidi="on">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGG2Q9YGQHGulI1o8O7aWsZdKaxG9es7B-R-0B6eyfMddlHY9wuDjuO7xE6kn5Orz5YhL44cUb-OB8cWN1gr2ve_8N6p5Hs1vBB7-xzyP-F9KnayQp0eeRzdqmwkmN8ptfz0rOohKscfiT/s1600/centricular-logo.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img alt="centricular-logo" border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGG2Q9YGQHGulI1o8O7aWsZdKaxG9es7B-R-0B6eyfMddlHY9wuDjuO7xE6kn5Orz5YhL44cUb-OB8cWN1gr2ve_8N6p5Hs1vBB7-xzyP-F9KnayQp0eeRzdqmwkmN8ptfz0rOohKscfiT/s400/centricular-logo.png" title="" /></a>For the third year in a row, <a href="http://www.centricular.com/">Centricular</a> was at <a href="http://2016.guadec.org/">GUADEC</a>, and this year we sponsored the evening party on the <a href="https://2016.guadec.org/2016/08/14/guadec-day-4-sunday/">final day</a> at <a href="http://hoepfner-burghof.de/">Hoepfner’s Burghof</a>! Hopefully everyone enjoyed it as much as we hoped. :)<br />
<br />
The focus for me this year was to try and tell people about the work we've been doing on <a href="http://blog.nirbheek.in/2016/05/gstreamer-and-meson-new-hope.html">porting GStreamer to Meson</a> and to that end, I gave a talk on the second day about how to <a href="https://2016.guadec.org/schedule/#abstract-44-making_your_gnome_app_compile_24x_faster">build your GNOME app ~2<span class="st">x</span> faster than before</a>.<br />
<br />
The talk title itself was a bit of a lie, since most of the talk was about how Autotools is a mess and how Meson has excellent features (better syntax!) and in-built support for most of the GNOME infrastructure to make it easy for people to use it. But for some people the attraction is also that Meson provides better support on platforms such as Windows, and improves build times on all platforms massively; ranging from 2x on Linux to 10-15x on Windows.<br />
<br />
Thanks to the excellent people at <a href="http://c3voc.de/">c3voc.de</a>, the talks were all live-streamed, and you can <a href="https://media.ccc.de/v/44-making_your_gnome_app_compile_24x_faster">see my talk at their relive website for GUADEC 2016</a>.<br />
<br />
It was heartening to see that over the past year people have warmed up to the idea of using Meson as a replacement for Autotools. Several people said kind and encouraging words to me and <a href="http://twitter.com/jpakkane/">Jussi</a> over the course of the conference (it helps that GNOME is filled with a friendly community!). We will continue to improve <a href="http://github.com/mesonbuild/meson">Meson</a> and with luck we can get rid of Autotools over time.<br />
<br />
The best approach, as always, is to start with the simple projects, get familiar with the syntax, and <a href="https://github.com/mesonbuild/meson/issues/">report any bugs you find</a>! We look forward to your bugs and pull requests. ;)</div>
Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com1tag:blogger.com,1999:blog-701969077517001201.post-26740074134450545152016-07-27T15:03:00.000+05:302019-04-06T05:07:09.071+05:30Building and Developing GStreamer using Visual Studio<div dir="ltr" style="text-align: left;" trbidi="on">
Two months ago, I talked about how we at <a href="http://www.centricular.com/">Centricular</a> have been working on a Meson port of GStreamer and its basic dependencies (glib, libffi, and orc) for various reasons — faster builds, better cross-platform support (particularly Windows), better toolchain support, ease of use, and for a better build system future in general.<br />
<br />
Meson also has built-in support for things like gtk-doc, gobject-introspection, translations, etc. It can even generate Visual Studio project files at build time so projects don't have to expend resources maintaining those separately.<br />
<br />
Today I'm here to share instructions on how to use <a href="https://cgit.freedesktop.org/gstreamer/cerbero">Cerbero</a> (our “aggregating” build system) to build all of GStreamer on Windows using MSVC 2015 (wherever possible). Note that this means you won't see any Meson invocations at all because Cerbero does all that work for you.<br />
<br />
Note that this is still all unofficial and has not been proposed for inclusion upstream. We still have a few issues that need to be ironed out before we can do that¹.<br />
<br />
<span style="font-size: large;"><b>Update: As of March 2019, all these instructions are obsolete since <a href="https://gitlab.freedesktop.org/gstreamer/cerbero/#enabling-visual-studio-support" target="_blank">MSVC support</a> has been merged into <a href="https://gitlab.freedesktop.org/gstreamer/cerbero/" target="_blank">upstream Cerbero</a>. I'm leaving this outdated text as-is for archival purposes.</b></span><br />
<br />
First, you need to setup the environment on Windows by installing a bunch of external tools: Python 2, Python3, Git, etc. You can find the instructions for that here:<br />
<br />
<a href="https://github.com/centricular/cerbero#windows">https://github.com/centricular/cerbero#windows</a><br />
<br />
This is very similar to the old Cerbero instructions, but some new tools are needed. Once you've done everything there (Visual Studio especially takes a while to fetch and install itself), the next step is fetching Cerbero:<br />
<br />
<code>$ git clone https://github.com/centricular/cerbero.git</code><br />
<br />
This will clone and checkout the <code>meson-1.8</code> branch that will build GStreamer 1.8.x. Next, we bootstrap it:<br />
<br />
<a href="https://github.com/centricular/cerbero#bootstrap">https://github.com/centricular/cerbero#bootstrap</a><br />
<br />
Now we're (finally) ready to build GStreamer. Just invoke the package command:<br />
<br />
<code>python2 cerbero-uninstalled -c config/win32-mixed-msvc.cbc package gstreamer-1.0</code><br />
<br />
This will build all the `recipes` that constitute GStreamer, including the core libraries and all the plugins including their external dependencies. This comes to about 76 recipes. Out of all these recipes, only the following are ported to Meson and are built with MSVC:<br />
<br />
bzip2.recipe<br />
orc.recipe<br />
libffi.recipe (only 32-bit)<br />
glib.recipe<br />
gstreamer-1.0.recipe<br />
gst-plugins-base-1.0.recipe<br />
gst-plugins-good-1.0.recipe<br />
gst-plugins-bad-1.0.recipe<br />
gst-plugins-ugly-1.0.recipe<br />
<br />
The rest still mostly use Autotools, plain GNU make or cmake. Almost all of these are still built with MinGW. The only exception is libvpx, which uses its custom make-based build system but is built with MSVC.<br />
<br />
Eventually we want to build everything including all external dependencies with MSVC by porting everything to Meson, but as you can imagine it's not an easy task. :-)<br />
<br />
However, even with just these recipes, there is a large improvement in how quickly you can build all of GStreamer inside Cerbero on Windows. For instance, the time required for building <code>gstreamer-1.0.recipe</code> which builds <code>gstreamer.git</code> went from 10 minutes to 45 seconds. It is now easier to do <a href="https://github.com/centricular/cerbero-docs/blob/master/start.md#dev-workflow">GStreamer development on Windows</a> since rebuilding doesn't take an inordinate amount of time!<br />
<br />
As a further improvement for doing GStreamer development on Windows, for all these recipes (except libffi because of complicated reasons), you can also <a href="https://github.com/centricular/cerbero-docs/blob/master/start.md#vs-projects">generate Visual Studio 2015 project files</a> and use them from within Visual Studio for editing, building, and so on.<br />
<br />
Go ahead, try it out and tell me if it works for you!<br />
<br />
As an aside, I've also been working on some proper in-depth documentation of Cerbero that explains how the tool works, the recipe format, supported configurations, and so on. You can see the <a href="https://github.com/centricular/cerbero-docs/blob/master/start.md">work-in-progress</a> if you wish to.<br />
<br />
<span style="font-size: x-small;">1. Most importantly, the tests cannot be built yet because GStreamer bundles a very old version of libcheck. I'm currently working on fixing that.</span> </div>
Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com1tag:blogger.com,1999:blog-701969077517001201.post-20044437318686484022016-05-23T13:54:00.001+05:302019-04-06T05:08:57.268+05:30GStreamer and Meson: A New Hope<div dir="ltr" style="text-align: left;" trbidi="on">
Anyone who has written a non-trivial project using Autotools has realized that (and wondered why) it requires you to be aware of <abbr title="m4, autoconf, automake, shell, make—add Perl for many projects">5 different languages</abbr>. Once you spend enough time with the innards of the system, you begin to realize that it is nothing short of an astonishing feat of engineering. Engineering that belongs in a museum. Not as part of critical infrastructure.<br />
<br />
Autotools was created in the 1980s and caters to the needs of an entirely different world of software from what we have at present. Worse yet, it carries over accumulated cruft from the past 40 years — ostensibly for better “cross-platform support” but that “support” is mostly for <span id="goog_326547089"></span>extinct platforms<span id="goog_326547090"></span> that five people in the whole world remember.<br />
<br />
We've learned how to make it work for most cases that concern FOSS developers on Linux, and it can be made to limp along on other platforms that the majority of people use, but it <a href="http://voices.canonical.com/jussi.pakkanen/2011/09/13/autotools/">does not inspire confidence</a> or really anything except frustration. People will not like your project or contribute to it if the build system takes 10x longer to compile on their platform of choice, does not integrate with the preferred IDE, and requires knowledge arcane enough to be indistinguishable from cargo-cult programming.<br />
<br />
As a result there have been several (terrible) efforts at replacing it and each has been either incomplete, short-sighted, slow, or just plain ugly. During my time as a Gentoo developer in another life, I came in close contact with and developed a keen hatred for each of these alternative build systems. And so I mutely went back to Autotools and learned that I hated it the least of them all.<br />
<br />
Sometime last year, <a href="https://twitter.com/tp_muller">Tim</a> heard about this new build system called ‘<a href="https://github.com/mesonbuild/meson/">Meson</a>’ whose author had created an experimental port of <a href="http://gstreamer.freedesktop.org/">GStreamer</a> that built it in record time.<br />
<br />
Intrigued, he tried it out and found that it finished suspiciously quickly. His first instinct was that it was broken and hadn’t actually built everything! Turns out this build system written in Python 3 with <a href="https://ninja-build.org/">Ninja</a> as the backend actually <b><i>was</i></b> that fast. About 2.5x faster on Linux and 10x faster on Windows for building the <a href="http://cgit.freedesktop.org/gstreamer/gstreamer">core GStreamer repository</a>.<br />
<br />
Upon further investigation, Tim and I found that Meson also has really clean generic cross-compilation support (including iOS and Android), runs natively (and just as quickly) on OS X and Windows, supports GNU, Clang, and MSVC toolchains, and can even (configure and) generate XCode and Visual Studio project files!<br />
<br />
But the critical thing that convinced me was that the creator <a href="https://twitter.com/jpakkane/">Jussi Pakkanen</a> was genuinely interested in the use-cases of widely-used software such as Qt, GNOME, and GStreamer and had already added support for several tools and idioms that we use — pkg-config, gtk-doc, gobject-introspection, gdbus-codegen, and <a href="https://github.com/mesonbuild/meson/wiki/Gnome-module">so on</a>. The project places strong emphasis on both speed and ease of use and is quite friendly to contributions.<br />
<br />
Over the past few months, Tim and I at Centricular <a href="https://github.com/centricular/">have been working</a> on creating Meson ports for <a href="https://github.com/centricular/gstreamer">most</a> <a href="https://github.com/centricular/gst-plugins-base">of</a> <a href="https://github.com/centricular/gst-plugins-good">the</a> <a href="https://github.com/centricular/gst-plugins-bad">GStreamer</a> <a href="https://github.com/centricular/gst-plugins-ugly">repositories</a> and the fundamental dependencies (<a href="https://github.com/centricular/libffi">libffi</a>, <a href="https://github.com/centricular/glib">glib</a>, <a href="https://github.com/centricular/orc">orc</a>) and improving the MSVC toolchain support in Meson.<br />
<br />
We are proud to report that you can now build GStreamer on Linux using the GNU toolchain and on Windows with either MinGW or MSVC 2015 using Meson build files that ship with the source (building upon Jussi's initial ports).<br />
<br />
Other toolchain/platform combinations haven't been tested yet, but they should work in theory (minus bugs!), and we intend to test and bugfix all the configurations supported by GStreamer (Linux, OS X, Windows, iOS, Android) before proposing it for inclusion as an alternative build system for the GStreamer project.<br />
<br />
You can either grab the source yourself and <a href="https://github.com/mesonbuild/meson/wiki/Quick%20guide">build everything</a>, or use our (with luck, temporary) fork of GStreamer's cross-platform build aggregator <a href="https://github.com/centricular/cerbero">Cerbero</a>.<br />
<br />
<b>Update:</b> I wrote a new post with detailed steps on how to <a href="http://blog.nirbheek.in/2016/07/building-and-developing-gstreamer-using.html">build using Cerbero and generate Visual Studio project files</a>.<br />
<br />
<span style="font-size: large;"><b>Second update: All this is now upstream, see the <a href="https://gitlab.freedesktop.org/gstreamer/cerbero/#description" target="_blank">upstream Cerbero repository's README</a> </b></span><br />
<br />
Personally, I really hope that Meson gains widespread adoption. Calling Autotools the Xorg of build systems is flattery. It really is just a terrible system. We really need to invest in something that works for us rather than against us.<br />
<br />
PS: If you just want a quick look at what the build system syntax looks like, <a href="https://github.com/centricular/proxy-libintl/blob/master/meson.build">take a look at this</a> or <a href="https://github.com/mesonbuild/meson/wiki/Tutorial">the basic tutorial</a>.</div>
Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com14tag:blogger.com,1999:blog-701969077517001201.post-79267308600859911952015-06-12T17:44:00.000+05:302015-06-12T17:47:05.991+05:30एक बच्चों की पद्यक<div dir="ltr" style="text-align: left;" trbidi="on">
आो मीलो (clap clap clap)<br />
शीलम शालो (clap clap clap)<br />
कच्चा धागा (clap clap clap)<br />
रेस लगा लो (clap clap clap)<br />
<br />
(जलदी से)<br />
<br />
आो मीलो शीलम शालो कच्चा धागा रेस लगा लो<br />
<br />
दस पत्ते तोड़े<br />
एक पत्ता कच्चा<br />
हिरन का बच्चा<br />
हिरन गया पानी में<br />
पकड़ा उस्की नानी ने<br />
नानी गयी लंडन<br />
वहां से लाइ कंगन<br />
<br />
कंगन गया टूट (clap)<br />
नानी गयी रूठ (clap)<br />
<br />
(और भी तेज़ी से)<br />
<br />
नानी को मनाएंगे<br />
रस मालाइ खाएंगे<br />
रस मालाइ अच्छी<br />
हमने खाइ मच्छी<br />
मच्छी में निकला कांटा<br />
मम्मी ने मारा चांटा<br />
चांटा लगा ज़ोर से<br />
हमने खाए समोसे<br />
समोसे बढे अच्छे<br />
<br />
नानाजी नमसते!</div>
Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com0tag:blogger.com,1999:blog-701969077517001201.post-16833825136823003012015-05-03T14:23:00.000+05:302015-05-04T13:13:58.267+05:30A Transcoding Proxy for HTTP Video Streams<div dir="ltr" style="text-align: left;" trbidi="on">
Sometime last year, <a href="http://centricular.com/">we</a> worked on a client project to create a prototype for a server that is, in essence, a "transcoding proxy". It accepts <i>N</i> HTTP client streams and makes them available for an arbitrary number of clients via HTTP GET (and/or over RTP/UDP) in the form of WebM streams. Basically, it's something similar to <a href="https://en.wikipedia.org/wiki/Twitch.tv">Twitch.tv</a>. The terms of our work with the client allowed us to make this work available as Free and Open Source Software, and this blog post is the announcement of its public availability.<br />
<br />
<a href="http://code.centricular.com/soup-transcoding-proxy/" target="_blank">Go and try it out!</a><br />
<br />
<code>git clone http://code.centricular.com/soup-transcoding-proxy/</code><br />
<br />
The purpose of this release is to demonstrate some of the streaming/live transcoding capabilities of GStreamer and the capabilities of <a href="http://wiki.gnome.org/LibSoup" target="_blank">LibSoup</a> as an HTTP server library. Some details about the server follow, but there's <a href="http://code.centricular.com/soup-transcoding-proxy/tree/README" target="_blank">more documentation</a> and <a href="http://code.centricular.com/soup-transcoding-proxy/tree/examples" target="_blank">examples</a> on how to use the server in the git repository. <br />
<br />
In addition to using GStreamer, the server uses the GNOME HTTP library <a href="http://wiki.gnome.org/LibSoup" target="_blank">LibSoup</a> for implementing the HTTP server which accepts and makes available live HTTP streams. Stress-testing for up to 100 simultaneous clients has been done with the server, with a measured end-to-end stream latency of between 150ms to 250ms depending on the number of clients. This can be likely improved by using codecs with lower latency and so on—after all the project is just a prototype. :)<br />
<br />
The <i>N</i> client streams sent to the proxy via HTTP PUT/PUSH are transcoded to VP8/Vorbis WebM if needed, but are simply remuxed and passed through if they are in the same format. Optionally, the proxy can also broadcast each client stream to a list of pre-specified hosts via RTP/UDP.<br />
<br />
Clients that want to stream video from the server can connect or disconnect at any time, and will get the current stream whenever they (re)connect. The server also accepts HTTP streams via both Chunked-Encoding and fixed-length HTTP PUT requests.<br />
<br />
There is also a JSON-based REST API to interact with the server. There is also in-built validation via a <i>Token Server</i>. A specified host (or address mask) can be whitelisted, which will allow it to add or remove <i>session id</i> tokens along with details about the types of streams that the specified <i>session id</i> is allowed to send to the server, and the types of streams that will be made available by the proxy. For more information, see the <a href="http://code.centricular.com/soup-transcoding-proxy/tree/REST-API" target="_blank">REST API documentation</a>.<br />
<br />
We hope you find this example instructive in how to use LibSoup to implement an HTTP server and in using GStreamer for streaming and encoding purposes. Looking forward to hearing from you about it! </div>
Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com2tag:blogger.com,1999:blog-701969077517001201.post-33115996684632880562014-05-24T06:12:00.002+05:302014-05-24T08:50:44.034+05:30Making things better<div dir="ltr" style="text-align: left;" trbidi="on">
When I read <a href="http://mjg59.dreamwidth.org/31714.html" target="_blank">Matthew’s post</a> a week ago about creating desktop features that cater to developers, I found myself agreeing quite strongly with the sentiment put forth, and I started to wonder how we could better integrate development features into GNOME. We’ve so far focused strongly on general ease of use and the use-cases of non-technical users, but as we’ve seen time and again, FOSS projects tend to first become popular on the shoulders of technical users. One must be in a position to attract both kinds of users if one wants broad acceptance and use.<br />
<br />
On the other hand, I found myself disagreeing very strongly with the sentiment in <a href="http://pvanhoof.be/blog/index.php/2014/05/23/lets-make-things-better" target="_blank">Philip Van Hoof’s post</a> yesterday. I found it strange that Philip chose to state that greater focus on development integration goes hand-in-hand with a lesser focus on the <a href="https://wiki.gnome.org/OutreachProgramForWomen" target="_blank">Outreach Program for Women</a>. Surely one’s immediate reaction would be to <i>utilize</i> the manpower (sic) OPW provides to make development integration happen, right? I’m left wondering what sets of biases, prejudices, or misconceptions one must have to conclude otherwise.<br />
<br />
In fact, being in the position to call multiple former OPW participants friends and hence being intimately familiar with their work, I’ve begun to realize that the removal of the “programmers only” requirement that GSoC has, actually leads to a much more holistic approach towards patching the deficiencies that GNOME has.<br />
<br />
Without OPW, would we have had a <a href="https://lwn.net/Articles/563298/" target="_blank">Usability Researcher for GNOME 3</a>? Or a professional Typeface Designer <a href="http://dispatchesfromopw.wordpress.com/" target="_blank">improving the shapes of our UX font, Cantarell and expanding the character set</a>? And surely as programmers and users we understand <a href="http://sindhus.bitbucket.org/summary-opw.html" target="_blank">the importance</a> of <a href="http://gardengnoming.wordpress.com/2013/08/" target="_blank">documentation</a>? To those who want to see some code, <a href="http://techchicblog.wordpress.com/category/gnome/" target="_blank">there's plenty of that to see as well</a>.<br />
<br />
Over the years, GNOME as an organisation has accreted talent and expertise in a wide spread of technical domains. We have the ability to create the most “usably-featured” OS out there — but only with all our arms working <b>together</b>. Cutting one off in the hope that another will become stronger will only result in a gaping, bleeding, wound. </div>
Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com10tag:blogger.com,1999:blog-701969077517001201.post-36265922431289230682013-11-09T13:58:00.000+05:302013-11-09T14:00:03.236+05:30A New Chapter<div dir="ltr" style="text-align: left;" trbidi="on">
Yesterday, my 20-month-long stint at <a href="http://www.collabora.com/" target="_blank">Collabora</a> ended. The company culture, work environment, and perks were brilliant, and working with friendly and extremely competent colleagues was a pleasure.<br />
<br />
Starting today, and for the next couple of months, I'll be spending most of my time on the <a href="http://bharati-braille.pareidolic.in/" target="_blank">various projects</a> that <a href="http://theballot.in/" target="_blank">I've been working on</a>, and on tackling the enormous backlog of <a href="http://en.wikipedia.org/wiki/The_Cathedral_and_the_Bazaar#Lessons_for_creating_good_open_source_software" target="_blank">itches to scratch</a> that I have accumulated over the past two years. In addition, I'll be looking for (and am available for) part-time consultancy gigs to fill the gaps in-between.<br />
<br />
I'm excited about the possibilities that have opened up for me due to this, and I'm really looking forward to spending more time on GNOME!</div>
Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com4tag:blogger.com,1999:blog-701969077517001201.post-87630183627685374462013-05-03T14:58:00.000+05:302013-05-03T14:58:54.126+05:30A FOSS Devanagari to Bharati Braille Converter<div dir="ltr" style="text-align: left;" trbidi="on">
Almost a year ago, I worked with <a href="http://poojasaxena.in/">Pooja</a> on transliterating a Hindi poem to <a href="http://en.wikipedia.org/wiki/Bharati_Braille">Bharati Braille</a> for a Type installation <a href="http://typerventions.com/"></a>at Amar Jyoti School; an institute for the visually-impaired in Delhi. You can read more about that <a href="http://poojasaxena.wordpress.com/2013/05/01/devanagari-to-bharati-braille-converter/">on her blog post about it</a>. While working on that, we were surprised to discover that there were no free (or open source) tools to do the conversion! All we could find were expensive proprietary software, or horribly wrong websites. We had to sit down and manually transliterate each character while keeping in mind the <a href="http://bharati-braille.pareidolic.in/about.html#bb_limitations">idiosyncrasies</a> of the conversion.<br />
<br />
Now, like all programmers who love what they do, I have an urge to reduce the amount of drudgery and repetitive work in my life with automation ;). In addition, we both felt that a free tool to do such a transliteration would be useful for those who work in this field. And so, we decided to work on a website to convert from <a href="http://en.wikipedia.org/wiki/Devanagari">Devanagari</a> (Hindi & Marathi) to Bharati Braille.<br />
<br />
Now, after tons of research and design/coding work, we are proud to announce the first release of our <a href="http://bharati-braille.pareidolic.in/">Devanagari to Bharati Braille converter</a>! You can read more about the converter <a href="http://bharati-braille.pareidolic.in/about.html">here</a>, and download the source code on <a href="https://github.com/pareidolic/bharati-braille">Github</a>.<br />
<br />
If you know anyone who might find this useful, please tell them about it!</div>
Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com3tag:blogger.com,1999:blog-701969077517001201.post-38968806152944536622012-12-06T10:53:00.001+05:302012-12-06T21:28:04.186+05:30Recording VoIP calls using pulseaudio and avconv<div dir="ltr" style="text-align: left;" trbidi="on">
For ages, I've wanted an option in Skype or <a href="http://live.gnome.org/Empathy">Empathy</a> to record my video and voice calls<sup>1</sup>. Text is logged constantly because it doesn't cost much in the form of resources, but voice and video are harder.<br />
<br />
In lieu of integrated support inside Empathy, and also because I mostly use Skype (for various reasons), the workaround I have is to do an <a href="http://en.wikipedia.org/wiki/X11">X11</a> screen grab and encode it to a file. This is not hard at all. A cursory glance at the man page of <a href="http://libav.org/avconv.html#Synopsis">avconv</a> will tell you how to do it:<br />
<br />
<blockquote class="tr_bq">
<code>avconv -s:v [screen-size] -f x11grab -i "$DISPLAY" output_file.mkv</code></blockquote>
<br />
<code>[screen-size]</code> is in the form of <code>1366x768</code> (Width x Height), etc, and you can extend this to record audio by passing the <code>-f pulse -i default</code> flags to avconv<sup>2</sup>—<i>but that's not quite right, is it</i>? Those flags will only record your own voice! You want to record both your own voice <i>and</i> the voices of the people you're talking to. As far as I know, avconv cannot record from multiple audio sources, and hence we must use <a href="http://www.pulseaudio.org/">Pulseaudio</a> to combine all the voices into a single audio source!<br />
<br />
As a side note, I really love Pulseaudio for the very flexible way in which you can manipulate audio streams. I'm baffled by the prevailing sense of dislike that people have towards it! The level of script-level control you get with Pulseaudio is unparallelled compared to any other general-purpose audio server<sup>3</sup>. One would expect geeks to like such a tool—especially since all the old bugs with it are now fixed.<br />
<br />
So, the aim is to take my voice coming in through the microphone, and the voices of everyone else coming out of my speakers, and mix them into one audio stream which can be passed to avconv, and encoded into the video file. In technical terms, the voice coming in from the microphone is exposed as an audio <code>source</code>, and the audio for the speakers is going to an audio <code>sink</code>. Pulseaudio allows applications to listen to the audio going into a <code>sink</code> through a <code>monitor source</code>. So in effect, every <code>sink</code> also has a <code>source</code> attached to it. This will be very useful in just a minute.<br />
<br />
The work now boils down to combining two sources together into one single source for avconv. Now, apparently, there's a <a href="http://www.freedesktop.org/wiki/Software/PulseAudio/Documentation/User/Modules/#module-combine-sink">Pulseaudio module to combine sinks</a> but there isn't any in-built module to combine <i>sources</i>. So we route both the sources to a <code>module-null-sink</code>, and then <code>monitor</code> it! That's it.<br />
<br />
<blockquote class="tr_bq">
<code><br />
pactl load-module module-null-sink sink_name=combined<br />
pactl load-module module-loopback sink=combined source=[voip-source-id]<br />
pactl load-module module-loopback sink=combined source=[mic-source-id]<br />
avconv -s:v [screen-size" -f x11grab -i "$DISPLAY" -f pulse -i combined.monitor output_file.mkv <br />
</code></blockquote>
<br />
<a href="http://nirbheek.in/files/record_screen.sh">Here's a script that does this and more</a> (it also does auto setup and cleanup). Run it, and it should Just Work™.<br />
<br />
Cheers!<br />
<br />
<span style="font-size: x-small;">1. It goes without saying that doing so is a breach of the general expectation of privacy, and must be done with the consent of all parties involved. In some countries, not getting consent may even be illegal.</span><br />
<span style="font-size: x-small;">2. If you don't use Pulseaudio, see the man page of avconv for other options, and stop reading now. The cool stuff requires Pulseaudio. :)</span><br />
<span style="font-size: x-small;">3. I don't count JACK as a general-purpose audio system. It's specialized for a unique pro-audio use case.</span></div>
Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com9tag:blogger.com,1999:blog-701969077517001201.post-5495180571571496902012-08-19T11:01:00.000+05:302012-08-19T11:09:29.119+05:30Android and Disk EncryptionBeginning with Android 3.0 (Honeycomb), Android includes the ability to <a href="http://source.android.com/tech/encryption/android_crypto_implementation.html">transparently encrypt your phone's storage</a> using the phone's settings. Internally, this works by using <a href="http://en.wikipedia.org/wiki/Dm-crypt">dm-crypt</a> — just like every other Linux distro out there. But what I found intriguing about this was that it only allows you to encrypt your phone if you use either a password or a numeric pin to lock your phone.<br />
<br />
This means that the password/pin is shared between the screen lock and dm-crypt. This has a number of consequences which I'll talk about below.<br />
<br />
Now, I understand why this is the default behaviour. Most users rarely, if ever, reboot their phones, and so if the phone has a (separate) passphrase for dm-crypt, we'll see users flooding service centres to get their phone "un-bricked" because they forgot they even <i>had</i> a passphrase. <br />
<br />
What surprises me is that there's <i>no </i>stock method to set a different passphrase for dm-crypt. Even <a href="http://www.cyanogenmod.com/">CyanogenMod</a> doesn't have this feature built-in. The only easy-to-use way I know to do this is by using the <a href="https://play.google.com/store/apps/details?id=org.nick.cryptfs.passwdmanager">Cryptfs Password app</a> (Disclaimer: I haven't actually tried the app itself, so I can make no guarantees about it).<br />
<br />
What also surprises me is that Android accepts pin numbers as dm-crypt passphrases, but not lock patterns! This decision makes little sense to me. Pattern locks are almost equivalent to pin numbers because as can be seen below, your pattern lock directly corresponds to a number. <br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigF2LeIeJ_sIZsvZscR31pDj_alhMzkYuTEovXfMjAx_Htz4_FO6P32Cb0wLqUqf2ejtNViIbWPDISFNdMarcrIcdqRd-1iOf3uAtiiODYpz9tKkpbqTyihMTAVfbSaZRYI3nMhAAxHjEX/s1600/pattern_lock_numbers.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigF2LeIeJ_sIZsvZscR31pDj_alhMzkYuTEovXfMjAx_Htz4_FO6P32Cb0wLqUqf2ejtNViIbWPDISFNdMarcrIcdqRd-1iOf3uAtiiODYpz9tKkpbqTyihMTAVfbSaZRYI3nMhAAxHjEX/s320/pattern_lock_numbers.jpg" width="262" /></a></div>
<br />
I say <i>almost</i> equivalent because from each node on that grid, you can
only access adjacent nodes to create patterns, and this reduces the
number pattern space by a bit (and there's no zero). But if this is a problem, then pin numbers shouldn't be allowed either since a numeric passphrase is trivial to crack— as anyone silly enough to use a numeric bicycle lock has found to great distress.<br />
<br />
And even if the screen lock is set using a password, the user is extremely unlikely to use anything but a trivial password for securing a screen that they unlock tens of times a day. This again means that the passphrase would be ridiculously easy to brute-force if the attacker has physical access to the phone.<br />
<br />
The level of security for a screen lock is just massively different from the level of security suitable for full-disk encryption. It's really good that the groundwork for this feature has been done, but as it stands the feature is mostly pointless. <br />
<br />
PS: The phone wallpaper in the screenshot was brought to you by, <a href="https://play.google.com/store/apps/details?id=org.lucasr.pattrn">Pattrn</a>!Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com4tag:blogger.com,1999:blog-701969077517001201.post-73096398961606000702012-07-19T15:45:00.001+05:302012-07-19T20:35:21.657+05:30GUADEC 2012, A Coruña<div class="separator" style="clear: both; text-align: center;">
<a href="http://www.guadec.org/sites/www.guadec.org/files/banner-125.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://www.guadec.org/sites/www.guadec.org/files/banner-125.png" style="background-color: white;" /></a></div>
<br />
This will be my first <a href="http://www.guadec.org/">GUADEC</a>, and I'm looking forward to it. Thanks to my employer <a href="http://www.collabora.co.uk/">Collabora</a> for sponsoring this trip!<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://www.collabora.com/logos/collabora-logo-small.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="65" src="http://www.collabora.com/logos/collabora-logo-small.png" style="background-color: white;" width="200" /></a></div>
<br />
<br />
I'll be around from 25th evening to 30th morning. Hope to see you all there. :)Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com0tag:blogger.com,1999:blog-701969077517001201.post-60456028482039016402012-07-01T06:49:00.000+05:302012-07-02T03:28:11.474+05:30Wired, headline click areaSomething has always bothered me about the story links on <a href="http://www.wired.com/">Wired magazine</a>'s home page. Today it bothered me enough to write about it.<br />
<br />
I don't know if they do this on purpose, or whether it's just an oversight. If it's on purpose, I'd love to know the reason why.<br />
<br />
So here's a story; and I hover over the text: <br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgNO49gx4eNgd_gz8RlcHxEBzi47aSFGbAOsOWTFSYSn4kjsZxDdvbC5BF4JPbo84lflLIQTfo2tX3h_flb3uVNRBUa5dohlgQbebaHIKwohOPT2d0JCobMAReKnWfwM1Ycre2rN5Ok8VgX/s1600/Wired-click.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgNO49gx4eNgd_gz8RlcHxEBzi47aSFGbAOsOWTFSYSn4kjsZxDdvbC5BF4JPbo84lflLIQTfo2tX3h_flb3uVNRBUa5dohlgQbebaHIKwohOPT2d0JCobMAReKnWfwM1Ycre2rN5Ok8VgX/s1600/Wired-click.png" /></a></div>
<br />
<br />
There you go, it lights up and I can click on it… Oops! I moved my mouse by just a few pixels, and: <br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjzHwsnUlmMmnN9NFWiX1Lr13F8E7zFnZ5_ZKyCtcyeDawk83JOwFueNpBi16fUc8GJldnWR_gzHLv3Sor8t_hO9g121YBfvxGCuKMdYL4uZQ_BEAUZ1df-RNd4F1ROoK5t5Utu1WylnX-m/s1600/Wired-noclick.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjzHwsnUlmMmnN9NFWiX1Lr13F8E7zFnZ5_ZKyCtcyeDawk83JOwFueNpBi16fUc8GJldnWR_gzHLv3Sor8t_hO9g121YBfvxGCuKMdYL4uZQ_BEAUZ1df-RNd4F1ROoK5t5Utu1WylnX-m/s1600/Wired-noclick.png" /></a></div>
<br />
Ew, I missed! I can't click on it any more.<br />
<br />
Fitts' law, my friends, is being violated. Why are they making it harder for me to click on their headlines?<br />
<br />
<a href="http://www.arstechnica.com/">Ars Technica</a> doesn't seem to have this problem. No matter where I hover inside a headline, it still lights up as a link, and I can click to view it:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjPapuCczfA3YHGCqBGjQLhKeQDswHsMrhh6oJFqDSaSWQuF_TeTvvQfnGu_BpfOwFGHpqgVxd6ZbOxEbQw6SceR5628HDu6t-Bi5H9KTbY6KnDhaHsLVnUTyO6m7QBE6GpIa9DnGTqhYT-/s1600/Ars.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjPapuCczfA3YHGCqBGjQLhKeQDswHsMrhh6oJFqDSaSWQuF_TeTvvQfnGu_BpfOwFGHpqgVxd6ZbOxEbQw6SceR5628HDu6t-Bi5H9KTbY6KnDhaHsLVnUTyO6m7QBE6GpIa9DnGTqhYT-/s1600/Ars.png" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
The difference comes because Ars Technica surrounds the entire span with the hyperlink anchor, whereas Wired only surrounds the text. <br />
<br />
I find this somewhat upsetting.<br />
<br />Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com10tag:blogger.com,1999:blog-701969077517001201.post-61760559409927834852012-02-04T04:05:00.000+05:302012-02-04T04:22:27.535+05:30An unintended gem about usability<br />
<blockquote class="tr_bq">
<UU> Somedays, I think why can't we have computers which just work.<br />
<UU> But then I remember that I am a Computer Scientist.<br />
<UU> So, yeah, I guess I understand why.<br />
<Nirbheek> :D</blockquote>
<br />
<i>Quite related to GNOME, really.</i>Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com0tag:blogger.com,1999:blog-701969077517001201.post-83903712223994278322011-03-04T15:25:00.000+05:302011-03-04T15:25:23.104+05:30GNOME 3 on Gentoo and related news<div>
<p>As those of you who follow the Gentoo GNOME Overlay know, GNOME 3 is shaping up nicely in the overlay, and runs according to upstream's intentions quite well. Whatever is missing should be filed as <a href="https://bugs.gentoo.org/enter_bug.cgi?product=Gentoo%20Linux&version=unspecified&component=GNOME&rep_platform=All&op_sys=Linux&priority=P2&bug_severity=normal&bug_status=NEW&alias=&bug_file_loc=http%3A%2F%2F&short_desc=%5Bgnome-overlay%5D%20&comment=&commentprivacy=0&keywords=&dependson=&blocked=&maketemplate=Remember%20values%20as%20bookmarkable%20template&form_name=enter_bug&assigned_to=gnome%40gentoo.org">a bug</a> and will be taken care of. :-)</p>
<p>Now that it's been a few days since the <a href="http://live.gnome.org/TwoPointNinetyone">release cycle</a> entered UI freeze, we have been able to evaluate whether or not you folks (i.e., our users), will be able to transition from GNOME 2 to GNOME 3 without too much pain. We came to the conclusion that there is no particular hurry to let go of GNOME 2.32, and that we should wait for things to settle down before unleashing GNOME 3 on our users.</p>
<p>Hence, this blog post serves as a notice for all Gentoo GNOME users about the fact that the addition of the latest GNOME to <acronym title="Roughly equivalent to debian testing/unstable">~arch in the Gentoo portage tree</acronym> will be delayed much more than usual. People who wish to be early-birds and try out GNOME 3 (and help with bugs!) should check out the <a href="http://git.overlays.gentoo.org/gitweb/?p=proj/gnome.git">Gentoo GNOME Overlay</a> (layman -a gnome).</p>
<p>One of the reasons for this is that besides the inevitable (temporary) feature regressions, parts of the design of GNOME 3 are still <a href="http://www.mail-archive.com/gnome-shell-list@gnome.org/msg02527.html">a work-in-progress</a>, and some of the existing designs aren't fully implemented yet. For instance, file management is currently in a half-way state, network-manager-applet is still being used, and the fallback mode needs work. In addition, a11y support in GNOME Shell is incomplete, and from what I can make out, it's not in a "shippable" state.</p>
<p>However, the list will definitely change before the final release. Things are in a fluid state at the moment, with upstream maintainers working hard at fixing bugs before the final freeze (you should help them in this!).</p>
<p>Another reason for the delay is that the influx of GNOME 3 libraries which need to be installed alongside GNOME 2 libraries means that the dependencies of a lot of in-tree ebuilds need to be adjusted. This is mostly straightforward work; except where slotting of libraries was not feasible, and porting to GTK+3 will need to be done. To ease the transition, and allow porting, GNOME 3.0 will probably be added to the tree <acronym title="Equivalent to Debian experimental">hard-masked</acronym>, or stay in the overlay till the work is done.</p>
<p>Looking at how things are moving, the upper limit for when GNOME 3 will get added to the ~arch tree is the 3.2 release. I have a personal stake in this, since I particularly love GTK+3, GSettings, GDBus, and GNOME Shell. I somehow feel an OCD need to see <i>everything</i> ported away from GTK+2/GConf/dbus-glib/bonobo/libunique towards GTK+3/GSettings/GDBus/GtkApplication. :D</p>
<p>In related news, thanks to the efforts of <a href="http://blog.aidecoe.name/">Amadeusz Żołnowski (aidecoe)</a>, <a href="http://cia.vc/stats/author/aidecoe/.message/694">Plymouth is now in the tree</a>! I tried it out, and it seems to work quite well. I'd love to see the Gentoo community create more Gentoo-centric themes for it. The absence of Larry the Cow was sorely felt. :-)</p>
<p>Some of you may remember <a href="http://bheekly.blogspot.com/2010/08/systemd-in-gentoo.html">my last blog post</a>, which was about systemd on Gentoo. Fellow dev <acronym title="Well-known for some minor exploits outside of Gentoo ;)">Greg KH</acronym> has <a href="https://bugs.gentoo.org/show_bug.cgi?id=318365#c141">taken up the mantle</a> of getting systemd into the tree. Thanks to everyone on <a href="https://bugs.gentoo.org/show_bug.cgi?id=318365">the bug report</a> for making systemd work on Gentoo, and thanks to Greg for volunteering to get it into the tree!</p>
<p>Here's to a very exciting 2011 year!</p>
</div>Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com9tag:blogger.com,1999:blog-701969077517001201.post-21754604353939879612010-08-26T06:47:00.001+05:302010-08-26T06:48:56.249+05:30Systemd in Gentoo<p>A lot of folks are raving about the next generation in init systems (aka <a href="http://0pointer.de/blog/projects/systemd.html">systemd</a>), and how it's (almost certainly) going to be the <a href="http://lwn.net/Articles/401856/">default init system for Fedora 14</a> (paid article, subscribe to LWN to read! [or wait a week]). It also seems that OpenSuse will be moving to systemd sometime in the near future (don't take my word for this though), and <a href="http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=580814">Debian has at least considered it</a>. It is also well-known that Ubuntu will <b>not</b> be using systemd for the foreseeable future.</p>
<p>So where is Gentoo in all this? Our current init system is <a href="http://packages.gentoo.org/package/sys-apps/baselayout">baselayout-1</a> in the stable tree, and <a href="http://roy.marples.name/projects/openrc">openrc</a> in the ~arch tree. The <a href="http://bugs.gentoo.org/show_bug.cgi?id=318365">maintainer-wanted bug for systemd</a> has been quite active with users posting preliminary ebuilds for it. The bug itself currently has >30 folks CCed (including me), and 86 votes. So users are definitely very interested in seeing systemd in Gentoo. However, it will take a <i>lot</i> of work before systemd can enter portage even as a masked ebuild.</p>
<p>Even after systemd enters portage, it is extremely unlikely that it will become the default init system for reasons that are listed below. Some developers are strongly in favour of moving from baselayout-1 to systemd, while some think it's a pile of crap that Gentoo should stay far away from. Neither of these opinions is shared by the majority of Gentoo devs. (that includes me :-)</p>
<p>In all likelihood, the end result will be that <a href="http://www.gentoo.org/proj/en/base/openrc/">openrc</a> will <b>finally</b> go stable (replacing baselayout-1), and if any developers are willing to spend the massive amount of time and effort required to make systemd usable in Gentoo, systemd will become an optional init system, strongly recommended for desktops/laptops.</p>
<p>Now why can't we throw out baselayout-1 as well as openrc and just use systemd? I was going to make a full list of the reasons, but as I was making it, I realised that I don't know enough details about systemd's requirements, what all it provides, what parts of Baselayout would need to be rewritten, how much porting of the tree (and systemd) would be needed, etc. So instead of hand-waving, I'll just list "needs several volunteer developers" as the blocker for now :)</p>
<p>I'm tempted to list myself as a future volunteer, but I won't do such a thing yet. Rest assured that if I do end up working on this, I'll be sure to blog about it. Although it is probably just a matter of "time" ;)</p>Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com6tag:blogger.com,1999:blog-701969077517001201.post-71352085466058606962010-04-20T17:41:00.000+05:302010-04-20T17:41:10.362+05:30Managing alternate browsers<p>For a while now, I've battled with the many problems of having too many tabs open in Firefox. One of them is faced when I want to view a link I saw on IRC, IM, Twitter, etc, and Firefox isn't running.</p>
<p>I click on the link; and wait. And wait. And wait. And finally Firefox pops up! But it's unusable because it's too busy loading 3 windows and 200 tabs. I just want to view this <i>one link</i>, I don't want the other 200 links to be opened!</p>
<p>For this use-case, I usually keep alternate browsers; Epiphany and/or Chromium. So I copy the link, open one of the browsers that <i>won't</i> open a bazillion tabs, and paste the link there. This is obviously somewhat annoying to do; and causes frustration when I forget that I don't have Firefox running and accidentally click on a link. *click* [wait] "huh? OH SHI-" *computer slows down to a crawl*</p>
<p>The final straw was when I realised that <a href="http://code.google.com/p/pino-twitter/">some clients</a> <i>don't let me</i> right-click -> copy links. So I wrote <a href="http://dev.gentoo.org/~nirbheek/files/browser-spawn.sh">a small script</a> to make the right decision for me whenever I click on a link so I can stop worrying about behemoth-launching. Now I've set the script as my "default browser" so that all links are sent straight to it, and the proper action is taken.</p>
<p>What it does; is it checks if one of your browsers (listed in $ORDER in the script) is already open, and opens the link in that. For instance; if Firefox is already open, you're probably using it; and you want the link to open in that. If Firefox is not, but Epiphany is already open, best to open it in Epiphany; etc.<br/>
<br/>
If no browsers are open; it tries to open the link in one of the lightweight "alternate" browsers that you have (listed in $SPAWN in the script).</p>
<p><b>Set the script as your default browser, and it shall feed hungry kittens.</b> It certainly did so for me ;)</p>Nirbheekhttp://www.blogger.com/profile/05472526900877533156noreply@blogger.com11